Can you retrieve the value of a different attribute for a field built from meta header tags?
Is there a way to retrieve a different attribute from the meta tags vs. the "content" attribute? The website we are spidering has an "id" attribute that will hold the unique id of the value of the text in the "content" attribute. We need this unique id to match other data in another source. The best chance of a match would be to use the value in the "id" attribute.
*Update (07/07/14): I am using the web connector to index a public website (not a sitecore site). I have mapped the meta-tags in the header of each html page to the fields in my source. For example: maps to the field "Authors" via the "citation_author" name attribute. One of the meta-tags I need to map, has both a "name" and "id" attribute: . The "id" attribute will be our primary key to match this data to an item in our Sitecore data. The "content" attribute holds the item name of the data, but will not always be an exact match. Is there a way I can map the "id" attribute value to a field in my fieldset?
In CES, you can add a post conversion script to your Web Crawler source and parse the HTML headers to retrieve the information you need and assign it to custom metadata. You can create that post-conversion script in VBScript, JScript or C#.
You can check https://developers.coveo.com/display/Converter/Analyzing+Text+to+Find+Metadata for more information on your specific case and https://developers.coveo.com/display/Converter/Conversion+Scripts for general information about the conversion process.