Gravatar for pavan.omtri@towerswatson.com

Question by psomtri, Sep 2, 2015 1:29 PM

Controlling Web Connector excerpt / custom field populating

In web connectorI know we can populate "Title" with one of the HTML meta data field. I am looking for something similar to populate excerpt, is it possible ? If not excerpt , some custom field on index document , is it possible to populate a custom field with some meta tag content in the html of the page being crawled ? If so please point me to the documentation on how to do it

1 Reply
Gravatar for jflheureux@coveo.com

Answer by Jean-François L'Heureux, Sep 2, 2015 2:27 PM

The excerpt is not a field or property on indexed documents. It is not generated at indexing time. It is instead a property of search results. Meaning it is generated at query time. CES generates an excerpt for each result of a search query based on the search terms of the query. The excerpt contains parts of the original document body that contains the searched terms. There is no way to customize the excerpt of search results.

With the Web connector, all the <meta name="..." content="..."> tags can be used to fill CES custom fields. This is done automatically by CES.

Let's suppose you have a <meta name="MyCustomMetaName" content="..."> meta tag in your HTML document and you want its content attribute value to be set in a CES custom field on your CES document.

First, you need to create a CES custom field in the field set used by your web source. You have 2 options to create the custom field:

  1. Name: "MyCustomMetaName", Metadata Name: blank
  2. Name: Any other name (e.g.: "foobar"), Metadata Name: "MyCustomMetaName"

When indexing HTML documents, CES tries to fill all the fields of the field set with the metadata that it extracted from the document:

  1. If the CES field has a metadata name set, it will look in the extracted document's metadatas to find one with that name. If it finds such metadata, it will fill the CES field value with that metadata value and stop.
  2. However, if it doesn't find a metadata with that name or if the CES field don't have a metadata name set, it will look in the extracted document's metadatas to find one with the CES field name. Again, if it finds such metadata, it will fill the CES field value with that metadata value.

Then, you need to re-index the HTML documents by rebuilding your web source. You should then see the new field with its value on your indexed documents after the indexing transaction is applied in the index.

Hope this helps.

Jeff

Gravatar for abhishek.shrivastava@towerswatson.com

Comment by abhisfortitude, Sep 3, 2015 10:30 AM

That was helpful Jeff! Thank you

Ask a question