Question by fcote, May 29, 2015 10:09 AM

How to configure search to take in account the ™ ligature

We're using Coveo for Sitecore 891

On our site, we have a variety of products containing the ™ ligature (e.g. productname™ ). Something to note is that there is no space between the productname and the trademark symbol. Typically our users do not search using the trademark symbol nor do they append tm to the end of the productname (which I see Coveo does support). Because of this, users search for productname but users do not get results that contain productname™.

We are looking to make it so that an indexed document that contains productname™ will be returned as a result whether a user searches for productname, productname™, or productnametm in search bar. The solution will also need to be universal so that documents containing productname1™, productname2™, or productname3™ can be returned by searching productname1, productname2, and productname3, respectively.

I was considering using postconversion to programmatically append [productname] to any metadata or document bodies that contain [productname]™ based on the value of [productname]. Is this the correct approach? Or is there a better approach?

In any case, how exactly should we go about doing this?

Answer by Daniel Lavoie, Jun 2, 2015 11:16 AM

It will be possible in the next major release to configure the index tokenizer to treat this ligature as a character to not index, but for now a conversion script is the way to go.

