Gravatar for

Question by timothevs, Sep 23, 2015 10:17 AM

Matching periods in query

In Coveo for Sitecore, we are trying to match queries against user generated data (one of which is a field called title). Because it isn't all uniform, we have users entering different formats of the same title. For instance one might enter PhD and another, Ph.D.

Since these titles are among the fields that are searchable, and so, we'd like to ensure that when a person searches for PhD we also return Ph.D. etc. I could normalize them at index time and store/display them uniformly, but is there a way that this can be done at search time as well?

The problem isn't restricted to just PhD. Other examples in use are MS/M.S., MD/M.D., MPharm/M.Pharm. etc.

Any suggestions/tips/tricks?

3 Replies
Gravatar for

Answer by Simon, Sep 23, 2015 10:27 AM

I would offer two options.

The manual option would be to use the thesaurus.

The semi-automatic option would be to use a conversion script to parse the Title and add the synonym to a custom field ex: extendedTitle, which would be free text searchable.

So the item would have:

  • Title: MS
  • extendedTitle: M.S

If you are using Coveo for Sitecore, you would want to use item processors instead of conversion scripts.

Gravatar for

Comment by timothevs, Sep 24, 2015 12:26 PM

That is a very clever way of creating aliases. Thanks for that. I am almost tempted to do that, but I do want to check Thesaurus out. So exciting, everyday something new to learn :)

Gravatar for

Comment by Simon, Sep 24, 2015 12:28 PM

Good to hear, just one thing however, you might want to use computed fields instead of the pipeline. Simply cleaner:

Gravatar for

Comment by Simon, Oct 2, 2015 5:30 PM

Indeed, a computed field would be the right option:

Simply add a condition to detect when a product code has a space or a dash and merge them together in a custom field. Having this field free text searchable would then allow the search to return both the main title and the newly formed alias.

Hope it helps, Simon

Gravatar for

Answer by Jean-François L'Heureux, Sep 23, 2015 10:27 AM

The feature you are referring to is called a Thesaurus. This is a dictionary of query expansions mainly for synonyms and acronyms. You can define thesaurus entries in the CES Admin Tool for your entire index. Here's a few documentation links about the thesaurus and how to use it:

Gravatar for

Comment by timothevs, Sep 24, 2015 12:25 PM

Thanks! That does open up a lot of possibilities for us!

Gravatar for

Answer by dtegethoff, Sep 29, 2015 5:15 PM

This answer looks like it might address my needs but I'm not sure. I'm on the business user side of things.

We have quite a list of parts such as PGR-3300. We find that people will search for part names with a dash, with a space and with no space. Coveo for Sitecore is returning results using the dash and a space, but not when people combine both halves into one term such as "PGR3300."

I know I can use the thesaurus, but that would be time consuming give the number we have. Would the item processor address this? Thanks!

Ask a question