Gravatar for rob.mcclanahan@arrowdesigns.com

Question by rmcclanahan, Nov 1, 2016 11:25 AM

Coveo Performance Tuning

I am reviewing performance on our current Coveo for Sitecore indexes and I'm looking to improve a few things but I am having trouble finding information on how to do this. One example of what we are trying to do is improve the performance of our autocomplete on our custom search box. The autocomplete is essentially doing a "StartsWith" on the the title field but this is taking between .35 and .45 seconds in the Coveo admin which is way too long with no load on the server. Is there any guidance/documentation on things that could be done to improve performance from an index structure standpoint. For example, could I pull all of the titles into a new index with just the title field and query that index rather than my web index?

Gravatar for ssartell@rightpoint.com

Comment by ssartell, Nov 1, 2016 12:07 PM

Is this a custom type-ahead feature or Omnibox? If it's custom, are you doing a values query on a field via the rest api or a full query on the title?

Gravatar for rob.mcclanahan@arrowdesigns.com

Comment by rmcclanahan, Nov 1, 2016 3:30 PM

This is a custom implementation. Im not sure what you mean by a values query but our query ends up being something like @title*="Some Que*". This query will eventually be run through the RESP API but right now we are just testing it on the index browser in the CES Admin.

Gravatar for ssartell@rightpoint.com

Comment by ssartell, Nov 1, 2016 4:22 PM

Coveo has a couple different REST API endpoints that you might want to look at besides the standard query endpoint. My guess is that you might see better performance through these endpoints than doing a standard wildcard query.

https://developers.coveo.com/display/public/SearchREST/Listing+Values+of+a+Field https://developers.coveo.com/display/public/SearchREST/Getting+Query+Suggestions

2 Replies
Gravatar for jflheureux@coveo.com

Answer by Jean-François L'Heureux, Nov 2, 2016 8:26 AM

@ssartell is right. You should use the first link he posted instead. You should put your @title field as a facet and use the REST call to list all its values. This call also accept wildcards. The performance will be highly improved.

Gravatar for rob.mcclanahan@arrowdesigns.com

Comment by rmcclanahan, Nov 2, 2016 11:18 AM

Thanks @ssartell and @jflheureux. This is reducing the query time to about .13 seconds which is still higher than I would like. This was one example of a few performance issues we are running into so I wanted to see if there was some general guidance how to structure an index for performance. For example, should the goal be to have a little data as possible with the fewest amount of boxes checked in the field editor (ex. no faceting, sorting, full test search, etc..). The reason I'm asking is that the tests have been taking a long time to build out so it would be helpful to understand how things like number of indexes, index size, number of fields, etc… have an impact on performance so I can identify whats an issue so I can either resolve it or push back on my stakeholders. I know this is somewhat vague so if needed i can open a support ticket.

Also, do you know if this values rest call is going to respect security? For example, if the Sitecorer user does not have access to the document will they see that value come back?

Gravatar for jflheureux@coveo.com

Comment by Jean-François L'Heureux, Nov 3, 2016 9:30 AM

All the index calls respect security. Usage Analytics Top Queries call don't however as it is not an index call.

Gravatar for gupta@rdacorp.com

Comment by deepakjg, May 1, 2017 6:49 PM

Hello,

May I know how can I clear the "cache" when too many constant expressions get created? I had some code that dynamically added constant expressions. After reading this thread, I changed it to use advanced expressions. However, I'd like to reset the cache to improve indexing performance.

Thanks.

Deepak

Gravatar for gupta@rdacorp.com

Comment by deepakjg, May 1, 2017 6:49 PM

Hello,

May I know how can I clear the "cache" when too many constant expressions get created? I had some code that dynamically added constant expressions. After reading this thread, I changed it to use advanced expressions. However, I'd like to reset the cache to improve indexing performance.

Thanks.

Deepak

Gravatar for jflheureux@coveo.com

Answer by Jean-François L'Heureux, Nov 3, 2016 9:06 AM

I don't think we have a performance guide available. Here's what I know:

  • If you use LINQ queries somewhere, follow the Optimizing LINQ Query Performance guide. Option 1 of this guide also applies if you are not using LINQ.
  • Avoid using wildcard, fuzzy match, phonetic match and other special field query operators. They are the slowest operators available. Whenever possible, stand with simple free-text search, field contains keyword (=) and field exact match (==) operators.
  • Avoid setting all the fields as free-text searchable, facet or sortable. Only set those attributes on needed fields.
  • Avoid creating a free-text searchable computed index field with all the values of all the item fields. Use the Coveo indexed document body/binaryData feature instead. It is a property on an indexed document instead of a field. It is free-text searchable by nature and also used to create the document quickview and provide search result excerpt. (see Indexing Documents with HTML Content Processor)
  • Limit the number of indexed fields to the minimum (either with includeField/excludeField config sections or coveoIndexingGetFields pipeline). Coveo for Sitecore indexes all the fields by default for customers to avoid complex configuration. A large number of fields will slow down the indexing and search processes. We are currently creating new intelligence to select fields to index in the upcoming December 2016 release.
  • Connect only one Sitecore farm to a CES server or Coveo Cloud organization. Every Sitecore farm creates a ton of fields, 1 security provider and 2 sources and a few other things. The more you connect, the slower it will get.
  • If you have a farm of more than one Sitecore server (CM/CD) connected to the same Coveo index, ensure your farmName value is the same in all the Sitecore instances to avoid having 2 Sitecore servers battling to update the Coveo index configuration. (see Installing Coveo for Sitecore in a CM or CD Configuration)
  • If you use on-premises CES index and you experience slow indexing, check the ConfigObjectCache size.
  • Avoid asking hundreds of search results for a query. the Coveo JavaScript Search Framework supports paging. Use reasonable page sizes like 10, 20, 25.
  • Avoid building a search interface that issues more than one search query per actual user search. One user query should execute only one search call to the index. Do not separate search results per type of content. Use a content type facet instead to drill down search results. This will ensure better relevancy, better performance, and accurate usage analytics data.
  • Do not disable 15 minutes Coveo Cloud query/result cache by decreasing the maximumAge parameter value on queries.
  • Do not insert dynamic query expressions in the constant expression part of the search query. This expression should be used only for really static filters that are hardcoded and will never change. All the constant expressions received by the index are inserted in a cache refreshed when documents are indexed. The bigger this cache, the slower indexing gets.
  • When using Coveo Cloud index and using numeric fields in your query filters, ensure these numeric fields are marked to be cached in the index with useCacheForNumericQuery (see Understanding the Coveo Search Provider's Configuration File)
  • Avoid configuring a field as facet when most the indexed documents have a different field value. This creates a field with a very high cardinality and is slower.
  • Avoid repeating a field name in a OR query like @field=="Value1" OR @field=="Value2". Use parenthesis instead like @field==("Value1","Value2")
  • Avoid adding multiple Query Ranking Expressions (QRE) with very similar expressions only different by their field value. Use a Query Function (QF) instead (see Query Function)

I hope this helps,

Jeff

Gravatar for ssartell@rightpoint.com

Comment by ssartell, Nov 3, 2016 1:56 PM

This is an awesome list. Definitely bookmarking this page now. Thanks @jflheureux!

Ask a question