Gravatar for ncipollina@captechconsulting.com

Question by ncipollina, Oct 19, 2016 2:04 PM

Coveo returning duplicate results

We have this weird issue occurring with only 1 of our template types in our site, and it's only happening on our production Content Delivery site. When querying for an item we are getting multiple results of the same item. When I look at the response from the Coveo rest call, I see the same item with different versions returning. I've added the enableDuplicateFiltering parameter to the query to prevent them from showing up in the UI, but can anyone tell me why they might be showing up on the site this way?

Thanks in advance, Nick

Gravatar for sbelzile@coveo.com

Comment by Sébastien Belzile, Oct 19, 2016 2:20 PM

Is it the same query, on your CM and CD, which is sent to the index?

Gravatar for ssartell@rightpoint.com

Comment by ssartell, Oct 19, 2016 2:29 PM

Can you look at the request in your browser's dev tools and see if the cq expression has @fz95xlatestversionXXXX=="1" in it?

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 19, 2016 2:33 PM

Yes, the cq expression looks like this:

(@fz95xlanguage79869=="en" @fz95xlatestversion79869=="1")
Gravatar for sbelzile@coveo.com

Comment by Sébastien Belzile, Oct 19, 2016 2:58 PM

Can you confirm that it is not this bug: https://answers.coveo.com/questions/5587/_latestversion-virtualfield-showing-1-for-all-items ?

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 19, 2016 3:01 PM

We are using Sitecore 8.1 and according to that article, this should be fixed now.

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 19, 2016 3:42 PM

One thing to also note, if I do a full index rebuild, the duplicates disappear from the index.

Gravatar for sbelzile@coveo.com

Comment by Sébastien Belzile, Oct 19, 2016 4:45 PM

  1. Do you have the same symptoms though: items in your index have all the field _latestversion == 1?

  2. Are we talking about the web index?

  3. If this is the case, do you have the inbound filter IndexLatestVersionInboundFilter enabled in your configuration?

  4. Do you use Sitecore 8.1 update 1 or above?

  5. Version of C4SC?

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 19, 2016 4:55 PM

  1. Yes.
  2. Yes, It's actually called pubweb, but it is my Content Delivery web index.
  3. Yes it is enabled
  4. Sitecore 8.1 rev 160519
  5. Version 4.

One thing I just noticed is that I'm missing the onPublishEndAsync strategy on the index. Could that be the culprit?

Gravatar for jflheureux@coveo.com

Comment by Jean-François L'Heureux, Oct 20, 2016 4:09 PM

That could be as onPublishEndAsync strategy handles item deletions in the web (or pubweb in your case) database by deleting them from the index too. You should not use the reference one as it is configured for the web database. You should define the entire strategy with its database configuration.

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 20, 2016 4:57 PM

According to the installation instructions, the strategy should not be on the CD instances.

https://developers.coveo.com/display/public/SitecoreV4/Installing+Coveo+for+Sitecore+in+a+CM+or+CD+Configuration

I do have this strategy on the CM instance.

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 21, 2016 1:58 PM

Another problem with using the enableDuplicateFiltering flag is that while it only returns the latest version, the facet values reflect all versions. Is this something I should create a support ticket for?

Gravatar for flguillemette@coveo.com

Comment by François Lachance-Guillemette, Oct 24, 2016 8:40 AM

Adding enableDuplicateFiltering to avoid documents duplicated by versions does not fix the initial problem that you have. This option will filter out documents that look similar, meaning it might filter some documents that you don't expect to be filtered.

Regarding: "One thing to also note, if I do a full index rebuild, the duplicates disappear from the index."

Then the documents only get duplicated when you publish a new version of this item?

Have you compared all the fields for the duplicate items? What are their @fz95xlatestversion79869 field values?

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 24, 2016 9:04 AM

We had just started to go down this path to see what is going on with the latest version field. We actually have two issues going on, the first issue is that publishing the item through workflow is not causing the index to be rebuilt for that item. The second issue is this duplicate item issue. When I rebuild the index on the bucket that my item is in, I have both versions of the same item and the both have latest version set to true.

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 24, 2016 10:06 AM

I have one other observation about our situation as well. We currently have three indexes defined:

Coveomasterindex Coveowebindex Coveopubwebindex

The master and web indexes are both getting updated correctly on publish but pubweb is not.

Gravatar for jflheureux@coveo.com

Comment by Jean-François L'Heureux, Oct 24, 2016 11:09 AM

What are the strategies on your pubweb index?

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Oct 24, 2016 11:14 AM

We literally have the web and pubweb indexes defined exactly the same.

1 Reply
Gravatar for jflheureux@coveo.com

Answer by Jean-François L'Heureux, Oct 26, 2016 4:28 PM

In your last comment to the question, you mention that your pubweb index strategy is:

<strategies hint="list:AddStrategy">
  <strategy ref="contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync"/>
</strategies>

That is the problem! Your strategy is a reference to contentSearch/indexConfigurations/indexUpdateStrategies/onPublishEndAsync. If you check your showconfig.aspx and look for that config node, you'll see that this strategy is hardcoded to listen to the web database publish events. Not the pubweb database.

You have to copy the referenced strategy XML elements and replace your reference by that copy. Then change web for pubweb.

Gravatar for aasanovic@coveo.com

Comment by Aljosa Asanovic, Oct 26, 2016 5:55 PM

The documentation which reflects this change has been adjusted for more precision as well.

Step 7 d) of the "Configuring the pub Database Search Index" section is the one you can take a look at for clarification. Notice the change from "ref" to "type" which allows you to define the strategy for use with the pub index instead of referencing the existing one which is hardcoded to the web index.

Gravatar for ncipollina@captechconsulting.com

Comment by ncipollina, Nov 4, 2016 11:08 AM

This was the magic bullet for us. Thanks!

Ask a question