Gravatar for p.fonseca@clerkswell.com

Question by Pedro, Feb 2, 2018 12:54 PM

How to get index structure to match sitecore tree

After setting up sitecore for coveo 4 on different scaled environments I often get different index structures and can't pinpoint what dictates them.

As an example, on a test environment (environment 1), after running a full index rebuild through the index manager, I get a structure with all documents directly under the index name folder:

While for environment 2 the documents are indeed more organized but there's an unintended separation between http - for content items - and https - for media and global items - which I find strange. On top of that, the media items appear under a shell folder even though that's not where they live in the sitecore tree:

All document links work correctly when queried from Sitecore so as far as I can tell, even the Media documents.

Can someone offer information on what might cause this type of discrepancy? I am looking to understand the issue so that in the end the index matches the sitecore tree as closely as possible.

thank you.

Gravatar for flguillemette@coveo.com

Comment by François Lachance-Guillemette, Feb 6, 2018 3:21 PM

Have you set the `serverUrl` parameter in your configuration for only one of those sites?

Here is an article that explains how clickable URI is computed: Understanding How The ClickableUri Value Is Computed

Gravatar for p.fonseca@clerkswell.com

Comment by Pedro, Feb 6, 2018 4:09 PM

For test.env2 (the environment I consider to be incorrect), I've set different serverurl values while trying to troubleshoot the problem:

Try 1:
serverurl on CM instances was the author website
serverurl on CD instances was the public hostname

Try 2:
serverurl on all CM and CD instances was the public hostname

Try 3:
serverurl on CM instances was empty
serverurl on CD instances was the public hostname.

For all the above, the result on the Coveo index was always similar to the one on the screenshot, with the author url and the media items under shell.
The fact that even when I set the serverurl value to something different than the author website url on all instances and that it still comes up on the index makes me believe it is getting that value from somewhere else.

I've been through this documentation more than once now and all I can think of that might affect it is the SiteName option which I am not passing. Could that be making a difference? I certainly didn't used that on previous CFS 3.0 installations. If so, given that this installation is composed of 8 sitecore sites, how could I make use of that setting and still have items from all sites crawled?

0 Reply
Ask a question