Index Item and associated Media Item
I have an Item Template "DocumentItem" in Sitecore which is just a placeholder with a link to a Media Library Item (mostly pdfs) and some custom properties.
I'm looking for a way to index my DocumentItem, but that the excerpt comes from the content of the file from the Media Library.
Anyway I can achieve this?
Simon's answer on the
coveoPostItemProcessingPipeline pipeline is right. You should create a custom processor for this pipeline. But you don't need to put the linked document text in a custom field nor change the UI to display this custom field.
Instead, in your processor, you will detect that the item about to be indexed is an instance of your "DocumentItem" template. When that will be the case, you will get the linked document binary data from the Sitecore API and set it as the value of the
.CoveoItem.BinaryData property of the
CoveoPostItemProcessingPipelineArgs object you received in the
All the other fields on the item will be handled by Coveo for Sitecore. You just need to set the
BinaryData property. Then, when CES will index the document, it will automatically detect the type of file contained in the binary data, extract its text and allow full-text search on all the document words. The excerpt in the search results will be automatically generated from the original document text.
I have never tried this particular case but my guess would be to use the Coveo Item Processing pipeline:
From there you can grab the body of the linked document and place it in a custom field. Then you can replace the out of the box excerpt by your custom field:
Try it and if you hit a roadblock with your code, post it as a comment and we can work on this together.