Gravatar for fcote@coveo.com

Question by fcote, Nov 23, 2015 10:08 AM

Indexing Media Item and Postprocessing in Coveo for Sitecore

We've been noticing this issue in our logs quite a bit on our dev system. We have built a resource library, using Sitecore items in a non-public folder in Sitecore which we then expose via both the Coveo search as well as a custom built interface on the website. When the resource items from the resource folder are indexed by Coveo, we attach the binary data of the related media item (resource such as .pdf or .doc, etc) to it (code at bottom). Please advise if this approach is what's causing the below errors, or if it's something else.

Specifically looking for a reson for the errors so we can correct the problem. Thanks!

If I recall correctly, we're using the October Coveo release (3.0.1123) as well as Sitecore 8 Update 4 (build 150621)

9432 16:42:57 ERROR An error occurred while calling method "GetItemBinaryData".
Exception: Coveo.Connectors.Sitecore2.SitecoreWebServiceExceptions.SitecoreWebServiceItemNotFoundOrAccessDeniedException
Message: The Sitecore item identified by "sitecore://master/{4890D27E-3EB4-4A80-BD10-219D195E11E2}?lang=vi&ver=1" couldn't be found or was denied with the current credentials.
Source: Coveo.Connectors.Sitecore2.SitecoreWebService
   at Coveo.Connectors.Sitecore2.SitecoreWebService.Wrapper.BaseSitecoreWrapper.GetItemBinaryData(String p_ItemUri)
   at Coveo.Connectors.Sitecore2.SitecoreWebService.SitecoreWebService.TryCatchWrapper[T](Func`1 p_Action, String p_MethodName)

12136 16:42:57 ERROR An error occurred while calling method "GetItemBinaryData".
Exception: Coveo.Connectors.Sitecore2.SitecoreWebServiceExceptions.SitecoreWebServiceItemNotFoundOrAccessDeniedException
Message: The Sitecore item identified by "sitecore://master/{F3D0F4CE-2DAA-40B7-AE8F-3D949F61F0DC}?lang=zh-Hant&ver=1" couldn't be found or was denied with the current credentials.
Source: Coveo.Connectors.Sitecore2.SitecoreWebService
   at Coveo.Connectors.Sitecore2.SitecoreWebService.Wrapper.BaseSitecoreWrapper.GetItemBinaryData(String p_ItemUri)
   at Coveo.Connectors.Sitecore2.SitecoreWebService.SitecoreWebService.TryCatchWrapper[T](Func`1 p_Action, String p_MethodName)

9988 16:42:57 ERROR An error occurred while calling method "GetItemBinaryData".
Exception: Coveo.Connectors.Sitecore2.SitecoreWebServiceExceptions.SitecoreWebServiceItemNotFoundOrAccessDeniedException
Message: The Sitecore item identified by "sitecore://master/{448B2FB1-BF89-4CBC-B898-A41DA04D7B96}?lang=ko&ver=1" couldn't be found or was denied with the current credentials.
Source: Coveo.Connectors.Sitecore2.SitecoreWebService
   at Coveo.Connectors.Sitecore2.SitecoreWebService.Wrapper.BaseSitecoreWrapper.GetItemBinaryData(String p_ItemUri)
   at Coveo.Connectors.Sitecore2.SitecoreWebService.SitecoreWebService.TryCatchWrapper[T](Func`1 p_Action, String p_MethodName)

Here's an exerpt from our "ResourceMediaItemPostProcessor" pipeline:

var resourceField = ((FileField)item.Fields["Resource"]);
if (resourceField.MediaItem != null)
{
    var mediaItem = MediaManager.GetMedia(resourceField.MediaItem);
    if (mediaItem != null && mediaItem.Extension != "mp4" && mediaItem.Extension != "m4v")
    {
        var memoryStream = mediaItem.GetStream();
        if (memoryStream == null) return;

        // get the stream
        var stream = memoryStream.Stream;

        //create new Byte Array
        byte[] byteArray = new byte[memoryStream.Length];

        //Set pointer to the beginning of the stream
        stream.Position = 0;

        //Read the entire stream
        stream.Read(byteArray, 0, (int)memoryStream.Length);

        p_Args.CoveoItem.BinaryData = byteArray;
    }
}
1 Reply
Gravatar for sbelzile@coveo.com

Answer by Sébastien Belzile, Nov 23, 2015 12:09 PM

Hi,

I see two possible solutions to you problem.

The way Coveo for Sitecore indexes binary data has been changed in September 2015. The binary data of Sitecore item is not sent into the RabbitMQ queue. Instead, Coveo sends informations about the binary data to download it from the crawler. This behavior fixes some RAM issues some clients had with RabbitMQ.

The first way to fix that would be to bring back the hold behavior. This article explains how to do this.

A second solution is to modify the message sent to the queue with your postItemProcessor by setting the parameter MustDownloadBinaryData to false on the CoveoItem: p_Args.CoveoItem.MustDownloadBinaryData = false

Gravatar for cwatkins@bayleaf.com

Comment by cwatkins, Nov 23, 2015 2:37 PM

Hi,

Do we know the actual underlying meaning or reason for the error?

I don't particularly mind or have any issue with the "new" approach (allowing the queue crawler to fetch the binary item from Sitecore instead of sending the binary data to RabbitMQ).

Is the issue because I'm attaching the binary data to the actual sitecore item, so that when the queue crawler pulls the item out of RabbitMQ, it takes the item's ID and tries to retrieve the binary media item for it, which doesn't exist because the binary data is actually associate with a different item (media item?)

If this is the case, then I guess I have to send the media item initially to RabbitMQ.

Thanks.

Gravatar for sbelzile@coveo.com

Comment by Sébastien Belzile, Nov 23, 2015 3:53 PM

This error occurs when:

  1. The Coveo Queue Crawler does not have the rights to see one of the items you are indexing.

  2. The Item you are indexing does not exists. (may occur when adding false items with the coveoItemProcessingPipeline pipeline for example).

  3. Any other exception that could occur when retrieving the binary date in the Sitecore web service. We would need logs to tell for sure.

In both cases, the Coveo Queue Crawler is trying to get the binary data of your item. If it could reach your item, the binary data you added to your item would be overwritten by the data retrieved from the SitecoreWebService.

This is why you need to tell the crawler to not download the binary data in your coveoPostItemProcessingPipeline processor (or to set the binary data in queue as the default behavior).

Gravatar for cwatkins@bayleaf.com

Comment by cwatkins, Nov 24, 2015 2:17 PM

It appears the errors are being reduced after making this change. I'll see what happens after a full re-index. Thanks!

Ask a question