Gravatar for adam@borsi.ca

Question by Adam, Mar 25, 2015 10:05 AM

Source Filter seemingly not Filtering - Exclusion Filter

Greetings,

I've attempted to create an Exclusion filter to remove an entire sub page/directory from being indexed. Unfortunately I've not been successful using either of the Exclusion filter types, Wildcards or Regular Expression.

I've attempted to type it, create one via the Index Browser using the Collection -> Source -> Folder tree then Folder Actions to Add an exclusion filter with my desired dir checked off.

An example of the thing I'm attempting to exclude is: http://www.mywebsite.com/site website/database one/sitecore/content/home/pleasedontindex/something/ok.aspx

My Wildcard Filter: http://www.mywebsite.com/site website/database one/sitecore/content/home/pleasedontindex/* As I understand it, this should not index anything under the "pleasedontindex" dir.

My RegEx Filter: (\/please(?:dont|DONT)index\/) As I understand this, anything that matches this patter it won't index. So all files under that dir including the dir itself.

What am I missing there? Did I setup something incorrectly?

Thank you kindly, Adam

Gravatar for adam@borsi.ca

Comment by Adam, Mar 25, 2015 10:09 AM

If it's of any importance, I'm using Sitecore 6.5 with CES 7.0 & Sitecore 2 connector. Thank you kindly, Adam

1 Reply
Gravatar for slangevin@coveo.com

Answer by Simon, Mar 25, 2015 10:23 AM

Hi,

Normal filters with the Sitecore 2 connector are based on long URL, to see what I am talking about:

1- Go to Index Browser
2- Search for an item to exclude
3- Click on Action
4- Click on Add an Exclusion Filter

Then go take a look at the Exclusion tab and you will see the type of URL on which we filter. If you need to filter out a whole path, you will need to use a conversion script:

https://developers.coveo.com/display/Converter/Conversion+Scripts

Regards,
Simon

Gravatar for adam@borsi.ca

Comment by Adam, Mar 25, 2015 10:46 AM

Hi Simon,

Thank you for the response. I do have to apologize though because I quite don't follow.

I thought by following these 4 steps I could filter out content using built in functionality without having to go the scripting route.

I have already followed these steps, created the "long url" wildcard filter, but Coveo still indexes everything under the undesirable dir.

I'm just looking to exclude everything under the "pleasedontindex" dir/page, including the page itself.

Can I not utilize the Exclusion Filter functionality?

Really appreciate the feedback.

Thank you kindly, Adam

Gravatar for slangevin@coveo.com

Comment by Simon, Mar 25, 2015 10:52 AM

Unfortunately, this built in feature works for every connector, apart from Sitecore 2. With the new Coveo for Sitecore, we bind ourselves to the Sitecore inbound filtering pipeline, making it very efficient. But the Sitecore connector, you would need to use a conversion script.

Gravatar for adam@borsi.ca

Comment by Adam, Mar 25, 2015 11:04 AM

Hi Simon,

Ahh and therein lies the rub.

Thank you for the insight. Really appreciate it.

More questions to follow :)

Ask a question