Question by Greg Jankowski, Mar 17, 2018 1:22 PM

Exclude directories from a sitemap search

I have a sitemap ( that also contains localized versions of some of the blog posts.

I do not want to index any pages that contain:

*/es/* */fr/* */de/* */jp/* */tw/*

I cannot find was way to exclude directories like the shared web source.

Any suggestions?

Answer by Etienne, Mar 19, 2018 2:49 PM

Would you not prefer indexing them all and adding a metadata field for language based on the URL? That you could eventually display as a facet?

If you don't want to index them, you can reject documents with an Indexing Pipeline Extension, or you could try with the inclusion filters directly on the source?

Comment by Greg Jankowski, Mar 19, 2018 3:21 PM

At this point, I'd like not to index them.

I'll look at both methods you described.


Comment by Greg Jankowski, Mar 19, 2018 3:23 PM

The issue with the inclusion filters (and exclusion) is that they are not available on a sitemap source.

Answer by Greg Jankowski, Mar 19, 2018 4:53 PM

So what I did was just edit the Tab filter expression to

@syssource=="Sitemap - Ipswitch Blog" AND language="English"

that seems to have worked fine.

Any downside to this approach?

