Gravatar for alex@yaplex.com

Question by AlexShapo, Feb 20, 2018 7:30 PM

How to customize source for specific URLs

Hello,

I need to create a source which will contain only News & Media on a website.

The URL looks like this: http://www.site.com/news-and-media/releases and under that root we have all our news, which look like this:

http://www.site.com/news-and-media/releases/show/this-is-my-news-url

How can I make Source crawl only links which looks like above? Currently it crwawling everything, even if I specify correct start link

1 Reply
Gravatar for erocheleau@coveo.com

Answer by Etienne, Feb 20, 2018 8:44 PM

Check the "inclusion filters section" in this documentation page.

https://onlinehelp.coveo.com/en/cloud/add_edit_web_source.htm#Add_or_Edit_a_Web_Source

And tell me if it solves your question.

Gravatar for alex@yaplex.com

Comment by AlexShapo, Feb 20, 2018 9:26 PM

Is the following correct way to setup filter? the Build Index fail with the message "Web no document indexed due to filters"

I also trying with regex, but failing because can't find documents:

(.+)\/news-and-media\/releases\/(.*)

Gravatar for alex@yaplex.com

Comment by AlexShapo, Feb 22, 2018 2:23 PM

Looks like it does not work with URLs of multiple levels, the following works for me

*/news-and-media/*

Ask a question