Gravatar for

Question by kurgen, Mar 10, 2017 1:24 PM

Crawling a Webpage but only want a small subsection of this page

I setup a crawl to a certain webpage URL just fine, but there are Datasheets on the page which are the only thing I want to crawl. These datasheets open up a PDF within the browser at a different URL than what is being crawled. I see the option of using an inclusion filter to capture different URLs beyond what is crawled, but it does not seem to work. Is possible Javascript on the page interfering with this? Any ideas?

0 Reply
Ask a question