Document Filtered (redirected)
I'm trying to figure out what the log description "Document Filtered (redirect)" means. I know that "Document Filtered" means that the document isn't being indexed in the source, but I'm fully sure about (redirect). Does that mean that the url of the page that was attempting to be index redirected to another page?
The Document Filtered (redirect) message happens when the crawler gets redirected to another page and that page does not match one of the inclusion filters. This is usually because of one of the following reasons:
- The web server is configured to redirect the page
- The page requires a logon.
So if the redirected page is the one that you want to crawl you will need to make sure that you have an inclusion filter in your source that matches that page link. You will need to make sure that you don't include any escape characters in your filter. For example (%20 should be a blank space).
So the above is what we normally see, however in writing this answer for you I did a few tests and I was able to reproduce the same message but in an unexpected way. You can also get it when you try to crawl a link that isn't part of my starting address but that an inclusion filter has been created for. Is this your case? Do you have a link that you want to crawl that isn't part of your starting address and you already have an inclusion filter for? If so please open a new case with support so that we can work on this with you.