Hi,
I have a list of sitemap files. (These I generate programmatically)
These Sitemap file are huge having thousands of URLs.
It is very difficult to check each and every URL manually.
So I have generated the utility which parses this sitemap file and using Apache Commons HttpInvoker I check if it is valid or not.
- Some urls if they are invalid they return 404 response; so I can find out the problem.
- But in some cases due to some exception error page is shown. So this is not a valid URL. But it does not return the 404 response.
Response code is 200.
So there is no way for me to identify if it is a valid URL or no.
Not sure, I have heard that web-master tool does the same checking; so there must be something which can help to identify the valid URLS.
Any Help on this is appreciated.
Thanks in advance.
Leena


LinkBack URL
About LinkBacks





Reply With Quote