This is a discussion on My SE Bots Activity Details within the General Discussion forums, part of the vBSEO Google/Yahoo Sitemap category; What would you say about this activity? Something more to disable in robots.txt? What? Thanks a lot! Marcel...
I block most of that via robots.txt.
There's nothing to be gained by having the search engines crawling register.php, for instance. You want to keep them directed to your forum content.
And which of them do you leave unblocked exactly?
It's a bit of different strokes for different folks, but from your list I would block -
printthread.php
search.php
newreply.php
private.php
attachment.php
album.php
register.php
poll.php
newthread.php
editpost.php
inlinemod.php
arcade.php
clientscript/
report.php
subscription.php
customavatars/
profile.php
sendmesssage.php
Briansol suggests rewriting everything you want indexed with no extensions and then blocking all .php scripts. This is on my list of things to do, as I think it makes excellent sense.
It gets recommended a lot, but his Ultimate Guide is loaded with great advice - Briansol's Ultimate Guide to vBSEO
What I don't understand:
This is my robots.txt:
Why is there still bot activity on scripts like printthread and others, if I have disallowed them a long time ago?# robots.txt for Symptome, Ursachen von Krankheiten - Forum, Hilfe, Tipps zu Gesundheit
# Zugriff auf alle Dateien erlauben
User-agent: *
Disallow: /images/
Disallow: /faq.php
Disallow: /attachment.php
Disallow: /avatar.php
Disallow: /cron.php
Disallow: /editpost.php
Disallow: /calendar.php
Disallow: /member.php
Disallow: /memberlist.php
Disallow: /printthread.php
Disallow: /sendmessage.php
Disallow: /register.php
Disallow: /sendtofriend.php
Disallow: /login.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /search.php
Disallow: /subscription.php
Disallow: /threadrate.php
Disallow: /private.php
Disallow: /misc.php
Disallow: /poll.php
Disallow: /showpost.php
Disallow: /showtext.php
Disallow: /profile.php
Disallow: /usercp.php
Disallow: /showgroups.php
Disallow: /cmps_index.php
Disallow: ../wikired/
Disallow: /sitemap/
Disallow: /archive/
sitemap: http://www.symptome.ch/vbboard/sitemap_index.xml.gz
If I enter my site, I'm directed to Symptome, Ursachen von Krankheiten - Forum, Hilfe, Tipps zu Gesundheit automatically.
But the above robots.txt is places within Symptome, Ursachen von Krankheiten - Forum, Hilfe, Tipps zu Gesundheit
Could it be, that the paths are wrong?
Yes, you need specify the vbboard directory.
Disallow: /vbboard/images/
Disallow: /vbboard/faq.php
Thanks a lot!
So my robots.txt did quite nothing. Good to know.
Some days later, the SE bot activity still shows the same.
Why? Does it take so long to see my changed robots.txt?