I'm just wondering, will it help or hurt my site? Do i really need it?
This is a discussion on Do i need a robots.txt? within the Ad Networks forums, part of the Monetizing category; I'm just wondering, will it help or hurt my site? Do i really need it?...
I'm just wondering, will it help or hurt my site? Do i really need it?
It'll help reduce duplicate content from my understanding.
is duplicate content a bad thing?
Yes, very much so.
The Forum Hosting - Forum Hosting from the Forum Experts
ok so when i go to create one, what is the very best thing to put in the txt file itself so that i can just c/p what the masters have put in theirs? :P
i found a post from the dude that made like 1400 in a day or whatever and put this in my txt:
now just waiting for google to update it's 404 message.User-agent: *
Disallow: /admincp/
Disallow: /cgi-bin/
Disallow: /clientscript/
Disallow: /includes/
Disallow: /install/
Disallow: /modcp/
Disallow: /subscription.php
Disallow: /payments.php
Disallow: /profile.php
Disallow: /faq.php
Disallow: /calendar.php
Disallow: /search.php
Disallow: /private.php
Disallow: /online.php
Disallow: /sendmessage.php
Disallow: /sendmessage.php?do=
Disallow: /showgroups.php
Disallow: /reputation.php
Disallow: /report.php
Disallow: /threadrate.php
Disallow: /postings.php
Disallow: /newthread.php
Disallow: /newreply.php
Disallow: /register.php
Disallow: /login.php
Disallow: /faq.php
Disallow: /image.php
Disallow: /cron.php
Disallow: /joinrequests.php
Disallow: /printthread.php
Disallow: /showpost.php
Disallow: /archive/
IMO the main purpose of robots.txt is to keep spiders from indexing content that doesn't matter, leaving them with more time to index the content that does matter. It's about indexing efficiency more than anything, but restricting them from showpost.php and /archive/ does reduce duplicate content and strengthens the validity of your indexed pages by essentially only indexing whole threads in your forum content.
I don't think any of this would have much impact on the AdSense spider though. It's going to hit new pages and index them regardless of what other pages it's trying to look at, because it's called by the ad script as a user loads that page, I think.
Your mileage may vary.![]()
-Josh
I'm thinking of copying seangworld's list in my own robots.txt file. Anyone here not think that's a good idea? I don't know much about this kind of thing so I'm looking for advice.
The one I found is differant -
Can we have a definitive robots text from some one high up here please, they all seem to be differant.Code:[User-agent: * Disallow: /forum/admincp/ Disallow: /forum/clientscript/ Disallow: /forum/cpstyles/ Disallow: /forum/customavatars/ Disallow: /forum/customprofilepics/ Disallow: /forum/images/ Disallow: /forum/modcp/ Disallow: /forum/ajax.php Disallow: /forum/attachment.php Disallow: /forum/calendar.php Disallow: /forum/cron.php Disallow: /forum/editpost.php Disallow: /forum/global.php Disallow: /forum/image.php Disallow: /forum/inlinemod.php Disallow: /forum/joinrequests.php Disallow: /forum/login.php Disallow: /forum/misc.php Disallow: /forum/moderator.php Disallow: /forum/newattachment.php Disallow: /forum/newreply.php Disallow: /forum/newthread.php Disallow: /forum/online.php Disallow: /forum/poll.php Disallow: /forum/postings.php Disallow: /forum/printthread.php Disallow: /forum/private.php Disallow: /forum/profile.php Disallow: /forum/register.php Disallow: /forum/report.php Disallow: /forum/reputation.php Disallow: /forum/search.php Disallow: /forum/sendmessage.php Disallow: /forum/subscription.php Disallow: /forum/threadrate.php Disallow: /forum/usercp.php Disallow: /forum/usernote.php
robots.tx ONLY works in root.
if your site is in /forums, you should use the 2nd version with the /forums/ directive
grr, that makes sense. i believe i used the first one.
correcting this now...
i took out 2 things from it tho: the poll and profile.