vBulletin SEO Forums

SEO

vBulletin Search Engine Optimization

Buy vBSEO Now! HACKER SAFE certified sites prevent over 99.9% of hacker crime.
ne nw
vBSEO Customer Account Tools vBSEO 3.2.0 GOLD Has Landed Success with vBSEO = 600ore Web Visitors + $1400 in a Day! vBSEO Helps Forum Earn $100/day from Google AdSense Discover the Power of "Long Tail Search" Crawlability Inc. Files for SEO Technology Patent
se sw

SE bots seem to ignore robots.txt exclusion of newreply.php

This is a discussion on SE bots seem to ignore robots.txt exclusion of newreply.php within the General Discussion forums, part of the vBSEO SEO Plugin category; I have installed the Track Guest Visits modification, which shows guest and SE bot activity on a forum. While browsing ...

Go Back   vBulletin SEO Forums > vBSEO SEO Plugin > General Discussion

Enhancing 80 million pages.

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1  
Old 04-06-2008, 01:36 AM
711 711 is offline
Junior Member
Big Board Administrator
 
Real Name: 711
Join Date: Nov 2007
Posts: 9
SE bots seem to ignore robots.txt exclusion of newreply.php

I have installed the Track Guest Visits modification, which shows guest and SE bot activity on a forum.

While browsing the spider activity page, I noticed lots of request to newreply.php, even though I have that page disallowed in my robots.txt.

Here is a screenshot of what I mean:





Notice all the:

Called with DO = 'newreply'

Here is my current robots.txt content:

HTML Code:
User-agent: *
Disallow: /forums/admincp/
Disallow: /forums/attachment.php
Disallow: /forums/avatar.php
Disallow: /forums/calendar.php
Disallow: /forums/cgi-bin/
Disallow: /forums/clientscript/
Disallow: /forums/cron.php
Disallow: /forums/editpost.php
Disallow: /forums/faq.php
Disallow: /forums/image.php
Disallow: /forums/images/
Disallow: /forums/includes/
Disallow: /forums/install/
Disallow: /forums/ispy.php
Disallow: /forums/joinrequests.php
Disallow: /forums/login.php
Disallow: /forums/member.php
Disallow: /forums/member2.php
Disallow: /forums/misc.php
Disallow: /forums/modcp/
Disallow: /forums/moderator.php
Disallow: /forums/newreply.php
Disallow: /forums/newthread.php
Disallow: /forums/online.php
Disallow: /forums/payments.php
Disallow: /forums/poll.php
Disallow: /forums/postings.php
Disallow: /forums/printthread.php
Disallow: /forums/private.php
Disallow: /forums/private2.php
Disallow: /forums/profile.php
Disallow: /forums/register.php
Disallow: /forums/reputation.php
Disallow: /forums/search.php
Disallow: /forums/sendmessage.php
Disallow: /forums/sendmessage.php?do=
Disallow: /forums/sendtofriend.php
Disallow: /forums/showgroups.php
Disallow: /forums/showpost.php
Disallow: /forums/sitemap/
Disallow: /forums/spy.php
Disallow: /forums/subscription.php
Disallow: /forums/tags/
Disallow: /forums/threadrate.php
Disallow: /forums/upload.php
Disallow: /forums/usercp.php
Disallow: /forums/weeklystats.php
Disallow: /forums/statistics.php
Disallow: /forums/stats.php
Disallow: /forums/infraction.php
Disallow: /forums/ajax.php
Disallow: /forums/arcade.php
 
User-agent: Slurp
Disallow: /gallery/
Crawl-delay: 90
Sitemap: http://www.entropiaforum.com/forums/sitemap_index.xml.gz
Any ideas or suggestions on how to prevent the spiders from visiting newreply.php (or any other unwanted pages) would be appreciated, thank!

Last edited by 711; 04-06-2008 at 06:49 AM. Reason: Fixed image
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2  
Old 04-06-2008, 05:47 PM
briansol's Avatar
Senior Member
vBSEO Pre-Release TeamDesign for SEOBig Board Administrator
 
Real Name: Brian
Join Date: Apr 2006
Location: Central CT, USA
Posts: 4,116
the robots file is merely a suggestion to the SE. It won't 100% stop anything.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3  
Old 04-06-2008, 06:30 PM
711 711 is offline
Junior Member
Big Board Administrator
 
Real Name: 711
Join Date: Nov 2007
Posts: 9
Quote:
Originally Posted by briansol View Post
the robots file is merely a suggestion to the SE. It won't 100% stop anything.
True, though I thought Googlebot tended to be more "well-behaved", and usually respected the robots.txt suggestions?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump

Similar Threads

Thread Thread Starter Forum Replies Last Post
Blocking bots in robots.txt - how do they see URLs? Dave Hybrid General Discussion 0 06-20-2007 06:53 PM
Just a test, please ignore Lian Off-Topic & Chit Chat 1 01-16-2007 02:00 AM


All times are GMT -4. The time now is 11:15 PM.


Powered by vBulletin Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0 ©2008, Crawlability, Inc.