Results 1 to 14 of 14

Another Robots.txt question...

This is a discussion on Another Robots.txt question... within the General Discussion forums, part of the vBulletin SEO Discussion category; In my robots.txt file... when ran through an optimizer it came back as follows: Line 4 User-agent: Slurp You specified ...

  1. #1
    Senior Member Sonnie's Avatar
    Real Name
    Sonnie
    Join Date
    May 2006
    Location
    L.A. (Lower Alabama)
    Posts
    305
    Liked
    2 times

    Another Robots.txt question...

    In my robots.txt file... when ran through an optimizer it came back as follows:

    Line 4 User-agent: Slurp
    You specified both the generic user-agent "*" and specific user-agents for this block of code; this could be misinterpreted.
    Line 5 Crawl-delay: 60
    Line 6
    Line 7 # Disallow directory
    Line 8 User-agent: *
    You specified both the generic user-agent "*" and specific user-agents for this block of code; this could be misinterpreted.
    Does this mean I need to take either User-agent: Slurp or User-agent: * out and not have both?

    Does it really hurt to have both? I only ask because I saw it blessed by vBSEO staff here in this forum on more than one occasion.

    Thanks!

  2. #2
    Senior Member
    Real Name
    Ceri May
    Join Date
    Jul 2009
    Location
    United Kingdom
    Posts
    1,726
    Liked
    15 times
    Blog Entries
    1
    Hey Sonnie,

    What is actually the full contents of your robots.txt, the messages makes little sense to me without something to reference against.

    Ceri

  3. #3
    Senior Member Sonnie's Avatar
    Real Name
    Sonnie
    Join Date
    May 2006
    Location
    L.A. (Lower Alabama)
    Posts
    305
    Liked
    2 times
    Thanks Ceri... here is our full robots.txt file...

    # For domain: Home Theater Forum - Home Theater Systems - HomeTheaterShack
    # All robots will spider the domain

    User-agent: Slurp

    Crawl-delay: 60

    # Disallow directory

    User-agent: *

    Disallow: /attachment.php
    Disallow: /attachments/
    Disallow: /cgi-bin/
    Disallow: /deniedaccess.html
    Disallow: /display.html
    Disallow: /forums/admincp/
    Disallow: /forums/ajax.php
    Disallow: /forums/articlebot/
    Disallow: /forums/clientscript/
    Disallow: /forums/cpstyles/
    Disallow: /forums/customavatars/
    Disallow: /forums/customprofilepics/
    Disallow: /forums/calendar.php
    Disallow: /forums/cron.php
    Disallow: /forums/editpost.php
    Disallow: /forums/global.inc.php
    Disallow: /forums/includes/
    Disallow: /forums/inlinemod.php
    Disallow: /forums/install/
    Disallow: /forums/ipinfo.php
    Disallow: /forums/joinrequests.php
    Disallow: /forums/login.php
    Disallow: /forums/markers.xml
    Disallow: /forums/member.php
    Disallow: /forums/memberlist.php
    Disallow: /forums/misc.php
    Disallow: /forums/modcp/
    Disallow: /forums/moderator.php
    Disallow: /forums/moderatorapplication.php
    Disallow: /forums/modules/
    Disallow: /forums/newattachment.php
    Disallow: /forums/newreply.php
    Disallow: /forums/newthread.php
    Disallow: /forums/online.php
    Disallow: /forums/payment_gateway.php
    Disallow: /forums/payments.php
    Disallow: /forums/poll.php
    Disallow: /forums/postings.php
    Disallow: /forums/private.php
    Disallow: /forums/process.php
    Disallow: /forums/profile.php
    Disallow: /forums/register.php
    Disallow: /forums/report.php
    Disallow: /forums/reputation.php
    Disallow: /forums/search.php
    Disallow: /forums/sendmessage.php
    Disallow: /forums/showgroups.php
    Disallow: /forums/subscription.php
    Disallow: /forums/test.php
    Disallow: /forums/threadrate.php
    Disallow: /forums/usercp.php
    Disallow: /forums/usernote.php
    Disallow: /forums/vbgooglemapme.php
    Disallow: /forums/vbseocp.php
    Disallow: /ModeratorOptions.pdf
    Disallow: /newrsanalog.cal
    Disallow: /newrsdigital.cal
    Disallow: /oldrsanalog.cal
    Disallow: /oldindex.html
    Disallow: /openads/
    Disallow: /phpform/
    Disallow: /roomeq/oldindex.html
    Disallow: /ssm.js
    Disallow: /ssmItems.js
    Disallow: /subscriptions/
    Disallow: /test.html

  4. #4
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,782
    Liked
    648 times
    Blog Entries
    2
    There is nothing wrong with that file, with the exception that you reference your modcp and admincp directories, which can be a secruity issue. But on the robots side of things, it's completely valid.
    Brian Cummiskey / Crawlability Inc.
    vBSEO 3.6.0 GOLD Released!
    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  5. #5
    Senior Member
    Real Name
    Dhillon
    Join Date
    Apr 2006
    Posts
    341
    Liked
    1 times
    With your robots.txt Slurp will be able to spider all files and directories blocked for other robots.
    Either copy all ' User-agent: *' rules for slurp as well or just remove 'User-agent: Slurp' and leave crawl-delay.
    that directive is ignored by googlebot.

  6. #6
    Senior Member Sonnie's Avatar
    Real Name
    Sonnie
    Join Date
    May 2006
    Location
    L.A. (Lower Alabama)
    Posts
    305
    Liked
    2 times
    Quote Originally Posted by Brian Cummiskey View Post
    There is nothing wrong with that file, with the exception that you reference your modcp and admincp directories, which can be a secruity issue. But on the robots side of things, it's completely valid.
    So I should remove admincp and modcp? Again... I copied it from here and it was previously given blessings from here, but it could have easily been overlooked.

    Quote Originally Posted by Notorious View Post
    With your robots.txt Slurp will be able to spider all files and directories blocked for other robots.
    Either copy all ' User-agent: *' rules for slurp as well or just remove 'User-agent: Slurp' and leave crawl-delay.
    that directive is ignored by googlebot.
    Thanks!

  7. #7
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,782
    Liked
    648 times
    Blog Entries
    2
    Yeah, you should never include private directories. You just told would-be hackers where your panel is.
    Brian Cummiskey / Crawlability Inc.
    vBSEO 3.6.0 GOLD Released!
    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  8. #8
    Senior Member Sonnie's Avatar
    Real Name
    Sonnie
    Join Date
    May 2006
    Location
    L.A. (Lower Alabama)
    Posts
    305
    Liked
    2 times
    I noticed you had them in yours when I searched for a few examples.

  9. #9
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,782
    Liked
    648 times
    Blog Entries
    2
    are you sure?
    Last edited by Brian Cummiskey; 02-03-2010 at 04:07 PM.
    Brian Cummiskey / Crawlability Inc.
    vBSEO 3.6.0 GOLD Released!
    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  10. #10
    Senior Member Sonnie's Avatar
    Real Name
    Sonnie
    Join Date
    May 2006
    Location
    L.A. (Lower Alabama)
    Posts
    305
    Liked
    2 times
    You are right... I was looking at one from briansol. Sorry about that.

  11. #11
    Senior Member Sonnie's Avatar
    Real Name
    Sonnie
    Join Date
    May 2006
    Location
    L.A. (Lower Alabama)
    Posts
    305
    Liked
    2 times
    What is ia_archiver?

    And have you ever heard of sojourner?

  12. #12
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,782
    Liked
    648 times
    Blog Entries
    2
    ia_archiver = archive.org bot.

    no idea what sojourner is.
    Brian Cummiskey / Crawlability Inc.
    vBSEO 3.6.0 GOLD Released!
    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  13. #13
    Senior Member Sonnie's Avatar
    Real Name
    Sonnie
    Join Date
    May 2006
    Location
    L.A. (Lower Alabama)
    Posts
    305
    Liked
    2 times
    What is the purpose of ia_archiver... why add that?

    sojourner constantly crawls my forum.

  14. #14
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,782
    Liked
    648 times
    Blog Entries
    2
    archive.org takes snapshots of your site. it's neat to look back on where you've been.
    CRAWLABILITY.com
    vBulletin Search Engine Optimization Forums

    They tend to not cache a lot of images or linked css/etc.
    Brian Cummiskey / Crawlability Inc.
    vBSEO 3.6.0 GOLD Released!
    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


Similar Threads

  1. Robots.txt question re. url rewriting
    By Pigsy in forum General Discussion
    Replies: 2
    Last Post: 07-17-2009, 02:33 PM
  2. Simple Question about robots.txt
    By sineater213 in forum General Discussion
    Replies: 2
    Last Post: 01-24-2009, 06:45 PM
  3. robots.txt question
    By Citricguy in forum General Discussion
    Replies: 9
    Last Post: 12-16-2006, 10:29 PM
  4. G. Sitemaps (2200 not found/unreachable urls + robots.txt question)
    By JustinBrand in forum General Discussion
    Replies: 1
    Last Post: 10-15-2006, 06:34 PM
  5. Question about location of robots.txt file
    By Mike in forum General Discussion
    Replies: 2
    Last Post: 05-22-2006, 02:05 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •