Results 1 to 7 of 7

How is my robots.txt?

This is a discussion on How is my robots.txt? within the Troubleshooting forums, part of the vBSEO SEO Plugin category; Code: $ cat robots.txt User-agent: * Disallow: /forums/ajax.php Disallow: /forums/attachment.php Disallow: /forums/calendar.php Disallow: /forums/cron.php Disallow: /forums/editpost.php Disallow: /forums/global.php Disallow: /forums/image.php ...

  1. #1
    Senior Member libertylounge's Avatar
    Real Name
    Ken
    Join Date
    Aug 2006
    Posts
    439
    Liked
    0 times

    How is my robots.txt?

    Code:
    $ cat robots.txt
    User-agent: *
    Disallow: /forums/ajax.php
    Disallow: /forums/attachment.php
    Disallow: /forums/calendar.php
    Disallow: /forums/cron.php
    Disallow: /forums/editpost.php
    Disallow: /forums/global.php
    Disallow: /forums/image.php
    Disallow: /forums/inlinemod.php
    Disallow: /forums/joinrequests.php
    Disallow: /forums/login.php
    Disallow: /forums/member.php
    Disallow: /forums/misc.php
    Disallow: /forums/moderator.php
    Disallow: /forums/newattachment.php
    Disallow: /forums/newreply.php
    Disallow: /forums/newthread.php
    Disallow: /forums/online.php
    Disallow: /forums/poll.php
    Disallow: /forums/postings.php
    Disallow: /forums/printthread.php
    Disallow: /forums/private.php
    Disallow: /forums/profile.php
    Disallow: /forums/register.php
    Disallow: /forums/report.php
    Disallow: /forums/reputation.php
    Disallow: /forums/search.php
    Disallow: /forums/sendmessage.php
    Disallow: /forums/showgroups.php
    Disallow: /forums/spiders.php
    Disallow: /forums/subscription.php
    Disallow: /forums/threadrate.php
    Disallow: /forums/usercp.php
    Disallow: /forums/usernote.php
    Disallow: /forums/admincp/
    Disallow: /forums/images/
    Disallow: /forums/modcp/
    I think that gets rid of most of the useless stuff and focuses their crawling on pages with content?

  2. #2
    Senior Member briansol's Avatar
    Real Name
    Brian
    Join Date
    Apr 2006
    Location
    Central CT, USA
    Posts
    6,981
    Liked
    8 times
    looks very similar to mine:

    User-agent: *

    Disallow: /ajax.php
    Disallow: /attachment.php
    Disallow: /calendar.php
    Disallow: /cron.php
    Disallow: /editpost.php
    Disallow: /global.php
    Disallow: /image.php
    Disallow: /inlinemod.php
    Disallow: /joinrequests.php
    Disallow: /login.php
    Disallow: /member.php
    Disallow: /misc.php
    Disallow: /moderator.php
    Disallow: /newattachment.php
    Disallow: /newreply.php
    Disallow: /newthread.php
    Disallow: /online.php
    Disallow: /poll.php
    Disallow: /postings.php
    Disallow: /printthread.php
    Disallow: /private.php
    Disallow: /profile.php
    Disallow: /register.php
    Disallow: /report.php
    Disallow: /reputation.php
    Disallow: /search.php
    Disallow: /sendmessage.php
    Disallow: /showpost.php
    Disallow: /showgroups.php
    Disallow: /spiders.php
    Disallow: /subscription.php
    Disallow: /threadrate.php
    Disallow: /usercp.php
    Disallow: /usernote.php
    Disallow: /admincp/
    Disallow: /cgi-bin/
    Disallow: /includes/
    Disallow: /install/
    Disallow: /ioncube/
    Disallow: /mint/
    Disallow: /modcp/

  3. #3
    Senior Member majordude's Avatar
    Real Name
    majordude
    Join Date
    Aug 2006
    Posts
    182
    Liked
    0 times
    Why do you guys exclude all those things?

    All you really want to do is stop the robots from listing your "private" directories (admincp, install, images, etc.).

    If you have the Google toolbar turned on when you access these areas you will eventually get spidered.

    Look at all the site.com/wp-admin stuff (WordPress log-in directory) listed in Google.

    Of course, if you really do have a hidden directory, once you put it into robots.txt, anyone can see what you are trying to hide.
    .
    Go Packers!

  4. #4
    vBSEO Staff Ace Shattock's Avatar
    Real Name
    Ace Shattock
    Join Date
    Jul 2005
    Location
    Auckland, New Zealand, New Zealand
    Posts
    3,998
    Liked
    11 times
    That's true.. and some (bad) spiders deliberately do the opposite of robots.txt.

    In those cases, it would probably be more beneficial to exclude the IPs in .htaccess.

  5. #5
    Senior Member
    Real Name
    Keith Cohen
    Join Date
    Jul 2005
    Location
    Raleigh, NC USA
    Posts
    6,147
    Liked
    12 times
    All my private directories are password protected anyway, so no chance of them getting spidered.

  6. #6
    Senior Member majordude's Avatar
    Real Name
    majordude
    Join Date
    Aug 2006
    Posts
    182
    Liked
    0 times
    Good point Keith.
    .
    Go Packers!

  7. #7
    Senior Member libertylounge's Avatar
    Real Name
    Ken
    Join Date
    Aug 2006
    Posts
    439
    Liked
    0 times
    Well, because most of those links either have duplicate content or no content that I want indexed.

    It also saves bandwidth (and thus cpu/memory).. meaning the site wont be sluggish.

    The main important things for indexing are showpost.php, showthread.php, forumdisplay.php, and index.php ..Member profiles (for me, anyway) are a secondary concern.
    The Liberty Lounge Political Forums - Our political forums, your two cents.

Similar Threads

  1. robots.txt
    By Zenith in forum General Discussion
    Replies: 64
    Last Post: 12-01-2010, 08:52 PM
  2. Temp robots.txt Brand New Forum?
    By rmjvol in forum Pre-Sales Questions
    Replies: 7
    Last Post: 08-26-2006, 02:53 AM
  3. robots.txt entries
    By shaochun in forum General Discussion
    Replies: 5
    Last Post: 12-10-2005, 08:18 PM
  4. "should" I use a robots.txt file?
    By drex in forum General Discussion
    Replies: 5
    Last Post: 11-03-2005, 09:47 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •