Page 1 of 3 1 2 3 LastLast
Results 1 to 15 of 42
Like Tree1Likes

Tutorial how to make a good robots.txt file

This is a discussion on Tutorial how to make a good robots.txt file within the General Discussion forums, part of the vBSEO SEO Plugin category; I have done some digging on this forum to find a proper robots.txt file to use for my forum (root ...

  1. #1
    Member
    Real Name
    Bram
    Join Date
    Apr 2008
    Posts
    73
    Liked
    2 times

    Tutorial how to make a good robots.txt file

    I have done some digging on this forum to find a proper robots.txt file to use for my forum (root installation at www.mysite.com) but although there are some threads that describe parts of the file i dont seem to find a nice copy/paste setting.

    Can anybody help me out?

  2. #2
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,789
    Liked
    657 times
    Blog Entries
    2
    There's no template available as everybody's site, sections, and rewrite rules are different, and thus need a different set up.

    Generally, you want to block all public pages that you don't want bots to visit.
    Note: NEVER list hidden or private directories, such as admincp, modcp, cpanel, etc, as this is a level 1 PCI violation and also tells would-be hackers where your areas are located. Sometimes, that's half the battle.

    Pages like search.php, usercp.php, login.php, etc should be blocked as you don't want to waste the bot's time on pages it will never index. If you've re-written those, you should block both the php version and the vbseo rewritten version.

    I would suggest starting with this basic template:

    Assuming your site is in root (robots.txt ONLY works in root, so you must reflect your paths from root in your file)

    Code:
    # Allow Archive.org to save snapshots of everything
    User-agent: ia_archiver
    Allow: /
    
    # Tame yahoo... it tends to eat a ton of resources without a delay
    User-agent: Slurp
    Crawl-delay: 60
    
    
    #list individual pages and files here that all bots should ignore, as well as group extentions.
    #If you re-write everything to .html, you can disallow *.php, but note that if you don't have a CRR for custom pages, those will be blocked.
    
    User-agent: *
    Disallow: *.js
    Disallow: search.php
    Disallow: /includes/
    Disallow: /install/
    Disallow: /customavatars/
    
    
    #Finally, list the path to your sitemap:
    Sitemap: http://yourdomain.com/sitemap_index.xml.gz

  3. #3
    Member
    Real Name
    Bram
    Join Date
    Apr 2008
    Posts
    73
    Liked
    2 times
    Thanks again for the super fast service!!

  4. #4
    Member
    Real Name
    .
    Join Date
    Jan 2010
    Posts
    30
    Liked
    0 times
    Thanks Brian great job.

  5. #5
    Junior Member
    Join Date
    Nov 2005
    Posts
    23
    Liked
    0 times
    Thank you Brian

  6. #6
    Member
    Real Name
    Kyle
    Join Date
    Dec 2006
    Posts
    64
    Liked
    0 times
    What if your sitemap isnt called:
    sitemap_index.xml.gz

    I realise this is possibly just an example you gave but ours..

    Appears as thus:

    sitemap_root..png

    What should our link then be to the sitemap ? in robots.txt file

    should it be:

    http://anywebsite.org/sitemap/vbulle...p_index.xml.gz

  7. #7
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,789
    Liked
    657 times
    Blog Entries
    2
    If you use vbseo, you SHOULDN'T be using the vb built in sitemap generator. It will NOT pick up vbseo urls. You must install sitemap generator 2.6 beta with vb4.

  8. #8
    Member
    Real Name
    Kyle
    Join Date
    Dec 2006
    Posts
    64
    Liked
    0 times
    I do use vbseo sitemap generator, hence the issues we had yesterday with submitting sitemap, and it crashing. We have a few server issues at the moment, getting 500 errors, but no one at vb has replied.

  9. #9
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,789
    Liked
    657 times
    Blog Entries
    2
    All of those files are from vb's built in sitemap generator. You should turn it off and never use it, and delete those from your webmasters area. All that will do is hurt you.

  10. #10
    Member
    Real Name
    Kyle
    Join Date
    Dec 2006
    Posts
    64
    Liked
    0 times
    Ok vb was ON. Have turned sitemap generator off.

    Have changed link in robots.txt file accordingly.
    Note: in vbseo it asks for forum home page, which dfaults to /forum

    However this isnt the forum home page this defaults to content if left at /forum
    However if we set forum home page in vbseo to /forum/forum.php the site crashes.

    Also should we or should we not have htaccess in domain root. Aswell as htaccess in forum root.

    Seems to be conflicting posts everywhere regarding this.

    Ste

  11. #11
    Member
    Real Name
    Kyle
    Join Date
    Dec 2006
    Posts
    64
    Liked
    0 times
    Quote Originally Posted by Brian Cummiskey View Post
    All of those files are from vb's built in sitemap generator. You should turn it off and never use it, and delete those from your webmasters area. All that will do is hurt you.
    By webmasters area, you mean from the server ?

  12. #12
    Senior Member
    Real Name
    Michael Biddle
    Join Date
    Jan 2007
    Location
    Southern California
    Posts
    7,097
    Liked
    5 times
    Webmasters Area meaning your Google Webmaster Tools.
    The Forum Hosting - Forum Hosting from the Forum Experts

  13. #13
    vBSEO Staff Brian Cummiskey's Avatar
    Real Name
    Brian Cummiskey
    Join Date
    Jul 2009
    Location
    btwn NYC and Boston
    Posts
    12,789
    Liked
    657 times
    Blog Entries
    2
    Quote Originally Posted by EliteNeo.com View Post
    Note: in vbseo it asks for forum home page, which dfaults to /forum
    I'm not sure what you're asking here... where are you getting 'asked' this?


    If your vb install is in /forums, your htaccess file should be in /forums as well.
    The only time you would want to move this would be if you moved your CMS to root as outlined here: vBSEO 3.5 Release Candidate 1 is here!

  14. #14
    Junior Member
    Real Name
    Martin
    Join Date
    May 2010
    Posts
    12
    Liked
    0 times
    Quote Originally Posted by Brian Cummiskey View Post
    NEVER list hidden or private directories, such as admincp, modcp, cpanel, etc, as this is a level 1 PCI violation and also tells would-be hackers where your areas are located. Sometimes, that's half the battle.
    What is a "level 1 PCI violation" and why is it important to us?

    Per your recommendations above then a robots.txt like this is a bad idea?

    Code:
    User-agent: *
    Disallow: /forum/admincp/
    Disallow: /forum/clientscript/
    Disallow: /forum/cpstyles/
    Disallow: /forum/customavatars/
    Disallow: /forum/customprofilepics/
    Disallow: /forum/images/
    Disallow: /forum/modcp/
    Disallow: /forum/ajax.php
    Disallow: /forum/attachment.php
    Disallow: /forum/calendar.php
    Disallow: /forum/cron.php
    Disallow: /forum/editpost.php
    Disallow: /forum/global.php
    Disallow: /forum/image.php
    Disallow: /forum/inlinemod.php
    Disallow: /forum/joinrequests.php
    Disallow: /forum/login.php
    Disallow: /forum/member.php
    Disallow: /forum/memberlist.php
    Disallow: /forum/misc.php
    Disallow: /forum/moderator.php
    Disallow: /forum/newattachment.php
    Disallow: /forum/newreply.php
    Disallow: /forum/newthread.php
    Disallow: /forum/online.php
    Disallow: /forum/poll.php
    Disallow: /forum/postings.php
    Disallow: /forum/printthread.php
    Disallow: /forum/private.php
    Disallow: /forum/profile.php
    Disallow: /forum/register.php
    Disallow: /forum/report.php
    Disallow: /forum/reputation.php
    Disallow: /forum/search.php
    Disallow: /forum/sendmessage.php
    Disallow: /forum/showgroups.php
    Disallow: /forum/subscription.php
    Disallow: /forum/threadrate.php
    Disallow: /forum/usercp.php
    Disallow: /forum/usernote.php
    Disallow: /forms/
    Disallow: /images/
    Disallow: /legal/
    Disallow: /css/
    Disallow: /common/
    What portions of the above file would you retain?
    What would you discard and why?

    Thanks,

    -Martin

  15. #15
    Senior Member
    Real Name
    Michael Biddle
    Join Date
    Jan 2007
    Location
    Southern California
    Posts
    7,097
    Liked
    5 times
    You should remove admincp and modcp.
    The Forum Hosting - Forum Hosting from the Forum Experts

Page 1 of 3 1 2 3 LastLast

Similar Threads

  1. Robots.txt file changes
    By Blind Dragon in forum General Discussion
    Replies: 3
    Last Post: 09-03-2009, 01:00 PM
  2. Edit or make a new htaccess file?
    By Collectors in forum Troubleshooting
    Replies: 2
    Last Post: 11-18-2008, 11:44 AM
  3. Redirecting /forums/robots.txt to /robots.txt - Is it good?
    By MadK in forum Custom Rewrite Rules
    Replies: 6
    Last Post: 08-22-2008, 06:29 PM
  4. Problems to Make Upload a file of more than 1MB
    By Christon in forum General Discussion
    Replies: 4
    Last Post: 04-26-2008, 12:44 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •