Page 11 of 15 FirstFirst 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 LastLast
Results 151 to 165 of 215
Like Tree2Likes

Briansol's Ultimate Guide to vBSEO

This is a discussion on Briansol's Ultimate Guide to vBSEO within the General Discussion forums, part of the vBSEO SEO Plugin category; Originally Posted by briansol robots.txt ONLY works in site.com/robots.txt so, where ever that happens to be in your local file ...

  1. #151
    JWL
    JWL is offline
    Senior Member JWL's Avatar
    Real Name
    John
    Join Date
    Sep 2008
    Location
    North Bay Area
    Posts
    104
    Liked
    0 times
    Quote Originally Posted by briansol View Post
    robots.txt ONLY works in site.com/robots.txt

    so, where ever that happens to be in your local file system is where you need to put it, usually

    /home/sitename/public_html/robots.txt

    it is not a global file- it's per DOMAIN.
    True its a one shot file, what your saying is for example as I showed
    /html_public/ (aka root)robot.txt
    /html_public/domain.com1/root
    /html_public/domain.com2/root
    /html_public/domain.com3/root

    A robot file on /html_public/ (aka root) will not work or be usable for
    /html_public/domain.com1/root
    /html_public/domain.com2/root
    /html_public/domain.com3/root

    Instead they will be like this
    /html_public/robot.txt
    /html_public/domain.com1/robot.txt
    /html_public/domain.com2/robot.txt
    /html_public/domain.com3/robot.txt
    Each of those .com domains will have their own robot file, pertanant to that domain.com's file and directories/folders

    is that correct?

  2. #152
    Member
    Real Name
    Nathan
    Join Date
    Mar 2009
    Posts
    73
    Liked
    0 times
    Could you please explain what each main part in your robot.txt file does and why it may help in SEO. Thank you so much for doing that Brian

    If I put in my robots.txt file:
    http://www.yoursite.com/forums/sitemap_index.xml.gz

    Will the robots crawl my sites homepage and main site pages besides what is in the forum sitemap?

    Do you also suggest I add these
    Code:
    User-agent: *
    Disallow: /attachment.php
    Disallow: /avatar.php
    Disallow: /editpost.php
    Disallow: /misc.php
    Disallow: /moderator.php
    Disallow: /newreply.php
    Disallow: /newthread.php
    Disallow: /online.php
    Disallow: /poll.php
    Disallow: /printthread.php
    Disallow: /private.php
    Disallow: /report.php
    Disallow: /sendtofriend.php
    Disallow: /threadrate.php
    Disallow: /usercp.php
    Disallow: /admin/
    Disallow: /modcp/
    Disallow: /sendmessage.php
    Disallow: /register.php
    Disallow: /subscription.php
    Last edited by breakpoint; 04-09-2009 at 05:24 PM.

  3. #153
    Member MaRµ's Avatar
    Real Name
    Miss Mariel xD
    Join Date
    Aug 2008
    Posts
    66
    Liked
    0 times
    Brian... a question... my robots.txt is:

    User-Agent: *
    Allow: /.
    Allow: /
    User-agent: *
    Disallow: /showthread.php?do=post_thanks*
    And yours mixing with mine:

    # Allow Archiver
    User-agent: ia_archiver
    Allow: /

    User-agent: Slurp
    Crawl-delay: 60

    User-agent: *
    Allow: /.
    Allow: /
    Disallow: *.php
    Disallow: *.js
    Disallow: *.jsp
    Disallow: *.cfm
    Disallow: *.asp
    Disallow: *.html
    Disallow: *.htm
    Disallow: *.aspx
    Disallow: *.cgi
    Disallow: /includes/
    Disallow: /install/
    Disallow: /customavatars/
    Disallow: /archive/
    Disallow: /sitemap/
    Disallow: /showthread.php?do=post_thanks*
    Sitemap: http://www.altoforo.com/sitemap_index.xml.gz
    Do you think is allright?... What do you recommendme? I dont understand "Crawl-delay: 60"... I want fast and good crawling

  4. #154
    Senior Member briansol's Avatar
    Real Name
    Brian
    Join Date
    Apr 2006
    Location
    Central CT, USA
    Posts
    6,981
    Liked
    8 times
    yahoo (slurp) tends to send WAY too many bots at once.. 200+ sometimes. if your under heavy load already, it can bring down your server. the delay let's the bot wait 60 seconds before going to the next page.

    since you aren't using my xml file, you shouldn't use my robots file either.

    ie, your thread urls end in .html
    Hola a todos!

    you are blocking SE's by using this robots file. you should disable at least the *.html rule immediately.

    my settings file uses NO extentions, thus i block ALL extentions.

  5. #155
    Member MaRµ's Avatar
    Real Name
    Miss Mariel xD
    Join Date
    Aug 2008
    Posts
    66
    Liked
    0 times
    Thank you so much Brian!... I understand you so good!.

    Well... I am afraid to changin the xlm because I dont want to do something wrog.., and Im so newbie... site:www.altoforo.com - Buscar con Google= look it...

    I just cant keep on reading all tutorials and making changes, but I dont know where should I have to stop.

  6. #156
    Senior Member briansol's Avatar
    Real Name
    Brian
    Join Date
    Apr 2006
    Location
    Central CT, USA
    Posts
    6,981
    Liked
    8 times
    Quote Originally Posted by breakpoint View Post
    Could you please explain what each main part in your robot.txt file does and why it may help in SEO. Thank you so much for doing that Brian
    My robots file is designed to go with my settings XML file. If you aren't using my settings file, you shouldn't be using my robots file either, as it will block things that should not be blocked.

    In my settings file, i rewrite EVERYTHING i want indexed with no extentions. /page/ not /page.html.

    Thus, If i don't rewrite it, i want to block it. So, i disallow all extentions (php, html) and to confuse would be hackers, i also put other langagues that i don't eve use (jsp, asp, etc)

    If I put in my robots.txt file:
    http://www.yoursite.com/forums/sitemap_index.xml.gz


    Will the robots crawl my sites homepage and main site pages besides what is in the forum sitemap?
    robots.txt ONLY works in root (site.com/robots.txt). All paths should reflect that.

    if you have a root sitemap, you can include that as well at the bottom. If you don't then you shouldn't put anything there. It won't block bots from accessing your hompage.

    Do you also suggest I add these
    Code:
    User-agent: *
    Disallow: /attachment.php
    Disallow: /avatar.php
    Disallow: /editpost.php
    Disallow: /misc.php
    Disallow: /moderator.php
    Disallow: /newreply.php
    Disallow: /newthread.php
    Disallow: /online.php
    Disallow: /poll.php
    Disallow: /printthread.php
    Disallow: /private.php
    Disallow: /report.php
    Disallow: /sendtofriend.php
    Disallow: /threadrate.php
    Disallow: /usercp.php
    Disallow: /admin/
    Disallow: /modcp/
    Disallow: /sendmessage.php
    Disallow: /register.php
    Disallow: /subscription.php
    NEVER EVER EVER include private directories in a robots file (modcp, admincp). You just told wouldbe hackers where your secure areas are.

    *.php will include all the rest of the urls, so they are redundant.

  7. #157
    Senior Member briansol's Avatar
    Real Name
    Brian
    Join Date
    Apr 2006
    Location
    Central CT, USA
    Posts
    6,981
    Liked
    8 times
    Quote Originally Posted by MaRµ View Post
    Thank you so much Brian!... I understand you so good!.

    Well... I am afraid to changin the xlm because I dont want to do something wrog.., and Im so newbie... site:www.altoforo.com - Buscar con Google= look it...

    I just cant keep on reading all tutorials and making changes, but I dont know where should I have to stop.
    rightfully so.

    since you are using extentions in your re-writes, you should just disallow your print pages, and non-important pages.

    There's no need to allow anything, as allow is default.


    Code:
    User-agent: *
    Disallow: /*-print/
    Disallow: /login.php
    Disallow: /member.php
    Disallow: /memberlist.php
    Disallow: /newthread.php
    Disallow: /newreply.php
    Disallow: /printthread.php
    Disallow: /private.php
    Disallow: /profile.php
    Disallow: /register.php
    Disallow: /search.php
    Disallow: /sendmessage.php
    should be enough, along with the crawl delay for yahoo, and allowing the IA-archiver (archive.org) to hit everything... even the blocked pages so there's a free cache of your site in the future.

  8. #158
    Member MaRµ's Avatar
    Real Name
    Miss Mariel xD
    Join Date
    Aug 2008
    Posts
    66
    Liked
    0 times
    All right then!... Im doing this change right now.

    www.altoforo.com/robots.txt


    Thank you!

  9. #159
    Member
    Real Name
    Nathan
    Join Date
    Mar 2009
    Posts
    73
    Liked
    0 times
    If I have my Forum sitemap listed in my robots.txt, will the robots still crawl my homepage or will they go directly to whats listed in the sitemap?

  10. #160
    Senior Member Shadab's Avatar
    Real Name
    Shadab
    Join Date
    Oct 2007
    Location
    Bhopal
    Posts
    821
    Liked
    0 times
    Blog Entries
    12
    Quote Originally Posted by breakpoint View Post
    If I have my Forum sitemap listed in my robots.txt, will the robots still crawl my homepage or will they go directly to whats listed in the sitemap?
    Yep they will crawl it.

    Not including a page in a sitemap won't stop the search engines from crawling it.
    (unless you explicitly Disallow a page in robots.txt)

  11. #161
    Junior Member
    Real Name
    Peter Foti
    Join Date
    Dec 2008
    Posts
    24
    Liked
    0 times
    K guys, im just a bit confused here. I have my sitemap settings set to only forum display and show thread. Now, my robots txt at the moment is this

    User-Agent: *
    Allow: /

    Disallow: /sitemap/
    Disallow: /archive/

    Sitemap: vBulletin SEO: The vBSEO Solution randomchatter org/sitemap_index.xml.gz

    I use the extensions, so I cant use Brian's robot.txt file, and im basically looking
    for someone to send me via PM or post the exact robots file that I could use.

    I see bits and pieces here and there but I get confused and what not.

    Also, most of my google search results are tag pages, because I have teh auto tag feature enabled.

    Should I disable that, or should I block the tags from being indexed?

    Thanks for the help.

  12. #162
    JWL
    JWL is offline
    Senior Member JWL's Avatar
    Real Name
    John
    Join Date
    Sep 2008
    Location
    North Bay Area
    Posts
    104
    Liked
    0 times
    The Robots file is explicit to that site, I did the same as you and got confused. Simply what you want to do is deny all agents to specific areas of your folder hierarchy. I also have my site map there I just did not add it

    I have used this as my robots

    PHP Code:
    User-agent: *
    Disallow/*-print/
    Disallow: /login.php
    Disallow: /member.php
    Disallow: /memberlist.php
    Disallow: /newthread.php
    Disallow: /newreply.php
    Disallow: /printthread.php
    Disallow: /private.php
    Disallow: /profile.php
    Disallow: /register.php
    Disallow: /search.php
    Disallow: /sendmessage.php
    # ########################
    # added 03/15/2009
    # ########################
    Disallow: /CHAT/getxml.php
    Disallow: /cron.php
    Disallow: /clientscript/
    Disallow: *.js
    Disallow: *.jsp
    Disallow: *.cfm
    Disallow: *.asp
    Disallow: *.aspx
    Disallow: *.cgi
    Disallow: /includes/
    Disallow: /install/
    Disallow: /customavatars/

    User-agent: msnbot
    Crawl-Delay: 10

    User-agent: Slurp
    Crawl-Delay: 10 
    Last edited by JWL; 04-13-2009 at 07:08 PM.

  13. #163
    Senior Member briansol's Avatar
    Real Name
    Brian
    Join Date
    Apr 2006
    Location
    Central CT, USA
    Posts
    6,981
    Liked
    8 times
    again, NEVER include your admin areas. you should remove

    Disallow: /vbseocp.php
    Disallow: /admincp/
    Disallow: /modcp/

  14. #164
    JWL
    JWL is offline
    Senior Member JWL's Avatar
    Real Name
    John
    Join Date
    Sep 2008
    Location
    North Bay Area
    Posts
    104
    Liked
    0 times
    Quote Originally Posted by briansol View Post
    again, NEVER include your admin areas. you should remove

    Disallow: /vbseocp.php
    Disallow: /admincp/
    Disallow: /modcp/
    simply question

    why?

  15. #165
    Senior Member briansol's Avatar
    Real Name
    Brian
    Join Date
    Apr 2006
    Location
    Central CT, USA
    Posts
    6,981
    Liked
    8 times
    because you just told hackers where they are.

Page 11 of 15 FirstFirst 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 LastLast

LinkBacks (?)


Similar Threads

  1. Is there a Newbies Guide to VBSEO
    By DieselMinded in forum General Discussion
    Replies: 5
    Last Post: 02-23-2008, 01:50 AM
  2. My Guide on the features of VBSEO
    By BamaStangGuy in forum Member Articles
    Replies: 18
    Last Post: 11-28-2006, 04:29 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •