Page 2 of 2 FirstFirst 1 2
Results 16 to 26 of 26

Follow-up critique please

This is a discussion on Follow-up critique please within the Critique Please forums, part of the vBulletin SEO Discussion category; Thanks Martyn. Yes, I include the old urls in the sitemap but googlebot is currently getting 404 on anything html. ...

  1. #16
    Senior Member
    Real Name
    Garry
    Join Date
    Jul 2009
    Posts
    156
    Liked
    0 times
    Thanks Martyn. Yes, I include the old urls in the sitemap but googlebot is currently getting 404 on anything html. That'll be my fault because I got jittery and removed the disallow html just before the bots picked up robots.txt. It's back in now though. So I can safely disallow php Martyn?

    Marco, the only php in my sitemap are forum display and show thread. When I had php disallowed my bot activity went down to almost zero.
    Last edited by Norkus; 07-12-2009 at 07:29 PM.

  2. #17
    Senior Member
    Real Name
    Marco Mamdouh
    Join Date
    May 2008
    Location
    Egypt
    Posts
    2,627
    Liked
    3 times
    Blog Entries
    1
    Quote Originally Posted by Norkus View Post
    Thanks Martyn. Yes, I include the old urls in the sitemap but googlebot is currently getting 404 on anything html. I assume that is normal as I now don't have any html links? (I put the disallow html back into my robots.txt so when the bots pick up the revised robots.txt later today all should be back to normal).

    Marco, the only php in my sitemap are forum display and show thread. When I had php disallowed my bot activity went down to almost zero.
    What the example Of these 404 pages ?

  3. #18
    Senior Member
    Real Name
    Garry
    Join Date
    Jul 2009
    Posts
    156
    Liked
    0 times
    Here's one a bot has just looked at Marco:

    http://www.hj-research.com/forum/554-post9.html

    Everything was rewritten to the new format 4 days ago though so I'm expecting html errors until the bots pick up the revised robots.txt in 10 hours which now contains the disallow html (it was included until I removed it today)

    My concern is the php thing. My sitemap only contains forum display and show thread but after adding the disallow php line yesterday to robots.txt almost nothing was looked at by the bots after 4 days of rising activity. This worried me so I removed the disallow php line.

    Edit: shouldn't any html links I posted on other forums prior to the rewrite be redirected to the new url?
    Last edited by Norkus; 07-12-2009 at 07:56 PM.

  4. #19
    Senior Member
    Real Name
    Marco Mamdouh
    Join Date
    May 2008
    Location
    Egypt
    Posts
    2,627
    Liked
    3 times
    Blog Entries
    1
    That's from showposts pages

    Please Make sure you have this option in vbseo sitemap settings is no
    include showposts pages ----> no

    Then remove all .gz files located in /data and then re-generate your sitemap manually
    and it will not see any 404 page again

    Also if the option is already to no then you should Then remove all .gz files located in /data and then re-generate your sitemap manually

  5. #20
    Senior Member
    Real Name
    Garry
    Join Date
    Jul 2009
    Posts
    156
    Liked
    0 times
    It's already set to no Marco. The only things I have in my sitemap are forumdisplay and showthread. I've regenerated my sitemap probably 10 times today so google definitely has the correct sitemap. My robots.txt allows html currently but when the file is picked up by the bots later html will be disallowed.

  6. #21
    Senior Member
    Real Name
    Marco Mamdouh
    Join Date
    May 2008
    Location
    Egypt
    Posts
    2,627
    Liked
    3 times
    Blog Entries
    1
    you should remove all .gz files located in /data and then re-generate your sitemap manually

  7. #22
    Senior Member
    Real Name
    Garry
    Join Date
    Jul 2009
    Posts
    156
    Liked
    0 times
    I edited my reply while you were writing

    I've regenerated my sitemap probably 10 times today so google definitely has the correct sitemap. My robots.txt allows html currently but when the file is picked up by the bots later html will be disallowed. Is the fact that I have html allowed at the moment causing the problem? Are the links (html links) which the bot is trying to look at just old links it remembers from previous visits but which no longer exist because the urls have been re-written in the new format? Why aren't the old urls being redirected to the new format url or is this normal after a rewrite?

  8. #23
    Senior Member
    Real Name
    Marco Mamdouh
    Join Date
    May 2008
    Location
    Egypt
    Posts
    2,627
    Liked
    3 times
    Blog Entries
    1
    Glad to hear that

    We replace showposts with permalinks so Google see it as problem and after you didn't include showposts pages the problems will disappear
    I think now every thing in your forum is working well now

  9. #24
    Senior Member
    Real Name
    Martyn Day
    Join Date
    Dec 2005
    Location
    Kent - UK
    Posts
    650
    Liked
    0 times
    Blog Entries
    1
    google wont reconize a new sitemap instantly might take a while..

    you need to be more clear about new and old formatt? how excatly? once you set the formatts, they shouldnt be altered, i can't remember if its a rediret 301 or something you need....


  10. #25
    Senior Member
    Real Name
    Garry
    Join Date
    Jul 2009
    Posts
    156
    Liked
    0 times
    Thanks guys. I still don't know whether adding a line to robots.txt blocking ALL php is okay so I've gone for the following robots.txt shown below for the next 24 hours.

    # Allow Archiver
    User-agent: ia_archiver
    Allow: /


    User-agent: *

    Disallow: /forum/*.html
    Disallow: /forum/*.htm
    Disallow: /forum/*.js
    Disallow: /forum/*.jsp
    Disallow: /forum/*.cfm
    Disallow: /forum/*.asp
    Disallow: /forum/*.aspx
    Disallow: /forum/*.cgi
    Disallow: /forum/images/
    Disallow: /forum/includes/
    Disallow: /forum/install/
    Disallow: /forum/customavatars/
    Disallow: /forum/archive/
    Disallow: /forum/attachments/
    Disallow: /forum/ajax.php
    Disallow: /forum/attachment.php
    Disallow: /forum/admincp/
    Disallow: /forum/calendar.php
    Disallow: /forum/cron.php
    Disallow: /forum/editpost.php
    Disallow: /forum/global.php
    Disallow: /forum/image.php
    Disallow: /forum/inlinemod.php
    Disallow: /forum/joinrequests.php
    Disallow: /forum/login.php
    Disallow: /forum/member.php
    Disallow: /forum/memberlist.php
    Disallow: /forum/misc.php
    Disallow: /forum/moderator.php
    Disallow: /forum/newattachment.php
    Disallow: /forum/newreply.php
    Disallow: /forum/newthread.php
    Disallow: /forum/online.php
    Disallow: /forum/poll.php
    Disallow: /forum/postings.php
    Disallow: /forum/printthread.php
    Disallow: /forum/profile.php
    Disallow: /forum/register.php
    Disallow: /forum/report.php
    Disallow: /forum/reputation.php
    Disallow: /forum/search.php
    Disallow: /forum/sendmessage.php
    Disallow: /forum/showgroups.php
    Disallow: /forum/subscription.php
    Disallow: /forum/threadrate.php
    Disallow: /forum/usercp.php
    Disallow: /forum/usernote.php

    Sitemap: http://www.hj-research.com/forum/sitemap_index.xml.gz
    Last edited by Norkus; 07-13-2009 at 09:17 AM.

  11. #26
    Senior Member
    Real Name
    Garry
    Join Date
    Jul 2009
    Posts
    156
    Liked
    0 times
    Well, the bots picked up the new robots.txt and bang... they all disappeared and haven't been back since.

    Can I pay someone to have a proper look at this?
    Last edited by Norkus; 07-13-2009 at 10:46 AM.

Page 2 of 2 FirstFirst 1 2

Similar Threads

  1. Follow-up critique please - problems
    By Norkus in forum Critique Please
    Replies: 4
    Last Post: 07-09-2009, 05:02 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •