Results 1 to 14 of 14

Sitemap generating a lot more urls

This is a discussion on Sitemap generating a lot more urls within the Troubleshooting forums, part of the vBSEO Google/Yahoo Sitemap category; I have got a problem here with vBSEO sitemap generator. this my forum stats. Members: 3,402 Threads: 3,151 Posts: 15,883 ...

  1. #1
    Junior Member Array
    Join Date
    Nov 2005
    Posts
    24
    Liked
    0 times

    Sitemap generating a lot more urls

    I have got a problem here with vBSEO sitemap generator.

    this my forum stats.
    Members: 3,402
    Threads: 3,151
    Posts: 15,883

    So i guess the urls shouldn't be more than 24,000 (rough). But this is wot i get in sitemap report.
    Date 2006-04-22 07:00
    Processing time 743.82 s
    Total URLs 46,655 (+2)
    Forumdisplay URLs 494 (-)
    Showthread URLs 23,208 (-)
    ShowPost URLs 15,844 (-)
    Archive URLs 267 (-)
    Member Profile URLs 6,776 (+2)
    Poll Results URLs 64 (-)

    There is a huge difference in showthread and almost double in member profile urls.

    I had a look at the xml file and i guess the non SEOed urls r been added along with SEOed urls.

    any idea?

  2. #2
    vBSEO Staff Array Oleg Ignatiuk's Avatar
    Real Name
    Oleg Ignatiuk
    Join Date
    Jun 2005
    Location
    Belarus
    Posts
    25,818
    Liked
    192 times
    Hello,

    if you are using a Sitemap generator v1.7, there is an option to include original vB URLs along with SEOed URLs in the sitemap.
    Oleg Ignatiuk / Crawlability Inc.
    Security bulletin - Patch Level for all supported versions released

    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  3. #3
    Junior Member Array
    Join Date
    Nov 2005
    Posts
    24
    Liked
    0 times
    ohh.. gotit!

    thanx Oleg for the info

  4. #4
    Junior Member Array
    Join Date
    Jan 2006
    Posts
    23
    Liked
    0 times
    Hello tutti

    I have a problem with "member profile"

    Pratically all my pages that Google or Yahoo take into account are "member profile"



    Is it possible to avoid this ?

    Thanh you

    Monique

  5. #5
    vBSEO Staff Array Oleg Ignatiuk's Avatar
    Real Name
    Oleg Ignatiuk
    Join Date
    Jun 2005
    Location
    Belarus
    Posts
    25,818
    Liked
    192 times
    Hello Monique,

    do you mean you want to exclude member profile pages from indexing by search engines? (you can do that using robots.txt)
    Oleg Ignatiuk / Crawlability Inc.
    Security bulletin - Patch Level for all supported versions released

    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  6. #6
    Junior Member Array
    Join Date
    Jan 2006
    Posts
    23
    Liked
    0 times
    Quote Originally Posted by Oleg Ignatiuk
    Hello Monique,

    do you mean you want to exclude member profile pages from indexing by search engines?
    Yes

    Quote Originally Posted by Oleg Ignatiuk
    Hello Monique,

    (you can do that using robots.txt)
    OK
    you said also :

    Quote Originally Posted by Oleg Ignatiuk
    Hello,

    your robots.txt entries are correct.
    By the way, there is a new robots.txt tool available in the Google sitemaps account.
    my question :

    Must I have to use the Google robots.txt ?

    Ca I use something like that :

    # Disallow directory
    User-agent: *
    Disallow: /admincp/
    Disallow: /ajax.php
    Disallow: /articlebot/
    Disallow: /attachment.php
    Disallow: /attachments/
    Disallow: /clientscript/
    Disallow: /cpstyles/
    Disallow: /customavatars/
    Disallow: /customprofilepics/
    Disallow: /calendar.php
    Disallow: /cgi-bin/
    Disallow: /cron.php
    Disallow: /editpost.php
    Disallow: /external.php
    Disallow: /frm_attach
    Disallow: /images/
    Disallow: /includes/
    Disallow: /inlinemod.php
    Disallow: /install/
    Disallow: /joinrequests.php
    Disallow: /login.php
    Disallow: /member.php?
    Disallow: /memberlist.php
    Disallow: /misc.php
    Disallow: /modcp/
    Disallow: /moderator.php
    Disallow: /modules/
    Disallow: /newattachment.php
    Disallow: /newreply.php
    Disallow: /newrs.cal
    Disallow: /newthread.php
    Disallow: /oldindex.html
    Disallow: /online.php
    Disallow: /poll.php
    Disallow: /postings.php
    Disallow: /printthread.php
    Disallow: /private.php
    Disallow: /profile.php
    Disallow: /register.php
    Disallow: /report.php
    Disallow: /reputation.php
    Disallow: /search.php
    Disallow: /sendmessage.php
    Disallow: /showgroups.php
    Disallow: /ssm.js

    In this forum someone said that I have to put this robots.txt in Root ? Where is it ?

    There is a lot of parameters (in vBulletin, in VBSEO, in Google SiteMap) that forgot my latin ... et I have difficulties to write good english (french speaking)

    Thanks

    Monique

  7. #7
    vBSEO Staff Array Oleg Ignatiuk's Avatar
    Real Name
    Oleg Ignatiuk
    Join Date
    Jun 2005
    Location
    Belarus
    Posts
    25,818
    Liked
    192 times
    Hello Monique,

    these robots.txt entries are correct, but only when vBulletin resides in the domain root. In your case, you should put /forum into every line, like:
    Code:
    User-agent: *
    Disallow: /forum/admincp/
    Disallow: /forum/ajax.php
    Disallow: /forum/articlebot/
    ... etc
    and put this robots.txt in the sites root.
    Oleg Ignatiuk / Crawlability Inc.
    Security bulletin - Patch Level for all supported versions released

    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  8. #8
    Junior Member Array
    Join Date
    Jan 2006
    Posts
    23
    Liked
    0 times
    Quote Originally Posted by Oleg Ignatiuk
    Hello Monique,

    Code:
    User-agent: *
    Disallow: /forum/admincp/
    Disallow: /forum/ajax.php
    Disallow: /forum/articlebot/
    ... etc
    and put this robots.txt in the sites root.
    Thank You Oleg

    so I put my robots.txt (like above /forum/admincp, etc...) in www and not in the directory forum ? is'n it ?

    http://www.cameravideo.net/robots.txt

    Is it OK ?

    Thank you

    Monique

  9. #9
    vBSEO Staff Array Oleg Ignatiuk's Avatar
    Real Name
    Oleg Ignatiuk
    Join Date
    Jun 2005
    Location
    Belarus
    Posts
    25,818
    Liked
    192 times
    Yes, exactly

    If you want to exlcude your member profiles pages from indexing, you should add one more line in robots.txt:
    Code:
    Disallow: /forum/members/
    However, you may want to keep them allowed to search engines - sometimes there is a useful information included by users in profile fields.
    Oleg Ignatiuk / Crawlability Inc.
    Security bulletin - Patch Level for all supported versions released

    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  10. #10
    Member Array
    Real Name
    Rik Brown
    Join Date
    Apr 2006
    Location
    St. Louis, Missouri, USA
    Posts
    30
    Liked
    0 times
    Quote Originally Posted by Oleg Ignatiuk View Post
    Hello,

    if you are using a Sitemap generator v1.7, there is an option to include original vBulletin URLs along with SEOed URLs in the sitemap.
    This is happening to me with v1.7. But I'm not sure which option to change. Could you please advise the option name?

    Thanks. -- Rik

  11. #11
    vBSEO Staff Array Oleg Ignatiuk's Avatar
    Real Name
    Oleg Ignatiuk
    Join Date
    Jun 2005
    Location
    Belarus
    Posts
    25,818
    Liked
    192 times
    The option's title is: "If you have vBSEO, include old urls?" (2nd option in the list)
    Oleg Ignatiuk / Crawlability Inc.
    Security bulletin - Patch Level for all supported versions released

    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  12. #12
    Member Array
    Real Name
    Rik Brown
    Join Date
    Apr 2006
    Location
    St. Louis, Missouri, USA
    Posts
    30
    Liked
    0 times
    Quote Originally Posted by Oleg Ignatiuk View Post
    The option's title is: "If you have vBSEO, include old urls?" (2nd option in the list)
    Oleg:

    I've got that option set to "No."

    FYI, I'm getting over 850,000 URLs in the Google sitemap but I only have 680,000 forum pages. Here are my other "include" settings:

    Include Show Post Pages: Yes
    Include Member Profile Pages: No
    Include Archive Pages: No
    Include Show Thread Pages: Yes
    Include Forum Display Pages: Yes
    Include Poll Results Pages: No

    Could one of the above settings be the culprit?

    Thanks. -- Rik

  13. #13
    vBSEO Staff Array Oleg Ignatiuk's Avatar
    Real Name
    Oleg Ignatiuk
    Join Date
    Jun 2005
    Location
    Belarus
    Posts
    25,818
    Liked
    192 times
    You can find exact numbers per "page type" in the generator reports.
    Oleg Ignatiuk / Crawlability Inc.
    Security bulletin - Patch Level for all supported versions released

    Unveiling the NEW vBSEO Sitemap Generator 3.0. - available NOW for vBSEO Customers!


  14. #14
    Member Array
    Real Name
    Rik Brown
    Join Date
    Apr 2006
    Location
    St. Louis, Missouri, USA
    Posts
    30
    Liked
    0 times
    Quote Originally Posted by Oleg Ignatiuk View Post
    You can find exact numbers per "page type" in the generator reports.
    Oleg: Will check there. Thanks. -- Rik

Similar Threads

  1. Google Sitemap Error (Location?)
    By Spitfire in forum General Discussion
    Replies: 5
    Last Post: 05-30-2006, 06:12 AM
  2. Memory Problem:
    By TraumTeam in forum Troubleshooting
    Replies: 1
    Last Post: 04-06-2006, 08:24 AM
  3. Google Sitemap-Listing of old url's option-from 1.7
    By T2DMan in forum Member Articles
    Replies: 0
    Last Post: 04-02-2006, 08:46 AM
  4. Replies: 0
    Last Post: 12-17-2005, 01:11 PM
  5. Sitemap generator, bad urls...
    By psico in forum Bug Reporting
    Replies: 1
    Last Post: 11-14-2005, 09:16 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •