vBulletin SEO Forums

SEO

vBulletin Search Engine Optimization

Buy vBSEO Now! HACKER SAFE certified sites prevent over 99.9% of hacker crime.
ne nw
vBSEO Total Support Team Launches DeskPro New vBSEO Discount Level for Network Builders vBSEO 3.2.0 GOLD Has Landed Success with vBSEO = 600ore Web Visitors + $1400 in a Day! Crawlability Inc. Files for SEO Technology Patent
se sw

2006 SES Conference - Duplicate Content Session

This is a discussion on 2006 SES Conference - Duplicate Content Session within the Member Articles forums, part of the Focus on Members category; After attending the Search Engine Strategies conference last week in New York City at the Hilton in Manhattan, I decided ...

Go Back   vBulletin SEO Forums > Focus on Members > Member Articles

Enhancing 80 million pages.

Register FAQ Members List Social Groups Calendar Search Today's Posts Mark Forums Read
  1 links from elsewhere to this Post. Click to view. #1  
Old 03-07-2006, 05:04 PM
Senior Member
vBSEO Pre-Release Team
 
Real Name: Mike Simonds
Join Date: Oct 2005
Location: Texas
Posts: 214
Send a message via AIM to msimonds Send a message via MSN to msimonds Send a message via Yahoo to msimonds
2006 SES Conference - Duplicate Content Session

After attending the Search Engine Strategies conference last week in New York City at the Hilton in Manhattan, I decided to share what I learned or took away from the sessions that I attended.

The following notes are from the duplicate content session. I will attach some power points for members here to download and view:
  • Discussion on hosting same content with multiple domains. Nothing new that we haven't already experienced/know. (Kennedy 3)
  • Watch out for dynamic urls can be serving the same information. (http://www.superpages.com/yellowpages/C-Art+Galleries+%26+Dealers/S-IL/T-Bloomington/ serves same content as http://www.superpages.com/yellowpages/S-IL/T-Bloomington/C-Art+Galleries+%26+Dealers) I believe Parid fixes these issues last year. (Kennedy 11)
  • If your content has been dropped from a search engine, you can fill out a reinclusion request. (Kennedy 15-17)
  • Yahoo
    • Search engines are looking for unique content. They are removing headers, footers, and side navigation when indexing.
    • Press Releases not considered duplicate spam because of linkage properties
    • Search engines look at host name resolution; multiple host names per IP address.
    • When indexing, web pages are broken into word sets (shingles). Rearranging those shingles into a different order doesn't add any benefit. Search engine still considers that all the shingle is there.
    • If you must have duplicate content, use Meta tags (noindex) to weed out secondary content.
    • One panelist (Yahoo?) mentioned that print-friendly pages are a problem since that is duplicate content of the original one. Matt Cutts said to not worry about print friendly pages.
    • Track paths through cookies not urls. This seems most appropriate for labs, since they do a lot of tracking through urls, which ends up duplicating content.
    • Make sure to call directory pages consistently: These three links have different urls (although I can't imagine that search engines are really having trouble distinguishing between the first two):
      • directory/
      • directory
      • directory/index.cgi
    • Don't use session IDs.
    • When redesigning a site, make sure that only one of the web sites is being indexed.
  • Matt Cutts
    • People are more worried than they should be. Google knows mistakes happen and isn't looking to punish anyone for innocent mistakes. An example was given of someone who was asking about duplicate content issues. When asked how many domains their content was on, they sheepishly replied 2500 domains. This is the type of person/site Google wants to go after (my 2c: at first any way). Search engines are aware there are good and bad reasons for showing duplicate content. (I remember Yahoo nodding at this point)
    • Google will soon be rolling out a new infrastructure that will specifically deal with multiple domains.
    • Use Google Site Maps, which allows for testing robots.txt. You can see exactly how the Googlebot will crawl, which pages it gets stuck on, which pages end up with duplicate contents.
    • Search Engines are getting smarter about understanding JavaScript. (My 2c: My feeling when this was being talked about was that Google probably has a bot that understands JavaScript. They can, or soon will, figure out what the script is doing. Yahoo was nodding with a grin on his face as well. However, I expect they will only use these tools for internal analysis.)
  • Question: "How can a search engine determine what the 'real page' is?"
    • Yahoo: By using shingling techniques and algorithms.
    • Google: By using algorithms. Looking at how often a site has duplicate content from other sources. "How much you copy from versus how much others are copying from you".
  • Yahoo has a possible proposal for webmasters to "noindex" only portions of pages.

You can download the PPT at Rantchaos


Thanks
Mike
Sportsrant.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Share on Facebook!
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


LinkBacks (?)
LinkBack to this Thread: http://www.vbseo.com/f34/2006-ses-conference-duplicate-content-session-3834/

Posted By For Type Date
כמה דפי Supplemental Result יש לך באתר? · חדשות קידום אתרים This thread Refback 10-23-2006 05:30 AM

Similar Threads

Thread Thread Starter Forum Replies Last Post
Matt Cutts says Honest site owners don't really have to worry about duplicate content BamaStangGuy General Discussion 24 06-05-2006 03:01 PM
Question regarding linking to styles ? duplicate content s2kinteg916 General Discussion 0 11-11-2005 12:41 AM
Possible duplicate content PageUp General Discussion 4 11-04-2005 01:38 PM


All times are GMT -4. The time now is 10:06 PM.


Powered by vBulletin Version 3.8.0 Beta 4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.5 ©2008, Crawlability, Inc.