At the latest Search Engine Strategy conference in Chicago, there were many questions about duplicate content. We recognize that there are many subtleties and bottlenecks (our mistakes) in working with such content, so I would like to clarify some points:
Why does Google care about content originality?Our users usually want to see a selection of various sites (or articles) at their request. Can you imagine what frustration it would be if typing your request a person sees 10 identical articles on different sites on the first sickle page? And webmasters are still scolding us for the fact that a page like example.com/contentredir?value=shorty-george=en may be higher than example.com/en/shorty-george.htm
What does Google do with duplicate content?In the process of indexing and ranking sites, we try to choose pages with original information. These filters mean that if your article on the site is in two versions - “normal” and “print” and one of them is not blocked via robots.txt or through noindex, then only one version of the article will be saved in the index. In rare cases, when we see that duplicate content is on the site in order to manipulate the search results - we can exclude such a site from the sickle. However, we prefer to do exactly filtering instead of removing sites with duplicate content from the issue. Therefore, in most cases, the worst thing that can happen to your site is the “lower” place in the sickle.
What is the best way to distribute duplicate text?- Instead of providing our robot with any copy of the text to choose (i.e., regular or “for printer”, etc.). Block extra copies through the robots file
- use 301 redirect if you changed the structure of the site (using .htaccess).
- Use full links, not / page / or / page or /page/index.htm
- Use domains and not subdomains, also do not forget that when ranking, the definition of a country is actively used (that is, Russian-speaking users will first of all show .ru domains, etc.)
- Use RSS carefully, always make sure that the sites that import your articles link to your site in EVERY article
- If your site is referenced both on “site.ru” and on “
www.site.ru ” you indicate WHAT the version of the site should be indexed.
- Minimize duplicate blocks of text on pages, for example, if at the bottom or at the top of each article you have in several sentences painted prohibitions on copying content, etc., then the best solution would be to put such text into a separate page and link to it in all articles.
- Avoid publishing alphabetically or, for example, by country if you click on one of the links, the user will see a blank template (for example, you do not have articles with the letter I, but there is a link to “I”). Users do not like such jokes, and we work for users.
- Be “on you” with your CMS, try to find out all the ways in which content is duplicated (for example, print version, mobile version, etc.)
- “Don't worry, be happy.” Don't worry too much about duplicating or stealing content from your site, as a rule, Google copes with such thieves without problems. If you’ve gotten a “copier” at all, go to
www.google.com/dmca.html - they will send us a request to remove it from the index.
Original article
http://googlewebmastercentral.blogspot.com/2006/12/deftly-dealing-with-duplicate-content.htmlRussian translation of the article
http://blog.seotrade.ru/?p=12