By request, here I began to translate the article "Google PageRank: What Do We Know About It?". So far only what had happened yesterday evening. If anyone needs a sequel - write, I will translate and lay out the rest. Proofreading and noticed errors are welcome, because I did not have to translate specifically. :)
Google PageRank: What do we know about it?
Everyone uses it, but almost no one knows how it actually works. Google PageRank is probably one of the most important algorithms ever developed on the web. Billions of existing pages and millions of pages appear every day - search results are much more complicated than you imagine. PageRank is one of hundreds of factors Google takes into account to determine the best search query to help make searching simple and effective. But how is it actually made? How does Google PageRank work, what factors affect it, and what doesn't? And what do we really know about PageRank?
This article will be just the bare facts .
For several weeks we carried out intensive research and selected a lot of facts and assumptions about PageRank, which are similar to reality . In addition, we have compiled research articles related to search results — such as suggestions for better search results (for example, subject-sensitive PageRank). You will read about the mathematical component of PageRank, as well as about 16 useful tools for working with PageRank, which you can use to analyze and track your web projects.
In short: how does it work?
PageRank is one of the many methods that Google uses to determine the relevance or importance of a page.
Google interprets the link from page A to page B as a “voice” and A for B. It monitors not only the volume of votes, more than a hundred other aspects of the page that gives this voice are analyzed.
PageRank is based on incoming links , but not only on their number - their relevance and quality also matter.
PR (A) = (1-d) + d (PR (t1) / C (t1) +… + PR (tn) / C (tn)). This is a formula that considers pagerank.
Not all links have the same “weight”when it comes to PR.
If you have a website with PR = 8 and 1 link from it to another page, then it will receive a certain addition to its PR. But if you have 100 links to this page, then each of them will give a hundredth part of this supplement.
Broken incoming links do not affect PR.
When calculating the popularity , the age of the site , the relevance of backlinks and their duration are taken into account . When calculating the page rank - no.
When counting PR content is not considered.
PageRank is not determined for the entire site at once, but for each page separately.
Every link to your site is important for the result.Excluding banned sites that are excluded from the counting.
PageRank is not defined from 1 to 10. This is a floating point number.Also, initially the PR value is slightly greater than 0.
Each next Page Rank level is harder to achieve with progressive addiction . We believe that it is considered on a logarithmic scale.
Google counts the PR of each page every few months .
Google is trying to find pages that are relevant and “respected” to them at the same time.
In short: impact on Google PageRank
Frequent content updates do not automatically improve page rank.
High page rank does not mean high position in the search result .
Finding in DMOZ and Yahoo! does not mean automatic improvement of page rank.
Site location on .edu or .gov domains does not mean automatic improvement of page rank
Sub-directories do not necessarily have a lower page rank than root directories.
Links from Wikipedia do not mean automatic improvement of PageRank (but pages that use its materials can improve PR).
Links with the nofollow attribute do not help the work of PageRank.
Effective links within the site affect the PageRank.
Relevant sites with high rank have an advantage in counting.
Link anchor text is often much more important than just a link on a high PR page.
Outbound or inbound links to quality relevant sites matter to PR.
A set of links to a specific place from one page means as much as one link to the same place from the same page.
The site can be excluded (banned) for links to excluded (banned) sites.
1.1. Why PageRank?
“PageRank is [only] one of the methods Google uses to determine page relevance or importance.” [ PageRank Explained Correctly ]
“Google uses many ranking factors. Therefore, the PageRank algorithm may be the most famous. PageRank is expressed in two things: 1. how many links there are from other sites to this one; 2. the quality of these sites. Links from five or six high-quality sites (such as cnn.com, nytimes.com) will mean more than twice as many links from lesser-known sites. ”[ Google Librarian Central ]
“PageRank is only a rough estimate of the quality of a web page and in no case a measure of its topical relevance. Thematic relevance depends on the content of the links and such factors as the correlation of content and keywords, title, etc. "[ PageRank: An Essay ]
1.2. How does he work?
No one is completely sure.“No one really knows how Google considers PR at the moment .” [ Google PageRank Explained ]
PR (A) = (1-d) + d (PR (t1) / C (t1) +… + PR (tn) / C (tn)). “This formula shows how PageRank is calculated. Here, 't1 - tn' are pages that refer to page A, 'C' is the number of outgoing links on the page, and 'd' is the regulatory factor, usually 0.85. "
We can write more simply: PageRank = 0.15 + 0.85 * (the “share” of each page that refers to this one). The “share” is equal to the PR of the referring page divided by the number of links emanating from it. A page “votes” an amount of PageRank. It is a little less than its own PageRank value (its own value * 0.85). This value is shared between all pages. ”[ Google's Page Rank ]
“ The essence of the Google PageRank algorithm is in distributing your own PR between outbound links . If you have a page with PR = 8 and one link to another page, then this page will receive the full “weight” of your PR. But if you have not one, but hundreds of links, then each link will have an equal part of the “weight” of your PR (in other words, 1/100 of it). ”[ The Importance of PageRank ]
“It follows that a page with PR = 4 and five links will mean more than a page with PR = 8 and a hundred links. PageRank of a page that refers to yours is important, but so is the number of links to it. The more links on a page, the lower the “weight” the PR carries each of them. ”[ Google’s Page Rank ]
“PageRank [..] uses the link structure as an individual indicator of each page. Google interprets the link from page A to page B as the “voice” of page A for page B. Google considers many more factors than just the number of such “votes” or links to the page; he also analyzes the page from which the “voice” was derived. Voices from “important” pages mean much more than from the rest, and help other pages increase their “weight”. [ Google: Technology ]
“ Not all links“ weigh ”equally when it comes to PR. So “Important” pages give more value to your PR than “less important” (according to Google, of course). [...] The power of PR distribution is the number of outgoing links on the “voting” page. So A page with PR = 4 and one link can give more weight than a page with PR = 5 and hundreds of outgoing links to it. A typical example in this case could be the famous million-dollar-page (milliondollarhomepage). The PR of this page is 7, but hundreds of links to it give very little weight to the pages to which they link. ”[ Google PageRank Explained ]
Each next level of PR is much harder to achieve (in progression). “PageRank is considered using logarithms. Similarly, in exponential dependence, earthquakes are measured on the Richter scale, i.e. behind the word PageRank are mathematical calculations. It takes one step from PR = 0 to PR = 1, slightly more steps from 1 to 3, much more to 4, even more to 5, etc. "[ Google Page Rank FAQ ]
“PageRank does not identify entire sites, but counts each page separately. Further, the PageRank of page A is recursively determined by the ranks of other pages that refer to page A. ”[ The Page Rank algorithm ]
“Google combines PageRank with sophisticated text search technology to find both important and relevant pages for the user. Google analyzes all the content details of a page (and the contents of the pages that link to this one) in order to achieve the best possible search results. ”[ What Is Google PageRank? ]
“Google counts the PR of each page once every few months (PR update). After the update is complete, all pages receive a new PR from Google, which they will have until a new update is available. The rank of new sites is 0 until a renewal occurs and a certain level of PR is assigned to them. ”[ Google PageRank Explained ]
The PageRank value does not change from 0 to 10. PageRank is a floating point number. “It will be more accurate to consider PR as a floating point number. Of course, our internal PR calculations have much more signs than the value from 0-10, shown in the toolbar. ”[ Matt Cutts ]
“We are confident that their curve is similar to exponential, where each new level is more difficult to achieve than the previous one. I personally conducted several studies on this topic and the result was an exponential base of 4. So, PR = 6 is 4 times harder to achieve than PR = 5. [..] The difference between the upper boundary of PR = 6 and the lower boundary of PR = 6 can be in hundreds or thousands of links. ”[ Top 10 Google Myths Revealed ]
“It is assumed that PageRank is calculated on a logarithmicscale . This means that the difference between PR = 4 and PR = 5 is approximately 5-10 times higher than the difference between PR = 3 and PR = 4. So, it is likely that there are 100 times more pages with PR = 2 than with PR = 4. This means that if you reach PageRank equal to 6 or higher, you are in a rating higher than 0.1% of all other sites. ”[ Importance of Google PageRank ]
“PageRank is based on incoming links, but not just on their number. Instead, your PageRank depends on the “weight” of the inbound links. To find the “weight” of the incoming link, divide the PR of the page with this link by the total number of links to it. It is very possible to get a PR equal to 6 or 7 from a small number of incoming links that have sufficient weight. ”[ Google Myths Revealed Top 10 ]
“Google is trying to find pages that are both authoritative and relevant. If two pages have approximately the same level of authority and information matching the search query, then the page that is referenced by more reputable sites is selected. But in spite of this, we often increase pages with less links or less PR in search results if other factors show that the page is more relevant. For example, a page devoted entirely to civil war will be much more useful than an article that casually mentions it, but is located on such an authoritative site as Time.com. ”[ Google Librarian Central ]
Links do not give yourPR to anyone, they give a "voice". “When a page“ votes ”with its PageRank value for other pages, its own PR value does not decrease. Own PR is not distributed and can not end as a result of "voting". Also, there is no transfer of your PR. There is only a “vote” depending on the PageRank of each page. ”[ Page Rank Explained ]
"From the material" The Anatomy of Large Hypertext Search Engines " (" The Anatomy of a Large-Scale Hypertextual Web Search Engine " ) we know that the PageRank of a page is a number resulting from the recursive algorithm in which the page receives the overall PR of each page referring to this. "[ Google PageRank ]
Googlebot does not analyze the site instantly. “In most cases, two monthly updates are required so that all links to your website are found, counted and shown.” [ Google FAQ ]