# The Learning Center

AKA... The Blog

AKA... The Blog

The order in which a search engine returns results is important, as the choices closest to the top of the list are the ones most likely to be clicked on. But what determines *how* these results are sorted? An obvious answer would be relevancy to the searched terms, but that determines which results are displayed in the first place, not the order they’re shown in. Another metric is needed to determine the importance of search results, and the Google search engine uses what they call Google PageRank.

As of this writing, there are ~**105 million registered domain names** and ~**183 million websites** on the internet today. Google claims to have indexed more than **2 trillion unique URLs**since its founding. With staggering numbers like these, how do you measure how important any one site is? Google uses their PageRank metric to determine the importance of specific sites. They describe it this way:

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important”.

Google PageRank is a logarithmic scale which runs from zero to ten. A rank of zero means a site is not linked to by anybody, or is being penalized by Google for deliberately manipulating the system (for example, using link farms to artificially inflate its rank). A rank of ten is the highest possible position, and a site with this rank is usually vital to the internet’s function in some manner. While the specific number of sites with **PageRank 10** varies over time, there are usually no more than twenty sites worldwide which rank this highly.

The calculation of a website’s PageRank can be expressed this way:

The PageRank value for a page **u** is dependent on the **PageRank values** for each page **v** out of the set **B _{u}** (this set contains all pages linking to page

That’s going to be a bit too wordy for many people, so let’s give a simple example. To calculate the rank for a page, all of its inbound links are taken into account, both within the site and from other sites outside. But the ranking “power” of an inbound link is its own PageRank, divided by the number of links on that page. The more links on a given page, the less value a link from that page will give you. So, given a set of four webpages, **A**, **B**, **C**, and **D**, with an arbitrary set of links, the PageRank of A can be written like this:

Now, if we define a specific set of links:

- A is linked to by B, C, and D
- B is linked to by D
- C is linked to by B and D

Then the PageRank of A with these specific links works out like this:

As you can see, the PageRank of any given page is dependent on the number of inbound links it has. Google has not disclosed exactly how the scale is measured, but it appears that PageRank 1 has a modest requirement of 18 links, and higher ranks require 5.5 times as many links as the rank below it. (This metric works both ways – making a link from a PageRank 4 site thirty times more valuable than a link from a PageRank 2 site.) Becoming a PageRank 10 site could theoretically require as many as 84 million inbound links, assuming all of those links are from PageRank 1 sites.

What isn’t immediately obvious here is that this equation cannot work if done only one time.

Suppose we have two pages, A and B, which link to each other and have no other inbound links.

**Calculating page A’s rank properly requires knowing page B’s rank.**

**Calculating page B’s rank properly requires knowing page A’s rank.**

Circular references such as these mean it isn’t possible to determine a page’s PageRank value with any accuracy on a single pass. Each time the equation is repeated, the accuracy improves; after ~50 passes you have a result which cannot be further improved in a meaningful way. This is exactly what Google does during every update, which is why updates take so long.

Google PageRank is definitely a factor in ranking better in the Google search engine but it is definitely not the only factor. Remember there are many elements to better search rankings.