Background and History
What is the PageRank algorithm?
Developed by Google founders Larry Page and Sergey Brin in 1996 as part of a Stanford University research project, PageRank is an algorithm that is used to rank web pages in search engine results. PageRank works by assigning numerical weights to web pages based on the number of hyperlinks to and from other web pages.
Why was PageRank invented?
While Larry Page and Sergey Brin were working on their research project for a "new type of search engine", Brin realized that the search engine would need some way of ranking and ordering web page results. Taking from the eigenvalue problem, a problem used throughout history to represent ranking problems, and from another earlier search engine called RankDex, Page and Brin developed the PageRank algorithm for Google's search engine.
How is the algorithm currently used?
Today, PageRank is still used by Google alongside other algorithms, however it is Google's first and most well known alrgorithm for ordering search results.
Why did we decide to write an article about it?
As Computer Science students, the main reason we decided to write an article about PageRank was because we had both heard of the algorithm but didn't really know how it worked. In thinking about this more, we realized that a lot of other Computer Science students could benefit from learning about PageRank, especially since it is the backbone of Google Search; a service many of us use every day.
Algorithm Description
Inputs and outputs
The algorithm takes in web pages, and outputs a rating for each page, which represents the reliability or trustworthiness of a site. To do this, it first needs to convert the web pages into a format more understandable by a computer:
The graph network representation
The algorithm works by representing the internet as a directed graph, where web pages are nodes and links between pages are the connections between nodes. The example below has three web pages, each with links between each other. The corresponding graph is shown beside.
Creating the rating
With the graph created, the algorithm then uses it to rank the pages based on importance. It does this by simulating people traveling around on the graph, clicking links from page to page. The pages where more people end up are thought to be the more important pages, and are ranked higher. The graphs below have been labeled with the percentage of traffic that they would get, which is also their ranking.
Example 1
Example 2
Interactive Graph
Click in the area to create a node. Drag from one node to another to create connections. The sizes and percents of each node will be updated live with the pagerank algorithm!