Suppose there are a thousand users of the internet – all of them high energy particle physicists. Each user has their own homepage. Each homepage contains some text, and some links to the homepages of their friends.
You can imagine this as a directed graph. Each node is a user with a homepage. And each arrow is a hyperlink between homepages.
This information can also be represented by an “adjacency matrix“. The entry in row 2 and column 4 of the matrix is the number of links going from the home page of user 2 to the homepage of user 4.
The “astute reader” will observe that if if we multiply the above matrix with itself five times, then the entry in row 2 and column 4 of the result will be the number of ways you can get from vertex 2 to vertex 4 in five steps. Magic.
Isn’t that clever? Now you are an expert on matrix multiplication. Go and make yourself a cup of tea before reading the rest of this article.
What we need to imagine now is that a user starts at their own homepage. Then randomly clicks on a link to go to another page. Then randomly clicks on a link to go to another page. Then clicks on a link to go to another page. Etc….
The number in the row 2 and column 4 of the “transition matrix” tells us the probability that our random “web surfer” will click on the link from node 2 to node 4. Check it on the example.
We let this “random walk” go on for a long time, and are interested in what proportion of time our random “web surfer” spends on each page.
To be more precise, we are interested in the “probability” that our random “web surfer” will be on any given page – after being giver a sufficient amount of time to wander around. This is known as the Stationary Distribution.