Google Search algorithm documents have leaked. Here's what experts are saying.

An internal Google document leak reveals the secrets to ranking in Google Search.

Google Search algorithm documents have leaked. Here's what experts are saying.
Google seach

The key to online success is usually dependent on one major factor above all others: Your website's ranking on Google Search.

For decades now, an entire industry – Search Engine Optimization or "SEO" – has revolved around attempting to crack the code that moves a given page up the ranks for various keyword search queries on Google.

This week, that "code," or more specifically the secrets behind Google's search engine algorithm, have leaked.

"In the last quarter century, no leak of this magnitude or detail has ever been reported from Google’s search division," said Sparktoro CEO Rand Fishkin, a longtime influential figure in the SEO industry.

Fishkin has worked in the industry for years and founded the well-established SEO company, Moz. Fishkin's long SEO history is likely why an unnamed person chose to send him Google’s internal "Content API Warehouse" document. This 2,500 page document details a slew of previously unknown or unconfirmed knowledge about how Google decides to rank websites on its search engine.

Once receiving the leak, Fishkin and a number of other SEO and digital marketing leaders went to work to verify the document. After examining the pages, they believed the leak to be legitimate. Google would not confirm the legitimacy of the leak outright at first, however, Fishkin shared that a Google employee contacted him in order to change the characterization of some of the details he posted in his breakdown of the document.

Late Wednesday, Google provided confirmation that the document was indeed legitimate in an email to The Verge.

There is a lot of technical information in the document, which appears to be more for developers and technical SEO professionals than for the layperson or even SEO professionals who specialize in content creation. However, there are some extremely interesting details that everyone can walk away with from this leak.

Google evidently uses Chrome to rank pages

This is particularly of interest as Google has previously denied using Chrome to rank websites.

According to the documents parsed by experts like Fishkin, it appears that Google tracks how many clicks a webpage receives from users in its web browser, Chrome, in order to choose which pages of a website to include in its search query sitemap.

So, while it doesn't seem that Google uses this info to decide where to rank an entire site outright, analysts have surmised that the company does use Chrome activity in order to decide which internal pages to show in search under the website's homepage.

Google tags "small personal" sites for some reason, it seems

SEO expert Mike King of iPullRank flagged this one, and it's brought about more questions than answers.

According to analysis of Google's internal document, the company has a specific flag it attaches to "small personal websites." It's unclear how Google determines what a "small" or "personal" website is, nor is there any information as to why Google is marking websites with this tag. Is this to help promote them in search? To demote them in the rankings? 

Its purpose is a mystery at this time.

Clicks matter a lot

This is another issue that SEO experts have long speculated about, which Google has denied over the years. And, once again, it looks like the experts were right.

It turns out that Google relies on user clicks for search rankings much more than was previously known.

NavBoost is a Google ranking factor that focuses on enhancing search results. It focuses heavily on click data to improve these results. According to King, we now know that NavBoost has a "specific module entirely focused on click signals." One major factor that determines a website's ranking for a search query: short clicks versus long clicks or how long a user stays on a page after clicking on the link from a Google search.

Exact match domains can be bad for search ranking

If you've ever come across a domain name with multiple keywords and dashes, like used-cars-for-sale.net for example, at least part of the reason was likely SEO. There has been a long held belief among domain investors and the digital marketing community that Google rewarded exact match domain names. 

It turns out that this isn't always true. In fact, an exact match domain can hurt your rankings.

Around a decade ago, Google did share that exact match domain names would no longer be held in high regard as a tool for earning rankings, despite being favored by the algorithm at one time. However, we now have evidence thanks to this leak that there is a mechanism to actively demote these websites in Google Search. It turns out that Google views many of these types of domains in the same light as keyword stuffing practices. The algorithm views this type of url as potential spam.

Topic whitelists

According to analysis of the documents, Google has whitelists for certain topics. This means that websites that appear in Google Search for these types of search queries need to be manually approved and don't appear based on the normal algorithmically ranked search factors.

Some of the topics aren't too surprising. Websites containing content related to COVID information and politics queries, specifically around election information, are whitelisted.

However, there is a whitelist for travel websites as well. It's unclear exactly what this whitelist is for. SEO experts have suggested that this could be related to travel sites appearing in specific Google travel tabs and widgets.

Google "lied"

Fishkin, King, and other SEO experts have been able to confirm and debunk quite a few SEO theories thanks to this leaked document. And it's now clear to them that Google hasn't been entirely truthful regarding how its search algorithm worked over the years.

“'Lied' is harsh, but it’s the only accurate word to use here," King wrote in his own breakdown of the Google Content API Warehouse document. 

"While I don’t necessarily fault Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the marketing, tech, and journalism worlds who have presented reproducible discoveries," he said.

As industry experts continue to pore through this massive document, we may soon find out some more interesting details hidden in Google's search algorithm.

A Google representative declined Mashable's request for comment.

UPDATE: May. 30, 2024, 10:52 a.m. EDT Google has since confirmed the legitimacy of the leaked document. This piece has been updated to reflect this information.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow