Introduction to Google SMITH Update

In October 2020, Google published a research paper on a new algorithm named SMITH, which also stands for ‘Siamese Multi-Depth Transformer-based Hierarchical’. The algorithm is used to understand the longer form of document or web page content. 

SMITH algorithm analyses the broader part of content present on the page like sentences, paragraphs, and even whole document. 

Now, you may think, isn’t this what Google does while ranking pages on search engine results? Then what differentiates this algorithm from other similar algorithms? 

Let’s understand the much deeper aspect of the algorithm below. 

Why was it needed at all? Why is BERT often talked about along with the SMITH algorithm? What’s the difference between both? And, what we can do to be on the safer side?

Before we understand the above aspects, it’s important to know the role of Natural language processing for search engines. 

Use of Natural Language Processing (NLP) for page ranking

Natural Language Processing is used to understand the context of content and its tone using Artificial intelligence. NLP takes cues from the context of the document and this interpretation is used to deliver search results that are the most relevant to search queries. 

The use of NLP has even become more imperative as voice search is increasingly adopted world wide. Search engines heavily rely on NLP to understand the intent of words used in the content.

What is the BERT algorithm? 

Bidirectional Encoder Representations from Transformers often knows as the BERT algorithm was released in 2019 by Google. The goal of using the BERT algorithm is to understand the search intent of users for long queries. 

The term ‘Bidirectional’ in the BERT algorithm, explains the nature of assessing the words in the document. The algorithm interprets the set of words from ‘before’ and ‘after’ of a specific keyword. Let’s understand this using the two sentences below. 

  1. ‘The human body is made up of millions of cells’
  2. ‘The prisoners were locked in cells’

To understand the search intent of users, the BERT algorithm will interpret both the sentences for the keyword ‘cell’. 

For sentence 1, the algorithm will understand the context of the keyword ‘cell’ through surrounding words like ‘Human body’. 

For sentence 2, the algorithm will understand the relevance of ‘cell’ through other keywords in the line like ‘prisoner’. 

Therefore, the BERT algorithm will understand the search intent of users by interpreting the right set of keywords in the document. 

BERT and SMITH algorithm

The BERT algorithm displayed some limitations when it comes to the long form of content on the page. 

It is suitable for short lines and sentences. 

To resolve this problem, the SMITH algorithm comes with the advantage of assessing long-form content. It can interpret sentences after sentences, paragraphs after paragraphs, and the whole document.  

Though the BERT and SMITH algorithm have the same intent, with different algorithmic capabilities, the SMITH algorithm still rely on BERT algorithm and is not a separate algorithm in action. 

Speculations over google SMITH update implementation

Google released a research paper on the SMITH algorithm in the year 2020. The research talks about the advantages of the SMITH algorithm and how it outperforms the BERT update. Here’s what the excerpt from the research explains as quoted below.

“Our experimental results on several benchmark datasets for long-form document matching show that our proposed SMITH model outperforms the previous state-of-the-art models including hierarchical attention, multi-depth attention-based hierarchical recurrent neural network, and BERT. Comparing to BERT based baselines, our model is able to increase maximum input text length from 512 to 2048. We will open source a Wikipedia based benchmark dataset, code and a pre-trained checkpoint to accelerate future research on long-form document matching.”

There are no official updates on the implementation of the SMITH update. The researchers believe that the SMITH algorithm might be in its initial stage or partially implemented along with the BERT algorithm and being tested. 

What is in it for you? 

To rank a page on the search result, it’s always important you provide valuable information to your users, create brand awareness, and educate your users. 

The key to getting rank is to create an overall user experience online using the set guidelines by search engines. The overall user experience is improved when the content is relevant to your users and answers their queries, concerns, and expectations from your brand. And, is easy to transit between pages within a website. Though there are several other parameters that search engine accounts for ranking, as long as you maintain and provide a user-friendly environment to your users, your website have a good scope to rank on top.