SEO SEM Plan for 2007

March 24th, 2007

The main job of search engines is to eliminate spam and useless content from showing up at the top of their SERP’s (Search Engine Results Page). They do this by employing advanced phrase matching techniques. Google uses a system called Latent Semantic Analysis (LSA) to determine relevance. LSI, Latent Semantic Indexing is the process Google then uses to order search results based on LSA.
 
So how does Google determine a websites relevancy? In a nutshell, it checks your website and determines the keywords being used. It then develops and understanding of the theme of your website. Then, it compares your keywords with those of other commonly themed websites to determine it’s relevancy. Sound confusing? It is.

LSA uses a term-document matrix which describes the occurrences of terms in documents; it is a sparse matrix whose rows correspond to documents and whose columns correspond to terms, typically stemmed words that appear in the documents. A typical example of the weighting of the elements of the matrix is tf-idf (term frequency inverse document frequency): the element of the matrix is proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance.

This matrix is also common to standard semantic models, though it is not necessarily explicitly expressed as a matrix, since the mathematical properties of matrix are not always used.

LSA transforms the occurrence matrix into a relation between the terms and some concepts, and a relation between those concepts and the documents. Thus the terms and documents are now indirectly related through the concepts.

The reason it’s important to understand how Google works is because it’s the most powerful search engine in the world today, and most other search engines are copying their model.

Here is The Plan that explains all of this and much more in greater detail.

Leave a Reply