Long Tail Search

“….a person that started in to carry a cat home by the tail was getting knowledge that was always going to be useful to him….” An aphorism from Uncle Abner in “Tom Sawyer Abroad” by Mark Twain, 1835-1910.

I expect you could learn a lot by carrying a cat home by its tail(!) but of more practical use to the website owner is understanding the long tail of search.

First something about word frequency. Word frequencies in natural languages such as English demonstrate the following property: In a given body of text there will be just a few words that are used very often and a large number of words that are used not so often. This property was noted by the Harvard linguistic professor George Zipf and the precise mathematical relationship is called Zipf’s Law.

Here is an example of Zipf’s law in action. I have taken Chapter 10 of “Tom Sawyer Abroad” (in which the above quotation occurs) and analyzed it for word frequency.

Word frequency chart

There are 2426 words in the chapter of which 624 are unique and 11 of these unique words account for 30% of the occurrences. As we said earlier just a few words are used very often and most of the words are used not so often.

What has this got to do with the long tail of search? Well if you examine the log files of your website there will be a record of the search terms that visitors used to find your site via the search engines. If this log file was for a medium size site and you looked at a week of statistics, the number of different search terms will run into thousands and obey Zipf’s law. It is here that we will find the long tail of search.

Here is an example:

Search term frequency chart.

Here we are not plotting ‘word frequency’ but ‘search term frequency’ however it is the same principle. Only 7 terms account for 30% of the searches. The remaining 1219 search terms account for 70% of the searches. The large number of search terms that are not used so often are referred to as the long tail of search.

In practice when you examine the long tail of search in your server logs there will be a number of search terms where it is clear that the searcher was seeking something totally different to that which your site has to offer. However there will be many more where the searcher has visited your site having used a search term that is highly relevant even if it is not frequently used, or one that you might have never conceived of as being used in the first place.

It is also true that the long tail of search will contain more specific search terms and that these will have a higher conversion rate than more general terms. There is a good reason for this because consumers tend to use one or two word queries when they start their basic research but as they learn more about what they are searching for they end up using specific and longer phrases. In other words they will start on the left of our search term frequency chart above but move to the right (the long tail).

So how do we use the long tail of search for SEO? The answer is to write new content (pages) using the relevant search terms found in the long tail as key phrases. The advantages of this strategy are twofold. Firstly your tail will get longer(!) and hence targeted traffic will increase. Secondly with a longer tail you will be able to repeatedly mine your logs for unusual search terms and produce even more content. If this is done systematically, regularly and with imagination the investment of time and effort will make it well worthwhile. (June 13, 2006. There is now an easy way to do this with a Long Tail Search Tool).

Long tail search is just one instance of many long tails arising from Chris Anderson’s original article in Wired magazine. If you are interested in long tails outside the realm of practical SEO Chris’s blog provides a plethora of outstanding reading material.

1 Comment »

  1. miklevin said,

    June 21, 2006 @ 12:52 am

    We agree with this article so wholeheartedly, we wrote an app to help you write for the long tail of search. It’s an enormous time-saver, sparing you from pouring over log files. Anyway, it’s called HitTail http://www.hittail.com and was developed by the PR firm that launched Amazon.com, Priceline and others. If you think it’s interesting, we’re always available to chat.

RSS feed for comments on this post · TrackBack URI

Leave a Comment

Bot-Check