Archive for News and Comment

Search Marketing Standard Review

It felt rather special getting my copy of the first issue of Search Marketing Standard magazine yesterday. I mean how often do you get to read the inaugural edition of the first magazine in a particular industry?

Producing a successful quarterly glossy magazine for the rapidly changing search marketing industry is never going to be easy and having read it I think it is worth a review.

Search Marketing Standard magazine cover.

There are eight articles starting with an introduction to the basic metrics for measuring your SEO performance by Michael Nguyen. With only five definitions and five paragraphs you are never going to get more than just a flavor of this complex subject and that is what you get here. A more comprehensive introductory read is Bruce Clay’s write up (and associated links) on Web Analytics.

Next comes a very short piece by Tom Dahm under the heading ‘Google Insider’. Tom briefly talks about Google Adwords recent introduction (March 8th) of demographic targeting but does not elucidate. For the interesting details you can go to Googles’s AdWords Help Center, What is demographic site selection?

This is followed by David Rodnitzky’s ’15 of the Biggest Myths in Search Marketing Exposed’. The longest article by far and it should be compulsory reading for those who are about to embark on spending money with search marketing service providers. However it is a little worrying that at one stage David suggests “…manually submit your site to major search engines like Google, Yahoo and MSN” when no professional SEO has submitted a site for years.

Next up is a review of the PPC campaign management tool ‘Dynamic Bid Maximizer Advance 3.0’. No one claims authorship but there is a passing reference to ‘our review team’. Well the review team rated the product ‘Excellent’ and found no faults whatsoever. The only drawbacks they found were the necessity to purchase a separate version for Yahoo! Search Marketing, the lack of a toll-free number for support and an extra charge for ROI tracking. Now I don’t know about you but I am naturally suspicious of glowing single product reviews, particularly without a name attached and would have preferred to see a product comparison between Bid Maximizer and (say) BidRank by people who use the products every day. Of course one thing these desktop solutions do not supply is rule-based bidding, for example setting your ROI or CPA for a specific keyword with the software handling the bidding. Maybe an idea for a future comparison article perhaps?

Following on is Kevin Gold and ‘Managing your PPC Bids: The 4 Most Important Things to Consider’. It includes some good advice for the novice PPC advertiser and it may be short but it’s one of the best pieces in the magazine.

“An interview with Perry Marshall” by Andrey Milyan, the editor-in-chief, is next in line. Perry Marshall, the ‘Google AdWords Guy’, is well known for making money from his book ‘The Definitive Guide to Google AdWords’ at $49 in competition with the Google’s AdWords Learning Center which is free. Now that is an achievement and that is why it’s worth paying attention to what he says in this interview.

Alexander Brabant follows with ‘Targeting the Tail’ in which he manages to talk about the importance of the long tail of search without explaining or even hinting as to how you find it. If you’re interested then read Long Tail Search.

The final piece is ‘Click fraud Alert’ by Boris Mordkovich. Sub-titled ‘Recent Click Fraud Developments’ it consists of a few very short paragraphs of old news and looks more like a ‘filler’ for lost advertising space.

In addition to the eight articles is a ‘Letter from the Editor’, a table of contents, an index of advertisers and a calender of events. There are 36 pages including the front and back cover with approximately 50% advertising.

Should you take out a subscription to Search Marketing Standard? Most definitely, not only because you will be able to read about search marketing in the bath but more importantly for the ROI in ideas and information you will receive for your money.

You can subscribe to Search Marketing Standard for $10 for one year or $20 outside of the US.

Comments (1)

The Google Sandbox

The Google Sandbox was mentioned here briefly in a comment appended to the Directory Links post. What I said was “…you may want to take into account a current theory relating to the Google sandbox. The theory goes that it is possible to avoid the sandbox by acquiring highly trusted links. The jury is still out on this theory (which is probably over simplified) but at the moment it seems to fit most professional SEOs experiences”. Additional relevant information relating to this theory has recently been provided by Google’s Matt Cutts in an interview (Part1, Part2) by Mike Grehan.

Here is a transcript of that part of the interview that deals with the sandbox:

Mike: One of the major issues that has been banded around in the major forums online and bloggers talking about the speculation of something called the sandbox. Now I have my own opinions on whether there is a sandbox or there isn’t a sandbox but Matt if you just give us the official line because you obviously heard the term so what does it mean to Google.

Matt: Yea it’s kind of interesting, I think Toolman over on the webmasterworld forums might have been one of the guys or one of those folks who sort of coined the term sandbox. And I think what he was originally talking about was the notion that people wouldn’t always be able to rank for the search terms that they want as quickly as they would want. Some people have said ok is this something that applies to newer sites and essentially the way to think about it essentially in my mind is around 2003 Google switched to a new method of updating its index, before that we had monthly Google dances and so we would only update the index once a month. It’s kind of interesting because when we switched over at that time people were like oh update Fritz is horrible I hate update Fritz. And you know there were people from Google saying things like look the competition will certainly be saying, will be releasing, new features and so we have to keep releasing and innovating new features ourselves.

Mike: I’ll just switch my Blackberry off (laughter).

Matt: No worries! By the end of that summer, by the end of 2003 we had essentially an index that was continuously changing and by the end of that summer all the infrastructure had been worked out and people really liked it and now if you were to go to SEOs and say would you like to wait one month for the index to be updated at all I don’t thing anybody would really want to go back to the bad old days of having a monthly index. And so as a result new data is always being folded into the index, it’s not like there is one pivotal moment each month were people can say ah! this is, the you know, the change. And in fact even at different datacenters we have different bineries, different algorithms, different types of data always being tested. I think a lot of what’s perceived as the sandbox is artifacts where in our indexing some data might take longer to be computed than other data.

Mike: I guess it really does have to work in tiers in the index I mean some sites are larger more prominent to changing more often, news sites those kind of things and there are those sites that may have great information but rarely change.

Matt: Well certainly there are other sites. The way I think about it is Googlebot goes out and tries to sample the web and we try to sample the web in the most efficient way that we know how and so certain sites have higher PageRank or change very often or are very important just for general index quality and so those sites often get crawled with a daily crawl or a very fresh crawl. But we also try to make sure that our crawl is very effective across the entire index. So it’s kind of interesting because there was this survey that people said well, do you even think that there is a sandbox and SEOs who are the experts who should be the ones to reach consensus were actually split down the middle. So I think that goes to show you have certainly seen reports online where people say well my site is two weeks old and it’s already showing up in the rankings.

Mike: I heard this classic quote from somebody recently that said we are never quite sure what the sandbox was but whatever it is it’s changed and it’s something different now, make sense of that.

Matt: Right, even getting you know five or ten SEOs to agree on a common definition of what exactly such a sandbox would be, would be really challenging.

Mike: But fundamentally I think if you go to the core issue where people are talking about it may take nine months before you get into the index and can rank if you have something which is extremely relevant Google doesn’t want to wait nine months before it can actually bring some great information to this service and you want it there as soon as it’s available.

Matt: Well we do want it there as soon as it’s available and you know some things like our news crawl and blog search can find stuff within minutes of stuff being live. So there is always a trade off, you know, how much do you trust certain pages, how much do you rank certain pages and the best advice I can give is don’t worry or over think or try to strategize thinking to hard about is or isn’t there a sandbox. Just make a great site with great content with a normal reason why lots of people would want to link to you and visit your site and a compelling reason why people would really value your site and that’s going to lead to you capturing the mind of the blogsphere anyway and that is really the best way to let search engines find out about you.

This tells us what we already knew, that the sandbox is an artifact (an unintentional effect) and not a separate component of the Google algorithm. It also tells us that if you are sandboxed ‘trust’ was involved in some way. If trust means links from (and to) ‘trusted’ sites then it would appear that our theory is on the right track.

Comments

Best Search Engine?

The concept of a best search engine is a nebulous one to say the least. Google (40%), Yahoo (30%) and MSN (15%) account for 85% of all searches but does that mean Google is the best? Not according to Intralink , a web marketing research, search engine optimization and consulting firm based in Cincinnati, Ohio.

Intralink did not initially set out to provide a search engine relevancy report; rather it was a natural progression using a core set of data that they had already accumulated as part of their keyword research on behalf of clients.

When conducting keyword research they look for holes in the search results. This involves identifying less competitive search phrases that will provide targeted traffic to their client’s website. Knowing what terms to target, where the holes are and what kind of site currently fills the hole enables them to construct a marketing and SEO program for their clients at reduced cost but one that is still effective.

(Intermezzo: I can see some advantages to this approach but personally I prefer not to exclude the highly competitive search terms from the start. I favor a well constructed site, themed perhaps, followed by regular analysis of the log files for the long tail search terms and adding content targeting these terms. Repeating these steps periodically makes this an iterative process which in my opinion is difficult to better over the longer term.)

Using a wide range of hundreds of search terms including local searches, informational searches and ‘how to’ searches Intralink scored the search engines on five criteria:

    Relevancy: This of course can be quite a subjective measure however they devised a scoring mechanism based on exact matches, with the number of matches and the position on the page(s) influencing the score.
    Freshness of content: By searching for specific current events they were able to score the search engines based on how up to date they were. Also by searching for long running current events they could score the search engines on the most recent relevant entry. Combining the scores provided an overall value for ‘freshness’.
    Difficult Search: A difficult search was defined when at least three of the search engines did not provide a relevant result on the first page. Scoring was based on how well the search engines performed with these specific difficult searches.
    Failure rate: Scoring was based on many times the search engine was unable to produce relevant results in the first three pages for a difficult search.
    Non-organic or extra features: A number of subjective elements such as performance in local searches and the availability of maps were combined to produce an overall score for this criterion.

Each of the raw scores for the five criteria were weighted and then combined with a proprietary formula to produce a composite score. The results were as follows:

Results of the best search engine report

Overall MSN comes out on top but it is noted in the original report for the Relevancy scores alone both MSN and Yahoo performed better than Google.

Intralink are promising to provide a report periodically but based on the same basic set of searches (though of course searches based on current events will have to be updated). Observing the trend in these later reports may prove even more interesting than this snapshot!

My thanks to Eric Gurr President/CEO Intralink for permission to reproduce the data and chart.

Comments

URLs (Update)

If you have read through the Tutorial you will have seen the article on Search Engine Friendly Urls and in the News and Comment section you may also have seen the article on Keywords in Urls. Here I am going to add some additional information and conclude with some practical advice.

From the results of the experiment we conducted in Keywords in Urls we know the following:

  • Google indexes on keywords in hyphenated urls but not on keywords in underscored or conjoined urls.
  • Yahoo indexes on keywords in hyphenated and underscored urls but not keywords in conjoined urls.
  • MSN indexes on some keywords in hyphenated, underscored and conjoined urls but the exact circumstances in which it does so are at the moment unclear.
  • We also know there is research that shows url depth (e.g. slash-count) and url length (e.g. number of characters) can be used to improve some types of web search results. In addition a recent paper out of Microsoft Research, Web Search & Mining Group shows that the location (identification) of search terms (keywords) in the url can also be used to improve search results. The paper is not yet available to read on the web but for reference it is “Exploring URL Hit Priors for Web Search” Ruihua Song, Guomao Xin, Shuming Shi, Ji-Rong Wen and Wei-Ying Ma, Advances in Information Retrieval: 28th European Conference on IR Research, 2006. When and if it becomes available I will add a note here. (May 30, 2006 Dr. Wei-Ying Ma has kindly sent me a .pdf file of his paper which you can read here, URL Hit Priors.) I personally do not know if depth priors and hit priors (as they are called) are actually used by the three main search engines in their algorithms at the moment. However it would not surprise me if they were and if not now some time in the near future.

    So from this and the previous articles the practical advice is that when constructing your urls:

  • Keep the urls short.
  • Put important pages in the root directory (immediately after the first slash).
  • Incorporate your keywords in the urls.
  • Use hyphens to separate keywords, never an underscore or space.
  • This advice put into practice can be seen on this site :)

    Comments (2)

    Long Tail Search

    “….a person that started in to carry a cat home by the tail was getting knowledge that was always going to be useful to him….” An aphorism from Uncle Abner in “Tom Sawyer Abroad” by Mark Twain, 1835-1910.

    I expect you could learn a lot by carrying a cat home by its tail(!) but of more practical use to the website owner is understanding the long tail of search.

    First something about word frequency. Word frequencies in natural languages such as English demonstrate the following property: In a given body of text there will be just a few words that are used very often and a large number of words that are used not so often. This property was noted by the Harvard linguistic professor George Zipf and the precise mathematical relationship is called Zipf’s Law.

    Here is an example of Zipf’s law in action. I have taken Chapter 10 of “Tom Sawyer Abroad” (in which the above quotation occurs) and analyzed it for word frequency.

    Word frequency chart

    There are 2426 words in the chapter of which 624 are unique and 11 of these unique words account for 30% of the occurrences. As we said earlier just a few words are used very often and most of the words are used not so often.

    What has this got to do with the long tail of search? Well if you examine the log files of your website there will be a record of the search terms that visitors used to find your site via the search engines. If this log file was for a medium size site and you looked at a week of statistics, the number of different search terms will run into thousands and obey Zipf’s law. It is here that we will find the long tail of search.

    Here is an example:

    Search term frequency chart.

    Here we are not plotting ‘word frequency’ but ‘search term frequency’ however it is the same principle. Only 7 terms account for 30% of the searches. The remaining 1219 search terms account for 70% of the searches. The large number of search terms that are not used so often are referred to as the long tail of search.

    In practice when you examine the long tail of search in your server logs there will be a number of search terms where it is clear that the searcher was seeking something totally different to that which your site has to offer. However there will be many more where the searcher has visited your site having used a search term that is highly relevant even if it is not frequently used, or one that you might have never conceived of as being used in the first place.

    It is also true that the long tail of search will contain more specific search terms and that these will have a higher conversion rate than more general terms. There is a good reason for this because consumers tend to use one or two word queries when they start their basic research but as they learn more about what they are searching for they end up using specific and longer phrases. In other words they will start on the left of our search term frequency chart above but move to the right (the long tail).

    So how do we use the long tail of search for SEO? The answer is to write new content (pages) using the relevant search terms found in the long tail as key phrases. The advantages of this strategy are twofold. Firstly your tail will get longer(!) and hence targeted traffic will increase. Secondly with a longer tail you will be able to repeatedly mine your logs for unusual search terms and produce even more content. If this is done systematically, regularly and with imagination the investment of time and effort will make it well worthwhile. (June 13, 2006. There is now an easy way to do this with a Long Tail Search Tool).

    Long tail search is just one instance of many long tails arising from Chris Anderson’s original article in Wired magazine. If you are interested in long tails outside the realm of practical SEO Chris’s blog provides a plethora of outstanding reading material.

    Comments (1)

    « Previous entries · Next entries »