The Last Post

This will be my last post on the blog :(

Search Engine Optimization is no longer a stand alone subject that can be studied or written about in isolation. The advent of social media, podcasting, corporate blogs, video and the myriad of other ways to successful branding, mean that a more holistic approach is required. I have noticed a few trends and intend to take advantage of them for my clients but not blog about them.

Thank you to all those that have emailed me in the past few years, I know I provided a much needed source of accurate information and thank you for your plaudits.

Good luck to all of you and remember test, test and test again.

- Michael Duz


Web Analytics: An Hour A Day by Avinash Kaushik

Web Analytics: An Hour A Day by Avinash KaushikI don’t normally herald the addition of a new book in the Essential Reading section of the sidebar here on SEO Blog but this is an exception. Web Analytics: An Hour A Day by Avinash Kaushik is quite the best book I have read for anyone involved in Internet marketing and small businesses. Don’t be put off by the title, the book explains the many complex metrics in simple terms and provides very specific guidelines with a step by step approach. Avinash Kaushik was Director of Web Research & Analytics for Intuit but left earlier this year to become an Independent Consultant. His first assignment is working with Google as an Analytics Evangelist and he talks more about that role on his excellent blog Occam’s Razor. Avinash is donating proceeds from the sale of Web Analytics to charity so you will not be the only good cause to benefit when you buy this book!


QR Codes

I was with a Japanese corporate client recently and as usual over dinner I asked them what was new on their cell phone (technically not personally!). I say as usual because I have found that if you ask that question in Japan you will always learn something interesting. This case was no exception but first some background.

It is worth noting that the Japanese are addicted to their cell phones with over 100 million (nearly 80% of the population) using the widely available high-speed 3G systems. Many of these users have been taking advantage of QR (Quick Response) codes which are two dimensional barcodes. QR codes can be read by any mobile device with a camera and the appropriate reader software. QR codes appear all over Japan on billboards, in print, on websites and in store windows. Even the Japanese government uses them with the immigration service stamping QR codes on passports detailing the visa status. I first saw them at a trade show in Tokyo where every stand seemed to have one.

QR codes can store up to 7089 numeric characters, 4296 alphanumeric characters or 1817 characters of Japanese (kanji script). That compares with 20-30 (depending on the standard) ascii characters for a conventional one dimensional bar code.

If you want to read more about QR codes Nokia has a simple explanation with some useful links and here is a short article on a potential development from NTT called Audio Barcodes.

Back to my dinner conversation. My client who is obviously a conscientious consumer told me that her local supermarket has started to use QR code labels on some fresh produce. She shops with the QR reader software enabled on her cell phone, takes a picture of the label and is then connected to a site with all the supplier’s details.

The labels look like this:

QR code label

You can see the QR code in the bottom right hand corner and the supplier’s details, in this case for a lettuce, look like this:

Lettuce grower's details

My Japanese is not very good but this has all sorts of interesting information; exactly where it was grown, when the seeds were sown, when the lettuce was harvested, the fertilizer used, the insecticide used, the bactericide used, the herbicide used and lots more. I have to say I was impressed!

She also told me that she uses QR codes to put useful RSS feeds on to her cell phone. It transpired that this will work on any RSS feed, not just those especially for mobile devices, because the software adapts the content automatically. I have generated a QR code for the feed on this site:

QR code for SEO Blog feed

This is what you would see on the cell phone:

Cell phone image data

If you would like to generate QR codes for your own rss feeds you can do so on the Kaywa site.

So when can we expect to see QR codes used widely in the US? Not any time soon would be my guess. As anyone who has visited the major cell phone trade shows like CTIA Wireless the latest and best cell phones on display are labeled ‘Not available in the US’. The three different transmission modes, CDMA (Sprint & Verizon), GSM (AT&T & T-Mobile) and iDEN (Nextel) used in the US make it a far more attractive proposition for manufacturers to concentrate on the European and Asian markets for the launch of new products and functionality, where they just use GSM.

Europeans are just beginning to see QR codes but if you have Japanese or Asian customers it’s definitely worth knowing about QR codes and the many ways they are used for marketing.

September 28, 2007 update.

Image searchA reader has emailed me about a new store in fashionable Rue de Turbigo, Paris, France. Called Denim Code they are selling designer clothes with QR codes attached. This publicity photo shows a QR code on a pair of jeans but do Parisian males need a excuse to take a picture of a lady’s bottom? What message will they receive on their cell phone? I bet those of you with a marketing brain are already thinking of hundreds of novel ideas!


Semantic Markup

In the Design and content guidelines sub-section of Google’s Webmaster Guidelines there is a paragraph that reads “…. write pages that clearly and accurately describe your content”. The only way I know of achieving this objective easily is to use semantic markup. (I will give some further reading references on semantic markup at the end of this post).

Semantic markup means using html elements that are appropriate to ‘content meaning’ rather than ‘content presentation’. A simple example might be using <em> for emphasis in some cases rather than <i> for italic because <i> only tells the browser what to do and does not explain what the content represents.

Apart from helping Google :) there are other advantages in separating content from presentation and using semantic markup. For example much easier code maintenance and correct interpretation by other user agents like audio screen reader software.

However semantic markup is not a technical necessity and this is probably the reason why only a tiny minority of web designers even know what it is let alone use it when coding web pages.

From my perspective as a SEO whenever I have to make site wide changes where non-semantic markup (presentational markup) has been used I curse the web designer :) For example if I want to change list items to uppercase and the designer has coded them as <p> - list item one</br> - list item two </br> - list item three</p> then I have to make the changes on every page. If they had used <ul> <li>list item one</li> <li>list item two</li> <li>list item three</li> </ul> then I can do the same thing with one line of CSS .ul {text-transform: uppercase ;} and not have to touch any of the markup.

I also want to shoot the designer whenever I see a missed opportunity to take advantage of Google’s recommendation to describe content, simply because the designer fails to use the appropriate semantic html.

Web designers being shot for not using semantic markup 1940's style comic cartoon

If your web designer has provided a W3C validated site you might like to try their Semantic Extractor tool, which provides a basic outline of some of the semantics in your html markup. It simply examines your markup and provides a summary of important data i.e. generic metadata, related resources, defined terms, abbreviations and acronyms and an outline of the document. Prior to this post I parsed the home page of this site and this was the output:

Generic metadata and related resources.

Semantic extractor output - generic metadata and related resources

Abbreviations and acronyms.

Semantic extractor output - abbreviations and acronyms

Outline of the document.

Semantic extractor output - outline of the document

So what’s the bottom line? If you believe that Google uses (or will use in the near future) any form of semantic analysis of web pages then before you write the copy for your new web pages ensure that your designer uses semantic markup.

Further reading on semantic markup:


What do search engine spammers look like?

You may think that search engine spammers look pretty much the same as anyone else and that is probably true, unless of course you are a spam detection algorithm.

At last weeks ACM SIGIR conference in the Netherlands an interesting paper was presented with the title “Know your Neighbors: Web Spam Detection using the Web Topology”.

Essentially this describes a spam detection system that uses the link structure of web pages and their content to identify spam. Or as the abstract puts it “In this paper we present a spam detection system that uses the topology of the Web graph by exploiting the link dependencies among the Web pages, and the content of the pages themselves.

The following impressive diagram appears in the paper:

Hostgraph of a section of the web

This is a graphical depiction (for a very small part of the web) of domains with a connection of over 100 links between them, black nodes are spam and white nodes are non-spam.

Most of the spammers are clustered together in the upper-right of the center portion and here is a magnified view of that section:

Magnified section of hostgraph


What spammers look like.

The other domains are either in spam clusters or non-spam clusters. Here is a typical spam cluster and it shows what spammers, who indulge in nepotistic linking, may look like to a spam detection algorithm.

Of course this is only one line of research into spam detection but you don’t need to be clairvoyant to know that the major search engines have been including similar components in their ranking algorithms for some time. Good search engine optimizers avoid unnatural linking patterns and all site owners are well advised to do the same.

You can read the full paper here: Know your Neighbors: Web Spam Detection using the Web Topology, Carlos Castillo, Debora Donato, Aristides Gionis, Vanessa Murdock and Fabrizio Silvestri, proceedings of SIGIR, ACM Press, July 2007, Amsterdam, Netherlands, 423-430.

There is also a very good lecture by Carlos Castillo that gives an insight into various techniques of spam detection. Recorded at the Workshop: The Future of Web Search, May 19, 2006 Organized by Yahoo! Research Barcelona and the Web Research Group of the Department of Technology, Universitat Pompeu Fabra. You can see the lecture here: Using Rank Propagation and Probabilistic Counting for Link-based Spam Detection.

