Search Engine Friendly Urls

It is important to have search engine friendly urls if you want your pages spidered and indexed by the search engines but what does having search engine friendly urls actually mean? Let’s take a look at what the three major search engines say about urls:

Google has three things to say on the subject in its Webmaster Guidelines:

1. If you decide to use dynamic pages (i.e., the URL contains a “?” character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few.

2. Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page.

3. Don’t use “&id=” as a parameter in your URLs, as we don’t include these pages in our index.

Yahoo in their Search Indexing FAQ say:

Do you index dynamically generated pages (e.g., asp, .shtml, PHP, “?”, etc.)?

Yahoo! does index dynamic pages, but for page discovery, our crawler mostly follows static links. We recommend you avoid using dynamically generated links except in directories that are not intended to be crawled/indexed (e.g., those should have a /robots.txt exclusion).

MSN’s Guidelines for successful indexing say:

Keep your URLs simple and static. Complicated or frequently changed URLs are difficult to use as link destinations. For example, the URL www.example.com/mypage is easier for MSNBot to crawl and for people to type than a long URL with multiple extensions.

The message is clear, static urls are better than dynamic but if you have a dynamic site the urls must be as simple as possible, with only one or two query strings and no session IDs.

A url that might look like this:

http://www.yoursite.com/main.php?category=books&subject=history

Should preferably look like this:

http://www.yoursite.com/books-history.htm

How you achieve this depends on whether you are starting out with a new site or have an established site with existing complex urls.

If it is a new site then search engine friendly urls must be built into the design criteria. How this will be done depends on the programming language. For example if you planned to use PHP then you might make use of the PATH_INFO variable or if you use ASP.NET then you could modify the Global.asax file.

If you plan to use a content management system (CMS) then make sure that it generates search engine friendly urls out of the box. The Content Management Comparison Tool has a check box for ‘Friendly URLs’ if you are researching CMS tools.

A completely different approach (not approved of by geeks but worth consideration if you are designing your own site as a non-professional) is to create static HTML web pages from a database or spreadsheets but not in real-time. WebMerge for example works with any database or spreadsheet that can export in tabular format such as FileMaker Pro, Microsoft Access, and AppleWorks. Using HTML template pages WebMerge makes a new HTML page from the data in each record of the exported file. It can also create index pages with links to other pages and generated pages can be hosted without the need for a database.

If it is an existing site then problematic urls can be converted to simple urls in real-time. If you are on an Apache server then you can use mod_rewrite to rewrite requested URLs on the fly. This requires knowledge of regular expressions which can be rather daunting if you are not a programmer. Fortunately there is an abundance of mod_rewrite expertise at RentACoder if you get stuck. If you are on Internet Information Server (IIS) then you can use something like ISAPI_Rewrite to rewrite your urls which also requires knowledge of regular expressions.

What ever your solution you should try to incorporate your keywords in the urls and only ever use hyphens, never an underscore or space.

Additional reading in a more recent post - URLs (Update)

Tutorial

5 Comments »

  1. weikelbob said,

    December 13, 2005 @ 2:09 am

    What should be done if you already have a hundred incoming links or so on a page with an url like boise-idaho-web-design.php? Is it important enough to start over and name it boise-idaho-web-design.html, which I assume would be starting over? Or is it mainly getting indexed in the first place that is the problem?

  2. duz said,

    December 13, 2005 @ 10:18 am

    If the url is yourdomain.com/boise-idaho-web-design.php? then that is perfectly acceptable as far as the SEs are concerned.

    Is that what you mean?

  3. weikelbob said,

    December 28, 2005 @ 4:31 pm

    Yes, that’s what I mean, thank you.

  4. kirtan said,

    September 23, 2006 @ 2:14 pm

    is javascript urls get indexed? some directory required reciprocal link and they put my link with use of javascript. will it get indexed and credit as a inlink to my site?

  5. duz said,

    September 23, 2006 @ 4:43 pm

    Unfortunately this is a common deceit by some directory owners. Google may see the link but not pass PageRank so avoid directories with JavaScript links.

RSS feed for comments on this post · TrackBack URI

Leave a Comment

Bot-Check