Keith Cirkel

Software Cyber Shepherd - the new generation of SEO

SEO has always been a bit of a black-hat science, hiring agencies to "do your SEO" is a minefield and you may just end up with a 2-bit operations who ask for money because they generated some XML using a free SiteMap Generator, or worse give you actively bad advice such as "use meta keywords" or "duplicate this paragraph on every page of your website". It's become so difficult to tell what is actually good SEO advice, and what is bad.

Luckily Google, Microsoft and Yahoo! have joined forces to create an approved form of Search Engine Optimisation: a new generation of microdata tags which helps optimise your pages to tell search engines exactly what these pages are, what data is inside of them and the relationships around that.

How to implement tags uses a simple set of HTML attributes (itemscope, itemtype, itemprop) in combination to describe "items" in your page. An item can be anything from a WebPage to a Product Offering to a Comedy Night to a TV Series and its Director. You simply pepper the attributes around your HTML and Google et al will pick up this rich microdata for use in indexing and displaying search results.

A base "item" looks something like this:

itemscope is used to determine that the <div> is the wrapping container for a new item, and the complementary itemtype attribute determines what type this new scope is (in this case "Thing"). Every tag inside this wrapping tag could then have pieces of information identified with itemprop, like this fuller example:

Here we can see that the itemprop attribute is used to associate parts of the Person profile with parts of our markup - any aware indexers of this markup will get meaningful data from this block of code about me, such as my full name as well as my given and family name, and my job title and website address. Cool right?

It goes much deeper to, for example some profiles link to other profiles, a Person can work for an Organisation or be related to other People, each which will have their own itemscopes and itemtypes as well as being related to the original scope.

As a more complex example, lets say you're a freelancer, who's just built a site for a new Movie called "Sherlock Redemption" (a tale of the famous Sherlock Holmes having to solve cases in Nazi Germany) directed by M. Night Shyamalan (spoiler alert, the twist of the movie Sherlock finds out he's a Nazi). The home page for this website might be marked up as so:

This markup is full of rich information about the film; we have the movie title, the director, 3 actors, a trailer, the production studio and the copyright year. This is all perfect data to wrap in tags, where you might end up with something like this:

What we've done here is given search engines some intricate meta-data about the movie in question. Search engines can determine that Universal Studios made this Movie, and that Shyamalan was the director and writer, as well as lots more. You can see that some parts of this markup define multiple itemprops as space separated names, for example <h2 class="written-by" itemprop="director author" itemscope itemtype=""> this lets you define one item to be associated to another in multiple ways, in this example the Director and Author are both one person, and can reflect that.

You can use Google's Structured Data Testing Tool to test markup or URLs and scrape the data from them, which is a great tool for experimenting and developing your site's awesome metadata.

That's all cool, but so far we haven't done anything really cool with this - how do we know search engines are even picking this up?

The cool uses for

Today, Google uses in some nifty ways which enhance search results, here are just a few examples that have been spotted on real websites:

Aggregate Ratings

If you have an itemscope that is a Movie, or perhaps a Restaurant (as well as many others) and it includes an AggregateRating itemscope then Google will show a little star rating just like this:

Google Search Result for Her (film), featuring the Aggregate Rating in the search result: "4/5 stars, rating 8.4/10 - 57865 votes"

That little star rating does not come from IMDB getting special treatment on Google, they give the same treatment to Rotten Tomatoes and MetaCritic as well as restaurant review sites such as TopTable - they all use AggregateRating and that's what causes Google to assign the star rating widget to them.


If you have an itemscope of Event, Product, CreativeWork or MediaObject (or any of their subtypes) that has an itemprop of offers (which relates to an Offer) then Google will show the price of that offer in the search results:

Google Search Result for Half Life (game) on Metacritic, featuring the Price in the result: "$9.99"

You can see this in lots of places, MetaCritic makes good use of this, as do the Google Play and iTunes App Stores.

Software Operating System

If you have a SoftwareApplication itemscope with an operatingSystem itemprop then Google can show the compatible Operating Systems in the results.

Google Search Result for Badland (game) on iTunes, featuring the Operating System compatibility in the search result: "iOS"

itunes and Google Play stores are to examples that show the operatingSystem. This seems ripe for the taking by other software houses but not many have picked up on it.

Recipe Cooking Time & Calories

This is an interesting one, if you have a Recipe itemscope and it features cookingTime or prepTime or totalTime properties, then it'll be shown in the search results. Similarly if it has a NutritionInformation itemscope attached to the nutrition itemprop then it can show calorific info in the result.

Google Search Result for Yorkshire Puddings on BBC Good Food, featuring the recipe cooking time: "25 mins" and the calories: "199 cal"

Lots of recipe sites' results show this, for example BBC Good Food, and others.

Summary is an easy to implement tool to allow search engines to extract rich data and provide useful clues to would-be customers about your product or service. It's still early days on this - the standard is only a few years old and not every part of it is displayed in useful ways to users, but that really shouldn't stop you from going ahead and implementing it on your website. In fact, the cool stuff is only really just beginning - imagine search engines soon which will use only data to search for things.

What are your thoughts on or rich data in general? Think I've missed something? Got some cool related tools to show me? I'm @keithamus on twitter, so tweet me or whatever.