Microformats, A Primer – Simple Semantics for the Web

In a recent write-up about the “Semantic Web“, I ended my thoughts with the following:
“A fully realized Semantic Web will be quite amazing indeed, but it is going to take a long time to get to the point where the technology regularly intersects with our daily lives.

It is going to take a long time to annotate the world’s information and then to capture personal information in the right way in order to really make it work the way it is supposed to.

We are a few years away before we really start to see real traction in terms of Semantic Web technology.”

Introducing A Different Kind of Meaning

So, we already know that the term “semantic” stands for “meaning of.” We also know that to achieve a “true” semantic Web, a great deal of work is going to need to get done.

This not only includes an outpouring of time and money, but will also require education, training and change s in operations in hundreds of thousands of systems world-wide.

Ouch!

What stinks about this is two-fold:

  1. If those behind the W3C’s master theory of a Semantic Web have their way, this huge investment in business and technology change will be inevitable and someone is going to have to pay for it.
  2. The whole concept of, and what user experiences will be enabled due to a Semantic Web is WAY TOO COOL to wait around for!


Enter Microformats

True Semantic Web technology would enable computers to exchange/share, read and understand the meaning of data, and provide a mechanism for developers to create applications that provide for truly “next generation” user experiences. The concept of The Semantic Web is mostly about machine-to-machine data exchange and “behind the scenes” indexing, searching, storing and sharing.

Microformats, however, allow information that is actually intended for and consumed by human users to also be understood by software applications, similar to this Semantic Web concept.

Microformats emerged due to necessity.

For a quick peek back at the roots of Microformats, let’s think back about 10 years ago when the “browser wars” were full-on and Web developers finally began to think of Web site user interfaces as they should be.

This ideological approach to the Web is what we essentially have today: A “markup” applied to underlying data to give it a specific look and feel or visual design. Back then, Cascading Style Sheets were the next big thing, and XML was blessed by the W3C as something official.

Both CSS and XML addressed the desire to separate underlying data from display or use of that data. Necessity is indeed the mother of invention, and the necessity for applications that share meaningful data caused the invention of Microformats, which can be looked at as a unique merger of XML and CSS.

Here + Now = Good

Perhaps the reason that a semantic concept like Microformats has become “real” at this stage in the game stems from the fact that implementing Microformats doesn’t require a savvy Web developer to learn anything that they don’t already know.

  • Microformats are not vaporware.
  • They are simple to implement.
  • They are based on familiar, standards-based technologies.

Microformats are the end-result of the approach of formatting existing Web content by tagging it with CSS-like tags, that describe the content’s metadata. This approach uses only simple XHTML and HTML classes and attributes.

Tagging content in this manner allows information that was created and published online (and intended for end-users) to just as easily be understood by software applications.

Since the inception of the World Wide Web, it has been possible to load, scan and parse HTML documents.

We call this “screen scraping”, and in reality, it is pretty clunky and never really a perfect solution. Programmers can create software applications that follow URLS, download page content, read through that content, and either store it, or act upon it.

Even if a program successfully scrapes content from a Web page, the content itself doesn’t really have any independent meaning to it. A screen-scraping application just knows that text is text and knows where to get it, and where to store it, and maybe what to do with specific things that it finds. It’s a dumb technology really. (stupid screen scrapers!)

The traditional means of creating Web pages so that they display nicely in Web browsers doesn’t do much of anything to help software understand their content (or context of that content). There is simply no meaning to the data.

Microformats are intended to change all of that. They do so by enabling the attachment of semantics (semantic data) to online content as we currently know it. (Or think of it).

By using Microformats, data can be indexed & stored, searched and cross-referenced allowing information from many places about many things to be reused and recycled.

Using Microformats

Microformats started as a grass-roots effort, and have been defined based on common needs of those involved in the development community.

Because of this, it is no surprise that the most wide-spread uses of Microformats today mirror the types of online applications that are most pervasive and common types of data found in those applications.

Because of this, the Microformats that we currently see implemented include those that describe:

  • Event Listings
  • Atom Feeds
  • Contact Information
  • Addresses
  • Geographic Information
  • Content Reviews
  • Resumes / CVs
  • Social Networks
  • Lists and Outlines
  • Currency
  • Species (Living Things)
  • Measurements

Mozilla’s Firefox 3 Web browser has native implementation of Microformat handling and implements a global Microformats object and associated API that provide developers an easy way to find and consume Microformats.

The out-of-the-box Microformat support in Firefox includes:

  • Addresses (adr – street or mailing address).
  • Geography (geo – geographical locations: latitude & longitude)
  • Human Contact Information (hCard – contact information for a person)
  • Events/Calendar (hCalendar – calendar appointment entries)

Firefox 3 also allows you to extend things a bit and add tags to other Microformats using one named… “tag”.

It isn’t just the Open Source community that is embracing Microformats. Community rumblings have lead to almost-certain speculation that Microsoft will offer native support for Microformats within Internet Explorer 8.0 and other future software application releases.

This isn’t meant to be a technical article, so I am not going to get into the specifics of implementing Microformats. With that, a simple example will help to add a bit of context and help make it easier to understand how Microformats can be implemented.

How About A Date?

Sounds good to me hot stuff!

Let’s use the example of a calendar appointment.

Let’s say I made an item in my calendar that reminded me that I needed to write this blog posting today. If I we were to look at the underlying structure of that appointment in my calendar, it might look something like this:


BEGIN:VCALENDARPRODID:-//myCalendar
//ENVERSION:3.0BEGIN:VEVENTURL:http://davemeeker.com
/DTSTART:20080414DTEND:20080414SUMMARY:Write Blog Post About
MicroformatsLOCATION:Your Office\, Chicago\,
ILEND:VEVENTEND:VCALENDAR

Now, let’s look at HTML that could be displayed in a Web browser that represents the exact same information:

davemeeker.com Write Blog Post About Microformats: 20080414- 20080414,
at My Office, Chicago, IL

Finally, let’s look at the hCalendar Microformat markup behind that, which not only displays as above in the user interface, but contains the same computer-readable data as the Mac Calendar appointment file:

<div class=\"vevent\"><a class=\"url\" href="http://davemeeker.com/\">http://davemeeker.comg/</a>
<span class=\"summary\">Write Blog Post About Microformats</span>:<span class=\"dtstart\">
20080414</span>-<span class=\"dtend\">20080414</span>,at <span class=\"location\">
My Office, Chicago,IL</span></div>

This example is about as simple as it gets, but I hope that it opens your mind to the futue possibilities of what Microformats, or a technology like Microformats could bring to the publishing and consumption of web-based data.


Where does this go?

If Microformats were the holy grail of online data than the chatter of the Semantic Web and other competing theories would be almost silent. There is, however, a loud argument that XML by itself makes more sense than Microformats, and that ultimately “Web” data will be radically different when new technologies like related to the Semantic Web become more widely understood (and used).

These additional technologies include XML & RDF along with associated schemas, as well as OWL (web ontology language), SPQRQL (a query language) and business rules driven by RIF (rules interchange format).

The adoption of Microformats by companies like Microsoft and Mozilla is encouraging though, but as we’ve seen before… just because they put the functionality in the browser, doesn’t mean it is going to be the next big thing. (Remember “channels” in IE 4 or “push content?” Ouch.)

It is ultimately up to the hordes of Web application developers and users who will decide whether or not Microformats secure their spot as a reliable and widely used bit of technology.
Users are thirsty for the additional functionality that Microformats could enable in online experiences, but developers have to bite the bullet first and implement Microformats prior to user adoption.

In order to do so, technology leaders, software developers, information architects and user experience professionals need to educate business owners and client stake-holders about the inevitable approach of the Semantic Web, the benefits of structuring data in this fashion, and how Microformats can be immediately leveraged to improve their customer/user experiences.

Improved user experiences mean happier, more loyal, and engaged customers, clients, partners and employees… And that’s a fact Jack.

For more information about Microformats, the Semantic Web and groovy things like that, check out the following:

  • Microformats.org – The “official” site of the Microformat community.
  • spacenamespace – An interesting site is about annotating space with metadata, building semantic models of places, and exchanging geospatial data in RDF.
  • Magpie – a plugin for web browsers and application development framework for emerging Semantic Web tools.
  • OTN Semantic Web Beta from Oracle – A proof-of-concept Web application that demonstrates the use of RDF-based technology as the basis for a rich user experience that relies on dynamic relational navigation.
  • Alex Faaborg @ Mozilla – Alex has a great 4-part series on Microformats, UI issues, and implementation in Firefox 3.