Before moving on to look at different ways of adding deeper rich meaning to your pages, let’s pause to ask the question, “Why bother with semantics at all?” I mean, is something intrinsically wrong with marking up a page using mostly div elements (as in the following code block)?
<div class="first">This is the heading.</div> <div class="main"><b>This is the first sentence.</b><br>This is the second sentence.</div>
Divya Manian addressed this in a polemical article, “Our Pointless Pursuit of Semantic Value,” in which she argues that putting too much emphasis on semantic markup is a waste of time for most people:
Mark-up structures content, but your choice of tags matters a lot less than we’ve been taught. …
I would say, however, that there are two good reasons for using correct semantic elements. The first and most prosaic is that you’re working to a de facto standard and writing code with good maintainability. You know if you use semantic elements, your colleagues or eventual successor will be able to work on your code without having to learn your naming scheme. And the reverse is also true: If you take over someone else’s code, you’ll know exactly what’s going on in the code if he or she has coded to standards.
A more recondite reason is that using semantic elements gives your content increased aboutness. Simply put, aboutness is a measure of the quality of meaning; what something is about is described by its aboutness.
As a simple illustration of that principle, imagine you have a web page that contains W.H. Auden’s poem “Funeral Blues”:
He was my North, my South, my East and West,
My working week and my Sunday rest,
My noon, my midnight, my talk, my song;
I thought that love would last forever: I was wrong.
Although we know that the poem is about death, the word itself doesn’t appear in the poem. How could a search engine that indexed the page know what it was about, and return it in the search results for that topic? The search engine looks at the text of the links to that page, so a link with the text “read more” provides no context, whereas a link with the text “W.H. Auden’s poem about death” provides some aboutness.
Using correct semantic elements provides the same benefit. If all of the content on your page is marked up with divs, the content has no context; if you mark up your page semantically, you give the content context:
<h1>This is the heading.</h1> <p>This is the first sentence.</p> <p>This is the second sentence.</p>
Now you clearly know which header is important and what the main body content is. You’ve given the content some aboutness.
As well as using semantic elements correctly to mark up your content, you can increase the meaning of your documents for machines rather than users (commonly known as structured data) in a number of ways. You can use existing attributes and elements in defined patterns (microformats) or extend HTML with new attributes (RDFa and microdata), and I’ll introduce them all briefly right now.