HTML5 is an iteration of HTML4.01 with some new features, a few deprecated or removed features, and some modified behaviors of existing features. Its aim is to standardize the many common hacks and design patterns that developers have used throughout the years and to expand in order to meet the demands of the modern Web, which is as much (if not more) about applications as it is about documents; indeed, the original proposal for what became HTML5 was called Web Applications 1.0.
New features in HTML5 include ways to structure documents for providing meaning and accessibility; I cover this in Chapter 2. HTML5 also has a whole range of new form functionality and UI controls that make it easier to build applications, which we’ll look at in Chapter 8. And HTML5 includes what many people still associate with it—native (without plug-ins) video, which is covered in Chapter 9.
Two main groups are working on HTML5, and their roles and responsibilities are broadly this: The WHATWG (you don’t need to know what that acronym means), a consortium of browser makers and “interested parties,” through the main spec editor Ian Hickson, creates a “living spec” of HTML—basically a versionless specification that constantly incorporates new features and updates existing ones; and the W3C (World Wide Web Consortium), the Web’s standards body, takes snapshots of this spec to create numbered versions, ensuring compatibility of implementation by the browser vendors.
The situation is, in fact, a bit more complex than that and plenty of political wrangling is going on, but that’s of interest only to standards wonks and shouldn’t make any practical difference to you.
The W3C has proposed, although not confirmed as I write this, that HTML5 (the W3C snapshot) be brought to Recommendation status—that is, “done”—by 2014, with HTML5.1 to follow in 2016. HTML5 would also be broken into separate modules rather than a single monolithic spec, so work can progress on different aspects without delaying the whole. These dates don’t really matter to you, however; all you need to know is when HTML5 is in browsers and ready to use.
As someone with basic working knowledge of HTML, you’re familiar with fundamental page markup. But things have changed a little bit in HTML5—not much, but enough to mention. The following code block shows the basic template that I’ll use for all of the examples in this book (you can also see this in the example file template.html):
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title></title> </head> <body></body> </html>
Most of it should be familiar to you, but I will discuss two points of interest. First is the Doctype. This is a remnant from the days when you had to tell the browser which type of document you were writing: strict HTML, transitional HTML, XHTML1.1, and so on. In HTML5, that’s no longer necessary—there is only one flavor of HTML—so the Doctype declaration really isn’t needed any more. In theory, that is.
Modern browsers tend to have three rendering modes: quirks mode emulates the nonstandard rendering of Internet Explorer 5, which is required for compatibility with legacy pages on the Web; standards mode is for modern, standards-compliant behavior; and almost standards mode is standards mode with a few quirks.
To know which mode to use, the browser looks to the Doctype. You always want to use standards mode, so the Doctype in HTML5 is the shortest possible that triggers standards mode:
<!DOCTYPE html>
The second point of interest, and the only other change to the standard HTML5 template, is the meta tag, which declares the range of Unicode characters used to render the text on the page—UTF-8 is the default used across the Web, so this is what you’ll use in most cases. The meta tag uses the charset attribute:
<meta charset="utf-8">
That’s really it. If a client ever asks you to “make their website HTML5,” you can update those two tags and charge them a fortune for it. (Please don’t; that was just a joke.)
I could have included plenty of other options, which I’ve left out for the sake of clarity and simplicity. The popular HTML5 Boilerplate website provides a comprehensive template, so look through the documentation to see what the template does—but please keep in mind it should be a starting point, not used verbatim.
In addition to the changes to the core template, HTML5 has one or two new best practices that you should consider implementing. HTML5 has been written to take advantage of the many different ways developers write code, so these shouldn’t be considered hard-and-fast rules, but in my opinion, they’ll make your code easier to write and maintain.
The first best practice is that you are no longer required to use the type attribute when calling the most common external resources. Using HTML4.01 or XHTML, you had to declare a type for each link, script, or style tag:
<link href="foo.css" rel="stylesheet" type="text/css"> <script src="foo.js" type="text/javascript"></script>
But when working on the Web, CSS and JavaScript are the de facto default resource types used with these tags, so writing them out every time is a little redundant. Therefore, you can now drop them, making your code a little cleaner while still being understood perfectly well by the browser:
<link href="foo.css" rel="stylesheet"> <script src="foo.js"></script>
The only time you need to use the tags is when you’re not using default CSS or JavaScript; for example, some releases of Firefox have experimental implementations of recent versions of JavaScript, and for safety’s sake they require that you include a flag on the type attribute if you want to use it:
<script src="foo.js" type="application/javascript;version=1.8"></script>
HTML5 is also very forgiving of syntax. Whether your preference is to use all lowercase characters, quote your attribute values, or close empty elements, HTML5 is happy to parse and understand them. That being the case, all of these are equal:
<img src=foo.png> <img src=foo.png /> <IMG SRC="foo.png"/>
Attribute values require quotation marks when they have multiple values, such as a list of class names, or if they contain certain special characters.
Some attributes, known as Boolean attributes, have only true or false values; their presence is presumed to mean true unless otherwise specified, so you don’t need to supply a value—unless you’re using an XML-like syntax where values are required, in which case you use the name of the attribute itself. This means both of these are the same:
<input type="checkbox" checked> <input type="checkbox" checked="checked">
My own preference is to use all lowercase, all quoted, but not to close empty elements:
<img src="foo.png">
This is the style I use throughout the book, as I find it neater and easier to work with, and the text editor I use has syntax highlighting, which makes looking through the code nice and clear. You can use whichever system you want, but be consistent to help with maintainability.