“Everything changes and nothing stands still.”
Heraclitus
Whether you are in charge of designing, implementing, or just maintaining a web API, at some point you start to think about how to handle change over time. The common word used in the API world is versioning. This idea of versioning is so deeply ingrained in the idea of API design that it is considered part of the common practice (discussed in “Web API Common Practice”) for APIs. Many API designers include versioning information (e.g., numbers in the URL, HTTP headers, or response body) without much thought about the assumption behind that practice.
And there are lots of assumptions behind the common practices of versioning APIs. Chief among them is the assumption that any change that is made to the API should be explicitly noted in the API itself (usually in the URL). Another common assumption is that all changes are breaking changes—that failing to explicitly signal changes means someone, somewhere will experience a fatal error. One more assumption worth mentioning here is that it is not possible to make meaningful changes to the functionality of web APIs unless you make a breaking change.
Finally, there is enough uncertainty associated with the life cycle of web APIs that many API developers decide to simply hedge their bets by adding versioning information in the API design just in case it might be needed at some future point. This is a kind of Pascal’s Wager for API design.
Blaise Pascal, noted seventeenth-century philosopher, mathematician, and physicist, is credited with creating Pascal’s Wager. A simplified version of his argument is that when faced with a decision that cannot be made using logic alone, “a Game is being played… where heads or tails will turn up.” In his case, he was illustrating, since we do not know if there is a God, we should bet that one exists since that is the bet with a better outcome.
Although his argument was more nuanced—and others have made similar observations about the nature of uncertainty—Pascal’s Wager has become a meme that essentially states “When in doubt, hedge your bets.”
OK, with this background, let’s look into the assumptions behind the versioning argument and at some evidence of how web-related technology handles change over time.
The idea that handling change over time on the Web means employing the explicit versioning technique flies in the face of some very important evidence about how things have worked on the Internet (not just the Web) over the last several decades. We’ll take some time here to look at just a few examples.
The examples we’ll explore here are (1) the foundational transport-level protocols (TCP/IP), (2) the most common application-level protocol (HTTP), and (3) the most common markup language for the WWW (HTML). They have each undergone important modification over the years and all without causing any major breakage for existing implementations.
Each of our examples is unique, but they all have a common design element that can help us understand how to handle change over time in our own web applications.
The TCP/IP protocol is an essential part of how the Internet works today. In fact, TCP/IP is actually two related protocols: the Transmission Control Protocol (TCP) and the Internet Protocol (IP). Together they are sometimes referred to as the Internet Protocol Suite. Computer scientist Alan Kay called TCP/IP “a kind of universal DNA of [the Internet].” Kay also has pointed out that the Internet has “never been stopped since it was first turned on in September 1969.” That means that this set of protocols has been working 24 hours a day, seven days a week for over forty years without a “restart.” That’s pretty amazing.
Kay’s point is that the source code for TCP/IP has been changed and improved over all these years without the need to shut down the Internet. There are a number of reasons for this and one of the key elements is that it was designed for this very situation—to be able to change over time without shutting down. Regarding this feature of the Internet Protocol Suite, Kay has been quoted as saying that the two people credited with designing TCP (Bob Kahn) and IP (Vint Cerf) “knew what they were doing.”
One bit of evidence of this knowing what to do can be found in a short section of the TCP specification: section 2.10 with the title The Robustness Principle. The full text of this section is:
“TCP implementations will follow a general principle of robustness: be conservative in what you do, be liberal in what you accept from others.”
RFC793
The authors of this specification understood that it is important to design a system where the odds are tilted in favor of completing message delivery successfully. To do that, implementors are told to be careful to craft valid messages to send. Implementors are also encouraged to do their best to accept incoming messages—even if they are not quite perfect in their format and delivery. When both things are happening in a system, the odds of messages being accepted and processed improves. TCP/IP works, in some part, because this principle is baked into the specification.
The Robustness Principle is often referred to as Postel’s Law because Jon Postel was the editor for the RFC that described the TCP protocol.
One way to implement Postel’s Law when building hypermedia-style client applications is to pass service responses through a routine that converts the incoming message into an internal representation (usually an object graph). This conversion process should be implemented in a way that allows successful processing even when there are flaws in the response such as missing default values that the converter can fill in or simple structural errors, such as missing closing tags, etc. Also, when forming outbound requests—especially requests that will send an HTTP body (e.g., POST, PUT, and PATCH)—it is a good idea to run the composed body through a strict validation routine that will fix formatting errors in any outbound messages.
Here is a bit of pseudo-code that illustrates how you can implement Postel’s Law in a client application:
// handling incoming messageshttpResponse=getResponse(url);internalObjectModel=permissiveConverter(httpResponse.body);...// handling outgoing messageshttpRequest.body=strictValidator(internalObjectModel);sendRequest(httpRequest);
So TCP teaches us to apply the Robustness Principle to our API implementations. When we do that, we have an improved likelihood that messages sent between parties will be accepted and processed.
The HTTP protocol has been around a long time. It was running in its earliest form in 1990 at CERN labs and has gone through several significant changes in the last 25 years. Some of the earliest editions of the HTTP specification made specific reference to the need for what was then called “client tolerance” in order to make sure that client applications would continue to function even when the responses from web servers were not strictly valid. These were called “deviant servers” in a special note linked to the 1992 draft of the HTTP specs.
A key principle used in the early HTTP specifications is the MUST IGNORE directive. In its basic form, it states that any element of a response that the receiver does not understand must be ignored without halting the further processing of that response.
The final HTTP 1.0 documentation (RFC1945) has several places where this principle is documented. For example, in the section on HTTP headers, it reads:
Unrecognized header fields should be ignored by the recipient and forwarded by proxies.
Note that in this quote, the MUST IGNORE principle is extended to include instructions for proxies to forward the unrecognized headers to the next party. Not only should the receiver not reject messages with unknown headers but, in the case of proxy servers, those unknown headers are to be included in any forwarded messages. The HTTP 1.0 specification (RFC1945) contains eight separate examples of the MUST IGNORE principle. The HTTP 1.1 specification (RFC2616) has more than 30 examples.
Throughout this section of the chapter, I use the phrase MUST IGNORE when referring to the principle in the HTTP specification documents. This name is also used by David Orchard in his blog article “Versioning XML Vocabularies.” While the HTTP specs use the word ignore many times, not all uses are prefaced by must. In fact, some references to the ignore directive are qualified by the words may or should. The name MUST IGNORE, however, is commonly used for the general principle of ignoring what you don’t understand without halting processing.
Supporting the MUST IGNORE principle for web clients means that incoming messages are not rejected when they contain elements that are not understood by the client application. The easiest way to achieve that is to code the clients to simply look for and process the elements in the message that they know.
For example, a client may be coded to know that every incoming message contains three root elements: links, data, and actions. In a JSON-based media type, the response body of this kind of message might look like this:
{
"links" : [...],
"data" : [...],
"actions" : [...]
}
Some pseudo-code to process these messages might look like this:
WITH message DO PROCESS message.links PROCESS message.data PROCESS message.actions END
However, that same client application might receive a response body that contains an additional root-level element named extensions:
{
"links" : [...],
"data" : [...],
"actions" : [...],
"extensions" : [...]
}
In a client that honors the MUST IGNORE principle (as implemented in the preceding examples), this will not be a problem because the client will simply ignore the unknown element and continue to process the message as if the extensions element does not exist. This is an example of MUST IGNORE at work.
So HTTP’s MUST IGNORE principle shows us that we must safely be able to process a message even when it contains portions we do not understand. This is similar to Postel’s Law from the TCP specification. Both rules are based on the assumption that some percentage of incoming messages will contain elements that the receiver has not been programmed to understand. When that happens, the processing should not simply stop. Instead, processing should continue on as if the unrecognized element had never appeared at all.
HTML is another example of a design and implementation approach that accounts for change over time. Like HTTP, the HTML media type has been around since the early 1990s. And, like HTTP, HTML has undergone quite a few changes over that time—from Tim Berners-Lee’s initial “HTML Tags” document in 1990, later known as HTML 1.0, on up to the current HTML5. And those many changes have been guided by the principle of backward compatibility. Every attempt has been made to only make changes to the media type design that will not cause HTML browsers to halt or crash when attempting to process an HTML document.
The earliest known HTML document is from November 1990 and is still available on the Web today. It was crafted two weeks before Tim Berners-Lee and his CERN colleague Robert Cailliau attended ECHT ’90—the European HyperText Convention—in Paris. The entire HTML document looks like this:
<title>Hypertext Links</title><h1>Links and Anchors</h1>A link is the connection between one piece of<ahref=WhatIs.html>hypertext</a>and another.
The fact that this page still renders in browsers today—25 years later—is a great example of how both message design (HTML) and client implementation principles (web browsers) can combine to support successful web interactions over decades.
From almost the very beginning, HTML was designed with both Postel’s Law and HTTP’s MUST IGNORE in mind. Berners-Lee makes this clear in one of the early working documents for HTML:
“The HTML parser will ignore tags which it does not understand, and will ignore attributes which it does not understand…”
What’s interesting here is that this type of guidance shows that the message designers (those defining HTML) are also providing specific guidance to client implementors (those coding HTML parsers). This principle of advising implementors on how to use and process incoming messages is an important feature of Internet standards in general—so important that there is an IETF document (RFC2119) that establishes just how specifications should pass this advice to implementors. This document defines a set of special words for giving directive advice. They are MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL. Other standards bodies have adopted a similar approach to passing on guidance to implementors.
So, after reviewing lessons from TCP, HTTP, and HTML, we can come up with some general guidance for designing and maintaining APIs that need to support change over time.
One more way to think about the act of versioning is that it is really creating a new fork of your app or service. When you create a new version, you are creating a related but likely noncompatible copy of your app—one that now has its own lifeline. And each time you do this forking of your app, you get yet another noncompatible copy. If you’re not careful, you’ll populate the universe with lots of similar apps that you need to maintain and you’ll need to help developers keep track of which copy they are using and make sure they don’t mistakenly try to get these non-compatible copies to talk to each other.
Committing to a backward-compatible strategy for adding functionality and fixing bugs gets rid of the whole job of naming (versioning) and tracking non-compatible copies of your app or service.
Dealing with change over time is best approached with a set of principles—a kind of change aesthetic. There is no one single action to take or design feature to include (or exclude). Another key thing to keep in mind is that change will always occur. You can certainly use energy and effort to stave off change (e.g., “I know you want that feature in the API but we are not going to be able to make that change this year.”). You can even work around change with a hack (e.g., “Well, we don’t support that but you can get the same effect if you first write a temp record and then filter based on the change-date of the temp file.”). There are other ways to avoid facing change, but almost all long-lived and oft-used APIs will experience pressure to change, and designing-in the ability to handle select types of change can reduce the stress, cost, and danger of changing APIs over time.
So, for those who are no longer attempting to stave off inevitable changes to your APIs, here is some general guidance for web API designers, service providers, and consumer clients. I’ve used all these options in the past and found them helpful in many cases.
For those in charge of designing APIs and message formats, it is important to understand enough about the general problem area to get a sense of what types of changes are likely to be needed over time. It is also important to design interfaces that lead implementors down the “happy path” of creating both API services and API clients that are able to handle common changes over time.
Taking into account the likely changes over time is tricky but important. We saw this kind of thinking when we reviewed the way HTML is documented and designed (see “HTML’s Backward Compatibility”). Paul Clements, one of the authors of the book Software Architecture in Practice (Addison-Wesley Professional) claims that those who work in software architecture have a responsibility to deal with change as a fundamental aspect of their design:
The best software architecture ‘knows’ what changes often and makes that easy.
With this in mind, here are three valuable principles for those tasked with designing web APIs.
Over time, object models are bound to change—and these models are likely to change often for new services. Trying to get all your service consumers to learn and track all your object model changes is not a good idea. And, even if you wanted all API consumers to keep up with your team’s model changes, that means your feature velocity will be limited to that of the slowest API consumer in your ecosystem. This can be especially problematic when your API consumers are customers and not fellow colleagues within the same company.
Instead of exposing object models in your APIs, promise standard message formats (e.g., HTML, Atom, HAL, Cj, Siren, etc.). These formats don’t require consumers to understand your service’s internal object model. That means you’re free to modify your internal model without breaking your promise to API consumers. This also means providers will need to handle the task of translating internal domain data into external message formats, but we covered that already in Chapter 3, The Representor Pattern.
Well-designed formats should allow API designers to safely introduce semantic changes (e.g., data carried within the message model), and well-implemented API consumers will be able to parse/render these content changes without the need for code updates. These same formats might support structural changes to messages (e.g., format extensions) in order to safely introduce changes that can be ignored by clients that do not understand them.
Your API should not bake static URLs into the design. URLs are likely to change over time, especially in cases where your initial service is running in a test bed or small online community to start. Tricking API consumers into baking explicit URLs into their source code increases the likelihood that their code will become obsolete, and that forces consumers into unwanted recode-test-deploy cycles if and when your URLs change.
Instead, your API design should promise to support a named operation (shoppingCartCheckOut, computeTax, findCustomer) instead of promising exact addresses for those operations (e.g., http://api.example.org/findCustomer). Documenting (and promising operations by name) is a much more stable and maintainable design feature.
If you want new operations to be ignored by existing clients, make it part of the structure of the message (e.g., <findCustomer … />). However, when you want the operation to be automatically parsed and/or rendered, favor formats that allow you to include the operation identifiers as part of the message’s semantic content (e.g., <operation name="findCustomer" … />). Good candidates for semantic identifers are properties such as id, name, class, and rel.
The notion of canonical models has been around a long time—especially in large enterprise IT shops. The hope is that, with enough hard work, a single grand model of the company’s business domain will be completely defined and properly described so that everyone (from the business analyst on through to the front-line developer) will have a complete picture of the entire company’s domain data. But this never works out.
The two things conspiring against canonical models are (1) scope and (2) time. As the scope of the problem grows (e.g., the company expands, product offerings increase, etc.), the model becomes unwieldy. And as time goes on, even simple models experience modifications that complicate a single-model view of the world. The good news is there is another way to solve this problem: vocabularies.
Once you move to promising formats instead of object models (see “Promise media types, not objects”), the work of providing shared understanding of your API’s domain data and actions needs to be kept somewhere else. A great place for this is in a shared vocabulary. Eric Evans refers to this using the name ubiquitous language—a common rigorous language shared between domain experts and system implementors. By focusing on a shared vocabulary designers can constantly probe domain experts for clarification, and developers can implement features and share data with a high degree of confidence.
Eric Evans’s book Domain-Driven Design (Addison-Wesley Professional) offers a number of valuable lessons in scoping problem domains. While the book was written primarily for those who favor OOP code and XML-based messages over local networks, there is still quite a bit of value in his DDD approach to building up a shared language for a domain and marking the boundaries of components (services) using what he calls bounded context.
Another important reason to rely on vocabularies is that you can define consistent binding rules between the vocabulary terms and the output formats used in your API. For example, you might document that data element names in the vocabulary will always appear in the class property of HTML responses and the name property of Collection+JSON responses, and so forth. This also helps API providers and consumers write general-use code that will work even when new vocabulary terms are added over time.
So, when designing APIs you should:
Like API designers, service implementors have a responsibility to create a software implementation that can account for change over time in an effective way. This means making sure local service changes can be made without any unnecessary complexity or instability. It also means you need to ensure that changes made to the service over time are not likely to break existing client implementations—even implementations that the service knows nothing about!
Maintaining backward compatibility is the primary principle for service implementors when supporting change over time. Essentially, this constrains the types of changes a service can make to those which will not invalidate existing implementations. We saw this principle in play, for example, when reviewing HTTP’s design principles (see “HTTP’s Must Ignore”).
With this in mind, here are three principles I’ve used to help support change over time while reducing the likelihood of breaking existing implementations.
One of the most important aspects of maintaining a backward-compatible service implementation is don’t take things away such as operations or data elements. In API projects I work on, I make this an explicit promise. Once API consumers know that you will not take things away, the value of your API will go up. That’s because API consumers will be assured that, even when you add new data elements and actions to the API, the existing API consumer code will still be valid.
One big reason to make this promise is that it allows API consumer teams to move at their own pace. They don’t need to stop current sprints or feature work in order to deal with potentially breaking changes from some service API they are using in their application. That also means the services don’t need to wait for the slowest API consumer team before they introduce new elements and actions in the API. This loose-coupling in the API update process can result in overall faster development processes since it reduces potential blocking scenarios.
So, what does this backward-compatibility promise look like? Here’s an example I learned from Jason Rudolph at GitHub a few years ago. This is an example of what they call evolutionary design for APIs. He says:
When people are building on top of our API, we’re really asking them to trust us with the time they’re investing in building their applications. And to earn that trust, we can’t make changes [to the API] that would cause their code to break.
Here’s an example of evolutionary design in action. They supported an API response that returned the current status of an account’s request rate limit. It looked like this:
*** REQUEST ***
GET rate_limit
Accept: application/vnd.github+json
...
*** RESPONSE ***
200 OK HTTP/1.1
Content-Type: application/vnd.github+json
...
{
"rate" {
"limit" : 5000,
"remaining : 4992,
"reset" : 1379363338
}
}
Over time, the GitHub team learned that this response was more coarse-grained than was needed. It turns out they wanted to separate search-related rate limits from typical core API calls. So the new design would look like this:
*** REQUEST ***
GET rate_limit
Accept: application/vnd.github+json
...
*** RESPONSE ***
200 OK HTTP/1.1
Content-Type: application/vnd.github+json
...
{
"resources" : {
"core" : {
"limit" : 5000,
"remaining : 4992,
"reset" : 1379363338
},
"search" : {
"limit" : 20,
"remaining : 19,
"reset" : 1379361617
}
}
}
Now, they had a dilemma on their hands. How could they make this important change to the interface without breaking existing implementations? Their solution was, I think, quite smart. Rather than changing the response body, they extended it. The new response for the rate_limit request now looks like this:
*** REQUEST ***
GET rate_limit
Accept: application/vnd.github+json
...
*** RESPONSE ***
200 OK HTTP/1.1
Content-Type: application/vnd.github+json
...
{
"rate" : {
"limit" : 5000,
"remaining : 4992,
"reset" : 1379363338
},
"resources" : {
"core" : {
"limit" : 5000,
"remaining : 4992,
"reset" : 1379363338
},
"search" : {
"limit" : 20,
"remaining : 19,
"reset" : 1379361617
}
}
}
Notice that GitHub applied a structural change to the response. The original structure (
) and the new structure (
) both appear in the response message. This results in a change that can be safely ignored by clients that don’t understand the new, detailed structural element ("resources"). This is just one example of implementing backward-compatibility by not taking things away. The same general approach can be made for links and forms in a response, too.
Another important backward-compatibility principle for service providers is don’t change the meaning of things. That means that, once you publish a link or form with an identifier that tells API consumers what is returned (e.g., <a href="…" rel="users" /> returns a list of users), you should not later use that same identifier to return something completely different (e.g., <a href="…" rel="users" /> later only returns a list of inactive users). Consistency of what the link identifier and/or data element represents is very important for maintaining backward compatibility over time.
In cases where you want to represent some new functionality to the API, it is much better to make a semantic change by adding the new feature. And you should do this without removing any existing functionality. To use the preceding example, if you want to add the ability to return a list of inactive users, it is better to introduce an additional link (and identifier) while maintaining the existing one:
*** REQUEST ***
GET /user-actions HTTP/1.1
Accept: application/vnd.hal+json
...
**** RESPONSE ***
200 OK HTTP/1.1
Content-Type: application/vnd.hal+json
...
{
"_links" : {
"users" : {"href" : "/user-list"},
"inactive" : {"href" : "/user-list?status=inactive"}
}
}
...
In cases where the preceding response is used to create a human-driven UI, both the links will appear on the screen and the person running the app can decide which link to select. In the case of a service-only interface (e.g., some middleware that is tasked with collecting a list of users and processing it in some unique way), the added semantic information (e.g., the inactive link) will not be “known” to existing apps and will be safely ignored. In both cases, this maintains backward compatibility and does not break existing implementations.
Another important change over time principle for service implementors is to make sure all new things are optional. This especially applies to new arguments (e.g., filters or update values)—they cannot be treated as required elements. Also, any new functionality or workflow steps (e.g., you introduce a new workflow step between login and checkout) cannot be required in order to complete the process.
One example of this is similar to the GitHub case just described (see “Don’t take things away”). It is possible that, over time, you’ll find that some new filters are needed when making requests for large lists of data. You might even want to introduce a default page-size to limit the load time of a resource and speed up responsiveness in your API. Here’s how a filter form looks before the introduction of the page-size argument:
*** REQUEST ***
GET /search-form HTTP/1.1
Accept: application/vnd.collection+json
...
*** RESPONSE ***
200 OK HTTP/1.1
Content-Type: application/vnd.collection+json
...
{
"collection" : {
"queries" : [
{
"rel" : "search"
"href" : "/search-results",
"prompt" : Search Form",
"data" : [
{
"name" : "filter",
"value" : "",
"prompt" : "Filter",
"required" : true
}
]
}
]
}
}
And here is the same response after introducing the page-size argument:
*** REQUEST ***
GET /search-form HTTP/1.1
Accept: application/vnd.collection+json
...
*** RESPONSE ***
200 OK HTTP/1.1
Content-Type: application/vnd.collection+json
...
{
"collection" : {
"queries" : [
{
"rel" : "search"
"href" : "/search-results",
"prompt" : Search Form",
"data" : [
{
"name" : "filter",
"value" : "",
"prompt" : "Filter",
"required" : true
},
{
"name" : page-size",
"value" : "all",
"prompt" : "Page Size",
"required" : false
}
]
}
]
}
}
In the updated rendition, you can see that the new argument (page-size) was explicitly marked optional ("required" : false). You can also see that the a default value was provided ("value" : "all"). This may seem a bit counterintuitive. The update was introduced in order to limit the number of records sent in responses. So why set the default value to "all"? It is set to "all" because that was the initial promise in the first rendition of the API. We can’t change the results of this request now to only include some of the records. This also follows the don’t change the meaning of things principle.
So, as service implementors, you can go a long way toward maintaining backward compatibility by supporting these three principles:
Those on the consuming end of APIs also have some responsibility to support change over time. We need to make sure we’re prepared for the backward-compatible features employed by API designers and service implementors. But we don’t need to wait for designers and providers to make changes to their own work before creating stable API consumer apps. We can adopt some of our own principles for creating robust, resilient API clients. Finally, we also need to help API designers and service providers understand the challenges of creating adaptable API consumers by encouraging them to adopt the kinds of principles described here when they create APIs.
The first thing API consumers can do is adopt a coding strategy that protects the app from cases where expected data elements and/or actions are missing in a response. This can be accomplished when you code defensively. You can think of this as honoring Postel’s Law (see “TCP/IP’s Robustness Principle”) by being “liberal in what you accept from others.” There are a couple of very simple ways to do this.
For example, when I write client code to process a response, I almost always include code that first checks for the existence of an element before attempting to parse it. Here’s some client code that you’ll likely find in the examples associated with this book:
// handle title
function title() {
var elm;
if(hasTitle(global.cj.collection)===true) {
elm = domHelper.find("title");
elm.innerText = global.cj.collection.title;
elm = domHelper.tags("title");
elm[0].innerText = global.cj.collection.title;
}
}
You can see that I first check to see if the collection object has a title property (
). If so, I can continue processing it.
Here’s another example where I supply local default values for cases where the service response is missing expected elements (
,
,
,
,
) and check for the existence of a property (
):
function input(args) {
var p, lbl, inp;
p = domHelper.node("p");
p.className = "inline field";
lbl = domHelper.node("label");
inp = domHelper.node("input");
lbl.className = "data";
lbl.innerHTML = args.prompt||"";
inp.name = args.name||"";
inp.className = "value "+ args.className;
inp.value = args.value.toString()||"";
inp.required = (args.required||false);
inp.readOnly = (args.readOnly||false);
if(args.pattern) {
inp.pattern = args.pattern;
}
domHelper.push(lbl,p);
domHelper.push(inp,p);
return p;
}
There are other examples of coding defensively that I won’t include here. The main idea is to make sure that client applications can continue functioning even when any given response is missing expected elements. When you do this, even most unexpected changes will not cause your API consumer to crash.
Another important principle for building resilient API consumer apps is to code to the media type. Essentially, this is using the same approach that was discussed in Chapter 3, The Representor Pattern, except that this time, instead of focusing on creating a pattern for converting internal domain data into a standard message format (via a Message Translator), the opposite is the goal for API consumers: convert a standardized message format into a useful internal domain model. By doing this, you can go a long way toward protecting your client application from both semantic and structural changes in the service responses.
For all the client examples I implement in this book, the media type messages (HTML, HAL, Cj, and Siren) are converted into the same internal domain model: the HTML Document Object Model (DOM). The DOM is a consistent model, and writing client-side JavaScript for it is the way most browser-based API clients work.
Here is a short code snippet that shows how I convert Siren entities into HTML DOM objects for rendering in the browser:
// entities
function entities() {
var elm, coll;
var ul, li, dl, dt, dd, a, p;
elm = domHelper.find("entities");
domHelper.clear(elm);
if(global.siren.entities) {
coll = global.siren.entities;
for(var item of coll) {
segment = domHelper.node("div");
segment.className = "ui segment";
a = domHelper.anchor({
href:item.href,
rel:item.rel.join(" "),
className:item.class.join(" "),
text:item.title||item.href});
a.onclick = httpGet;
domHelper.push(a, segment);
table = domHelper.node("table");
table.className = "ui very basic collapsing celled table";
for(var prop in item) {
if(prop!=="href" &&
prop!=="class" &&
prop!=="type" &&
prop!=="rel") {
tr = domHelper.data_row({
className:"item "+item.class.join(" "),
text:prop+" ",
value:item[prop]+" "
});
domHelper.push(tr,table);
}
}
domHelper.push(table, segment, elm);
}
}
}
It might be a bit tough to see how the HTML DOM is utilized in this example since I use a helper class (the domHelper object) to access most of the DOM functions. But you can see that, for each Siren entity I create an HTML div tag (
). I then create an HTML anchor tag (
) for each item. I set up an HTML <table> to hold the Siren entity’s properties (
) and add a new table row (<tr>) for each one (
). Finally, after completing all the rows in the table, I add the results to the HTML page for visible display (
).
This works because all the implementation examples in this book are intended for common HTML browsers. For cases where the target clients are mobile devices or native desktop applications, I need to work out another strategy. One way to handle this is to create reverse representors for each platform. In other words, create a custom Format-to-Domain handler for iOS, Android, and Windows Mobile, etc. Then the same for Linux, Mac, and Windows desktops, and so forth. This can get tedious, though. That’s why using the browser DOM is still appealing and why some mobile apps rely on tools like Apache Cordova, Mono, Appcelerator, and other cross-platform development environments.
As of this writing, there are a number of efforts to build representor libraries that focus on the client—the reverse of the example I outlined in Chapter 3, The Representor Pattern. The team at Apiary are working on the Hyperdrive project. The Hypermedia Project is a Microsoft.NET-specific effort. And Joshua Kalis has started a project (to which I am a contributor) called Rosetta. Finally, the Yaks project is an independent OSS effort to create a framework that includes the representor pattern to support plug-ins for new formats. There may be more projects by the time you read this book, too.
Once you start building clients that code to the media type, you’ll find that you still need to know domain-specific details that appear in responses. Things like:
Does this response contain the list of users I asked for?
How do I find all the inactive customers?
Which of these invoice records are overdue?
Is there a way for me to find all the products that are no longer in stock in the warehouse?
All these questions are domain-specific and are not tied to any single response format like HAL, Cj, or Siren. One of the reasons the HTML browser has been so powerful is that the browser source code doesn’t need to know anything about accounting in order to host an accounting application. That’s because the user driving the browser knows that stuff. The browser is just the agent of the human user. For many API client cases, there is a human user available to interpret and act upon the domain-specific information in API responses. However, there are cases where the API client is not acting as a direct user agent. Instead, it is just a middleware component or utility app tasked with some job by itself (e.g., find all the overdue invoices). In these cases, the client app needs to have enough domain information to complete its job. And that’s where API vocabularies come in.
There are a handful of projects focused on documenting and sharing domain-specific vocabularies over the WWW. One of the best-known examples of this is the Schema.org project (pronounced schema dot org). Schema.org contains lists of common terms for all sorts of domains. Large web companies like Google and Microsoft use Schema.org vocabularies to drive parts of their system.
Along with Schema.org, there are other vocabulary efforts such as the IANA Link Relations registry, the microformats group, and the Dublin Core Metadata Initiative, or DCMI. A few colleagues and I have also been working on an Internet draft for Application-Level Profile Semantics, or ALPS for short.
I won’t have time to go into vocabularies in this book and encourage you to check out these and other similar efforts in order to learn more about how they can be used in your client-side apps.
So what does this all look like? How can you use vocabularies to enable API clients to act on their own safely? Basically, you need to “teach” the API consumer to perform tasks based on some baked-in domain knowledge. For example, I might want to create an API consumer that uses one service to find overdue invoices and then pass that information off to another service for further processing. This means the API consumer needs to “know” about invoices and what it means to be “overdue.” If the API I am using has published a vocabulary, I can look there for the data and action element identifiers I need to perform my work.
As just one example, here’s what that published vocabulary might look like as expressed in a simplified ALPS XML document:
<alps><doc>Invoice Management Vocabluary</doc><linkrel="invoice-mgmt"href="api.example.org/profile/invoice-mgmt"/><!-- data elements --><descriptorid="invoice-href"/><descriptorid="invoice-number"/><descriptorid="invoice-status"><doc>Valid values are: "active", "closed", "overdue"</doc></descriptor><!-- actions --><descriptorid="invoice-list"type="safe"/><descriptorid="invoice-detail"type="safe"/><descriptorid="invoice-search"type="safe"><descriptorhref="#invoice-status"/></descriptor><descriptorid="write-invoice"type="unsafe"><descriptorhref="#invoice-href"/><descriptorhref="#invoice-number"/><descriptorhref="#invoice-status"></dscriptor></alps>
The ALPS specification is just one profile style for capturing and expressing vocabularies. You can learn more about the ALPS specification by visiting alps.io. Two others worth exploring are XMDP (XHTML Metadata Profiles) and Dublin Core’s Application Profiles (DCAP).
Now when I build my client application, I know that I can “teach” that app to understand how to deal with an invoice record (invoice-number and invoice-status) and know how to search for overdue invoices (use search-invoice with the invoice-status value set to "overdue"). All I need is the starting address for the service and the ability to recognize and execute the search for overdue invoices. The pseudo-code for that example might look like this:
:: DECLARE :: search-link = "invoice-search"search-status = "overdue" write-invoice = "write-invoice" invoice-mgmt = "api.example.org/profile/invoice-mgmt" search-href = "http://api.example.org/invoice-mgmt" search-accept = "application/vnd.siren+json" write-href = "http://third-party.example.org/write-invoices" write-accept = "application/vnd.hal+json" :: EXECUTE :: response = REQUEST(search-href AS search-accept) IF(response.vocabulary IS invoivce-mgmt) THEN
FOR-EACH(link IN response) IF(link IS search-link) THEN invoices = REQUEST(search-link AS search-accept WITH search-status)
FOR-EACH(link IN invoices) REQUEST(write-href AS write-accept FOR write-invoice WITH EACH invoice)
END-FOR END-IF END-FOR END-IF :: EXIT ::
Although this is only pseudo-code, you can see the app has been loaded with domain-specific information (
). Then, after the initial request is made, the response is checked to see if it promises to use the invoice-mgmt vocabluary (
). If that check passes, the app searches all the links in the response to find the search-link and, if found, executes a search for all invoices with the status of overdue (
). Finally, if any invoices are returned in that search, they are sent to a new service using the write-invoice action (
).
Something to note here is that the defensive coding is on display (the if statements) and the code has only initial URLs memorized—the remaining URLs come from the responses themselves.
Leveraging vocabularies for your API means you can focus on the important aspects (the data elements and actions) and not worry about plumbing details such as URL matching, memorizing the exact location of a data element within a document, etc.
The last client implementation principle I’ll cover here is to react to link relations for workflow. This means that when working to solve a multistep problem, focus on selected link relation values instead of writing client apps that memorize a fixed set of steps. This is important because memorizing a fixed set of steps is a kind of tight-binding of the client to a fixed sequence of events that may not actually happen at runtime due to transient context issues (e.g., part of the service is down for maintenance, the logged-in user no longer has rights to one of the steps, etc.). Or over time, new steps might be introduced or the order of events might change within the service. These are all reasons not to bake multistep details into your client app.
You might recognize this principle of not memorizing steps as the difference between path-finders and map-makers from Chapter 5, The Challenge of Reusable Client Apps.
Instead, since the service you are using has also followed the API principles of document link identifiers, publish vocabularies, and don’t take things away, you can implement a client that is trained to look for the proper identifiers and use vocabulary information to know which data elements need to be passed for each operation. Now, even if the links are moved within a response (or even moved to a different response) your client will still be able to accomplish your goal well into the future.
One way to approach this react to links principle is to isolate all the important actions the client will need to take and simply implement them as standalone, stateless operations. Once that is done, you can write a single routine that (1) makes a request, and (2) inspects that request for one of the known actions, and when found, executes the recognized action.
Following is an example of a Twitter-like quote-bot I created for my 2011 book Building Hypermedia APIs:
/* these are the things this bot can do */functionprocessResponse(ajax){vardoc=ajax.responseXML;if(ajax.status===200){switch(context.status){case'start':findUsersAllLink(doc);break;case'get-users-all':findMyUserName(doc);break;case'get-register-link':findRegisterLink(doc);break;case'get-register-form':findRegisterForm(doc);break;case'post-user':postUser(doc);break;case'get-message-post-link':findMessagePostForm(doc);break;case'post-message':postMessage(doc);break;case'completed':handleCompleted(doc);break;default:alert('unknown status: ['+g.status+']');return;}}else{alert(ajax.status);}}
In the preceding example, this routine constantly monitors the app’s current internal context.status and, as it changes from one state to another, the app knows just what to be looking for within the current response and/or what action to take in order to advance to the next step in the effort to reach the final goal. For this bot, the goal is to post inspirational quotes to the available social media feed. This bot also knows that it might need to authenticate in order to access the feed, or possibly even create a new user account. Notice the use of the JavaScript switch…case structure here. There is no notion of execution order written into the code—just a set of possible states and related operations to attempt to execute.
Writing clients in this way allows you to create middleware components that can accomplish a set goal without forcing that client to memorize a particular order of events. That means even when the order of things changes over time—as long as the changes are made in a backward-compatible way—this client will still be able to complete its assigned tasks.
So, some valuable principles for implementing clients that support change over time include:
This chapter focused on dealing with the challenge of change over time for web APIs. We looked at examples of planning for and handling change over the decades in three key web-related fields: TCP/IP and Postel’s Law, HTTP and the MUST IGNORE principle, and the backward-compatibility pledge that underpins the design of HTML. We then looked at some general principles we can use when designing APIs, implementing API services, and building client apps that consume those APIs.
The key message of this chapter is that change is inevitable and the way to deal with it is to plan ahead and adopt the point of view that all changes do not require you break the interface. Finally, we learned that successful organizations adopt a change aesthetic—a collection of related principles that help guide API design, inform service implementors, and encourage API consumers to all work toward maintaining backward compatibility.
Blaise Pascal’s Wager has more to do with the nature of uncertainty and probability theory than anything else. A decent place to start reading about his Wager is the Wikipedia entry.
Alan Kay’s 2011 talk on Programming and Scaling contains a commentary on how TCP/IP has been updated and improved over the years without ever having to “stop” the Internet.
TCP/IP is documented in two key IETF documents: RFC793 (TCP) and RFC791 (IP).
The Client tolerance of bad servers note can be viewed in the W3C’s HTTP protocol archive pages.
The IETF specification document for RFC1945 contains eight separate examples of the MUST IGNORE principle. The HTTP 1.1 specification (RFC2616) has more than 30 examples.
Dave Orchard’s 2003 blog post “Versioning XML Vocabularies” does a good job of illustrating a number of valuable “Must Ignore” patterns.
Tim Berner-Lee’s Markup archive from 1992 is a great source for those looking into the earliest days of HTML.
The 2119 Words can be found in IETF’s RFC2119.
The book Software Architecture in Practice was written by Len Bass, Paul Clements, and Rick Kazman.
I learned about GitHub’s approach to managing backward compatibility from a Yandex 2013 talk by Jason Rudolph on “API Design at GitHub.” As of this writing, the video and slides are still available online.
The Schema.org effort includes the website, a W3C community site, a GitHub repository, and an online discussion group.
The book Building Hypermedia APIs (O’Reilly) is a kind of companion book to this one. That book focuses on API design with some server-side implementation details.
The Dublin Code Application Profile (DCAP) spec “includes guidance for metadata creators and clear specifications for metadata developers.” You can read more about it here.
In 2003, Tantek Çelik defined the XHTML Meta Data Profile (XMDP). It supports defining document profiles that are both human- and machine-readable. The specification can be found online.