Every programmer is an API designer at one time or another. Maybe you don’t have any immediate plans to write the next popular JavaScript library. But when you program in a platform for a long enough period of time, you build up a repertoire of solutions to common problems, and sooner or later you start to develop reusable utilities and components. Even if you don’t release these as independent libraries, developing your skills as a library writer can help you write better components.
Designing libraries is a tricky business and is as much art as science. It’s also incredibly important. APIs are a programmer’s basic vocabulary. A well-designed API enables your users (which probably includes yourself!) to express their programs clearly, concisely, and unambiguously.
There are few decisions that affect API consumers more pervasively than the conventions you use for names and function signatures. These conventions have enormous influence: They establish the basic vocabulary and idioms of the applications that use them. Users of your library have to learn to read and write using these idioms, and it’s your job to make that learning process as easy as possible. Inconsistency makes it harder to remember which conventions apply in which situations, which leads to more time spent consulting your library’s documentation and less time spent getting real work done.
One of the key conventions is argument order. User interface libraries, for instance, usually have functions that accept multiple measurements such as width and height. Do your users a favor and make sure these always come in the same order. And it’s worth choosing an order that matches other libraries—nearly all libraries accept width first, then height:
var widget = new Widget(320, 240); // width: 320, height: 240
Unless you have a really strong reason for needing to vary from universal practice, stick with what’s familiar. If your library is meant for the web, remember that web developers routinely deal with multiple languages (HTML, CSS, and JavaScript... at a minimum). Don’t make their lives even harder by needlessly varying from conventions they are likely to use in their normal workflow. For example, whenever CSS accepts parameters describing the four sides of a rectangle, it requires them in clockwise order starting from the top (top, right, bottom, left). So when writing a library with an analogous API, stick to this order. Your users will thank you. Or maybe they won’t even notice—so much the better! But you can be sure they will notice if you deviate from standard convention.
If your API uses options objects (see Item 55), you can avoid the dependence on argument order. For standard options such as width/height measurements, you should pick a naming convention and adhere to it religiously. If one of your function signatures looks for width and height options and another looks for w and h, your users are in for a lifetime of constantly checking your documentation to remember which is used where. Similarly, if your Widget class has methods for setting properties, make sure you use the same naming convention for these update methods. There’s no good reason for one class to have a setWidth method and another class to do the same thing with a method called width.
Every good library needs thorough documentation, but a great library treats its documentation as training wheels. Once your users get accustomed to your library’s conventions, they should be able to do common tasks without ever checking the documentation. Consistent conventions can even help users guess what properties or methods are available without looking them up at all, or discover them at the console and guess their behavior from the names.
• Use consistent conventions for variable names and function signatures.
• Don’t deviate from conventions your users are likely to encounter in other parts of their development platform.
The undefined value is special: Whenever JavaScript has no specific value to provide it just produces undefined. Unassigned variables start out with the value undefined:
Accessing nonexistent properties from objects produces undefined:
var obj = {};
obj.x; // undefined
Returning without a value or falling off the end of a function body produces the return value undefined:
function f() {
return;
}
function g() { }
f(); // undefined
g(); // undefined
Function parameters that are not provided with actual arguments have the value undefined:
function f(x) {
return x;
}
f(); // undefined
In each of these situations, the undefined value indicates that the operation did not result in a specific value. Of course, there’s something a little paradoxical about a value that means “no value.” But every operation has to produce something, so JavaScript uses undefined to fill the void (so to speak).
Treating undefined as the absence of any specific value is a convention established by the language. Using it for other purposes is a risky proposition. For example, a library of user interface elements might support a highlight method for changing the background color of an element:
element.highlight(); // use the default color
element.highlight("yellow"); // use a custom color
What if we wanted to provide a way to request a random color? We could use undefined as a special value for that purpose:
element.highlight(undefined); // use a random color
But this would be at odds with undefined’s usual meaning. This makes it easy to get the wrong behavior when getting the value from another source, particularly one that might not have a value to provide. For example, a program might be using a configuration object with an optional color preference:
var config = JSON.parse(preferences);
// ...
element.highlight(config.highlightColor); // may be random
If the preferences do not specify a color, the programmer will most likely expect to get the default, just as if no value were provided. But by repurposing undefined, we actually caused this code to generate a random color. A better API might use a special color name for the random case:
element.highlight("random");
Sometimes it’s not possible for an API to choose a special string value that’s distinguishable from the normal set of string values accepted by the function. In these cases, there are special values other than undefined, such as null or true. But these tend not to lead to very readable code:
element.highlight(null);
For someone who is reading the code and may not have your library committed to memory, this code is rather opaque. In fact, a first guess might be that it removes highlighting. A more explicit and descriptive option is to represent the random case as an object with a random property (see Item 55 for more on options objects):
element.highlight({ random: true });
Another place to watch out for undefined is in the implementation of optional arguments. In theory, the arguments object (see Item 51) makes it possible to detect whether an argument was passed, but in practice, testing for undefined leads to more robust APIs. For example, a web server might take an optional host name:
var s1 = new Server(80, "example.com");
var s2 = new Server(80); // defaults to "localhost"
The Server constructor could be implemented by testing arguments.length:
function Server(port, hostname) {
if (arguments.length < 2) {
hostname = "localhost";
}
hostname = String(hostname);
// ...
}
But this has a similar problem to the element.highlight method above. If a program provides an explicit argument by requesting a value from another source such as a configuration object, it might produce undefined:
var s3 = new Server(80, config.hostname);
If there’s no hostname preference specified by config, the natural behavior is to use the default "localhost". But the above implementation ends up with the host name "undefined". It’s better to test for undefined, which could be produced by leaving off the argument or by providing an argument expression that turns out to be undefined:
function Server(port, hostname) {
if (hostname === undefined) {
hostname = "localhost";
}
hostname = String(hostname);
// ...
}
A reasonable alternative is to test whether hostname is truthy (see Item 3). Logical operators make this convenient:
function Server(port, hostname) {
hostname = String(hostname || "localhost");
// ...
}
This version uses the logical OR operator (||), which returns the first argument if it is a truthy value and otherwise returns its second argument. So, if hostname is undefined or an empty string, the expression (hostname || "localhost") evaluates to "localhost". As such, this is technically testing for more than undefined—it will treat all falsy values the same as undefined. This is probably acceptable for Server since an empty string is not a valid host name. So, if you are happy with a looser API that coerces all falsy values to a default value, truthiness testing is a concise way to implement parameter default values.
But beware: Truthiness is not always a safe test. If a function should accept the empty string as a legal value, a truthy test will override the empty string and replace it with the default value. Similarly, a function that accepts a number should not use a truthy test if it allows 0 (or NaN, although it’s less common) as an acceptable value. For example, a function for creating a user interface element might allow an element to have a width or height of 0, but provide a different default value:
var c1 = new Element(0, 0); // width: 0, height: 0
var c2 = new Element(); // width: 320, height: 240
An implementation that uses truthiness would be buggy:
function Element(width, height) {
this.width = width || 320; // wrong test
this.height = height || 240; // wrong test
// ...
}
var c1 = new Element(0, 0);
c1.width; // 320
c1.height; // 240
Instead, we have to resort to the more verbose test for undefined:
function Element(width, height) {
this.width = width === undefined ? 320 : width;
this.height = height === undefined ? 240 : height;
// ...
}
var c1 = new Element(0, 0);
c1.width; // 0
c1.height; // 0
var c2 = new Element();
c2.width; // 320
c2.height; // 240
• Avoid using undefined to represent anything other than the absence of a specific value.
• Use descriptive string values or objects with named boolean properties, rather than undefined or null, to represent application-specific flags.
• Test for undefined instead of checking arguments.length to provide parameter default values.
• Never use truthiness tests for parameter default values that should allow 0, NaN, or the empty string as valid arguments.
Keeping consistent conventions for argument order, as Item 53 suggests, is important for helping programmers remember what each argument in a function call means. This works to a point. But it simply doesn’t scale beyond a few arguments. Try making sense of a function call such as the following:
var alert = new Alert(100, 75, 300, 200,
"Error", message,
"blue", "white", "black",
"error", true);
We’ve all seen APIs like this. It’s often the result of argument creep, where a function starts out simple, but over time, as the library expands in functionality, the signature acquires more and more arguments.
Fortunately, JavaScript provides a simple, lightweight idiom that works well for larger function signatures: the options object. An options object is a single argument that provides additional argument data through its named properties. The object literal form makes this especially pleasant to read and write:
var alert = new Alert({
x: 100, y: 75,
width: 300, height: 200,
title: "Error", message: message,
titleColor: "blue", bgColor: "white", textColor: "black",
icon: "error", modal: true
});
This API is a little more verbose, but noticeably easier to read. Each argument becomes self-documenting: There’s no need for a comment explaining its role, since its property name explains it perfectly. This is especially helpful for boolean parameters such as modal: Someone reading a call to new Alert might be able to infer the purpose of a string argument from its contents, but a naked true or false is not particularly informative.
Another benefit of options objects is that any of the arguments can be optional, and a caller can provide any subset of the optional arguments. With ordinary arguments (sometimes called positional arguments, since they are distinguished not by name but by their position in the argument list), optional arguments can often introduce ambiguities. For example, if we want both the position and the size of an Alert object to be optional, then it’s not clear how to interpret a call such as this:
var alert = new Alert(app,
150, 150,
"Error", message,
"blue", "white", "black",
"error", true);
Are the first two numbers meant to specify the x and y or width and height arguments? With an options object, there’s no question:
var alert = new Alert({
parent: app,
width: 150, height: 100,
title: "Error", message: message,
titleColor: "blue", bgColor: "white", textColor: "black",
icon: "error", modal: true
});
Traditionally, options objects consist exclusively of optional arguments, so it’s even possible to omit the object entirely:
var alert = new Alert(); // use all default parameter values
If there are one or two required arguments, it’s better to keep them separate from the options object:
var alert = new Alert(app, message, {
width: 150, height: 100,
title: "Error",
titleColor: "blue", bgColor: "white", textColor: "black",
icon: "error", modal: true
});
Implementing a function that accepts an options object takes a little more work. Here is a thorough implementation:
function Alert(parent, message, opts) {
opts = opts || {}; // default to an empty options object
this.width = opts.width === undefined ? 320 : opts.width;
this.height = opts.height === undefined
? 240
: opts.height;
this.x = opts.x === undefined
? (parent.width / 2) - (this.width / 2)
: opts.x;
this.y = opts.y === undefined
? (parent.height / 2) - (this.height / 2)
: opts.y;
this.title = opts.title || "Alert";
this.titleColor = opts.titleColor || "gray";
this.bgColor = opts.bgColor || "white";
this.textColor = opts.textColor || "black";
this.icon = opts.icon || "info";
this.modal = !!opts.modal;
this.message = message;
}
The implementation starts by providing a default empty options object, using the || operator (see Item 54). The numeric arguments test for undefined as Item 54 advises, since 0 is a valid value but not the default. For the string parameters, we use logical OR under the assumption that an empty string is not a valid value and should be replaced by a default value. The modal parameter coerces its argument to a boolean with a double negation pattern (!!).
This code is a little more verbose than it would be with positional arguments. Now, it’s worth paying the price within the library if it makes users’ lives easier. But we can make our own life easier with a useful abstraction: an object extension or merging function. Many JavaScript libraries and frameworks come with an extend function, which takes a target object and a source object and copies the properties of the latter object into the former. One of the most useful applications of this utility is for abstracting out the logic of merging default values and user-provided values for options objects. With the help of extend, the Alert function looks quite a bit cleaner:
function Alert(parent, message, opts) {
opts = extend({
width: 320,
height: 240
});
opts = extend({
x: (parent.width / 2) - (opts.width / 2),
y: (parent.height / 2) - (opts.height / 2),
title: "Alert",
titleColor: "gray",
bgColor: "white",
textColor: "black",
icon: "info",
modal: false
}, opts);
this.width = opts.width;
this.height = opts.height;
this.x = opts.x;
this.y = opts.y;
this.title = opts.title;
this.titleColor = opts.titleColor;
this.bgColor = opts.bgColor;
this.textColor = opts.textColor;
this.icon = opts.icon;
this.modal = opts.modal;
}
This avoids constantly reimplementing the logic of checking for the presence of each argument. Notice how we use two calls to extend, since the default values for x and y depend on first computing the values of width and height.
We can clean this up even further if all we want to do with the options is copy them into this:
function Alert(parent, message, opts) {
opts = extend({
width: 320,
height: 240
});
opts = extend({
x: (parent.width / 2) - (opts.width / 2),
y: (parent.height / 2) - (opts.height / 2),
title: "Alert",
titleColor: "gray",
bgColor: "white",
textColor: "black",
icon: "info",
modal: false
}, opts);
extend(this, opts);
}
Different frameworks provide different variations of extend, but typically the implementation works by enumerating the properties of the source object and copying them into the target whenever they are not undefined:
function extend(target, source) {
if (source) {
for (var key in source) {
var val = source[key];
if (typeof val !== "undefined") {
target[key] = val;
}
}
}
return target;
}
Notice that there are small differences between the original version of Alert and the implementation using extend. For one, our conditional logic in the first version avoids even computing the default values if they aren’t needed. As long as computing the defaults has no side effects such as modifying the user interface or sending a network request—which is usually the case—this isn’t really a problem. Another difference is in the logic for determining whether a value was provided. In our first version, we treat an empty string the same as undefined for the various string arguments. But it’s more consistent to treat only undefined as a missing argument; using the || operator was more expedient but a less uniform policy for providing default parameter values. Uniformity is a good goal in library design, because it leads to better predictability for consumers of the API.
• Use options objects to make APIs more readable and memorable.
• The arguments provided by an options object should all be treated as optional.
• Use an extend utility function to abstract out the logic of extracting values from options objects.
APIs are sometimes classified as either stateful or stateless. A stateless API provides functions or methods whose behavior depends only on their inputs, not on the changing state of the program. The methods of a string are stateless: The string’s contents cannot be modified, and the methods depend only on the contents of the string and the arguments passed to the method. No matter what else is going on in a program, the expression "foo".toUpperCase() will always produce "FOO". The methods of a Date object, by contrast, are stateful: Calling toString on the same Date object can produce different results based on whether the Date’s properties have been modified by its various set methods.
While state is sometimes essential, stateless APIs tend to be easier to learn and use, more self-documenting, and less error-prone. A famous stateful API is the web’s Canvas library, which provides user interface elements with methods for drawing shapes and images onto their surface. A program can draw text onto a canvas using the fillText method:
c.fillText("hello, world!", 75, 25);
This method provides a string to draw and a position in the canvas. But it doesn’t specify other attributes of the drawn text such as its color, transparency, or text style. All of these attributes are specified separately by changing the internal state of the canvas:
c.fillStyle = "blue";
c.font = "24pt serif";
c.textAlign = "center";
c.fillText("hello, world!", 75, 25);
A less stateful version of the API might instead look like this:
c.fillText("hello, world!", 75, 25, {
fillStyle: "blue",
font: "24pt serif",
textAlign: "center"
});
Why might the latter be preferable? First of all, it’s much less fragile. The stateful API requires modifying the internal state of a canvas in order to do anything custom, and this causes one drawing operation to affect another one, even if they have nothing to do with each other. For example, the default fill style is black. But you can only count on getting the default value if you know that no one has changed the defaults already. If you want to do a drawing operation that uses the default color after changing it, you have to specify the default explicitly:
c.fillText("text 1", 0, 0); // default color
c.fillStyle = "blue";
c.fillText("text 2", 0, 30); // blue
c.fillStyle = "black";
c.fillText("text 3", 0, 60); // back in black
Compare this to a stateless API, which would automatically enable the reuse of default values:
c.fillText("text 1", 0, 0); // default color
c.fillText("text 2", 0, 30, { fillStyle: "blue" }); // blue
c.fillText("text 3", 0, 60); // default color
Notice also how each statement becomes more readable: To understand what any individual call to fillText does, you don’t have to understand all the modifications that precede it. In fact, the canvas might even be modified in some completely separate part of the program. This can easily lead to bugs, where one piece of code written somewhere else changes the state of the canvas:
c.fillStyle = "blue";
drawMyImage(c); // did drawMyImage change c?
c.fillText("hello, world!", 75, 25);
To understand what happens in the last line, we have to know what modifications drawMyImage might make to the canvas. A stateless API leads to more modular code, which avoids bugs based on surprising interactions between different parts of your code, while simultaneously making the code easier to read.
Stateful APIs are also more difficult to learn. Reading the documentation for fillText, you can’t tell what aspects of the state of a canvas affect the drawing. Even if some of them are easy to guess, it’s hard for a nonexpert to know whether they’ve correctly initialized all of the necessary state. It’s of course possible to provide an exhaustive list in the documentation of fillText. And when you do need a stateful API, you should definitely document the state dependencies carefully. But a stateless API eliminates these implicit dependencies altogether, so they don’t need the extra documentation in the first place.
Another benefit of stateless APIs is conciseness. A stateful API tends to lead to a proliferation of additional statements just to set the internal state of an object before calling its methods. Consider a parser for the popular “INI” configuration file format. For example, a simple INI file might look like this:
[Host]
address=172.0.0.1
name=localhost
[Connections]
timeout=10000
One approach to an API for this kind of data would be to provide a setSection method for selecting a section before looking up configuration parameters with a get method:
var ini = INI.parse(src);
ini.setSection("Host");
var addr = ini.get("address");
var hostname = ini.get("name");
ini.setSection("Connection");
var timeout = ini.get("timeout");
var server = new Server(addr, hostname, timeout);
But with a stateless API, it’s not necessary to create extra variables like addr and hostname to save the extracted data before updating the section:
var ini = INI.parse(src);
var server = new Server(ini.Host.address,
ini.Host.name,
ini.Connection.timeout);
Notice how once we make the section explicit we can simply represent the ini object as a dictionary, and each section as a dictionary, making the API even simpler. (See Chapter 5 to learn more about dictionary objects.)
• Prefer stateless APIs where possible.
• When providing stateful APIs, document the relevant state that each operation depends on.
Imagine a library for creating wikis: web sites containing content that users can interactively create, delete, and modify. Many wikis feature simple, text-based markup languages for creating content. These markup languages typically provide a subset of the available features of HTML, but with a simpler and more legible source format. For example, text might be formatted by surrounding it with asterisks for bold, underscores for underlining, and forward slashes for italics. Users can enter text such as this:
This sentence contains a *bold phrase* within it.
This sentence contains an _underlined phrase_ within it.
This sentence contains an /italicized phrase/ within it.
The site would then display the content to wiki readers as:
This sentence contains a bold phrase within it.
This sentence contains an underlined phrase within it.
This sentence contains an italicized phrase within it.
A flexible wiki library might provide application writers with a choice of markup languages, since many different popular formats have emerged over the years.
To make this work, we need to separate the functionality of extracting the contents of user-created markup source text from the rest of the wiki functionality, such as account management, revision history, and content storage. The rest of the application should interact with the extraction functionality through an interface with a well-documented set of properties and methods. By programming strictly to the interface’s documented API and ignoring the implementation details of those methods, the rest of the application can function correctly regardless of which source format an application chooses to use.
Let’s look a little more closely at what kind of interface is needed for wiki content extraction. The library must be able to extract metadata such as page title and author and to format page contents as HTML for displaying to wiki readers. We can represent each page in the wiki as an object that provides access to this data through page methods such as getTitle, getAuthor, and toHTML.
Next, the library needs to provide a way to create an application with a custom wiki formatter, as well as some built-in formatters for popular markup formats. For example, an application writer might wish to use the MediaWiki format (the format used by Wikipedia):
var app = new Wiki(Wiki.formats.MEDIAWIKI);
The library would store this formatter function internally in the Wiki instance object:
function Wiki(format) {
this.format = format;
}
Whenever a reader wants to view a page, the application retrieves its source and renders an HTML page using the internal formatter:
Wiki.prototype.displayPage = function(source) {
var page = this.format(source);
var title = page.getTitle();
var author = page.getAuthor();
var output = page.toHTML();
// ...
};
How would a formatter such as Wiki.formats.MEDIAWIKI be implemented? Programmers familiar with class-based programming might be inclined to create a base Page class that represents the user-created content and implement each different format as a subclass of Page. The MediaWiki format would be implemented with a class MWPage that extends Page, and MEDIAWIKI would be a “factory function” that returns an instance of MWPage:
function MWPage(source) {
Page.call(this, source); // call the super-constructor
// ...
}
// MWPage extends Page
MWPage.prototype = Object.create(Page.prototype);
MWPage.prototype.getTitle = /* ... */;
MWPage.prototype.getAuthor = /* ... */;
MWPage.prototype.toHTML = /* ... */;
Wiki.formats.MEDIAWIKI = function(source) {
return new MWPage(source);
};
(See Chapter 4 for more about implementing class hierarchies with constructors and prototypes.) But what practical purpose does the base Page class serve? Since MWPage needs its own implementation of the methods required by the wiki application—getTitle, getAuthor, and toHTML—there’s not necessarily any useful implementation code to inherit. Notice, too, that the displayPage method above does not care about the inheritance hierarchy of the page object; it only requires the relevant methods in order to work. So implementations of wiki formats are free to implement those methods however they like.
Where many object-oriented languages encourage structuring your programs around classes and inheritance, JavaScript tends not to stand on ceremony. It is often perfectly sufficient to provide an implementation for an interface like the MediaWiki page format with a simple object literal:
Wiki.formats.MEDIAWIKI = function(source) {
// extract contents from source
// ...
return {
getTitle: function() { /* ... */ },
getAuthor: function() { /* ... */ },
toHTML: function() { /* ... */ }
};
};
What’s more, inheritance sometimes causes more problems than it solves. This becomes evident when several different wiki formats share nonoverlapping sets of functionality: There may not be any inheritance hierarchy that makes sense. For example, imagine three formats:
Format A: *bold*, [Link], /italics/
Format B: **bold**, [[Link]], *italics*
Format C: **bold**, [Link], *italics*
We would like to implement individual pieces of functionality for recognizing each different kind of input, but the mixing and matching of functionality just doesn’t map to any clear hierarchical relationship between A, B, and C (I welcome you to try it!). The right thing to do is to implement separate functions for each kind of input matching—single asterisks, double asterisks, slashes, brackets, and so on—and mix and match functionality as needed for each format.
Notice that by eliminating the Page superclass, we don’t have to replace it with anything. This is where JavaScript’s dynamic typing really shines. Anyone who wishes to implement a new custom format can do so without needing to “register” it somewhere. The displayPage method works with any JavaScript object whatsoever, so long as it has the proper structure: the expected getTitle, getAuthor, and getHTML methods, each with the expected behavior.
This kind of interface is sometimes known as structural typing or duck typing: Any object will do so long as it has the expected structure (if it looks like a duck, swims like a duck, and quacks like a duck...). It’s an elegant programming pattern and especially lightweight in dynamic languages such as JavaScript, since it doesn’t require you to write anything explicit. A function that calls methods on an object will work on any object that implements the same interface. Of course, you should list out the expectations of an object interface in your API documentation. This way, implementers know what properties and methods are required, and what your libraries or applications expect of their behavior.
Another benefit of the flexibility of structural typing is for unit testing. Our wiki library probably expects to be plugged into an HTTP server object that implements the networking functionality of the wiki. If we want to test the interaction sequences of the wiki without actually connecting to the network, we can implement a mock object that pretends to behave like a live HTTP server but follows a prescribed script instead of touching the network. This provides a repeatable interaction with a fake server, instead of relying on the unpredictable behavior of the network, making it possible to test the behavior of components that interact with the server.
• Use structural typing (also known as duck typing) for flexible object interfaces.
• Avoid inheritance when structural interfaces are more flexible and lightweight.
• Use mock objects, that is, alternative implementations of interfaces that provide repeatable behavior, for unit testing.
Consider two different class APIs. The first is for bit vectors: ordered collections of bits.
var bits = new BitVector();
bits.enable(4);
bits.enable([1, 3, 8, 17]);
bits.bitAt(4); // 1
bits.bitAt(8); // 1
bits.bitAt(9); // 0
Notice that the enable method is overloaded: You can pass it either an index or an array of indices.
The second class API is for string sets: unordered collections of strings.
var set = new StringSet();
set.add("Hamlet");
set.add(["Rosencrantz", "Guildenstern"]);
set.add({ "Ophelia": 1, "Polonius": 1, "Horatio": 1 });
set.contains("Polonius"); // true
set.contains("Guildenstern"); // true
set.contains("Falstaff"); // false
Similar to the enable method of bit vectors, the add method is also overloaded, but in addition to strings and arrays of strings, it also accepts a dictionary object.
To implement BitVector.prototype.enable, we can avoid the question of how to determine whether an object is an array by testing the other case first:
BitVector.prototype.enable = function(x) {
if (typeof x === "number") {
this.enableBit(x);
} else { // assume x is array-like
for (var i = 0, n = x.length; i < n; i++) {
this.enableBit(x[i]);
}
}
};
No problem. What about StringSet.prototype.add? Now we seem to need to distinguish between arrays and objects. But that question doesn’t even make sense—JavaScript arrays are objects! What we really want to do is separate out array objects from nonarray objects.
Making this distinction is at odds with JavaScript’s flexible notion of “array-like” objects (see Item 51). Any object can be treated as an array as long as it obeys the right interface. And there’s no clear way to test an object to see whether it’s intended to satisfy an interface. We might try to guess that an object that has a length property is intended to be an array, but this is no guarantee; what if we happen to use a dictionary object that has the key "length" in it?
dimensions.add({
"length": 1, // implies array-like?
"height": 1,
"width": 1
});
Using imprecise heuristics to determine their interface is a recipe for misunderstanding and misuse. Guessing whether an object implements a structural type is sometimes known as duck testing (after the “duck types” described in Item 57), and it’s bad practice. Since objects are not tagged with explicit information to indicate the structural types they implement, there’s no reliable, programmatic way to detect this information.
Overloading two types means there must be a way to distinguish the cases. And it’s not possible to detect that a value implements a structural interface. This leads to the following rule:
APIs should never overload structural types with other overlapping types.
For StringSet, the answer is not to use the structural “array-like” interface in the first place. We should instead choose a type that carries a well-defined “tag” indicating that the user truly intends it to be an array. An obvious but imperfect choice is to use the instanceof operator to test whether an object inherits from Array.prototype:
StringSet.prototype.add = function(x) {
if (typeof x === "string") {
this.addString(x);
} else if (x instanceof Array) { // too restrictive
x.forEach(function(s) {
this.addString(s);
}, this);
} else {
for (var key in x) {
this.addString(key);
}
}
};
After all, we know for sure that anytime an object is an instance of Array, it behaves like an array. But this time it turns out that this is too fine a distinction. In environments where there can be multiple global objects, there may be multiple copies of the standard Array constructor and prototype object. This happens in the browser, where each frame gets a separate copy of the standard library. When communicating values between frames, an array from one frame will not inherit from the Array.prototype of another frame.
For this reason, ES5 introduced the Array.isArray function, which tests whether a value is an array, regardless of prototype inheritance. In ECMAScript standards-ese, this function tests whether the value of the internal [[Class]] property of the object is "Array". When you need to test whether an object is a true array, not just an array-like object, Array.isArray is more reliable than instanceof.
This leads to a more robust implementation of the add method:
StringSet.prototype.add = function(x) {
if (typeof x === "string") {
this.addString(x);
} else if (Array.isArray(x)) { // tests for true arrays
x.forEach(function(s) {
this.addString(s);
}, this);
} else {
for (var key in x) {
this.addString(key);
}
}
};
In environments that don’t support ES5, you can use the standard Object.prototype.toString method to test whether an object is an array:
var toString = Object.prototype.toString;
function isArray(x) {
return toString.call(x) === "[object Array]";
}
The Object.prototype.toString function uses the internal [[Class]] property of an object to create its result string, so it too is a more reliable method than instanceof for testing whether an object is an array.
Notice that this version of add has different behavior that affects consumers of the API. The array version of the overloaded API does not accept arbitrary array-like objects. You can’t, for example, pass an arguments object and expect it to be treated as an array:
function MyClass() {
this.keys = new StringSet();
// ...
}
MyClass.prototype.update = function() {
this.keys.add(arguments); // treated as a dictionary
};
Instead, the correct way to use add is to convert the object to a true array, using the idiom described in Item 51:
MyClass.prototype.update = function() {
this.keys.add([].slice.call(arguments));
};
Callers need to do this conversion whenever they want to pass an array-like object to an API that expects a true array. For this reason, it’s necessary to document which of the two types your API accepts. In the examples above, the enable method accepts numbers and array-like objects, whereas the add method accepts strings, true arrays, and (nonarray) objects.
• Never overload structural types with other overlapping types.
• When overloading a structural type with other types, test for the other types first.
• Accept true arrays instead of array-like objects when overloading with other object types.
• Document whether your API accepts true arrays or array-like values.
• Use ES5’s Array.isArray to test for true arrays.
JavaScript is notoriously lax about types (see Item 3). Many of the standard operators and libraries automatically coerce their arguments to the expected type rather than throwing exceptions for unexpected inputs. Without additional logic, building off of these built-in operations inherits their coercing behavior:
function square(x) {
return x * x;
}
square("3"); // 9
Coercions can certainly be convenient. But as Item 3 points out, they can also cause trouble, hiding errors and leading to erratic and hard-to-diagnose behavior.
Coercions are especially confusing when working with overloaded function signatures, like the enable method of the bit vector class of Item 58. The method uses its argument’s type to determine its behavior. The signature would become harder to understand if enable attempted to coerce its argument to an expected type. Which type should it choose? Coercing to a number completely breaks the overloading:
BitVector.prototype.enable = function(x) {
x = Number(x);
if (typeof x === "number") { // always true
this.enableBit(x);
} else { // never executed
for (var i = 0, n = x.length; i < n; i++) {
this.enableBit(x[i]);
}
}
};
As a general rule, it’s wise to avoid coercing arguments whose type is used to determine an overloaded function’s behavior. Coercions make it harder to tell which variant you will end up with. Imagine trying to make sense of this use:
bits.enable("100"); // number or array-like?
This use of enable is ambiguous: The caller could plausibly have intended the argument to be treated as a number or as an array of bit values. But our constructor was not designed for strings, so there’s no way to know. It’s likely an indication that the caller didn’t understand the API. In fact, if we wanted to be a little more careful in our API, we could enforce that only numbers and objects are accepted:
BitVector.prototype.enable = function(x) {
if (typeof x === "number") {
this.enableBit(x);
} else if (typeof x === "object" && x) {
for (var i = 0, n = x.length; i < n; i++) {
this.enableBit(x[i]);
}
} else {
throw new TypeError("expected number or array-like");
}
}
This last version of enable is an example of a more cautious style known as defensive programming, which attempts to defend against potential errors with additional checks. In general, it’s not possible to defend against all possible bugs. For example, we could also check to ensure that if x is an object it also has a length property, but this wouldn’t protect against, say, an accidental use of a String object. And JavaScript provides only very rudimentary tools for implementing these checks, such as the typeof operator, but it’s possible to write utility functions to guard function signatures more concisely. For example, we could guard the BitVector constructor with a single up-front check:
function BitVector(x) {
uint32.or(arrayLike).guard(x);
// ...
}
To make this work, we can build a utility library of guard objects with the help of a shared prototype object that implements the guard method:
var guard = {
guard: function(x) {
if (!this.test(x)) {
throw new TypeError("expected " + this);
}
}
};
Each guard object then implements its own test method and string description for error messages:
var uint32 = Object.create(guard);
uint32.test = function(x) {
return typeof x === "number" && x === (x >>> 0);
};
uint32.toString = function() {
return "uint32";
};
The uint32 guard uses a trick of JavaScript’s bitwise operators to perform a conversion to an unsigned 32-bit integer. The unsigned right shift operator converts its first argument to an unsigned 32-bit integer before performing a bitwise shift (see Item 2). Shifting by zero bits then has no effect on the integer value. So uint32.test effectively compares a number to the result of converting it to an unsigned 32-bit integer.
Next we can implement the arrayLike guard object:
var arrayLike = Object.create(guard);
arrayLike.test = function(x) {
return typeof x === "object" && x && uint32.test(x.length);
};
arrayLike.toString = function() {
return "array-like object";
};
Notice that we have taken defensive programming one step further here, ensuring that an array-like object should have an unsigned integer length property.
Lastly, we can implement “chaining” methods (see Item 60), such as or, as prototype methods:
guard.or = function(other) {
var result = Object.create(guard);
var self = this;
result.test = function(x) {
return self.test(x) || other.test(x);
};
var description = this + " or " + other;
result.toString = function() {
return description;
};
return result;
};
This method combines the receiver guard object (the object bound to this) with a second guard object (the other parameter), producing a new guard object whose test and toString methods combine the two input objects’ methods. Notice that we use a local self variable to save a reference to this (see Items 25 and 37) for use inside the resultant guard object’s test method.
These tests can help catch bugs earlier when they crop up, which makes them significantly easier to diagnose. Nevertheless, they can clutter a codebase and potentially affect application performance. Whether to use defensive programming is a question of cost (the number of extra tests you have to write and execute) versus benefit (the number of bugs you catch earlier, saving development and debugging time).
• Avoid mixing coercions with overloading.
• Consider defensively guarding against unexpected inputs.
Part of the power of stateless APIs (see Item 56) is their flexibility for building compound operations out of smaller ones. A great example is the replace method of strings. Since the result is itself a string, we can perform multiple replacements by repeatedly calling replace on the result of the previous method call. A common usage of this pattern is for replacing special characters of a string before inserting it into HTML:
function escapeBasicHTML(str) {
return str.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/"/g, """)
.replace(/'/g, "'");
}
The first call to replace returns a string with all instances of the special character "&" replaced with the HTML escape sequence "&"; the second call then replaces any instances of "<" with the escape sequence "<", and so on. This style of repeated method calls is known as method chaining. It’s not necessary to write in this style, but it’s much more concise than saving each intermediate result to an intermediate variable:
function escapeBasicHTML(str1) {
var str2 = str1.replace(/&/g, "&");
var str3 = str2.replace(/</g, "<");
var str4 = str3.replace(/>/g, ">");
var str5 = str4.replace(/"/g, """);
var str6 = str5.replace(/'/g, "'");
return str6;
}
Eliminating the temporary variables makes it clearer to readers of the code that the intermediate results are only important as a step along the way to the final result.
Method chaining can be used whenever an API produces objects of some interface (see Item 57) with methods that produce more objects, often of the same interface. The array iteration methods described in Items 50 and 51 are another great example of a “chainable” API:
var users = records.map(function(record) {
return record.username;
})
.filter(function(username) {
return !!username;
})
.map(function(username) {
return username.toLowerCase();
});
This chained operation takes an array of objects representing user records, extracts the username property of each record, filters out any empty usernames, and finally converts the usernames to lowercase strings.
This style is so flexible and expressive for consumers of an API, that it’s worth designing your API to support it. Often, in stateless APIs, “chainability” falls out as a natural consequence: If your API does not modify an object it has to return a new object. As a result, you get an API whose methods all produce more objects with similar sets of methods.
Method chaining is also useful to support in a stateful setting. The trick here is for methods that update an object to return this instead of undefined. This makes it possible to perform multiple updates on the same object via a sequence of chained method calls:
element.setBackgroundColor("yellow")
.setColor("red")
.setFontWeight("bold");
Method chaining for stateful APIs is sometimes known as the fluent style. (The term was coined by programmers simulating Smalltalk’s “method cascades”; a built-in syntax for calling multiple methods on a single object.) If the update methods do not return this, then the user of the API has to repeat the name of the object each time. If the object is simply named by a variable, this doesn’t make much difference. But when combining stateless methods that retrieve objects with update methods, method chaining can make for very concise and readable code. The front-end library jQuery popularized this approach with a set of (stateless) methods for “querying” a web page for user interface elements and a set of (stateful) methods for updating those elements:
$("#notification") // find notification element
.html("Server not responding.") // set notification message
.removeClass("info") // remove one set of styling
.addClass("error"); // add more styling
Since the stateful calls to the html, removeClass, and addClass methods support the fluent style by returning the same object, we don’t even have to create a temporary variable for the result of the query performed by the jQuery function ($). Of course, if users find this style too terse, they can always introduce a variable to name the result of the query:
var element = $("#notification");
element.html("Server not responding.");
element.removeClass("info");
element.addClass("error");
But by supporting method chaining, the API allows programmers to decide for themselves which style they prefer. If the methods returned undefined, users would be forced to write in the more verbose style.
• Use method chaining to combine stateless operations.
• Support method chaining by designing stateless methods that produce new objects.
• Support method chaining in stateful methods by returning this.