Thinking in terms of API-driven and documentation-driven design will yield more usable modules than not doing so. You might argue that internals are not that important: “as long as the interface holds, we can put anything we want in the mix!” A usable interface is only one side of the equation; it will do little to keep the maintainability of your applications in check. Properly designed module internals help keep our code readable and its intent clear. In this chapter, we’ll debate about what it takes to write modules with scalability in mind but without getting too far ahead of our current requirements. We’ll discuss the CRUST constraints in more depth, and finally elaborate on how to prune modules as they become larger and more complex over time.
Small, single-purpose functions are the lifeblood of clean module design. Purpose-built functions scale well because they introduce little organizational complexity into the module they belong to, even when that module grows to 500 lines of code. Small functions are not necessarily less powerful than large functions, but their power lies in composition.
Suppose that instead of implementing a single function with 100 lines of code, we break it up into three or more smaller functions. We might later be able to reuse one of those smaller functions somewhere else in our module, or it might prove a useful addition to its public interface.
In this chapter, we’ll discuss design considerations aimed at reducing complexity at the module level. While most of the concerns we’ll discuss here have an effect on the way we write functions, it is in the next chapter where we’ll be specifically devoting our time to the development of simple functions.
Cleanly composed functions are at the heart of effective module design. Functions are the fundamental unit of our code. We could get away with writing the smallest possible number of functions required, the ones that are invoked by consumers or need to be passed for other interfaces to consume, but that wouldn’t get us much in the way of maintainability.
We could rely solely on intuition to decide what deserves to be its own function and what is better left inlined as part of a larger body of code, but this might leave us with inconsistencies that depend on our frame of mind, as well as how each member of a team perceives functions are to be sliced. As we’ll see in the next chapter, pairing a few rules of thumb with our own intuition is an effective way of keeping functions simple, limiting their scope.
At the module level, it’s required that we implement features with the API surface in mind. When we plan out new functionality, we have to consider whether the abstraction is right for our consumers, how it might evolve and scale over time, and how narrowly or broadly it can support the use cases of its consumers.
When considering whether the abstraction is right, suppose we have a function that’s a draggable object factory for DOM elements. Draggable objects can be moved around and then dropped in a container, but consumers often have to impose different limitations on the conditions under which the object can be moved, some of which are outlined in the following list:
Draggable elements must have a parent with a draggable-list class.
Draggable elements mustn’t have a draggable-frozen class.
Dragging must initiate from a child with a drag-handle class.
Elements may be dropped into containers with a draggable-dropzone class.
Elements may be dropped into containers with at most six children.
Elements may not be dropped into the container they’re being dragged from.
Elements must be sortable in the container they’re dragged from, but they can’t be dropped into other containers.
We’ve now spent quite a bit of time thinking about use cases for a drag-and-drop library, so we’re well equipped to come up with an API that will satisfy most or maybe even every one of these use cases, without dramatically broadening our API surface.
Consider, in contrast, the situation if we were to go off and implement a way of checking off each use case in isolation without taking into account similar use cases, or cases that might arise but are not an immediate need. We would end up with seven ways of introducing specific restrictions on how elements are dragged and dropped. Since we’ve designed their interfaces in isolation, each of these solutions is likely to be at least slightly different from the rest. Maybe they’re similar enough that each of them is an option flag, but consumers still can’t help but wonder why we have seven flags for such similar use cases, and they can’t shake the feeling that we’ve designed the interface poorly. But there wasn’t much in the way of design; we’ve mostly tacked requirement upon requirement onto our API surface as each came along, never daring to look at the road ahead and envision how the API might evolve in the future. If we had designed the interfaces with scalability in mind, we might’ve grouped many similar use cases under the same feature, and would’ve avoided an unnecessarily large API surface in the process.
Now let’s go back to the case where we do spend some time thinking ahead and create a collection of similar requirements and use cases. We should be able to find a common denominator that’s suitable for most use cases. We’ll know when we have the right abstraction because it’ll cater to every requirement we have, and a few we didn’t even have to fulfill but that the abstraction satisfies anyhow. In the case of draggable elements, once we’ve taken all the requirements into account, we might choose to define a few options that impose restrictions based on a few CSS selectors. Alternatively, we might introduce a callback whereby the user can determine whether an element can be dragged and another whereby they can determine whether the element can be dropped. These choices also depend on how heavily the API is going to be used, how flexible we want it to be, and how frequently we intend to make changes to it.
Sometimes we won’t have the opportunity to think ahead. We might not be able to foresee all possible use cases. Our forecasts may fail us, or requirements may change, pulling the rug from under our feet. Granted, this never is the ideal situation to find ourselves in, but we certainly wouldn’t be better off if we hadn’t paid attention to the use cases for our module in aggregate. On the other hand, extra requirements may fit within the bounds of an abstracted solution, provided the new use case is similar enough to what we expected when designing the abstraction.
Abstractions aren’t free, but they can shield portions of code from complexity. Naturally, we could boldly claim that an elegant interface such as fn => fn() solves all problems in computing; the consumer needs to provide only the right fn callback. The reality is, we wouldn’t be doing anything but offloading the problem onto the consumers, at the cost of implementing the right solution themselves while still consuming our API in the process.
When we’re weighing whether to offer an interface like CSS selectors or callbacks, we’re deciding how much we want to abstract, and how much we want to leave up to the consumer. When we choose to let the user provide CSS selectors, we keep the interface short, but the use cases will be limited as well. Consumers won’t be able, for example, to decide dynamically whether the element is draggable, beyond what a CSS selector can offer. When we choose to let users provide callbacks, we make it harder for them to use our interface, since they now have to provide bits and pieces of the implementation themselves. However, that expense buys them great flexibility in deciding what is draggable and what is not.
Like most things in program design, API design is a constant trade-off between simplicity and flexibility. For each particular case, it is our responsibility to decide how flexible we want the interface to be, but at the expense of simplicity. We can also decide how simple we want an interface to be, but at the expense of flexibility. Going back to jQuery, it’s interesting to note how it always favors simplicity, by allowing you to provide as little information as needed for most of its API methods. Meanwhile, it avoids sacrificing flexibility by offering countless overloads for each of its API methods. The complexity lies in its implementation, balancing arguments by figuring out whether they’re a NodeList, a DOM element, an array, a function, a selector, or something else (not to mention optional parameters) before even starting to fulfill the consumer’s goal when making an API call. Consumers observe some of the complexity at the seams when sifting through documentation and finding out about all the ways of accomplishing the same goals. And yet, despite all of jQuery’s internal complexity, code that consumes the jQuery API manages to stay ravishingly simple.
Before we go off and start pondering the best ways of abstracting a feature that we need to implement so that it caters to every single requirement that might come in the future, it’s necessary to take a step back and consider simpler alternatives. A simple implementation means we pay smaller up-front costs, but it doesn’t necessarily mean that new requirements will result in breaking changes.
Interfaces don’t need to cater to every conceivable use case from the outset. As we’ve analyzed in Chapter 2, sometimes we may get away with first implementing a solution for the simplest or most common use case, and then adding an options parameter through which newer use cases can be configured. As we get to more-advanced use cases, we can make decisions as outlined in the previous section, choosing which use cases deserve to be grouped under an abstraction and which are too narrow for an abstraction to be worthwhile.
Similarly, the interface could start off supporting only one way of receiving its inputs, and as use cases evolve, we might bake polymorphism into the mix, accepting multiple input types in the same parameter position. Grandiose thinking may lead us to believe that, in order to be great, our interfaces must be able to handle every input type and be highly configurable with dozens of configuration options. This might well be true for the most advanced users of our interface, but if we don’t take the time to let the interface evolve and mature as needed, we might code our interface into a corner that can then be repaired only by writing a different component from the ground up with a better thought-out interface, and later replacing references to the old component with the new one.
A larger interface is rarely better than a smaller interface that accomplishes the job consumers need it to fulfill. Elegance is of the essence here: if we want our interface to remain small but predict that consumers will eventually need to hook into different pieces of our component’s internal behavior so that they can react accordingly, we’re better off waiting until this requirement materializes than building a solution for a problem we don’t yet have.
Not only will we be focusing development hours on functionality that’s needed today, but we’ll also avoid creating complexity that can be dispensed with for the time being. It might be argued that the ability to react to internal events of a library won’t introduce a lot of complexity. Imagine, however, that the requirement never materializes. We’d have burdened our component with increased complexity to satisfy functionality we never needed. Worse yet, say the requirement changes between the moment we’ve implemented a solution and the time it’s actually needed. We’d now have functionality we never needed, which clashes with different functionality that we do need.
Suppose we don’t need hooks only to react to events, but we need those hooks to be able to transform internal state. How would the event hooks’ interface change? Chances are, someone might’ve found a use for the event listeners we implemented earlier, and so we cannot dispose of them with ease. We might be forced to change the event listener API to support internal state transformations, which would result in a cringe-worthy interface that’s bound to frustrate implementors and consumers alike.
Falling into the trap of implementing features that consumers don’t yet need might be easy at first, but it’ll cost us dearly in terms of complexity, maintainability, and wasted developer hours. The best code is no code at all. This means fewer bugs, less time spent writing code, less time writing documentation, and less time fielding support requests. Latch onto that mentality and strive to keep functionality to exactly the absolute minimum that’s required.
It’s important to note that abstractions should evolve naturally, rather than have them force an implementation style upon us. When we’re unsure about whether to bundle a few use cases with an abstraction, the best option is often to wait and see whether more use cases would fall into the abstraction we’re considering. If we wait, and the abstraction holds true for more and more use cases, we can go ahead and implement the abstraction. If the abstraction doesn’t hold, we can be thankful we won’t have to bend the abstraction to fit the new use cases, often breaking the abstraction or causing more grief than the abstraction had originally set out to avoid on our behalf.
In a similar fashion to that of the preceding section, we should first wait until use cases emerge and then reconsider an abstraction when its benefits become clear. While developing unneeded functionality is little more than a waste of time, leveraging the wrong abstractions will kill or, at best, cripple our component’s interface. Although good abstractions are a powerful tool that can reduce the complexity and volume of code we write, subjecting consumers to inappropriate abstractions might increase the amount of code they need to write and will forcibly increase complexity by having users bend to the will of the abstraction, causing frustration and eventual abandonment of the poorly abstracted component.
HTTP libraries are a great example of how the right abstraction for an interface depends entirely on the use cases its consumer has in mind. Plain GET calls can be serviced with callbacks or promises, but streaming requires an event-driven interface that allows the consumer to act as soon as the stream has portions of data ready for consumption. A typical GET request could be serviced by an event-driven interface as well, allowing the implementor to abstract every use case under an event-driven model. To the consumer, this model would feel a bit convoluted for the simplest case, however. Even when we’ve grouped every use case under a convenient abstraction, the consumer shouldn’t have to settle for get('/cats').on('data', gotCats) when their use case doesn’t involve streaming. They could be using a simpler get('/cats', gotCats) interface instead, which wouldn’t need to handle error events separately, either, instead relying on the Node.js convention whereby the first argument passed to callbacks is an error or null when everything goes smoothly.
An HTTP library that’s primarily focused on streaming might go for the event-driven model in all cases because convenience methods such as a callback-based interface could be implemented on top of this minimal interface. This is acceptable; we’re focusing on the use case at hand and keeping our API surface as small as possible, while still allowing our library to be wrapped for higher-level consumption. If our library was primarily focused on the experience of leveraging its interface, we might go for the callback- or promise-based approach. When that library then has to support streaming, it might incorporate an event-driven interface. At this point, we’d have to decide whether to expose that kind of interface solely for streaming purposes, or whether it’d be available for commonplace scenarios as well. On the one hand, exposing it solely for the streaming use case keeps the API surface small. On the other, exposing it for every use case results in a more flexible and consistent API, which might be what consumers expect.
Context is of the utmost relevance here. When we’re developing an interface for an open source or otherwise broadly available library, we might need to listen to a variety of folks who’ll be weighing in on how the API should be designed. Depending on our audience, they may prefer a smaller API surface or a flexible interface. Over time, broadly available libraries tend to favor flexibility over simplicity as the number of users grows, and with them, the number of use cases the library needs to support. When the component is being developed in the context of our day jobs, we might not need to cater to a broad audience. It may well be that we ourselves are the only ones who will be consuming the API, or maybe our team. It might be that we belong to a UI platform team that serves the entire company, which would put us in a situation akin to the open source case, though.
In any case, when we’re uncertain whether our interface will be needing to expose certain surface areas, it’s highly recommended that we don’t expose any of it until we are indeed certain. Keeping API surfaces as small as possible reduces the odds of presenting the consumer with multiple ways of accomplishing the same task. This is often undesirable given that users will undoubtedly become confused and come knocking to ask which one is the best solution. There are a few answers. When the best solution is always the same, the other offerings probably don’t belong in our public interface. When the best solution depends on the use case, we should be on the lookout for better abstractions that encapsulate those similar use cases under a single solution. If the use cases are different enough, so should the solutions offered by the interface be, in which case consumers shouldn’t be faced with uncertainty: our interface would offer only a single solution for that particular use case.
You might have heard the “Move Fast and Break Things” mantra from Facebook. It’s dangerous to take this mantra literally in terms of software development, which shouldn’t be hurried nor frequently broken, let alone on purpose. The mantra is meant to be interpreted as an invitation to experiment; the things we should be breaking are assumptions about how an application architecture should be laid out, how users behave, what advertisers want, and any other assumptions. Moving fast means to quickly hash out prototypes to test our new-found assumptions, to timely seize upon new markets, to avoid engineering slowing to a crawl as teams and requirements grow in size and complexity, and to constantly iterate on our products or codebases.
Taken literally, moving fast and breaking things is a dreadful way to go about software development. Any organization worth its salt would never encourage engineers to write code faster at the expense of product quality. Code should exist mostly because it has to, in order for the products it makes up to exist. The less complex the code we write, provided the product remains the same, the better.
The code that makes up a product should be covered by tests, minimizing the risk of bugs making their way to production. When we take “Move Fast and Break Things” literally, we are tempted to think testing is optional, since it slows us down and we need to move fast. A product that’s not covered by tests will be, ironically, unable to move fast when bugs inevitably arise and wind down engineering speed.
A better mantra might be one that can be taken literally, such as “Move Deliberately and Experiment.” This mantra carries the same sentiment as the Facebook mantra, but its true meaning isn’t meant to be decoded or interpreted. Experimentation is a key aspect of software design and development. We should constantly try out and validate new ideas, verifying whether they pose better solutions than the status quo. We could interpret “Move Fast and Break Things” as “A/B test early and A/B test often,” and “Move Deliberately and Experiment” can convey this meaning as well.1
To move deliberately is to move with cause. Engineering tempo will rarely be guided by the development team’s desire to move faster, but is most often instead bound by release cycles and the complexity of requirements needed to meet those releases. Of course, everyone wants engineering to move fast where possible, but interface design shouldn’t be hurried, regardless of whether the interface we’re dealing with is an architecture, a layer, a component, or a function. Internals aren’t as crucial to get right, for as long as the interface holds, the internals can be later improved for performance or readability gains. This is not to advocate sloppily developed internals, but rather to encourage respectfully and deliberately thought-out interface design.
We’re getting closer to function internals, which will be discussed at length in Chapter 4. Before we do so, we need to address a few more concerns on the component level. This section explores how to keep components simple by following the CRUST principle outlined in Chapter 2.
The DRY principle (Don’t Repeat Yourself) is one of the best regarded principles in software development, and rightly so. It prompts us to write a loop when we could write a hundred print statements. It makes us create reusable functions so that we don’t end up having to maintain several instances of the same piece of code. It also questions the need for slight permutations of what’s virtually the same piece of code repeated over and over across our codebases.
When taken to the extreme, though, DRY is harmful and hinders development. Our mission to find the right abstractions will be cut short if we are ever vigilant in our quest to suppress any and all repetition. When it comes to finding abstractions, it’s almost always best to pause and reflect on whether we ought to force DRY at this moment, or should wait a while and see whether a better pattern emerges.
Being too quick to follow DRY may result in selecting the wrong abstraction. This mistake can cost us time if we realize it early enough, and cause even more damage the longer we let an undesirable abstraction loose.
In a similar fashion, blindly following DRY for even the smallest bit of code is bound to make our code harder to follow or read. Merging two sides of a regular expression that was optimized for readability (a rare sight in the world of regular expressions) will almost certainly make it harder to read and correctly infer its purpose. Is following DRY truly worthwhile in cases like this?
The whole point of DRY is to write concise code, improving readability in turn. When the more concise piece of code results in a program that’s harder to read than what we had, DRY was probably a bad idea, a solution to a problem we didn’t yet have (not in this particular piece of code, not yet anyway). To stay sane, it’s necessary to take software development advice with a grain of salt, as we’ll discuss in Section 3.3.4: Applying Context.
Most often, DRY is the correct approach. But in some cases, DRY might not be appropriate, such as when it yields trivial gains at the expense of readability or when it hinders our ability to find better abstractions. We can always come back to our piece of code and sculpt pieces away, making it more DRY. This is typically easier than trying to decouple bits of code we’ve mistakenly made DRY, which is why sometimes it’s best to wait before we commit to this principle.
We’ve discussed interface design at great length, but we haven’t touched on deciding when to split a module into smaller pieces. In modern application architectures, having certain modules may be required by conventional practices. For instance, a web application made up of multiple views may require that each view is its own component. This limitation shouldn’t, however, stop us from breaking the internal implementation of the view into several smaller components. These smaller components might be reused in other views or components, tested on their own, and better isolated than they might have otherwise been if they were tightly coupled to their parent view.
Even when the smaller component isn’t being reused anywhere else, and perhaps not even tested on its own, moving it to a different file is still worthwhile. Why? Because we’re removing the complexity that makes up the child component from its parent virtually for free. We’re paying only a cheap indirect cost, as the child component is now referenced as a dependency of its parent instead of being inlined. When we split the internals of a large component into several children, we’re chopping up its internal complexity and ending up with several simple components. The complexity didn’t dissipate; it’s subtly hidden away in the interrelationships between these child components and their parent. But that’s now the biggest concern in the parent module, whereas each of the smaller modules doesn’t need to know much about these relationships.
Chopping up internals doesn’t work only for view components and their children. That said, view components are a great example that might help us visualize the way complexity can remain flat across a component system, regardless of how deep we go, instead of being contained in a large component with little structure and a high level of complexity or coupling. This is akin to looking at the universe on a macroscopic level and then taking a closer look, until we get to the atomic level, and then beyond. Each layer has its own complexities and intricacies waiting to be discovered, but the complexity is spread across the layers rather than clustered on any one particular layer. The spread reduces the amount of complexity we have to observe and deal with on any given layer.
Speaking of layers, it is at this stage of the design process that you might want to consider defining different layers for your application. You might be accustomed to having models, views, and controllers in MVC applications, or actions, reducers, and selectors in Redux applications. Maybe you should think of implementing a service layer where all the business logic occurs, or perhaps a persistence layer where all the caching and persistent storage takes place.
When we’re not dealing with modules that we ought to shape in a certain way (like views), but modules that can be composed any which way we choose (like services), we should consider whether new features belong in an existing module or in an entirely new module. When we have a module that wraps a Markdown parsing library, adding functionality such as support for emoji expansions, and want an API that can take the resulting HTML and strip out certain tags and attributes, should we add that functionality to the Markdown module or put it in a separate module?
On the one hand, having it in the Markdown module would save us the trouble of importing both modules when we want the sanitization functionality. On the other hand, in quite a few cases we might have HTML that didn’t come from Markdown parsing but that we still want to sanitize. A solution that’s often effective in these cases is putting the HTML sanitization functionality into its own module, but consuming it in the Markdown module for convenience. This way, consumers of the Markdown module always get sanitized output, and those who want to sanitize a piece of HTML directly can do so as well. We could always make sanitization opt-in (or better yet, opt-out) for the Markdown module, if the feature isn’t always what’s needed by consumers of that interface.
It can be tempting to create a utilities.js module where we deposit all of our functionality that doesn’t belong anywhere else. When we move onto a new project, we tend to want some of this functionality once again, so we might copy the relevant parts over to the new module. Here we’d be breaking the DRY principle, because instead of reusing the same bits of code, we’re creating a new module that’s a duplicate of what we had. Worse yet, over time we’ll eventually modify the utilities.js component, and at that point the new project would not contain the same functionality anymore.
The low-hanging fruit here would be to create a lib directory instead of a single utilities.js module, and to place each independent piece of functionality into its own module. Naturally, some of these pieces of functionality will depend on other utility functions, but we’ll be better off importing those bits from another module than keeping everything in the same file. Each small file clearly indicates utility as well as the other bits it relies on, and can be tested and documented individually. More importantly, when the utility grows in scope, file size, and complexity, it will remain manageable because we’ve isolated it early. In contrast, if we kept everything in the same file but then one of the utilities grew considerably, we’d have to pull the functionality into a different module. At that point, our code might be coupled with other utilities in subtle ways that might make the migration to a multimodule architecture a bit harder than it should be.
Were we to truly embrace a modular architecture, we might go the extra mile after promoting each utility to its own module. We could start by identifying utility modules we’d like to reuse—for example, a function used to generate slugs such as this-is-a-slug based on an arbitrary string that might have spaces, accents, punctuation, and symbols, besides alphanumeric characters. Then we could move the module to its own directory, along with documentation and tests, register any dependencies in package.json, and publish it to an npm registry. In doing so, we’d be honoring DRY across projects. When we update the slugging package while working on our latest project, older projects would also benefit from new functionality and bug fixes.
This approach can be taken as far as we consider necessary: as long as we’d benefit from a piece of functionality being reusable across our projects, we can make it so, adding tests and documentation along the way. Note that hypermodularity offers diminishing returns; the more we take modularity to the extreme, the more time we’ll have to spend on documentation and testing. If we intend to release each line of code we develop as its own well-documented and well-tested package, we’ll be spending quite some time on tasks not directly related to developing features or fixing bugs. As always, use your own judgment to decide how far to take modular structures.
When a piece of code is not complex and rather small, it’s usually not worth creating a module for. That code might be better kept in a function on the module where it’s consumed, or inlined every time. Such short pieces of code tend to change and branch out, often necessitating slightly different implementations in different portions of our codebase. Because the amount of code is so small, it’s hardly worth our time to figure out a way to generalize the snippet of code for all or even most use cases. Chances are we’d end up with something more complex than if we just inlined the functionality to begin with.
When a piece of code is complex enough to warrant its own module, that doesn’t immediately make creating a package for it worthwhile. External modules often involve a little bit more maintenance work, in exchange for being reusable across codebases and offering a cleaner interface that’s properly documented. Take into consideration the amount of time you’ll have to spend extricating the module and writing documentation, and whether that’s worth the effort. Extricating the module will be challenging if it has dependencies on other parts of the codebase it belongs to, since those would have to be extricated as well. Writing documentation is typically not something we do for every module of a codebase. However, we have to document modules when they’re their own package, since we can’t expect other potential consumers to effectively decide whether they’ll be using a package without having read exactly what it does or how to use it.
When we’re designing the internals of a module, keeping our priorities in order is key: the goal is to do what consumers of this module need. That goal has several aspects to it, so let’s visit them in order of importance.
First off, we need to design the right interface. A complicated interface will frustrate and drive off consumers, making our module irrelevant or, at best, a pain to work with. Having an elegant or fast implementation will be of little help if our reluctant consumers have trouble leveraging the interface in front of them. A programming interface is so much more than beautiful packaging making up for a mediocre present. For consumers, the interface should be all there is. Having a simple, concise, and intuitive interface will, in turn, drive down complexity in code written by consumers. Thus, the number one step toward our goal is to find the best possible interface that caters to the needs and wants of its consumers.
Second, we need to develop something that works precisely as advertised and documented. An elegant and fast implementation that doesn’t do what it’s supposed to is no good to our consumers. Promising the right interface is great, but it needs to be backed up by an implementation that can deliver on the promises we make through the interface. Only then can consumers trust the code we write.
Third, the implementation should be as simple as possible. The simpler our code, the easier it will be for us to introduce changes to it without having to rewrite the existing implementation. Note that simple doesn’t necessarily mean terse. For example, a simple implementation might indulge in long but descriptive variable names and a few comments explaining why code is written the way it is. Besides the ability to introduce changes, simple code is easier to follow when debugging errors, when new developers interact with the piece of software, or when the original implementors need to interact with it after a long period of time without having to worry about it. Implementation simplicity comes in third, but only after a proper interface that works as expected.
Fourth, the internals should be as performant as possible. Granted, some measure of performance is codified in producing something that works well, because something that’s too slow to be considered reliable would be unacceptable to consumers. Beyond that, performance falls to the fourth place in our list of desirable traits. Performance is a feature, to be treated as such, and we should favor simplicity and readability over speed. There are some exceptions, where performance is of the utmost importance, even at the cost of producing suboptimal interfaces and code that’s not all that easy to read. But in these cases, we should at least strive to heavily comment the relevant pieces of code so that it’s abundantly clear why the code had to be written the way it was.
Flexibility, other than that afforded by writing simple code and providing an appropriate interface, has no place in satisfying the needs of our consumers. Trying to anticipate needs is more often than not going to result in more complexity, code, and time spent, with hardly anything to show for it in terms of improving the consumer’s experience.
Much like modern web development, module design is never truly done. In this section, we’ll visit a few topics that’ll get you thinking about the long half-life of components, and how to design and build our components so that they don’t cause us much trouble after we’ve finished actively developing them.
While working on software development, we’ll invariably need to spend time analyzing the root cause of subtle bugs that seem impossible to hunt down. Only after spending invaluable time will we figure out that the bug was caused by a small difference in program state that we had taken for granted. That small difference snowballed through our application’s logic flow and into the serious issue we just had to hunt down.
We can’t prevent this from happening over and over—not entirely. Unexpected bugs will always find their way to the surface. Maybe we don’t control a piece of software that interacts with our own code in an unexpected way, which works well until it doesn’t anymore because of a problem in the data. Maybe the problem is merely a validation function that isn’t working the way it’s supposed to, allowing data to flow through the system in a shape that it shouldn’t; but by the time it causes an error, we’ll have to spend quite some time figuring out that, indeed, the culprit is a bug in our validation function, triggered by a malformed kind of input that was undertested. Since the bug is completely unrelated to the error’s stack trace information, we might spend a few hours hunting down and identifying the issue.
What we can do is mitigate the risk of bugs by writing more predictable code or improving test coverage. We can also become more proficient at debugging.
In the predictable code arena, we must be sure to handle every expected error. When it comes to error handling, we typically will bubble the error up the stack and handle it at the top, by logging it to an analytics tracker, to standard output, or to a database. When using a function call that we know might throw (for example, JSON.parse on user input) we should wrap it with try/catch and handle the error, again bubbling it up to the consumer if our inability to proceed with the function logic is final. If we’re dealing with conventional callbacks that have an error argument, let’s handle the error in a guard clause. Whenever we have a promise chain, make sure to add a .catch reaction to the end of the chain that handles any errors occurring in the chain. In the case of async functions, we could use try/catch or, alternatively, we can add a .catch reaction to the result of invoking the async function. While leveraging streams or other conventional event-based interfaces, make sure to bind an error event handler. Proper error handling should all but eliminate the chance of expected errors crippling our software. Simple code is predictable. Thus, following the suggestions in Chapter 4 will aid us in reducing the odds of encountering unexpected errors as well.
Test coverage can help detect unexpected errors. If we have simple and predictable code, it’s harder for unexpected errors to seep through the seams. Tests can further abridge the gap by enlarging the corpus of expected errors. When we add tests, preventable errors are codified by test cases and fixtures. When tests are comprehensive enough, we might run into unexpected errors in testing and fix them. Since we’ve already codified those errors in a test case, they can’t happen again (a test regression) without our test suite failing.
Regardless of how determined we are to develop simple, predictable, and thoroughly tested programs, we’re still bound to run into bugs we hadn’t expected. Tests exist mostly to prevent regressions, preventing us from running once again into bugs we’ve already fixed; and to prevent expected mistakes, errors we think might arise if we were to tweak our code in incorrect ways. Tests can do little to prognosticate and prevent software bugs from happening, however.
This brings us to the inevitability of debugging. Using step-through debugging, inspecting application state as we step through the code leading to a bug, is a useful tool, but it will not help us debug our code any faster than we can diagnose exactly what is going on.
To become truly effective debuggers, we must understand how the software we depend on works internally. If we don’t understand the internals of something, we’re effectively dealing with a black box in which anything can happen from our perspective. This adventure is left as an exercise for you, who is better equipped to determine how to obtain a higher understanding of the way your dependencies truly work. Reading the documentation might suffice, but this is rarely the case. Perhaps you should opt to download the source code from GitHub and give it a read. Maybe you’re more of a hands-on kind of person and prefer to try your hand at making your own knock-off of a library you depend on, in order to understand how it works. Regardless of the path you take, the next time you run into an unexpected error related to a dependency that you’re more intimately familiar with, you’ll have an easier time identifying the root cause, since you’ll be aware of the limitations and common pitfalls of what was previously mostly a black box to you. Documentation can take us only so far in understanding how something works under the hood, which is what’s required when tracking down unexpected errors.
It is true: in the hard times of tracking down and fixing an unexpected error, documentation often plays a diminished role. Documentation is, however, often fundamental when trying to understand how a piece of code works, and this can’t be underestimated. Public interface documentation underscores readable code, and is useful not only as a guide for consumers to draw from for usage examples and advanced configuration options that may aid them when coming up with their own designs, but also for implementors as a reference of exactly what consumers are promised and, hence, ultimately expect.
In this section, we’re talking about documentation in its broadest possible sense. We’ve discussed public interface documentation, but tests and code comments are also documentation in their own way. Even variable or function names should be considered a kind of documentation. Tests act as programmatic documentation for the kinds of inputs and outputs we expect from our public interfaces. In the case of integration tests, they describe the minimum acceptable behavior of our application, such as allowing users to log in providing an email and a password. Code comments serve as documentation for implementors to understand why code looks the way it does, indicate areas of improvement, and often refer the reader to links offering further details on a bug fix that might not look all that elegant at first sight. Descriptive variable names can, cumulatively, save the reader considerable time when explicit names like products are preferred over vague and ambiguous names like data. The same applies to function names: we should prefer names like aggregateSessionsPerDay over something shorter but unclear such as getStats.
Getting into the habit of treating every bit of code and the structure around it (formal documentation, tests, comments) as documentation itself is only logical. Those who will be reading our code in the future—developers looking to further their understanding of how the code works, and implementors doing the same in order to extend or repair a portion of functionality—rely on our ability to convey a concise message about the way the interface and its internals work.
Why would we not, then, strive to take advantage of every variable, property, and function name; every component name; every test case; and every bit of formal documentation to explain precisely what our programs do, how they do it, and why we opted for certain trade-offs?
In this sense, we should consider documentation to be the art of taking every possible opportunity to clearly and deliberately express the intent and reasoning of all the aspects of our modules.
I don’t mean to say we should flood consumers and implementors alike until they drown in a tumultuous stream of never-ending documentation. On the contrary, only by being deliberate in our messaging can we strike the right balance and describe the public interface in formal documentation, describe notable usage examples in our test cases, and explain abnormalities in comments.
Following a holistic approach to documentation, through which we’re aware of who might be reading what, and what should be directed to whom, should result in easy-to-follow prose that’s not ambiguous as to usage or best practices, nor fragmented, nor repetitive. Interface documentation should be limited to how the interface works and is rarely the place to discuss design choices, which can be relayed to architecture or design documentation, and later linked in relevant places. Code comments are great for explaining why, or linking to a bug fixed in their vicinity, but they aren’t usually the best place to discuss why an interface looks the way it does. This is better left to architecture documentation or our issue tracker of choice. Dead code should definitely not be kept around in comment blocks, as it does nothing but confuse the reader, and is better kept in feature branches or Git stashes, but off the trunk of source control.
Tom Preston-Werner wrote about the notion of README-driven development as a way of designing an interface by first describing it in terms of how it would be used. This is generally more effective than test-driven design (TDD), where we’ll often find ourselves rewriting the same bits of code over and over before we realize we wanted to produce a different API to begin with. The way README-driven design is supposed to work is self-descriptive; we begin by creating a README file and writing our interface’s documentation. We can start with the most common use cases, inputs, and desired outputs, as described in Section 2.1.2: API First, and grow our interface from there. Doing this in a README file instead of a module leaves us an itsy bit more detached from an eventual implementation, but the essence is the same. The largest difference is that, much like TDD, we’d be committing to writing a README file over and over before we settle for a desirable API. Regardless, both API-first and README-driven design offer significant advantages over diving straight into an implementation.
A popular description of CSS as an “append-only language” implies that after a piece of CSS code has been added, it can’t be removed; doing so could inadvertently break our designs because of the way the cascade works. JavaScript doesn’t make it quite that hard to remove code, but it’s indeed a highly dynamic language, and removing code with the certainty that nothing will break remains a bit of a challenge as well.
Naturally, modifying a module’s internal implementation is easier than changing its public API, as the effects of doing so would be limited to the module’s internals. Internal changes that don’t affect the API are typically not observable from the outside. The exception to that rule occurs when consumers monkey-patch our interface, sometimes becoming able to observe some of our internals.2 In this case, however, the consumer should be aware of the brittleness of monkey-patching a module they do not control, and that they do so assuming the risk of breakage.
In Section 3.1.2: Design for Today we observed that the best code is no code at all, and this has implications when it comes to removing code as well. Code we never write is code we don’t need to worry about deleting. The less code there is, the less code we need to maintain, the less potential bugs we have yet to uncover, and the less code we need to read, test, and deliver over mobile networks to speed-hungry humans.
As portions of our programs become stale and unused, it is best to remove them entirely instead of postponing their inevitable fate. Any code we desire to keep around for reference or the possibility of reinstating it in the future can be safely preserved by source control software without the necessity of keeping it around in our codebase. Avoiding commented-out code and removing unused code as soon as possible will keep our codebase cleaner and easy to follow. When there’s dead code, a developer might be uncertain as to whether this is actually in use somewhere else, and reluctant to remove it. As time passes, the theory of broken windows comes into full effect, and we’ll soon have a codebase that’s riddled with unused code nobody knows the purpose of or how the codebase has become so unmanageable.
Reusability plays a role in code removal. As more components depend on a module, it becomes more unlikely we’ll be able to trivially remove the heavily depended-on piece of code. When a module has no connections to other modules, it can be removed from the codebase, but might still serve a purpose as its own standalone package.
Software development advice is often written in absolute terms, rarely considering context. When you bend a rule to fit your situation, you’re not necessarily disagreeing with the advice; you might just have applied a different context to the same problem. The advisor may have missed that context or might have avoided it because it was inconvenient.
However convincing an eloquent piece of advice or tool might seem, always apply your own critical thinking and context first. What might work for large companies at an incredible scale, under a great load, and with their own unique set of problems, might not be suitable for your personal blogging project. What might seem like a sensible idea for a weekend hack might not be the best use of a mid-size startup’s time.
Whenever you’re analyzing whether a dependency, tool, or piece of advice fits your needs, always start by reading available resources and consider whether the problem being solved is one you indeed need to solve. Avoid falling into the trap of leveraging advice or tools merely because it became popular or is being hailed by a large actor.
Never overcommit to that which you’re not certain fits your needs, but always experiment. It is by keeping an open mind that we can capture new knowledge, improve our understanding of the world, and innovate. This is aided by critical thinking and hindered by rushing to the newest technology without firsthand experimentation. In any case, rules are meant to be bent and broken.
Let’s move to the next chapter, where we’ll decipher the art of writing less-complex functions.
1 In A/B testing, a form of user testing, a small portion of users are presented with a different experience than that used for the general user base. We then track engagement among the two groups, and if the engagement is higher for the users with the new experience, then we might go ahead and present that to our entire user base. It is an effective way of reducing risk when we want to modify our user experience, by testing our assumptions in small experiments before we introduce changes to the majority of our users.
2 Monkey-patching is the intentional modification of the public interface of a component from the outside in order to add, remove, or change its functionality. Monkey-patching can be helpful when we want to change the behavior of a component that we don’t control, such as a library or dependency. Patching is error-prone because we might be affecting other consumers of this API who are unaware of our patches. The API itself or its internals may also change, breaking the assumptions made about them in our patch. Although it’s generally best avoided, sometimes it’s the only choice at hand.