Chapter 2. Agile Enablers

Much of this book is written to help security catch up in an Agile world. We have worked in organizations that are successfully delivering with Agile methodologies, but we also work with companies that are still getting to grips with Agile and DevOps.

Many of the security practices in this book will work regardless of whether or not you are doing Agile development, and no matter how effectively your organization has embraced Agile. However, there are some important precursor behaviors and practices which enable teams to get maximum value from Agile development, as well as from the security techniques that we outline in this book.

All these enabling techniques, tools, and patterns are common in high-functioning, Agile organizations. In this chapter, we will give an overview of each technique and how it builds on the others to enhance Agile development and delivery. You’ll find more information on these subjects further on in the book.

Build Pipeline

The first, and probably the most important of these enabling techniques from a development perspective, is the concept of a build pipeline. A build pipeline is an automated, reliable, and repeatable way of producing consistent deployable artifacts.

The key feature of a build pipeline is that whenever the source code is changed, it is possible to initiate a build process that is reliably and repeatably consistent.

Some companies invest in repeatable builds to the point where the same build on different machines at different times will produce exactly the same binary output, but many organizations simply instantiate a build machine or build machines that can be used reliably.

The reason this is important is because it gives confidence to the team that all code changes have integrity. We know what it is like to work without build pipelines, where developers create release builds on their own desktops, and mistakes such as forgetting to integrate a coworker’s changes frequently cause regression bugs in the system.

If you want to move faster and deploy more often, you must be absolutely confident that you are building the entire project correctly every time.

The build pipeline also acts as a single consistent location for gateway reviews. In many pre-Agile companies, gateway reviews are conducted by installing the software and manually testing it. Once you have a build pipeline, it becomes much easier to automate those processes, using computers to do the checking for you.

Another benefit of build pipelines is that you can go back in time and check out older versions of the product and build them reliably, meaning that you can test a specific version of the system that might exhibit known issues and check patches against it.

Automating and standardizing your build pipeline reduces the risk and cost of making changes to the system, including security patches and upgrades. This means that you can close your window of exposure to vulnerabilities much faster.

However, just because you can compile and build the system fast and repeatedly doesn’t mean it will work reliably. For that you will need to use automated testing.

Automated Testing

Testing is an important part of most software quality assurance programs. It is also a high source of costs, delays, and wastes in many traditional programs.

Test scripts take time to design and write, and more time to run against your systems. Many organizations need days or weeks of testing time, and more time to fix and re-test the bugs that are found in testing before they can finally release.

When testing takes weeks of work, it is impossible to release code into test any faster than the tests take to execute. This means that code changes tend to get batched up, making the releases bigger and more complicated, which necessitates even more testing, which necessitates longer test times, in a negative spiral.

However, much of the testing done by typical user acceptance testers following checklists or scripts adds little value and can be (and should be) automated.

Automated testing generally follows the test pyramid, where most tests are low level, cheap, fast to execute, simple to automate, and easy to change. This reduces the team’s dependence on end-to-end acceptance tests, which are expensive to set up, slow to run, hard to automate, and even harder to maintain.

As we’ll see in Chapter 11, modern development teams can take advantage of a variety of automated testing tools and techniques: unit testing, test-driven development (TDD), behavior-driven design (BDD), integration testing through service virtualization, and full-on user acceptance testing. Automated test frameworks, many of which are open source, allow your organization to capture directly into code the rules for how the system should behave.

A typical system might execute tens of thousands of automated unit and functional tests in a matter of seconds, and perform more complex integration and acceptance testing in only a few minutes.

Each type of test achieves a different level of confidence:

Unit tests

These tests use white-box testing techniques to ensure that code modules work as the author intended. They require no running system, and generally test the inputs to expected outputs or expected side effects. Good unit tests will also test boundary conditions and known error conditions.

Functional tests

These tests test whole suites of functions. They often don’t require a running system, but they do require some setup, tying together many of the code modules. In a system comprised of many subsystems, they check a single subsystem, ensuring each subsystem works as expected. These tests try to model real-life case scenarios using known test data and the common actions that system users will perform.

Integration tests

These tests are the start of standing up an entire system. They check that all the connecting configuration works, and that the subsystems communicate with each other properly. In many organizations, integration testing is performed only on internal services, so external systems are stubbed with fake versions that behave in consistent ways. This makes testing more repeatable.

System testing

This is the standing up of a fully integrated system, with external integrations and accounts. These tests ensure that the whole system runs as expected, and that core functions or features work correctly from end-to-end.

Automation gets harder the further down the table you go, but here are the benefits of testing that way:

Speed

Automated tests (especially unit tests) can be executed often without needing user interfaces or slow network calls. They can also be parallelized, so that thousands of tests can be run in mere seconds.

Consistency

Manual testers, even when following checklists, may miss a test or perform tests incorrectly or inconsistently. Automated tests always perform the same actions in the same way each time. This means that variability is dramatically reduced in testing, far reducing the false positives (and more important, the false negatives) possible in manual testing.

Repeatability

Automated tests, since they are fast and consistent, can be relied on by developers each time that they make changes. Some Agile techniques even prescribe writing a test first, which will fail, and then implementing the function to make the test pass. This helps prevent regressions, and in the case of test-driven development, helps to define the outward behavior as the primary thing under test.

Auditability

Automated tests have to be coded. This code can be kept in version control along with the system under test, and undergoes the same change control mechanisms. This means that you can track a change in system behavior by looking at the history of the tests to see what changed, and what the reason for the change was.

These properties together give a high level of confidence that the system does what its implementers intended (although not necessarily what was asked for or the users wanted, which is why it is so important to get software into production quickly and get real feedback). Furthermore, it gives a level of confidence that whatever changes have been made to the code have not had an unforeseen effect on other parts of the system.

Automated testing is not a replacement for other quality assurance practices, but it does massively increase the confidence of the team to move fast and make changes to the system. It also allows any manual testing and reviews to focus on the high-value acceptance criteria.

Furthermore, the tests, if well written and maintained, are valuable documentation for what the system is intended to do.

Naturally, automated testing combines well with a build pipeline to ensure that every build has been fully tested, automatically, as a result of being built. However, to really get the benefits of these two techniques, you’ll want to tie them together to get continuous integration.

Automated Security Testing

It’s common and easy to assume that you can automate all of your security testing using the same techniques and processes.

While you can—and should—automate security testing in your build pipelines (and we’ll explain how to do this in Chapter 12), it’s nowhere near as easy as the testing outlined here.

While there are good tools for security testing that can be run as part of the build, most security tools are hard to use effectively, difficult to automate, and tend to run significantly slower than other testing tools.

We recommend against starting with just automated security tests, unless you have already had success automating functional tests and know how to use your security tools well.

Continuous Integration

Once we have a build pipeline, ensuring that all artifacts are created consistently and an automated testing capability that ensures basic quality checks, we can combine those two systems. This is most commonly called continuous integration (CI), but there’s a bit more to this practice than just that.

The key to continuous integration is the word “continuous.” The idea of a CI system is that it constantly monitors the state of the code repository, and if there has been a change, automatically triggers building the artifact and then testing the artifact.

In some organizations, the building and testing of an artifact can be done in seconds, while in larger systems or more complex build-test pipelines, it can take several minutes to perform. Where the times get longer, teams tend to start to separate out tests and steps, and run them in parallel to maintain fast feedback loops.

If the tests and checks all pass, the output of continuous integration is an artifact that could be deployed to your servers after each code commit by a developer. This gives almost instantaneous feedback to the developer that he hasn’t made a mistake or broken anybody else’s work.

This also provides the capability for the team to maintain a healthy, ready-to-deploy artifact at all times, meaning that emergency patches or security responses can be applied easily and quickly.

However, when you release the artifact, the environment you release it to needs to be consistent and working, which leads us to infrastructure as code.

Infrastructure as Code

While the application or product can be built and tested on a regular basis, it is far less common for the systems infrastructure to go through this process—until now.

Traditionally, the infrastructure of a system is purchased months in advance, and is relatively fixed. However the advent of cloud computing and programmable configuration management means that it is now possible, and even common, to manage your infrastructure in code repositories.

There are many different ways of doing this, but the common patterns are that you maintain a code repository that defines the desired state for the system. This will include information on operating systems, hostnames, network definitions, firewall rules, installed application sets, and so forth. This code can be executed at any time to put the system into a desired state, and the configuration management system will make the necessary changes to your infrastructure to ensure that this happens.

This means that making a change to a system, whether opening a firewall rule or updating a software version of a piece of infrastructure, will look like a code change. It will be coded, stored in a code repository (which provides change management and tracking), and reliably and repeatably rolled out.

This code is versioned, reviewed, and tested in the same way that your application code is. This gives the same levels of confidence in infrastructure changes that you have over your application changes.

Most configuration management systems regularly inspect the system and infrastructure, and if they notice any differences, are able to either warn or proactively set the system back to the desired state.

Using this approach, you can audit your runtime environment by analyzing the code repository rather than having to manually scan and assess your infrastructure. It also gives confidence of repeatability between environments. How often have you known software to work in the development environment but fail in production because somebody had manually made a change in development and forgotten to promote that change through into production?

By sharing much of the infrastructure code between production and development, we can track and maintain the smallest possible gap between the two environments and ensure that this doesn’t happen.

Configuration Management Does Not Replace Security Monitoring!

While configuration management does an excellent job of keeping the operating environment in a consistent and desired state, it is not intended to monitor or alert on changes to the environment that may be associated with the actions of an adversary or an ongoing attack.

Configuration management tools check the actual state of the environment against the desired state on a periodic basis (e.g., every 30 minutes). This leaves a window of exposure for an adversary to operate in, where configurations could be changed, capitalized upon, and reverted, all without the configuration management system noticing the changes.

Security monitoring/alerting and configuration management systems are built to solve different problems, and it’s important to not confuse the two.

It is of course possible, and desirable by high-performing teams, to apply build pipelines, automated testing, and continuous integration onto the infrastructure itself, ensuring that you have a high confidence that your infrastructure changes will work as intended.

Once you know you have consistent and stable infrastructure to deploy to, you need to ensure that the act of releasing the software is repeatable, which leads to release management.

Release Management

A common issue in projects is that the deployment and release processes for promoting code into production can fill a small book, with long lists of individual steps and checks that must be carried out in a precise order to ensure that the release is smooth.

These runbooks are often the last thing to be updated and so contain errors or omissions; and because they are executed rarely, time spent improving them is not a priority.

To make releases less painful and error prone, Agile teams try to release more often. Procedures that are regularly practiced and executed tend to be well maintained and accurate. They are also obvious candidates to be automated, making deployment and release processes even more consistent, reliable, and efficient.

Releasing small changes more often reduces operational risks as well as security risks. As we’ll explain more in this book, small changes are easier to understand, review, and test, reducing the chance of serious security mistakes getting into production.

These processes should be followed in all environments to ensure that they work reliably, and if automated, can be hooked into the continuous integration system. If this is done, we can move toward a continuous delivery or continuous deployment approach, where a change committed to the code repository can pass through the build pipeline and its automated testing stages and be automatically deployed, possibly even into production.

Understanding Continuous Delivery

Continuous delivery and continuous deployment are subtly different.

Continuous delivery ensures that changes are always ready to be deployed to production by automating and auditing build, test, packaging, and deployment steps so that they are executed consistently for every change.

In continuous deployment, changes automatically run through the same build and test stages, and are automatically and immediately promoted to production if all the steps pass. This is how organizations like Amazon and Netflix achieve high rates of change.

If you want to understand the hows and whys of continuous delivery, and get into the details of how to set up your continuous delivery pipeline properly, you need to read Dave Farley and Jezz Humble’s book, Continuous Delivery (Addison-Wesley).

One of us worked on a team that deployed changes more than a hundred times each day, where the time between changing code and seeing it in production was under 30 seconds.

However, this is an extreme example from an experienced team that had been working this way for years. Most teams that we come into contact with are content to reach turnaround times of under 30 minutes, and 1 to 5 deploys a day, or even as few as 2 to 3 times a month.

Even if you don’t go all the way to continuously deploying each change to production, by automating the release process you take out human mistakes, and you gain repeatability, consistency, speed, and auditability.

This gives you confidence that deploying a new release of software won’t cause issues in production, because the build is tested, and the release process is tested, and all the steps have been exercised and proven to work.

Furthermore, built-in auditability means you can see exactly who decided to release something and what was contained in that change, meaning that should an error occur, it is much easier to identify and fix.

It’s also much more reliable in an emergency situation. If you urgently need to patch a software security bug, which would you feel more confident about: a patch that had to bypass much of your manual testing and be deployed by someone who hasn’t done that in a number of months, or a patch that has been built and tested the same as all your other software and deployed by the same script that does tens of deploys a day?

Moving the concept of a security fix to be no different than any other code change is huge in terms of being able to get fixes rapidly applied and deployed, and automation is key to being able to make that mental step forward.

But now we can release easily and often, we need to ensure that teams don’t interfere with each other, for that we need visible tracking.

Visible Tracking

Given this automated pipeline or pathway to production, it becomes critical to know what is going to go down that path, and for teams to not interfere with each other’s work.

Despite all of this testing and automation, there are always possible errors, often caused by dependencies in work units. One piece of work might be reliant on another piece of work being done by another team. In these more complex cases, it’s possible that work can be integrated out of order and make its way to production before the supporting work is in place.

Almost every Agile methodology highly prioritizes team communication, and the most common mechanism for this is big visible tracking of work. This might be Post-it notes or index cards on a wall, or a Kanban board, or it might be an electronic story tracker; but whatever it is, there are common requirements:

Visible

Everybody on the team and related teams should be able to see at a glance what is being worked on and what is in the pathway to production.

Up-to-date and complete

For this information to be useful and reliable, it must be complete and current. Everything about the project—the story backlog, bugs, vulnerabilities, work in progress, schedule milestones and velocity, cycle time, risks, and the current status of the build pipeline—should be available in one place and updated in real-time.

Simple

This is not a system to track all the detailed requirements for each piece of work. Each item should be a placeholder that represents the piece of work, showing a few major things, who owns it, and what state it is in.

Of course having the ability to see what work people are working on is no use if the work itself isn’t valuable, which brings us to centralized feedback.

Centralized Feedback

Finally, if you have an efficient pipeline to production, and are able to automatically test that your changes haven’t broken your product, you need some way to monitor the effectiveness of the changes you make. You need to be able to monitor the system, and in particular how it is working, to understand your changes.

This isn’t like system monitoring, where you check whether the machines are working. It is instead value chain monitoring: metrics that are important to the team, to the users of the system, and to the business, e.g., checking conversion rate of browsers to buyers, dwell time, or clickthrough rates.

The reason for this is that highly effective Agile teams are constantly changing their product in response to feedback. However, to optimize that cycle time, the organization needs to know what feedback to collect, and more specifically, whether the work actually delivered any value.

Knowing that a team did 10, 100, or 1000 changes is pointless unless you can tie that work back to meaningful work for the organization.

Indicators vary a lot by context and service, but common examples might include value for money, revenue per transaction, conversion rates, dwell time, or mean time to activation. These values should be monitored and displayed on the visible dashboards that enable the team to see historical and current values.

Knowing whether your software actually delivers business value, and makes a visible difference to business metrics, helps you to understand that the only good code is deployed code.

The Only Good Code Is Deployed Code

Software engineering and Agile development is not useful in and of itself. It is only valuable if it helps your company achieve its aims, whether that be profit or behavioral change in your users.

A line of code that isn’t in production is not only entirely valueless to the organization, but also a net liability, since it slows down development and adds complexity. Both have a negative effect on the security of the overall system.

Agile practices help us shorten the pathway to production, by recognizing that quick turnaround of code is the best way to get value from the code that we write.

This of course all comes together when you consider security in your Agile process. Any security processes that slow down the path to production, without significant business gains, are a net liability for the organization and encourage value-driven teams to route around them.

Security is critically involved in working out what the Definition of Done is for an Agile team, ensuring that the team has correctly taken security considerations into account. But security is only one voice in the discussion, responsible for making sure that the team is aware of risks, and enable the business to make informed decisions about these risks.

We hope that the rest of this book will help you understand how and where security can fit into this flow, and give you ideas for doing it well in your organization.

Operating Safely and at Speed

What happens if you can’t follow all of these practices?

There are some environments where regulations prevent releasing changes to production without legal sign-off, which your lawyers won’t agree to do multiple times per day or even every week. Some systems hold highly confidential data which the developers are not expected or perhaps not even allowed to have access to, which puts constraints on the roles that they can play in supporting and running the system. Or you might be working on legacy enterprise systems that cannot be changed to support continuous delivery or continuous deployment.

None of these techniques are fundamentally required to be Agile, and you don’t need to follow all of them to take advantage of the ideas in this book. But if you are aren’t following most of these practices to some extent, you need to understand that you will be missing some levels of assurance and safety to operate at speed.

You can still move fast without this high level of confidence, but you are taking on unnecessary risks in the short term, such as releasing software with critical bugs or vulnerabilities, and almost certainly building up technical debt and operational risks over the longer term. You will also lose out on some important advantages, such as being able to minimize your time for resolving problems, and closing your window of security exposure by taking the human element out of the loop as much as possible.

It’s also important to understand that you probably can’t implement all these practices at once in a team that is already established in a way of working, and you probably shouldn’t even try. There are many books that will help you to adopt Agile and explain how to deal with the cultural and organizational changes that are required, but we recommend that you work with the team to help it understand these ideas and practices, how they work, and why they are valuable, and implement them iteratively, continuously reviewing and improving as you go forward.

The techniques described in this chapter build on each other to create fast cycle times and fast feedback loops:

  1. By standardizing and automating your build pipeline, you establish a consistent foundation for the other practices.

  2. Test automation ensures that each build is correct.

  3. Continuous integration automatically builds and tests each change to provide immediate feedback to developers as they make changes.

  4. Continuous delivery extends continuous integration to packaging and deployment, which in turn requires that these steps are also standardized and automated.

  5. Infrastructure as code applies the same engineering practices and workflows to making infrastructure configuration changes.

  6. To close the feedback loop, you need metrics and monitoring at all stages, from development to production, and from production back to development.

As you continue to implement and improve these practices, your team will be able to move faster and with increasing confidence. These practices also provide a control framework that you can leverage for standardizing and automating security and compliance, which is what we will explore in the rest of this book.