Here are some of the weaknesses in our current approach:
- Lack of consistency: Most enterprise-level applications are developed by a team. It is likely that each team member will use a different operating system, or otherwise configure their machine differently from others. This means that the environment of each team members’ local machine will be different from each other, and by extension, from the production servers'. Therefore, even if all our tests pass locally, it does not guarantee that it will pass on production.
- Lack of independence: When a few services depend on a shared library, they must all use the same version of the library.
- Time-consuming and error-prone: Every time we want a new environment (staging/production) or the same environment in multiple locations, we need to manually deploy a new VPS instance and repeat the same steps to configure users, firewalls, and install the necessary packages. This produces two problems:
- Time-consuming: Manual setup can take anything from minutes to hours.
- Error-prone: Humans are prone to errors. Even if we have carried out the same steps hundreds of times, a few mistakes will creep in somewhere.
- Furthermore, this problem scales with the complexity of the application and deployment process. It may be manageable for small applications, but for larger applications composed of dozens of microservices, this becomes too chaotic.
- Risky deployment: Because the job of server configuration, updating, building, and running our application can only happen at deployment time, there’s more risk of things going wrong when deploying.
- Difficult to maintain: Managing a server/environment does not stop after the application has been deployed. There will be software updates, and your application itself will be updated. When that happens, you’d have to manually enter into each server and apply the update, which is, again, time-consuming and error-prone.
- Downtime: Deploying our application on a single server means that there’s a single point of failure (SPOF). This means that if we need to update our application and restart, the application will be unavailable during that time. Therefore, applications developed this way cannot guarantee high availability or reliability.
- Lack of version control: With our application code, if a bug was introduced and somehow slipped through our tests and got deployed on to production, we can simply rollback to the last-known-good version. The same principles should apply to our environment as well. If we changed our server configuration or upgraded a dependency that breaks our application, there’s no quick-and-easy way to revert these changes. The worse case is when we indiscriminately upgrade multiple packages without first noting down the previous version, then we won’t even know how to revert the changes!
- Inefficient distribution of resources: Our API, frontend client, and Jenkins CI are each deployed on their own VPS, running their own operating system, and controlling their own isolated pool of resources. First of all, running each service on its own server can get expensive quickly. Right now, we only have three components, but a substantial application may have dozens to hundreds of individual services. Furthermore, it’s likely that each service is not utilizing the full capabilities of the server. It is important to have a buffer at times of higher load, but we should minimize unused/idle resources as much as possible:
