Monoliths and microservices are just web services. Many teams are moving to microservices, but they are not all moving for the right reasons.
Contrary to what some people will tell you, microservices are not always the right decision. Sometimes, it’s more beneficial to stick with the battle-tested monoliths you already have.
But I am not here to talk you out of moving to microservices. I’ve actually helped lead two large-scale migrations to microservices in the last five years, and in both projects, one thing was clear. We could have benefited from having a few more upfront conversations about the reasons we were moving to microservices. The following conversations are ones I wish I had with my engineering, operations, and leadership teams before we embarked on the move.
Are we clear on the differences between monoliths and microservices?
The monolith
In a monolithic environment, development teams (as part of their normal development habits) collaborate on a single code repository. If new functionality is needed, engineers add the new code to the existing repository even if the functionality is categorically different from the code already in place. When deployed, everything runs in a single environment/process.
As an example, let’s say your team is in the business of selling mobile devices. Your team’s web service already has an entire structure to handle the behavior and storage of user information. Now, you want to handle the structure to support the device that those users can own.
Users and devices have vastly different properties, but in the context of this project, the engineers have intimate knowledge of how the two should interact. Maintaining the code for these objects in the same repository and possibly making them dependent on each other is called ‘tight coupling’. This increases the risk when making updates because changes to the device behavior may affect the user behavior in unexpected ways. Avoiding this tight coupling is one reason a team may decide to move to microservices.
Microservices
In a microservices environment, the development team practices ‘loose coupling’. Using the example above, user and device would each have their own repository and would be deployed to their own environments (on VMs or containers). Separating these web services into different repositories allows them to be maintained and operate independently from each other.
Now that we have a very high-level understanding of the differences between monoliths and microservices, we can discuss the many tradeoffs between the two approaches.
Would we benefit from splitting up our development teams?
One possible advantage of moving from monoliths to microservices would be the ability to split up large development teams into multiple teams who each own an individual microservice. This would allow teams to work on features independently from each other.
On one project, we had half of our developers creating the microservices that support the general buy flow while the rest of the developers concentrated on authorization, credit, and fraud detection. While credit checks were certainly part of the buy flow, it was not critical for these teams to contribute to the same codebase.
Understand that this act of splitting does not completely free each team from closely communicating with the other. If these microservices communicate frequently, any changes (particularly to the interface of the microservice) can still cause headaches for your teammates who are in charge of other services.
The larger advantage comes with the idea that (again, using the previous example) you can start to develop experts in the user functionality without those engineers needing to also be experts in the device functionality.
Note: when the microservices are deployed to production, you often need to create an orchestration layer to marshal the communication between services. This is usually handled with an API Gateway or Service Mesh.
Can we take advantage of elasticity?
While splitting up the dev teams may be an advantage during the design of a system, there are also advantages to using microservices from an operational standpoint. When these microservices are running in production, you may be able to benefit from elasticity. Being elastic means you can easily scale up or down the number of identical microservices as traffic increases or decreases.
Elasticity works very well when your microservices are written to take advantage of it and when you are deploying your microservices to the public cloud. However, if you are deploying to your own internal environment, your infrastructure may not be configured to handle elastic services.
What does it mean to ‘limit the blast radius’?
Now that we have all of these microservices happily and harmoniously living amongst each other behind the API Gateway, what happens when something goes wrong? The good news is, if implemented correctly, microservices help minimize the impact of failure.
What if your customer wants to buy a phone, but the microservice that is used to suggest an accessory for the phone is failing? Is that important enough to stop them from buying the phone? Probably not.
With microservices, the error will be limited to that microservice while the purchase microservice continues to function normally. In the monolithic world, an error like this would likely cause a global failure in your application. This smaller impact is referred to as ‘limiting the blast radius’.
How will this change affect network traffic and log file sizes?
It is very important not to underestimate the impact that a move to microservices will have on your operations teams. Microservices are considered to be ‘chatty’. This means that what used to be simple inter-process communication in a monolith, now requires the network (often HTTPS). With this increased communication comes a significant increase in both network traffic and logs.
If your operations team is not expecting these changes, they may get new and frequent alerts due to unusual spikes in traffic and increased log sizes. Be a good teammate and get involvement from your operations team early.
Do we have expertise in distributed tracing?
Distributed tracing is a method of monitoring services (particularly microservices) in distributed environments. This is not as necessary with the classic monolith structure since all processing between objects is done locally. Since microservices typically do not share the same process, they must use a centralized monitoring framework in order to aggregate logs and track down issues during runtime and in production.
If you choose to move to microservices, it is critical that your team understands the importance of distributed tracing and builds in this capability from the very beginning.
How do we approach a large structural change like this?
By now, it is probably clear to you that the decision to move to microservices should not be taken lightly. After weighing out all of the pros and cons, if you do decide to move forward with the transformation, where should you begin?
I would strongly discourage a ‘big bang’ transformation where you pick one date to deploy the entire new structure at once. Instead, the most common pattern people use is called the strangler pattern. This pattern is a way of migrating a system by replacing the new functionality incrementally.
Keep your monolith in place while you create a microservice for one piece of the functionality. Then use something like NGINX to route traffic for just that microservice and ensure that it works as expected. Repeat this until you have completely replaced all of the monolith’s functionality with microservices. Then remove the monolith.
How will we handle version control?
With the classic monolith approach, we were lucky that our release version almost always matched the current version of our code. So, if the code we had running in production was Version 1.5.7, then we could say that our service, in general, was also Version 1.5.7.
This is very different in a microservices environment. Each microservice has its own version that increments independently from others. So, the actual current version of the service is simply a label that describes all of the microservice versions that make up the service at that point in time. This makes it very difficult to go back in time to a previous version. Your team should become very familiar with the nuances of orchestrating all of these services.
Are we prepared to take on a change of this magnitude?
Going from monoliths to microservices is rarely an easy task. There will be a very large learning curve for your engineering teams and it is not always easy to spot when your team is experiencing burnout. You should not ignore the human impact of this decision. If you are already fighting unreasonable deadlines or if the benefits to your organization do not outweigh the level of effort, you should consider keeping your existing architecture.
In summary
To help summarize this article, here is a list of things to consider when deciding if you should move to microservices.
- Have a reason to move to microservices (team size, scaling, resiliency, etc.);
- Have a plan for monitoring, telemetry, and distributed tracing;
- Move incrementally using the strangler pattern rather than all at once;
- Prepare for a lot of complexity in relation to versioning;
- Staff appropriately and monitor employee health.
Remember, microservices can have many benefits to your organization – just be sure not to underestimate the time, effort, and money involved.