This piece is more about defining the problem and the hazards along the way than a fixed solution to issues of collaboration with microservice architectures. The interdependent world of microservices has made local replication difficult, and existing stop-gaps for smaller microservices clusters stop working beyond a certain scale.
In conversation with Kostis from Codefresh last month, one thing he said really stood out to me:
Kubernetes and containers have made the operations side of the equation far simpler and more consistent. However, the developer experience has not really caught up. Developers are still trying to figure out the ideal software development lifecycle. They need to have environments at their disposal to be able to develop fast, test their code fast, and collaborate with other developers that may be working on other microservices that they could depend on.
There’s no doubt that microservices has drastically improved the operational excellence of large software teams. When is the last time that a service you relied on was down for more than a few hours? The establishment of clear contracts between microservices helps developers work confidently, with simple tests showing whether any changes continue to meet said contract.
But microservices represent challenges for the developer experience as your microservice architecture grows in scale. It is difficult to test how your code works in a complex microservice architecture, and collaborating on larger or more complex features can be made doubly difficult in these environments.
Before microservices, developers could have the entire application (front end, middleware, database) on their local workstation, making it easy to develop and test. There were of course limitations to this approach: architecture differences meant that some things ‘worked on my machine’ and didn’t work in production, but the basic map of the system could be entirely mapped on a local workstation.
After microservices, with the splitting of the monolith into potentially hundreds of microservices, it becomes impossible to have all services running on a local workstation. This means you’ll have to devise a different solution to preview how your code will work with dozens of other microservices.
Several extant strategies exist for creating a preview environments. They can be broadly defined as:
💡I recenly heard an anecdote about a large enterprise team: Earlier than most teams they ran into the need for some kind of Sandbox for previewing code. However as these sandboxes were needed frequently, they began to gobble resources. Therefore an entire team was dedicated to working as ‘sandbox killers’ to detect unused/uneeded sandboxes for downscaling.
All of these solutions, even with the drawbacks listed, will work at smaller scales, but they will all pose significant challenges to collaboration between engineers.
What are the situations where microservice teams need to collaborate? Shouldn’t each small team just worry about their microservice and fulfilling the contracts between them and the others? Picture a simple feature like collecting user birthdays and adding a ‘happy birthday’ message to users on their profile page. Which microservice team can deliver this feature? No microservice team can deliver a feature like this. From user signup, to user records handling, to the profile display, multiple microservices will be involved. This is true of most features. Simple requests lead to more and more requirements for teams to work together to make sure every aspect works.
The solutions listed in the section above prove problematic for collaboration between teams. The issue, broadly, is isolation between developers and the separation of operational aspects from developer control.
Isolation Between Developers: Using namespaces in Kubernetes provides isolation between different developers, but it also means that collaboration requires more coordination and understanding of the shared environment.
Operational Divisions: The provisioning of namespaces might be handled by a platform or DevOps team, separating the operational aspects from development. While this can be an advantage, it also means that developers must rely on other teams for certain aspects of their environment, potentially slowing down the development process.
Let’s talk through our process of pushing out a ‘user birthday’ feature with the tools we have available:
The concept of microservices shift left testing emphasizes the importance of testing early and often in the development lifecycle. By shifting testing to the left, developers can catch issues sooner, reducing the time and effort required to fix them. This approach aligns well with the dynamic and distributed nature of microservices development, where collaboration and communication between team members are essential.
The transition from a monolithic architecture to a microservice architecture introduces new complexities in the development environment. While solutions like Docker Compose and Kubernetes namespaces can mitigate some challenges, they also introduce new collaboration difficulties. The need to manage multiple services, ensure consistency between environments, and coordinate with other teams can make collaboration more challenging in a microservice architecture.
How can two teams collaborate on features with extensive interdependencies? Let’s go over the requirements:
In essence we require something that easily lets us define zones where we can test and experiment with new code, a ‘sandbox’ that can safely interact with the rest of our cluster.
In order to enable on-demand sandboxes that can connect with the shared cluster when necessary, many solutions to allow a sort of ‘tunnel’ between a local service and the cluster, seems like a solution. This, however, really struggles when we think about collaboration. How would this ‘tunnel’ allow for a request to hit multiple test services, running on disparate local workstations? It’s not well supported by that architecture.
Rather we need something that isolates the requests of our sandbox and can control routing across the entire cluster. For this, a tool like Istio or linkerd, running a service mesh that controls intra-service communication, is a crucial missing piece.
This piece is more about defining the problem and the hazards along the way than a fixed solution to issues of collaboration with microservice architectures. The interdependent world of microservices has made local replication difficult, and existing stop-gaps for smaller microservices clusters stop working beyond a certain scale. The result has been a much slower path between first writing code and sending it to production, with added friction as changes conflict with each other. When we imagine features that rely on multiple changes across teams, it’s possible that developer velocity can slow to zero.
However the same technologies that brought about this impasse also offer solutions: in a modern containerized and orchestrated cluster of microservices, tools like service mesh can make it possible to collaborate fluidly with multiple test sandboxes.
Get the latest updates from Signadot