The staging environment is now an obsolete bottleneck for development. This article argues we must kill staging entirely and adopt a production-fidelity validation system, powered by Kubernetes, to gain the speed of the local loop combined with the quality of the outer loop.
Image by Markus Spiske from Unsplash.
Learn how testing in production with the right guardrails can eliminate bottlenecks, reduce costs and help your team ship more reliable code faster.
Staging has always been a necessary evil. New approaches to isolation and on-demand sandboxes have finally made it just plain evil.
For decades, the staging environment has been a fixture of software development. And for just as long, it has been hated by developers everywhere. It’s the proverbial traffic jam that every developer is forced to sit in just to get their work validated.
Yet staging is no longer necessary; its time has come and gone. Newer isolation methods enable developers to test safely in live environments, providing fast, high-fidelity feedback that is impossible to achieve in a staging environment.
It is time to kill your staging environment.
Staging environments make sense, in theory. We must test code in a production-like environment before we ship to production. Anything else would be madness.
But the cure has become its own disease. In any organization with more than a handful of microservices, the staging environment inevitably becomes a wasteland of developer pain and burned cash.
We have accepted this broken workflow for 20 years. We believed it was the only way.
Staging exists because of the assumption that testing must be isolated at the environment level. To test a new version of the payment service, you must deploy it to an environment that also contains a cart service, user service and auth service.
This assumption is outdated and obsolete.
The new model is request-level isolation. Instead of cloning an entire environment, you spin up only the service you’re changing. This model is enabled by Kubernetes-native platforms that provide on-demand sandboxes for every request.
Here is how it works:
With this approach, you get high-fidelity testing (real dependencies, real network policies) without the downsides of shared environments (no collisions, no queues, dramatically lower cost).
Request-level isolation can be implemented into a traditional local > staging > production deployment flow to improve it, eliminating contention and long waits for a CI pipeline. But its real power lies in bypassing the need for staging altogether, enabling testing in production.
Testing in production sounds dangerous. It’s not when you have the right guardrails.
Testing in production is the logical evolution of the movement to shift testing left. But making such a foundational change to your CI/CD pipeline naturally brings up some critical questions:
Addressing these and making the shift to testing in production will inevitably involve some upfront engineering investment. However, the return on investment for that work is not just the savings from eliminating the direct infrastructure costs of your staging. It’s also a better developer experience, faster product delivery and fewer opportunities are lost to competitors that can iterate and ship faster than you.
Eliminating staging may sound like a pipe dream, but several prominent cloud native teams like DoorDash and Uber have already made the shift left to testing in production. Driven by their highly complex microservice stacks and a need to get better testing fidelity, they are also realizing huge infrastructure cost savings.
Teams like these deprecating their staging environments to test in production represent a broader trend: the rejection of approximation in favor of reality. Staging environments are artifacts of an era when duplicating infrastructure was a harder problem to solve than coordinating humans around shared resources.
That era is ending.
The future isn’t about building better approximations of production or optimizing your CI pipeline. It’s about adopting an entirely new paradigm. The teams taking this step aren’t just moving faster and cutting costs. They’re also shipping more reliable code.
It’s time to kill your staging environment.
Get the latest updates from Signadot