When you work in the digital sphere, ideas spread quickly. If something works and really works, engineers and IT specialists across the world will at least be tangentially aware of it soon enough. This was certainly the case with Site Reliability Engineering (SRE), one of the most successful management methods to come out of Google, as well as DevOps, a cultural approach that aims to free pipelines from restrictive siloed thinking.
At this point, SRE and DevOps have been co-existing for nearly a decade. There is no shortage of companies that utilize both frameworks simultaneously, yet there remains a popular discourse that the two are competitors that cover the exact same ground.
Admittedly there is a great deal of overlap between DevOps and SRE. Both improve the relationship between development and operations teams, helping to optimize pipelines in their own way. However, there are also significant differences worth taking into account before making a decision on whether to use one, both, or neither in a software pipeline.
What is the difference between site reliability engineers and DevOps engineers?
Short answer? Site reliability engineers follow a set of practices and performance metrics designed to optimize the reliability of services. DevOps, on the other hand, is a cultural state of mind with an emphasis on what to generally focus on, but with no prescriptive list of what to do. ‘DevOps engineer’ is also a general term that can apply to anyone in a DevOps team regardless of their function, skills, or background.
Indeed, the main difference between the two is how exactly they go about accomplishing their goals. For example, site reliability engineers set several fixed documents to establish how a service should be functioning, including the Service Level Agreement (SLA), Service Level Objectives (SLO), and Service Level Indicators (SLI)
SRE pipelines are managed in a way that makes it clear that if these standards are not adhered to, development must stop. At the same time, site reliability engineers will continually analyze and repair code to prevent potential flaws from ever becoming issues. Once the desired level of reliability is achieved, the team can then focus on making tangible service improvements or coming up with additional features.
In short, site reliability engineers play the role of administrators and software engineers, working within a specified management framework that enables them to prioritize the quality of code.
Compared to SRE, DevOps is a lot broader in its approach. It encompasses the entire product lifecycle, including design, development, and operations, with a general focus on automation and continuous improvement.
Most notably, it advocates shared responsibility: making the entire culture responsible for meeting targets, rather than having developers and operations staff responsible for siloed goals. This creates an added incentive for collaboration and sharing ideas on how to improve processes throughout the pipeline.
However, DevOps is not as cross-functional in its approach as SRE. ‘DevOps engineer’ is very much an umbrella term for anyone working within a DevOps pipeline, while a ‘site reliability engineer’ will often have distinct skills. As such, when a DevOps engineer needs help, they will pass it on to the most relevant colleague or team. In contrast, a site reliability engineer will usually try to fix the problem on their own.
That said, there are a great many similarities between DevOps and SRE. Both emphasize automation, for example, for the sake of optimizing speed and reliability. Each approach also suggests applying a developer’s mindset to operations tasks, as well as continuous tracking, continuous improvement, and so on.
The most important thing to remember is that in areas where SRE converges with DevOps, it is generally a lot clearer about how to get things done. Elements like the Error Budget, SLA, SLO, and SLI do not exist as standards within DevOps cultures. In fact, very little does, and it isn’t uncommon for a DevOps culture to borrow elements from other approaches, including SRE.
A key example of this is the attitude SRE and DevOps engineers have to failure. Both approaches generally see failure as being inevitable: something for teams to learn from. However, SRE has a far more codified approach for preparing for and accepting failure. Elements like SLIs and SLOs not only help prevent failure but also reduce the impact in terms of cost, meeting targets, and so on.
SRE vs. DevOps – Can they work well together?
Despite the differences, it’s safe to say that SRE and DevOps can absolutely work well together. The pillars of each approach are very similar, including the sharing of insight, tools, and technology, as well as bridging the gap between operations and development teams.
In ‘SRE vs. DevOps: Competing Standards or Close Friends’, Google itself states: “DevOps and SRE are not two competing methods for software development and operations, but rather close friends designed to break down organizational barriers to deliver better software faster.”
In practice, they are not so different at all. DevOps cultures can even take different forms which mirror SRE environments more closely, depending on the requirements of the practitioner organization in question. It is quite simple to integrate DevOps and SRE within an organization, provided you have experienced engineers capable of establishing a healthy relationship between them.
Good e-Learning is an award-winning online training provider and a Trusted Education Partner for the DevOps Institute. We work with subject matter experts to deliver courses that not only help students become certified but also give them a clear working knowledge they can begin applying in their daily roles.
Want to find out more about SRE and DevOps training? Visit the Good e-Learning website today.