Learn to fully optimize your IT culture with this SRE Foundation (SREF) course from Good e-Learning!
Site Reliability Engineers fulfill many of the roles associated with Operations, supporting Development teams and driving productivity and efficiency. This course also demonstrates how site reliability engineers make systems more stable, predictable, and scalable while also tracking essential metrics to enable continuous improvement. Kickstart your SRE training today!
This module provides an introduction to the course, explaining its rationale, introducing the subject matter and providing an overview of the ‘SRE Foundation’ syllabus.
This module also provides students with a toolkit:
Table of contents
The first module of this SRE online course introduces students to the discipline of SRE and compares it with DevOps. It also offers an introduction to the principles and practices of site reliability engineering.
What is Site Reliability Engineering?
SRE & DevOps: What is the Difference?
SRE Principles & Practices
This module looks at service levels, service level objectives (SLOs), error budgets, and error budget policies.
Service Level Objectives (SLO’s)
Error Budget Policies
This module looks at the ‘Toil’ concept, why it is a problem, and how it can be managed.
What is Toil?
Why is Toil Bad?
Doing Something About Toil
This module introduces service level indicators (SLIs), monitoring, and observability.
Service Level Indicators (SLI’s)
This module looks at automation, defining it in terms of DevOps and SRE. It also introduces different types of automation, as well as a number of automation tools.
Hierarchy of Automation Types
This module examines the principle of learning from failure, and how it can be used for anti-fragility and chaos engineering.
Why Learn from Failure
Benefits of Anti-Fragility
Shifting the Organizational Balance
This module introduces how site reliability engineering is managed at an organizational level, as well as how it can be implemented.
Why Organizations Embrace SRE
Patterns for SRE Adoption
Sustainable Incident Response
SRE & Scale
This module looks at how SRE can incorporate frameworks such as ITIL, Agile, and IT4IT. It also examines emerging trends that will define the future of SRE, including ‘customer reliability engineering’.
SRE & Other Frameworks
This SRE course is designed to fully prepare students to sit the official SRE Foundation (SREF) examination. This includes providing official practice exams to help students test themselves and get used to examination conditions.
This course comes with mock exams to help students prepare for the real thing, as well as a FREE exam voucher. (T&Cs apply)
Before booking your exam, it will be a good idea to make sure that your device meets the technical requirements. Please visit the DevOps Institute website for more information and guidance.
When you are ready to use your free exam voucher, simply contact [email protected]. Exam voucher requests are typically processed within 2 working days but please allow up to 5. Students must request their exam voucher within the course access period which starts from the date of purchase. For more information, please visit our Support & FAQs page.
Site Reliability Engineering (SRE) is the process of continuously testing the ‘reliability’ of a new product in development. This enables developers to better understand and adapt to the needs of operations teams.
There are several elements to SRE, including:
A ‘Service Level Agreement (SLA)’ is outlined to define reliable has to be for end-users
An ‘Error Budget’ is established to show how much can be spent on fixing errors before production must stop
Site reliability engineers make themselves available to help with development team workloads and vice versa
Site reliability engineers actively find and repair problems during the development stage
Developers take on Operations tasks if necessary
Site reliability engineers create automation wherever possible for the sake of efficiency and reliability
A ‘site reliability engineer’ is an automation/ coding specialist whose job it is to find and solve problems within Development and Operations.
An SRE team can not only make a DevOps pipeline more reliable, but also far more efficient and scalable. It can also free Development and Operations team members to focus on improving services elsewhere, boosting the quality of releases. Incorporating SRE will also further improve existing DevOps cultures by encouraging greater communication, clarity, and understanding between teams.
Finally, site reliability engineers are specialists in considering and conveying concerns in relation to the wider organization and can extract metrics that can prove extremely valuable for other departments.
DevOps and SRE work extremely well together. This is largely because both are designed with automation, inter-team collaboration, and communication in mind, as well as boosting efficiency and reliability within IT pipelines. The SRE Foundation qualification even comes from the DevOps Institute.
There are no prerequisites for taking this course. However, it can be helpful to have pre-existing knowledge of SRE, as well as DevOps.
SRE was originally developed by Google. Its purpose is to quantify the relationship between Development and Operations teams, ensuring that code is created efficiently, reliably, and with operational factors in mind. This is particularly valuable in organizations where IT departments and teams have become siloed from one another.
SRE is ideal for organizations that rely on developing and releasing code. It works particularly well in DevOps environments and is a popular choice with DevOps engineers and DevOps Leaders. Given the growing popularity of SRE, a qualified and experienced practitioner will often find it easier to take the next step in their career.
This Good e-Learning Site Reliability Engineering Foundation course takes a unique approach to introduce principles and methods that allow engineers, developers, architects, and all teams to focus on work that drives value within their disciplines