From the course: Monitoring and Observability with Datadog

Unlock the full course today

Join today to access over 24,400 courses taught by industry experts.

Defining SLIs, SLOs, and error budgets with Datadog

Defining SLIs, SLOs, and error budgets with Datadog - Datadog Tutorial

From the course: Monitoring and Observability with Datadog

Defining SLIs, SLOs, and error budgets with Datadog

- The concepts of SLIs and SLOs are fast becoming a core part of SRE and how we measure system uptime and availability. Service level indicators are key metrics that measure system performance. While services often have many metrics SLIs are supposed to be the most critical and the most customer impacting. The latency or error rate of a service directly impacts how a user experiences your system and can be part of the SLIs that contribute to how you measure your system's availability. Service level objectives or SLOs are the availability target for a service. You might be familiar with AWS's service level agreement or SLA, which is nine nines which means their services will be available for 99.999999 and so on percent of the time. SLOs are the internal version of SLAs and I usually set to a higher number internally. This holds engineering teams to a higher standard but gives room for error so that the SLA can still be met.…

Contents