If you read about modern Site Reliability Engineering (SRE) or enterprise customer contracts, you will frequently run into three acronyms: SLA, SLO, and SLI.
While they sound similar, they represent entirely different concepts—ranging from legal obligations to technical metrics.
For SaaS founders, understanding these terms is crucial to closing enterprise clients and setting realistic performance targets. Here is a simple, non-boring guide.
1. SLI (Service Level Indicator)
- What it is: The actual raw metric you measure to see if a system is behaving.
- Plain English: How are we doing right now?
- Example: The percentage of HTTP requests that return a success status code (2xx/3xx) over a 5-minute window. E.g., "99.8% of requests succeeded."
2. SLO (Service Level Objective)
- What it is: The internal target or goal you set for your SLI.
- Plain English: What level of quality do we want to maintain?
- Example: "Our monthly average uptime (success rate) must be at least 99.9%."
Setting an SLO is a trade-off. Choosing a 100% SLO is impossible and prevents your team from deploying new code (since all deploys carry risk). Aim for target objectives that keep users happy without paralyzing development.
3. SLA (Service Level Agreement)
- What it is: The legal commitment you make to your customers, including the consequences if you fail to meet it.
- Plain English: What happens if we break our promise?
- Example: "If our monthly uptime falls below 99.5%, we will refund 10% of your subscription fee."
Key Advice for Startups
- SLA $\neq$ SLO: Keep your internal target (SLO) higher than your legal contract promise (SLA). If your SLA is 99.0%, set your SLO to 99.9%. This gives you a margin of safety to identify and resolve issues before penalties apply.
- Start Internal: Define your SLIs and SLOs first. Do not offer a legal SLA to customers until you have at least six months of production telemetry to prove you can hit those goals.
