• AIPressRoom
  • Posts
  • Improving SLOs with Nobl9 and Google Cloud

Improving SLOs with Nobl9 and Google Cloud

Google Site Reliability Engineering (SRE) helped to popularize Service Level Objectives (SLOs) by releasing several SRE Books. Since then, the SRE approach to understanding the health of modern distributed systems has helped many teams modernize their operations. By defining personalized SLOs, Google Cloud customers can advance their technology and grow their customer base without sacrificing reliability or potentially burning out their team by chasing false positives. SLOs help teams understand which problems are directly negatively impacting customer experience — a tremendous insight. However, the adoption of SLOs can be difficult because reliability requirements constantly change with customer expectations, and the inherent tradeoffs in setting goals are counter-intuitive to the uninitiated. Identifying the right metrics and developing processes around effectively using SLOs is not just a plug-and-play solution, it also involves changing the way you run the software services supporting your business.

Could AI technologies offer a better way? Specifically large language models (LLMs), which have been shown to be useful, simple interfaces to a potentially large set of structured and unstructured data? Built using Google Cloud, Nobl9’s new LLM-based product, SLOgpt.ai, is built to answer these questions, providing users expert-level interpretation of signals via a natural human language interface.

What if you could use the power of LLMs to simplify your understanding of SLOs?

Understanding the reliability of a large organization is difficult, even with the right tooling and data at hand. Tracking every product or subsystem can be exhausting. Speaking of tooling, 72% of companies use six or more monitoring and observability tools. That’s a lot! But as shown in the latest DORA report, reliability is an important part of software delivery and is required in order to achieve high organizational outcomes.

When an organization has multiple teams building and operating many products, each of which might be composed of many different services, the number of SLOs can be staggering.