As LLM applications evolve into multi-agent systems and power complex decision-making workflows, the ability to observe and debug their behavior becomes a core engineering challenge. These systems are dynamic, non-deterministic, and increasingly reliant on external tools and APIs making traditional monitoring approaches insufficient. At Fiddler, we’ve worked with enterprise and federal teams deploying LLMs at scale, and what we’ve consistently seen is the absence of effective observability creates blind spots that delay iteration and introduce risk. In this talk, we will introduce Agentic Observability, a set of techniques and infrastructure to monitor production LLM systems. We will walk through how we trace agent reasoning and tool usage in structured form, apply Fast Trust Models to evaluate output quality beyond token-level accuracy, and monitor shifts in behavior using statistical and embedding-based methods. We will also share how we enable integration testing for agent workflows by simulating decision paths and validating semantic intent, all while operating under the scale and latency constraints of modern AI stacks. This work bridges AI science, platform engineering, and real-world GenAI deployment. We will highlight engineering lessons learned from high-scale environments, and how these observability tools are helping teams move faster, catch failures earlier, and build AI systems that can be trusted in production.
- WATCH NOW
- 2025 EVENTS
- PAST EVENTS
- 2024
- 2023
- 2022
- February
- RTC @Scale 2022
- March
- Systems @Scale Spring 2022
- April
- Product @Scale Spring 2022
- May
- Data @Scale Spring 2022
- June
- Systems @Scale Summer 2022
- Networking @Scale Summer 2022
- August
- Reliability @Scale Summer 2022
- September
- AI @Scale 2022
- November
- Networking @Scale Fall 2022
- Video @Scale Fall 2022
- December
- Systems @Scale Winter 2022
- 2021
- 2020
- 2019
- 2018
- 2017
- 2016
- 2015
- Blog & Video Archive
- Speaker Submissions