BEYOND THE BUZZ: EVOLUTION OF AIOPS TO IMPROVE RELIABILITY AT SCALE

Nitin Gupta

Madhura Parikh

TOPIC: Data, Systems and Networking

@SCALE SERIES: Systems @Scale

TYPE: video

YEAR: 2023

TAGS:

Operating globally distributed services at Meta scale is no easy feat. In a world with increasing complexity of systems & ever-growing telemetry data, engineers are left looking for a needle in a haystack during high-pressure critical incidents. How can we automate and assist engineers to accelerate root cause analysis and incident mitigation? In this talk we will demystify the industry buzz around AIOps. You will learn about our multi-year journey of embracing AIOps at Meta and leave with a blueprint for improving the reliability of your systems!