BEYOND THE BUZZ: EVOLUTION OF AIOPS TO IMPROVE RELIABILITY AT SCALE

Operating globally distributed services at Meta scale is no easy feat. In a world with increasing complexity of systems & ever-growing telemetry data, engineers are left looking for a needle in a haystack during high-pressure critical incidents. How can we automate and assist engineers to accelerate root cause analysis and incident mitigation? In this talk we will demystify the industry buzz around AIOps. You will learn about our multi-year journey of embracing AIOps at Meta and leave with a blueprint for improving the reliability of your systems!


To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy