With billions of active users, Meta’s incident response process is critical to maintaining our reliability commitments. In this talk, we explore the challenges we face due to the complexity and scale of our operations and how we are leveraging AI to revolutionize onboarding responders and root cause analysis. Join us to learn about our journey, the lessons we’ve learned, and our vision for the future of incident response.