DECEMBER 07, 2022

Chaos Engineering at Scale

At Expedia Group we are building our on-road experience that includes a common runtime compute platform for a target scale of more than 15 thousand applications, running across a fleet of tens of thousands of nodes on Kubernetes.

In this talk we will present our chaos engineering platform, a part of our platform on-road experience, which aims to enable execution of chaos experiments for thousands of engineers. We will touch upon the importance of a great developer experience, scaling the platform through integrations with continuous delivery mechanisms, and operational aspects such as monitoring and runbooks. We will also present our learnings from promoting the platform through GameDays, byte-size videos, and success stories. Finally, we will demonstrate our recent work on closing the feedback loop between reliability best practices and tools through our reliability hub.

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy