Scribe: Improving Reliability One 9 at a Time

Scribe is at the heart of data transport at Meta. From revenue critical data to important monitoring datasets flow through the system. This sets an extremely high reliability bar for the system. In this talk, we will take you through our journey of adding the 5th 9 to our SLAs. We’ll talk about how we detect incidents before our customers without overloading our oncalls and how we hardened the system against regional and dependency outages.

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy