Improving Reliability Through Data-Driven Engineering & Culture Changes

In the past 2 years, we have leveraged a data-driven strategy to improve the state of system reliability across Monetization. The first step was to quantify the impact of reliability work by defining a longitudinal metric that measures the impact of reliability failures on the business. For the Advertising business at Meta, the negative impact from SEVs can be characterized in terms of advertiser value lost (our systems cannot deliver optimal value for advertisers), which translates to short and long-term revenue lost for the company. Teams across the company then took clear data-driven goals to reduce the negative impact on advertisers of SEVs affecting their systems. These goals were based on data-mining of past SEVs (system outages) that helped identify hot spots actionable in terms of the four levers of reliability (prevention, detection, mitigation, culture)

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy