Systems @Scale Fall 2021
Share

Power Loss Siren: Making Facebook Resilient to Power Outages

Power outages cause the majority of unplanned server downtime in Facebook data centers. During a power outage, thousands of servers can go offline simultaneously for several hours, which can lead to service degradations. At Facebook, all data center racks are equipped with batteries that can provide backup power for a few minutes after power outages. Power Loss Siren (PLS) is a rack level, low latency, distributed power outage detection and alerting system. PLS leverages existing in-rack batteries to notify services about impending power outages and helps mitigate the impact of power outages on services. Typical mitigations include promoting remote database secondaries when primaries are experiencing power outages, routing requests away from hosts experiencing power outages, flushing memory to disk, etc. PLS also helps simplify physical infrastructure management by not requiring additional power source redundancy for critical services.

Related Topics

Join the @Scale Mailing List and Get the Latest News & Event Info

Code of Conduct

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy