The Wide Area Network (WAN) connects many datacenter (DC) regions and hundreds of Points of Presence (POPs) of Meta. The WAN resource is shared by several high network demand services at Meta. The network must be built for peak demand and also account for failure scenarios to reduce the impact on Meta products. However, building a resilient network that is over-provisioned for all service peak demands at our current growth rates is practically infeasible due to fiber sourcing, deployment constraints and the costs involved.
This talk presents Meta’s production traffic classification and WAN Entitlement solution that is currently used by our services to share the network safely and efficiently. Network Entitlement framework aims to provide a simple, stable, and operations-friendly abstraction of network for sharing the backbone. Our framework includes two key parts: (1) an hose-based entitlement granting system that establishes an agile contract while achieving network efficiency and meeting long-term SLO guarantees, and (2) a flexible large-scale distributed host-based traffic admission system that enforces the contract on the production traffic.