Improving L4 Routing Consistency at Meta

We work on a layer 4 load balancer called Shiv. Shiv routes packets to backends using a consistent hash of the 5-tuple of the packet (namely, the source ip, destination ip, source port, destination port, and protocol). Shiv’s objective is to route packets for a connection (which all have the same 5-tuple) to the same backend for the duration of the connection. If it is unable to do so, this leads to broken connections and user impact (for example, stalled videos). While consistent hashing is quite resilient to changes, when a large number of backends are added or removed, remappings occur, resulting in broken connections. To protect from such changes, Shiv maintains a cache that contains a mapping from 5-tuple to backend. The logic used by Shiv to route packets can be summarized as follows: If the 5-tuple of the packet is in its cache, route it to the backend indicated by the cache. Otherwise, calculate the hash function on the 5-tuple to obtain the destination backend, route the packet to that backend, and place the (5-tuple, backend) entry in the cache. Shiv works well under the following conditions: – In steady state, when the arrangement of Shivs and backends is the same. – When the arrangement of Shivs changes. In this case, packets for a connection may land on a different Shiv host than earlier packets, but both Shiv hosts use the same consistent hash function, and therefore, pick the same backend. – When the arrangement of backends changes. In this case, packets for a connection continue to land on the same Shiv host, which utilizes its cache to route the packet to the same backend as it used to. However, during changes to the arrangement of both Shivs and backends, a nontrivial number of misroutings occur, because the following sequence of events could happen: – Packets for a connection C arrive at a Shiv host A, which picks a backend X – A large topology change occurs on the Shivs and backends. – Packets for connection C now land at Shiv host B != A, which picks a backend Y != X because the hash ring has changed. We have implemented two solutions to this problem, that we will talk about: – Embedding “Server ID” hints into packets, that enable Shiv to route the packets to a specific server without having to perform a consistent hash. – Sharing the 5-tuple to backend cache among all Shivs in a cluster, thereby facilitating consistent decision making among them in the face of hash ring changes.

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy