The Meta Thrift Journey

Designed for engineers that manage large-scale information systems serving millions of people. The operation of large-scale systems often introduces complex, unprecedented engineering challenges.

Thrift is a framework consisting of Codegen, Serialization, and RPC (remote procedure call) for service communication. Here’s a diagram that describes Thrift in relation to a typical client-server stack.

We initially introduced Thrift to the public in 2007 as part of Apache Incubator, along with an original whitepaper. We reintroduced Facebook Thrift as an open-source project in 2012 to meet Meta’s new infrastructure requirements. The reintroduction included improving asynchronous workload performance and per-request features. 

Since then, Thrift has been used at scale, becoming the standard communication framework across Meta’s Infrastructure. Here are some of the benefits we’ve seen from using Thrift at scale: 

  1. Code generation, leading to language interoperability: Thrift provides a way to express the data model, server, and client interfaces in a consistent manner across different programming languages. This enables applications written in different languages to continue to communicate with one other.
  2. Performant Serialization: Given that the shape of the data structure being sent over the wire is known, Thrift expects a well-typed data model. Consequently, serialization tends to be performant. 
  3. Pluggable transport: Thrift has been designed in a modular fashion. This enables us to use different networking protocols. Depending on the workloads and usage, at Meta we use TCP, HTTP, and others. 

Currently, Thrift supports O(100 B) QPS. We’ve invested significantly in Thrift to be able to support this scale. In this blog post, we’ll focus on a few efforts that have contributed to achieving overall efficiency. 

Operating Thrift at Scale

The first step towards ensuring Thrift could operate at scale was to unify myriad systems to use Thrift over the course of a few years.

In this blog, we talk about fast-path optimization, one of the initiatives to improve deserialization performance and unblock more latency-aware services from leveraging Thrift out of the box.

Below is an example of fast-path optimization. Consider the following data structure whose first field is a 32-bits integer, and the second field is a string:

A struct is serialized as a sequence of fields. Each field contains the field header followed by the actual content. The following BNF summarizes the encoding:

For the binary serialization protocol, structs are encoded with the following table:

Type CodeField IDValue
8 Bits16 BitsVariable

Each field consists of an 8-bit type code, a 16-bit field ID, and then the serialized value.

Users can freely add new fields, remove existing fields, or reorder fields. Parsers handle fields that are missing, unknown, or out of order. The parsers are able to deserialize while taking all possible paths into consideration (for example, unknown field ID, trivial types, recursively going over compound types, and so on).

The simplified flowchart of generic parser looks like this:

We’ve observed that in most common cases, the schema of Record is the same between the server and client. This is the common case that can be optimized via fast path. 

For the struct Record, we can generate a specialized code to parse it: We check whether the field type matches each field, and if so, we can parse each field as the corresponding type; otherwise we fall back to the generic parser. This is faster because the whole process is predictable and there is no loop, which is good for branch prediction and constant folding. The overall the simplified flowchart looks like this:

Based on the optimization above, and on many other optimizations in this vein, we improved the overall performance and enabled the usage of Thrift across most services at Meta, and we now support O(100B) QPS.

Thrift Delta Representation 

As we brought more workloads onto Thrift, an important use case emerged around providing a minimal wire representation for services that were starting to be network constrained. 

The proposed approach is to update Thrift to provide a delta representation over the wire. Similar concepts can be found in CRUD and HTTP applications. SQL has an UPDATE statement that allows the user to update an entire row or a specific column in the row, and HTTP has the PATCH request method, which allows partial modification of resources. Thrift Delta Representation automatically generates additional Thrift schema that can represent generic mutation semantics of user-provided schema. The schema generation is crucial here, as it allows Thrift Delta Representation to be sent over the wire and be serialized or deserialized with the Thrift serialization framework. 

The following diagram illustrates the schema generated for Thrift Delta Representation.

  • A struct Foo is annotated with @patch.GeneratePatch to generate a struct FooPatch, where each field consists of a desired operation
  • assign, clear, patchPrior, ensure, and patchAfter operations are supported for structured data. 
  • assign completely replaces the existing Foo. 
  • clear clears the value to the intrinsic default
  • patchPrior and patch provides partial modification for each field in a struct, where the patchPrior only updates the field when the field is not absent.
  • ensure replaces the value if a field is absent. 

Thrift Delta Representation generates the schema illustrated above so that a user can simply create a single method to partially modify any Thrift value, including primitives, containers, and structs. For example, a user can create a single method as shown below. 

Thrift Delta Representation strives for simplicity of usage, as the complex generated schema can be hard to use. The new method directly accepts FooPatch as the input. As FooPatch can support the generic representation of any mutation of Foo, the single method is sufficient. With the helper class, we provide an easy-to-use interface and validation to populate Thrift Delta Representation. For example, patch.patch<FieldId{1}> patches a field with Field ID 1. As the helper class also provides type safety validation, it will reject any other types when a user attempts to assign them. To consume the patch, a user can simply use the apply method, such as patch.apply(foo_), which effectively performs all operations specified in the Thrift Delta Representation. 

We are actively working on leveraging Thrift’s delta representation for several services and platforms at Meta and hope to share more wins soon. Please check out our github repository for the latest update!

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy