Presto is an open source, high-performing, distributed relational database system targeted at making SQL analytics over big data fast and easy at Facebook. It provides rich SQL language capabilities for data engineers, data scientists, and business analysts to quickly and interactively process terabytes to petabytes of data. Presto is widely used within at Facebook for interactive analytics with thousands of active users.
Facebook is using Presto to accelerate a massive batch pipeline workload in our Hive Warehouse. Presto is also used to support custom analytics workloads with low-latency and high-throughput requirements. As an open source project, Presto has been adopted externally by many companies, including Comcast, LinkedIn, Netflix, Walmart, and others. In addition, Presto is being offered as a managed service by multiple vendors, including Amazon, Qubole, and Starburst Data. In this talk, Vaughn Washington, Software Engineering Manager at Facebook, outlines a selection of use cases that Presto supports at Facebook, describe its architecture, and discuss several features that enable it to support these use cases.