Boston Data @Scale – Presto: Pursuit of Performance
Andrii Rosa, Software Engineer at Facebook, and Matt Fuller, VP of Engineering at Starburst, discuss Presto, an open source, distributed SQL engine.
Presto is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Inspired by increasingly complex SQL queries, engineers at Facebook and Starburst have recently focused on cost-based query optimization.
In the first part of this talk, Andrii and Matt present the design and implementation of the Presto cost-based optimizer (CBO) to support connector-provided statistics, estimate selectivity, and choose efficient query plans. In the second part of the talk, they discuss a new mechanism in Presto that computes statistics seamlessly and efficiently making all Presto-generated data ready for CBO without any extra manual steps.
Finally, they discuss our future work enhancing the CBO and statistics collection in Presto.