Andrii Rosa, Software Engineer at Facebook, and Matt Fuller, VP of Engineering at Starburst, discuss Presto, an open source, distributed SQL engine.
Presto is widely recognized for its low-latency queries, high concurrency, and native ability to query multiple data sources. Inspired by increasingly complex SQL queries, engineers at Facebook and Starburst have recently focused on cost-based query optimization.
In the first part of this talk, Andrii and Matt present the design and implementation of the Presto cost-based optimizer (CBO) to support connector-provided statistics, estimate selectivity, and choose efficient query plans. In the second part of the talk, they discuss a new mechanism in Presto that computes statistics seamlessly and efficiently making all Presto-generated data ready for CBO without any extra manual steps.
Finally, they discuss our future work enhancing the CBO and statistics collection in Presto.