Apache Impala is a highly popular open source SQL interface built for large-scale data warehouses. Impala has been deployed in production at more than 800 enterprise customers as part of Cloudera Enterprise, managing warehouses up to 40 PB in size. Hadoop Distributed File System (HDFS), cloud object stores, and scalable columnar storage engines make it cheap and easy to store large volumes of data in one place rather than spread across many silos. This data attracts queries, and, soon enough, contention for resources arises between different queries, workloads, and organizations. Without resource management policies and enforcement, critical queries can’t run and users can’t interactively query the data. In this talk, Tim Armstrong, Software Engineer at Cloudera, discusses the challenges in making resource management work at scale for SQL analytics and how his team is tackling them in Apache Impala.
- WATCH NOW
- 2024 EVENTS
- PAST EVENTS
- 2023
- 2022
- February
- RTC @Scale 2022
- March
- Systems @Scale Spring 2022
- April
- Product @Scale Spring 2022
- May
- Data @Scale Spring 2022
- June
- Systems @Scale Summer 2022
- Networking @Scale Summer 2022
- August
- Reliability @Scale Summer 2022
- September
- AI @Scale 2022
- November
- Networking @Scale Fall 2022
- Video @Scale Fall 2022
- December
- Systems @Scale Winter 2022
- 2021
- 2020
- 2019
- 2018
- 2017
- 2016
- 2015
- Blog & Video Archive
- Speaker Submissions