In 2013, eight years ago, Amazon Web Services revolutionized the data warehousing industry by launching Amazon Redshift, the first fully managed, petabyte-scale cloud data warehouse solution. Amazon Redshift made it simple and cost-effective to efficiently analyze large volumes of data using existing business intelligence tools. This launch was a significant leap from the traditional on-premise data warehousing solutions which were expensive, rigid (not elastic), and needed a lot of tribal knowledge to perform. Unsurprisingly, customers embraced Amazon Redshift and it went on to become the fastest growing service in AWS. Today, tens of thousands of customers use Amazon Redshift in AWS’s global infrastructure of 25 launched Regions and 81 Availability Zones (AZs) to process Exabytes of data daily.
The success of Amazon Redshift inspired a lot of innovation in the analytics industry which in turn has benefited consumers. In the last few years, the use cases for Amazon Redshift have evolved and in response, Amazon Redshift has delivered a series of innovations that continue to delight customers. In this talk, we take a peek under the hood of Amazon Redshift, and give an overview of its architecture. We focus on the core of the system and explain how Amazon Redshift maintains its differentiating industry-leading performance and scalability. We then talk about Amazon Redshift’s autonomics. In particular, we present how Redshift continuously monitors the system and uses machine learning to improve its performance and operational health without the need of dedicated administration resources. Finally, we discuss how Amazon Redshift extends beyond traditional data warehousing workloads, but integrating with the broad AWS ecosystem making Amazon Redshift a one-stop solution for analytics.