Lessons from Building a Large-scale, Multi-cloud Data Platform at Databricks

The cloud is becoming one of the most attractive ways for enterprises to store, analyze, and get value from their data, but building and operating a data platform in the cloud has a number of new challenges compared to traditional on-premises data systems. I will explain some of these challenges based on my experience at Databricks, a startup that provides a data analytics platform as a service on AWS, Azure, and Google Cloud. Databricks manages millions of VMs per day to run data engineering and machine learning workloads using Apache Spark, TensorFlow, Python and other software for thousands of customers.

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy