September 02, 2016

Unifying big data workloads in Apache Spark

Topic: Systems and Networking

Jay Ayres

TripAdvisor

TYPE: Videos

YEAR: 2016

In contrast to previous big data systems, Apache Spark was designed to offer a unified engine across diverse workloads, such as SQL, streaming, and batch analytics. While this approach may seem counterintuitive, it has some key benefits — most important, applications can combine workloads in ways that are not possible with specialized engines, and users benefit from a uniform management environment. The talk will cover how having a unified engine enabled new types of applications based on Spark (such as interactive queries over streams), and how Databricks designed Spark’s APIs to enable efficient composition. It will also sketch the newest unified API in Spark, Structured Streaming, which lets the engine run batch SQL or DataFrame computations incrementally over a stream of data.

SUBSCRIBE TO @SCALE

← Back

Unifying big data workloads in Apache Spark

Jay Ayres

TYPE: Videos

YEAR: 2016

SUBSCRIBE TO @SCALE

Thank you for your response. ✨

RECENT POSTS

RELATED POSTS