@Scale 2018: Distributed AI with Ray
Over the past decade, the bulk synchronous processing (BSP) model proved highly effective for processing large amounts of data. Today, however, we are witnessing the emergence of a new class of applications — AI workloads. These applications exhibit new requirements, such as nested parallelism and highly heterogeneous computations.
In this talk, Ion Stoica, Professor at UC Berkeley, discusses how his team developed Ray, a distributed system that provides both task-parallel and actor abstractions. Ray is highly scalable, employing an in-memory storage system and a distributed scheduler. Ion discusses some design decisions as well as early experiences using Ray to implement a variety of applications.