OCTOBER 16, 2019

@Scale 2019: Multinode: Natural language understanding at scale

Sharan Chetlur

NVIDIA

TOPIC: Machine Learning and AI

@SCALE SERIES: AI @Scale

TYPE: video

YEAR: 2019

TAGS:

This session includes an in-depth look at the world of multinode training for complex NLU models such as BERT. Sharan describes the challenges of tuning for speed and accuracy at the scale needed to bring training times down from weeks to minutes. Drawing from real-world experience running models on as many as 1,500 GPUs with reduced precision techniques, he explores the impact of different optimizers, strategies to reduce communication time, and improvements to per-GPU performance.

SUBSCRIBE TO @SCALE

TOPICS

Data, Systems and Networking Dev Tools and Ops, Privacy, Sustainability and Performance Fighting Abuse and Security Machine Learning and AI Mobile, Video and Web