Performance @Scale 2020: Scaling machine learning on graphs

Networks (graphs) of people’s social and content interactions are a rich source of data for machine learning algorithms. Traditional machine learning algorithms do not naturally take graph-structured data as input, so unsupervised methods such as graph embeddings are used to turn graph data into features that can be used for machine learning tasks. However, modern interaction graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of typical embedding systems. In this talk, I will describe the techniques that the PyTorch-BigGraph uses to scale graph embedding methods to graphs of this size. I will also discuss new work on applying the PBG philosophy to achieve further scaling on GPUs, and how we are combining graph embeddings with graph neural network models on these extremely large graphs.

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy