EVENT AGENDA
Event times below are displayed in PT.
Event times below are displayed in PT.
Presentation information coming soon!
This talk discusses the diversity, volume and freshness of data required for GenAI, as well as the need to extract and prepare data differently based on its type, including interleaved data and multi-step trajectories for learning agentic behaviors. The talk also presents some of investments we have done to improve researcher productivity.
Large scale training requires substantial investment across the infrastructure stack. In this talk, we delve into some of the data center, network and software investments that enabled the development of our Llama3 models.
Presentation information coming soon!
In recent years, we've entered an AI summer, characterized by soaring investments, insatiable demand for compute power, and widespread enthusiasm for AI-driven technologies such as ChatGPT, GitHub Copilot, and MidJourney. As we stand on the brink of the next wave of AI advancements—featuring AI agents, co-pilots, and AI-powered process automation—the success of these advances hinges on developing safe, efficient, and highly capable AI components. In this talk, we will explore the next wave of AI and how open innovation in models, datasets, libraries, and research serves as a critical cornerstone for this progress. By leveraging open innovation, we can provide the foundation necessary to achieve these ambitious goals and propel the next wave of AI forward.
In this talk, we will go through the PyTorch advancements for Large Language Models (LLMs), developments that enhance every aspects of the LLM lifecycle. This includes our newest features/tools to enable large scale training, memory efficient fine-tuning, and on device LLM capabilities.
In this talk, we will discuss fine-tuning and deploying LLMs for local inference. First, we will discuss the importance of memory-efficient fine-tuning and a couple common architectural and algorithmic techniques to enable fine-tuning on consumer-grade hardware. The second half of the talk will cover challenges in deploying such large models for on-device deployment and some of the techniques such as quantization that make deployment possible.
MTIA is Meta's in-house ML accelerator program, and the second generation chip is serving in data centers. This talk describes the co-design process in building custom silicon, the Pytorch software ecosystem, and model architectures for Meta's key applications.
We show how MTIA achieves the performance, efficiency, and developer experience to successfully launch models into production. We highlight several co-design examples where we utilize special silicon features to accelerate our models. Finally, we describe future directions for MTIA.
Introduce the landed silicon MTIA Next Generation Accelerator. Meta specific optimizations to accelerate Meta workloads. Performance gains over software/GPU solutions. Future silicon roadmap.
Presentation information coming soon!
Details coming soon!