MULTI-TENANCY FOR AI INFERENCE @ META SCALE

With the increasingly diverse landscape of AI workloads, it’s challenging to build an efficient and reliable infrastructure especially with the emerging powerful and expensive AI accelerators. We identify multi-tenancy as a key strategy to this end. By understanding the characteristics of AI workloads and their supporting hardware, we have an opportunity to optimize workload colocation to achieve significant infra cost savings.


To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy