As the industry pivots toward an “AI Native” paradigm, the bottleneck for innovation has shifted from algorithmic design to the underlying infrastructure’s ability to handle unprecedented scale and complexity. This session explores how Google TPU (Tensor Processing Unit) infrastructure serves as the catalyst for this transformation, specifically within the domains of large-scale Recommender Systems, MoEs, LLMs and the emerging era of Autonomous Agents.We will delve into the architectural innovations of the latest TPU generations, demonstrating how their purpose-built design facilitates the massive throughput required for real-time recommendation engines and the high-speed inference necessary for agentic orchestration.
- WATCH NOW
- 2026 EVENTS
- PAST EVENTS
- 2025
- 2024
- 2023
- 2022
- February
- RTC @Scale 2022
- March
- Systems @Scale Spring 2022
- April
- Product @Scale Spring 2022
- May
- Data @Scale Spring 2022
- June
- Systems @Scale Summer 2022
- Networking @Scale Summer 2022
- August
- Reliability @Scale Summer 2022
- September
- AI @Scale 2022
- November
- Networking @Scale Fall 2022
- Video @Scale Fall 2022
- December
- Systems @Scale Winter 2022
- 2021
- 2020
- 2019
- 2018
- 2017
- 2016
- 2015
- Blog & Video Archive
- Speaker Submissions