TOPIC: Data, Systems and Networking

Networking @Scale Fall 2022

NOVEMBER 14, 2022 @ 10:00 AM - NOVEMBER 15, 2022 @ 12:30 PM PT
Designed for engineers that build and manage large-scale networks. Networking solutions are critical for building applications and services that serve billions of people around the world. Building and operating such large-scale networks often present complex engineering challenges to solve.
RSVPS CLOSED
AGENDA SPEAKERS

ABOUT EVENT

Networking @Scale is a technical conference for engineers that build and manage large-scale networks.

Networking solutions are critical for building applications and services that serve billions of people around the world. Building and operating such large-scale networks often present complex engineering challenges to solve. The Networking @Scale community focuses on bringing people together to discuss these challenges and collaborate on the development of new solutions.

Joining us are speakers from Alibaba, Apple, AWS, Google, Meta and Microsoft. The Fall Networking @Scale will cover all aspects of networking innovation (L2-L7) such as load balancing and prioritization of traffic, monitoring and diagnosis of problems in the network with new instrumentation, testing strategies for new devices and protocols, achieving design, deployment and operational efficiencies at scale for optical networks and much more. We are asking speakers to share their experience in Improving the Network performance, and experience across these areas for large-scale networks.

This event will take place virtually and span two days, featuring pre-recorded presentations and LIVE Q&A sessions.

EVENT AGENDA

Event times below are displayed in PT.

November 14

November 15

10:00 AM - 10:05 AM
Rajiv Krishnamurthy will give Opening Remarks
SPEAKER Rajiv Kirshnamurthy,Meta
10:00 AM - 12:35 PM
Tanuja Ingale will introduce each presentation
SPEAKER Tanuja Ingale,Meta
10:05 AM - 10:35 AM
Alibaba Predictable Network

This presentation will introduce Alibaba Predictable Network. With the emerging AI/ML new workload, data center network is evolving rapidly. A large-scale high-performance network becomes crucial to support a large-scale AI cluster. During the large-scale deployment of the RDMA network in the past 5 years, Alibaba cloud came out with the new Host-Network Fusion architecture, which can provide predictable network performance at scale.

SPEAKER Dennis Cai,Alibaba
10:35 AM - 10:55 AM
Developing, Deploying, Operating In-House Network Switches at a Massive Scale

FBOSS is Meta’s software stack for controlling/managing network switches deployed in Meta’s datacenters. FBOSS is one of the largest services in Meta (in terms of the number of instances deployed) and powers Meta’s Network. The talk presents our experience designing, developing and operating FBOSS: an in-house software built to manage and support a set of features required for data center switches of a large scale Internet content provider. We present the key ideas underpinning the FBOSS model viz.: switch-as-a-server and deploy-early-and-iterate. These ideas have enabled us to build a stable and scalable network. We demonstrate these ideas with concrete examples: Switch Abstraction Interface (SAI) defines a vendor-independent API for programming the forwarding ASIC. We developed a new SAI based implementation for FBOSS. The new FBOSS implementation was deployed at a massive scale to a brownfield deployment, and was also leveraged to onboard a new switch vendor into the Meta infrastructure. We present this effort as a case study to further illustrate the FBOSS design, development and operations model. In particular, the case study shares our experience and lessons learned while: -Developing the new FBOSS implementation to be functionally equivalent to the existing implementation for all production use cases, -Methodology for Third party vendor ASIC/SDK integration and functionality validation, -FBOSS push infrastructure and qualification pipeline, Maintaining and upgrading a mix of old and new implementation during gradual and phased brownfield deployment, -Tooling enhancements to detect/mitigate/triage/repair issues discovered while rolling out the new implementation to the brownfield deployment as well as onboarding new switch vendor to production, -Rolling out patches to mitigate/fix production issues without causing traffic disruption.

SPEAKER Shrikrishna Khare,Meta
SPEAKER Srikrishna Gopu,Meta
10:55 AM - 11:15 AM
Transforming Google's Datacenter Network Through Optical Circuit Switching and Software-Defined Networking

We present a decade of evolution and production experience with Google’s Jupiter datacenter network. In this period, Jupiter has delivered 5x higher speed and capacity, 30% reduction in capex, 41% reduction in power, incremental deployment and technology refresh all while serving live production traffic. A key enabler for these improvements is evolving Jupiter from a Clos to a direct-connect topology among the machine aggregation blocks. Critical architectural changes for this include: A datacenter interconnection layer employing Micro-Electro-Mechanical Systems (MEMS) based Optical Circuit Switches (OCSes) to enable dynamic topology reconfiguration, centralized Software-Defined Networking (SDN) control for traffic engineering, and automated network operations for incremental capacity delivery and topology engineering.

SPEAKER Rui Wang,Google
11:15 AM - 11:35 AM
Live Q&A

Featuring Rui Miao - Alibaba, Shrikrishna Khare, Srikrishna Gopu, Rui Wang Moderated by Luca Niccolini

SPEAKER Shrikrishna Khare,Meta
SPEAKER Srikrishna Gopu,Meta
SPEAKER Rui Wang,Google
SPEAKER Luca Niccolini,Meta
11:35 AM - 12:00 PM
Wiring the Planet: Scaling Meta's Global Optical Network

The first half of the talk will highlight the expansive global fiber network that is being built and managed by BBE. We will first highlight the submarine fiber optic systems that we are building to connect the globe. We are working on over 25 submarine fiber network builds but will highlight just a few. Bifrost and Echo will be shown which are the first to directly connect the US and Singapore and will support SGA, our first APAC data center. The vast 2Africa project will be discussed, which is both the world’s largest submarine cable network and has the potential to connect the largest number of people, 3 Billion. We have established a series of firsts in the submarine fiber world and are driving new innovations which are transforming the industry. The connection of our submarine networks to our terrestrial backbone will be shown and how this network interconnects our data centers. Next, we describe how we design and build the hierarchies of the optical transport layer built on top of those fiber paths. We have to model and inter-operate multiple hardware types, from multiple vendors, across multiple technology generations, to achieve maximum utilization of our fiber. The scale of network growth means that we continuously deploy the latest generation of high-performance transponder systems onto our network, but each additional generation imposes an operational (support) tax which can only be mitigated by focusing on automation for both configuration and maintenance (e.g. software upgrades). Finally, we will discuss our approach to provisioning capacity on this network, our solutions for distributed provisioning and monitoring of this global fleet of hardware, and our approach to diagnosing and remediating network failures. This, as well as the design challenge described prior, requires us to maintain a suite of in-house software systems, without which operating at our scale would not be possible.

SPEAKER Stephen Grubb,Meta
SPEAKER Joseph Kakande,Meta
12:00 PM - 12:15 PM
OneWAN Traffic Engineering: Unifying Inter-Datacenter and Internet Backbones

Like other large cloud providers, Microsoft operates two planet-scale wide-area networks: AS8075 WAN for Internet traffic and AS8074 software-defined WAN to carry inter-datacenter traffic. Maintaining two networks has operational challenges and so we consolidated the split-WAN architecture into a unified OneWAN using SDN principles (used in Swan). OneWAN grappled with the scaling challenges posed by network and routing table sizes 10 times larger than Swan. In addition, Internet Traffic Engineering (ITE) is another SDN system that enables performance-aware egress peer selection and inbound traffic engineering between autonomous systems. Together, OneWAN and ITE controllers measure peering traffic and adapt the backbone and the peering edge to dynamic demand spikes and capacity reductions.

SPEAKER Umesh Krishnaswamy,Microsoft
12:15 PM - 12:35 PM
LIVE Q&A

Featuring Stephen Grubb, Joseph Kakande, Somesh Chaturmohta Moderated by Andrey Golovanov, Meta

SPEAKER Stephen Grubb,Meta
SPEAKER Joseph Kakande,Meta
SPEAKER Umesh Krishnaswamy,Microsoft
SPEAKER Somesh Chaturmohta,Microsoft
SPEAKER Andrey Golovanov,Meta
10:00 AM - 10:20 AM
Millisampler: Fine-Grained Network Traffic Analysis

Managing data center networks with low loss requires understanding traffic patterns, especially burstiness of the traffic, at fine time granularity. Yet, monitoring traffic with millisecond granularity fleet wide is challenging. To gain more visibility into our production network, we design millisampler, a BPF based lightweight traffic measurement tool deployed in every server in the entire fleet for continual monitoring that operates at high granularity timescale. Millisampler data allows us to characterize microbursts at millisecond or even microsecond granularity, and simultaneous data collection enables analysis of how synchronized bursts interact in rack buffers. We’ll discuss the design, implementation, and production experience with millisampler, and share some interesting observations we got with the millisampler data.

SPEAKER Yimeng Zhao,Meta
10:20 AM - 10:45 AM
Network SLOs: Knowing When The Network Is The Barrier To Application Quality Of Experience

At Meta, we need to be able to readily determine if network conditions are responsible for instances of poor quality of experience (QoE) such as images loading slowly or video stalling during playback. In response, we’ve developed Network SLOs, which can be thought of as a product’s "minimum network requirements’ for good QoE. If the network between Meta and a user does not meet the product’s SLO requirements, QoE will be degraded. In this talk, we describe our work over the past three years on deriving and operationalizing Network SLOs for Meta’s user facing products. First, we discuss how we measure quality of experience for a handful of products and our approach to quantifying the relevant network conditions. We then discuss how we use a combination of statistical tools to derive Network SLOs, and how we process trillions of measurements each day to evaluate Network SLO compliance. We present case-studies of how Network SLOs have been used to triage regressions in QoE, identify gaps in Meta’s edge network capacity, and surface inefficiencies in how product utilizes the network.

SPEAKER Brandon Schlinker,Meta
SPEAKER Sharad Jaiswal,Meta
10:45 AM - 11:10 AM
Responsiveness Under Working Conditions

Since the early days of the Internet, capacity has been the prime metric to quantify the quality of the Internet access. While capacity was the primary challenge back in those days, we have successfully reached a point where the vast majority of users can easily access sufficient capacity for the majority of the use-cases. But still, Internet experience is often lacking the smoothness that we would expect. Video-conferencing still has frequent issues, video gaming is rarely a smooth experience and web-browsing still suffers from bad page-load times. We present a new metric, called ""Responsiveness under working conditions"", which significantly broadens the scope far beyond traditional capacity and latency measurements. This metric aims at quantifying the network's ability to provide low latency while at the same time providing high capacity. The measurement methodology not only measures the network, but also the end-host networking stack. We will describe how measuring responsiveness will allow to detect deep buffers in the server's networking stack and how it affects the end-user experience. Further, we describe steps that can be taken to reduce those buffers. We will conclude this talk by providing resources and open-source tools to allow everyone to reproduce the same measurement on their infrastructure and tune their networking stack for the benefit of their end-users.

SPEAKER Christoph Paasch,Apple
11:10 AM - 11:30 AM
LIVE Q&A

Featuring Yimeng Zhao, Brandon Schlinker, Sharad Jaiswal. Moderated by Neil Spring

SPEAKER Yimeng Zhao,Meta
SPEAKER Brandon Schlinker,Meta
SPEAKER Sharad Jaiswal,Meta
SPEAKER Neil Spring,Meta
11:30 AM - 11:50 AM
Improving L4 Routing Consistency At Meta

We work on a layer 4 load balancer called Shiv. Shiv routes packets to backends using a consistent hash of the 5-tuple of the packet (namely, the source ip, destination ip, source port, destination port, and protocol). Shiv’s objective is to route packets for a connection (which all have the same 5-tuple) to the same backend for the duration of the connection. If it is unable to do so, this leads to broken connections and user impact (for example, stalled videos). While consistent hashing is quite resilient to changes, when a large number of backends are added or removed, remappings occur, resulting in broken connections. To protect from such changes, Shiv maintains a cache that contains a mapping from 5-tuple to backend. The logic used by Shiv to route packets can be summarized as follows: If the 5-tuple of the packet is in its cache, route it to the backend indicated by the cache. Otherwise, calculate the hash function on the 5-tuple to obtain the destination backend, route the packet to that backend, and place the (5-tuple, backend) entry in the cache. Shiv works well under the following conditions: - In steady state, when the arrangement of Shivs and backends is the same. - When the arrangement of Shivs changes. In this case, packets for a connection may land on a different Shiv host than earlier packets, but both Shiv hosts use the same consistent hash function, and therefore, pick the same backend. - When the arrangement of backends changes. In this case, packets for a connection continue to land on the same Shiv host, which utilizes its cache to route the packet to the same backend as it used to. However, during changes to the arrangement of both Shivs and backends, a nontrivial number of misroutings occur, because the following sequence of events could happen: - Packets for a connection C arrive at a Shiv host A, which picks a backend X - A large topology change occurs on the Shivs and backends. - Packets for connection C now land at Shiv host B != A, which picks a backend Y != X because the hash ring has changed. We have implemented two solutions to this problem, that we will talk about: - Embedding “Server ID” hints into packets, that enable Shiv to route the packets to a specific server without having to perform a consistent hash. - Sharing the 5-tuple to backend cache among all Shivs in a cluster, thereby facilitating consistent decision making among them in the face of hash ring changes.

SPEAKER Aman Sharma,Meta
SPEAKER Andrii Vasylevskyi,Meta
11:50 AM - 12:10 PM
Highly-Available, Global Traffic Management At Scale

AWS Global Accelerator (AGA) leverages Amazon’s global network, anycast routing, and integrated load-balancing techniques to reduce latency and improve availability. In this session, we discuss the architecture, capabilities, and challenges of operating AGA at scale. We will provide insight into how AGA drives path optimization through the integration of customer-defined traffic control and real-time internet performance. We end the session describing future work in this space.

SPEAKER Alan Halachmi,AWS
SPEAKER Akshat Aranya,AWS
12:10 PM - 12:30 PM
Live Q&A

Featuring: Aman Sharma, Andrii Vasylevskyi, Alan Halachmi, Akshat Aranya Moderated by Bharet Parekh

SPEAKER Aman Sharma,Meta
SPEAKER Andrii Vasylevskyi,Meta
SPEAKER Alan Halachmi,AWS
SPEAKER Akshat Aranya,AWS
SPEAKER Bharat Parekh,Meta

SPEAKERS AND MODERATORS

Rajiv is a Software Engineering Director in the Network Infrastructure group at Meta. He supports the Host Networking Team which... read more

Rajiv Kirshnamurthy

Meta

Tanuja Ingale is a Technical Program Manager in the Production Network Infrastructure group at Meta. She supports Network Infrastructure programs... read more

Tanuja Ingale

Meta

Dennis Cai, Head of Network Infrastructure of Alibaba Cloud. His team is responsible for the entire Alibaba Cloud network infrastructure... read more

Dennis Cai

Alibaba

Shrikrishna Khare is a software engineer at Meta Platforms Inc. (formerly Facebook). He works on FBOSS - Meta Platform's switching... read more

Shrikrishna Khare

Meta

Srikrishna Gopu is a software engineer at Meta Platforms Inc. (formerly Facebook). He works on FBOSS - Meta Platform's switching... read more

Srikrishna Gopu

Meta

Rui Wang is a Senior Staff Software Engineer at Google. He has contributed to Google's data center networking infrastructure, with... read more

Rui Wang

Google

Luca is a Software Engineer working on network protocols, improving applications performance at scale. Most recently involved with the implementation... read more

Luca Niccolini

Meta

Stephen Grubb is a Global Optical Architect at Meta. He is part of the BBE optical team, and oversees the... read more

Stephen Grubb

Meta

Joseph Kakande is a Network Engineer at Meta. He is part of the BBE optical team, and works on automation... read more

Joseph Kakande

Meta

Umesh is a Partner Software Engineer in Microsoft Azure Networking. He works on software-defined wide area network traffic engineering and... read more

Umesh Krishnaswamy

Microsoft

Somesh Chaturmohta is a Principal Software Engineering Manager in the Microsoft Global Networking team, responsible for managing the Microsoft Edge... read more

Somesh Chaturmohta

Microsoft

Andrey Golovanov, Network Engineer at Meta. His team is responsible for the development and operations of the global SDN-based backbone... read more

Andrey Golovanov

Meta

Yimeng Zhao is a Research Scientist at Meta on the Network and Transport Analytics team. She is broadly interested in... read more

Yimeng Zhao

Meta

Brandon is a research scientist at Meta, where his work spans all things networking — from the datacenter to user... read more

Brandon Schlinker

Meta

Sharad Jaiswal is an Optimization Engineer in Meta’s Edge/Traffic Infrastructure organization. His work focuses on bridging the understanding when Meta’s... read more

Sharad Jaiswal

Meta

Christoph Paasch has been working on transport layer networking since 2010. Focusing on extensions to TCP, like Multipath TCP or... read more

Christoph Paasch

Apple

Neil is a Research Scientist at Meta, working on tools to measure and improve service performance on the data center... read more

Neil Spring

Meta

Aman has been working as a software engineer in Networking at Meta over the last 4 years, focusing mostly on... read more

Aman Sharma

Meta

Andrii has been working as a software engineer in networking for over a decade. In recent years he’s been focusing... read more

Andrii Vasylevskyi

Meta

Alan Halachmi is the Director of Solutions Architecture for Worldwide Public Sector at Amazon Web Services. Alan leads a global... read more

Alan Halachmi

AWS

Akshat Aranya is a senior software engineer working with AWS Global Accelerator team for the last three years. His area... read more

Akshat Aranya

AWS

Bharat is a Software Engineering Manager in the Traffic Infrastructure group at Meta. He supports the Traffic Protocols Team which... read more

Bharat Parekh

Meta

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy