Large global WAN networks have unique reliability and capacity delivery requirements. They typically connect to the Internet, which means they use distributed routing protocols. They are typically much more sparse and irregular than large cluster networks, and can have significantly poorer reachability depending on where in the world they are. Yet, we depend on these networks to reach our customers. We need to build and maintain these networks at an extremely high level of reliability, while at the same time, growing the capacity on these network at hithertofore unseen speeds, while doing it cheaper than ever before. These needs are often directly in conflict.
In this talk, Ashok will go over some of his experiences in building and automating Google’s network backbone. He will cover:
-- The perceived and real reliability differences between SDN and on-box routed networks.
-- The importance of network automation and programmatic network management to capacity delivery as well as reliability.
-- The risks introduced by these management paradigms, and how they can be mitigated.
-- The importance of defining and measuring network SLOs, and tracking network health and capacity availability over time against these SLOs.
-- Some of the hard problems in global WAN availability today, such as global routes, BGP and MPLS, and where we could go from here in the search for a truly 6-nines network.
Speaker
Ashok Narayanan,Google