EVENT AGENDA
Event times below are displayed in PT.
Event times below are displayed in PT.
INTRODUCTION: Head of Engineering and Infrastructure, Facebook - Jay Parikh
DNA DATA STORAGE: Borrowing from nature to build better computers - Luis Ceze
Sergey Doroshenko and Adriana Libório share lessons learned on a journey of transforming Facebook's continuous integration system, Sandcastle, into a universal platform.
The session will focus on how the Google Translate team members make their Android app work better for users in emerging markets.
Azure Data Lake Store (ADLS) is a fully managed, elastic, scalable, and secure file system that supports semantics of the Hadoop distributed file system (HDFS) and the Microsoft Cosmos file system. It is specifically designed and optimized for a broad spectrum of big data analytics that depend on an extremely high degree of parallel reads and writes, as well as colocation of compute and data for high-bandwidth and low-latency access. It brings together key components and features of Cosmos — long used internally at Microsoft as the warehouse for data and analytics — and HDFS. It also is a unified file storage solution for analytics on Azure. Internal and external workloads run on this unified platform. Distinguishing aspects of ADLS include its support for multiple storage tiers, exabyte scale, and comprehensive security and data sharing. Raghu Ramakrishnan will cover ADLS architecture, design points, the Cosmos experience, and performance.
Currently, the predominant approach in AI is to use unlimited data to solve narrowly defined problems. To progress toward humanlike intelligence, AI benchmarks will need to be extended to focus more on data efficiency, flexibility of reasoning, and transfer of knowledge between tasks. This talk will detail the challenges and successes in making these ideas operational. At Vicarious, the language of probabilistic graphical models is used as the representational framework. Compared with neural networks, graphical models have several advantages, such as the ability to incorporate prior knowledge, answer arbitrary probabilistic queries, and deal with uncertainty. However, a downside is that inference can be intractable. By incorporating several insights that originally were discovered in neuroscience, engineers at Vicarious were able to create probabilistic models, on which accurate inference can be performed using message-passing algorithms that are similar to the computations in a neural network.
What if you had to build more machine-learned models than there are data scientists in the world? Well, at enterprise companies serving hundreds of thousands of businesses, this is precisely the case. In this talk, Shubha Nabar will walk through the scale challenges of building AI in the enterprise. She also will describe the general-purpose machine learning platform at Salesforce that automatically builds personalized models optimized for every business and every use case.
Facebook's Mark Marchukov will talk about Facebook's work to build a durable and highly available sequential distributed storage system that can handle hardware failures and sustain consistent delivery at exceptionally high ingest rates.
This panel features leaders who have spent decades building successful teams that tackle large-scale technical challenges. Come hear how they’ve worked across disciplines, across oceans, and across their companies to move technology forward — while solving for the inherent difficulties of scaling technical management and managing technical teams at scale. Even as technology plays a greater role in society, this panel will highlight how people remain at the center of the code, infrastructure, and products that reach billions of people.
Product teams usually grow to deal with the increased workload caused by their own success. While growing, they risk losing the qualities that initially made them successful. Microsoft's VS Code is a small team with a wildly successful open source, cross-platform code editor. In this talk, Kai Maetzel will explain what the team has learned from developing open source projects and working on desktop, SaaS, and mobile applications, and how these findings help the team stay small and nimble, manage explosive growth, and make millions of developers happy.
This talk will explore the future directions for 360 media across photos, video, and AR/VR. Matt Uyttendaele will dive deep on his latest work, applying machine learning models to 360 photos for an enhanced user experience.
Fibers get cut, databases crash, and you've adopted chaos engineering to challenge your production environment as much as possible. But what are you doing to craft the resiliency test suites that minimize the impact of failure on your application as much as possible? How do you debug resiliency problems locally and make sure single points of failure don't creep into the application in the first place? Shopify developed the open source Toxiproxy in 2015 to emulate timeouts, latency, and outages in its development environments. This talk will equip you with tools to start writing resiliency test suites that harden your own applications and supplement other chaos engineering practices.
Apache Beam is a unified programming model capable of expressing a wide variety of traditional batch and complex streaming use cases. By neatly separating properties of the data from runtime characteristics, Beam enables users to easily tune requirements around completeness and latency and run the same pipeline across multiple runtime environments. In addition, Beam's model enables cutting-edge optimizations such as dynamic work rebalancing and autoscaling, giving those runtimes the ability to be highly efficient. This talk will cover the basics of Apache Beam, touch on its evolution, and describe the main concepts in its powerful programming model. It will include detailed, concrete examples of how Beam unifies batch and streaming use cases, and show efficient execution in real-world scenarios.
Netflix engineers were spending too much time working with infrastructure and not enough time on their media algorithms, so they created Archer, a high-scale distributed computing platform for media processing. It uses Docker, which allows developers to write their code in any language with any OS packages, test it on a laptop, and run it with millions of compute hours. This talk will discuss the Archer platform architecture as well as its implementation and applications, including feature extraction, encode experimentation, and machine learning.
Last year Facebook started rolling out the ability for public figures to go live with a guest. Now Live With is available for all profiles and Pages on iOS, letting you invite a friend into your live video so you can hang out together, or broadcast the conversation to an audience. To make this possible, Facebook's engineers worked to bring real-time interactive communication to broadcast-quality streams. In this talk, Nick Ruff will discuss how they bridged the trade-offs between video streaming technologies to enable real-time multiparty broadcasting.
In the last year, GPUs plus deep learning have gone from a hot topic to large-scale production deployment in major data centers. That's because deep learning works, and the evolution of GPUs has made them a great fit for deep learning and inference. Neural nets, frameworks, and GPU architectures have changed significantly in the last year as well, allowing better solutions to be created more quickly and in more places, moving from niche applications to the mainstream. It also allows them to be used in real time for more industrial automation and human interaction roles. We talk about GPU architecture and framework evolution, scaling out and scaling up training and performance, real-time inference improvements, security plus VM isolation and management, and overall deep learning flow improvements to make development and deployment more devops-friendly.
Magda Balazinska talks about PerfEnforce, a system that enables performance-oriented service-level agreements (SLAs) for data analytics. Using a set of tenants and query-level performance SLAs, she addresses how to dynamically assign compute resources to queries from each tenant (query scheduling). She also discusses how to dynamically resize the multi-tenant service to minimize costs due to compute resources and SLA violation penalties (resource provisioning).
Google's codebase includes over 2 billion lines of code, spanning thousands of projects. This talk looks at how Google keeps such a large codebase nimble and evolving despite its size and scale.
Unity builds the tools that enable game developers, artists, designers, and videographers to tell better visual stories. Games made with Unity have reached over 3 billion devices and were installed over 16 billion times. Unity also has powerful tools for design, video animation, special effects, and video rendering. Adam Myhill, who heads cinematics at Unity, will discuss how Unity is democratizing and scaling tools for video that enable the real-time filming and editing of 3D content, watching VR without hardware, and more.
Qualcomm is an at-scale company. It powered the smartphone revolution and connected billions of people. It pioneered 3G and 4G, and now it is leading the way to 5G and a new era of intelligent, connected devices. Mobile is going to be the largest machine learning platform on the planet. Come learn how Qualcomm is making efficient on-device machine learning possible, how Qualcomm and Facebook worked closely to support machine learning in Facebook applications, and what's next for Qualcomm and AI.
DynamoDB is a fully managed NoSQL database service that provides high throughput at low latency with seamless scalability. The service is the backbone for many Internet applications, handling trillions of requests daily. The scale of data that applications have to manage continues to grow rapidly, making it a challenge to manage systems and respond to events in real time. This talk will be a deep dive into the challenges of building the DynamoDB Streams feature, which provides a time-ordered sequence of item changes on DynamoDB tables, and leveraging it with AWS Lambda to reimagine large-scale applications for the cloud.
Deep learning is making a great impact across products at Google and in the world at large. As Google pushes the limits of AI and deep learning, research is underway in many areas. With integration into many Google products, this research is improving the lives of billions of people. Open source tools like TensorFlow and open publications put the latest deep learning research at the fingertips of engineers around the world. This talk begins by exploring what has enabled this field to evolve rapidly over the last few years. It also will cover some of the leading research advances and current trends that point to a promising future, and the algorithms that make it possible.
The ability to prefetch data while your app is on background can decouple the usability of your app from network availability. Moreover, it can minimize cellular data usage and significantly increase perceived speed. This talk walks through the main technical and performance challenges of the implementation. How do you schedule data prefetching on background? What framework is the most appropriate to execute this type of work? What should you be prefetching and when? Find out here.
Monolith, microservices, or both? Find out how Okta has developed the best of both worlds to solve for the challenge of scaling to handle dynamic traffic volumes. In this session, Kelvin Zhu will walk through how the Okta team manages a monolithic codebase by way of virtual splitting, allowing dialable CI loop speed. He'll also speak to how Okta has rethought testing at scale — ensuring that tests run at speed while also monitoring the quality of tests to help gain visibility into problem tests and get past them.
Facebook can move fast and iterate because of its ability to make data-driven decisions. Data from its stream processing systems provides real-time analytics and insights; the system is also implemented into various Facebook products, which have to aggregate data from many sources. In this talk, Rajesh Nishtala covers the difficulties of stream processing at scale, the solutions Facebook has created to date, and three case studies on improving the time-to-deliver insights with data via stream processing. The case studies include examples from search product development, accelerating daily pipelines in the data warehouse, and seamless integration with machine learning platforms. Each case study shows how Facebook can deliver value to more teams while continuing to abstract the details of stream processing from various teams. Rajesh concludes by speaking to the future of stream processing.
This updated talk will give the latest details on how Facebook's Release Engineering team ships facebook.com multiple times per day.
BigQuery is best known for being a large-scale query engine, but one of its most important components is a structured storage system. Over the last several years, Google has found that active data management is crucial for providing no-ops scalable storage. This talk goes into the details of how BigQuery managed storage works, why it's hard to get right, and how it helps ensure that queries are always fast.
At Facebook, the mission is to give people the power to build community and bring the world closer together. In this talk, Necip Fazil Ayan will present the most recent work on using deep learning for machine translation and language understanding to unlock meaning across languages to help that mission. He will talk about the challenges of doing machine translation and language understanding at large scale, and will discuss technologies and platforms that have been built to tackle these challenges.
This talk will explore applications of ReDex to improve performance on emerging-market phones.
Rajesh is a 14 year veteran of Meta and is currently a senior TL... read more