The @Scale Conference 2017

AUGUST 31, 2017 @ 8:00 AM PDT - 6:30 PM PDT

The annual @Scale conference is an invite-only, in-person conference hosted in the fall of each year in the Bay Area, California. The event invites engineers and industry leaders from all over the world to come together and discuss leading @Scale topics and network with their peers. Attendees will be able to participate in several thought provoking general sessions and 4 separate tracks highlighting the year's themes.

RSVPS CLOSED

AGENDA AT A GLANCE FULL AGENDA SPEAKERS

Agenda at a glance

View full agenda

Event times below are displayed in PT.

View full agenda

August 31

SPEAKERS AND MODERATORS

Executive Vice President at Microsoft. read more

Jay Parikh

Microsoft

Luis Ceze

University of Washington

Sergey Doroshenko

Facebook

Adriana Liborio

Facebook

Barak Turovsky

Google Translate & Machine Intelligence

Raghu Ramakrishnan

Microsoft

Dileep George

Vicarious AI

Shubha Nabar

Salesforce

Mark Marchukov

Facebook

Julia Grace

Slack

S. Somasegar

Madrona Venture Group

Sri Viswanath

Atlassian

Julie Pearl

Nest

Nam Nguyen

Facebook

Kai Maetzel

Microsoft

Matt Uyttendaele

Facebook

Simon Eskildsen

Shopify

Reuven Lax

Google

Frank San Miguel

Netflix

Naveen Mareddy

Netflix

Nick Ruff

Facebook

Robert Ober

NVIDIA

Magda Balazinska

University of Washington

Collin Winter

Google

Adam Myhill

Unity

Jeff Gehlhaar

Qualcomm

Krishnan Seshadrinathan

Amazon

Rajat Monga

Google

Lola Priego

Instagram

Kelvin Zhu

Okta

Rajesh is a 14 year veteran of Meta and is currently a senior TL... read more

Rajesh Nishtala

Chuck Rossi

Facebook

Jordan Tigani

Google

Necip Fazil Ayan

Facebook

Shane Nay

Facebook

EVENT AGENDA

Event times below are displayed in PT.

August 31

08:00 AM - 10:00 AM

Breakfast

08:00 AM - 06:30 PM

Registration

10:00 AM - 10:40 AM

Keynotes

INTRODUCTION: Head of Engineering and Infrastructure, Facebook - Jay Parikh
DNA DATA STORAGE: Borrowing from nature to build better computers - Luis Ceze

Speaker Jay Parikh,Microsoft

Speaker Luis Ceze,University of Washington

10:50 AM - 11:30 AM

DEV TOOLS & OPS: The journey of turning a CI system into a universal platform

Sergey Doroshenko and Adriana Libório share lessons learned on a journey of transforming Facebook's continuous integration system, Sandcastle, into a universal platform.

Speaker ,Facebook

Speaker Adriana Liborio,Facebook

10:50 AM - 11:30 AM

HOT TOPICS: Google Translate: Breaking language barriers in emerging markets

The session will focus on how the Google Translate team members make their Android app work better for users in emerging markets.

Speaker Barak Turovsky,Google Translate & Machine Intelligence

10:50 AM - 11:30 AM

Azure Data Lake Store

Azure Data Lake Store (ADLS) is a fully managed, elastic, scalable, and secure file system that supports semantics of the Hadoop distributed file system (HDFS) and the Microsoft Cosmos file system. It is specifically designed and optimized for a broad spectrum of big data analytics that depend on an extremely high degree of parallel reads and writes, as well as colocation of compute and data for high-bandwidth and low-latency access. It brings together key components and features of Cosmos — long used internally at Microsoft as the warehouse for data and analytics — and HDFS. It also is a unified file storage solution for analytics on Azure. Internal and external workloads run on this unified platform. Distinguishing aspects of ADLS include its support for multiple storage tiers, exabyte scale, and comprehensive security and data sharing. Raghu Ramakrishnan will cover ADLS architecture, design points, the Cosmos experience, and performance.

Speaker Raghu Ramakrishnan,Microsoft

10:50 AM - 11:30 AM

MACHINE LEARNING: Building data-efficient AI algorithms with a dose of inspiration from the brain

Currently, the predominant approach in AI is to use unlimited data to solve narrowly defined problems. To progress toward humanlike intelligence, AI benchmarks will need to be extended to focus more on data efficiency, flexibility of reasoning, and transfer of knowledge between tasks. This talk will detail the challenges and successes in making these ideas operational. At Vicarious, the language of probabilistic graphical models is used as the representational framework. Compared with neural networks, graphical models have several advantages, such as the ability to incorporate prior knowledge, answer arbitrary probabilistic queries, and deal with uncertainty. However, a downside is that inference can be intractable. By incorporating several insights that originally were discovered in neuroscience, engineers at Vicarious were able to create probabilistic models, on which accurate inference can be performed using message-passing algorithms that are similar to the computations in a neural network.

Speaker Dileep George,Vicarious AI

11:40 AM - 12:20 PM

MACHINE LEARNING: When all the world's data scientists are just not enough

What if you had to build more machine-learned models than there are data scientists in the world? Well, at enterprise companies serving hundreds of thousands of businesses, this is precisely the case. In this talk, Shubha Nabar will walk through the scale challenges of building AI in the enterprise. She also will describe the general-purpose machine learning platform at Salesforce that automatically builds personalized models optimized for every business and every use case.

Speaker Shubha Nabar,Salesforce

11:40 AM - 12:20 PM

LogDevice: A file-structured log system

Facebook's Mark Marchukov will talk about Facebook's work to build a durable and highly available sequential distributed storage system that can handle hardware failures and sustain consistent delivery at exceptionally high ingest rates.

Speaker Mark Marchukov,Facebook

11:40 AM - 12:20 PM

HOT TOPICS: Building successful teams at scale

This panel features leaders who have spent decades building successful teams that tackle large-scale technical challenges. Come hear how they’ve worked across disciplines, across oceans, and across their companies to move technology forward — while solving for the inherent difficulties of scaling technical management and managing technical teams at scale. Even as technology plays a greater role in society, this panel will highlight how people remain at the center of the code, infrastructure, and products that reach billions of people.

Speaker Julia Grace,Slack

Speaker S. Somasegar,Madrona Venture Group

Speaker Sri Viswanath,Atlassian

Speaker Julie Pearl,Nest

Speaker Nam Nguyen,Facebook

11:40 AM - 12:20 PM

DEV TOOLS & OPS: Inside the VS Code team: How a small team manages growth

Product teams usually grow to deal with the increased workload caused by their own success. While growing, they risk losing the qualities that initially made them successful. Microsoft's VS Code is a small team with a wildly successful open source, cross-platform code editor. In this talk, Kai Maetzel will explain what the team has learned from developing open source projects and working on desktop, SaaS, and mobile applications, and how these findings help the team stay small and nimble, manage explosive growth, and make millions of developers happy.

Speaker Kai Maetzel,Microsoft

12:20 PM - 01:20 PM

Lunch & Office Hours

01:20 PM - 02:00 PM

MACHINE LEARNING: Bringing 360 to the world

This talk will explore the future directions for 360 media across photos, video, and AR/VR. Matt Uyttendaele will dive deep on his latest work, applying machine learning models to 360 photos for an enhanced user experience.

Speaker Matt Uyttendaele,Facebook

01:20 PM - 02:00 PM

DEV TOOLS & OPS: Resiliency testing with Toxiproxy

Fibers get cut, databases crash, and you've adopted chaos engineering to challenge your production environment as much as possible. But what are you doing to craft the resiliency test suites that minimize the impact of failure on your application as much as possible? How do you debug resiliency problems locally and make sure single points of failure don't creep into the application in the first place? Shopify developed the open source Toxiproxy in 2015 to emulate timeouts, latency, and outages in its development environments. This talk will equip you with tools to start writing resiliency test suites that harden your own applications and supplement other chaos engineering practices.

Speaker Simon Eskildsen,Shopify

01:20 PM - 02:00 PM

Using Apache Beam for batch, streaming, and everything in between

Apache Beam is a unified programming model capable of expressing a wide variety of traditional batch and complex streaming use cases. By neatly separating properties of the data from runtime characteristics, Beam enables users to easily tune requirements around completeness and latency and run the same pipeline across multiple runtime environments. In addition, Beam's model enables cutting-edge optimizations such as dynamic work rebalancing and autoscaling, giving those runtimes the ability to be highly efficient. This talk will cover the basics of Apache Beam, touch on its evolution, and describe the main concepts in its powerful programming model. It will include detailed, concrete examples of how Beam unifies batch and streaming use cases, and show efficient execution in real-world scenarios.

Speaker Reuven Lax,Google

01:20 PM - 02:00 PM

HOT TOPICS: Archer, a distributed computing platform for media processing

Netflix engineers were spending too much time working with infrastructure and not enough time on their media algorithms, so they created Archer, a high-scale distributed computing platform for media processing. It uses Docker, which allows developers to write their code in any language with any OS packages, test it on a laptop, and run it with millions of compute hours. This talk will discuss the Archer platform architecture as well as its implementation and applications, including feature extraction, encode experimentation, and machine learning.

Speaker Frank San Miguel,Netflix

Speaker Naveen Mareddy,Netflix

02:10 PM - 02:50 PM

HOT TOPICS: Building Live With

Last year Facebook started rolling out the ability for public figures to go live with a guest. Now Live With is available for all profiles and Pages on iOS, letting you invite a friend into your live video so you can hang out together, or broadcast the conversation to an audience. To make this possible, Facebook's engineers worked to bring real-time interactive communication to broadcast-quality streams. In this talk, Nick Ruff will discuss how they bridged the trade-offs between video streaming technologies to enable real-time multiparty broadcasting.

Speaker Nick Ruff,Facebook

02:10 PM - 02:50 PM

MACHINE LEARNING: GPUs and deep learning

In the last year, GPUs plus deep learning have gone from a hot topic to large-scale production deployment in major data centers. That's because deep learning works, and the evolution of GPUs has made them a great fit for deep learning and inference. Neural nets, frameworks, and GPU architectures have changed significantly in the last year as well, allowing better solutions to be created more quickly and in more places, moving from niche applications to the mainstream. It also allows them to be used in real time for more industrial automation and human interaction roles. We talk about GPU architecture and framework evolution, scaling out and scaling up training and performance, real-time inference improvements, security plus VM isolation and management, and overall deep learning flow improvements to make development and deployment more devops-friendly.

Speaker Robert Ober,NVIDIA

02:10 PM - 02:50 PM

PerfEnforce: A dynamic scaling engine for analytics with performance guarantees

Magda Balazinska talks about PerfEnforce, a system that enables performance-oriented service-level agreements (SLAs) for data analytics. Using a set of tenants and query-level performance SLAs, she addresses how to dynamically assign compute resources to queries from each tenant (query scheduling). She also discusses how to dynamically resize the multi-tenant service to minimize costs due to compute resources and SLA violation penalties (resource provisioning).

Speaker Magda Balazinska,University of Washington

02:10 PM - 02:50 PM

DEV TOOLS & OPS: Keeping 2 billion lines of code moving forward

Google's codebase includes over 2 billion lines of code, spanning thousands of projects. This talk looks at how Google keeps such a large codebase nimble and evolving despite its size and scale.

Speaker Collin Winter,Google

02:50 PM - 03:15 PM

Office Hours

03:15 PM - 03:55 PM

HOT TOPICS: Democratizing the real-time filming and editing of 3D content

Unity builds the tools that enable game developers, artists, designers, and videographers to tell better visual stories. Games made with Unity have reached over 3 billion devices and were installed over 16 billion times. Unity also has powerful tools for design, video animation, special effects, and video rendering. Adam Myhill, who heads cinematics at Unity, will discuss how Unity is democratizing and scaling tools for video that enable the real-time filming and editing of 3D content, watching VR without hardware, and more.

Speaker Adam Myhill,Unity

03:15 PM - 03:55 PM

MACHINE LEARNING: Achieving AI at scale on mobile devices

Qualcomm is an at-scale company. It powered the smartphone revolution and connected billions of people. It pioneered 3G and 4G, and now it is leading the way to 5G and a new era of intelligent, connected devices. Mobile is going to be the largest machine learning platform on the planet. Come learn how Qualcomm is making efficient on-device machine learning possible, how Qualcomm and Facebook worked closely to support machine learning in Facebook applications, and what's next for Qualcomm and AI.

Speaker Jeff Gehlhaar,Qualcomm

03:15 PM - 03:55 PM

Serverless at scale with Amazon DynamoDB and Lambda

DynamoDB is a fully managed NoSQL database service that provides high throughput at low latency with seamless scalability. The service is the backbone for many Internet applications, handling trillions of requests daily. The scale of data that applications have to manage continues to grow rapidly, making it a challenge to manage systems and respond to events in real time. This talk will be a deep dive into the challenges of building the DynamoDB Streams feature, which provides a time-ordered sequence of item changes on DynamoDB tables, and leveraging it with AWS Lambda to reimagine large-scale applications for the cloud.

Speaker Krishnan Seshadrinathan,Amazon

04:05 PM - 04:45 PM

MACHINE LEARNING: Deep learning trends and developments

Deep learning is making a great impact across products at Google and in the world at large. As Google pushes the limits of AI and deep learning, research is underway in many areas. With integration into many Google products, this research is improving the lives of billions of people. Open source tools like TensorFlow and open publications put the latest deep learning research at the fingertips of engineers around the world. This talk begins by exploring what has enabled this field to evolve rapidly over the last few years. It also will cover some of the leading research advances and current trends that point to a promising future, and the algorithms that make it possible.

Speaker Rajat Monga,Google

04:05 PM - 04:45 PM

HOT TOPICS: Efficient and healthy background data prefetching sessions

The ability to prefetch data while your app is on background can decouple the usability of your app from network availability. Moreover, it can minimize cellular data usage and significantly increase perceived speed. This talk walks through the main technical and performance challenges of the implementation. How do you schedule data prefetching on background? What framework is the most appropriate to execute this type of work? What should you be prefetching and when? Find out here.

Speaker Lola Priego,Instagram

04:05 PM - 04:45 PM

DEV TOOLS & OPS: Okta's secret sauce for dealing with flaky tests at scale

Monolith, microservices, or both? Find out how Okta has developed the best of both worlds to solve for the challenge of scaling to handle dynamic traffic volumes. In this session, Kelvin Zhu will walk through how the Okta team manages a monolithic codebase by way of virtual splitting, allowing dialable CI loop speed. He'll also speak to how Okta has rethought testing at scale — ensuring that tests run at speed while also monitoring the quality of tests to help gain visibility into problem tests and get past them.

Speaker Kelvin Zhu,Okta

04:05 PM - 04:45 PM

Make data-driven decisions faster with real-time stream processing

Facebook can move fast and iterate because of its ability to make data-driven decisions. Data from its stream processing systems provides real-time analytics and insights; the system is also implemented into various Facebook products, which have to aggregate data from many sources. In this talk, Rajesh Nishtala covers the difficulties of stream processing at scale, the solutions Facebook has created to date, and three case studies on improving the time-to-deliver insights with data via stream processing. The case studies include examples from search product development, accelerating daily pipelines in the data warehouse, and seamless integration with machine learning platforms. Each case study shows how Facebook can deliver value to more teams while continuing to abstract the details of stream processing from various teams. Rajesh concludes by speaking to the future of stream processing.

Speaker Rajesh Nishtala,Meta

04:55 PM - 05:35 PM

DEV TOOLS & OPS: Rapid release at massive scale

This updated talk will give the latest details on how Facebook's Release Engineering team ships facebook.com multiple times per day.

Speaker Chuck Rossi,Facebook

04:55 PM - 05:35 PM

BigQuery: Managed storage for analytics

BigQuery is best known for being a large-scale query engine, but one of its most important components is a structured storage system. Over the last several years, Google has found that active data management is crucial for providing no-ops scalable storage. This talk goes into the details of how BigQuery managed storage works, why it's hard to get right, and how it helps ensure that queries are always fast.

Speaker Jordan Tigani,Google

04:55 PM - 05:35 PM

MACHINE LEARNING: Unlocking meaning across languages at scale

At Facebook, the mission is to give people the power to build community and bring the world closer together. In this talk, Necip Fazil Ayan will present the most recent work on using deep learning for machine translation and language understanding to unlock meaning across languages to help that mission. He will talk about the challenges of doing machine translation and language understanding at large scale, and will discuss technologies and platforms that have been built to tackle these challenges.

Speaker Necip Fazil Ayan,Facebook

04:55 PM - 05:35 PM

HOT TOPICS: Scaling Android with ReDex

This talk will explore applications of ReDex to improve performance on emerging-market phones.

Speaker Shane Nay,Facebook

05:35 PM - 06:30 PM

Happy Hour & Office Hours

past EVENT November 20-21, 2024 | Video @Scale

Video @Scale 2024

Video @Scale 2024 is a technical conference designed for engineers that develop or manage large-scale video systems serving millions of people. The development of large-scale video systems includes complex, unprecedented engineering challenges. The @Scale community...

PAST EVENT March 20, 2024 @ 9am PT - 3pm PT | RTC @Scale

RTC @Scale 2024

RTC @Scale is for engineers who develop and manage large-scale real-time communication (RTC) systems serving millions of people. The operations of large-scale RTC systems have always involved complex engineering challenges which continue to attract attention...

Past EVENT May 22, 2024 | AI & Data @Scale

Data @Scale 2024

Data @Scale is a technical conference for engineers who are interested in building, operating, and using data systems at scale. Companies across the industry use data and underlying infrastructure to build products with user empathy,...

Past EVENT June 12, 2024 | Systems & Reliability @Scale

Systems @Scale 2024

Systems @Scale 2024 is a technical conference intended for engineers that build and manage large-scale distributed systems serving millions or billions of users. The development and operation of such systems often introduces complex, unprecedented engineering...

Past EVENT JULY 31, 2024 @ 2:30 PM PDT - 7:00 PM PDT - IN PERSON EVENT | AUGUST 7, 2024 @ 2:30 PM PDT - 5:30 PM PDT - VIRTUAL PROGRAM | AI & Data @Scale

AI Infra @Scale 2024

Meta’s Engineering and Infrastructure teams are excited to return for the second year in a row to host AI Infra @Scale on July 31. This year’s event is open to a limited number of in-person...

Past EVENT August 14, 2024 | Product @Scale

Product @Scale 2024

Product @Scale conferences are designed for technologists who work on solving complex product problems at scale. The @Scale community focuses on bringing forward people's experiences in creating innovative solutions to large-scale products serving millions or...

Past EVENT September 11, 2024 | Santa Clara Convention Center | Networking @Scale

Networking @Scale 2024

Meta’s Networking team invites you to Networking@scale on September 11th. This year’s event is an in-person event hosted at the Santa Clara Convention center and will also be live streamed for virtual attendees. Registration is...

Past EVENT October 9, 2024 | Systems & Reliability @Scale

Reliability @Scale 2024

In the digital age, where systems operate at unprecedented scales, the importance of robust configuration management cannot be overstated. This year’s Reliability @Scale will focus on a central theme of "Move Safely", emphasizing the critical...

Past EVENT October 23, 2024 | Mobile @Scale

Mobile @Scale 2024

Mobile @Scale is a technical conference designed for the engineers, product managers, and engineering leaders building mobile experiences at significant scale (millions to billions of daily users). Mobile @Scale provides a rare opportunity to gather...