Performance @Scale

Facebook Headquarters 8:30am - 6:00pm

Enter Code to Attend

Performance @Scale is an invite-only conference for engineers working on the technical and organizational challenges of high-performance applications and services.

If you have ever wanted to learn best practices from the pros on how to detect performance anomalies, scale your web service, or speedup your mobile apps, then Performance @Scale is the place to be on Thursday, June 20th, 2019! If you have friends or colleagues who may also be interested in attending, feel free to forward them this invitation.

Performance @Scale will be held on Facebook’s campus in Menlo Park, California. Registration and breakfast starts at 8:30 a.m. The Women in Technology panel will be held at at 9:00 a.m. and talks begin at 10 a.m. Stick around after the day-long conference for Happy Hour.


Learn more about @Scale events. and follow us on Facebook for updates.

Read More Read Less

@Scale brings thousands of engineers together throughout the year to discuss complex engineering challenges and to work on the development of new solutions. We're committed to providing a safe and welcoming environment — one that encourages collaboration and sparks innovation.

Every @Scale event participant has the right to enjoy his or her experience without fear of harassment, discrimination, or condescension. The @Scale code of conduct outlines the behavior that we support and don't support at @Scale events and conferences. We expect participants to follow these rules at all @Scale event venues, online communities, and event-related social activities. These guidelines will keep the @Scale community a safe and enjoyable one for everyone.

Be welcoming. Everyone is welcome at @Scale events, inclusive of (but not limited to) gender, gender identity or expression, sexual orientation, body size, differing abilities, ethnicity, national origin, language, religion, political beliefs, socioeconomic status, age, color and neurodiversity. We have a zero-tolerance policy for discrimination.

Choose your words carefully. Treat one another with respect and in a professional manner. We're here to collaborate. Conflict is not part of the equation.

Know where the line is, and don't cross it. Harassment, threats, or intimidation of any kind will not be tolerated. This includes verbal, physical, sexual (such as sexualized imagery on clothing, presentations, in print, or onscreen), written, or any other form of aggression (whether outright, subtle, or micro). Behavior that is offensive, as determined by @Scale organizers, security staff, or conference management, will not be tolerated. Participants who are asked to stop a behavior or an action are expected to comply immediately or will be asked to leave.

Don't be afraid to call out bad behavior. If you're the target of harmful or offensive behavior, or if you witness someone else being harassed, threatened, or intimidated, don't look away. Tell an @Scale staff member, a security staff member, or a conference organizer immediately. Please notify our event staff, security staff, or conference organizers of any harmful or offensive behavior that you've experienced or witnessed in any form, whether in person or online.

We at @Scale want our events to be safe for everyone, and we have a zero-tolerance policy for violations of our code of conduct. @Scale conference organizers will investigate any allegation of problematic behavior, and we will respond accordingly. We reserve the right to take any follow-up actions we determine are needed. These include being warned, being refused admittance, being ejected from the conference with no refund, and being banned from future @Scale events.

Enter Code to Attend
8:30am - 10:00am

Registration and Breakfast

9:00am - 10:00am

Women in Tech Breakfast and Panel Discussion

10:00am - 10:30am

Welcome and Keynote Address

10:30am - 11:05am

Performance Analysis of Facebook AI Workloads on Accelerated Platforms

In this talk, we describe our top-down methodology for uncovering inefficiencies in our production AI workloads, the tools and technologies we’ve built to support performance analysis, and the common pitfalls in optimizing accelerated code. Our tools and techniques are being used by thousands of ML engineers at Facebook on products that serve billions of users. Kim Hazelwood is an Engineering Manager leading the AI Infra Foundation and AI Infra Research efforts at Facebook, which focus on the hardware and software platform design and efficiency for Facebook's many applied machine learning-based products and services. Prior to Facebook, Kim held positions including a tenured Associate Professor at the University of Virginia, Software Engineer at Google, and Director of Systems Research at Yahoo Labs. She received a PhD in Computer Science from Harvard University in 2004, and is the recipient of an NSF CAREER Award, the Anita Borg Early Career Award, the MIT Technology Review Top 35 Innovators under 35 Award, and the ACM SIGPLAN 10-Year Test of Time Award. She currently serves on the Board of Directors of CRA, MIT SystemsThatLearn, and EPFL EcoCloud. She has authored over 50 conference papers and one book.
11:05am - 11:40am

Scaling ML models on Google's TPUs

Tensor Processing Units are Machine Learning accelerators developed at Google. A TPU v3 Pod offers over 100 PFLOPs of compute, leading to dramatic reductions in training time of Machine Learning models. In this talk, we will explore some of the scalability challenges, often not unique to TPUs, and techniques to address those challenges. Naveen Kumar is a Software Engineer at Google. He currently leads Performance within Google Brain. Previously, Naveen worked on Google's second generation Tensor Processing Units. Prior to Google, Naveen focused on microprocessor research at Intel Labs. Naveen holds a PhD from University of Pittsburgh and enjoys outdoor life in the Bay Area.
11:40am - 12:15pm

Scaling Deep Learning Workloads on GPUs

The computational size, complexity and footprint of neural network training has been doubling about every 3.5 months, according to OpenAI. As well, the amount of data used for training has been increasing, for instance as researchers are able to take advantage of unsupervised training methods as in BERT. These researchers now require multiple systems for training their models (a trend similar to scientific simulations on HPC systems in the past). This talk will discuss the techniques needed for running deep learning training at scale on GPUs, and state of the art results. The discussion will also review how to deploy, scale, load balance and optimize the trained network inference (or prediction) throughput on GPUs, using tools such as TensorRT Inference Server. Ujval has spent the last 10 years working on software and libraries for deep learning and HPC at NVIDIA. Previously, he co-founded Stream Processors, a fabless-semi startup building programmable processors for signal and image processing. Ujval earned his PhD in EE at Stanford and a BS at Brown University.
12:15pm - 1:15pm


1:15pm - 1:55pm

The Intersection of Data, Performance and Usability

Performance is more than a numbers game. This talk will share how Bing leverages behavioral analytics to identify usability bottlenecks and optimize perceived performance. We will cover a wide range of performance experiments, including good ideas that failed, and summarize the lessons we learned along the way. Sarvesh leads the performance team at Bing, Microsoft and is passionate about solving complex data problems with rich visualizations. Sarvesh holds a M.S. in Computer Science from Columbia University, NY.
1:55pm - 2:25pm

Open-Source Browser Contributions at Facebook

The Web as an application platform is still very much behind native platforms like Android and Windows for performance and richness of integration APIs. This makes it challenging for developers to create sophisticated yet performant webapps which require a non-trivial amount of client-side JS code. The Browser Engineering team at Facebook finds bottlenecks in browser implementations, contributes code to open-source browsers, prototypes new Web technologies, and advances new API proposals through Web standards committees. This talk will cover our current and future projects for making Web apps as fast and as powerful as native apps, including the new isInputPending() API, the upcoming JS Self-Profiling API, and new ideas for eliminating JavaScript overheads. Vladan is the tech lead for the Browser Engineering team at Facebook. His technical focus is browser technology, performance, and low-level systems. Previously, he lead the Firefox performance team at Mozilla, working on browser startup, responsiveness and performance measurement.
2:25pm - 3:00pm

FlameScope: A Different Way of Looking at Profilers

Even under constant load, the behavior of a system is affected by variance, perturbations, single-threaded execution and other time-based issues, and never completely uniform. Using profilers to analyze the performance of a system generally involves aggregating events or samples over a period of time, and identifying these small variations in the full profile becomes a needle-in-a-haystack problem. FlameScope solves this by combining a subsecond-offset heatmap, for navigating a profile and visualizing these perturbation, with a flame graphs for code-path analysis. For the past 13 years Martin's career evolved around technology and performance engineering, leading major initiatives at Netflix, Expedia and other companies. Currently, as a Performance Architect at Netflix, Martin is responsible for improving the performance of the Netflix service, for its 148+ million users, watching hundreds of millions of hours of movies and TV shows every day. Martin is also a Venture Advisor at monashees+, one of the largest venture capital firms in Brazil, angel investor and advisor to multiple startups, and an avid open source contributor.
3:00pm - 3:30pm


3:30pm - 4:05pm

Monitoring Real User Perceived Performance on Native Apps

At LinkedIn, we monitor our client side performance as experienced by our members (RUM/Real User Monitoring). In this talk, we will share our journey migrating to a new generation of RUM for native apps, challenges faced in building a generic instrumentation framework, tradeoffs made to fit in our mobile architecture, lessons learnt and best practices when designing new Tier 0 performance metrics for the company. Ramya Pasumarti is a Staff Software Engineer at LinkedIn with the Performance Engineering team. She works on mobile and server side performance focusing on a variety of monitoring, tooling and optimization projects across the stack. She currently leads initiatives to enhance mobile performance measurement, monitoring and debugging experience for developers.
4:05pm - 4:40pm

Improving iOS Startup Performance with Binary Layout Optimizations

Startup of the iOS app is an important performance metric user experience. However, poor ordering of functions in the iOS binary can greatly increase page faults during startup and significantly hurt startup performance. An “order file” can be used to direct the linker how to order functions in an iOS binary better. To generate an order file for iOS apps, we usually use dtrace, but some apps have multiple startup scenarios that we want to optimize for with the order file. The dtrace approach does not scale well and it is not easy to automate. In this talk, we describe some more scalable approaches to generating order files. Manman Ren is a Software Engineer at Facebook. She currently works on iOS app performance. Previously, Manman worked at Apple's compiler team and on bringing Android to support IA at Intel. Manman holds a PhD from Stanford University.
4:40pm - 5:30pm

Networking Happy Hour

Join the @Scale Mailing List and Get the Latest News & Event Info

Code of Conduct

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy