The @Scale Conference 

San Jose Convention Center 8:30am - 6:30pm

Enter Code to Attend

The annual @Scale conference is coming to the San Jose Convention Center on October 16. Registration is now open!

This year’s event features technical deep dives from engineers at a multitude of scale companies. Amazon Web Services, Box, Confluent, Cloudflare, Facebook, Google, Lyft, and NVIDIA are scheduled to appear. Attendees will be able to participate in four tracks: AI, Data Infra, Privacy and Security. We also have our scale system demonstration area back for a fourth year so engineers can interact with the latest tech. Keep an eye on this page — we’ll be sharing the schedule soon!

@Scale is an invite-only event, so if you do not have an invite code, please message us on the @Scale community page.

 

About @Scale:

@Scale is a series of technical conferences for engineers who build or maintain systems that are designed for scale. Building applications and services that scale to millions or even billions of people presents a complex set of engineering challenges, many of them unprecedented. The @Scale community is focused on bringing people together to openly discuss these challenges and collaborate on the development of new solutions.

Read More Read Less

Code of Conduct

@Scale brings thousands of engineers together throughout the year to discuss complex engineering challenges and to work on the development of new solutions. We're committed to providing a safe and welcoming environment — one that encourages collaboration and sparks innovation.

Every @Scale event participant has the right to enjoy his or her experience without fear of harassment, discrimination, or condescension. The @Scale code of conduct outlines the behavior that we support and don't support at @Scale events and conferences. We expect participants to follow these rules at all @Scale event venues, online communities, and event-related social activities. These guidelines will keep the @Scale community a safe and enjoyable one for everyone.

Be welcoming. Everyone is welcome at @Scale events, inclusive of (but not limited to) gender, gender identity or expression, sexual orientation, body size, differing abilities, ethnicity, national origin, language, religion, political beliefs, socioeconomic status, age, color and neurodiversity. We have a zero-tolerance policy for discrimination.

Choose your words carefully. Treat one another with respect and in a professional manner. We're here to collaborate. Conflict is not part of the equation.

Know where the line is, and don't cross it. Harassment, threats, or intimidation of any kind will not be tolerated. This includes verbal, physical, sexual (such as sexualized imagery on clothing, presentations, in print, or onscreen), written, or any other form of aggression (whether outright, subtle, or micro). Behavior that is offensive, as determined by @Scale organizers, security staff, or conference management, will not be tolerated. Participants who are asked to stop a behavior or an action are expected to comply immediately or will be asked to leave.

Don't be afraid to call out bad behavior. If you're the target of harmful or offensive behavior, or if you witness someone else being harassed, threatened, or intimidated, don't look away. Tell an @Scale staff member, a security staff member, or a conference organizer immediately. Please notify our event staff, security staff, or conference organizers of any harmful or offensive behavior that you've experienced or witnessed in any form, whether in person or online.

We at @Scale want our events to be safe for everyone, and we have a zero-tolerance policy for violations of our code of conduct. @Scale conference organizers will investigate any allegation of problematic behavior, and we will respond accordingly. We reserve the right to take any follow-up actions we determine are needed. These include being warned, being refused admittance, being ejected from the conference with no refund, and being banned from future @Scale events.

Enter Code to Attend
Agenda
Filter by Track:
  • Keynote
  • Data Infra
  • AI
  • Security
  • Privacy
8:00am - 9:00am

Registration and Breakfast

9:00am - 9:50am

Women's Leadership Breakfast

10:00am - 10:50am

Keynote

10:55am - 12:15pm

Lunch

12:15pm - 1:15pm

Student Program

1:15pm - 1:45pm

Data InfraZanzibar: Google’s Consistent, Global Authorization System

Determining whether online users are authorized to access digital objects is central to preserving privacy. This talk presents the design, implementation, and deployment of Zanzibar, a global system for storing and evaluating access control lists. Zanzibar provides a uniform data model and configuration language for expressing a wide range of access control policies from hundreds of client services at Google, including Calendar, Cloud, Drive, Maps, Photos, and YouTube. Its authorization decisions respect causal ordering of user actions and thus provide external consistency amid changes to access control lists and object contents. Zanzibar scales to trillions of access control lists and millions of authorization requests per second to support services used by billions of people. It has maintained 95th-percentile latency of less than 10 milliseconds and availability of greater than 99.999%% over 3 years of production use.
1:50pm - 2:20pm

Data Infra6 Technical Challenges Developing a Distributed SQL Database

Developing YugaByte DB was but not without its fair share of technical challenges. There were times when we had to go back to the drawing board and even sift through academic research to find a better solution than what we had at hand. In this talk we’ll outline some of the hardest architectural issues we have had to address in our journey of building an open source, cloud native, high-performance distributed SQL database. Topics include architecture, SQL compatibility, distributed transactions, consensus algorithms, atomic clocks and PostgreSQL code reuse.
2:25pm - 2:55pm

Data InfraGlobal Data Management

Building a logging and distribution layer.
3:00pm - 3:30pm

Data InfraAmazon DynamoDB: Fast and flexible NoSQL database service for any scale

Amazon DynamoDB is a hyperscale, NoSQL database designed for internet-scale applications, such as serverless web apps, mobile backends, and microservices. DynamoDB provides developers with the security, availability, durability, performance, and manageability they need to run mission-critical workloads at extreme scale. In this session, we dive deep into the underpinnings of DynamoDB and how we run a fully managed, nonrelational database service that is used by more than 100,000 customers. We look under the hood of DynamoDB and discuss how features such as DynamoDB Streams, ACID transactions, continuous backups, point-in-time recovery (PITR), and global tables work @scale. We also share some of our key learnings in building a highly durable, highly scalable, and highly available key-value store that you can apply when building your large-scale systems.
4:05pm - 4:35pm

Data InfraKafka @Scale: Confluent’s Journey Bringing Event Streaming to the Cloud

As streaming platforms become central to data strategies, companies both small and large are re-thinking their architecture with real-time context at the forefront. What was once a ‘batch’ mindset is quickly being replaced with stream processing as the demands of the business impose more and more real-time requirements on developers and architects. What started at companies like LinkedIn, Facebook, Uber, Netflix and Yelp has made its way to countless others in a variety of sectors. Today, thousands of companies across the globe build their businesses on top of Apache Kafka®. This talk will be a deep dive into the evolution and future of event streaming and the lessons that Confluent learned through its journey to make the platform cloud-native.
1:15pm - 1:45pm

AIUnique Challenges and Opportunities for Self-Supervised Learning in Autonomous Driving

Autonomous vehicles generate a lot of raw (unlabeled) data every minute. However, only a small fraction of that data can we labeled manually. My talk will focus on how we leverage the unlabeled data for the tasks for perception and prediction in a self-supervised manner. There are few unique ways to achieve this in the AV land: one of them is cross-modal self-supervised learning, where one modality can serve as a learning signal for another modality without the need for labeling (such as using depth from LiDAR to train monocular depth neural network on images). Another approach that is unique to AVs, is by using the outputs from large scale optimization (that can only run in non-real-time in the cloud such as SLAM) as a learning signal to train neural networks that mimick their outputs but can run in real-time on the AV. The talk will also touch upon how we can leverage the Lyft fleet to oversample the long tail events and hence learn the long tail.
2:25pm - 2:55pm

AIMulti-Node Natural Language Understanding at Scale

This session includes a deep look into the world of multi-node training for complex NLU models like BERT. Sharan will describe the challenges of tuning for speed and accuracy at scales needed to bring training times down from weeks to minutes. Drawing from real world experience running models on up to ~1500 GPUs with reduced precision techniques, he will dive into the impact of different optimizers, strategies to reduce communication time, and improvements to per-GPU performance.
3:00pm - 3:30pm

AICross Product Optimization

Artificial Intelligence (AI) is behind practically every product experience at LinkedIn. From ranking the member’s feed to recommending new jobs, AI is used to fulfill our mission to connect the world’s professionals to make them more productive and successful. While product functionality can be decomposed into separate components, they are beautifully interconnected; thus, creating interesting questions and challenging AI problems that need to be solved in a sound and practical manner. In this talk, I will provide an overview of lessons learned and approaches we have developed to address these problems, including scaling to large problem sizes, handling multiple conflicting objective functions, efficient model tuning, and our progress toward using AI to optimize the LinkedIn product ecosystem more holistically.
1:15pm - 1:45pm

SecurityLeveraging the Type System to Write Secure Applications

How to extend the type system to eliminate entire classes of security vulnerabilities at scale.
1:50pm - 2:20pm

SecuritySecuring SSH Traffic to 190+ Data Centers

Cloudflare maintains thousands of servers in over 190 points of presence that need to be accessed from multiple offices. We relied on a private network and SSH keys to securely connect to those machines. However, that private network perimeter posed a risk if breached and those keys had to be carefully managed and revoked as needed. To solve those challenges, we built and migrated to a model where we expose those servers to the public internet and authenticate with an identity provider to reach those servers. We deployed a system that leverages ephemeral certificates, based on user identity, so that we could delete our SSH keys as an organization. We are here to share what we learned in the three years that Cloudflare has been building a zero-trust layer on top of its existing network to secure both HTTP and non-HTTP traffic.
3:00pm - 3:30pm

SecurityThe Call is Coming From Inside the House: Lessons in Securing Internal Apps

Locking down internal apps presents unique and frustrating challenges for appsec teams. Your organization may have dozens if not hundreds of sensitive internal tools, dashboards, control panels, etc., running on heterogenous technical stacks with varying levels of code quality, technical debt, external dependencies, and maintenance commitments. How do you tackle this problem scalably with limited resources? Come hear a dramatic and humorous tale of internal appsec and the technical and management lessons we learned along the way. Even if your focus is on securing external apps, this talk will be relevant for you. You’ll hear about what worked well for us and what didn’t, including: - Finding a useful mental model to organize your roadmap - Starting with the basics: authn/z, TLS, etc. - Rolling out Content Security Policy - Using SameSite cookies as a powerful entry point regulation mechanism - Leveraging WAFs for useful detection and response - Using internal apps as a training ground for new security engineers
4:05pm - 4:35pm

SecurityStreaming, Flexible Log Parsing with Real-Time Applications

Logs from cybersecurity appliances are numerous, are generated from heterogeneous sources, and are frequently victim to poor hygiene and malformed content. Relying on an already understaffed human workforce to constantly write new parsers, triage incorrectly parsed data, and keep up with ever-increasing data volumes is bound to fail. Using RAPIDS, an open-source data science platform, we show how creating a more flexible, neural network approach to log parsing can overcome these obstacles. Presented is an end-to-end workflow that begins with raw logs, applies flexible parsing, and then applies stream analytics (e.g., rolling z-score for anomaly detection) to the near real-time parsing. By keeping the entire workflow on GPUs (either on-premises or in a cloud environment), we demonstrate near real-time parsing and the ability to scale to large volumes of incoming logs.
1:50pm - 2:20pm

PrivacyFirefox Origin Telemetry with Prio

Measuring browsing behavior by site origin can provide actionable insights into the broader web ecosystem in areas such as blocklist efficacy and web compatibility. However, an individual’s browsing history contains deeply personal information that browser vendors should not collect wholesale. In this talk, we discuss how we can precisely measure aggregate page-level statistics using Prio, a privacy-preserving data collection system developed by Stanford researchers and deployed in Firefox. In Prio, a small set of servers verify and aggregate data through the exchange of encrypted shares. As long as one server is honest, there is no way to recover individual data points. We will explore the challenges faced when implementing Prio, both in Firefox and its Data Platform. We will touch on how we have validated our deployment of Prio through two experiments: one which collects known Telemetry data and one which collects new data on the application of Firefox’s blocklists across the web. We will share the results of these experiments and discuss how they’ve informed our future plans.
2:25pm - 2:55pm

PrivacyDNS Privacy at Scale: Lessons and Challenges

It’s no secret that the use of the domain name system (DNS) reveals a lot of information about what people do online. The use of traditional unencrypted DNS protocols reveals this information to third parties on the network, introducing privacy risks to users as well as enabling country-level censorship. In recent years, Internet protocol designers have sought to retrofit DNS with several new privacy mechanisms to help provide confidentiality to DNS queries. The results of this work include technologies such as DNS-over-TLS, DNS-over-HTTPS and encrypted SNI for TLS. In this talk, we’ll share some of the technical and political challenges around deploying these technologies.
4:05pm - 4:35pm

PrivacyFairness and Privacy in AI/ML Systems

How do we protect privacy of users when building large-scale AI based systems? How do we develop machine learned models and systems taking fairness, accountability, and transparency into account? With the ongoing explosive growth of AI/ML models and systems, these are some of the ethical, legal, and technical challenges encountered by researchers and practitioners alike. In this talk, we will first motivate the need for adopting a fairness and privacy by design" approach when developing AI/ML models and systems for different consumer and enterprise applications. We will then focus on the application of fairness-aware machine learning and privacy-preserving data mining techniques in practice by presenting case studies spanning different LinkedIn applications (such as fairness-aware talent search ranking, privacy-preserving analytics and LinkedIn Salary privacy & security design).
5:00pm - 6:00pm

Social Hour

Join the @Scale Mailing List and Get the Latest News & Event Info

Code of Conduct

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy