Video @Scale 2024

November 20-21, 2024

Video @Scale 2024 is a technical conference designed for engineers that develop or manage large-scale video systems serving millions of people. The development of large-scale video systems includes complex, unprecedented engineering challenges. The @Scale community focuses on bringing forward people's experiences in the creation of innovative solutions in the video engineering domain.

Video @Scale 2024 will be hosted virtually on November 20 & 21. Joining us are speakers from AWS, Boston University, Captions, Meta, Momento, and Netflix.

Register today and check back for upcoming speaker and agenda announcements!

RSVPS CLOSED
AGENDA SPEAKERS

EVENT AGENDA

Event times below are displayed in PT.

Day 1 - Nov 20

Day 2 - Nov 21

09:00 AM - 09:05 AM
Opening Remarks
Speaker Diana Chen,Meta
09:05 AM - 09:35 AM
Fireside Chat

Revolutionizing Video Creation and Editing: Insights from Engineering Leaders on the Present and Future of Generative Video Models

Hear video gen model experts together in the same panel for the first time. Hear them talk about the future of Gen AI Video Models

Moderator ABHINAV KAPOOR,Meta
Speaker Peter Vajda,Meta
Speaker Amit Jain,Luma AI
09:35 AM - 09:55 AM
Movie Gen Video: State-of-the-art Video Generation

Image, video, and audio generation are a fundamental building block for Generative AI research and applications in the real world. In this talk, I'll present Movie Gen, a set of foundational models for video generation, editing, personalization and audio generation. Movie Gen models are one the world's most advanced media generation models, with state-of-the-art results compared to industry solutions. I'll focus my talk on text-to-video generation and share key insights that enabled this step change in quality. Movie Gen produces HD quality videos of up to 16 seconds in length and has been used by movie producers in Hollywood.

Speaker Ishan Misra,Meta
Featured Article
Movie Gen: A Cast of Media-Generation Foundation Models  read more
09:55 AM - 10:15 AM
How Captions Enables Anyone To Tell Their Story With Video

At Captions, we believe that anyone can become a video creator, regardless of their experience. This mission presents both product and technical challenges as we bridge the gap between cutting-edge video generation capabilities and a user-friendly experience. In this presentation, we'll deep dive on Captions and our underlying technical systems. Lastly, we'll demonstrate how we're evolving these systems to empower users to create videos that resonate globally.

Speaker Corbyn Salisbury,Captions
10:15 AM - 10:20 AM
Break
10:20 AM - 10:40 AM
Meta's AI Translations Inference

In this talk we will show how we implemented a media processing pipeline to perform (autodub / lipsync) media inference at Meta scale.

We will focus on the challenges we faced from a media processing / scaling point of view, such as: inference latency and scheduling, voice isolation, media timing/alignment, alternate tracks delivery, instrumentation, model evaluation, etc.

Speaker Sravan Rekula,Meta
Speaker Jordi Cenzano,Meta
Speaker Amisha Jaiswal,Meta
10:40 AM - 11:00 AM
Building Netflix's New Video Pipeline

An efficient and flexible video processing pipeline is critical for enabling innovation and supporting both our streaming service and studio partners, which is essential for Netflix's continued success. Over the past few years, we have been rebuilding this pipeline on our next-generation microservice-based platform. In this talk, we will share our journey and learnings with the community

Speaker Liwei Guo,Netflix
11:00 AM - 11:20 AM
On-device Video Playback Upsampling

Various video upsampling technologies are adaptively applied to video playback on mobile clients. The video playback quality are constraint when streamed to the end users mostly due to the device network bandwidth constraints or the low quality in the original content. With these different advanced upsampling technologies on device playback, the streamed video quality are improved and at the meantime, it helped the user to save their cell/wifi data by playing a video with lower bitrate.

Speaker Wen Li,Meta
Featured Article
On-Device Video Playback Upsampling  read more
11:20 AM - 11:45 AM
Live Q&A Session
Moderator Diana Chen,Meta
Speaker Ishan Misra,Meta
Speaker Corbyn Salisbury,Captions
Speaker Sravan Rekula,Meta
Speaker Jordi Cenzano,Meta
Speaker Amisha Jaiswal,Meta
Speaker Wen Li,Meta
Speaker Liwei Guo,Netflix
09:00 AM - 09:05 AM
Opening Remarks
Speaker ABHINAV KAPOOR,Meta
09:05 AM - 09:25 AM
SAM 2: Segment Anything in Images & Videos

We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos. We build a data engine, which improves model and data via user interaction, to collect the largest video segmentation dataset to date. Our model is a simple transformer architecture with streaming memory for real-time video processing. SAM 2 trained on our data provides strong performance across a wide range of tasks. In video segmentation, we observe better accuracy, using 3x fewer interactions than prior approaches. In image segmentation, our model is more accurate and 6x faster than the Segment Anything Model (SAM). We believe that our data, model, and insights will serve as a significant milestone for video segmentation and related perception tasks.

Speaker Chay Ryali,Meta
09:25 AM - 09:45 AM
Challenges in generated video evaluation

In the past couple of years, the landscape of image and video generation has transformed dramatically. Despite such phenomenal progress, rigorous and holistic evaluation of the generative models continues to suffer. This is primarily due to the multi-faceted and highly subjective nature of the task: the generated image / video should be evaluated not just on overall visual quality and aesthetics, but also on its alignment to the input prompt , originality, lack of propagating stereotypical biases, and several more factors.

In this talk, I’ll give an overview of current metrics, their shortcomings, and the rapid progress in the research community to improve the rigor in evaluation.

Speaker Deepti Ghadiyaram,Boston University
09:45 AM - 10:05 AM
Efficient Segment Anything

EfficientSAM: small but mighty.

Speaker Bilge Soran,Meta
Speaker Yunyang Xiong,Meta
Featured Article
Pioneering AI Efficiency: Efficient Segmentation Models  read more
10:05 AM - 10:10 AM
Break
10:10 AM - 10:30 AM
Metrics Driven Obsession with Viewer Experience

Viewer experience is a complex outcome of many competing dimensions, from encoding quality to network performance of the video pipeline. Metrics, like Zero Buffer Rates (ZBR), illuminate the impact of various components in the end-to-end pipeline on the viewer experience. . This talk will explore the essential elements of operational excellence in video infrastructure, working backwards from viewer’s perspective and into the video pipeline.

Speaker Khawaja Shams,Momento
10:30 AM - 10:50 AM
Using Performance signals to improve Video Recommendations

Presentation information coming soon!

Speaker Zhi Long Tan,Meta
Speaker Wen Zhang,Meta
10:50 AM - 11:10 AM
Rethinking the Video Transcoding Pipeline: How Pre-Filtering Holds the Key to Efficiency and Quality

While video input pre-filtering is not a new concept, it has evolved significantly in recent years. Originally designed to reduce noise and artifacts in low-quality video, pre-filtering techniques have now been adapted for use even on pristine video content. This talk will explore the importance of video input pre-filtering and the specific advantages it can offer in modern video production and delivery workflows.

Key benefits of advanced input pre-filtering include reduced file sizes without sacrificing perceived video quality, as well as mitigation of common video quality issues like softness and deblocking artifacts. However, implementing a production-ready, broadcast-grade pre-filtering solution requires careful consideration of several critical factors.

This talk will dive into the core pillars of the newly released video input pre-filter in AWS Elemental's media processing solutions. It will explain how this advanced pre-filtering technology can help video providers deliver high-quality, efficiently encoded video for a wide range of applications, from over-the-top streaming to broadcast television. Attendees will come away with a deep understanding of the evolving role of pre-filtering in the video technology landscape and practical insights they can apply to their own workflows.

Speaker Ramzi Khsib,AWS
11:10 AM - 11:35 AM
Live Q&A Session
Moderator ABHINAV KAPOOR,Meta
Speaker Chay Ryali,Meta
Speaker Deepti Ghadiyaram,Boston University
Speaker Bilge Soran,Meta
Speaker Yunyang Xiong,Meta
Speaker Khawaja Shams,Momento
Speaker Zhi Long Tan,Meta
Speaker Wen Zhang,Meta
Speaker Ramzi Khsib,AWS

SPEAKERS AND MODERATORS

Engineering Manager at Meta, supporting teams in Video Infrastructure that provides client-side media editing,... read more

Diana Chen

Meta

Abhinav is part of the Video Infra leadership team at Meta, focusing on scaling... read more

ABHINAV KAPOOR

Meta

Peter Vajda joined Meta in 2014 as a Research Scientist. He currently directs the... read more

Peter Vajda

Meta

Amit Jain is the CEO and Founder of Luma AI, which he founded in... read more

Amit Jain

Luma AI

Ishan Misra is a Research Scientist in the GenAI group at Meta where he... read more

Ishan Misra

Meta

As the first Backend Engineer Manager at Captions, I’m proud to have led the... read more

Corbyn Salisbury

Captions

Sravan is Engineering Manager in video infra and supports large scale video ingestion and... read more

Sravan Rekula

Meta

Jordi Cenzano is an engineer specializing in broadcast and online media. He is currently... read more

Jordi Cenzano

Meta

Amisha is a Software Engineer at Meta, where she currently contributes to the video... read more

Amisha Jaiswal

Meta

Liwei Guo is a Staff Software Engineer in the Encoding Technologies team at Netflix.... read more

Liwei Guo

Netflix

Wen Li is a software engineer in the iOS Video Playback team at Meta.... read more

Wen Li

Meta

Chay Ryali is a Research Engineer at AI@Meta (FAIR), developing multimodal foundation models with... read more

Chay Ryali

Meta

Deepti is an Assistant Professor in the Dept. of Computer Science in Boston University.... read more

Deepti Ghadiyaram

Boston University

Bilge Soran holds a PhD from the University of Washington, where she focused on... read more

Bilge Soran

Meta

I am a research scientist at Meta. I have been working on foundation model... read more

Yunyang Xiong

Meta

Khawaja is the CEO @ Momento, won the NASA Early Career Medal for his... read more

Khawaja Shams

Momento

Zhi is a Software Engineering Manager at Meta. read more

Zhi Long Tan

Meta

Wen began his career as a data scientist in the finance domain, spending 3... read more

Wen Zhang

Meta

Ramzi Khsib is a Principal Software Development Engineer with AWS Elemental's Research & Development... read more

Ramzi Khsib

AWS

LATEST NOTES

Video @Scale
11/20/2024
Movie Gen: A Cast of Media-Generation Foundation Models
Humans communicate using a rich variety of digital media—text, images, videos, audio. Movie Gen is a cast of media-generation foundation...
Video @Scale
11/20/2024
On-Device Video Playback Upsampling
In today’s digital age, videos are a major source of entertainment and information. Like many of us, I enjoy swiping...
Video @Scale
11/21/2024
Pioneering AI Efficiency: Efficient Segmentation Models
In the rapidly evolving landscape of artificial intelligence, the quest for models that are both powerful and efficient is paramount....
past EVENT   November 20-21, 2024 | Video @Scale

Video @Scale 2024

Video @Scale 2024 is a technical conference designed for engineers that develop or manage large-scale video systems serving millions of people. The development of large-scale video systems includes complex, unprecedented engineering challenges. The @Scale community...
PAST EVENT   March 20, 2024 @ 9am PT - 3pm PT | RTC @Scale

RTC @Scale 2024

RTC @Scale is for engineers who develop and manage large-scale real-time communication (RTC) systems serving millions of people. The operations of large-scale RTC systems have always involved complex engineering challenges which continue to attract attention...
Past EVENT   May 22, 2024 | Data @Scale

Data @Scale 2024

Data @Scale is a technical conference for engineers who are interested in building, operating, and using data systems at scale. Companies across the industry use data and underlying infrastructure to build products with user empathy,...
Past EVENT   June 12, 2024 | Systems @Scale

Systems @Scale 2024

Systems @Scale 2024 is a technical conference intended for engineers that build and manage large-scale distributed systems serving millions or billions of users. The development and operation of such systems often introduces complex, unprecedented engineering...
Past EVENT   JULY 31, 2024 @ 2:30 PM PDT - 7:00 PM PDT - IN PERSON EVENT | AUGUST 7, 2024 @ 2:30 PM PDT - 5:30 PM PDT - VIRTUAL PROGRAM | AI Infra @Scale

AI Infra @Scale 2024

Meta’s Engineering and Infrastructure teams are excited to return for the second year in a row to host AI Infra @Scale on July 31. This year’s event is open to a limited number of in-person...
Past EVENT   August 14, 2024 | Product @Scale

Product @Scale 2024

Product @Scale conferences are designed for technologists who work on solving complex product problems at scale. The @Scale community focuses on bringing forward people's experiences in creating innovative solutions to large-scale products serving millions or...
Past EVENT   September 11, 2024 | Santa Clara Convention Center | Networking @Scale

Networking @Scale 2024

Meta’s Networking team invites you to Networking@scale on September 11th. This year’s event is an in-person event hosted at the Santa Clara Convention center and will also be live streamed for virtual attendees. Registration is...
Past EVENT   October 9, 2024 | Reliability @Scale

Reliability @Scale 2024

In the digital age, where systems operate at unprecedented scales, the importance of robust configuration management cannot be overstated. This year’s Reliability @Scale will focus on a central theme of "Move Safely", emphasizing the critical...
Past EVENT   October 23, 2024 | Mobile @Scale

Mobile @Scale 2024

Mobile @Scale is a technical conference designed for the engineers, product managers, and engineering leaders building mobile experiences at significant scale (millions to billions of daily users). Mobile @Scale provides a rare opportunity to gather...

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy