TOPIC: ANDROID, VIDEO, AND WEB

Video @Scale Fall 2022

NOVEMBER 03, 2022 @ 09:00 AM - NOVEMBER 03, 2022 @ 03:00 PM PT
Designed for engineers that develop or manage large-scale video systems serving millions of people. The development of large-scale video systems includes complex, unprecedented engineering challenges.
RSVPS CLOSED
AGENDA SPEAKERS

ABOUT EVENT

Video @Scale is a technical conference for engineers that build large scale video systems, where engineers come together to discuss unprecedented engineering challenges, learn about new technology, and collaborate on the development of new solutions.

This year’s Video @Scale will be hosted virtually. Joining us are speakers from Meta, Twitch, Akamai, Caffeine, BrightCove and more. The event will take place on November 3rd, 2022, with talks themed around Interactive, Immersive, and Intelligent video at scale.

EVENT AGENDA

Event times below are displayed in PT.

November 3

09:00 AM - 09:05 AM
Welcome Remarks

Presented by: Abhinav Kapoor & Venus Montes

SPEAKER ABHINAV KAPOOR,Meta
SPEAKER Venus Montes,Meta
09:05 AM - 09:35 AM
Building Real Time AR Experiences at Scale on Constrained Devices

Augmented Reality is going to change the way humans interact. Various companies have started to build the foundational infrastructure and tools to create the AR ecosystem and AR experiences on mobile devices. But to provide similar or even more computationally intensive immersive AR experiences for thin clients like AR glasses, one needs to take a step back and understand the strict power and thermal limitations and think of an architecture which allows you to offload compute to a beefier server in a privacy/context aware, latency sensitive and scalable fashion. There are lot of challenging areas when it comes to shipping camera frames to server for computation ranging from operating real time transport at scale, leveraging GPU's at scale for ML and render operations for both calling (like Augmented Calling) and non-calling scenarios. Camera frames (RGB and possibly depth) happen to be the prime driving force for Augmented Reality and to be able to process this video data at scale is a necessity for scaling AR experiences in the future. This talk will focus on some of the work which Meta has done in this domain and how industry as a whole needs to come together to solve some of these challenges in order to build the future of high fidelity, low latency immersive AR experiences.

SPEAKER Pranav Saxena,Meta
09:35 AM - 09:50 AM
Bringing Interactivity to Videos

Traditionally, viewers consumed video in a passive, ‘lean back’ environment. Video consumption on social media, however, is an interactive, ‘lean forward’ experience with rich engagement between creators and their audience. Creators are looking for more ways to connect, engage and interact with their audience through new video experiences. Unfortunately, existing video specifications don’t provide a standardized mechanism to support a diversity of interactive experiences.

In this talk, we’ll present a generic end to end framework for interactive video experiences. Our solution enables creators and broadcasters to simply add interactive components (e.g. Ads, sticker, poll, image, chapter marker, etc.) into the video timeline and define how the audience can interact with the components. During playback, viewers can interact with the video at the predefined timeline. We will also cover how AR and AI technologies can be applied towards interactive components, and discuss the different use cases the framework could power up.

SPEAKER Yurong Jiang,Meta
09:50 AM - 10:05 AM
Democratizing AR: Building an AR platform for everyone

Arti.AR is building a cloud-based AR platform for live video. This talk outlines the benefits of adopting AR at scale and the challenges we faced during the platform development.

SPEAKER Ben Hazan,Arti
10:05 AM - 10:25 AM
LIVE Q&A

Featuring Pranav Saxena, Yurong Jiang, & Ben Hazan
Moderated by Venus Montes

SPEAKER Pranav Saxena,Meta
SPEAKER Yurong Jiang,Meta
SPEAKER Ben Hazan,Arti
SPEAKER Venus Montes,Meta
10:25 AM - 10:45 AM
Live Media Over QUIC

Twitch has been working on Warp, a new live streaming protocol utilizing QUIC. This talk outlines the benefits of QUIC and why it will replace TCP. I'll cover some of the emerging approaches for transferring media over QUIC such as Warp, Meta's RUSH, and RTP over QUIC.

SPEAKER Luke Curley,Twitch
10:45 AM - 11:00 AM
Lessons Learned: Low Latency Ingest

Over the past six months, Caffeine has reimplemented its ingest gateway, both to address long-standing historical behaviors, and to provide a platform for future service enhancements. This presentation touches on a number of high-level challenges encountered during this development, and dives deep on one of the more baffling roadblocks we uncovered.

SPEAKER Adam Roach,Caffeine.tv
11:00 AM - 11:15 AM
Delivering Reliable Live Streaming Over Unreliable Backbone Networks

Sometimes we need to deliver a highly reliable live streaming experience over a network that is not designed for that. In our case we implemented a flexible multi-path strategy that allowed us to fix (almost) all problems (buffering and disconnections) caused by unavoidable network events.

SPEAKER Jordi Cenzano,Meta
SPEAKER Thomas Higdon,Meta
11:15 AM - 11:35 AM
LIVE Q&A

Featuring Luke Curley, Adam Roach, Jordi Cenzano, & Thomas Higdon
Moderated by Abhinav Kapoor

SPEAKER Luke Curley,Twitch
SPEAKER Adam Roach,Caffeine.tv
SPEAKER Jordi Cenzano,Meta
SPEAKER Thomas Higdon,Meta
SPEAKER ABHINAV KAPOOR,Meta
11:35 AM - 12:15 PM
LUNCH BREAK
12:15 PM - 12:30 PM
Gaze-Driven Video Delivery: Science Fiction or Viable Link to the Metaverse?

It is well known that due to the uneven distribution of cones in the human retina, we have sharp vision only in the central (fovea) region. The angular span of this region is tiny, just about 1 degree^2. In comparison, the angular span of a TV set watched from 4x screen heights is over 250 degrees^2. This observation implies that using eye-tracking for video compression offers enormous potential. If the encoder can instantaneously know which spot (1degree^2 patch) is visible, only information in that spot will need to be encoded and transmitted. Up to 2 orders of magnitude savings in bandwidth may be attainable!

This idea has been known, at least, since Bernd Girod's paper "Eye movements and coding of video sequences," published in 1988. Many additional works have followed, proposing various variants of implementations of gaze-based video coding systems. Even special classes of compression techniques called foveated video coding or region-of-interest (ROI)-based video coding have appeared, motivated by this application. However, most early attempts to build complete systems based on this idea were unsuccessful. The key reasons were the long network delays observed in the 1990s and 2000s – years when this idea was studied most extensively. But things have changed since.

In this talk, I first briefly survey some basic principles (retinal eccentricity, eye movement types, and related statistics) and some key previous studies/results. I will then derive an equation explaining the relationship between network delay and bandwidth savings that may be achievable by gaze tracking. Then, I will switch the attention to modern-era mobile wireless networks – 5G and Wi-Fi 6 / 802.11ax - and discuss delays currently achievable in direct links to user devices and in cases of device-to-device communication in the same cell (or over the same WiFI access network), as well as in cases of data transmissions involving 5G core networks.

SPEAKER Yuriy Reznik,Brightcove
12:30 PM - 12:50 PM
Cloud Streaming in Metaverse

Cloud streaming is an important tool to make Metaverse better. Using cloud streaming, we can increase the reach of Metaverse to 2D surfaces quickly and to a variety of devices. Cloud streaming can also help 3D environments by enabling massively social, immersive, and rich experiences on lightweight devices that’s limited on compute, thermal, and power.

SPEAKER Naizhi Li,Meta
12:50 PM - 01:10 PM
Building a Professional Video Editor on the Cloud, Powered by Machine Learning

In this talk, I'll discuss some of the challenges we faced building Runway, a professional video editor on the browser, focusing on Green Screen, our interactive video segmentation tool, and the general server-side architecture we've developed for low-latency ML inference on video with computer vision models

SPEAKER Anastasis Germanidis,Runway.ml
01:10 PM - 01:30 PM
LIVE Q&A

Featuring Yuriy Reznik, Naizhi Li, Anastasis Germanidis
Moderated by Venus Montes

SPEAKER Yuriy Reznik,Brightcove
SPEAKER Naizhi Li,Meta
SPEAKER Anastasis Germanidis,Runway.ml
SPEAKER Venus Montes,Meta
01:30 PM - 01:40 PM
Approach to HDR and Tonemap on Android

At Meta, we deeply invest to ingest and playback with the best media quality for our users.

This becomes especially challenging with the advancement in camera capture capabilities of new devices and products such as Reels that allow users to add special effects on top of these videos.

As the HDR color space evolves on Android with different OEMs supporting different HDR formats, at Meta we need to correctly read these formats and apply the appropriate tonemap(conversion to SDR) so that such videos are not busted on upload and playback.

Video Client Infra has solved the challenging problem to correctly tonemap different format HDR videos on Android devices, at a frame level. This helps to preserve the media quality, minimum latency impact and keeps these videos still compatible with all the awesome effects loved by our creators.

We also plan for HDR transcode and ingestion, as the HDR format is standardized for all OEMs.

SPEAKER Bhumi Sabarwal,Meta
01:40 PM - 01:50 PM
HDR at Instagram: The iOS Story

We have been working on gracefully supporting HDR within the Instagram iOS app since it was made popular by Apple in October 2020. Follow our journey and the challenges we faced from ingestion through playback as we adopt this format within our non-traditional media stack.

SPEAKER Chris Ellsworth,Meta
01:50 PM - 02:10 PM
Scaling AV1 End-to-End Delivery at Meta

AV1 was the first generation royalty-free coding standard developed by Alliance for Open Media, of which Meta is one of the founding members. Since its release in 2018, we have worked closely with the open source community to implement and optimize AV1 software decoder and encoder. Early in 2022, we believed AV1 was ready for delivery at scale for key VOD applications such as Facebook(FB) Reels and Instagram (IG) Reels. Since then, we have started delivering AV1 encoded FB/IG Reels videos to selected iPhone and Android devices. After roll out, we have observed great engagement win, playback quality improvement, and bitrate reduction with AV1.

In this talk, we will share our journey on how we enabled AV1 end-to-end from Meta servers to users' mobile screens around the world. First, we will talk about AV1 production, including encoding configuration and ABR algorithms. Further, since the main delivery challenge is on the decoder and client side, we will also talk about the learnings on integrating AV1 software decoder on both iOS and Android devices and the current state. Finally, some ongoing and future work will also be presented.

SPEAKER Ryan Lei,Meta
02:10 PM - 02:30 PM
Content Steering with MPEG DASH

Content distributors routinely use multiple concurrent CDNs to distribute their live and VOD content. For performance, contractual and failover reasons, there are requirements to switch dynamically between these distribution channels at run time. A new specification being developed by the DASH Industry Forum provides a standardized means for a third-party steering service to switch a player between alternate content sources, both at start-up and dynamically while the stream is underway. This talk investigates the mechanics of this steering workflow, including manifest enhancements, player behavior, local steering for 3GPP EMBMS compatibility, steering the manifest itself, and steering ads separately from primary content. We’ll demo a working steering server and discuss compatibility and interop with HLS Content Steering.

SPEAKER Will Law,Akamai
02:30 PM - 02:50 PM
LIVE Q&A

Featuring Bhumi Sabarwal, Chris Ellsworth, Ryan Lei, & Will Law
Moderated by Abhinav Kapoor

SPEAKER Bhumi Sabarwal,Meta
SPEAKER Chris Ellsworth,Meta
SPEAKER Ryan Lei,Meta
SPEAKER Will Law,Akamai
SPEAKER ABHINAV KAPOOR,Meta
02:50 PM - 02:55 PM
Closing Remarks

Presented by: Abhinav Kapoor & Venus Montes

SPEAKER ABHINAV KAPOOR,Meta
SPEAKER Venus Montes,Meta

SPEAKERS AND MODERATORS

Abhinav is part of the Video Infra leadership team at Meta, focusing on scaling live video and building new engagement... read more

ABHINAV KAPOOR

Meta

Venus Montes is a Software Engineering Manager at Meta working on Video Infra. Her focus is Live Creation infrastructure and... read more

Venus Montes

Meta

Pranav Saxena is a Staff software engineer/Technical Lead at Meta Reality Labs driving key efforts for running AR capabilities at... read more

Pranav Saxena

Meta

Yurong is a software engineer in video infra from Meta. He’s primarily working on interactive media framework, which aims to... read more

Yurong Jiang

Meta

Ben Hazan is the VP of R&D at Arti.AR, building a cloud-based platform to create and stream AR easily. For... read more

Ben Hazan

Arti

Luke is a software engineer at Twitch primarily focused on video distribution. Twitch runs our own baremetal CDN designed for... read more

Luke Curley

Twitch

Adam Roach

Caffeine.tv

Jordi Cenzano is an engineer specializing in broadcast and online media. He is currently working on the live pipeline optimization... read more

Jordi Cenzano

Meta

Thomas Higdon is a software engineer at Meta Platforms, Inc. in Cambridge, MA, USA. At Meta, he develops traffic infrastructure,... read more

Thomas Higdon

Meta

Yuriy Reznik

Brightcove

Naizhi is a software engineer working on real time communication field, currently focusing cloud streaming for metaverse. read more

Naizhi Li

Meta

Anastasis Germanidis is the co-founder/CTO at Runway, which is building next-generation content creation software with artificial intelligence. read more

Anastasis Germanidis

Runway.ml

I am an Android Software Engineer@ Meta working in Video Client Infra. We work to support Media Composition and Upload... read more

Bhumi Sabarwal

Meta

Chris is an iOS engineer on the Media Platform team at Instagram. His focus is on providing a delightful media... read more

Chris Ellsworth

Meta

Ryan Lei is currently working as a video codec specialist and software engineer in the Video Infra team at Meta.... read more

Ryan Lei

Meta

Will Law is Chief Architect within the Edge Technology Group at Akamai and a leading media delivery technologist. Involved with... read more

Will Law

Akamai

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy