Humans communicate using a rich variety of digital media—text, images, videos, audio.
Movie Gen is a cast of media-generation foundation models that enables users to use simple text inputs to generate high-quality videos, personalize or edit them, and add audio. When the generations are evaluated by humans, on all of these tasks Movie Gen establishes new state-of-the-art performance compared to existing solutions.
Movie Gen builds upon Meta’s track record of foundational research in this space: See the Make-A-Scene models that enabled generation of image, audio, video, and 3D animation. Our second wave of work with Llama Image foundation models enabled higher-quality generation of images and video, as well as image editing. Movie Gen builds upon these advances while enabling higher-quality outputs and finer-grained control.
We have piloted Movie Gen with Hollywood creatives [3] who have found this to be a useful collaborative tool. You can read more about Movie Gen on our blog [2] and technical paper [4], and see examples on our website [1].
Resources
[1] https://ai.meta.com/research/movie-gen/
[2] https://ai.meta.com/blog/movie-gen-media-foundation-models-generative-ai-video/
[3] https://ai.meta.com/blog/movie-gen-video-sound-generation-blumhouse/