Google DeepMind unveils a new video model to rival Sora

2024/12/18

TechCrunch Industry News

DeepMind

DeepMind: Veo 2 是一个新一代的视频生成 AI 模型，在分辨率和时长上超越了 OpenAI 的 Sora。它能够生成 4K 分辨率、时长超过 2 分钟的视频。Veo 2 在物理模拟、相机控制和图像清晰度方面都有所改进，能够更逼真地模拟运动、流体动力学和光线特性，生成更清晰、更锐利的图像和视频。虽然 Veo 2 在某些方面取得了显著进展，但仍然存在一些挑战，例如在长视频中保持一致性和连贯性，以及生成复杂的细节和快速运动等。DeepMind 正在与艺术家和制作人合作，改进其视频生成模型和工具，并致力于解决模型中存在的伦理问题，例如深度伪造和版权问题。 DeepMind 承认 Veo 2 的训练数据来自公开视频，并认为使用公共数据进行训练是合理使用。DeepMind 正在努力与创作者和合作伙伴合作，以实现共同的目标，并积极收集反馈意见，以改进模型和工具。 Eli Collins: Veo 2 将在未来通过 Google 的 Vertex AI 开发者平台提供，并会整合到 Google 生态系统中。Google 将继续根据用户的反馈迭代改进 Veo 2，并将其更新的功能整合到 Google 生态系统中的引人注目的用例中。在 Veo 2 普遍可用之前，Google 的赔偿政策不适用。为了减轻深度伪造的风险，DeepMind 使用其专有的水印技术 SynthID 来嵌入不可见的标记到 Veo 2 生成的帧中。Veo 2 的训练数据包括高质量的视频和描述配对。

Deep Dive

Key Insights

What is Veo 2 and how does it compare to OpenAI's Sora?

Veo 2 is Google DeepMind's next-generation video-generating AI, capable of creating 2-minute-plus clips in resolutions up to 4K (4096x2160 pixels). This is 4x the resolution and over 6x the duration of OpenAI's Sora, which can produce up to 1080p, 20-second clips. However, in Google's experimental tool VideoFX, Veo 2 videos are currently capped at 720p and 8 seconds.

What are the key improvements in Veo 2 compared to its predecessor?

Veo 2 features an improved understanding of physics and camera controls, producing clearer footage with sharper textures, especially in scenes with movement. It can more realistically model motion, fluid dynamics, and properties of light like shadows and reflections. Additionally, it offers enhanced camera positioning and movement for capturing objects and people from different angles.

What are the limitations of Veo 2 in video generation?

Veo 2 struggles with coherence and consistency over long durations, particularly with complex prompts. Character consistency, intricate details, and fast, complex motions remain challenging. The model also exhibits issues like lifeless eyes in animations, physically impossible facades, and blending of pedestrians and backgrounds.

How is DeepMind addressing ethical concerns around Veo 2's training data?

DeepMind uses prompt-level filters to mitigate risks like regurgitation of training data and employs its proprietary watermarking technology, SynthID, to embed invisible markers in Veo 2-generated frames. However, the lab does not offer a mechanism for creators to remove their works from existing training sets, maintaining that training on public data is fair use.

What role do creators play in the development of Veo 2?

DeepMind collaborates with creators like Donald Glover and The Weeknd to understand their creative processes and refine its video generation models. Feedback from these collaborations informed the development of Veo 2, and DeepMind continues to work with trusted testers and creators to improve the model.

What other AI model upgrades did Google DeepMind announce alongside Veo 2?

Google DeepMind announced upgrades to Imagine 3, its commercial image generation model. The new version creates brighter, better-composed images in styles like photorealism, impressionism, and anime. It also follows prompts more faithfully and renders richer details and textures. UI updates to ImageFX include chiplets for key terms in prompts, allowing users to iterate or select auto-generated descriptors.

Chapters

Google DeepMind's Veo 2 boasts higher resolution and longer video generation capabilities compared to OpenAI's Sora, although current implementations have limitations. Future plans include wider availability via Vertex AI and integration into the Google ecosystem.

Veo 2 generates longer videos (2+ minutes) at higher resolution (4K) than Sora.
Currently available in Google's VideoFX tool with limitations on resolution and duration.
Future release on Vertex AI and integration into Google products planned.

Shownotes Transcript

Google DeepMind, Google’s flagship AI research lab, wants to beat OpenAI at the video generation game — and it might just, at least for a little while. On Monday, DeepMind announced Veo 2, a next-gen video-generating AI and the successor to Veo, which powers a growing number of products across Google’s portfolio.

Learn more about your ad choices. Visit podcastchoices.com/adchoices)

Google DeepMind unveils a new video model to rival Sora 08:09 Share