#198 - DeepSeek R1 & Janus, Qwen2.5, OpenAI Agents

2025/2/2

Last Week in AI

AI Deep Dive AI Chapters Transcript

People

Andrey Kurenkov

Jeremy Harris

Topics

@Andrey Kurenkov : DeepSeek R1模型的发布在技术层面和商业层面都具有重大意义。在技术层面,该模型展示了强化学习在提升大型语言模型推理能力方面的巨大潜力,其训练方法相对高效,成本较低。同时,DeepSeek R1模型的开源和宽松的MIT许可证,也使其能够被广泛应用于商业和研究领域。在商业层面,DeepSeek R1模型的发布引发了美国科技股市场的剧烈波动,这反映了市场对该模型的关注和担忧。虽然DeepSeek R1模型的训练成本相对较低,但其对英伟达等硬件厂商的影响较为复杂。一方面,高效的训练方法可能减少英伟达高端芯片的销量;另一方面,DeepSeek R1模型的成功也证明了英伟达硬件的强大性能,从而利好英伟达的硬件生态系统。 @Jeremy Harris : DeepSeek R1模型的成功并非偶然,它基于DeepSeek v3的强大基础模型,并通过强化学习进行了优化。强化学习的成功应用,证明了仅仅通过奖励模型获得正确答案,就能有效提升大型语言模型的推理能力。DeepSeek R1模型在强化学习过程中,自主发现了并利用了推理时间缩放定律,这表明该定律是AI系统的一个内在属性。 DeepSeek R1模型对硬件的意义被误读了,它实际上利好英伟达的硬件生态系统。DeepSeek R1模型的成功表明,通过更高效的训练方法,可以以更低的成本获得更高的智能水平,这对于英伟达等硬件厂商来说是利好消息。同时,DeepSeek R1模型的成功也凸显了算力在AI发展中的重要性,进一步强调了出口管制的重要性。在追求模型推理能力和人类可理解性之间存在权衡,强化学习模型的推理能力更强,但可解释性较差。

Deep Dive

Chapters

DeepSeek's R1 model, competitive with OpenAI's O1, is discussed. Its reliance on reinforcement learning for reasoning and its implications for hardware and AI development are explored.

DeepSeek R1 rivals OpenAI's O1 in performance.
R1 uses reinforcement learning, achieving impressive results with fewer resources.
The model's reasoning process is more organic and diverse than traditional methods.
R1's success suggests inference time scaling laws are robust and naturally discovered by AI systems.

Shownotes Transcript

Our 198th episode with a summary and discussion of last week's big AI news!
Recorded on 01/31/2024

Join our brand new Discord here! https://discord.gg/nTyezGSKwP

Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

In this episode:

- DeepSeek releases R1, a competitive AI model comparable to OpenAI’s O1, leading to market unrest and significant drops in tech stocks, including a 17% plunge in NVIDIA's stock.
- OpenAI launches Operator to facilitate agentic computer use, while facing competition from new releases by DeepSeek and Quen, with applications seeing rapid adoption.
- President Trump revokes the Biden administration's executive order on AI, signaling a shift in AI policy and deregulation efforts.
- Taiwanese government clears TSMC to produce advanced 2-nanometer chip technology abroad, aiming to strengthen global semiconductor supply amidst geopolitical tensions.

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Timestamps + Links:

(00:00:00) Intro / Banter
(00:03:01) Response to listener comments
Projects & Open Source
- (00:06:26) DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
- (00:30:25) Viral AI company DeepSeek releases new image model family
- (00:34:07) Qwen2.5-1M Technical Report
- (00:38:32) Alibaba’s Qwen team releases AI models that can control PCs and phones
Tools & Apps
- (00:42:09) OpenAI launches Operator, an AI agent that performs tasks autonomously
- (00:47:37) DeepSeek reaches No. 1 on US Play Store
- (00:52:17) Alibaba rolled out Qwen Chat v0.2 and Qwen2.5-1M model
- (00:53:50) Perplexity launches US-hosted DeepSeek R1, hints at EU hosting soon
- (00:55:31) Apple is pulling its AI-generated notifications for news after generating fake headlines
- (00:59:00) French AI ‘Lucie’ looks très chic, but keeps getting answers wrong
Applications & Business
Policy & Safety
(01:33:01) Outro

#198 - DeepSeek R1 & Janus, Qwen2.5, OpenAI Agents 01:37:26 Share

Last Week in AI

Deep Dive

Shownotes Transcript

#198 - DeepSeek R1 & Janus, Qwen2.5, OpenAI Agents