For the full show notes and links visit https://sub.thursdai.news
🔗 Subscribe to our show on Spotify: https://thursdai.news/spotify
🔗 Apple: https://thursdai.news/apple
Ho, ho, holy moly, folks! Alex here, coming to you live from a world where AI updates are dropping faster than Santa down a chimney! 🎅 It's been another absolutely BANANAS week in the AI world, and if you thought last week was wild, and we're due for a break, buckle up, because this one's a freakin' rollercoaster! 🎢
In this episode of ThursdAI, we dive deep into the recent innovations from OpenAI, including their 1-800 ChatGPT phone service and new advancements in voice mode and API functionalities. We discuss the latest updates on O1 model capabilities, including Reasoning Effort settings, and highlight the introduction of WebRTC support by OpenAI. Additionally, we explore the groundbreaking VEO2 model from Google, the generative physics engine Genesis, and new developments in open source models like Cohere's Command R7b. We also provide practical insights on using tools like Weights & Biases for evaluating AI models and share tips on leveraging GitHub Gigi. Tune in for a comprehensive overview of the latest in AI technology and innovation.
00:00 Introduction and OpenAI's 12 Days of Releases
00:48 Advanced Voice Mode and Public Reactions
01:57 Celebrating Tech Innovations
02:24 Exciting New Features in AVMs
03:08 TLDR - ThursdAI December 19
12:58 Voice and Audio Innovations
14:29 AI Art, Diffusion, and 3D
16:51 Breaking News: Google Gemini 2.0
23:10 Meta Apollo 7b Revisited
33:44 Google's Sora and Veo2
34:12 Introduction to Veo2 and Sora
34:59 First Impressions of Veo2
35:49 Comparing Veo2 and Sora
37:09 Sora's Unique Features
38:03 Google's MVP Approach
43:07 OpenAI's Latest Releases
44:48 Exploring OpenAI's 1-800 CHAT GPT
47:18 OpenAI's Fine-Tuning with DPO
48:15 OpenAI's Mini Dev Day Announcements
49:08 Evaluating OpenAI's O1 Model
54:39 Weights & Biases Evaluation Tool - Weave
01:03:52 ArcAGI and O1 Performance
01:06:47 Introduction and Technical Issues
01:06:51 Efforts on Desktop Apps
01:07:16 ChatGPT Desktop App Features
01:07:25 Working with Apps and Warp Integration
01:08:38 Programming with ChatGPT in IDEs
01:08:44 Discussion on Warp and Other Tools
01:10:37 GitHub GG Project
01:14:47 OpenAI Announcements and WebRTC
01:24:45 Modern BERT and Smaller Models
01:27:37 Genesis: Generative Physics Engine
01:33:12 Closing Remarks and Holiday Wishes
Here’s a talking podcast host speaking excitedly about his show
TL;DR - Show notes and Links
Open Source LLMs
Meta Apollo 7B – LMM w/ SOTA video understanding (Page), HF))
Microsoft Phi-4 – 14B SLM (Blog), Paper))
Cohere Command R 7B – (Blog))
Falcon 3 – series of models (X), HF), web))
IBM updates Granite 3.1 + embedding models (HF), Embedding))
Big CO LLMs + APIs
OpenAI releases new o1 + API access (X))
Microsoft makes CoPilot Free! (X))
Google - Gemini Flash 2 Thinking experimental reasoning model (X), Studio))
This weeks Buzz
W&B weave Playground now has Trials (and o1 compatibility) (try it)
Alex Evaluation of o1 and Gemini Thinking experimental (X), Colab), Dashboard))
Vision & Video
Google releases Veo 2 – SOTA text2video modal - beating SORA by most vibes (X))
HunyuanVideo distilled with FastHunyuan down to 6 steps (HF))
Kling 1.6 (X))
Voice & Audio
OpenAI realtime audio improvements (docs))
11labs new Flash 2.5 model – 75ms generation (X))
Nexa OmniAudio – 2.6B – multimodal local LLM (Blog))
Moonshine Web – real time speech recognition in the browser (X))
Sony MMAudio - open source video 2 audio model (Blog), Demo))
AI Art & Diffusion & 3D
Genesys – open source generative 3D physics engine (X), Site), Github))
Tools
CerebrasCoder – extremely fast apps creation (Try It))
RepoPrompt to chat with o1 Pro – (download)) This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe)