Weekly AI News Round Up

2024/12/11

The Daily AI Show

People

Beth

一位获得艾美奖和格蕾西奖的商业分析师和《Jill on Money》播客主持人，专注于个人财务和投资建议。

Brian

Python 开发者和播客主持人，专注于测试和软件开发教育。

Carl

Jyunmi

Topics

Brian: 对OpenAI发布的ChatGPT Pro、Sora和Canvas工具更新进行了详细的回顾，并对未来发布的更新进行了展望。他还讨论了这些更新对开发者和用户的益处。 Beth: 参与了OpenAI更新的讨论，并对Canvas工具的更新表示了兴奋，特别是其与自定义GPT的集成。她还强调了自定义GPT中启用Canvas功能的重要性。 Karl: 分享了加拿大政府对Cohere公司AI数据中心项目的投资以及阿尔伯塔省政府吸引AI发展的战略。他还提到了Kevin O'Leary提出的建设世界最大AI数据中心工业园的计划。 Jyunmi: 讨论了Deep Health和RadNet的一项研究，该研究表明AI技术在乳腺癌筛查中的应用可以提高21%的检测率。她还讨论了YouTube的自动配音功能及其对知识类内容的影响，并展望了该功能的未来发展。她还讨论了Sakana.ai公司开发的NAM（神经注意力记忆模型）及其对Transformer模型效率和性能的提升，以及该技术在机器人和老年人家庭护理方面的应用潜力。最后，她还讨论了谷歌发布的量子芯片Willow及其在计算速度上的突破。 Brian: 重点介绍了OpenAI在12天内发布的ChatGPT Pro，Sora和Canvas工具的更新，并详细分析了这些更新的功能和对开发者和用户的意义。他还表达了对OpenAI未来更新的期待。 Beth: 主要关注Canvas工具的更新，并详细解释了如何在自定义GPT中启用该功能，以及该功能如何提高工作效率。她还参与了对OpenAI其他更新的讨论。 Karl: 主要讨论了加拿大政府对AI基础设施建设的投资，特别是对Cohere公司AI数据中心项目的投资，并分析了这一投资对加拿大AI产业发展的影响。他还介绍了Kevin O'Leary提出的建设世界最大AI数据中心工业园的计划，并对该计划的潜在影响进行了分析。 Jyunmi: 主要关注AI在医疗保健和量子计算领域的应用。她介绍了Deep Health和RadNet的研究成果，并对AI在乳腺癌检测中的应用前景进行了展望。她还介绍了谷歌发布的量子芯片Willow，并对该芯片的性能和潜在应用进行了分析。此外，她还讨论了YouTube的自动配音功能及其对知识类内容的影响，以及Sakana.ai公司开发的NAM（神经注意力记忆模型）及其在机器人和老年人家庭护理方面的应用潜力。

Deep Dive

Key Insights

What are the key updates from OpenAI's first four days of announcements?

OpenAI announced the release of ChatGPT Pro for $200 per month, which includes access to the better model O1. They also released Sora, which allows 50 uses per month and up to 1080p resolution for Pro users. Additionally, updates to Canvas were made, integrating it directly into GPT-4.0 for easier use in custom GPTs.

How does OpenAI's Canvas update benefit developers and users?

Canvas is now integrated directly into GPT-4.0, eliminating the need to manually select it for tasks. Users can enable Canvas for previously built custom GPTs, allowing for side-by-side editing and inline use in advanced prompts, enhancing productivity and flexibility.

What significant improvement did AI bring to breast cancer detection in a recent study?

A study by Deep Health and RadNet showed a 21% improvement in breast cancer detection using AI in mammogram screenings, demonstrating the potential of AI to enhance healthcare outcomes.

Why did the Canadian government invest $240 million in Cohere's AI data center?

The Canadian government invested $240 million in Cohere's AI data center as part of a $2 billion sovereign AI compute strategy, aiming to advance national AI infrastructure and economic development.

What is the potential impact of Google's quantum chip Willow?

Google's quantum chip Willow can reduce errors exponentially and perform calculations in under five minutes that would take today's fastest supercomputers 10 to the septillion years, revolutionizing computational science and potentially confirming the existence of the multiverse.

How does YouTube's new auto-dubbing feature work, and who benefits from it?

YouTube's auto-dubbing feature converts audio into multiple languages for knowledge-focused content, benefiting creators by automating the process and expanding their reach to global audiences without the need for manual translation.

What is the significance of Sakana.ai's evolved universal transformer memory (NAMs)?

NAMs optimize how transformers store and retrieve information by selectively pruning memory, similar to human memory processes. This enhances efficiency and performance across various tasks, enabling superior results with less memory and cross-domain mastery without additional training.

What challenges do home care robots face in assisting older adults?

Home care robots need to navigate complex environments, handle unexpected pop-ups, and prioritize tasks effectively, similar to human caregivers. They must also gain acceptance and understanding from the public to fulfill the increasing demand for elderly care.

How does Google's Gemini 2.0 aim to transform AI agents?

Gemini 2.0 introduces multimodal capabilities, allowing AI agents to process text, images, and audio together. It also features tool use, function calling, and memory, enabling real-time applications like tutoring and spatial computing, which could revolutionize industries like education and robotics.

What is the significance of Stainless' SDKs for AI platforms?

Stainless' SDKs simplify the development process for platforms like OpenAI, Anthropic, and Meta by providing cohesive software development kits. This reduces the need for developers to navigate extensive documentation, accelerating the creation of AI-powered applications.

Chapters

OpenAI's twelve days of announcements included the release of ChatGPT Pro, Sora, and updates to the Canvas tool. The updates focus on improved developer tools and user experiences. The team speculates on what the remaining announcements might entail.

Release of ChatGPT Pro, Sora, and Canvas updates
Focus on developer tools and user experience improvements
Speculation on future announcements

Shownotes Transcript

Translations:

中文

Canadian government invested $240 million to fund the development of Cohere's multi-billion dollar AI data center. They were able to boost detection by 21%. Think about an agent trying to work past a pop-up. An evolved universal transformer memory. ♪♪♪

Aloha, everyone, and welcome to The Daily AI Show. This is our weekly AI news wrap-up because Wednesday is News Day.

So what we do is we gather all the stories that we are most interested in, most excited in, and the news that we think that everyone should know about and share it with you and each other. Today, I'm joined by Beth, Brian, Carl, and I'm Junmi. So before we open it up to the floor, just want to let you know that we appreciate you joining us every weekday morning for our live show, but...

Because of the 12 days of OpenAI, we have been doing some live streams around that as well, getting our initial reactions to all the crazy news that they've released, like 01 and Sora. I so hate myself for missing that one, but I'm sure I'm going to bring it up soon enough on its own show.

And also, you can always keep up to date with all the latest information with our newsletter. Just go to thedailyaishow.com and you can sign up there. Every issue builds upon the previous one and gets better and better each time. So I really suggest everyone join up and sign up for that.

So let's get into it. Does anybody have a pressing piece of news that they really, really want to talk about? Well, I think honestly, we are doing, like you just said, Jimmy, we are doing the second live show that specifically is about OpenAI and their 12 days of OpenAI. But maybe it's worth just doing, I definitely don't want to lose the rest of the news show, but maybe it's worth just doing like a less than three minute recap of where we've gotten to so far. Because today, December 11th is day four.

of the 12 days of opening. So we have had four announcements so far, starting back with last Thursday. So definitely after our last new show. So, um, I'll try my best to do a recap, but you guys add in what I miss here. Okay. Day one, uh, the 12 days we got, uh, Oh, one came out of preview and the other big, uh,

announcement was they announced ChatGPT Pro. That's the $200 per month. And there was a lot of discussion right off the bat. What do you get with that $200? Would you even qualify in terms of would you need the better, bigger model of O1 when we literally just came out of preview, but they're saying there's this better one? Well, since then, on another day, we also learned...

about Sora, which we'll get in a second, that you get more for that pro. So that was day one. Those are big ones. Day two was sort of a continuation, but it was more about, um,

how developers could use O1. And it was more of a developer type announcement, but still really, really cool. We can dig into that one a little bit more. Day three, coming up on Monday of this week, we got an announcement that Sora finally was released. And not only was it released, it was released. I had access to it within almost minutes, almost live of Beth and I doing the reaction that day. And I've since used it several times. You can use it 50 times per month

That's going to be for most of us. But as I mentioned with the ChatGPT Pro account, the $200 a month account, that is throttled and sort of unlimited. It just depends on where you're using it, how you're using it. You also can go up to 1080p where you cannot do that on the 50 per month, our paid accounts that we have like the 2025 that most of us have. So that was day three. And then day four yesterday,

I missed the live, but I watched it back with Carl and Beth and they announced some updates to Canvas. So I'm really excited about yesterday. I mean, I love the other stuff, too, but this one really hit for me. Not only did they make Canvas available to everybody, I didn't actually realize before I heard that that it wasn't available to everybody. So that was actually news to me. So they did sort of do like they started with a general version.

of what Canvas can do. And I was like, oh, this isn't really news, but I get it that not everybody had it. So that's why they were showing that. What I really liked is that they integrated it directly into 4.0, meaning that you don't really have to select, kind of like Dali back in the day. You used to have to say, oh, I want to create an image or I want to have a chat. And you don't have to do that anymore. Now it's just integrated. 4.0 should be smart enough to pull it in wherever you want. However, what

What I immediately did yesterday, and I heard Carl talk about this in the review, is he said, "Oh, Brian's going to want to know about the custom GPTs." You can, in fact, turn it on for previously built custom GPTs, which I did yesterday. It worked immediately. I even integrated in an advanced prompt that I had seven tasks

Buried into that advanced prompt I could go to each section and tell it independently when I wanted it to use canvas Which is really helpful because sometimes those first couple tasks are gathering gathering gathering gathering information I don't need canvas for that but I

when I get to the result, the product, the report, the summarization, the transcript recall, whatever it is, that's when you wanted to open up that side panel and have side by side editing. And that is very cool that you can inline edit and make that part of it. So

I don't know how quick I did it, probably not three minutes, but that's what we had in the first four days. And of course, we have, you know, quick math here, eight more days to hear what else they're going to do, which we'll be doing a live calls, like you said, the rest of this week and all of next week. They skipped the weekend. What did I miss? I kind of forgot what day it was.

No, I don't think you missed anything, but I want to pull up Gwen's comment. Gwen, the reason that you didn't see Canvas in your custom GPTs is because it's not affecting the old custom GPTs so that it doesn't break. You have to turn it on for your previous. Yes.

Go down where you can check web search and all of those things. You should have a box or a canvas there, and then it will turn on for your GPTs.

Yeah, I just wanted to clarify. You just said what I was like, wait, hold on. You're right, Beth. I just want to make sure it's good because I misunderstood this yesterday too. It appeared that they said initially like in the way they said it, oh, you won't have Canvas for your previously built custom GPTs. No, no, no. Like Beth just said, if you're the editor, if you're the creator, you can go into those custom GPTs and turn that feature on. And also you can turn it on just generally, but like I said, you can also do line item prompting.

to add it into the prompt at certain spots. And I did both of those yesterday and it worked wonderfully. So that was great that it worked. There wasn't hiccups. It was super, super cool to see that happen. And if you're just doing regular prompting, it should just show up when it should show up. I mean, that's the whole idea is if you're asking for a report or summary, chatbot,

ChatGPT 4.0 should now be smart enough to say, hey, you know what? That's a good case for using Canvas and you should just see it show up on the right side of your screen. Whether it hits or misses, that's to be seen, I think.

Yeah, definitely exciting. Already started using it when I decided to build a custom GPT to help me build Sora prompts. Right? Okay. And I'm really excited for today. We'll see what comes up. Like I said, missing those first few days. Of course, they came out with the big swings that everybody wanted in the first one. So I'm really interested to see on the back half what's going to happen.

Dolly updates. Right? Let's hope. But I do want to open up with a new story. Again, I always like AI and science. So my first one is a story about Deep Health who just did a survey.

And they're owned by Radnet, which is a big radiology company for reading your charts and things like that.

And they just did a study with over almost 750,000 women who went under having a mammogram screen, right? And over this 12-month period, they found that by opting into this AI program, which unfortunately they had to cover, the patients had to cover, but

When they did, it was basically getting an extra set of eyes on the reports to see if they can pick out, you know, help identify, you know, cancer, breast cancer specifically. And if they did, they were able to boost detection by 21%. Now, okay.

That's huge. That's huge. That is an applied, you know, we've been hearing how AI in the process of science has improved things by taking care of the, you know, the mundane tasks of science, you know, going through thousands of tests and things like that. But to improve detection by 21%,

Admittedly, this is early numbers. This is not 100%. They need to go through and analyze how this really breaks down. But that's what the early review is, that it boosts detection by 21%, which to me is huge. That was one of my more exciting AI and science news because

We're talking about real numbers, real quick, showing applied how AI can just help, help, right? So does anyone have another story? 'Cause I got like three more loaded up. - I have some, but I probably heard so out. But here's the thing, I think somebody is gonna say the one that I'm the most excited about, but that's okay. Let's see if anybody else brings it up. - Okay, okay. - Yumi, you're my guess on who's gonna bring this one up. Go ahead. - Okay. Would you say Carl?

Oh, so I wanted to take this time. So Jen, this is for you and any all Canadians. This is purely Canadian news. This is so two of them here. One is Cohere, one of the labs that isn't talked about that much. They will get an...

with 240 Canadian million dollars. Canadian government invested 240 million to fund the development of Cohere's multi-billion dollar AI data center. And it's part of the 2 billion Canadian sovereign AI compute strategy. So that will be for Cohere. Now related to that was an announcement by Alberta government

So, we announced our Alberta government to a strategy, so it's not money, aimed at making the province the most attractive place to build AI.

It's a strategy, so I don't know what specifically relates to that. But related to that, does everybody here know Kevin O'Leary? You know, Dragon's Den? Bald guy? Yeah, yeah. Yeah, yeah, yeah. So anyway, he proposed a $70 billion AI data center, most people don't even know this, in the municipal district of Greenview, and really be the world's largest AI data center industrial park. Yeah.

Now, they actually created a video on this that I actually want to play because it's AI generated. So I'm just going to share this. It's kind of hilarious. And yeah, here we go. Hold on a second. Let's get it up on the screen. Can you see it? Yeah, I can see it. Perfect. So, okay. Here we go. There's a valley, a valley where technology and nature will come together in perfect harmony.

for the future will pulse at every turn, bursting with nature and wildlife, nurturing the present as it embraces the future. The valley where the brightest minds unite, where electrons, machines, and data come together in an algorithmic dance. Wonder. It's not just a name, it's a mission. A data center campus that will sustainably marshal in a new era of innovation. The single largest AI compute data center park on Earth.

Nestled quietly within 7,000 acres of forest land with trails, wildlife and recreation, and beauty. Welcome to Wonder Valley. Yesterday is history. Tomorrow's a mystery. Let's shape the future together. And that is that. I don't know how...

From a purely Canadian perspective, it's exciting because A, no one knows where, I don't even know where the municipal district of Greenview is, but that's fascinating. And it's in North, I think Northwest Alberta. But what's really funny about that is when you see that video, you're like, huh, that is maybe in the two months that we get sun here. No, I'm sorry. It did seem very nice there. I will say that. I was like, what part of Canada is this in? Exactly.

There seemed to be a lot of fall foliage going on. I would be very interested to have like an office and then you have deer right beside. I was like, that doesn't happen. And most of the time it'll probably look like dreary and it's cold, but it's really cool. I think that's kind of exciting that O'Leary proposed that for that area. That's really amazing. And to be the world's biggest AI data center, I was like,

hey you know things looking up for canada canada just wanted to showcase that uh that that news because i think that was pretty big 70 billion dollars yeah for a data center up uh up here so it's pretty amazing it'd be interesting to see what kind of if at all push back or what you know maybe there's more into the wonder valley plan and stuff i don't know how you know sometimes these things are just well well put together videos and there's not a whole lot of

oomph behind it yet. It's just a concept to get people excited. But if there's more of a plan there, I'd be curious to see what

what people's maybe pushback would be about having data centers that ingrained with nature, you know, and being surrounded by like, is that a good thing? Are people going to push back pretty quickly and say, oh, no, the warmth that a data center gives off, yada, yada, yada, or it's going to mess with natural resources. That's where, as I look at that, I go, oh, that looks great. But I am curious what the initial pushback would be on something like that.

And by the way, that's not a Canada problem. I would say that about anywhere in the world, you know, it looks great. Oh, it's surrounded by nature. And then, but what I expect people to be like is, yeah, that's the problem.

What's the infrastructure plan? Because if it is out in the middle of nowhere, they're going to have to dig a ditch for all of the power and communication lines and everything like that. That's not a problem up here, especially where up north we have

like if you've ever been to the actual oil sands and that they there is no problems digging ditches and and that is not an issue and in terms of like so uh in calgary i'm not sure in in uh

the municipal district we get the most days of sun per year so 330. so i'm curious to see if they i would imagine solar or wind would probably be a um part of the power perspective on that because

Yeah, most people don't know. We actually, like, it's very lucrative to build for solar and wind power in Alberta. So it's, I would imagine that would be, that would be the tout. It would be like, hey, you know, we're building this data center, but it also has, it's environmentally friendly because of, you know, solar and wind that's powering it. So there's a lot of this. So, Carl, you're saying you're going to get us in data.

Daily AI show solar and power business, right? It's another million dollar idea.

If you all want to move to the municipal district of Greenview, I would be more than happy. Hey, there's going to be the largest data center in the world there. Why wouldn't we move there? I mean, it sounds like it's the right time to buy a house if you ask me. Right? I would like to know where it is, what amenities are around there, and I just want to make sure you are well aware of where you're going to. Oh, yeah.

The way that it made it look based on the plan. That's like that's like two Apple campuses Exactly, especially when it rises. Oh, right. Fantastic. Yeah. Yeah, it's So during the winter it brings it down Perfect perfect exactly like that apple tv show silo. Sorry i'm digressing. So let's Let's move on with the news. I got a couple of quick hits and unless beth you you got one, uh

No? Okay. All right. So, quick hit, YouTube, their new auto-dubbing feature is now available for knowledge-focused content, and this is coming in from TechCrunch. So, they came up with this auto-dubbing feature that will just dub your content, your audio, into various languages. So far, it's been a very slow beta type of release.

Um, but now they want to apply it to channels that are knowledge focused. Now, the examples that are given in the article are things like, you know, cooking shows and, you know, how to make this stuff. But I, uh,

If YouTube is monitoring all of their content, I would like to make the case for the Daily AI Show, since we're knowledge-focused content, to allow us to try the auto-dubbing feature. Because it is something that we've talked about quite a while, as part of our post-production process, to add, when can we add in...

subtitles in additional languages or audio in additional languages? And then, you know, what are the feasibilities of doing the lip sync auto dubbing, you know, features that we've seen in things like 11 Lads and whatnot. So YouTube,

I know you watch us because OpenAI watches us. Brian said the other day, OpenAI watches us. They got the idea about space data centers from us. So, you know. Directly from us. Okay. It's okay. It's okay, guys. You can grab ideas from us anytime you want. It's all right. Okay. All right. I have a question about that, though. Is the audio dubbing that YouTube just released include the lip?

- Lip sync auto dubbing? - No, no, it's not lip sync auto dubbing. That would be an amazing leap. It's basically just converting your audio into other languages, which is a huge advancement in and of itself. I know we're a little spoiled because of things like 11 Labs going, "Oh yeah, now we'll make it look like you were actually speaking in that other language."

But for now, it's a matter of just converting your audio into another language. - If you share the screen real quick, I'll show it real quick. - Absolutely. - Okay, so this is a video that was originally in French. - Okay, and then if we go over here, we can actually see that there's an audio track option for English. If we switch over. - We will butter the gratin dish and place the potatoes and taste inside.

So it definitely sounds AI, right? Yeah. But it's still pretty cool. It is very cool. The coolest part about it is the auto part. You don't have to process your own audio. You don't have to do any of that. YouTube will do that part of the process, just like they introduced auto captioning. Still not the best version of subtitling or captioning, but...

Just the fact that they would do that on the background as part of their normal process, which I think is a huge benefit to creators, but of course, a bigger benefit to YouTube because that means more of their content can be distributed to more regions and get more attention. - Attention is all you need. - So, quick add to this. If it starts in English, so if it's us and we're talking in English,

It says that eventually it once were included in all this, it'll be translated into French, German, Hindi, Italian, Spanish, Indonesian, Japanese, and Portuguese. However, if it starts in one of those languages that will only auto be dubbed back to English. So if you start with English, you get all of those 1, 2, 3, 4, 5, 6, 7, 8 different languages automatically, uh,

Or if it's in one of those languages, like the French one we just saw, it will be auto-dubbed back to English. That is a giant swath of the world that you're able to connect with. Let's just say this show. And maybe it's, I mean, we're not going to, I'm not going to know if it's perfect in Hindi. I wouldn't know, you know, but hopefully it's good enough that like when we're having conversations, even if it does sound AI generated Hindi, it's,

I mean, if it gives somebody in that part of the world the opportunity to listen to our show and maybe gain insights from it that they wouldn't typically have, I love this. And obviously, I can't wait for this to be available to everybody. I just think it really opens up the world. There's so much amazing content being created, but oftentimes it is just limited by the original content.

language and YouTube is making it, as you said, you'd be easier for content creators to know, hey, I only have to do this once. I don't have to do the old Mr. Beast version where I create all new channels in Spanish.

it's going to be done for me. And that's, that's a huge lift, you know, for us. Uh, so I love it. I love it. I love the idea that somebody very soon in Japan will be like, I love your show. And I tune in all the time, but they're saying it in Japanese to us. And we're having to translate their comment to say, Oh, they're watching our dubbed version in Japanese. That's super cool. I also want to see the reverse because the, the fact that I can watch, uh,

a show from Japan or a content from Japan and get insight in what is trending over there. What kind of, what's, what's the zeitgeist as you put it, you know, because you can, you can gather a lot from a region based on what is in, you know, what they're talking about.

And that language barrier is an issue. It slows the flow of information. So I want to see the viewpoints and the ideas and things like that from other regions to inform me about what's going on in the world.

I'd rather hear a local creator talking about something that's important to them and hearing it from them than me having to get it filtered through a news agency or from Google or from Perplexity or something like that. So I was really excited about that one.

So Beth, you got something? - Yes. Japan makes me think of our friends at Sakana.ai. - That's the one. - I love it. I love this story. - All right, so Brian pop in. But if we remember Sakana.ai are the people who developed the evolutionary merge, model merge, evolutionary model merge, where you could take two different models

merge them and get greater than the sum of its parts, a new model without having to do the training that was involved in the original two models, right? So they're fascinating. We're the only people I know who are talking about it. So like, hey,

And I would love to do some interviews and it would be very cool for that to be dubbed into Japanese or do the interview in Japanese and for that to be dubbed in English. Like this is an exact use case. So what they revealed or what they released the research paper about is

is an evolved universal transformer memory. And basically what that is and sort of the rationale or the inspiration that they use for the research that they do is that it's looking at what our human brains actually do with our memory. We actually don't keep all of the data. We record the data and then we delete

redundant details, right? So we are pruning our own memories as we're learning. So yeah, that's super, like yes, and that makes memory more efficient, that makes memory more powerful. Brian, what am I missing?

No, you pretty much know that they're calling this a new kind of memory. They call it the NAM. I'm assuming that would be the acronym, but that's Neural Attention Memory Models. And they optimize how transformers store and retrieve information, unlocking unprecedented efficiency and performance. And they say that's got some supercharged results. With NAMs, transformers achieve superior results across all

across a wide range of language and coding tasks while requiring less memory, and then their cross-domain mastery. Trained solely on language, NAMs can be applied to vision, reinforcement learning, and other domains without additional training, which is similar to kind of what we were talking about yesterday, and they mentioned it yet again in here. It says fully evolved males, NAMs,

trained on language could be zero shot transferred to other transformers. That's exactly our conversation yesterday, even across input modalities and task domains. This is way more than we can get into right now on a new show. I guarantee you we will do a show on this because on names, because it's so cool. Again, Sakana coming out with just some honest to God, like we always say it like just some really cool move the needle type

improvements in technology. And I'm just super cool. I feel like I'm just dipped into this. I haven't even looked at the actual research paper. I'm just reading the, their blog office, the kind of that AI and what they're talking about. But once again, I mean what they're talking about in the selective, they've got an image here. Let's talk about if you, you know,

LLMs mess up a lot with, you know, Jan is seven years old and his brother Alex is twice his age. How old is Alex? Well, the answer is 14, right? Or how old is Alex? Alex is seven because his brother is twice his age. It's selective memory knows to keep the right answer and eliminate the wrong answer. Because what it's saying is, and we all know this, is when you try to hold everything in memory, it confuses the

the heck out of the language models. We just talked about this yesterday with the whole idea of the traffic lights. If you put everything in there, it gets awful confusing to figure out what it should be looking at and what it should. So the answer isn't everything always in memory. The answer is selective pruning of memory, much more like humans do it. We tend to connect and fire the neurons on things that we need to keep for long-term memory, like how to chew and not stab ourselves in the face with our fork, right, Beth? Yeah.

We just talked about it as an inside thing with me and Beth the other day. And then also things that only need to be remembered for a short period of time, you know? So I think it's really, really exciting. And obviously, yes, I'm so glad you brought it up, Beth. My guess was Jimmy was going to do it, but maybe Jimmy didn't...

Just beat me to it, basically. Yeah, I've always been impressed with the Sakana folks. Yes, they're small and quiet and then drop these huge movements in the space. I think this right here, these NAMs, is going to be...

That silent background thing that's just gonna enable, you know huge forwards into like embodied AI Right. So if if your robot no longer has to remember Every possible thing that's around them They only they they look at it more like a human like this has priority or this is what's important in here that ability to discern is huge because we talked about it before for LLM is is

there isn't this curation process, right? And so this is that step towards that curation process. So if we can see this in other applications or this is the basis for doing that kind of curation, then we're going to see some like,

Sci-fi futuristic jumps in. It's like, oh yeah, this is my humanoid home care robot that knows that these pills are important for me. And not this other frivolous thing. The donut's not important. The pills are important for eating every day. Those kinds of things. Which leads me into my next science story.

where Chiba University, we seem to have a Japan street going on right now. So Chiba University is doing a study on how to empower older adults with home care robots. So according to their estimates, by 2030, one in every six individuals will be over 60 years old.

That's a sixth of the entire population. That's well over a billion people, right? And with that, home care is going to increase. The need for home care is going to increase. And we don't have enough people going into home care or elderly care to fill that gap.

So they're thinking robots are going to fill this, embodied AI is going to fill this role, right? And we've touched upon that here and there. But

But right now, there isn't that acceptance or information, understanding, acceptance of what that's going to look like. So their big study is in to find out how they can alleviate that and then create applied solutions for it. So I thought that was interesting news. All right.

Uh, I got one because G brought it up, uh, but it's one I would want to bring up anyway. Um, he was saying, did, I'm sorry, this is Justin. I apologize. Justin was saying anyone catch the news on Google's new quantum chip Willow and, uh, Justin and everybody else, you might, you might enjoy this. So the first thing I saw pop up on YouTube was one of their engineers or somebody. Um, and this, this I'll call him a kid cause he looked young. Um,

Obviously, he's working on quantum quantum computers. But he did this presentation of it. And I watched all six minutes of this presentation. I was like, I barely understand anything.

he just said. And I was like, wow, that was really not over my head, but it was just flying past me. I was like, okay, what are we trying to say here? So I had to do some other research on it since then. So Justin, I'm glad you brought it up. But this is just from Google. So it says, today I'm, I'm not who I am.

I'm delighted to announce Willow, our latest quantum chip, will have a state-of-the-art performance across a number of metrics, enabling two major achievements. So this is what I really wanted to share because this is what I was missing. The first is that Willow can reduce errors exponentially as we scale up using more qubits.

This cracks a key challenge in quantum error correction that the field has pursued for almost 30 years. So that's number one. Second, Willow performed a standard benchmark computation in under five minutes that would have taken one of today's fastest supercomputers 10 to the septillion years, a number that vastly exceeds the age of the universe. Yeah.

So they built a computer that could solve a problem in five minutes that would take longer than the universe to do with today's fastest supercomputers. So just a small improvement in speed there, just a tiny one. So yeah, there you go. There's the two major things they felt like they cracked. It is a huge announcement. I think it was...

Sort of announced during this 12 days opening. I, and you know, this, that's gotten a lot of the news, but it is a really, really big deal. How it's deployed, who uses it, who this makes the biggest impact for. I'm not quite sure. But very, very interesting technology. As I watch this, this kid explain it and pluses and minus signs and the whole thing. And I was like, that looks awesome. I don't know what I'm looking at.

Right? And one of the hypotheses about why and how it could do that many calculations is that it was doing calculations in the multiverse, in alternative, in other universes. That was like the last line of the post, too. Yeah. Really? Yeah, it was just like a throwaway. Yeah, so I was watching, was it Nate?

Nate Johnson? Jones. Jones. Sorry, Nate Jones. Sorry about that, Nate. So I was watching, because he was covering this bit of news after Carl dropped it in our back channel. And

Yeah, the throwaway piece of that is Google believes it confirms the existence of the multiverse by this one line, just throwaway line about, oh, we can do this because of the multiverse. I'm just like, wow, just a reality there, Google. You know the thing about timelines? Actually a thing. Like...

I'm surprised he didn't say like, "Who's seen Loki?" -Right. -Exactly. That's what we're talking about. That's what they should name the-- I know they called it Willow, but they should definitely name the first data center, Quantum Data Center Loki, just for us popular IP folks. Or what was that, Back to the Future, where they show the alternative?

Is that back to the future? Oh, how it breaks. Yeah. It breaks, right? That's where I remember that from as a kid. No, Willow is obviously very, very cool. It is a big announcement from Google, and it just goes to show just how big Alphabet is, Google's parent company, and how many different pieces and parts and

fingers they have in different parts of technology that this is just one of them. Not to mention, we just mentioned Google earlier, but we called it YouTube, right? They own YouTube too. And like what they just did there. So it's really, really amazing. And it's fun that it's not a straight up Gemini story. Whether that advanced technology is being used on the backend, it's cool to like see other really, really cool advancements here that Google is coming out with. And this is nothing new. They've been doing this for decades now, but

This is a cool one. It'll be interesting to see where it goes. The salty one in our comments says, means the end of our passwords and our passwords will be absolutely useless. Well, listen, the salty one, I've heard that because of other reasons too with passwords and passwords

and how they're done and things like that. So yeah, I think there's going to be a big reckoning here very, very soon on how we secure data and doing it with a hexadecimal 20 digit, you know, password is no longer going to be good enough if we can solve

problems in five minutes to take longer than the universe to do on a supercomputer that certainly seems like it can do a lot in the multiverse on there. So yeah, I have another one if you want if I can share one more here. Yeah, can I break in? Can I do breaking news? Yeah, do it. Oh, go for it. Yeah. So I think Gemini Flash, Gemini 2.0 just dropped. And I think I really want to show you a

They presented these four demos and it's not a slick demo. It feels like a legit demo.

So they have one on building AI agents. If you all want to watch that. Yeah, we do. I love that. I just said it wasn't about Gemini. You're like breaking news. The multiverse, Brian, the multiverse. Logan, I want to say Fitzpatrick, but that's not it.

I think, or maybe that is it. - I wanted to see Logan Paul, but I'm like, no, it's not. - I was like, no, not Logan Paul. - On one of the open AI drops, he just like left a tiny comment that said, things change in a flash, right? It's like, oh, okay, here we go. - Yeah, this is them building AI agents with it.

Hey, this is a quick demo of multimodal live streaming in Gemini 2.0, and we are casting a tab into AI Studio. Can you see my screen? Yes, I see your screen. It's showing a document with the title "Demo." It also contains bullet points and a Google Meet window with a person in the video. Is there anything in particular you would like to talk about on the screen? Yeah, can you start the demo by just reading the text I highlighted?

Sure, I can do that. The text you highlighted reads, "The Multimodal Live API lets you build real-time multimodal applications powered by Gemini 2.0 Flash. It features tool use like Google search and code execution, function calling, and more." Would you like me to read the following lines too? No, can you actually just tell me what this word means though?

I can do that. Multimodal refers to the ability to process and understand different types of data, like text, images, and audio, all together. Does that help? Yeah, um, I need to demo the interruptions feature next. Can you tell me a really boring, kind of once upon a time sort of story?

Okay, I can do that. Once upon a time, in a small village nestled beside a quiet forest, there was a very unremarkable rock. This rock was gray and it was round and it cracked their- Sorry, I didn't mean to interrupt you. No problem. I understand that interruptions are part of the demo. What should I do now? Um, so we're gonna demo memory. Can you just summarize everything we've seen or heard so far?

I can do that. So far, we've discussed a demo of the Multimodal Live API, which lets you build real-time multimodal apps, and I defined what multimodal means. I was also asked to tell a very boring once-upon-a-time story, but I was interrupted. Is there anything else? No, I think we're done. Do you want to read this ending card, or should I?

I can read it if you would like. The ending card says, start building with Gemini 2.0 at aistudio.google.com. Is that correct? Yeah, that was great. That's it. There's a bunch of other videos there. On the one I'm really curious about is...

is spatial. They had a really interesting piece there on spatial computing, on spatial visual or whatever. So I want to watch that video. And then I think though from a voice I was reading, it's still text to speech. So I don't think it's the same as the real-time API, but it did say that it could recognize how you were saying it. So I'm curious to see how that is because text to speech is

What does that mean? Is it still the old where your text, your speech gets transcribed, then that's transcribed and there's three things that have to happen? I'm curious to see what that means. And I think Gemini was at 12.06. 12.06 has been in the...

arena battle there for a bit. And so I'm curious to see what people, the build potential with this is very, very interesting too. It says, I'm reading this blog as you were showing that from, yeah,

I don't know who they are, but that's just the one I pulled up here. It says real-world applications, so this is sort of helpful. Agentic experiences powered by Gemini 2.0 are poised to transform industries. And it says education, AI agents can tutor students using dynamic multimodal

So yeah, really, really interesting. Not surprising at all to see them

you know, announcing this right now during the 12 days of opening out. We've seen this before. You know, maybe Google always had this on their docket. Maybe they're strategically aligning this to come out as opening eyes, pushing all these new things like SOAR and Canvas and things like that. Pull the wind out of some sails. Yeah, I mean, this is nothing new. And we've seen this before. And I think it's smart. I think it's actually good for everybody. And it's just showing how much

progress is going on as we end this year. I mean, we're just getting hit after hit after hit after hit between Runway and Google and Sakana and OpenAI and Meta getting in there and everything else. Grok 3 is coming, right? Grok 3 is everybody's... Isn't Grok 3 was...

trained using the hundred thousand something. That's the one he talked about, yeah. That was the one, right? The last thing I want to just point out is the coding assistant that they powered. They have Jules, your AI-powered code agent. So

It works within your GitHub workflow, handles bug fixes, time-consuming tasks, whatever it is when you're coding. So that's very interesting in addition to some of these other things that you can do. So I'm curious to see. I definitely want to start seeing if I can integrate some of the custom builds we have with Gemini 2.0 to kind of maybe see some native different multimodal aspects of this. So that would be interesting.

So that circling of multimodal without saying it, just having the cursor go around it and have a conversation about it was fascinating because my understanding is what Gemini's multimodal is, is it divides it up into screenshots.

And if you just do a screenshot, you're just pointing, right? So you have to be processing the sequence of the screenshots to be able to get that. And that was, wow. It was highlighted though. The word was highlighted. So I could see how that would be pulled into a

OK. Look, I'm not-- and I'm not saying this isn't impressive, right? I mean, you know, it's kind of silly, you know, to be like, well, it's not that impressive. Yeah, of course it's impressive. But just to point out, like, it was highlighted. So depending on how many frames are being captured,

And I would imagine, Jimmy, right, this would be eventually like movies where it's at least 30 frames per second. Yeah, they'll be able to capture, like, as soon as they hit 30, well, as soon as they hit 24, because I think right now the top end's like 16 a second. Right, that's motion. That's, yeah. Yeah.

So if you can hit 24, you'll capture the vast majority of the content that's out there. 30, you're talking about 90% or more. And then for the stuff in the last few years, that 60 plus, you've got everything. Like pretty much everything on YouTube is going to be 30 or less. And that's what the college is now too, right? About selective memory, the NAMs. Oh, yeah.

and then you go okay this starts to make sense even if it's capturing 30 frames per second if it's just as fast pruning 24 frames out of that that don't matter anymore and deleting that out because it doesn't make sense well there you could watch my screen and if i had a true agent and we're working together there's going to be plenty of times where it's capturing no motion on the screen because there's there's there's a process or thinking going on at that time and it's irrelevant

If you just capture everything and you have to memorize, you have to keep everything in memory, that's going to slow down the processing speed. You're going to have way more data in there that you need to have. So you can see how these different technologies start to play with each other because, yes, we can get to a point where hopefully Gemini and others are capturing what's on your screen at 30 frames per second. But it's what do you do with that data and how do you decide what's most relevant specifically?

to the task. Just comparing them to each other and being able to pick, you know, cherry pick what, what's important. Yeah. That's going to be huge. Think about an agent trying to work past the pop-up on a screen. Like, Oh, you have an agent that's supposed to go get you a dinner reservation or something. Great.

but it hits a pop-up and it totally gets confused because it's taking up three quarters of the screen. Okay, that's irrelevant to the task. We know that as humans, click X out of that, move on with what we're doing. It's annoying, get rid of it. This video just took up the whole screen. Minimize it. An agent, that's not so easy. So it'll be interesting to see how all this starts to play together because that's what it's going to take for true agentic workflows. They're going to have to get past the things that are blockers. True navigation. Yeah, totally.

Like pop-ups, and then that leads to captures and things like that. Because as soon as the agent can literally replicate all of your movement and interpret the screen like you would, then it's free to do whatever it really wants at that point. Or whatever you want at that point. Yeah, exactly.

there's also one more thing that they rolled rolling out today it's releasing gemini deep research mode

to advanced users. So I think it's, I think their answer to a one and it's their reasoning mode. So I'm not sure because Sundar Pichai just like 17 minutes ago, put it out. So I think if you really want to try it, you got to head over to AI studio. There's a really cool spatial demo that I'm,

kind of trying to figure out. But if you look at some of these capabilities versus, and this is Flash, Gemini 2.0 Flash, right? So like when... Fastest and smallest model, right? Right, it's like 2.5 cents a million tokens or 25 cents a million tokens. It's better than Gemini 1.5 Pro. So I think, and Dennis saw this just

posted too. It's like, this is the first of our 2.0 models. So get ready for the next level of models, which is, you know, what's crazy is that, you know, ChatGPT 4.5 or whatever is being like, you know, there's heavy, heavy rumors that's coming out in the 12 days. So

This battle is insane. Within this next week or two, Grock, I was like, Meta, where's Llama? What is it? Four? Whatever it is, it's just one-on-one-on-one. Llama drops 3.3. 3.3, but that's the last... They say forthcoming. So now it's like, hey, Anthropic, where is... Whatever you have, but...

Well, I know we're up against time, so I just wanted to wrap up with a couple of quick hits. And the main one is Stainless, this company called Stainless. What they are creating specifically for companies or for APIs for OpenAI, Anthropic and Meta, essentially, they've created a method for developing SDKs or software development kits

to develop using OpenAI, Anthropic and Meta and various others. So that's been one of the gaps, right? Here's an API, go for it guys. Here's our basic development documentation. But there isn't...

There hasn't really been a cohesive SDK that will help developers or guide developers through doing a lot of the low-hanging fruit, mundane tasks, the interconnections and things like that.

And so that needs, so that forces developers to go through a huge amount of documentation and learning everything about the API access and everything like that. But SDKs allow you to sort of shortcut that and get your development going. And then you, you know, you can review and things like

And that's one of the things about having all the code helpers and those kinds of things. So this stainless company is coming out with their product where they create SDKs for all of those services. So that I think on the development side is pretty big in terms of getting more people and developers into developing for those platforms, things like that.

A couple other quick ones I think will be also useful for us in the future is Automatic.

which is the parent company and owner of WordPress. They've got WordPress.com, WordPress.org, a bunch of other things, including Tumblr. But they just purchased a company called WPAI. And WPAI is a company that builds AI tools and products for WordPress.

So that is going to be a huge influx of AI-powered tools and AI-powered building tools for the WordPress environment, which is pretty much the largest GUI-based web host, website building environment.

software that's out there. So most websites are either on a WordPress site or one of the other ones, but WordPress has been around for so long. There's been so many integrations. All of your best blogs are run on WordPress for the most part. And so this addition of WPAI to the family, the automatic family, I've got high hopes that we're going to see

more AI tools just be automatically added to the base WordPress. And that'll be a huge get for creators and web developers and things like that.

And finally, Humane has come out with a-- or they've announced their Cosmos OS, which if you remember, Humane is supposed to be that layer AI or essentially your orchestrator or conductor AI that will conduct all the other AIs you have in your life.

And their big slick video they came out with essentially says, "Humane is going to live on all of your devices, every single device, and it's going to orchestrate all of the AIs that live on those devices for you. And that will be your interface layer." And that's their big push is to have that AIOS essentially.

All right. So those are my last quick hits. Does anybody have anything else?

No? Okay. So we'll wrap it up there. Thank you, everyone, for joining us. It has been some intense news, great stuff from around the world. Carl's always giving us the latest and greatest, late-breaking. And yeah, again, thank you, everybody, for joining us. Also, remember, we're going to be on in a couple more hours, in a couple hours, because it's day five of the 12 Days of OpenAI. And as always...

Sign up for the newsletter at thedailyaishow.com to always get more in-depth insights and information about what we talk about during the week. And that comes out every Sunday. All right. Thanks, everybody. Bye.

Bye.

Weekly AI News Round Up 54:59 Share