Industry Roundup #1: OpenAI vs Anthropic, Claude Computer Use, NotebookLM

2024/11/22

DataFramed

Adel

Richie

Richie: Anthropic专注于参数效率，其Haiku模型在性能上超越了更大的模型，这与OpenAI追求更大模型的策略形成对比。Anthropic模型在LMSYS排行榜上表现出色，与OpenAI和Google的模型竞争。OpenAI的O1模型结合了Transformer模型、强化学习和思维链技术，在推理和编码方面表现优异，但Anthropic的Claude模型也紧随其后。选择模型取决于具体任务，对于不需要复杂推理的任务，小型模型可能更高效。大型语言模型的智能发展似乎正在趋于平缓，未来的差异化可能在于模型的能力和代理功能。大型语言模型智能发展的瓶颈在于获取线性性能提升的成本呈指数级增长。未来的差异化可能在于架构创新、更廉价的能源以及产品创新。除了基础模型本身，围绕模型的软件和整体产品体验将变得越来越重要。 Adel: 需要进一步解释Anthropic的最新模型（Claude 3.5、Sonnet和Opus）的特性及其与OpenAI模型的性能比较。大型语言模型的命名不清晰，模型大小与性能之间并非简单的线性关系。OpenAI的O1模型结合了Transformer模型、强化学习和思维链技术，在推理和编码方面取得了更好的结果，但Anthropic的Claude模型也紧随其后。思维链是O1模型的关键，它能够自动将问题分解成步骤，但对于能够自行分解问题的人来说，使用其他模型可能更高效。根据任务类型选择不同的模型，例如，对于不需要复杂推理的任务，GPT-4.0可能比O1更高效。尽管O1模型强大，但对于日常使用来说，GPT-4.0速度更快且更方便。Claude 3.5模型在Python编码问题上的性能与O1模型非常接近，即使没有使用思维链。大型语言模型的智能发展似乎正在趋于平缓，未来的差异化可能在于模型的能力和代理功能。Claude的Artifact功能是其与其他大型语言模型的差异化优势，允许用户直接在浏览器中可视化或原型设计应用程序和代码。大型语言模型领域存在先发优势，ChatGPT的用户数量远超其他模型。ChatGPT的成功与其先发优势、良好的用户界面以及微软的市场推广有关。Anthropic与AWS的合作旨在提升其产品分销能力。用户对模型的认知也存在一定的主观性。Anthropic早期侧重于AI安全，其营销策略更偏向于B2B市场。

Deep Dive

Key Insights

Why are OpenAI and Anthropic taking different approaches to model development?

OpenAI is focusing on larger models like O1 Preview, which combines transformer models with reinforcement learning and chain of thought for advanced reasoning and coding. Anthropic, on the other hand, is refining smaller models like Haiku, which outperform larger models by optimizing performance with fewer parameters.

How do Anthropic's models compare to OpenAI's in terms of performance?

Anthropic's latest models, particularly Claude 3.5, are nearly as capable as OpenAI's O1 in coding and reasoning tasks, with only a 1% performance gap. Both companies' models are among the top performers on the LMSYS leaderboard, alongside Google's Gemini models.

What is the significance of Anthropic's 'computer use' feature?

Anthropic's 'computer use' allows models to interact with a user's computer by taking screenshots and performing tasks like copying data between spreadsheets. This feature has the potential to automate routine office tasks, but it also raises significant security concerns, such as the risk of data breaches or system damage.

What are the potential risks of AI interacting with computers?

The risks include data security breaches, accidental system damage, and malicious use, such as phishing attacks or unauthorized file deletions. These dangers are amplified by the unpredictability of generative AI, which can sometimes perform actions that are not intended.

How does Google's NotebookLM aim to change information interaction?

NotebookLM introduces a novel user interface by generating podcasts from documents, such as meeting notes or research papers. This feature allows users to consume information in an audio format, making it easier to summarize and digest large amounts of text while multitasking, like during workouts or commutes.

What is the future of AI agents in 2025?

2025 is expected to be the year of AI agents, with a likely increase in both generalized and specialized agents. While some agents may dominate the market with broad capabilities, others will likely focus on vertical use cases, such as automating specific industry tasks.

Why is 'boring AI' gaining attention?

Boring AI focuses on automating routine, mundane tasks that are a significant part of many jobs, such as data entry or email management. By reducing human effort in these areas, it allows people to focus on more meaningful work, which is increasingly seen as a valuable application of AI.

What are the implications of AI for RPA (Robotic Process Automation)?

AI-powered agents, like Anthropic's 'computer use,' could replace traditional RPA by offering a more intelligent and flexible interface for automating computer tasks. This could lead to more efficient workflows but also raises concerns about security and job displacement.

What is the potential of audio-based AI in industrial settings?

Audio-based AI can be used to monitor manufacturing lines and diagnose issues in physical objects, such as cars, by analyzing sounds. This has the potential to save companies significant costs by detecting problems early and improving maintenance processes.

Shownotes Transcript

Translations:

中文

Thank you.

All right, all right, all right. I think we're live. Richie, how are you? Life is good. I'd say I'm very excited for this episode. Like usually on DataFrame, there's this sort of Superman Clark Kent situation. You never see us both in the same place. But yeah, it's nice to be here with you. Yeah, it's nice to be here with you as well. So maybe set the stage. What are we doing here? So we're trying a new format today. It's all going to be about news, latest trends in AI and data. And this is what we're interested in at the moment.

Yeah, indeed. So when we do the podcast, actually, a lot of our time is spent listening. We love listening to our DataFrame guests, but we don't get to share actually what we're excited about a lot of times in this space. So we thought that maybe if we do this episode, maybe once every two weeks, Industry Roundup, we look at the latest trends in the data and AI space, some of the stories that have caught our attention.

that we share with you, what we think about it. And hey, if you like it, do let us know. If you don't like it, also let us know. We'd love to hear your feedback. Yeah, maybe. How do we get started, Richie?

Sure. First story that's been interesting to me is around cutting edge LLMs. Now, the one that piqued my attention this week was around the latest version of Anthropic's Haiku model. So in order to understand the different Anthropic models, you need to know a bit about poetry. So Haiku, very short poem, that's the smallest model.

And then they have Sonnet and Opus, which refer to longer poems and therefore bigger models. Now, this is the latest version of Haiku. This new architecture actually outperforms the larger models. This is also coming on the back of OpenAI going in an opposite direction where they've got O1 Preview. It's a very big model that does powerful things. So there's a very different sense of directions there between the two companies.

So maybe unpacking that a bit more, you mentioned here the latest Anthropic models. So one, they don't make it easy because they kind of use the same name, Cloud 3.5. I think not maybe for Heiko, but the Sona and Opus models. Maybe describe what makes these models a bit special and like talk to us about the state of performance for these models and how they compare to the OpenAI models and where they are today.

So the big way of saying how good is a model, there's something called the LMSYS leaderboard. And this ranks models based on users that kind of have to use them blind and then say, was this a good answer or not? And for the last few months, it's been this three-way sort of tie at the top of the best performing models. So it's some of the OpenAI GPT models, and then also the Anthropic models. And then you've got the Google models with versions of Gemini.

With Anthropic, it's really not clear exactly how they've changed their architecture, but it seems like a constant sort of refinement is trying to squeeze the most performance out of fewer parameters. And so for those use cases where you don't need a giant model, Haiku seems to be the way to go. Yeah, indeed. And it really kind of shifts the thinking as well about what is a large language model, because it's not necessarily a small language model, but if great models become smaller as well, it's going to be interesting to see how that evolves.

I've said this on the show before, but small, large language model is just a really stupid name. We need something better. Like maybe a medium language model, something like that. Small language model, not a small, large language one. But I agree though. And you know, the LLM industry, they don't make it easy with the namings. But you mentioned something about the OpenAI01 model that was released, I think in September. That model actually changes the architecture of the traditional GPT transformer. It combines transformer model with...

with reinforcement learning and chain of thought, and then you're able to get a lot better results on reasoning and coding. But if you look at the leaderboard, actually Anthropic with their latest iterations of the Claude models are really close to that O1 model. So how much extra performance do you actually get from just refinement of the current architecture versus adding chain of thought and reinforcement learning?

Chain of thought is the key here, and the idea is that if you have a problem that can be decomposed into steps, it's going to do that for you automatically. So O1 is niche in the sense that it is only really designed for these specific questions which have multiple steps to them. And for my own personal use, if I know how to break down a problem myself, I will just stick to GPT or one of these other small models that are good at completing individual steps.

It's only if I don't know what the process is that I'd want to bother with something like 01 because it does take longer to respond because it's coming up with all this thinking about, well, how do I break this down into steps myself? And so there are very different task types, I think. I don't know how you've made use of the two different kinds of models.

So I usually, I like O1. I think it's really good, but I'm not working on such hard use cases that involve coding that I need O1 all the time. So I actually end up using just traditional GPT-4.0. Traditional, everything is cutting edge, but traditional GPT-4.0 because it's faster and it's the default setting. I think as a user, you get lulled into using the default setting almost all the time. So yeah.

Yeah, I haven't had a lot of use cases for O1. That's why I've seen some incredible examples on Twitter of people building like weather apps, people building like dashboards, tools with O1. So what's interesting though is that seeing even the Claude 3.5 model, I think this is not Haiku, but it's Sonnet, is like 1% away on Python coding questions from O1 at the moment in terms of comparison. So even from a reasoning perspective and coding capabilities perspective, Claude is not that far off from O1 at the moment without Chain of Thought.

per se. Okay, that's really interesting. And I do love the idea that GPT-4.0 is like, it's been around for what, six months or something? Well, less than a year anyway. Traditional. Yeah, it just shows how fast things are moving. 100%. And maybe my question for you, because it does seem like the intelligence of these models is plateauing.

Not in the sense that we're going to stop seeing gains in intelligence, but we're going to start seeing maybe gains in capabilities, especially when these systems become agentic. But I was seeing Mark Andreessen today from Andreessen Howard saying that if you look at the pace of change with LLMs and how smarter they got from GPT-2 to GPT-3 to GPT-3.5 to GPT-4, it does seem like the increments in intelligence, despite much more investment today, is getting smaller and smaller in each release cycle.

So maybe for you, what do you think will be the differentiator between frontier models if intelligence keeps plateauing? Oh, yeah, there's the trillion dollar question. And that's your point about plateauing. So there's this sort of rule of thumb that in order to get a linear increase in performance, you have to either have an exponential increase in the amount of data or an exponential increase in the amount of training compute. I'll just throw more hardware at it.

or an exponential increase in the amount of inference compute. So you've got these exponential costs to get a linear performance, and that's why we're seeing things slow down. Now, there are some cool things you could do around like changing architectures and basically doing clever things for the underlying models, which might get around this for a bit.

But yeah, at some point we're going to see some economic limits on how much better the models can get unless we invent like dramatically cheaper power sources. We've already seen a lot of these big companies invest in nuclear power plants for their data centers. And then I think some companies are gambling or like nuclear fusion being surrounded the corner.

Yeah, so there are some limits in terms of what differentiates stuff. I mean, that's really a trillion dollar question. I wish I had an answer. But yeah, agents are certainly one thing that is going to make the difference. The other thing is just going to be more of a product side thing. Like you don't necessarily need the LLM to be better. You need the rest of the software and the whole package to be better. So I think this is where the surrounding industry beyond just foundation models are going to become increasingly important.

And I love that last point because I think that's exactly the answer to that trillion dollar question that you just mentioned. Because I do think that the key differentiator for me, for example, from using Claude to Chachapati is that artifact feature.

feature. I love artifacts from Cloud. Everyone who hasn't tried it, I highly recommend that you try it. Essentially, artifacts, what it allows you to do is it lets you visualize or prototype applications, code, whatever you're doing directly in the browser, in the LLM. So if you say, here's a table, create an infographic out of it, it will build it in Rust or HTML, for example, and create a beautiful infographic or interactive visualization in that sense.

So I think, couldn't agree more here that the differentiation will come from product innovation. But what's interesting related to product innovation and product usage is that there seems to be a first mover advantage in the LLM space. Because we've been talking about how...

Anthropic and OpenAI are essentially in a class of their own. You know, Meta and Google are still there as well when it comes to the intelligence of their models. But ChatGPT has like how many more users than Anthropic at this point?

I actually have no idea of the exact market size numbers. Certainly in terms of company size, I think it's about maybe 20 times bigger in terms of personnel. And yeah, it's a clear leader. I think it's still got more than 50% market share. Although that's dropping, like the other competitors are catching up slightly in terms of market share. Yeah, they are catching up, but they definitely, when I talk to my parents, for example, they told me about Chachapiti. No one's talking to me about Claude, right? It seems like Chachapiti entered public consciousness much more than any other LLM. Why do you think that is the case?

Yeah, I think it's more of a business question and a marketing question than a technical question. I mean, obviously, there's the first mover advantage that you named there. So obviously, it came out a few months before anything else. They had this great chat interface that no one else had.

And so that's been really powerful beyond that. Yeah, I guess the Microsoft connections didn't help either because Microsoft's got an amazing go-to-market machine. So they're bringing it to businesses that way. Yeah, and no surprise as well to see Anthropic partner really hard with AWS as well to keep up with that distribution machine. But I also think it's about vibes. Like,

You see on Twitter, you know, when a model explodes, people get interested in it really quickly, right? Lama, for example, had that popularity surge because it was that open source alternative to the GPT models. But then,

But then you saw over time, like even better models being released from Lama. Like I'll give you an example, Snowflake's Arctic model for a while was much better than the current Lama model at the time. But the level of download, number of downloads, interest, et cetera, is so much lower, right? So even whether on the open source or closed source space, I think there is a vibes-based intuition that the audience's kind of users have around these models that's kind of hard to decipher.

Agreed. And Anthropic started off very much, the founders were people who broke away from open AI. They were very keen on AI safety. And so Anthropic, at least its early marketing, it was all very much, we are all about safe AI. And that's a very staunchly B2B message. It's cool to have like the grown up in the room creating AI models, but also it's not that sort of B2C way.

wildly popular. Everyone must use this sort of message. Yeah. Chachpati had that, let me solve your homework type messaging early on that I think gave it a differentiation. I think speaking of differentiation as well, I think this segues to a perfect next story that we have, which is one of my favorite innovations in the AI space in the while, which is quite related to the conversation that we're having now, which is Anthropix computer use. And

promises of really automating a lot of routine tasks that we see at work, but also the perils of having agents use computers. We're going to talk about that. So alongside the release of Anthropic's latest set of models, Anthropic also released a new feature, which comes back to that differentiation that comes with their new models, which is called computer use. And in a nutshell, the way it works is that the Anthropic model consistently takes screenshots of your computer

with what's happening on your machine. You provide an input, for example, you know, copy paste all of these cells from this spreadsheet to this other Excel sheet, so on and so forth. And then based on the screenshots, creates a set of actions that it needs to take, performs these actions, takes another set of screenshots, and so on and so forth, it also accomplishes its tasks.

That essentially puts us in a place where you now have systems that can interact with computers freely. And there's really quite a few angles by which we can approach it. So, Richie, from your perspective, what do you think about computer use? What excites you? What scares you?

I think one thing that's consistent across every company I've worked in and basically everyone I've spoken to is that dealing with IT support is always hard work. It's just one of those fundamentally difficult organizational problems. And so I guess the end goal for this is having AI agents that are going to replace IT support, at least for common tasks.

That's kind of a cool thing. Should save a lot of companies money and probably makes things easier. On the other hand, you're right. There are some big danger things here. Like,

Generative AI is known to be slightly unpredictable. It sometimes does very stupid things. And when you do stupid things with admin access on someone's work computer, that can go horribly wrong. So you've got data security challenges. You've got the chance of just simply like bricking someone's computer. That's a few thousand dollars down the drain and a lot of wasted time as well.

So I think a lot of security researchers are concerned about this. But if those problems can be solved or mitigated, then it's quite a promising technology. Indeed. And I think what's exciting about it for me is that if you think about the entire world GDP, there's probably like my ballpark estimation, my world model, I could be mistaken here, that at least 10%.

10 to 25% of the world's GDP is routine office tasks that people do that they generally don't like to do and wish that they can do something else. That involves copying text from one field to another. Think about healthcare data entry workers, for example. So much wastage in the healthcare system comes from time taken to do data entry and data not being consistent.

And when you think about these set of tasks that people do in the labor market, if you're able to automate them with a kind of a free-flowing agent that can just understand what your computer looks like and it's able to perform a set of tasks, that's quite exciting. But similar to how the world's GDP is pretty massive and there's a big chunk of it that can be tackled by a system like Anthropos Computer Use, on the security side, to your point, the map of possible problems

dangers now that an LLM can navigate just got a thousand times bigger. You mentioned breaking a computer. That's pretty simple, but there's also, you know, let's perform phishing on someone's email by using personal information from their computer, for example, or something along those lines, right? Like there's a lot more malicious uses now when an LLM has an open-ended computer canvas to work with to a certain extent. So yeah, walk me through the risks that you see.

Actually, yeah, that phishing example is kind of interesting because once you combine that with deep fakes, you can have like an entirely automated scam artist. It's like pretend person chatting with someone, gaining confidence to get access to the machine, AI bot goes in and does what it does. So yeah, cybersecurity is a growing problem. I think there's a big difference here between having...

computer use or another similar AI agent to control a computer in a tightly defined corporate setting and something in the wild where it's deliberately being used for malicious purposes.

I couldn't agree more, but I still think that there's ways that things can go wrong with these types of technologies that we don't even think about. And I wonder what will be some of the unknown unknowns related to computer use. You mentioned like one of the things that I didn't think of before you mentioned is that someone's computer being rigged, you delete the system 32 file in a Windows machine, done. And can you use these systems to maliciously attack a grid, for example, or a set of computers?

even like from a cybersecurity perspective. So there's a lot to be learned here about Anthropos computer use. Do you happen to kind of read on like their security approach with computer use and how they're planning on rolling this out securely? I haven't read up on this, so I don't know what's happening with this particular piece of software. But one thing I've heard is like a common complaint over the last few months from people who are trying to build AI agents

is that the test surface area is huge. So there are so many different tests you need to write in order to ensure that it performs well. And so having a tool that can do anything with a computer, that's a terrible idea because you cannot possibly test everything. So a sensible implementation, if you want to adopt technology like this, would be...

only let it do a few simple tasks at first, get the test for that, make sure it's working and gradually implement capabilities. So if you want it to just move windows around for you, open and close things, then that's cool. That's a simple enough thing that you can test it. If you want it to just do file manipulation, you can give it like half a dozen file manipulation commands, you know,

move, copy, maybe don't give it delete in the first version and then get that working and gradually add things one command at a time. And that way you're going to have a much safer, more robust experience. Okay.

Okay, I actually saw someone using computer use to play the first game of Doom, like the old school Doom. And it did not bad. It was just really slow. But automating playing video games is definitely not what I wanted out of AI, especially in this day and age. And maybe, Richie, given that we've talked about computer use here, what are the implications for RPA, for example, if you can have just an intelligent...

agent on your computer that can perform tasks for you, whatever they are. Yeah, that's a good point. So RPA is Robotic Process Automation. So that's the stuff we've been talking about is having a tool manipulate your computer. So companies like UiPath, I think one of the market leaders anyway. So this has been around for a decade or so, and it's only really like the generative AI layer that is new here.

So some of the issues we talked about around security, like there's a decade of sort of precedent in trying to deal with this stuff. But the generative AI layer basically means you've got a nicer interface to it and it can start doing a little bit of reasoning itself. Yeah. And then looking broader beyond computer use, computer use is a great example of an agent.

And I think 2025 will be the year of AI agents. How should we think about agents today? Because there's probably dozens, if not hundreds of agents out there in the market. Do you think the agent space will become more generalized in the future where you have three, four intelligent agents like the anthropic computer use that dominate the market? Or will you have like just a lot more verticalization with agents focused on specific use cases?

Yeah, so it's interesting. Agents get brought up a lot on DataFrame. Almost every guest has said, yep, agents are coming soon. It's the future of AI. And actual implementations, they're mostly very simple or prototype at this stage. So I think there are many different ways the market can play out. So there are lots of companies, lots of startups around. Well, you can create agents easily. So sort of zero code agent creation. There are some where it's more like, yep, we are industry specific. Some where it's just...

well, anyone who can build something and then sell it. I don't know what the best market model is going to be, but you can be sure that there are just going to be dozens and dozens and dozens of startups trying to help you make agents.

and maybe help you sell agents as well. And then when OpenAI releases their own version of agents, these startups will most likely get wiped out or at least a big percentage of them because that's been what we're seeing over the past few years or so. Yeah, absolutely. So at the last OpenAI Dev Day, Sam Alton had a slide saying, lol, we just killed your startup. Yeah.

Yeah, that's a common problem when you're building on top of other technologies from big tech providers is that there's always a risk that you're going to get cannibalized from underneath.

100%. And I think this wraps up the second story. So maybe you can jump into the third one. I'll let you introduce this one because this is dear and near to our hearts as we might be out of a job soon. Yeah, sure. So you mentioned the idea of boring AI and like a lot of particularly white collar jobs just have a lot of routine admin tasks. And so there've been quite a few announcements from big tech companies around this.

Recently. So first up is Notebook LM. So this has been Jupyter Notebook product. Well, I'm not sure if it's Jupyter, but a general web-based notebook product from Google. It was released back in 2023. But a couple of months ago, they released a podcast auto-generation feature, which, I'm honest, it does frighten me a little. Yeah.

I played around with it. It is not yet at data-framed quality, but AI does get better. My point of view is that the AI hosts are actually too good. They're so good that you can feel it's a bit uncanny, where it's absolute perfect, no ums, no ums. Their cadence is 100% spot on all of the time. It's quite bizarre.

Yeah, I mean, that's the thing. I feel like one of my strengths as a podcaster is that I do sometimes ask stupid questions. And that way, you know, it's an artisanal product. Yeah, so yeah, walk us through how even podcasts are going to be automated in the future.

Yeah, so I think this is a little bit like the case of music. So you have these sort of top artists that come up with original songs and they're creating high quality music. But then you also have background music in an elevator, in a supermarket, which no one's really listening to. And that's where like AI music has its strength because it can just be background sort of nonsense.

And so I think this Notebook LM podcast generation feature is pretty similar. It's like if you have a document or a set of documents, you're not going to

go and have a human record a podcast all about your weekly meeting. But just with a single click, you can auto-generate something and you have a nice discussion of, well, what happened in this meeting that I missed earlier today? Rather than having to listen to your colleagues drone on for like 20 minutes about coffee, it's just going to pick out the important points and then make it into an entertaining format.

Yeah, well, it depends on the colleague. But I do like Notebook LM quite a lot because I think there's two points that I like about it specifically. One, I do love this concept of boring AI because a lot of the awe-inspiring use cases that we see from AI today are stuff that generally looks amazing, but doesn't really work in real life a lot of the time. Think, you know, self-driving cars have done great strides over the past few years, but we're still not near that.

nowhere near full automation. And we've been promised full automation. It's been 10 years that we say next year, full automation to a certain extent. But boring AI actually, so much human drudgery is spent on these boring tasks that make work not so fun, like catching up on emails where 90% of the emails are not useful.

And this type of AI that is able to kind of unlock your time and your focus, I do like it quite a lot, especially with Notebook LM. If I'm at the gym, I'll listen up to an analysis of like a latest AI paper or like a journal article or something along those lines. And I just check it in to a podcast in a format that is really useful or easy to ingest while I'm at the gym. That's great.

And the second thing that I love about Notebook LM is the actually the podcast feature, I think, is what made it catch on fire. Because when the podcast, we're going to play a podcast example and you can check it out right now. One example they talked about is Google's Notebook LM. They've got this feature now where you can get this auto generate podcasts from meeting notes. Wait, wait, podcasts from notes? Yeah.

Yeah. Or even like long articles, you know, research papers, things like that. That sounds, I don't know, that sounds kind of amazing. How does that even work? Well, it's pretty clever. Basically, you feed the AI your notes or your document and it like it uses its understanding of language and context.

Going back to it, what I love about the podcast feature is that it's quite a novel user interface for how you interact with information. It breaks the bound of how you can summarize information in the future. And this is a type of user experience that can only be unlocked with Elements. You can't generally do that without this technology. So it does point towards a future where the way we interact with software

will fundamentally change because of the underlying technology. And I'm excited to see how the UI and the UX of LLM-based products evolves. So maybe, yeah, how do you think about the UI and the UX of these LLM-based products evolving in the future? Yeah, it's very cool. And I think the Notebook LLM podcast generation feature is novel because it contains audio, whereas a lot of generative AI text has really been the big story over the last couple of years.

And actually, that relates to a lot of other announcements recently. So Apple Intelligence just added a load of features around like summarizing documents and things like that. And it's like, well, that was kind of cool a year ago, but we've had that already. So Zoom does meeting summarization. There's other plugins like Autopilot do that sort of thing. And I feel like soon every piece of software that involves text documents is going to have an auto summarization feature.

And so that just feels like table stakes now. So the notebook LM idea of where you've got audio as well, and it's creating something novel on top of what you've written or what you have in your document, that's pretty useful. Yeah, and it's going to be interesting to see how the audio space evolves as well, like in general, in the AI space. There's a startup, Speechify. I have it as well on my phone. I read a lot of articles, for example, but sometimes I'm commuting, I'm at the gym, so on.

You can have a celebrity read an article for you. And these types of novel ways by which you can interact with an LLM is definitely going to be interesting to see how it evolves. As we are nearing wrapping up here, do you think an auto-generated podcast and notebook LLM will ever be as good as DataFrame? Well...

That's tricky. I mean, it would require a lot of fine tuning. Ideally, you know, for authenticity, we'd have to deep clone our voices. But certainly in terms of the questions I ask,

I've gone, like recently I've been pushing hard to try and auto-generate more of my questions for guests. So I used to handwrite every single question I asked a guest for DataRain. But now what I do, I've trained GPT on all my sort of previous questions. And now it can write something that's a bit like what I would say anyway.

And I can normally get like half the questions for an episode. And then I go and change those. And then when I actually speak to the guest, I do what I normally do and just ignore most of the questions. So there's several levels of sort of indirection from the actual AI. But it does come up with like quite a good first approximation. So we could get something that's like,

Good for DataFrame B-roll with AI quite in the near future. Yeah, if you want us to put in DataFrame notebook alarm snippets, do let us know. We were thinking about that. And maybe, Richie, what are other pieces of news that you're excited about in the AI space nowadays? Oh, man. So I have to say, since we were talking about audio, we had a few DataFrame guests recently talk about all these audio use cases. And I think that's one thing where it's not had as much hype as text or video.

and particularly in industrial use cases. So I had a guest talking about in manufacturing facilities, just like listening to the sounds coming out of the production line, you can tell when things go wrong. And same with like diagnosing problems with your car. And because I drive a very old car, I'm particularly keen on that one. So just understanding what's going wrong with like physical objects using audio. And that seems like no one's really talking about it, but it's got...

such a huge potential for saving money for manufacturing companies. Yeah, I couldn't agree more. That's actually a great set of use cases. And it goes back to the idea of boring AI. Technically speaking, it's not hard to set up

a recognition system that can recognize a faulty car part, right? You just need relatively good data for that to be able to fine tune the model and train it. But the value it can unlock both you as the car owner and the garage and the manufacturer is really great and really comes back to, you know, boring AI is going to be on the ascendance in 2025. Mark our words. We talked about it in our data trends and predictions, I think in 2023, it's still going to be ascending. It's still going to be ascending.

Richie, as we wrap up today for our first Industry Roundup episode, any final notes to share? Yeah, so we talked a lot about AI today. And of course, we're both very excited about AI. Maybe in future episodes, we can also talk about some data things as well. And we will cover on this podcast. What's data? I think we're only talking about AI nowadays. But yeah, it seems that AI has eaten up all of our attention span, so we can only cover it. But yeah, indeed,

If you enjoyed this episode, I know it's a different format from what you're used to, but if you want to hear more from our industry roundups, maybe we do different formats, do AMAs, for example, do let us know. We'd love your feedback, how we can make the show better. We also did include a survey early in the show that you can access in the show notes. Do let us know what you're thinking about and how we can make DataFrame better for you. And off to the next one. Absolutely. Pleasure chatting with you, Adel. Likewise.

Industry Roundup #1: OpenAI vs Anthropic, Claude Computer Use, NotebookLM 30:05 Share