cover of episode #116 - ChatGPT plugins, AI hardware, petition to pause AI, Trump deepfakes

#116 - ChatGPT plugins, AI hardware, petition to pause AI, Trump deepfakes

2023/3/31
logo of podcast Last Week in AI

Last Week in AI

AI Deep Dive AI Chapters Transcript
People
A
Andrey Kurenkov
J
Jeremy Harris
Topics
Andrey Kurenkov:OpenAI为ChatGPT添加插件支持,使其能够访问互联网和外部服务,极大地扩展了其功能。这带来了新的机遇和挑战,插件的安全性需要谨慎考虑。 Andrey Kurenkov:NVIDIA的Hopper GPU(H100)针对Transformer模型进行了优化,显著提高了AI训练效率,这标志着NVIDIA在AI硬件领域的领先地位。 Andrey Kurenkov:高盛的研究表明,生成式AI将对3亿个工作岗位产生影响,其中7%的工作岗位将至少有一半的工作内容被AI完成。这将对全球经济产生深远的影响。 Andrey Kurenkov:Agility Robotics的Digit机器人被设计用于仓库中搬运塑料箱,标志着人形机器人向实际应用迈进。 Andrey Kurenkov:Runway Gen-2是第一个公开可用的文本转视频生成器,标志着AI在视频生成领域的进展。 Andrey Kurenkov:Clearview AI的面部识别技术已被美国警方使用近100万次,这引发了人们对隐私和安全问题的担忧。 Andrey Kurenkov:澳大利亚Centrelink使用的语音识别系统容易受到AI的欺骗,这凸显了AI技术带来的安全风险。 Andrey Kurenkov:AI技术进步使得生成逼真的图像变得更容易,这增加了深度伪造的风险。 Andrey Kurenkov:AI生成的特朗普被捕照片并未广泛欺骗公众,这表明公众对AI生成的图像仍保持警惕。 Andrey Kurenkov:美国编剧协会(WGA)允许在编剧中使用人工智能,但要求维护编剧的署名权。 Jeremy Harris:ChatGPT插件带来的新工具,会产生新的能力和风险,这是一个值得关注的AI系统演变阶段。为AI系统添加新工具会带来新的机遇和风险,这与单纯扩大模型规模观察其新能力不同,是一个值得研究的‘阶跃式’变化。 Jeremy Harris:NVIDIA推出Hopper GPU(H100),针对Transformer模型进行了优化,并提供云服务,进一步推动了AI发展。 Jeremy Harris:OpenAI的Ilya Sutskever认为,目前AI发展主要集中在工程问题上,而非算法突破。 Jeremy Harris:超过1100名知名人士签署了一封公开信,呼吁所有AI实验室暂停至少6个月的时间,以评估AI技术的风险。 Jeremy Harris:旧金山的一个博物馆关注AI的潜在风险,旨在提高公众对AI安全性的认识。 Jeremy Harris:AI技术进步使得生成逼真的图像变得更容易,这增加了深度伪造的风险。

Deep Dive

Chapters
Discussion on how OpenAI's addition of plugin support to ChatGPT will allow it to interact with external services, potentially leading to new capabilities and risks.

Shownotes Transcript

Translations:
中文

Hello and welcome to Skynet Today's Last Week in AI podcast, where you can hear us chat about what's going on with AI. As usual, in this episode, we provide summaries and discussion about some of last week's most interesting AI news. You can also check out our Last Week in AI newsletter at lastweekin.ai for more articles. I am one of your hosts, Andrey Kurenkov. I am currently finishing up my PhD at the Stanford AI Lab.

And I'm your other host, Jeremy Harris. I do a bunch of national security and AI work focusing on AI safety. And I actually have a book coming out pretty soon too. In fact, on April 4th, so probably the time this comes out like a day or two from now.

And it's all about quantum mechanics, physics, the physics of consciousness and how AI ties into that. I know it sounds a little bit loopy, but it is kind of related to some of the stuff we talk about here. So if you want to check that out, it's called Quantum Physics Made Me Do It. Anyway, a little bit of a fun background for today. Well, just from like my time in toiling obscurity in a quantum physics lab and at the University of Toronto and then like later at a Max Planck Institute.

And yeah, I worked on paradoxes in quantum mechanics. That was sort of like the field, the foundations of quantum mechanics. And that sort of led me to look at basically what it would mean, like how fragile our perception of reality is to small changes in the physical theories that we believe in. So if you take quantum mechanics, you change it just a little bit, like how radically do you have to reimagine reality? That's sort of like what the book is focused on

And it ties into a lot of what we've seen from leading AI labs. You know, Ilya Setskiver famously a few months ago saying, I think there's a chance that like GPT-3 is slightly conscious, that sort of discussion. Anyway, that's something that inevitably gets tackled because, you know, we're talking about the physics of consciousness, which is really poorly understood. Anyway, so that's kind of the background there.

Okay, yeah, sounds interesting. Hopefully I can get a free copy. Maybe, we'll see. But let's go ahead and start discussing AI news. So first up, as usual, we have our applications and business news stories, starting with

OpenAI is massively expanding ChatGPT's capabilities to let it browse the web and more. And basically, the idea is that OpenAI has added plugin support to ChatGPT. So now there can be support for basically adding kind of a glue, I guess, between ChatGPT and things like Gmail or Expedia or...

other things like that so that instead of just generating text, which is what chat GPT has been doing so far, it can now receive information and send information to these other external services.

Yeah, and it's kind of interesting as we start to ask questions about the capabilities of AI systems and what it even means to assess the capabilities of AI systems. We have ChatGPT, which we understand reasonably well at this point. Hundreds of millions of people have played with it.

But what happens if you take the same system that has capabilities X, Y, and Z, and then you just give it tools to use, right? Like what emergent new capabilities could we discover? It's kind of an interesting question. And I think a second phase now that we're hitting in the evolution of these systems where we get to see not just the capability of the raw system, but the capability of the raw system plus an arbitrary set of, and a growing set of tools.

So I think it's interesting in that sense. This is another angle on the idea of emergent capabilities, where normally we think of emergence as like you scale up an AI system, you scale up a model more and just see what new capabilities it has. But what about just throwing in a new tool? What new opportunities, what new risks could that introduce? It's kind of an interesting step function, step change in the capabilities of these systems. Yeah, I totally agree. I think there's...

Definitely the aspect of just capabilities of intelligence where now you can have things like, you know, looking up facts or, you know, checking the correctness of a math equation or, you know, many more things that relate to just being intelligent. And on another dimension of this, I think it's actually very exciting in terms of the just...

straight up practical capabilities of using this now, I think this will be a big, big deal in terms of now, instead of just getting text output, which can do a lot, but ultimately is not interfacing of anything, you can't make it book your flights or write an email for you directly in your Gmail application or a million other things.

Now, with this plugin support, whatever you do, you can just tell Chet to do this, and it can then send it directly over to whatever service you're using. And I think you can look at maybe things like Trello for tracking activities or, I don't know, any software. There's so many things we use every day, and now...

With plugins, any of that you can kind of just combine with ChatGPT.

Yeah, absolutely. And it also sort of raises questions about whether we've thought out the limits on what our models can do. Famously, one of the limits on large language models is the context window, the maximum amount of text that they're able to read or digest at any given time. In some sense, that's a measure of the maximum complexity of a thought that the model can have to anthropomorphize a little bit.

Now, when we think about plugins, like plugins that allow for external stores of memory potentially, that allow effectively the model to hold more ideas in its head at any given time. I think there's an interesting question about whether interaction with tools here should be thought of as an extension of the model versus...

versus a part of the model itself. And it's obviously like a fuzzy boundary there, but I think there are really interesting research questions as we start to see how these tools get used and whether they change the risk surface of these systems.

We're already seeing OpenAI kind of give us shades of like, there's a new risk surface emerging here. They're flagging explicitly they need a safety strategy. And right now, their safety strategy with the plugins is that they're going to just prioritize a small number of developers and chat GPT plus users. So kind of roll this out at smaller scales, do the standard OpenAI thing of 10%.

Testing, you know, to see how people actually use this in the wild and then do gradual kind of wider rollouts. But anyway, really interesting to kind of see all this. I also think one last quick note on this for me, at least, is this does start to make me think, and I've mentioned this before, adept AI, right? This startup that's specialized in making AI tools that can carry out actions on the internet.

Right. Like the canonical example that Adept AI has or their first tool is called Act One. And like famously, you know, it's able to like book you or do searches for you to find houses that might be a good fit for you, for your preferences or whatever, by using its access to the Internet.

I think there's an interesting question here as to whether that approach stands up to chat GPT plus plugins, because now we're kind of talking about open AI stepping on Adept AI's space here. And it might be a strategic risk for a lot of companies that have just kind of focused on a more narrow approach that involves getting AI to use tools as a kind of special thing. It seems like that's being subsumed now in this more general set of capabilities of the chat GPTs and the GPT-4s and so on.

Yeah, I agree. I think Adapt AI from the demo they released was just directly interfacing with web browsers, so simulating a person writing the URL and clicking on stuff. This is a different approach of just having APIs. You can send a request to another website and receive response.

And much of what we do by directly using a browser and a keyboard, you can't do with APIs. Fundamentally, a lot of the stuff is just done via API calls, right? You write something on your laptop and it gets sent through a service to some API. So

It's an interesting question, and I'm sure it'll be interesting to see how this plays out. But yeah, I think this is kind of a huge deal, and I'll be curious to see how this gets integrated into various things that are very commonly used. Yeah, absolutely. Yeah.

And then we've got another story here from NVIDIA and maybe more on the hardware side, eh? Yeah. So the story is about how NVIDIA's big AI moment is here. It's from Engadget. And it's kind of covering...

what NVIDIA has been up to with respect to AI. And NVIDIA has been in AI and a big deal in AI for at least a decade. I mean, a lot of why we have this level of AI improvement is due to using GPUs starting in about 2010 to train these large neural networks.

And NVIDIA has been continually pushing the boundaries on the power of GPUs. So they've been scaling up and up and up, and now you can buy GPUs for $10,000. And you can also buy something called DGX, which is basically a mini supercomputer. It's got, I don't know, 8, 12 GPUs or whatever, a lot of memory, and so on.

So yeah, it's expensive, of course, but now they provide a cloud service, which is still not cheap, but it's definitely cheaper than buying one of these supercomputers.

Yeah, and a big kind of sub part of the story is it continues the rollout that we've seen at NVIDIA of their new class of GPUs, their Hopper GPUs, H100s, which is sort of like the next evolution, more geared towards the specific architectures that are making waves in AI right now, these transformer models, the T and GPT.

And so specifically designed to kind of optimize for those sorts of systems and delivering a very significant improvement over the A100, which is kind of like the former system. So it seems like they're kind of starting to mix the H100 and A100 out now in the offerings that they're putting forward to the world and the DGX cloud instances that they're now offering at the low, low price of $37,000 for one node. Yeah.

They can either have A100s or H100s in the mix. So that's really cool. They also talked about announcing AI Foundations, which is some cloud software for large language model training. Really, this is, I think, another instance of

not just software chasing the transformer, basically everybody starting to converge on this, this architecture at an industry level, but also now the hardware, like we're seeing more and more large language model transformer oriented software, um, infrastructure, hardware, and so on. So really kind of interesting to see a, whether that reflects a general belief that, Hey, this is just going to continue almost indefinitely, uh,

And B, whether this means that even if other ideas end up being more fruitful in terms of advancing AI faster, we're still locked into some degree into that hardware choice. Frankly, I mean, I'm very bullish on transformers, and I think that they're going to be able to do a disturbing amount. But it's interesting to see the whole industry, the whole stack really from hardware to software now being shaped by innovations that happened in like –

three or four years ago, I guess with the, or no more than that with the invention of the transformer and like, what was that? 2017 or so. 2017. Yeah. Yeah. Yeah. I agree. It's, it's kind of interesting because decent research for a while, it was basically, you know, we keep inventing new neural networks for, you know, various applications and they're all customized and hand built. And now, you know,

More or less, for a lot of things, you can just deploy this standard template and scale it and format the inputs and outputs, and that's it. And yeah, as you say, it's kind of a new paradigm that has been emerging for a few years. And now it looks like maybe it's just going to be, from an engineering perspective, a sort of standardized

part of AI in a sense. Well, exactly. You know, I was listening to an interview that Ilya Setskiver gave one of the essentially the

the head of tech, if you will, at OpenAI, like the absolute rock star in deep learning. And yeah, and he was basically saying, like, he was asked, what's your day-to-day like? Like, what percentage of your time do you spend on coming up with new ideas versus what percentage of your time do you spend on kind of engineering type stuff?

And he said, actually, the new ideas is a pretty small part. I mean, it's an important part, but it's a pretty small part of my day-to-day, which I think is actually, in a weird way, the kind of the reflection of what we're seeing, the kind of software side of this NVIDIA play. So NVIDIA says, hey, you know what? We're going to make custom hardware to make transformers really easy to build and scale. And what that means for the engineers who are building these systems is that

Well, it's just an engineering problem now. It's just a software problem trying to like scale things up rather than necessarily coming up with new algorithmic breakthroughs. That's still an important part, but it's not the dominant part of these big projects, which, you know, for people like me who think that we kind of have the ingredients

that we need algorithmically and maybe even on the hardware side now to get to something like human level intelligence across the board. It's not exactly heartening, especially since I'm a little bit concerned, to say the least, about what those sorts of systems could do. But interesting to see it turn into the kind of engineering problem where we already have a roadmap, where it seems like we have a recipe that gets us basically as far as we want to go.

Yeah, no, that is really interesting. I think in a sense, it's sort of like we found...

a basis where, yeah, now it's just using, you don't need to figure out some new idea to use AI. You know, we know the ingredients, you need a data set, you need a model, which can now be more or less a standardized model template. So if you want to build a neural network to do something in industry,

you don't need to think about it too much from a research perspective. You make a data set, you use a model, and it's much more about the engineering to make it performant and so on. And then actually another part of the story is that NVIDIA now has a platform for inferencing, so not training but running things.

And this is, for instance, NVIDIA L4, which is super performant and very efficient and is now available on Google Cloud. And actually a few...

companies are already using it, apparently. So, you know, long story short, NVIDIA is dominant in the AI space in terms of hardware and probably will remain so. And yeah, it's been quite the achievement for them, for sure.

Yeah, right place, right time. And, you know, the idea of inference being more and more important, being more and more built for, if you will, by NVIDIA is also a reflection of, you know, the fact that this stuff is truly entering the mainstream. You know, now companies aren't just treating AI like a research project where all of their resources or by and large, most of their resources go into like training these massive systems.

But they're now thinking like, how can we serve these systems to real users at scale? And I think that's another phase transition. That's another kind of stage in the evolution of AI where people start to go, hey, you know, it matters how much it costs to make a prediction. And we can't ignore that cost anymore. Now we're developing hardware to facilitate that specifically. And so I think just, you know, a whole bunch of ways in which even to date, 2023 has really been truly transformative. Like we're seeing...

evolution stages in the evolution of AI unfold before our eyes. And it's not even April. Yeah. I think there's a quote here. This is all news from the GPU technology conference. And the CEO on NVIDIA had this comment that we are at the iPhone moment for AI. And I think actually that's

Kind of a good metaphor where, you know, you got iPhone came out or was announced. And then within a year, within a few years, everyone was using a smartphone. Everyone was plugged into the internet 24-7 or could be plugged in from anywhere. And it just completely was transformative, right? It just, in all aspects of life, almost, smartphones have...

really changed at least to some extent how we live. And now I do think AI is at a point where within a year or two or three, it's just going to be everywhere. The same as with smartphones. Yeah. And maybe like, I think more than smartphones, you know, smartphones were great because they allowed us to, um,

to bring a little pocket of intelligence along with us. It was very dumb intelligence, more like memory storage was kind of the memory storage and communication. But AI, we're talking about the automation of thought, human level thought across basically all domains.

And so, you know, what that iPhone moment looks like for AI, it beggars belief. But I mean, you know, I think it's hard to imagine the fundamental ways in which it'll change society, like seeing what the world looks like when large language models are as widely used as iPhones, like when we can't go more than like,

30 seconds without checking our large language model, so to speak. That world is like, I think it's clear to you and me, it's coming in some form. We have no idea how good these metaphors are, of course, but I just don't know that the average person's quite prepared for that and that society has the institutional fortitude to deal with what that means. But one way or another, we're going to have to find out in a real hurry. Yeah.

Yeah, yeah, it's exciting. And it is a little bit scary. I think, you know, it's just like, we know that all this stuff is happening. And it's we're kind of in the early stages. And within a year or two, right, there's going to be a completely different picture. So it's an interesting time.

Moving on to our lightning round with a few quicker stories. First up, we have the former head of Google China joins JAT-GPT frenzy by starting own venture. So there's about...

Kai-Fu Lee, who is a big person in venture capital, was head of Google China, is generally a pretty notable person in AI, has written, I think, one or maybe even several books about AI. And yep, now there's a new venture to do something like Chad GPT, although it's not too clear.

Yeah. And that was kind of the take home for me from this article was like it was sort of rich Chinese technocrat who knows a lot about AI generally invents new buzzword. So he calls this platform that he wants to build that he claims will go beyond chat GPT. He calls it AI 2.0. And there just aren't a ton of details. It seems like a kind of platform play of some sort.

But sort of yet to see, I don't know if it was just the article or whatever, but it doesn't seem like there's a ton of information. I think in general, this just forms, as you said, another part of that steady trickle that's becoming a flood of Chinese companies, Baidu, Tencent, Alibaba, and so on that are

moving in this direction generally, that are seeing the large language models being built in the West, in particular OpenAI, just because of the publicity around their model releases. And they're going, hey, me too. And I think that this is going to be very interesting from a proliferation standpoint. We start to think about what these systems can do while we can regulate them within our borders. But what happens when another country with different standards

starts to see its own ecosystem flourish, things get really complicated. And anyway, not super clear where that leads. One of the things that the post does mention too is just this funny story about Baidu. They had their chat GPT-like model Ernie bought, and

And they released that initially, and it caused the company stock to tumble just because the demo was kind of bad. It didn't include a live demo that people could play with. They just had some screenshots. And so the stock tumbled. But then, hey, it turns out that the model is actually pretty good now. It's gotten really good reviews from early users. So it's rebounded in any way up 13% or something recently.

on the whole. So kind of interesting to see how, how sensitive stock prices are to these kinds of announcements, you know, whether it's hype or whether it's not, I mean, people are at some level pricing in major, major value, uh, for these models. And it'll be interesting to see if that pans out in the near term. Yep.

Onto the next story, we have Cerebras Systems releases seven new GPT models trained on CS2 wafer scale systems. This is another hardware story where Cerebras Systems is a company that builds kind of specialized

chips that are pretty different from CPUs and GPUs that are typically used. They're basically these gigantic chips that are, you know, instead of being relatively small, like in a CPU, you have maybe like, you know, two or three inches of

Square is pretty standard. These are really, really big chips. And yeah, now they have trained the GPT models and they said that it took only a few weeks compared to a few months, which would typically be how long it takes. And they have released the weights for a bunch of them. At the top end, it was 13 billion parameters.

which is pretty big. We haven't seen too many models at this scale be released, especially not by a company.

Yeah, yeah. The open source question is always an interesting one, you know, especially like lately we've seen more and more people playing with Lama, the meta model that leaked. That's also kind of very powerful and cutting edge. And it's led to a bunch of these open source versions of chat GPT essentially, and raised a bunch of questions about whether or not open sourcing is a good idea at this stage with this level of capabilities from a malicious use standpoint.

At the very least. And, you know, we've got all kinds of companies taking positions on that separately. But when it comes to this hardware, like Andre, do you have a rough sense, like an intuition as to like what it is that makes these like new cerebrals, like the cerebrals system more efficient than a GPU based strategy? Yeah.

Yeah, I'd have to think about it. It's pretty different, so I don't even fully understand the implications. I think my general sense is that there is less of a need to do transfer of data from memory storage to a GPU. So it's, yeah, just sort of more streamlined in terms of

keeping the data next to a compute, but I might be totally wrong. I would have to think about it. Yeah, I mean, that jives with my understanding so far in that it seems like the big bottleneck right now is just moving data around the chip and people are trying all kinds of optical-based strategies to do that. And I guess Cerebras has their own. Does that sound like the right way to think about this? Yeah, I would say so, I think.

Cerebras is one of these companies that are really building something totally different from standard computing. So even NVIDIA stuff, that's GPUs that you had for 30 years now that are still based on a bunch of chips and memory storage. And this is pretty different from just a basic computing architecture perspective.

So I am quite curious to see if this will make a large impact where, as you said, Transformers seem to be kind of a standard and we get more and more customized hardware for them. And then maybe we wind up just having

fully new architectures that are just specialized for AI. Yeah, it's interesting. I don't know. I take this story as a reminder that no matter how much we think we have a sense of the pace of AI development, there's always the risk that we're going to be blindsided by a wildcard development in hardware that just says, hey, you know what? Like,

that giant transformer model, that GPT-5, turns out to be way cheaper or trainable on way smaller systems or something. Kind of interesting to keep that possibility in mind anyway. Yeah. Yeah, and I think the other thing to note is 13 billion parameters used to be

Kind of not much, but I think it's interestingly, now that we understand these things a little better, LAMA, for instance, has shown that you can get to pretty impressive performance by just training correctly. So this could be a pretty significant starting point for a lot of development of new models.

Next up, we have generative AI set to affect 300 million jobs across major economies. So this is research by Goldman Sachs, and it has a bunch of details, but roughly these 300 million jobs will at least partially be affected by generative AI. So

That's roughly two thirds of jobs in the US are exposed to some degree. For most of that, there would be like less than half of the workload tasks, let's say, but about 7% of jobs will have at least half the job basically be done by AI. And you could imagine that once most of your job is done by AI, you won't

have quite as many people working at those jobs. Yeah. You know, these kinds of articles, first of all,

I'm struck by how conservative some of these estimates look, at least to me. They're talking about like, you know, this could spark a productivity boom that would eventually raise annual global GDP by 7% over a 10-year period. Like, to me, that sounds laughable. Like, I think it's a bit of a ridiculous time estimate, but I'm very ready to be embarrassed by my wrongness on this.

I think when we're talking about AI systems that have capabilities, like just look at GPT-4 and what it can do. And that's on a scaling trajectory that seems to keep going and going. Like we talk about we're automating wholesale 7% of jobs. I just...

And a lot more seems like it will in practice be in the line of fire. I'm very ready to be surprised, but part of what informs this is also just like humans suck at their intuition when it comes to guessing which jobs are going to be affected by this stuff. You know, there was a time like I remember, you know, 20 minutes ago, we were talking about –

Oh, the metaphorically, I mean, like way back in the day, people said like, oh, people who work in trades, they're going to be the first ones. We'll have robots that do X, Y, and Z. And then it turned out that, hey, you know what? Those are actually some of the safest jobs, which is part of the conclusions that people have been coming to lately in these sorts of studies.

And it just, it's that old idea of more of X paradox that the tasks that are easy for computers are not always the ones you would expect based on looking at humans. And that sometimes you can be surprised by how easy something turns out to be. I, if I were a betting man,

I would be short this prediction and put a lot more money on a much, much bigger increase in GDP and in workforce disruption. But that's my hot take for the day. Yeah, I think seven does seem pretty conservative. On the other hand, there's been quite a bit of research in the past decade by economists and kind of analysis of the possible impacts of AI on

One of the things that could be in play is that even in previous industrial revolutions where you had, you know, motors, right, as a foundational technology, there is a bit of a delay in adoption, right? So we have all these things, but to really fully transfer over to AI will not be, you know, over...

It's going to take some time. And a lot of GDP is buying goods, right? It's not services. And these hardware things, things like trucking, which actually might be affected, but not by generative AI and factories, there's quite a bit of an economy that's not exposed to this more kind of purely text-based work. So, yeah.

Oh, sorry. Yeah, no, I was just going to say, I totally agree on that aspect of it. I do think on the rollout side, the rollout timing, one thing that makes this technology qualitatively different from automobiles, motors, and steam engines, things like that, that took a long time to diffuse through the economy and start to create real GDP value is that it is software-based. And so it's not a coincidence, at least in my view, that ChatGPT

became, when it was launched, the number one fastest-growing, fastest-adopted technology in the history of human technology. They reached 100 million users in, what, two months or something? I think that's a reflection of the media. We're using software now to diffuse this technology, and it's diffusing through platforms like Slack, where you can have a...

like an automated service bot that helps you do your job, that sort of thing. So much, much faster to get the value in the hands of users, which is why I'm sort of skeptical of the diffusion argument for AI specifically. Again, it could be wrong. I mean, what the hell do I know? But anyway, I think it's interesting to consider that we might be in a fundamentally different era from the standpoint of diffusion of economic value, this stuff.

Yeah. No, I think there's many dimensions you could talk about. The other thing is if you just automate jobs, does that actually increase the output of an economy? Right. So I don't know. There's many questions, but there's going to be many of these reports, I'm sure, coming out.

And speaking of automation, the next story is Agility's latest Digit robot prepares for its first job. So Agility Robotics makes these kind of humanoid-esque robots and now has announced the latest version of the Digit robot, which is now specifically designed to move these plastic bins and warehouses. So it's

Got now a little head to introduce to people and it's got these weird hands that are basically entirely for carrying bins. And yeah, I mean, the hope is that they can actually deploy this robot. And we don't have really any of these kind of humanoid robots working anywhere. Yeah, their hope is to deploy it.

Yeah, it's so interesting to see the push into robotics. I think we talked about this briefly in a previous episode, but just the idea that, you know, LLMs had taken, like large language models had taken the air out of the room for people who were trying to apply this stuff to robotics. And kind of cool that, you know, there are definitely companies that are just leaning in. And it does seem lately like...

some interesting progress has been made. Like this is not the, you know, the first time we've seen a major kind of step forward in this space. And I'm kind of curious, like how, you know, how much LLM tech is going to end up benefiting robotics, you know, making it easier for us to give instructions that are followable in a robust way for bots to ultimately build and use like robust world models in their heads. They can use that to kind of logic their way through navigating their environment.

But yeah, anyway, really, really interesting kind of point in that trend. Yeah, I think, you know, long term, you'll need definitely these LLM things to make generalist robots that can do anything. But this is showing how having a specialized robot for, you know, pretty much deployed in one very important application could be a good starting point.

Moving on to research and advancements. First up, we have learning to grow machine learning models. And this is a paper from MIT that pretty much does what the title says. Instead of just having an initial neural net with kind of a set

number of weights that are just trained without being altered. Here I propose a method to continually expand the neural net as you train to make the training more efficient and to hopefully get to the right size. This is not the first time this has been done, but this kind of they say is really nice for various reasons because there's some learning

And the headline result is that on average, it reduces compute costs for training something by half, by 50%, which is, you know, a lot. Yeah.

It's true. Yeah. And you get 50% here, 50% there, these things compound pretty fast. It seems like another one of those developments that you could imagine getting folded into the next wave of super scaled AI models. Yeah. It's kind of interesting because I remember when I was first diving into deep learning way back in the day, and I was thinking about the idea of randomly initializing, like picking random values for the parameters in the model to start.

And as that being kind of a very sloppy seeming kind of rough process, rough strategy. And so it's kind of cool to see people take this approach and say, hey, well, yeah, maybe we can start by benefit from knowing from what's been learned by a smaller model and then like growing another around it. I'm

I'm curious as well about where, if you compare the training process of a randomly initialized model to this one, where exactly, and maybe I missed these curves, I'm sure they actually, I'm sure they did plot them, but what those curves look like? Do they plateau presumably to the same level in the long run? But how much do you gain? When do you gain the performance advantage during the training process?

Yeah, I think they have quite a few results and experiments. Most of it is on these relatively smaller models like BERT, BERT-small to BERT-base, so nowhere near CHA2GPT or GPT-3. And on these, they have some plots that show that basically all throughout training, you end up improving more rapidly.

And they do compare against, there are several other approaches from the last few years that have done similar ideas, which is basically to take a smaller neural net and then sort of interpolate to weights a little bit to convert your large on neural net. So yeah, I also have not found some details here that I was curious about, like how often do you apply this operator?

Do you apply it like every X steps or how do you know? Is it just part of your training schedule now? It's not too clear. But what is clear is if this is actually general and they say this is a general approach that works for transformers, then if you can reduce the compute cost of training a large model by 40%, 50%,

Yeah, obviously now immediate effect in industry and even in academia as well.

And they did do some more experiments. They have initial results on billion-dollar models, GPT-2 in particular, and they did show that 40% efficiency bump there. So if this actually does scale up to $10 billion, $20 billion, yeah, it's just going to be applied everywhere, I guess.

Yeah, it's really interesting because this is one of those things that fundamentally changes how we think about the process of training a model in a way that, I don't know, just aesthetically seems new. Like I haven't seen people fuck around too much with the random initialization premise. So it's kind of interesting to see that get shaken up.

And a factor of two X is like, I think it's interesting to think about what that means. You know, like will frontier labs respond by like, you know, pocketing those, those profits, or are they going to dump the same budget and double the size of their, their systems or the scale of their systems? My guess is the latter, obviously, but anyway, it gives people a new, a new set of options in terms of in terms of how they want to build and train their models, which is exciting. Yeah.

Yeah. Yeah. I do think scaling up is still not easy. So this would make training faster. But if you're trying to scale up to more weights, you would need a larger supercomputer and so on. But yeah. Yeah. Interesting to see where this goes.

I also kind of wonder about the interpretability angle. If you have a smaller model that you're going to use to seed a larger one, I wonder if this gives you an option to deeply understand that smaller model and do your mechanistic interpretability or whatever you're going to do, understand those neurons. And then as you grow that structure, maybe it makes it marginally easier to understand the bigger structure. That's me pulling stuff out of my ass here, but I expect like

People like Chris Ola and Anthropic generally will probably have an interesting take on that, hopefully in the coming weeks and months.

Yeah, no, I think that's very plausible. OpenAI has shown that you can use smaller models to predict the performance of larger models. So you can, with relatively small computes, kind of understand how a change to the architecture or something like that can then, once you scale it up, what the impacts will be. And so I think you could probably, yeah, extend that to more things.

Next story is researchers from UC Berkeley and DeepMind propose success VQA, a reformulation of success detection that uses Flamingo. So the idea here is for something like reinforcement learning, where it's more of a trial and error type of technique, as opposed to just having a data set and learning without data,

you need something to detect success or failure, right? And in general, that's been customized for every single task for reinforcement learning. And it's implemented in a different way for every single thing. And here they propose a kind of general approach

where you have a video of the agent doing its task execution and you use a video question answering model

So here it's Flamingo, which is kind of like GPT-free for images and text, you could say. And yeah, have this large model predict whether the agent has accomplished some task. Like you ask it, did the robot successfully insert medium gear? And it says yes or no, and that's your reward.

So, yeah, this is kind of interesting, I think. We haven't been talking about reinforcement learning too much these days, but clearly, I think we'll start seeing reinforcement learning be impacted by these giant models more and more, and this is just one example.

Yeah, and I think an interesting angle on the safety, the alignment problem, the safety problem generally in AI, because, you know, one of the common problems that you get into if you have a reinforcement learning agent and then you have some like some, you know, success evaluation function, some sort of reward model that goes, OK, you carried out that action. Well, let me evaluate that action and the state of the world and give you a reward based on that.

is that you can get that system to learn to hack that process. If the reward metric is simple enough, it can find dangerously creative strategies that make the reward go up, but that don't look anything like what you would actually want. A famous example of this, OpenAI trained an AI system to play this game called Coast Runners, where basically it was a racing game. You'd race a boat around a circuit.

And the AI learned that it could, through strategically bashing the boat against different obstacles and causing a whole bunch of damage, it could collect more of these turbo points that it was getting rewarded for collecting and thereby do better at playing the game, at least according to the reward function it was given, than it could just by playing the game as intended by running that circuit as fast as possible. And so what I find interesting about this is success VQA

We're talking about taking a model with a pretty general understanding of the world, just feeding it an image and getting it to look broadly at that landscape and say, hey, generally speaking, yes, the robot succeeded at putting the trophy in the trophy case or something like that.

And so you've got a more robust way of evaluating the performance of these models. And hopefully that starts to decrease by a little bit the chances that your AI, as it's getting optimized in these environments, will learn to hack that reward. It's not foolproof, but it gives you a little bit more robustness in that sense. - Yeah, I think in general, this is pretty much having a binary reward. So did you accomplish what you wanted to by then or not?

And that makes reward hacking a lot harder. There are still cases where in simulation you could, you know, hack a physics and do something crazy. But if you're doing this in the real world, it's going to be pretty hard to think of ways to do something crazy to get the reward.

So yeah, I think it's an interesting paper. It's very much still research where they are fine tuning it for a few specific tasks and running it for a few tasks. It's just one more kind of way to try and do research.

training with reinforcement learning. And we've seen some similar ideas before with classifiers being used as reward metric, but it does seem like a sort of pretty general approach that can leverage these general purpose systems to apply them to all sorts of domains. And I think exciting for that reason, right? That like it's

For those of us who worry about the alignment problem and, you know, achieving that scalably, solving that scalably, using AI systems as evaluators of the performance of AI systems feels like it's going to be necessary to achieve that, to achieve some kind of alignment. And I think two of the dimensions that they flag in the paper that are especially useful for that, you know, the first is they flag language generalization, this idea that this system, like if you ask people,

about the success of your AI that you're training in different ways, you'll still get a coherent response, right? So if you ask like, did the AI put the trophy in the trophy case or did the AI put away the trophy? Those two questions can be interpreted constructively by the sort of

flamingo-like model here as having the meaning that they should. And so that does cover one dimension of robustness. And then the other is just the visual robustness, right? This model being able to still understand what it's looking at if it changes its perspective on the scene and hopefully giving a little bit more of that rounded kind of general world model using its more rounded general world model to perform more robust evaluations. Yeah.

Yeah. More and more, it seems like for these embodied settings, you sort of have a stack of AI where you'll have the success detector, you'll have a language model for general purpose reasoning, you'll have another general purpose model for doing the low level control and manipulation.

So there's this chain of translation from high level to low level, which is interesting, I guess.

Moving on to our lightning round. First up we have new virtual testing environments breaks the curse of rarity for autonomous vehicle decision-making. So pretty much talks about a new simulated environment that can simulate just the

scary, dangerous situations that autonomous vehicles can get into. And this is important because when you actually want to deploy something, you need to actually do testing to show that your autonomous driving stack is safe. And if you can do most of that testing and simulation, then that makes it much easier to deploy these kinds of autonomous vehicles.

How exactly do they pull this off? The curse of rarity, which is the thing that they're going after here, the fact that it's really rare to have dangerous events happen. Right now, the convention is, yeah, you drive as many miles as you possibly can with your AI system and you hope that it encounters some dangerous situations so that you can train it against those dangers. How do they condense that

experience and somehow preserve these dangerous situations that they can be trained on? What's the strategy? Yeah. So the basic strategy is instead of training, collecting just the same sort of data you would by driving around for a million miles, in the simulation, you can pretty much condense it to just those cases where you

It's dangerous. So you can save a lot of time that way. And I think they are leveraging a lot of existing data on what sort of scary cases there can be. Now, of course, this does have limitations where...

We are talking about kind of weird-ish situations that aren't typical. So I think this is addressing more typical situations like an emergency vehicle driving on the road or another driver running a red light. Things that are not common, but also not that uncommon.

And that makes sense to me, I guess. Yeah. And then that's, I think that's what was kind of confusing me about this because the challenge of self-driving cars is like that long tail of like kind of weird things, like get

could happen, you know, where, I don't know, all of a sudden, uh, I don't even know. I mean, it's a weird thing that can happen. Somebody in a, in a weird suit, like in a weird costume jumps out in front of your car and your car has to figure out what it is and all that stuff. Um, so, so I guess it's like this fast forwards us past the like kind of more expected weird stuff, but we still have to deal with that long tail of like weird edge cases.

Yeah. Yeah. It's apparently self-driving is harder than whatever you can say general reasoning. It's harder than the LSAT. Yeah. Who knew?

And next up we have scientists are using machine learning to forecast bird migration and identify birds in flight by their calls. So this is a little article that talks about how AI is being used for generally the study of birds or nymphology.

And it's talking about this system BirdCast that uses a bunch of data that actually they collected from weather satellites.

And yeah, that allows the researchers to have a machine learning model to understand these images and quantify bird migrations, which I imagine in the field of ornithology is kind of important. Yeah. And things like climate change study too, I guess, understanding, you know, how birds are responding to new pressures, sort of an interesting application.

Also just generally another kind of more of these narrow applications of AI that are easy to forget in the context of things like chat GPT, but are creating value in some ways, very clearly human aligned ways. So these sorts of specialist scientific systems are kind of an exciting subclass of this new age of AI that we're entering. Kind of cool to see it used here. Yeah, definitely. I think you can probably recall when...

Protein folding was like all the rage last year or two years ago. DeepMind had AlphaFold. And now that's kind of a long time ago by AI terms. But this is another example where science, the discipline of science is being transformed or at least greatly affected by AI in all sorts of fields. And that's just another example.

kind of metric or way to be impressed by AI where just all of science is being accelerated by AI now, which is, yeah, cool.

And we have new in-home AI tool monitors for the health of elderly residents. So there's obviously a lot of older people and we have shortages of people to take care of them. And this new system

It's pretty interesting. It uses low power radar technology and AI here can be used to basically process these readings from the sensors to predict and understand

what these elderly people are doing from those kind of probably noisy and hard to read through radar data. - Yeah, and I think again, one of these very positive applications of AI, we think about our elderly as a society and the resources it takes to kind of keep them safe and engaged.

This is one dimension of AI, including conversational AI that I'm really bullish on and excited for. It's not easy being an old person in our society. And when you're in an old folks home, if you've ever had relatives there, you kind of know it's hard to get them the attention they need. So from a safety standpoint, keeping...

old folks safe from falls and things like that. This sounds really good. Then also, you got the conversational systems to keep people engaged. These are rough solutions. Obviously, you'd love to have humans in the NICs too, but as the population ages, I expect

things like this to become more important, especially as people have fewer and fewer kids to take care of them, that sort of thing. And in countries like China, I'd expect this sort of tech to become differentially advantageous just because of the age and the fertility rates.

Yeah, yeah, exactly. It's never going to replace human caretakers, but it will also help because of a shortage of human caretakers. And in fact, there's already been pilot programs in New York and elsewhere that

There's a company working specifically on having these little Alexa type robots that have a little bit more movement that can interact with the elderly and basically keep them company, but also help them with things like their medication or reminders and so on. So I also think this is something that is kind of a big deal or to be excited about.

And last up we have Runway Gen 2 is the first publicly available text-to-video generator. Maybe we should have a whole section on generativity at this point. But yeah, text-to-video instead of text-to-image, you can generate short videos. So just like a few seconds, not super high resolution, pretty rough.

But, you know, it's a first step. We've seen only a few really research papers from last year that also have shown this kind of describe a video and you get a few seconds of it. And it's still very early on. But now we have a publicly available tool from Runway where you can try it out and maybe use it for something.

Yeah, just another example of that absolutely relentless effectiveness of scaling and just how it keeps working. I remember talking to somebody about the state of the field in 2015, and we were discussing – basically, it was all classifiers and regressors back then. So you'd feed, I don't know, an AI an image, and then it would tell you what's in it, right? It wouldn't generate an image.

And so he was kind of like, well, I think we could just run this process backwards at some point and go from the category label to the image. And we were talking about what it would take to get there. And we were talking about algorithms. We were not talking about just processing power and scale. And it seems like that's all that was needed. Like to run this thing backwards, you just need, hey, more compute, more processing power, more scale. Interesting also, like

They point out in the article quite a few times this idea that mid-journey was not able to handle photorealistic image generation until fairly recently. And this video generation isn't photorealistic either, so it's easy to look at it and kind of like shake your head and say, ah, well, it's not there. But now we've got way too many examples of how overnight, like I'm sure six months from now, a year from now,

We're going to be looking at really kind of longer videos that are photorealistic and maybe a warning shot for the emergence of that capability, but it's going to be interesting to see what it does to entertainment, what it does to art, and so on. Another big step.

Yeah, yeah, exactly. Videos are a little tricky, so you're not going to be generating movies anytime soon, but I also think that we'll get much more impressive texture video models soon.

Yeah. And also actually last week thing is like, this is an interesting way to check how, uh, how good of a world model your AI has. Like if it's able to create videos of, for example, like two objects interacting with each other in a way that looks like physically possible, then that's a good indication that your model is grounded, that it's learned some kind of physics and

And, you know, we've seen people asking language models, for example, more and more of these questions about like logic problems or problems that involve like, what if I took an object like this and an object like this and I put them together or had an assembly of gears that worked in this way? What would the output be? Kind of interesting to see videos as another way to probe at like what, you know, how robust, how complete are these world models? How grounded is AI's understanding of the real physical world? Definitely. Yeah.

Next up, policy and societal impacts, where of course a lot is going on. So the first story is about how OpenAI co-founder has stated that the company's past approach to openly sharing research was wrong. So this is Ilya Sutskovor, which we mentioned before.

And this, I think, is an interview. And the basic statement is because it's so competitive right now, OpenAI is a business. OpenAI originally was called OpenAI because the idea was to be open and to share their outputs as open source and, you know,

be open. There's been for years now jokes to the point that it's a cliche that open AI is closed AI and so on. So this is kind of addressing that and saying, you know, they don't believe in open source anymore.

Yeah. And I think an interesting question as to whether open sourcing was ever the right idea. I got to say, the first time OpenAI was announced with this idea, this kind of vision of a sort of universal open source AGI development, it sounded a little bit nuts, if I'm being honest. Like,

The idea is we're going to make human-level intelligence, and then we're just going to open source it so that anyone can use and develop on it. When you think about the level of harm that you can do – and I'm not talking here about societal harm. I mean physical harm to individuals using systems that could write malware better than human beings, that could design new biopathogens.

It always seemed weird to me that this was being entertained as a good idea. Frankly, I think it's nice to see that that's changing, that we're hearing a different tone from OpenAI recognizing, as Ilya Sutskever says, we were wrong. Flat out, we were wrong. If you believe at some point AI, AGI is going to be extremely unbelievably potent, then it just does not make sense to open source. It is a bad idea.

I fully expect that in a few years, it's going to be completely obvious to everyone that open sourcing AI is just not wise. You know, I think the, the, the, the,

The idea of this open sourcing, I mean, I get the kind of utopic vision behind it. Don't get me wrong. I get the concerns over companies monopolizing this tech. It just seems to me that like the one thing that's worse than that is what you get when every person with a grudge or an ax to grind has their own like human level or superhuman AI available to them to use as they please, especially if that's open source and they can tweak it and tinker it. Like

I don't know. To me, that seems... I may be wrong. I think that's a little bit wild. I think it makes a lot of sense, right? At some point, this is very powerful technology. You can misuse it in lots of ways. And it is expensive. On the other hand, to be fair...

up to really a couple of years ago, it's not like these were AGI level models. This was more on the path to AGI as we kept getting better and better.

so that we could all sort of collectively think about alignment and understand this growth and so on. There I could see an argument that open source is best because then we could all sort of understand the technology and do that. And yeah, now maybe GPT-3, or GPT-4 now is a good reason to maybe not open source.

I totally agree. And that's kind of the that seems to be the position that motivates groups like Aloyther AI, where, you know, their mission is to open source these models. Their stated mission is to do that specifically so that AI safety people, AI alignment people can tinker with them, which sounds you can agree or disagree with that approach. But at least as a goal, it sounds noble, right?

The founding purpose of OpenAI, at least as Elon Musk articulated it back in the day, was explicitly he wanted AGI, human-level intelligence, to be like,

proliferated in the hands of the average person, that's what I found a little bit difficult to reconcile with reality, frankly, and the way that people would use these things in practice. Not that it's an easy problem. There is this monopolization issue. And at some point as a society, we're going to have to, I would argue now actually as a society, we're going to have to figure out how

how we want this game to play out. But anyway, and it's an interesting focal point, right? You can kind of see like, I'm getting worked up about this. Everybody's getting worked up about this in their own ways. There are a lot of different positions on this topic and everyone's kind of trying to navigate it as best they can, but it's extremely complex. And we're finding ourselves having to do our philosophy homework on a deadline.

Yeah, and I think on another dimension, in practice, even if this is open sourced, most people cannot run these models. So open source means you release the weights of a model, but it still requires a supercomputer to actually run it.

So what will actually happen in practice is if you open source it, then now Russia or North Korea or various bad agents like hacking groups can spend that money and build that computer infrastructure to use GPT-4 however they like. So especially given that, I think there's an easy argument to be made that

things as powerful as GPT-4 should be at least not open source, should be kind of shared in a limited way, in a more careful way, if not entirely closed. That's true. And then there's also the angle when you think about like the current need, let's say, for supercomputers to run some of these open source models.

There's the fact that processing power is getting cheaper all the time. We just finished talking about the Cerebrus AI platform that's making exponential improvements in the efficiency of processing for these models. And we've got the H100 coming online. The cost of training these models is collapsing. And so if you open source a model that today requires a supercomputer to run,

Two years from now, we may have to revisit that story. And by then, the model's been out there forever. Everybody has it if they want it. And so there's this kind of like treadmilling effect where when we think about the cost of using these models for malicious purposes or whatever, we kind of have to think about the cost not just today, but the cost, you know, one, two and three years from now, which kind of complicates things a little bit.

Yeah. And speaking of that, next story we have 1,100 plus notable signatories just signed an open letter asking, quote, all AI labs to immediately pause for at least six months. So this is this open letter that has been publicly posted, I think, last week with a lot of famous people. So Elon Musk, Steve Wozniak,

a lot of AI names that are pretty famous. Yeah, and basically it talks about how we should slow down or even pause where things are moving so fast that it's not safe and we need to kind of catch up to be able to constrain the negative impacts at least somewhat.

So yeah, what do you think of this, Jeremy? I thought it was interesting on a lot of different levels, one of which was the list of signatories. So obviously we had Yoshua Bengio up there, a legend in the field, one of the founders of the modern age of deep learning. Elon Musk and Stuart Russell, but also the co-founders of DeepMind and the co-founders of Stability AI, which I thought was especially interesting given the kind of open source stuff that they've been doing.

But besides that, I think this is a great opportunity, I think, to discuss almost like the game theory in advanced AI development today, because I

I think to a lot of people, this idea of a pause seems just like straight up a great idea. And I don't know if I agree with it or not. Like even though I'm a, I'm an AI safety Hawk, I'm very concerned about where this stuff is going. I think we're going to be hitting dangerously capable systems very soon. And I don't think we remotely have the technical understanding to control these systems. However, um,

We just talked about today how quickly the price of processing power is collapsing. So OpenAI right now rides the cutting edge along with perhaps DeepMind and maybe a couple of other labs. They're riding the cutting edge of this field by, in part, spending gratuitous amounts of money on processing power.

And it's those budgets really that are creating the moat that allow them to have a little bit of lead time. And that lead time, they plan to invest, hopefully, in aligning their systems, making them safe and controllable before they're then released. And so if you talk about like instituting a six-month pause,

Supposedly so that these organizations can all coordinate with each other. I think that coordination is critical, but part of the downside here is over that time, you're dropping and dropping and dropping the price of compute. You're making it easier and easier for other people to jump in and burning some of the time that OpenAI would argue it needs on the back end to do all their alignment work to make things safe at the other end.

end. I actually lean towards thinking this is actually a pretty reasonable proposal because in the absence of something like this, I do think that we're moving at an unmanageable rate towards potentially very risky outcomes.

I think it's just something that needs to be kept in mind is like, what is the cost of a pause like this? And also, you know, what happens in China? What happens with hedge funds who are not referred to in this letter, as far as I could tell, but that are absolutely at the intersection. If you look at people who have the budget to pull off large scale AI projects and the economic incentive and know how to do that, like that Venn diagram pretty squarely includes certain hedge funds.

And they're not exactly the most transparent organizations in the world. So there's also a risk there, right, from a safety standpoint that they might not care much at all about safety or even be aware necessarily of some of the alignment risk stuff. So I think it's a very multifaceted thing. I don't think any solution here is terribly exciting, but it's certainly something we ought to be thinking about, solutions of this form and maybe in certain senses more comprehensive solutions too.

Yeah, I think there's definitely arguments to be made that the points being made in this letter are good. To clarify, I guess I misspoke a little bit, the actual statement is that

They call on all AI labs to immediately pause for at least six months for training of AI systems more powerful than GPT-4. So this is, yeah, for very, very, like basically don't move further than we have moved to GPT-4, which actually only really applies to like a few companies in practice, right? So this is almost just,

Talking to open AI. Yeah. And the fact that you've got, you know, Demis Hassabis of DeepMind who signed on to this also suggests like, hey,

you know, like they're, they're, uh, you know, deep, deep mind is seeing this as something that's a good idea. Um, and that, that exactly kind of leaves open AI as one of the obvious non-signatories of this, which is another, uh, almost political dimension to this stuff. Um, I know Sam Altman tweeted, uh, cryptically this morning, something like, uh, very calm in the eye of the storm, which, uh, which was a sort of amusing and dark reference to, to what's, what's happening here. But, um,

I think if nothing else, look, if you're thinking about the seriousness of this issue, first of all, I don't think it's a particularly tenable issue.

to say that we know with confidence that there's like less than a 10% chance of something very, very bad happening. I don't think it's a tenable position to say that we know with more than 90% confidence that something really, really bad will happen. Anything between 10 and 90% seems reasonable. Anything between 10 and 90%, though, gets you quite quickly to some very significant policy responses that would be required.

And so, proposals that have this shape, this feel, that seem almost ridiculous on their surface, I think are going to have to be entertained at some point surprisingly soon, just given the rate of the development of this technology, whether all you care about is the existential catastrophic accident risk piece,

or the malicious use piece. Those two things, whichever floats your boat, I think they're both going to be really important in the near term. And trying to get our arms around that problem is going to mean getting a better handle on the behavior, the controllability of these systems, and also on our policy apparatus to control and contain and manage this ecosystem of technologies.

Yeah, I mean, definitely in discussing this, you can be cynical and say, well, this is obviously not going to happen. You can't make everyone agree on something. But I do think just the statement in itself is worthwhile. There's been a ton of media coverage of it. And so it makes kind of the public more aware of the notion of AI safety. And I think the other aspect here is that

AI alignment, AI safety, AI deployment, AI policy, these are not or have not been sexy areas of AI research. We as a field have been prioritizing getting better and better at tasks and not worrying about making it safe to deploy these models. And we've had research on that for sure, but that's just not been the hot topic yet.

And now maybe, hopefully, with these sorts of letters and with GPT-4, that will change and we will just kickstart and really change focus in a large degree with the field to care about AI safety and alignment. Yeah.

Yeah, I certainly hope so. I think I found some of the reaction on Twitter and social media generally a little disappointing. There's a body of people like this who have come out and pooh-poohed the idea that AI represents a kind of fundamental risk to human existence in a context where I think we have a lot more research that shows us that, in fact, this is something to be taken very seriously.

than their position suggests. And, you know, even we've seen Jeff Hinton, the founder of Deep Learning, kind of move in this direction and start to signal that like, hey, I think actually this might be a thing. Again, we mentioned Joshua Bengio is even one of the signatories on this thing. So, you know, there's this aesthetic, right? It's always fun to be the skeptic. It's always fun to be the person who walks into a room and tells everybody, oh, I'm here to debunk things.

these crazy sounding ideas. They shouldn't be given any airtime and so on. And you know what? Most of the time that position ends up being true. And that's part of the reason why it's so easy to defend in the public square. But the reality is we have research on AI and power seeking, for example. We know that by default, by all the experiments and all the theory that anyone's ever been able to run and set up, we consistently land on this result that

Yeah, optimal policies, optimal RL policies tend to seek power by default. And we've seen reward hacking. We've seen all kinds of examples of these sorts of dangerous behaviors that do seem like they would generalize if you just built a more scaled, more powerful system. And so I think it's important that people who are on the capability side start to educate themselves more about these risks.

Certainly before passing any commentary on issues like this, because otherwise you get something like the Andrew Ng comment, which frankly, I just don't think stands up to the kind of scrutiny that it should, especially from a specialist, an incredibly knowledgeable guy as Andrew Ng is. He's just not an AI safety expert and

Unfortunately, that generally does disqualify him from having a complete perspective on this particular issue. That's again, Jeremy hot take. I guess today's that day, but I hope I'm wrong in any case. Yeah, I'm still a skeptic, at least insofar as fundamental concern, human extinction, at least with these types of generative models, until we get to agent-based

type AI that is actually autonomous, it's okay. But what I will say is even as someone who is not so much on the X-risk side, AI safety, regardless, clearly with such powerful techniques is important for many reasons. And what I will say is I think what I don't like about the letter is

It is easy to ridicule this proposal to pause for at least six months the training of AI systems. I would have personally thought that something that said, for instance, we need to establish a multi-organization framework

kind of organization where you can have, before you deploy it, you can submit the model for testing by AI safety experts. You can independently verify and red team and so on to make sure that there is some consensus that this is safe.

And in that case, I think, you know, if someone is trying to make an agent top AI that can autonomously do stuff at once, which is the actual kind of scary thing, not GPT-4, which is not autonomous at this point, then having this oversight committee of some sort or, you know, multi-organization thing, perhaps even as part of, you know,

political regulations, that is something concrete that probably needs to be established in some form. And yeah, I think it's kind of a shame that that wasn't commented upon in this letter. But regardless, this letter is good insofar as it makes it just pushes forward the idea of AI safety.

Yeah, no, I totally agree on that concreteness thing that, you know, the blank. And they have some suggestions, but like you say, they're not quite explicit enough. And, you know, on the question of agency and like what kinds of models are concerning, it's also the case that like, you know, you can use GPT-4 to...

to very quickly bootstrap an agent-y kind of system with a quick sort of RL loop on top. So these scaled systems bring us fewer and fewer incremental steps away from those dangerous, risky systems. And those extra things that you need to slap on top to make them truly dangerous are smaller and smaller and hard to detect.

So that's kind of where I have some sympathy for this whole general idea. I think it's a complicated space and I don't blame anyone for trying to reason through it, but I do wish that people would engage with the actual kind of underlying arguments for it rather than playing a more aesthetic kind of card.

Yeah. So it's a pretty short letter. Once again, we will always include links to these stories. So if you want to read it, you can go ahead and check out the description of the podcast.

Moving on to the lighting round, speaking of AI safety, there is a bit of an overview article at CNBC titled, In San Francisco, Some People Wonder When AI Will Kill Us All. And yeah, this is X risk, right? Extinction risk. And it's not that weird. Many people in AI now do at least somewhat...

care about it. And this is covering in particular the misalignment museum in San Francisco, which we've been mentioning enlightenment over and over and over. Alignment is basically when AI does what we want it to do. And misalignment is when it does stuff we don't want it to do, such as kill all humans. And this is a museum kind of about this whole idea of misalignment.

Isn't it just like the most San Francisco thing you've ever heard of? The premise of it is some misaligned artificial intelligence has wiped out humanity and it basically has left this art exhibit to apologize to current day humans.

And they've got all these like ridiculously like inside baseball references to AI alignment. There's I think we talked about the paperclip maximizer. Anyway, it's like this scenario. Details don't matter, but it's a scenario that certain people use to illustrate principles in AI safety. And so they've got a sculpture called the paperclip embrace where they use paperclips to show like two people in a hug.

Um, they've got apparently Roombas with brooms attached, shuffling around the room and props from the matrix movies and stuff like that. So anyway, so like, like very, very inside baseball. Um, but there are things that I think generalize beyond sort of the, the very in crowd in San Francisco and it's designed to broaden the conversation to non-technical people.

I thought it was cute. I thought it was very funny. And boy, is it, again, is it ever like the most San Francisco thing to have an exhibit dedicated to AI existential risk. Yeah, yeah. It's quite cute and humorous, surprisingly, whereas...

I think what you can see from outside next to the title is sorry for killing most of humanity. Again, this is from this perspective of the AI. And yeah, I think it's very, very Bay Area, San Francisco. But even if it's mostly catering to the tech crowd, yeah.

having more public awareness of ai safety and these concepts of misalignment is good and these kinds of museums hopefully will come about more just so people are aware of what ai is its history its potential future and so on so um kudos uh to the creator and uh you know uh

Curator of this, Audrey Kim. Yeah, yeah. And one thing I like about it, a last quick note, is the donor was anonymous, which I kind of like because it's less distracting. We're not talking about who funded this thing. We can focus on the art and Audrey Kim's great work here. And next story is...

Clearview AI used nearly 1 million times by US police, it tells the BBC. So Clearview AI, we've discussed quite a bit last year, is a facial recognition tool. Give an image, it can give you a name. Give a name and yeah, now...

It's apparently been used a ton by U.S. police, and that's been their primary target to sell to. In fact, it's been...

banned from selling it to most U.S. companies due to an ACLU case in Illinois, but it has seemed that police organizations in the U.S. have been using it more and more, and now Cleaview claims that it's been used a million times. Yeah, it's so interesting to see the law try to catch up

to all these developments and the, you know, the, the struggle that the public has, like understanding how these things are being used and our ability or inability to like audit these kinds of technologies. And then whether they're being applied, uh,

It's, yeah, not an easy issue at all. And well, especially when these systems make mistakes too, because that's something that has also happened. Like you have people from time to time arrested because of like a facial recognition issue. But yeah, anyway, one of the darker sides of this for sure. Yeah. So yeah, it's good to be aware of

As AI proliferates, there's a lot of more practical things like surveillance or profiling that are happening currently and that we need regulation for. Next, the story is voice system used to verify identity by central link can be fooled by AI. This voice identification system used by the Australian government for many people,

can be fooled with AI. And you can't just use it to access your account. You do need the customer reference number, but clearly this is a major security concern.

Yeah. And I think we're just going to keep seeing stuff like this. It sort of reminds me, you know, with like quantum computing and people are worried about quantum cryptography, breaking RSA encryption schemes, all our passwords become crackable, at least in principle. Sort of one of those things, right? Where like every new development in AI is going to undermine a certain set of fundamental assumptions that are baked into our society's structure.

And this seems like, you know, it's another one of them there. Like you said, you still need to know your customer reference number to access these things. So it's not a total like open breach, but, you know, you don't need that many of these kinds of events before you start to see some pretty gaping holes in security systems.

Yeah. And we've already seen news of phishing where you just call someone and pretend to be someone. And yeah, that's going to become a big practice by hackers. We just know that and we have to somehow prepare for that. And that's kind of weird.

Next story is also about concerns, specifically AI can draw hands now, that's bad news for deepfakes. So this is a bit of an overview article from the Washington Post about how there's been a lot of progress in AI. Recently, we've seen a ton of progress in text-to-image AI. And most recently, some of the kind of trademark issues

things that AI could not do, it is starting to be able to do. Last week we discussed mid-journey v5 and yeah, this article basically gets into how mid-journey v5 no longer exhibits these predictable flaws, meaning that it will be easier to fool people in all sorts of ways.

Yeah. And I remember a phase when, when teeth were like this too, you know, you'd see like these kind of grotesque smiles from people in, in AI generated images. And then that kind of went away. And so we're seeing all these things. I don't know. One of the interesting things about hands too, is that I,

I think I'm getting this from my wife, so I'm pretty sure this is true for most people. I remember her talking about how drawing hands was just this like really big challenge when she was learning to draw and learning to paint. And it's just interesting to see that that's also the case for AI. Like, I don't know if I would have expected that. But yeah, an interesting bit of overlap there. And another hurdle crossed for our AI generation or image generating AI. Yeah, it is kind of interesting.

And in our art and fun things section to finish up, actually the next story is about deep fakes as well. So it's people aren't falling for AI Trump photos yet from the Atlantic. And it's pretty much exactly about what we just talked about. There were some Twitter posts with,

former President Trump being arrested. There was these news stories that he might be arrested. And so someone generated these deep fakes, you could say, or generated photos. And this article covers how people weren't really fooled. I mean, this kind of did go viral with millions of views. But for the most part, people just found it funny. People are skeptical to the extent that

They, yeah, if something seems impossible, they still can think about it and say this might be AI. But yeah, it's showing this progress towards the potential of being fooled by these AI images if they keep improving.

Yeah, it's interesting to look at the photos too because I don't know. They have – maybe I'm just fooling myself, but they have a certain kind of – it's not quite Uncanny Valley, but it feels like they have a vibe, a mildly artistic vibe that like Midjourney is very famous for being biased towards.

And so, yeah, I'm wondering how long that'll be sustained, how long it'll take for, you know, maybe these things get that much more credible looking, but they're definitely cool pictures. I mean, boy, you know, to be able to do that, I wouldn't have bet on being here just a year and a half ago.

Yeah, yeah, it's certainly impressive. And I mean, if you look closely, you can see some flaws still, but there really aren't as many flagrant flaws as you would see even just a few months ago. And actually, another story that I thought was kind of funny is AI generated images of Pope Francis in a puffer jacket full of the internet. So there was another tweet, how...

The Pope wore this sort of big white jacket against the cold, I guess, puffer jacket. And this one also circulated widely. And, you know, unlike the Trump one, I think it actually did fool many people. Right. Which is kind of funny where Trump,

It looks kind of stylish and cool, so it was sort of very fun to share and it seemed plausible enough, but it, I guess, fooled our skepticism detector to be, you know, whether this is real or not. But it really does look very real.

Yeah, I will say this straight up like fooled me at first when I was looking at it on Twitter. I was like, oh, that seems like – you know, it's like his PR team should really be trying to make him seem a little bit more classically Catholic, I would assume. But like – but okay, you know, you do you. And it's – yeah, it's really impressive. I think –

You know, I wondered if there was like a giveaway when you zoomed in on the cross. Like you couldn't see – I think it's – you know, Midjourney thinks that there should be like a Jesus there on the cross. Sorry, he's wearing a cross around his neck like some big bling thing. But like you can't quite – like it looks deformed. It doesn't look like there's really a person there. So I don't know. Maybe that's a gap. I'm just –

picking at things here but overall what a yeah really impressive and kind of funny image too yeah yeah it's it's pretty uncanny actually even looking at now it's hard to spot an obvious giveaway so uh

Yeah, hopefully, I guess we'll just all need to learn to be skeptical about images on Twitter. Yeah. And last story, WGA would allow artificial intelligence in script writing as long as writers maintain credit. So the Writers Guild America has now proposed a policy that allows the use of AI to write scripts,

as long as the offers are actually given a credit. So basically it's saying that something like ChatGPT is a tool that a writer, a human writer can use, but the script is still ultimately the product of the human offer.

Yeah. And it's so interesting. We're seeing all these decisions being made in not just like writing scripts and things like that, but also books. And it seems like people are taking a different approach with books. Like we covered a couple of stories a few weeks ago where by and large, these publishers are saying, hey, we're not going to be taking AI generated content by default.

that may have to change. And it certainly seems like that the Writers Guild of America is taking a different stance on this. Kind of cool. I mean, I'm curious what the first AI-generated scripts are going to end up looking like. Maybe we've already watched shows with the first AI-generated dialogue. I don't know. But yeah. Yeah, I would imagine many writers are already using ChatGPT. And one thing to note is, to be fair for actual scripts,

At this point, you can't generate a whole script of GPT-4. You can generate a page or a bit of dialogue, but you would need to have human involvement and editing and so on. So the story might change when we scale up. Yeah, I'm curious about the version of GPT-4 that has a 32,000 token context window. So at that point, you're kind of getting into like,

a third to a half of a book in length. Um, I, I wonder how coherent that'll be. I, you know, I guess we haven't had the chance yet to play with that system, but, um,

Yeah. And I mean, again, looking back, we've talked about how a science fiction magazine that's quite famous closed submissions entirely because it was getting spammed by low quality shared GPT stories. And this is for short stories. So for short stories and for visual art, it's much harder to say, you know, don't use AI.

- Yes. - Because, yeah. - And the generation two actually is probably gonna be more challenging. Like for videos or actually for images, right? You like type in a prompt, you get an image and you can like almost immediately evaluate the image. You can be like, oh, that's a good image, that's not what I wanted. And immediately you type a new prompt, you get a new image.

But if you're talking about a full script or large amounts of text, that process is just much slower because it takes you longer to evaluate the output of what you've just generated. So I guess I imagine that might be an interesting challenge in a way in which text generation ends up being a little different from image generation in terms of the speed that we get to high quality outputs. Yeah. And...

I guess also this is for professional writers, right? So people that are in the industry that are submitting scripts and that's a little less problematic. This is more about like a policy of how do you get paid if you use AI as a tool and it's sort of saying, well, AI is sort of like a word processor, like word, it's a tool, but for, you know, in different contexts, it's much harder to make an argument. Yeah. Yeah.

And with that, we're going to close out another episode of Last Week in AI, another episode of GPT Talk, you could say. Once again, you can check out the newsletter at lastweekin.ai. Please share this podcast with your friends and leave us a review on Apple Podcasts. We really are curious to hear what you think. And just tune in next week.