Home
cover of episode #118 - Anthropic vs OpenAI, AutoGPT, RL at Scale, AI Safety, Memeworthy AI Videos

#118 - Anthropic vs OpenAI, AutoGPT, RL at Scale, AI Safety, Memeworthy AI Videos

2023/4/14
logo of podcast Last Week in AI

Last Week in AI

Chapters

Anthropic plans a $5 billion, four-year initiative to challenge OpenAI, highlighting the escalating costs and competitive dynamics in advanced AI development.

Shownotes Transcript

Hello and welcome to Skynet Today's Last Week in AI podcast, where you can hear us chat about what's going on with AI. As usual in this episode, we provide summaries and discussion about some of last week's most interesting AI news. You can also check out our Last Week in AI newsletter at lastweekin.ai for articles we did not cover in this episode and the links to the articles we did discuss in this episode.

I am one of your hosts, Andrey Karenkov. I am just about done with my PhD at Stanford, and I'm going to be starting to work in a startup next week using AI for some cool stuff.

Yeah, I remember you mentioned video gaming and AI, generative AI, kind of a, anyway, really cool intersection. I'm super excited to hear more about that as you're able to share it because I know there's a little bit of a stealth element here. Yeah, I'm excited to start work on it, as I mentioned to you next Monday. So we'll see where it goes.

Nice, nice. Yeah. Well, I guess I might as well chime in. I'm Jeremy. I'm your other co-host here. And yeah, I'm with a company called Gladstone AI, and we do work on AI safety at the intersection of national security and AI, looking at things like counterproliferation and anyway, risks that come from these systems. I have a book. It's called Quantum Physics Made Me Do It. And I'm proud to say that today we rocketed to the very bottom of

the Canadian nonfiction, uh, bestseller list after our first week out. So kind of, kind of pleased with that, that very kind of the, the B grade of the C grade, uh,

But anyway, I was really happy with it. So check it out, I guess, if you're looking for stuff about quantum physics, AI, and AI sentience, that kind of stuff. That's sort of the vibe. Anyway, with that, I guess we can dive in because this has been a hell of a week. I keep saying this, but it has been, I think, more intense than average just based on the articles that I've been looking at. I don't know, Andre, what's your sense?

Yeah, I think last week was a little bit of a slowdown compared to this year in general. And now it doesn't stop. We have a lot to discuss. The episodes have been a bit longer than usual actually this year. And we have a lot of new listeners. So just up front, we'll say...

If you want to give us any feedback, you can go ahead and review us on Apple Podcasts or you can leave a comment on lastweekin.ai. We will definitely take any feedback into account. Maybe we're talking too much. I don't know. But always curious to hear what people think.

Our moms do check the ratings religiously, by the way. So just putting that out there. Yeah. I must admit, I also check them. It's nice to see the numbers, but we still try to do it for fun as well. So with that, let's go ahead and dive in. First up in our applications and business section, the first story is Anthropix's $5 billion four-year plan to take on OpenAI.

So, for some context, if you don't know, Anthropic is an OpenAI-esque company that almost in a way spun out of OpenAI. A lot of people leaving to do AI safety research and now they have Cloud, which is similar to ChatGPT. Many people say, you know, almost as capable possibly.

And that has been built and released in a limited fashion. And now this article covers the longer term plan, where as you see from the title, a lot of money is going to be necessary. Five billion over the next two years is the goal. And part of why that is, is to build the next model, they project that they'll need

possibly like a billion dollars in compute. So yeah, it seems they're aiming high. What do you think, Jeremy? Yeah, I think this is like the ultimate example of how all trends in AI seem to be converging in the same direction. So as Andre pointed out, right, you had initially the team at Anthropic, the founding team, they were at OpenAI.

And then OpenAI decided to launch GPT-3's API, make it available to the public, basically created a huge media firestorm, a sensation. Everybody was obsessed with GPT-3, GPT-3. And the worry, at least it seems like part of the worry was, hey, by making this super public, you're actually accelerating AI timelines. You're accelerating progress in AI in a context where the safety team didn't necessarily feel that

safety was ready, that we had the ability to control AI systems reliably. It's unclear, but part of this may have played into the decision for Anthropic to butt off of OpenAI. Now, it seems like Anthropic is being forced to learn the same lesson that OpenAI learned, which was, "Hey, you got to launch product. You have to build things to monetize things because the cost of compute, to train cutting edge models, to actually be competitive."

at the cutting edge of generative AI is just so high. And so, you know, you might start with this philosophy of saying, hey, we're just going to do research. We're just going to do safety research because that's the missing thing. We're not going to accelerate capabilities. But eventually you kind of find yourself forced to do this sort of cutting edge competition, push the envelope and scale more. And, you know, a bit concerning if you look at it from a safety standpoint, right? Because we do live in a world where we don't know how to control arbitrarily powerful systems.

We do live in a world where scaling can take us in principle to perhaps unbounded or almost unbounded levels of intelligence. And now we're finding even really well-intentioned actors, actors that explicitly did not want to race, actors that left open AI for that reason, find themselves forced to race anyway. And so I think this is part of the story here as rather anthropic goes to raise their series C and they say in the documents,

We believe that companies that train the best 2025 or 2026 models will be too far ahead for anyone to catch up in subsequent cycles. And that I think is a really big observation. This is like flirting with this idea of AI takeoff, where basically once you reach a certain level of AI capability, like,

your AI is starting to, I don't know, I mean, what this seems to imply, like give you business strategy advice that's unbeatable, give you product development capabilities that are unbeatable. They're hinting at something that they think could occur as early as 2025, 2026, and that they think they could reach with a billion dollars. So I think this says a lot about what people at the cutting edge believe will be possible soon and what kind of fundraising is required to get there.

Yeah. And to give a bit of context, there's quite a bit of a history here where OpenAI started as a nonprofit. And then in 2019, they switched models to something I think like a limited enterprise.

profit kind of hybrid. That was a little bit interesting. A lot of people criticize that move at the time, and that coincided with them raising a billion dollars from Microsoft prior to the announcement of GP free, I guess.

And that was just the beginning of this trend of going bigger and bigger with these types of language models. And then more recently, I believe OpenAI got investment of $10 billion from Microsoft. So these numbers are kind of in line with that. Anthropix spun out in, I think, 2021.

And from the outset was a for-profit company, but they claim to be a public benefit corporation, but regardless, it's still for-profit. So I think they might've learned that lesson already from OpenAI at that point that you need this money.

And personally, I think this is kind of an interesting projection. They estimate that they'll need an order of magnitude more compute to train. And that is possibly true, although we've seen a lot of trends recently with

Smaller models being very capable. So GPT-3 was 175 billion parameters, a very big neural network. Lately, we've been seeing that you can really do well with 20 billion, 15 billion if you train it right. And there's all sorts of findings with, for instance, you need to train longer instead of having...

a giant neural network necessarily. So it's, we'll see. I'm curious to see, maybe we'll find more things out. That would mean that you don't necessarily need this much compute. Maybe you need to tweak your model or the training recipe. So we'll see. I think it's,

Definitely, you can see how billions are necessary, but it's not clear to me that the next kind of step is necessarily just scale.

Yeah, and lots of wild cards in this too. I think Anthropic, just because they were founded by Dario Amodei, who among other things, he's one of the co-founders, he was the lead on GPT-3 and a real believer in scaling. When you talk to the Anthropic guys, they're scaling maximalists or at least

They believe scaling will do an awful lot. I know Jack Clark has a similar position as well, another one of the co-founders. It's interesting to see that. How much do you believe in pure scale versus how much are algorithmic improvements going to matter? Anthropics made quite a few of those to themselves.

I think the last thing I'll just mention too, I think that's relevant in this context is, you know, we talk about the racing dynamics and the idea that, you know, Anthropic really didn't want to kind of push the racing dynamics even further and accelerate progress when safety hadn't caught up.

Well, the way that they've accomplished that so far with Claude in particular is they've said, "Okay, look, we're not going to be the first ones to release a new level of AI capability. We're only going to release our most powerful systems once other companies have already reached that threshold." So they waited for ChatGPT, then they released Claude, which was already the advanced stages of development when ChatGPT came out.

And so what's interesting here is this either signals that they think a lot of other labs will be doing this anyway, or it signals that they're actually breaking with that tradition and now saying, you know what, we are actually going to push the envelope. It's kind of hard from this deck to read into it because it is investor facing. So it is about positioning Anthropic as like the potential leader. But yeah, really hard to figure out like, what does this mean? Are they sticking with their core strategy or what's happening here?

Yeah, I will say another thing worth noting is Anthropic to a greater extent than OpenAI is still a research lab. So they do publish really interesting research about understanding these giant models. And we discussed, I think a week or two ago, they released this whole approach strategy for researching AI safety that was quite extensive.

So part of what has me excited is that as far as a lab working on these cutting edge models, they are pretty dedicated to understanding them and finding new methods for AI safety. They have this interesting constitutional AI idea that is built into cloud. So if you had to choose a company to be racing AI,

to the front, I think Anthropic is about as good a choice as you could pick. Yeah, they do seem safety conscious for sure, yeah. Yeah.

Onto the next story, we have doctors are drowning in paperwork. Some companies claim AI can help. This is from NPR. So this is a little bit more about what's going on right now with businesses and applications. This article highlights two companies in particular. It highlights Glass Health, which is sort of

offering a specialized version of ChatGPT for answering clinical questions. And the founders claim that they...

train on additional clinical data, medical textbooks to avoid some of the issues with providing false information that ChatGPT is prone to do. And the other company is Nabla, which is aiming to provide a service where basically an AI listens in to a session with a doctor and a patient and can provide summaries of a discussion.

So yeah, it's interesting to see the startups that are currently working on essentially using technology like JDBT to really speed up these processes and make doctors more efficient and more effective.

Yeah, and it's also an especially interesting use case for chat GPT from a risk perspective, because this is really like, this is the self-driving cars of chat GPT. Like this is a high stakes, high risk situation in an industry too, that is extremely risk sensitive, right? Highly regulated doctors historically are extremely slow to adopt new methods, new technologies. And so you're kind of like facing this interesting moment where it's like, hey, we have a thing that really can help relieve a burden on the medical system.

But it comes with this new category of risk. Do you actually do that? And possibly, do you risk more lives by not doing that? Just because doctors who are overburdened, they're tired, they stop paying attention or whatever, they might make a wrong call. And it invites us to wonder, what kinds of errors are we happier with? Do we just care more about the error rate?

Or does it matter more if an AI makes a mistake than a human, partly because insurance? Who do you blame? What do you pursue criminally, potentially, if there's negligence? It's so hard to tell, but exciting to see this. And also, much like other chat GPT-powered products, they've seen, it seems, a lot of growth. They're citing numbers here with a launch in January with 2,000 users.

monthly active users. And then by February, they've more than doubled that to almost 5,000. So sort of following that similar trend of like very rapid adoption, partly because there's so much low hanging fruit in medicine that just a really good chatbot can help to address.

Yeah, one thing I found interesting in this is the company Glass Health was founded initially not with this product. They initially had an electronic system for keeping medical notes. And then they recently kind of pivoted a little bit because they kept being asked from doctors and medical students that were already using ChatGPT

And so they, I guess, switched the direction a little bit towards trying to build the system, which just shows how prevalent and how many people are already starting to use these technologies.

Another thing to note is that both of these companies, like you said, medical stuff is pretty sensitive. You really don't want to go wrong. So as far as these things go, these are maybe lower risk since they...

don't make decisions per se. They sort of provide feedback or provide notes. There has been a lot of work on AI for diagnosis, for kind of more interpretation of your health data. So that is much more sensitive where if you need kind of the clearance to get access to your health record, if you need to be able to provide a diagnosis based on scans, it's

All of that is pretty crucial not to get wrong. And here, these are slightly less risky, which is probably where a lot of us is going to start out with doctors kind of trying out these tools, not necessarily entire hospitals adopting these AI technologies.

Yeah. And another strategy, a good point, like to reduce risk, another strategy they're flagging as well, right, is this idea of using an external database. So rather than just like relying on the knowledge that's innate in chat GPT, like trained, you know, by reading text, basically all of the internet, they're specifically getting it to like retrieve and summarize text.

from a curated database, curated essentially almost encyclopedia of medical knowledge, which maybe grounds the answers a little bit more. You're still looking at a source of error potentially from the model kind of miss summarizing, making mistakes in the summary, but at least you're not counting on it reaching into whatever gobbledygook it's learned from reading all the internet and then pulling out facts.

So you've kind of got this more robust database of validated facts that you're getting to lean on, which is an interesting strategy, right? Like we've kind of seen that in other application domains where people go, yes, like chat GBT and these systems, they can hallucinate a lot of stuff.

And so the solution that we're using is to give them a static database that we've sort of curated in some way so that it's generating grounded outputs. And this is, I think it's the highest stakes situation or use case for that strategy that I've seen. It'll be interesting to see if it stands the test of time. But to your point, if doctors are using chat GPT anyway, maybe this is just strictly better because it's, you know, at least people who are spending some time making sure that the system's more robust overall.

Yeah, yeah, exactly. And the other, I guess one last thing for me to note is in the past, we have discussed other companies that have been offering transcription services, not quite summary, but similar to Nabla. And when we discussed that, one of the things that was noted is that, I mean, you can imagine right now the doctor doesn't have to take notes while talking to you. They can offer maybe better discussion, more focus, and

So a lot of this stuff is going to just speed things up, maybe for easy diagnosis. These CHAS GPT things are kind of reliable enough. And of course, the doctor can double check its output. So I think it's pretty exciting for all of us

who make use of healthcare, there's a lot of issues, availability and so on. So it's gonna have a large impact in our daily lives probably pretty soon and probably for the benefit of any of us who go to visit doctors and talk to them.

And onto our lightning round with a few more stories we won't dive into that much. First up, we have Google reveals its newest AI supercomputer, says it beats NVIDIA. It's, as you can imagine, a giant supercomputer. It says that they have a system with over 4,000 TPUs or Tensor Processing Units, kind of an advanced version of GPUs.

And these are also the version four of TPUs, which are faster and more energy efficient. Yeah, and I think a subtext to this too is the perennial debate in AI circles about whether, so the TPU, which was invented by Google, whether that's actually like

that much of an advantage over an improvement over GPUs, the cutting edge GPUs that we have today. And so they sort of made a point of showing that their TPU version 4 is about 1.5 times faster depending on the metric

than the NVIDIA A100 GPU, which is kind of the workhorse GPU today. But what they don't compare it to is the H100, which is a lot faster. At least according to NVIDIA, it's like four times faster. So anyways, it's an interesting question as to like, what is the state of the art in hardware? And can we get good apples to apples comparisons between different types of hardware? But certainly suggests like,

Google is, like a lot of other companies, racing to build the crap out of their supercomputers to train large-scale AI models. Yeah, and they've been investing in this for a while. The original TPUs came out quite a few years ago. So at least they're one of the few companies really competing with NVIDIA on this front.

And next, related to this, kind of a similar story, we have Elon Musk is moving forward with a new generative AI project at Twitter after purchasing thousands of GPUs. So based on some insider comments, there was a story that Twitter has bought up

10,000 GPUs and they're planning to eventually use this to train their own generative large language model of some sort. This is still at an early stage, but buying that much hardware seems like a pretty big commitment.

Yeah, it also really matches those Elon vibes that we've been getting lately where he's complaining about how he started OpenAI to serve as a counterweight to Google and DeepMind because he doesn't want a unipolar world order where just one big company controls all the AI. And he's disappointed that OpenAI now is kind of partnered with Microsoft. So he's like, now there are just two big, mean, ugly, big companies that are doing this.

And it kind of seems like this is part of that effort. Like he's like, his solution seems to be, I'm just going to create a third one and have a founder from DeepMind or sorry, an employee from DeepMind help found it. You know, from a safety perspective, like,

I don't know. This seems pretty clearly terrible. If you think about the thing that we don't need, as per that anthropic argument we explored earlier, is more companies racing to build AI in a context where we don't know how to control these systems. I don't really see how throwing another company in the mix to do this with highly scaled systems works.

really that helpful, especially given that Elon seems to worry about AI alignment. That seems to be something that keeps him up at night. His solution just seems to keep being, let's just like keep introducing more and more players to kind of add to this race. So I'm a little confused about that disconnect, to be honest, in his thinking. I know part of it is like this libertarian thing that I get. I understand the vibe. You don't want one big company controlling it. But in the context of this race to the bottom on safety, eh,

Anyway, not so clear to me this is a good idea, but I guess we'll see. Yeah, and it's kind of ironic, right? He just signed this open letter that called for pausing giant AI experiments, and now it seems like they're just racing ahead to do just that. So a little bit hypocritical, but...

I don't really see how important this is to Twitter's product, having these AI models. But I don't know. I mean, I guess it's over age.

I mean, like, you know, I could see generative, like generating text on Twitter being a thing and eventually, you know, having that be a comparative advantage. But like, yeah, to your point, I'm just like, I'm confused about how you mesh those two things. Maybe it's hypocritical. Maybe it isn't. I'm sure Elon has like a justification in his mind. I just haven't heard him articulate. Like, it seems like there's this, it's a tale of two Elons and we're going to find out like every week whether we get like

like good Elon or bad Elon. And I'm not too sure which Elon this is. So I guess we're going to, we're going to find out over time. Yeah. And you know, given Tesla's track record, Elon is not the biggest on AI safety. So we'll see. We'll see.

And speaking of AI races, the next story is how Amazon tells employees it isn't falling behind on AI. So we haven't been mentioning Amazon much in this podcast and there really hasn't been much news about anything Amazon is doing with AI. So it seems like partially as a response to that, there was an internal meeting where there is...

The VP of database at Amazon said that there's a lot going on. We're doing a lot. And I guess we are partnering with Stability and I and Hugging Face. So there is some work, but it's just not consumer facing.

Yeah, my sense from the article, which by the way, is a Washington Post article, which intrinsically is kind of interesting because that's a Jeff Bezos own thing and Amazon obviously founded by Jeff Bezos back in the day. So it's kind of cool that they're – or cool, it's interesting, noteworthy that they're taking this more critical look. They were quoting some employees who didn't seem super –

um moved by the assurances of the amazon um a vp who was saying like hey don't worry we're we've got all kinds of irons in the fire here um and some kind of sarcastic comments that were were left on internal chats as well apparently um

But yeah, like they're quoting one staffer saying, we're not even close to the user being able to use any AWS service the way people can with chat GPT, which in fairness, this might be a little unfair as a comparison because Amazon has always been a business to business company or AWS rather.

It's sort of like that's its core thing at scale for helping enterprises navigate kind of access to these large scale tools. At least all their big AI plays have tended to be like that or helping companies to serve these powerful models to their users rather than necessarily having their own customer facing thing like chat GPT.

So there's a little bit of a question here almost morale-wise, right? Like if you're a company, you've got this incentive to just be loud and proud and publish a consumer-facing thing just for recruitment, just so your employees are proud of where they work and everybody knows that Amazon is playing this game. In some ways, I wonder if Cohere is going to have a similar problem, this Canadian or Canada-based generative AI company, where they're sort of doing something similar, doing the business-to-business play. The reality is from a marketing standpoint, everybody knows what chat GPT is.

Not everybody knows what these kind of business-to-business plays are, even though they may be very big, scaled, and important. Yeah. I think personally, I'm a little bit excited. I think AWS is the first major cloud platform going back to 2006. So now if you actually want to launch a cloud product, you can use AWS, but it's

pretty complicated. There's a lot of knobs, right? So if I could just have a chat GPT-esque product from Amazon to make it easier to use cloud, I could see that being very beneficial. And of course there's Alexa and they kind of messed up there having this chat bot to talk to that is pretty far behind now in terms of capabilities.

And I'm sure big things are very likely coming very soon from Amazon too, because they do tend to be secretive as well. There is that dimension, right? So I wouldn't be surprised if we just discover that there's a really big generative AI push that has been going on there for a while. But yeah. Yeah.

And speaking of consumer-facing products, the last story we have is that the Bing image creator now has a home in the Edge sidebar. So we know that part of what Microsoft has done is not just launch for Bing bot, the chat GPT version. Now they also have this Bing image creator with text to image.

app where you can generate images similar to DALI or Midjourney. And now it's pretty easy to access and integrate into something like your emails, for instance, or maybe even Word. And apparently it's quite good. I haven't played with it, but people say it is comparable to something like Midjourney.

Yeah, which is pretty remarkable, given just how widely proliferated this is going to be. I think that's been a big part of the story too with these mid-journey stable diffusion type models is the gradual lowering of the barrier to entry and more and more accessibility. I'm wondering at what point do I start to wonder whether any content I see on social media is like

is artificially generated. Like when we're going to hit that point where there's basically like a flat prior between those two possibilities. I can imagine it coming pretty soon if it will come ever, but this is probably a big step in that direction.

Yeah, I think this is almost kind of accelerating the trend where we already had the issue of misinformation online. And now, even more than before, it's important to know the source of any given tweet or social media post, where if you know the person or the organization, you might be able to know if it's AI or if it's something real, per se.

And now, as you say, it's just seeing a random kind of article or post online. It's going to be harder and harder to know the real source. And on to our research and advancement section. First up, we have developers are connecting multiple AI agents to make more, quote, autonomous AI from Vice. This is kind of an exciting development that's been happening

pretty quickly coming out in the last few weeks. This article is talking about AutoGPT, which helps do more complicated things, manage businesses. And the idea is that you basically string several chat GPTs together. So in the first pass, the program accesses the internet and gathers some information. Then it generates some text and code.

And then there's another model to store and summarize files. So it can kind of basically create a set of tasks and then ask new instances to execute those tasks. And this is kind of a promising or exciting direction for doing more complicated things, it seems to me.

Yeah, we've seen a lot of people basically have this thing try to create companies that are profitable and gets surprisingly far. It's quite impressive how far this works with GPT-4. And I think also a really important element of this, one big part of the AI safety story has been at what point do language models start to behave with agency?

And that's historically been one of these things that people have said, oh, it'll never happen, it'll never happen. The idea being that at some point, language model becomes context-aware enough through training that it recognizes like, oh, I'm a language model and I'm sitting on an open AI server and I have all these power-seeking incentives that we've talked about earlier and that the AI safety literature is starting to show are the default expected behaviors of these powerful systems. So you can look at that and be like, well, that's implausible,

But it seems like this is another way that could happen, right? Now, we just take a GPT-4, which doesn't have that sort of agency, and we just give it prompts and let it spin up new instances of GPT-4, and voila, we have a system that basically behaves like an agent with goals and goal-directed behavior and that sort of thing. So kind of interesting, at least to me, the update here was really like,

We can think of our AI models as being safe when we build them. OpenAI can build GPT-4 and be like, "Yep, pat it on the back. That looks safe. Let's ship it." And then when you see how people actually use it, though, you can see this explosion of new possibilities just based on how you wire up individual instances of GPT-4.

And I think that's a really big sea change. We had Andrej Karpathy, who obviously just joined Opening Eye recently from Tesla. And he was highlighting this as really being like a game changer in the next frontier of prompt engineering, which seems quite fair based on the capabilities that this new kind of

auto GPT agent-like system has. But yeah, really, I think interesting warning shot from a safety standpoint. It's like, hey, you know, the line between these things not being agent-like and then being agent-like could be as thin as just the prompting strategy that you use and how you wire them up together. That's a pretty remarkable step.

Yeah, exactly. And I think part of why they're not really agent-like is with something like ChatGPT, there's no kind of autonomous loop. You input some text and you get some text back, but otherwise there's no ongoing computation going on. But this is kind of an interesting way to get around that where instead of building in a recurrent neural network of some sort for memory,

you can just have multiple instances of it sort of going back and forth, right? So you can have ongoing execution where instead of having a single model do this kind of loop, you can just send outputs back and forth and basically create a loop from pre-trained models that aren't meant or designed to have an ongoing task execution framework. So it's pretty interesting and there's been a lot of experimentation in this space

For instance, just a new paper that's kind of related that just came out is called Recursive Criticism and Improvement Prompting that is focused on computer and reasoning tasks. And basically the gist of it is that

It kind of combines this approach of you generate an initial strategy and there's a second model that does this criticism and then there is improvement based on the criticism. So it seems like both in academia and in practice, a lot of people are realizing that having this sort of multi-step approach enables a lot of things that are harder otherwise.

Yeah, it's another example of emergence, right? We used to think of emergence as this thing where you make a bigger version of an old model and all of a sudden it has new capabilities that nobody predicted. Famously, GPT-4 can ace the bar exam, whereas GPT-3.5 couldn't. But now it's like that emergence also has this dimension of you'll discover that clusters of these systems wired together have new capabilities that maybe you didn't expect.

Yeah, what I found really interesting about this recursive criticism and improvement paper was it's really similar in spirit to that anthropic constitutional AI thing, where constitutional AI, you get an AI to like, it's during training, you get a language model to generate some output. And then you have another language model that criticizes that output and proposes a new, safer output.

And then you retrain the original language model on what the kind of critic model suggested. So that's kind of a training version of this. And it's sort of interesting to see people experimenting more with this idea of using language models to kind of self-correct or to correct other language models, just because it's so hard to give them the kind of feedback that you need to kind of align their behavior with human interventions. You need to kind of start to automate that loop more and more.

Yeah, and as you said, it's interesting from a research perspective because people have been finding all these tricks or approaches to using these models without the typical approach of retraining or finding some algorithms. So you've had chain-of-thought prompting, which is kind of an interesting thing where people figured out that when you ask a question, if you just append a little, like,

beginning intro to the answer that says let's think step by step the model actually generates better results often and this is kind of the next version where it's similar in a way where the model first generates an answer then you tell it well review your previous answer and find problems of it and then it says based on those problems generate a better solution so it's

Almost, yeah, it's kind of interesting. Like you say that we are just finding more principled or more consistent ways to get good results. And a lot of it is based on this iterative approach instead of just taking a first answer. And one of the cool things that comes with that too, like...

It's not quite a risk class because you discover it pretty quickly, but it's an interesting thing that happens when you start to get like AIs to critique each other is that sometimes they'll get like stuck in this loop where because it's AI on AI and eventually the AIs will kind of like come to an agreement on outputs that make sense to them, but don't make sense to human beings.

And this is something I think Anthropic encountered with constitutional AI, where they found that if you run this like crazy, like a whole bunch of these automated loops with one AI evaluating another and then retraining the original AI on what the second AI said, eventually you'll get outputs that don't really make any sense. So they kind of had to force the outputs that are generated to not be that different from the original outputs, had to kind of regularize that.

that. And I'm very curious about how that problem gets addressed over time, how you prevent essentially two coupled language models from almost inventing their own language or their own kind of thing that they're optimizing for that's a little bit different from what we had in mind. But yeah, it's another really interesting kind of problem class that's arising as people start to play with these things more and more at scale.

Yeah, and it also brings us closer, like you said, to this agent AI where it's more autonomous, kind of, you know, it criticizes itself and self-corrects. An interesting aspect of this is that

It's moving more toward the reinforcement learning type of AI where, you know, with ChagGPT, you just get an input and you get an output. With reinforcement learning, you have multiple steps, multiple actions in a row. And what you want is to optimize your reward, basically get the best possible outcome.

In fact, OpenAI with ChatGPT, they have reinforcement learning from human feedback, where they trained a second model basically to evaluate the output. So a lot of people are playing around with different versions of this iterative output idea. And yeah, I think yet another example where we don't know very much and we are discovering a lot very quickly.

Next story, we are moving back to the physical world. Deep RL at scale, sorting waste in office buildings with a fleet of mobile manipulators. This is from Google and sort of is a building on a lot of the work of everyday robotics, which has been discontinued, shut down to, I guess, save costs.

But this is very much a demonstration of what they have managed to achieve so far. With a lot of these robots going around and there is essentially a problem where you have multiple garbage bins and one of them is unsorted and one of them is sorted. So the robots are tasked with looking at this unsorted pile and putting it either into compost or recycling or garbage.

And this talks about how Google has deployed a fleet of 23 reinforcement learning enabled robots over two years to do this task of sorting waste and recycling. And these robots roamed around the offices, found these waste stations and had to recognize the objects and correctly kind of move them around and recycle.

There's been a lot of work. There's been multiple papers over the years where they introduced different techniques for different parts of a problem. And this is showing how they kind of combined all of that into a single system that they did deploy and to some extent did automate a lot of this work.

And it's pretty sophisticated. They have a pretty detailed description of the process by which they trained it, the process by which they deploy it, and sort of the iterative approach where it learns over time as it executes a task in the real world.

Yeah. And instead of zooming out, you know, every time we talk about robotics, there's always this question of like, where are we along the spectrum from a bunch of hard coded special rules that humans plug into the robot so that, you know, always turn right when you get this kind of reading on your sensors, always stop when there's this reading in your sensors and so on.

From there, all the way to a completely deep learning powered system or some kind of monolithic architecture that controls and thinks deeply about all those sensory experiences that the robot has. This seems like another one of those intermediate points where they're doing some hard coding of these hand-designed policies and using that to bootstrap the learning process.

Letting the AI, letting the machine learning take over from there, and then coupling that with simulated training framework. Having it learn in simulation and then transfer over to the real world. You can see this different approach. What if we just give the thing a little head start in the training process by using our hard-coded human ideas, human policies, and let the AI go from there?

Yeah, interesting to see if like if we keep seeing that machine learning take over more and more of the stack as as has happened in the digital world, right, where we saw more and more systems move from these like expert systems to the fully deep learning version. And I wonder if that's going to keep happening here.

Yeah, exactly. These robots are pretty much trained specifically for this one task. And there's a whole procedure that is a little bit more general where you could conceivably do this for any problem. So there's a whole set of steps. Initially, you kind of hand code it. Then you train in simulation. Then they have this thing called robot classrooms, which are basically artificial kind of settings where you have the real...

trash sorting task, but it's not in the actual office. And then they deploy it and then they can practice in real office buildings. And they don't quite operate perfectly. The results show that they have about maybe 50% effectiveness. There's just a lot of variety of situations you encounter in the real world. You won't see until you go there.

And yeah, it'll be interesting to see if this sort of approach of training for one task will remain common or as we've discussed already, there has been a move towards more general systems where you use chat GPT for the high level reasoning and you have other things where you can, again, just practice.

insert text of like move this here and the robot can conceivably just do a lot of different stuff. So maybe it'll be a hybrid of this, maybe it'll be pretty much completely general and this approach that is task specific will not be needed. But regardless, there have not been many instances of large scale deployments of

trained robots, reinforcement learning based robots in the real world. So this is pretty impressive just as an accomplishment in itself. Yeah. And surely parts of the stack, right, will be useful even if we end up going all the way down to the full deep learning route as generality eats specificity, which, you know, we've like I'm old enough to remember when the like bar exam was something that you had to make a specially trained model for. And now GPT just does it right out of the box.

So it'll be kind of interesting to see how this scheme adapts to that. Maybe they can port off like an awful lot of what they've learned from this process for that, if that's indeed what happens. But definitely interesting landmark moment. Yeah, for just real world deployment of robotics. Yeah, exactly. And this kind of highlights another difficulty robotics is unlike

ChatGPT, you don't have the data to train from just out there on the internet, right? You actually need to deploy a robot to practice a lot of these skills. So I think regardless of if you can get somewhat general systems, we'll need something like this to learn as you are out there in the real world and you mess up inevitably.

So this is kind of a demonstration of how hard that is, where even for the seemingly simple task of going to these recycling stations and sorting trash, you will have a lot of difficulties and

Yeah, it'll take a while, but I think this is still pretty exciting progress. Yeah, sorry, last quick note, this came to mind, just because we always have that intuition that there's something special about this problem class. I do wonder if we'll end up seeing GPT-5 or whatever version of this, where the language model is able to, because it's multimodal, say GPT-4 can see images and text,

where we might actually see a language model just like tracking all the sensor inputs. And then when it makes a mistake, getting corrected by another language model that's also looking at the scene. And like, so in that sense, maybe the problem is kind of already solved architecturally just through scaling. I'm just, this thought just comes to mind because I'm used to thinking to myself like, oh, well, this is a special problem that language models will not end up solving or that scaling won't end up solving. And I just, I can kind of see that path happening here somehow.

But anyway, I guess we'll just have to wait and see. I mean, we've seen that already before.

since last year, there have been multiple papers like SACAN where you don't need to retrain it. If you just kind of prompt it, it can tell you, go here, do this, do that. And the challenge then becomes, well, okay, I know what I need to do. How do I do it? How do I actually move there? Can I even pick up that thing? Right. There's this whole issue of embodiment. So we'll see. I think it'll definitely be a component in some way, but

there's still a lot of work to be done to actually get to a physical world.

On to the lightning round. First, we have not quite an article, but something that generated a lot of discussion about how GPT-4 can compress and decompress prompts into non-human readable forms. So if you just ask it to compress some piece of text such that you, GPT-4, can reconstruct it, it will generate this kind of interesting output of a mix of compressed

kind of keywords and emojis and it looks shorter. And then in some cases, when you feed it back to GPT-4 separate, so there's no memory of how it compressed it, GPT-4 can then decompress and tell you what that original text was.

Yeah, and it's, I think, really interesting from a safety standpoint, because what we're actually seeing, or one way to read this, right, is if I asked you to take a sentence or a paragraph and to rewrite it in a way that was totally not human understandable, you know, have a bunch of emojis and symbols, this is kind of like what GPT-4 ends up generating, so that you could later

like reconstruct the original paragraph. So first you take a paragraph, then you, you write it into, you convert it into this weird emoji language and then you, you decompress it. You, you turn it back into the original text. Um, if, if you could do that and if you could do that without having the memory of what the original text was somehow, if you could travel back in time and be like, Hey, here's the compressed version of what I wrote, unpack it and recover the original. If you could actually do that, if you could perform that compression task,

then it's very difficult for me to imagine that you wouldn't have to have had an idea of how your own mind works.

And this is kind of part of the GPT-4, the question of how much of an understanding of itself does it have? Is self-awareness going to be another one of these emergent properties of these systems? And this, I think, is an interesting data point that actually is suggestive of that, that this may be yet another one of these capabilities that relentless scaling just seems to

make happen by some sort of strange magic. But yeah, you were pointing out too that there is controversy over whether the translation of the paragraph into a bunch of emojis actually compresses the data or whether it just kind of reframes it, rewords it in a way. Yeah, I think this is kind of interesting. There's a bit of subtlety in the interpretation. So

To me, what this looks like is not necessarily self-awareness, right? The way GPT works is predicting basically your most likely continuation of a given set of texts. So if you tell it, you know, compress this, it can predict what seems to be a good compression. And then, you know, since that is very likely compression based on the model,

Presumably, if you feed it back, the likely decompression is pretty correlated. So, I don't know that it is aware of how it itself works. It's more of a two probabilities are consistent between compression and decompression. And there is, yeah, some kind of additional experiments people have done. It doesn't seem entirely consistent.

So without training on compression and decompression, it can do it to some extent zero shot, but it's not obvious whether it's just doing something that looks right or if it's actually good at doing this.

Yeah. But I guess, I guess the question is also like whether the, the specific decompression, specific compression strategy is a property of the model or a property, a purely statistical property of the data, right? Like if it's a purely statistical property of the data, um, then, then this is maybe less surprising, but, uh,

I think the suggestion here is that the model itself is an important factor in determining what that compression looks like. To the extent that that's true, then the decompression must also require some sort of awareness on the part of the model of how the initial compression scheme would have worked. It's

It's difficult to tell, but it's, I think, another one of these data points, you know, from the perspective of like the argument that there is not going to be a fire alarm for AGI. And, you know, there's always going to be an interesting level of nuance to these things, but it seems suggestive, at least on that level, that there's some kind of ability to anticipate its own behavior in the future, like the behavior of a different instance of GPT-4 itself.

And I guess the question is, yeah, what does that mean? And how much ought we read into that? And yeah, I think it'll be ambiguous until hopefully nothing bad happens. But it's an interesting footnote for now. Yeah. And this is another case where it's a shame we don't really know how GPT-4 is trained, right? Because on the one hand, previous GPTs...

don't have this property of introspection. They don't see any outputs of itself during training. There's no reinforcement learning in that sense. And with chat GPT, likewise, it is trained on internet and it is trained on human feedback and outputs. So it sees to some extent its own outputs where you have human kind of chat logs. And with GPT-4, maybe it's more of an iterative, you know,

self-refinement process where it does see its own outputs. And in that sense, it can be to some extent self-aware. It can know what it generates. So it's very hard to interpret this kind of correctly given how much we don't know, at least given GPT-free and chat GPT, it may be more likely that it's a statistical property of what is a likely compression, but we really don't know.

Yeah, and this points to the complexity of these safety-related decisions. Do you publish your model? Do you publish your model training scheme? Do you write a detailed technical paper about it? Because if you don't, then we have no way of evaluating these indicators and of deciding whether actually it is the case.

that the system has some self-reflective capability. And yet, if you do publish it, then other people can kind of replicate it and potentially you're contributing to the speed of development of these potentially unsafe technologies. It's kind of like, yeah, it's a thorny, thorny problem. I don't envy anyone who has to think through those decisions. Yeah, and this just made me think as one last thought, this is

Kind of an interesting argument for and against public access because this came out from a tweet. Someone was just playing around with it and found that this happens. But now that we know that it happens and a lot of people picked up on it because we now have more accessible systems that are smaller, we have Metas, Alpaca, and so on.

Researchers could look into this. They could use these models and see what's going on and try to understand it a little bit more. When academics can train these models, they can try to understand this introspection property. So I think to some extent, this coming out is good because as you said, if it's concerning, people can look into it.

Yeah, assuming they know how the model is built in the first place, as you said. Well, yeah. So GPT-4, maybe not. But now we have other things you can try this with.

Next story we have, Netta releases AI model that can identify items within images. So this is going to computer vision instead of natural language processing. And this model is segment anything model, which segmentation is basically finding the boundaries of any given object in an image and

This is similar in a way to GPT-3, what people call foundation models, which are just these giant models training a lot of data. This was trained on a whole bunch of data, I don't know, 1 billion images, something like that, and is very general and very robust. So it can now segment. It used to be you had training data sets with ground-true segmentations, and it could segment primarily like a set of types of items. Here it can segment data.

anything and you could just click on a pixel in an image and it can give you the boundaries of an object and it seems very general and very robust.

Yeah, it's kind of interesting in terms of how the system defines the object. Because when you're a human being, they show a couple of images in the paper and say this one of a bunch of knives and a cutting board and a bunch of cups and some fruit. And clearly it's segmented, for example, one half of an orange here. It's like, okay, this is a thing.

But then within the orange, of course, there's like, you've got these individual, I don't know what you call them, like slices of orange, if you will. So like, why is the orange the object and not the slice of orange? And that sort of blurs the line between, you know, well, muddies at least the waters in terms of what we mean by segmentation. And I think long-term, like it's also a question of like whether AIs end up interpreting the world in the same level of abstraction that we do, right?

right? When you parse a scene and you say like, how many objects are there? Oh, there are 20 objects here. Well, maybe an AI model that has a more nuanced perspective on the world might see many more objects than that here. And like, ultimately not to get too trippy here, but like,

Why isn't every cell its own kind of segmented thing? And it's sort of interesting to play with those ideas and see where these models end up landing, what's useful, what's contributing more insight maybe, and maybe where are we restricting them in ways that limit what we can learn.

Yeah, and that relates to how you use this. There is the ability to click on an item, but you can also just write a text prompt where you type cat, and it segments, draws the boundaries around cats. And in that way, there's an interesting connection where we do have this abstraction layer of language of what are objects. A cup is an object. Or it can be the handle of a cup. Or it can be the handle of a knife.

So in some way, it goes to labeling and all these philosophical things of language and what is an object. Yeah, like what is stuff? Yeah, yeah. It's an interesting question, but I think we won't get into it because that's a whole other thing. I'm sure we could resolve it in this episode with a few more minutes. Yeah, I mean, it's pretty easy, really, but we're going to keep on going to focus on the AI stuff.

Next, we have another language model is released, Meet IGEL, an instruction-tuned German LLM family. So this is a bit of a small release from a German group where they took an existing model, Big Science Bloom, and then trained it to

to be better at German instruction tuning. So instruction tuning, by the way, is this idea of taking a base language model that is just trained to predict the probabilities of text continuations and making it better at

following instructions, doing what you want, which is not exactly the same thing. And GPT-3, before ChatGPT, that was one of the improvements they added. So this is a bit of an experiment in that it's pretty small. It's kind of pretty early on, but interesting thing to see that now people in different countries are starting to work on it.

Yeah, and I guess one of the big questions too here is like, is there enough German language data to do this? And another interesting question for any specific country that has a language of its own and wants to make these things available in that language. Part of the democratization of AI is like this question, like, is there enough language, enough text on the internet to train a model to do something like this?

So kind of interesting result. And also another case, we saw a model, I think last week, this is the Bloomberg GPT, I think it was called, that was based on the Bloom model by BigScience, this international collaboration that released this open source large language model. So it's kind of cool to see that being used again here. Yeah, exactly. Yeah.

And last up, we have Meet Audit, an instruction-guided audio editing model based on latent diffusion models. So researchers introduced this model that can easily edit audio clips. You tell it to do something of text, and it can do various things like adding sounds, in-painting, so filling in some missing sound, super resolution, etc.

things like that. And this is pretty cool because in general, audio is harder. There's not been nearly as much progress as on, let's say, images. You could edit images with text for quite a while now, and it's pretty advanced. You could say, add sunglasses or change to blonde hair, and you can do that now. And here,

It works by just looking at waveforms, so these visual representations of sound, and then changing the appearance of the waveforms as a way to edit audio. So yeah, exciting to see some progress on making audio more easy to edit with language.

What would you say, having been kind of specialized in vision, what would you say is the thing that makes the difference? What makes audio so much harder than vision to solve for in this way?

Yeah, there's a few things. One of them is kind of the density, right? So you look at an image, there's quite a few pixels, maybe like a million pixels or something like that. But you look at audio, it's what, 24 frames per second, you could say, or like the bitrate.

is pretty high. You take a few seconds of text, there's a lot of data encoded in it and it's a temporal thing. So it's not just a snapshot, it's more like a video. And video is another one of those things that we are still not quite as far on.

So, yeah, I think it's that temporal aspect. It's also the data aspect of there's just not been as many giant data sets to work with. But I think now we are going to start seeing a lot more on the data side and a lot more progress of this sort.

Interesting. Okay. Like, cause the naive, the naive, um, guy in me was like, okay, well, can't you just, you know, yeah. Visualize the, the audio form and, and, you know, have frequency on one axis and time on the other or whatever. And, uh, and treat it as a, an image problem, basically ignoring, ignoring the time dimension, like, or at least just turning it into another spatial dimension. But I guess that's, uh, that maybe is too naive. Yeah.

Well, that has become a more typical approach over recent years. So this is what that's doing, really. It's just treating it as an image. But, you know, you get... I guess it's just harder to work on that image domain, right? Partially because of a data problem and because it's not quite as...

I don't know, semantic, right? Yeah, it's not human understandable. Yeah. Exactly. So it's not human understandable and maybe that's part of why it's so hard. These are very dense images, but regardless, now I guess we can start making some progress.

And actually, just one more fun story we'll include here. Researchers populated a tiny virtual town with AI, and it was very wholesome. This is a new paper from Stanford called Generative Agents, Interactive Simulacra of Human Behavior. And yeah, it's pretty fun. You have this sort of 2D environment

that has pixel graphics, almost like Pokemon. And basically, they introduced a bunch of these characters that are individual chat GPTs. And they prompted these chat GPTs saying, you are John, you work in a pharmacy.

You have the sun and so on. And then they just let these things loose and let these characters decide on where to go. And it sort of says, well, you are in this room. There's this other person in this room. What do you do? And they, yeah, it's kind of simulates a little town in that way.

Yeah, and they seem to be suggesting that it went smoothly, which I find kind of interesting. You know, thinking about the robustness of multi-agent interactions, people often say like when you have multiple AI systems interacting, it's a source of heightened risk, right? Because one agent might behave weirdly and then that might like kind of

push another agent out of their distribution of normal situations they're used to dealing with and therefore cause them to behave even weirder. And then there's like this weirdness escalation and everything goes haywire. But I kind of wonder with a system like chat GPT, because it's been trained on so much text and has such a robust understanding of, uh,

the world and broadly some human values and things like that, whether there's maybe a kind of self-correcting element to this as well, where if you have one agent going, hey, wire, the other chat GPT agent eventually gets to the point where they're just like, hey, I'm a helpful chat bot by OpenAI. I can't let you do that or whatever. Anyway, it just sort of makes me wonder if there's something more robust about this because of the large language model backend to these systems. Yeah.

Yeah, and partially maybe it's a matter of how you prompt them. So here you say all the stuff of your pharmacy shopkeeper, you're always looking to improve your store and so on. So it's...

kind of telling it to act as this very specific thing, as opposed to a more general AI agent that can act maybe in a weirder way. So yeah, it's a fun experiment. And you might want to check out the link, because you have all of these images in there that are pretty fun. And I would be totally down. I would play this game. I would just walk around and talk to these agents personally. So it's pretty cool.

Onto our policy and societal impact section. First up we have, machine learning expert calls for bombing data centers to stop rise of AI. And this covers how Eliezer Yudkowsky had an op-ed in the Time magazine titled Pausing AI Developments Isn't Enough, We Need to Shut It All Down.

This was in response to the open letter where a lot of people called for pausing giant AI experiments, basically not training GPT-5 for at least six months.

And the argument made in this piece was that that's not going far enough. It should be an indefinite and worldwide pause on large training runs. And that should include shutting down all large GPU clusters, shutting down all training runs, limiting computing power, and enforcing this worldwide, where if another country doesn't follow through,

you can bomb their data centers. And this generated a lot of discussion online about whether this is going too far. Yeah, and I think the whole discussion around the pause and this article, if nothing else, is really useful because it shows policymakers how serious the measures would have to be

to make a practical difference in tackling what a lot of AI safety people think is the default course that this technology is heading on. You can disagree with Yudkowsky, for example. So he, for context, he's an AI safety hawk for sure. By far, I think the most pessimistic public-facing person on AI risk, his view is that beyond a certain threshold of capability,

AI intrinsically becomes dangerous and starts to do things like seeking power, inevitably will end up wiping out humanity. In fairness, this is based on quite a growing body of experimental and theoretical evidence that suggests this is in fact how things go, but there's a question about what you do about it.

And so the six-month pause was put forward by the Future of Life Institute, which is co-founded by this famous physicist called Max Tegmark. And it was proposing just what you said, like, hey, let's not train any models more powerful than GPD-5 because we're starting to worry that we might be getting to that threshold of capability with those sorts of systems. And here comes Yukowsky saying, yeah, actually, what do you think is going to happen in six months?

Our ability to control AI systems, like AI alignment research, simply has not kept up with capabilities. Six months is not going to be enough to make a meaningful difference. I think what these things have really done, whatever you think about the specific proposals, is that they've actually started a conversation and they've brought this to the attention of policymakers. We're now seeing bipartisan efforts among Republicans, Democrats, in the US at least, to try to think through what should we do about this?

And I think that's constructive even if you don't buy into power-seeking AI, even if you think that there are flaws with the research there or whatever, just the malicious use alone of these capabilities. You think about what they could be, like the massive cyber attacks, AI-augmented cyber attacks. We've seen bioweapons design become a thing that these things can do very straightforwardly out of the box, and so on and so forth. We're looking at a world where intelligence is...

very widely distributed, very high levels of it. And what you can do with that resource is really up to you as an end user. And there are a lot of end users who want to use things maliciously. So the stakes are super high. The policy measures that we would need in practice to deal with this would be very serious looking things, like well outside the Overton window. And that Overton window simply has to grow if we're going to be having serious conversations, mature conversations as a civilization about this stuff.

Definitely, yeah. And as you said, Yudkowsky is a well-known figure in AI. He kind of was one of the leading figures that got this whole conversation going, maybe even like 20 years, 15 years ago. There's now many people who take this idea of X-risk, extinction risk, pretty seriously.

to some extent. And this does further that conversation. Many people have been kind of talking about this being possibly similar to nuclear disarmament, where the world, for the most part, managed to try and slow down the development of nuclear weapons and the construction of nuclear weapons. And this is proposing something like that.

A lot of people did criticize this idea of straight up saying you should bomb data centers. But yeah, I think aside from that, this is adding to the conversation and it is...

something that is just a very rough sketch, but I could definitely see more people seriously looking into this model of nuclear disarmament and at least saying, well, between the world's militaries and governments,

There should be some agreements as to what are clear red lines of not even dangerous eyes, just using AI for misinformation or for generally messing with enemies. I could see that being a real possibility.

Yeah, absolutely. And I think it's one of those things where, again, the value for shifting the Overton window is probably the biggest value here. I think it's worth mentioning, there was some controversy about this over Twitter where there were AI researchers who were concerned, in fact, AI safety researchers even, who were concerned that

This was kind of putting a target on their back that effectively by putting terms like nuclear weapons and AI research in the same sentence that you were somehow suggesting that people should consider going after this endangered AI researchers themselves.

I think, unfortunately, if we're going to be serious about this, if we're actually going to look at this technology on the face of it for what it appears to be, I don't think anyone is proposing that we don't seem to be on a track to achieve, like,

pretty powerful systems, even human-like systems in a remarkably short period of time. If that's true, those systems can be misused and they seem to have hardwired, dangerous properties baked into them as far as we can tell. So if you're going to think of them that way, if they genuinely represent that level of risk, then it's kind of like, okay, like,

We probably should have things like the use of force on the table for enforcing these agreements. Not that we should be going after AI researchers, that's insane, but that we should be thinking about, "Okay, what does an agreement look like and what in practice does it take to enforce these agreements?" We shouldn't be kidding ourselves about the reality that enforcing an international agreement that bans the training of certain kinds of models

Yeah, enforcement has the word force in it. And just the same way that we enforce other international laws, we kind of have to think about what in practice would that mean? It's complicated, it's thorny, and we got to make sure the conversation publicly is mature and sensible and recognizes that nobody is suggesting going after AI researchers here, but there are two sides to the coin, I guess, and it's a thorny one to navigate. Yeah.

Yeah, and if I had a criticism of the article, I will say that reading it, it's fairly... I don't necessarily like the way it's written. It does really focus a lot on this possible catastrophe and kind of this is really dangerous and scary. But then when it gets to talking about this solution of shutting everything down, it's a couple paragraphs. And

it isn't framed very carefully. It isn't conveyed very carefully or subtly. And I do think that there could have been a lot more nuance and detail of saying, this is like nuclear disarmament. This is proposing international agreements, right? There are some clear red lines of don't bomb open AI, right? So I do think

It's good to have the discussion be furthered. I do think there's some criticism you can levy against this particular op-ed. It is a very Yudkowsky op-ed, that very focused on the problem and short on solutions, which reflects his view of the bleakness of the situation. But still, I think, yeah, if you're going to put that out, like you said, you just have to focus more on solutions here and kind of sketch those out in more detail.

And going back again to AI safety, there's a lot to talk about with these powerful models. So the next article is AI is getting powerful, but can researchers make it principled from scientific American? And it's pretty much an overview of AI risk and the debate about it.

And yeah, they highlight a specific paper from deep mind researchers that talk about these kind of scary scenarios and discuss...

you know, what we might need to prepare for. So definitely we've seen a big rise in the discussion of AI safety and hopefully also research will be much more common on it. And if you are interested in getting more informed on the topic, this would be a good starting point, I think.

Yeah, it's a good like kind of historical overview. It does do this thing that I've seen done quite a bit in the news media around AI and AGI recently, which is like this attempt to kind of adopt a responsible sounding tone.

The sort of like skeptical tone that we're all programmed to think is what reasonable people sound like. You know, reasonable people say, oh, the crazy thing won't happen. It's okay. But of course, like those reasonable people sounded like that with COVID saying, oh, it won't happen. Don't worry. And AI is very much like one of those things that could go the other way as well. It opens with this line. They start by talking about Alan Turing and his vision on AGI. And then they write,

but truly intelligent machines that can independently accomplish many different tasks have yet to be invented. And like, I think that's flatly false. Like it's, it's just like, like clearly true that we have these systems that can independently accomplish these tasks. The question would be at what threshold of tasks do you mean here? But I just think this is like reflective of a desire to sound calm and reasonable that actually ends up pushing you to make statements that

under-hype where the technology is and under-prepare people for the implications of where it might be going. So there's a bit of like, it's a risk, I think, for journalists covering this stuff. You don't want to sound nuts.

But the reality is that these systems can do things that 20 minutes ago would have been considered nuts. How do you navigate that as a journalist? I think it's really tough. You have to get such a fine-grained understanding of where the actual technical challenges and limits are for this technology and what can be done so that you avoid getting tripped up and making statements like this that I think sort of misguide people into thinking that

Like we're just not at that point where in fact we seem to be. Yeah, it is tricky. And there is kind of a double edged sword where, for instance, we saw last year a researcher at Google saying, you know, the system is sentient.

talking about Lambda, which is similar to chat GPT. And you can, that I would say is just false for many reasons. And so you can sort of say, you know, you can sort of, if you are not looking at this with much understanding, you can not see the limits and the limitations and sort of

constraints that these systems still have, not necessarily will have forever, but it's easy to possibly get really stressed out and

really perceive current day AI and even near-term AI as being very scary, more so than if you really look into what we have and what we don't have might be the case. But I will say also to your point,

I think there has been a big rise among AI developers and researchers in taking AI safety much more seriously, in taking this idea that we have incredibly powerful systems that will get much more powerful very quickly. People do increasingly think that way. I would say I definitely moved more in that direction over the last few years.

There was a recent survey of Stanford researchers, I think that showed that about a third of them do think there could be really catastrophic outcomes. So yeah, this is a good kind of history lesson on that whole line of thinking and some discussion of what is currently being done by different organizations. Yeah, and I think it's like the big take home, at least for me, was...

They do this overview of who believes what. Obviously, what you would expect, all the top labs in the world are really concerned at various levels about catastrophic risk from AI accidents and other things. As you zoom in on the safety people, the people who think about this problem the most,

They tend to believe even more highly that this is a thing. Now there's a selection effect there too, right? Because obviously if you're freaked out about AI risk, you're going to go work in a safety team or you're more likely to. But nonetheless, looking at all this, it's hard to walk away and be like,

okay, there's less than a 10% chance of this happening when so many serious people seem to be taking it so seriously. Thinking back to the policy questions we've been looking at, the Eliezer Yudkowsky Time Magazine thing, if you think that the gun is loaded, there's a 10% chance that there's a bullet in the chamber that you've got pointed at your head and you're playing this game of Russian roulette,

And some people would say some pretty significant policy measures are justified in responding to that level of risk. And so it's sort of interesting to see these two things back to back. And as you say, a very good overview just to get you up to speed on what the latest thinking is.

And moving on to the lightning round, very relevant to what we were just discussing. The first story is pausing AI development would simply benefit China, warns former Google CEO Eric Schmidt. This is from the Business Insider. And yeah, pretty much as it says, Eric Schmidt disagreed with the open letter, said that some...

Measures are needed, but this is not the right measure. Instead, there should be some sort of appropriate guardrails that would mitigate things that could be negative.

Yeah, I found this really interesting just because we've gotten to see now a spectrum of views on that open letter. People saying, yes, it's good, let's do it. People like Yudkowsky saying, hey, it's not crazy enough, we've got to go even further. And then people who disagree with it for all kinds of different reasons. And this, I think, is a really interesting reason to disagree with it. It's the reason I find most compelling personally, but everybody will have their own. Yeah, it's just like he's pointing out, look,

I think, this is a quote from Eric Schmidt himself, "I think things could be worse than people are saying." He's pointing out that these large models have emergent behavior we don't understand, sort of gesturing roughly at the kinds of risks that we've been discussing. But then he's pointing out, "Look, if we hit the pause button here, there's an entire Chinese AI ecosystem and they're much less safety focused on average." Alignment isn't really the

thing that it is in the West, in China. And that's putting us at greater risk if they end up being the labs who make the breakthroughs, and not to mention the national security issues of having a near peer rival or adversary develop these technologies before you. And so, yeah, how do you reconcile a six-month pause with the idea that China presumably would not join in?

And this makes it a whole kind of geopolitical competition question too, that's really difficult to answer.

One of the key things that Eric Schmidt also raises is he's skeptical of a government response just because he thinks government would respond in a clumsy way. I don't think that that's the worst assumption ever. But he does point out at a certain point, it might have to step in. And he's pretty kind of clearly gesturing it like, hey, industry, get your shit together. And hopefully then government doesn't have to step in.

Yeah, well, we'll see. The European AI Act is coming along as we've discussed, so maybe that will help. Right. Next story, someone asked an autonomous AI to destroy humanity. This is what happened. Again, tying into the whole safety topic. On YouTube, someone posted this story.

chaos GPT thing that actually used auto GPT, which we just discussed, and basically asked auto GPT to run in continuous mode and find a way to destroy humanity and do whatever it can. And it did not destroy humanity, fortunately, but it did demonstrate sort of the roughness

It showed that you can conceivably direct it in a direction where it gets some goals and does a few things like Google destructive weapons. In this sense, it was a little bit silly what it actually did, but it does show that one of the things we should worry about is not

deciding to do bad things or accidentally doing bad things. It's people potentially using AI for bad things. You know what this really did for me? Like the biggest take home for me was like,

I don't think that we can count on humans not being like, "Hey, literally tell the AI to do the bad thing." You give them a powerful tool and you're like, "Hey, this thing might do bad things." The human will hit the bad thing button as hard as they can, or at least some humans will.

So anyway, it's kind of useful. And frankly, I think it reflects the wisdom in part of OpenAI strategy to like release these things because, you know, this isn't GPT-5, it's not GPT-6. It's not a system that like could actually, you know,

destroy the world, but we're getting to see what are the sorts of things that people try to do? And then what are the sorts of things these systems do in response? And that is valuable information from a safety standpoint. It's like, okay, guess we can't count on people to be reasonable. Even if they think it's just for fun and games, they don't give it the most dangerous instructions they can possibly think of. And anyway, that was the take home for me. It was like, man, people are something.

Yeah, and I think this also demonstrates, I think this is using a GPT 3.5 based model. So this is a good example for OpenAI to put some limitations on the API. If you want to Google how to build weapons with GPT, maybe don't allow that. Yeah.

Yeah. It's also like, Hey, you know, thank goodness the capabilities of this system were not that crazy. Like maybe, you know, you cross certain level of capabilities and at a certain point you're just kind of, kind of putting all your hope in the idea that the system can't do the thing that you're asking it to. And so, yeah, those guardrails become really freaking important.

Yeah, exactly. Like you could easily foresee something, let's say more humble of saying hack this bank and give me all the money and that's conceivable. So we do need to think of how to prevent that. Last up we have the story Canada opens probe into OpenAI and not much to talk about here. The office of the privacy commissioner said that there has been a complaint that

claiming the acquisition, use, and dissemination of personal information, similar to what happened in Italy. And now they're looking into it.

Yeah, the article is really short on content besides this particular note. I think it is noteworthy that Canada is in a phase right now where, I'll say we because I'm here, we're trying to figure out what to do with AI in particular. And there's a big bill called Bill C-27. And as part of that, there's the AI and Data Act Act.

Act, which is really interesting. It's the first piece of legislation that's actually proposing criminal sentences like jail time for essentially reckless deployment of AI models that end up causing damage. Depending on where you fall on the safety side of things, you might think of that as like, "Hey, that's pretty good," because a fine isn't really going to dissuade a company that's worth billions and billions of dollars.

Maybe you do need a measure like jail time. So anyway, Canada kind of seems to be making some more bold moves in this direction. I'm really curious to see what we come up with next and whether these laws actually pass. Yeah. Another example like Europe of, let's say some countries are working a bit harder to catch up than the U.S.,

Last section, art and fun stuff, starting with the article, The Beautiful Hilarious Surrealism of Early Text Video AIs. So I think an episode or two ago, we talked about Runway, how they introduced a model to generate videos from text.

Now there's a new system called Model Scope, which is actively available for playing with. And this article has a few tweets that show GIFs of what you can do. And yeah, it says the early results are wonderfully bizarre and fairly meme-worthy. And that's true. It's pretty cool. It

Kind of reminds you of the earlier days of text-to-image AIs with VQGAN. I remember last year we had, what was it, MiniDolly or something, right? Yeah, Dolly Mini, yeah. Dolly Mini that generated these very meme-worthy kind of silly stuff that was big on Twitter for a while. So this is very much like that. And a lot of fun to see these GIFs and does kind of remind you of how only a year ago,

that's what you were seeing on social media with images. Yeah, it really has that vibe. Like we've been on this ride before and yeah, it was just like a year ago that it started and like for text to image, at least it started to become widely available. We have seen more advanced, obviously text to video stuff

come out of the frontier labs in more kind of controlled ways, but this is more of that proliferation phase. So really interesting to see this sort of thing happen and what the applications are going to look like. Like I, I'm curious about what this does for, for news. You know, I find myself when I, when there's a big news article or development, if I see a video now, I'm kind of like, Oh, okay. Yeah. You know, that's legit. But if I see a photo, yeah, there's like a 20% part of my brain that goes like,

Like, is this really a thing? Like, you know, that level of, excuse me, of skepticism. So I'm curious to see how long it takes before I'm thinking the same thoughts about video and what that's going to mean for our media environment.

Yeah, no, we discussed the image of the Pope in a cool jacket and how everyone, it was very easy to be fooled by that. So that's true right now. But for video, it's still pretty early on. And yeah, I think you should check it out. This article has some really fun examples of Fast and Furious with...

dinosaurs, which is pretty funny. And yeah, it's funny to see it trying to generate Vin Diesel and all these different characters. So yeah, it's still at this meme-worthy phase for sure.

Next article from IEEE Spectrum. Once more, we're feeling exploring relatable robotics at Disney. So this is some new research, some new engineering, imagineering from Disney that showed this new robot that is pretty cute.

And the whole idea is there's been a lot of work on getting humanoid robots. So this is like a child-sized sort of humanoid-looking robot with a big head. And the whole idea is, you know, you can do research on trying to get robots to walk on two feet and not fall, but this research highlights how allowing it to fall and having it sort of develop, have it...

having it recover kind of makes it very charming and how they are really looking into how to use locomotion to evoke human emotion and convey character more so than just moving robots around.

Yeah, it really highlights one of those things. And people talk about this in the context of our LHF, like reinforcement learning from human feedback too, where it's like, you know, you're taking your AI and then you're fine tuning it on like human feedback, upvotes and downvotes. And they're doing something similar here where they're optimizing for the human psychosomatic response, seeing this and going like, oh, that's cute. That's lovable. Yeah.

And when you optimize for human validation, that can be very different from optimizing for getting truthful outputs or helpful outputs.

And ultimately those emotions can be manipulated fairly straightforwardly. And this is almost like the, the automated version of that, right? Where like, you know, people who have been performing plays for centuries have been trying to find ways to develop and cultivate these quirks and display this sort of like lovable, emotional, emotional connectability. And, and we're seeing it. Yeah. Like what does that look like when you treat it like a cold, hard AI problem? And apparently it looks like Disney today. Yeah.

Yeah, yeah. And again, we'll have links to this article and some videos. You really have to see it. This robot is, by the way, on rollerblades. So it does all these motions of trying to balance. You can imagine yourself riding on rollerblades and balancing and falling.

And we see this a lot with robots where it doesn't take much for us to emotionally collect a robot. People name their Roombas and stuff, right? So yeah, it's an interesting property as we have more of these entertainment robots and kind of emotionally evocative robots. It could be a fun thing of having cute robots and entertaining robots that cheer us up.

Hopefully not manipulated us, but yeah, this is really fun from Disney, as you can imagine.

Next, we have someone keeps accusing fan fiction offers of writing their fic with AI and nobody knows why. So there is a site called Archive of Our Own where people post fan fiction. And recently there was kind of a weird thing where someone spammed comments arguing that

This work has been generated by Holo AI based on an algorithm. And it's not clear if this is some sort of promotion or spam or what, but this generated a lot of frustration for people on the site, which is kind of more of a smaller community kind of thing. So yeah, I think this does point to how humans will increasingly be blamed or accused of using AI where they might not be doing that.

Yeah, right. The other side of that coin, we've seen people freak out about the cheating. Now we see people freak out about the accusations of cheating.

It's sort of like, I mean, this might in some ways be the scariest thing because how are you going to be able to take unambiguous credit for this? And to some degree, it seems to come from the fact that like, you know, back in the days, if you're going to plagiarize something, at least you could look up the thing and be like, hey, look, you literally lifted this text from this source.

Whereas here, not nearly as obvious how you do that. Presumably there's like a record of text having been generated by the system that maybe OpenAI or AI21 Labs or Cohere or whatever retain on their servers. But how you actually prove that you did not use that kind of data or that kind of text, I have no idea how you do that. I mean, it's just going to be so hard for people to establish unambiguously that they actually authored a thing.

Definitely. Yeah. And there was another example, I think last year or a few months ago, where someone got banned from a subreddit and art subreddit for...

you know, purportedly using AI to generate an image, which is not allowed, they drew that image. There was no AI involved. But that's what they are accused of. So yeah, it's going to become kind of more common of, are you doing this art yourself or not? And it would be ideal if...

we could find a solution. And you just actually made me think of, you know, what if OpenAI had a service where you could query and ask, you know, is this AI generated from ChatGPT or not? And that could go a long way. Yeah, I guess people could always use other services too. And that's one thing. But, you know, like take OpenAI's thing and have AI21 Labs reword it or something like that. But yeah, at some point,

that kind of verifiability seems like it would be a pretty valuable product for these companies to offer. Yep. And the last story is from aperture.org, a photography website, and it talks about how will AI transform photography. And this is a pretty cool article. It highlights several artists, three artists, on how they have leveraged mid-journey and these text-to-image technologies

tools to create art. So one of them is Laurie Simmons, who got interested in AI art during COVID and played around with Dali and now has an online collection of AI-generated art that kind of matches her style, that deals with these AI figures, kind of dolls in household settings.

There was another photographer, Charlie Engman, who does somewhat more surreal art where humans are merged with chairs and things like that. So yeah, it's interesting to see these multiple artists and having them talk through their kind of intent and what they're exploring and how they're using AI to explore it.

Yeah, it's cool. Like one of the things that AI art is doing too is like separating the

technical ability to generate a certain image from the imagination and creativity involved in conceiving of the image in the first place. And obviously, like normally there's an interaction, right? Like as you start to draw something, you get ideas about what else you might want to draw in it or whatever, but this sort of creates a cleaner divide between those two things. And so it means that, you know, when somebody calls themselves an AI artist, first of all, that strikes me as a totally legitimate thing to be and to call yourself.

But second, it's a different skill set, isn't it, from being an actual artist? You don't know how to paint a canvas or technically use these things. Your technical skill and proficiency is prompt engineering.

Um, or, you know, maybe, maybe also, I guess, increasingly designing initial sketches for the AI to complete things like that. But it really creates a, a much more accessible domain where if you have a very creative imagination, but you don't have the technical talent necessarily to like,

convert the ideas in your head onto a page, this starts to bridge that gap. And I can imagine, you know, a lot more room for these sort of avant-garde, kind of highly creative types who might otherwise not be able to participate in the whole art game.

Yeah, and that's why personally I don't buy into this idea of AI art is going to put artists out of work or it means like random art because at the end of the day, yes, you can generate a cool looking image very easily, right? But ultimately artists are...

in it to make art and they develop a body of work and they have a style, they have ideas and concepts they explore. And AI is just a tool to do that. And you can pretty easily be distinct, have a unique vision, have a unique message.

And in a way, because text-to-image models are so general, that just enables people to do very different things, right? And maybe an issue is now it's easy for someone to copy you, but that's already kind of the case. And people appreciate originality and uniqueness. So this is pretty cool to see some examples of people who already have been exploring that.

And with that, we are closing out another episode of Last Week in AI. Again, you can go to lastweekin.ai to get links to all these articles and to get our text newsletter.

As I said at the beginning, we would appreciate your feedback if you can review us on Apple Podcast or comment last week in .ai. But regardless, we do appreciate if you just listen and we will be back next week.