cover of episode #115 - GPT4, Bard, AGI, Alpaca, Anthropic, Midjourney V5

#115 - GPT4, Bard, AGI, Alpaca, Anthropic, Midjourney V5

2023/3/26
logo of podcast Last Week in AI

Last Week in AI

AI Deep Dive AI Chapters Transcript
People
A
Andrey Korenkov
J
Jeremy Harrison
Topics
Andrey Korenkov:OpenAI 发布了 GPT-4,它是 GPT-3.5 的后续版本,具有处理图像和文本输入的能力,其性能显著优于之前的版本。GPT-4 的训练使用了比 GPT-3 多得多的计算资源,其性能提升是数据、模型大小和计算能力共同作用的结果。虽然 OpenAI 没有公开 GPT-4 的内部细节,但其在 LSAT 和 SAT 等考试中的高分以及在各种任务上的出色表现,表明其能力已经达到或接近 AGI 的水平。GPT-4 的扩展定律曲线趋于平缓,这暗示着未来可能难以通过简单的规模化来进一步提升其性能。此外,GPT-4 的微调不会降低其性能,反而会降低其出现不良行为的可能性,这在一定程度上提升了其对齐性。然而,GPT-4 的微调版本在校准方面不如原始版本,这表明在对话能力和可靠性之间存在权衡。文本的不可约复杂性可能限制了 GPT 模型的性能提升,物理定律也可能限制 AI 的智能水平。 Jeremy Harrison:GPT-4 的主要区别在于它可以接受图像和文本作为输入,但输出仍然是文本。GPT-4 的性能优于 GPT-3.5,并且其能力令人惊讶,其在 LSAT 和 SAT 等考试中的得分远高于 GPT-3.5。GPT-4 的能力是不可预测的,这体现了 AI 规模化的一个共同主题。OpenAI 对 GPT-4 的内部机制信息保密,这引发了人们的讨论,关于竞争优势和 X 风险存在不同的观点。GPT-4 的能力在多个任务上同时出现,这挑战了人们对 AGI 的传统理解。GPT-4 的能力发展速度快于人们对其进行哲学、数学和统计学思考的速度。GPT-4 的规模化提升了其对齐性,使其更可能按照预期执行任务。GPT-4 的微调版本在校准方面不如原始版本,这表明在对话能力和可靠性之间存在矛盾。文本的不可约复杂性可能限制了 GPT 模型的性能提升,物理限制可能高于人类大脑的计算能力,这增加了对 AI 风险的担忧。

Deep Dive

Chapters
GPT-4 introduces the ability to handle both text and images as input, with text output, and shows significant improvements over GPT-3.5 in various tasks.

Shownotes Transcript

Translations:
中文

Hello and welcome to Skynet Today's Last Week in AI podcast, where you can hear us chat about what's going on with AI. As usual in this episode, we will provide summaries and discussion of last week's most interesting AI news. You can also check out our Last Week in AI newsletter at lastweekin.ai for articles we may not have covered in this episode.

I am one of your hosts, Andrey Korenkov. I am currently finishing my PhD at Stanford, where I do research on the intersection of machine learning and robotics. Yeah, and I'm your other host, Jeremy Harrison. Actually, Andrey had the really good idea that like, hey, maybe we should, you know, introduce ourselves a little bit, since we never actually did that, or at least I didn't. And I know it's been a while since Andrey, you did as well. So yeah, I guess as a reminder, I...

I've done a bunch of work in AI safety right now. I work at a company called Gladstone, which I co-founded. Basically, we work with the US government, national security, military, and international governments on AI safety and AI safety policy. And we do, anyway, collaborations with some leading labs, researchers at DeepMind, OpenAI, Anthropic, that sort of thing. Bit of a rabbit hole, but that's roughly what I am, who I am, and what I'm doing here.

Yeah, we have kind of a nice set of different backgrounds on AI. So it's pretty good for a discussion, I guess. And it looks like, at least based on our last episode, we have quite a few new listeners, I guess, because of all this chat GPT stuff. So welcome, new listeners. Yeah.

Yeah, I hope not. I hope people are here to listen about AI. So yes, welcome new listeners. We hope you stick around. If you'd like to, we would appreciate any feedback. We do check out our Apple podcast page to see what people are saying. So feel free to let us know what you think could be improved on there.

But that's enough banter. Let's go ahead and dive in to the news stories, starting with our applications and business section with the first story, surprisingly, being GPT-4. So OpenAI announced, I think, in...

released maybe even GPT-4, which is a follow up to GPT 3.5, which has been the basis for chat GPT, I think, since the beginning.

And it's largely the same type of thing, right? The main distinction is unlike previous GPT iterations, in the input to the model, you can now have both images and text, and the output is still text. So you can now ask GPT questions about images, in addition to just prompting it the same as we've seen with chat GPT.

I released a pretty sizable paper on this that just showed that GPT-4 is good and better than GPT-3.5 and a whole bunch of stuff. Yeah, and unsurprisingly, this led to a lot of conversations and there's a lot of aspects to touch on here.

So what did you find interesting in this, Jeremy? Yeah, you know, nothing much. Just another, you know, incrementing an integer on the GPT label. Yeah, no, I mean, look, I think that this is, there's so much to talk about here. First of all, totally agree, you know, big thing that distinguishes this obviously is the fact that text and images are now inputs to this thing. Still outputs only text for now, or at least the main variant that was talked about here mostly just does that.

But the interesting thing is, first off, how little information we have about GPT-4. We talk about GPT-3 as this watershed moment where all of a sudden we go, oh my God, scaling works. You can just scale up these systems and get more intelligent

And we got to talk about the guts of GPT-3 because OpenAI talked about it at the time. They told us how they did this. They told us about the data, the training, all that good stuff. And the only thing they didn't release was the actual model itself. Well, now, consistent with OpenAI's gradual kind of movement

backward motion on releasing stuff to the world, they're now not revealing any information, not just about the model itself, but about the training data or the training strategy. And so we're all kind of left guessing. And the reason why they've been quite open about is, hey, there's a lot of competition in the space. We're not in a big hurry to just kind of like spill the beans on this one. Understandable.

But certainly a lot of the performance characteristics of this model are groundbreaking and surprisingly so. So you look at chat GPT 3.5, we're looking at LSAT scores in the bottom 10th percentile. So really crappy LSAT student.

Well, GPD4 comes in and hits 90th percentile. This thing can pass the LSATs. It can pass the SATs pretty well, 80th percentile basically across the board, and numbers like this, they came out of nowhere. And this is another great example of that sort of like common theme we've seen with AI scaling, where you cannot predict the capabilities of a new system before you build it.

You may know that it's going to be more powerful. That's all the scaling laws tell us. They tell us if I make my system bigger in these specific ways, it'll kind of mathematically be more powerful in this way. But how that math leads to actual capabilities, that's still a complete mystery.

And so we just are handed these new capabilities. Hey, guess what? Before the SATs were an unsolved problem. Now they basically seem solved. And surprising emergence of capabilities is another feature of this landscape that we're just going to have to get used to. Like we don't know what's around the corner.

Yeah, I think there's a lot to touch on, as you said. A lot of conversation on Twitter was about how there's this footnote in the paper of due to competitive advantage, we are not going to reveal any of the details, as opposed to GPT-3, which the paper has now been a few years ago.

Back in 2020, we actually had a lot of information about internal bits. And even reinforcement learning from human feedback for GPT 3.5 and CHAD-GPT, we had more kind of insight. Here, they really don't reveal anything aside from their performance, which is...

I could see some people arguing this is not necessarily a bad thing because from like an X-risk perspective, it's better maybe that fewer people can build these incredible models. But on the other hand, you know, it's probably pretty safe. This is still the scaling approach where they're using kind of the same model and they're just using more

and more training. And we actually have at least the standard power law figure in the paper, figure one. And it shows in terms of compute, the performance on just next word prediction. And we do see that they used about, I think a hundred times more compute, or maybe it was even larger than that.

And you got that performance increase on a bunch of downstream tasks. So yeah, it's presumably a mix of more data and more training and maybe model size. We don't really know. But we know it's something like that.

And isn't that the equation too, right? Like in the last few years, it just kind of seems like there's a cocktail of data, model size and compute, and it just kind of gives you capabilities. And nobody, you know, we know that that'll keep happening, but sorry to interrupt. I do think you're raising a really important point though, on this question of the X risk, the existential risk angle to this.

Because this is like kind of, maybe this is a conversation, it's kind of worth it for us to get into just because of the light we can shed on this for people who maybe aren't tracking this dimension to the AI ecosystem. But there is sort of this polarity where some people are like, look, we don't want all the AI capabilities to be concentrated in a small number of hands. We don't want OpenAI or Microsoft or Google to be the only companies that can field these models. And people on the other hand saying, well, look, we have research that suggests that, you know,

Powerful AIs are intrinsically dangerous. And at what point do we cross that threshold? How would we know that we've crossed that threshold? And that's a whole kind of sub part to this whole story too. There's a section in the GPT-4 paper where they actually talk about having run an independent audit where they bring in people from this group, the Alignment Research Group.

It's headed up by Paul Cristiano, who used to head up AI safety, AI alignment at OpenAI. And he since left, and now they're bringing him back in to run this audit to test GPT-4's capability for power-seeking.

When you think about this, it might sound like science fiction. This is mundane reality, not just in AI safety, but increasingly AI capabilities. When you have systems that can plan ahead that much, you got to start to worry about, are they going to realize that, hey, I can't achieve my programmed objectives if I get turned off or if I don't have access to certain resources? Kind of cool to see this, whichever way you fall on this debate, the very least doesn't seem like a bad thing that they're inviting in this external auditing and maybe setting that precedent going forward.

Yeah, I agree. And there's other interesting details in terms of the academic side. Even if we don't know the secret sauce of how to train it, there are some interesting conclusions. I think for me, one thing, for instance, is if you do look at that scaling law, and when we say scaling law, we just mean being able to predict the performance as you scale up the size of a model or the amount of training required.

we are starting to flatten out on that curve. It's exponential, inverse exponential, and it's becoming a bit flat. And yeah, I'm not sure if we can go a thousand X bigger in terms of compute, right? This already costs tens of millions to train and takes, from what I know, months

So yeah, I think GPT-4, again, like we've seen with ChasGPT already, is incredible. It can do a crazy amount of stuff. It can do a lot of it better than before. And it can now do vision. And I think another interesting conclusion for me is typically when people have done these image plus text models-- and we've had a few of them already--

One of the issues is actually having both modalities results in you being worse. If you just do text versus text plus image, there's a trade-off there. But in the results of GPT-4, they do show text only and image plus text performance. And there's actually not much of a performance drop for most of these tasks.

Which is, I think for me, quite interesting that they just managed to keep it about the same.

I think I totally agree. And this is it's funny. I haven't heard that many people harp on this quite as hard as you did. And I think that's totally warranted. You know, I remember back in the day, I think the first big example of this kind of question coming up was when DeepMind came out with Gato. And this is their AI model that famously could perform like 450 tasks, at least half as well as a human, something like that.

And people were pointing out that actually that model exhibited what was known at the time as negative transfer. So in other words, the stuff that it would learn from, if you train it specifically to do a small number of tasks, it would do really well, but then it couldn't quite port that knowledge onto other tasks. When you train it to do other tasks as well, it didn't benefit from that focused knowledge. Instead, it got worse at everything.

So you sort of have this idea of a system that isn't quite generalizing as robustly as it should. And a lot of people were looking at that and saying, yeah, there you go. This is an absolute roadblock on the path to AGI. We seem to have this problem where scaling fails because of this transfer problem.

problem. It kind of looks like that argument is joining a long list of other arguments that are finding their way into the cemetery of, "Oh, don't worry, AGI isn't coming anytime soon," arguments where really it looks like yet again, we're seeing, nope, scaling solves that too. Interesting question as to how far it goes. You pointed out the flattening of the scaling curves.

I'm very curious to see what can algorithmic improvements do to keep that curve going? How cheap does processing power get? What about hardware improvements? I think it's fair to say these are all the open questions right now that the entire AI community is looking at as it looks more and more like we've cracked the code on intelligence. Yeah, exactly. I think at least in some sense, and we'll discuss this in a bit actually in the research section,

We definitely should be starting to have a conversation of, is this getting to the point of something like AGI, things like that. And I think some of us in the AI space are a little bit cynical and maybe still are trying to say this is not that big a deal. But

This is a really big deal. A couple of things to note just before we finish up this discussion. Many people have been pointing out that in various kind of tests of GPT on things like SATs or AP tests or even academic benchmarks, one of the issues is contamination, where in fact the model just in this training data set happens to have

maybe the exact questions of the benchmark or the experiment. Because if you're trying for replicability, the questions and the data is out there. And they do have some additional results that showed that at least on some of these tasks, some of the performance can be attributed to having this additional data. Although it's actually not too bad from what they are able to showcase.

Yeah, and that cuts to the core of another one of these fundamental questions that people love to ask every time we cross a new threshold. It's like, does this machine understand? That word gets thrown around a lot, but is it just memorizing stuff? And then is it just regurgitating effectively in clever and clever ways, but basically just regurgitating stuff that it's already seen? And

I don't know, interesting questions about whether that ultimately is roughly all that the human brain ever does. But this certainly, you know, this question you raise about are these SAT scores really reflective of GPT-4's genuine capabilities out of the box? Or, you know, did it get to peek at the answer key? Very central to this whole debate too. Yeah. And I think ultimately we still are getting a lot of, you know, challenge in terms of just evaluating this, right? Because...

These benchmarks are just not really capturing the whole picture of once you actually play with it and it can do a million things.

And now you're trying SAT, but ultimately Visa's not kind of truly capturing. It's like an indirect portrayal of how capable or impressive GPT-4 is. And I think that is still, yeah, yet another question we need to think about. Well,

Well, and to, I'll make this my last thought on G, I feel like we could do a whole episode on GPT-4, I'm sure. Maybe we should. Maybe we should. Or maybe it'll do the episode. But yeah, I mean, I think one of the things this highlights too, you know, as people look at GPT-4 and what it can do

One of the plots that's been circulating around Twitter a fair bit is this plot that shows GPT-3.5, so the precursor to GPT-4, GPT-3.5's capabilities across a whole bunch of different tasks. And it shows how it was good at some and it was bad at others.

And then it overlays the capabilities of GPT-4 and shows that suddenly you go from like basically, you know, 0% or 10% scoring on one task all the way up to like 70% or 90% even on that. And it happens for a bunch of these tasks.

So you sort of have this idea of a cluster of capabilities that all emerge at the same time, and the confusion in the community about what to do with this. Do we look at this and say, for example, "Oh, well, look, it still can't do this very specific task, and that specific task is a task that a human could do really well, and therefore, GPT-4 is not anywhere near human capability."

Like, do we go with that argument or do we say, well, you know, this is a fundamentally different kind of intelligence. It does some things way better than a human, other things way worse. And so we need to kind of change the way we're thinking about things like human brains as benchmarks, almost even though they're crappy benchmarks and they're very fuzzy, may not even be appropriate as points of comparison. And so just like the problems are so deep and I think they reflect just the

speed at which these new capabilities are being developed, we haven't had time to chew the fat here and think philosophically, mathematically, statistically about what we're even building here and how we would measure its capabilities. Yeah. There's just a ton of thinking and things are moving very fast. So that's a challenge. And one final thing on my end to note is one of the other cool things in a paper aside from

the kind of performance on various things is also in terms of just the alignment angle of how much does it seem like GPT-4 does what we want it to do instead of unexpected bad things. Just scaling up appears to actually

make it better in terms of doing what we actually want. And another interesting thing is once you fine tune it on human feedback, so you kind of align it even more,

There's no performance drop. Actually, on average, it's still just as capable, but now it's less likely to do bad things. And so I think that's another interesting thing to note from just results that is encouraging, I would say. It is. And actually, there is a little chink in the armor there, which is they talked about how well calibrated.

is this model. And when they talk about how well calibrated it is, a bit of technical language, basically all this means is if the model says that X is true with 70% certainty, it ought to be right with its prediction about 70% of the time. In other words, how well can the model determine its own internal level of certainty? And they looked at the raw model, GPT-4, basically the system that you get after you train an AI to do text autocomplete on a grotesque amount of data.

And then they compared that raw model to the version that was fine-tuned on some human feedback to make it a little bit more kind of conversational.

And what they found was that the fine-tuned version, interestingly, although it's more human-like, actually is worse at being calibrated. It basically gets overconfident or underconfident a lot more often, which is sort of this, I don't know, very interesting thing from the standpoint of alignment, kind of shows you, you know, on the one hand, you want to align this to make it more conversational. That's part of the value we want here.

But it turns out that by doing that, you are actually making the model worse at understanding its own level of confidence. And as a result, we're almost faced with this contradiction. What do you want? Do you want a model that's engaging and conversational? Do you want one that's, let's say, truthful and self-correcting? And those two things, it's not clear if they're necessarily at odds, but they certainly have diverged in this particular context.

Yeah, and there's, again, some interesting questions and research to be done there. In some sense, you would expect it to be less well-calibrated because you're taking the objective of probability prediction, basically what is being trained on, and now you're messing with it to say, I'll put these things less, even if in the data set they are more frequent.

So in that sense, you would expect the probabilities to shift. And the question is, does that actually seem to affect performance? And in some cases it does, in some cases it doesn't. So yeah, even if OpenAI didn't release the details of a model, the good news is they did release a paper with a lot of results, a lot of details, much shorter than GPT-3 as a paper, but still quite a bit

So yeah, exciting times. And wow, I hope GPT-5 doesn't come out in a month. Well, there are rumors, eh? I don't know if you saw this on Twitter, but JP Morgan has a bit of analysis that they did where they said that they think the GPT-5 training run is currently on. So giddy up. I guess it's just a matter of time. But again, it's...

It'll be very interesting to see if we can push this much further, given the kind of computer limits just from physics. And on that, like one just last thing, I keep saying this, there's so much freaking stuff. But one last thought is like, you know, is there something...

of an irreducible complexity to speech that is, that causes this limit, this kind of flattening. Like, do we get to a point where every once in a while, uh, an idiot podcast guy like me says some random string of letters and numbers that is totally unpredictable. Um,

Or every once in a while, I'll say something weird that really makes no sense. At what point does that or I don't know, I mean, I'm getting kind of hypothetical here, but it certainly is something that people have talked about is this idea of the irreducible entropy of just text. And I might be actually, I might be screwing up whether that applies here. I'm not sure I'd have to look in more detail at the specific scaling curves, but that could be part of it. Yeah, I mean, I think it's definitely true that

At some point, you can't get perfect performance. You can't become like an Oracle that knows exactly what's going to happen. And for me, that's one of the interesting points with what we'll talk about in a bit, which personally, there is reason for concern. But at the same time, a lot of people seem to think that we'll get to a godlike AI that is just supremely intelligent and can do things that are way beyond our

the capabilities of any human. And I do think

We'll get something very smart potentially, but I do think there might be physical laws that limit the level of intelligence of any being. This is just a pet theory of mine. No, no. I think it's absolutely correct. People have actually looked into this question in some depth. There's been analysis done of what is the maximum amount of compute that you could fit into a certain volume of physical space?

And unfortunately, well, I want to say unfortunately, because obviously I'm the AI risk freak out person. But unfortunately, I would say that quantity is monstrously higher than what goes on in the human brain. So to the extent that compute maps onto performance, to the extent that scaling laws do hold,

I do expect them, by the way, to be more robust as we go multimodal. We start to get AIs as well that do stuff like teaching themselves in the ways that we'll discuss more today. I do expect that trend to continue personally, and that's why I'm quite concerned, just because the physical limit seems so much further from where the human brain is. But who knows? We'll see.

Hopefully. Well, let's take a break from GPT-4. We'll actually be back to it in just a bit. But some other things that happened aside from GPT, Google has opened up access to its chat GPT arrival, BARD. And it's still on a waitlist basis, but people are starting to play around with it. There were a bunch of posts on Twitter with people's first impressions.

And from what I've seen, I haven't played around with it that much. It's been a little bit underwhelming where it doesn't seem to be quite as smart as ChatGPT and it doesn't have the capabilities of Bing in terms of including links to things that it kind of, you know, sites. It is faster, which is pretty good. ChatGPT is not the fastest, but...

First impressions appear to be this is similar to chat GPT, not necessarily better, maybe worse, but probably, you know, not too far is my impression.

Yeah, and I think a fascinating question is going to be like, why is that? Is the reason that BARD has not been scaled as much? Is the reason the quality of the data? Or is the reason, and this is where, again, I put on my worry word hat here, but is the reason that Google has invested more time and effort into RLHF, into basically alignment and making sure the behavior of this is safe and robust,

And that as a result of that effort, they've kind of lobotomized the system in the same way that we saw Microsoft lobotomize Bing chat shortly after it came out and started like threatening users. Because, you know, if we start to see that, then it kind of, you know, you start to look at this as a bit of a race to the bottom where, hey, you know what, you invested, you know, 10 times more in safety than the other guy. Well, you're going to lose. Right.

And I very much hope that's not the case. This is part of the reason why I really wish OpenAI would release more information specifically about the alignment part of GPT-4. Because we've got to get these leading labs to come out and make it clear to the world they're investing in this stuff, if only to set precedent and if only to encourage others to not skip out on the safety piece. But yeah, it's fascinating. And it does seem to kind of underperform. I agree. It's not quite there yet.

Yeah, and I have seen some comments where some people have been hypothesizing that maybe part of that is because it is much more safe, that its outputs tend to be more bland and uncreative. So it does seem like a reasonable guess. It is worth pointing out that as far as I've seen,

There hasn't been a paper on BARD. Google has not released much of anything from a technical perspective. They have had various papers in the past on systems similar to this. Last year, we had Lambda, and there was a whole paper, but not so much with this. So we rag on OpenAI and Microsoft, but then again,

It could be the trend that just starts now of people not disclosing this at the Eastern industry.

Yeah, yeah. And I honestly, to your point from, I think, right at the beginning of the podcast, you were talking about, is this a good thing or a bad thing? I think it's super ambiguous. Like, you know, it's as ambiguous as, you know, somebody refusing to leak the details of some like nuclear weapon control, like technology or something that makes sure a bomb doesn't go off prematurely. It's like, yeah, it's good for safety, you know, but it's also good for capabilities. And it's hard to know,

Whether proliferation is what we want here, I think it's not. But there are certainly people who think that it's good and open publication bans are not great for that sort of thing. Yeah, and this might be a place where

Regulations might come in where there would be some level of required transparency as to how much alignment was done, for instance, or just alignment evaluations, which would be pretty interesting. So I could see it heading in that direction.

But who knows? As we keep saying, there's just so much to consider. Well, and as things go darker within these labs and groups, one thing that has been talked about a fair bit in the safety ecosystem is like,

Is there a way to set up a tripwire where you could, for example, go, okay, this latest model that's being built, say, at DeepMind or OpenAI, it just crossed the threshold of capabilities that gets it into power-seeking territory, into existential risk territory. And as a result, we're going to alert people in other labs. We're going to bring people over to complement our alignment efforts on this thing and so on. It's all like variance on the theme, but it just seems...

It's so crazy that we are there, that we're having that kind of conversation, and it doesn't sound insane given the state of things. It doesn't. Yeah, even to an ex-risk skeptic like myself, it's definitely getting to a dangerous point where we've seen a lot of cases of people misusing AI already.

Police, we've discussed many times with facial recognition. And as these methods get more powerful, it just means that there could be a larger impact from people misusing them. And now they're becoming, if they become ever present, just like smartphones, a lot of things will go wrong. That's for sure.

Well, moving on to a lighting round with just a quick discussion of a few stories to start off. Pretty short thing to comment on. It has been confirmed that the Bing AI runs on OpenAI's GP4. So it's actually, you can say, has all that capability that we've just discussed.

Yeah, and so big shout out to very, very, very, very niche internet user called Gwern on the Align, not the Align for LessWrong, I think. So Gwern is, you're not going to know who this guy is unless you know who he is. But quite a while ago, he posted to the, sorry, to LessWrong, I keep doing that, saying exactly this, like in his assessment, this was GPT-4 running in the background and

There are so many questions that this raises. First of all, Bing Chat was running GPT-4 months before OpenAI said, "Hey, we think GPT-4 is now safe to release because we finished aligning it. We finished making sure that it doesn't behave like an unhinged crazy person." What did Bing Chat do when it was released without all of these extra safety measures? It behaved like an unhinged crazy person.

And so what we seem to have here is a situation where we need to ask, you know, like, what are Microsoft's release standards around safety? Are they indeed materially different and lower than open AIs? And if that's the case, what accounts for those differences? And what amount of say do regulators now need to have in terms of deciding, look, we have clearly different organizations with vastly different safety standards. And, you know, the issues that arise when you don't

add that extra layer of alignment that OpenAI did are what we saw with Bing Chat, threatening users, giving people advice on how to make bombs, the works. This is pretty serious stuff. And so I think this retroactive revelation that we had an earlier version, an unaligned version, let's say of GPT-4 running, is I think something that should be getting a lot more attention than it is.

Yeah, and I think just to add to that point that to me what seems interesting is it's one thing to release a chatbot like Bing and there's some unfortunate discussions, but the damage is kind of limited. OpenAI has an API so anyone can build something on top of GPT-4.

And what is their moderation policy on allowing and disallowing various uses? They do have, of course, code of conduct and all this legal stuff. So they can say to various users, you can or you cannot use GPT-4 for this. But again, as it gets adopted in a million different use cases, to me, that would be a concerning case of...

There are many ways to use this for bad purposes, and it may get harder and harder to actually limit people from doing that. Yeah, that's so true. I mean, the malicious use angle is so interesting, isn't it? Because you have this situation where it's not even clear. There's no real way sometimes to know whether someone is prompting the system for malicious purposes. You could imagine prompting it by saying, hey, make me some malware that does this thing.

And like, okay, that's obvious. The word malware is in there and the intent is clear. But what if, you know, you write like, you know, write an email to somebody, you know, to Andre who has a PhD and blah, blah, blah, and recommend to him that he downloads the attached file. And like, is that...

That could be a phishing email, or it could be a totally benign email. And it all depends on what is in the attached file. And so that's the impossible challenge, I think, for a lot of these systems is sometimes you just don't have the information to even judge. So what are these security thresholds even for? How much security do they actually offer? It's an interesting open question.

I guess we can use GPT to tell us whether it is being misused or not. Moving on to the next story, again, building off of all the stuff we're talking about, Microsoft and Google unveil AI tools for businesses. Not too much to comment on. The headline is what it is. Microsoft is integrating AI assistance, which it calls co-pilots, into various systems.

things like business chats, documents, and various things. Google is also going to embed it into Gmail and Google Docs so that it can draft things, write various documents for you. Just another showcase of how quickly things are moving. Now, if you're a user of Excel or Word,

You have GPT right there already integrated. Yeah, I think you're so right. It's also this, I think, qualitative step in the direction of an AI-first economy, really, where it used to be that we'd think about these AI systems as being the tool. ChatGPT is the tool. GPT-3 is the tool.

And it was up to secondary companies to kind of figure out how to integrate them and work around them. But now what we're starting to see is actually a pull from the direction of Excel, Microsoft Word, Google Docs, whatever. And these tools are just going to live there with us. And we're not going to have to actually go out of our way to use an AI tool. Everything becomes an AI tool.

And so I think actually a very interesting dimension of this and a whole bunch of new challenges and opportunities are going to arise from this sort of thing, not least of which I think is a user experience challenge.

You think about how do you integrate AI into Google Docs, into Excel? What does it look like to have recommendations that don't piss you off because they're not too frequent, but they're also not too rare? There are a lot of unsolved user experience problems around AI integration that people are going to be figuring out in a real hurry. And it's going to be interesting to see what the new norms are, what the new icons are in our user interfaces and all that stuff. Yeah.

Yeah, exactly. I think it's not hyperbolic to call this as big a deal as the introduction of smartphones, right? That everyone has one. It completely changed many aspects of our lives. And even maybe the introduction of the internet, you could argue. So it's, yeah, it's crazy. Yeah.

And again, building off of this whole theme, because that's what's going on, we have introducing Claude from Anthropic. We discussed Claude a little bit, I think, in the research sections before, but now it's actually being rolled out with some companies like Notion and Quora and DuckDuckGo in closed alpha. So basically, Claude is...

Something like GPT-4. It's built by another company, Anthropic, which was actually started by people from OpenAI largely. And what people have been saying is that it's maybe more reliable, that it doesn't...

seem to do wacky things as much as GPT, which is kind of cool. But so far you cannot play with chat cloud and we'll see, I guess, if that is going to come up.

Yeah, and I think, so, you know, quick shout out to the anthropic people. One of their policies on this one has been to do fast following. So basically, they're aware of the race dynamics in the AI industry that we've talked about in this episode and others, and they're determined not to contribute to the race dynamics as much as they can. And so their strategy here is to wait until other firms, other labs, release models of comparable capability, and then they say, okay, we'll come out with ours.

And so Claude here represents this next step in the business evolution, the release of Claude, even in this sort of closed beta, represents a commitment to that strategy. And I think it's healthy. I mean, it would be great to see more like this, especially in a context where it does seem like capabilities are outstripping our ability to control these systems. But also interesting, I think, is, yeah, this trade-off between how creative do you want the system to be and how

how risky do you want it to be? How, let's say, safe you want it to be? So in this case, I guess Anthropic has given people a knob they can tune to sort of tune the dial between levels of creativity, much like, I'm trying to remember, I think, was it GBD4?

Three? Four? 3.5? One of them. Somebody released a thing recently that has that kind of setting. So I think we're seeing more and more. Oh, it was Bing Chat, I think. Yeah. So more and more ability to kind of customize the model to your own preferences, maybe giving users the option, go with a creative version of this system, but...

at your own risk. Don't complain if it starts to threaten you. That's your own choice in a sense. But anyway, I think it's an interesting dimension to this, to see that setting appear in yet another model release. Yeah. And with respect to race dynamics, I think another interesting thing to note is, if I remember correctly, Google invested 350 million in Anthropic,

these models do require hundreds of millions and a lot of compute. There was a $500 million investment in Anthropic by FTX or the founder of FTX, which I'm not sure how that's going now. So that is another factor to consider that Anthropic is being pressured by Google, presumably to maybe hurry up a little bit.

Yeah, hopefully the investment wasn't done in FTX stablecoins, but I guess what? But yeah, no, it's such a, the funding aspect, right? And this is so interesting. And well, maybe that leads us smoothly to the next story. It looks like we've got a fundraising story about Adept.

Yes. So speaking of hundreds of millions, this company, Adept, has raised $350 million to build AI that learns how to use software for you. It's already raised $65 million before. It has demonstrated a little bit how you can basically ask

a language model, not just to respond to you, but to do something like go look on hotels.com and book me an affordable place in Florida or in Miami. And it can interface with the intranet and with tools to do that. And it seems like people are pretty optimistic that they can pull this off.

Yeah, I think this is another one of those interesting business strategy questions too. Like as we're seeing models get scaled more and more, one of the things that seems to be emerging organically is the ability to use tools. And so I personally, I mean, I'm a little concerned for the Adept team. I know they do have a bit of a safety focus, so I got a soft spot for them, but I'm personally concerned for them just because, you know,

there's a chance that they get eaten by generality, by more general systems that can solve what Adept does as a subset of their more general capabilities. So for example, OpenAI just came out with the chat GPT plugin, I think just today or yesterday.

And, you know, this is basically roughly that it's an AI that can use tools for you. And and tool former is the, you know, the anyway, the strategy that the framework that allows you to do that. And this is.

I think a really important and fundamental business question for Adept to look at is, have we hit liftoff velocity where we're going to be able to bring in enough money to compete with the big guys, given that their really big systems are starting to chip away at the capability space that once was just Adept AI's special little bubble? We'll see, and maybe I'll be proven wrong. I have been before, but it's an interesting dimension of business risk that exists here.

Yeah, exactly. And this whole dimension of having APIs

to access the external world is a whole other thing. But let's move on to the research and advancement section having spent, I don't know how, 40 minutes on just one portion. We are coming back to GPT-4. Maybe we won't discuss it quite as much. But there was another paper released just a few days ago, I think, by Microsoft titled Sparks of Artificial General Intelligence.

early experiments with GPT-4. This is a bunch of additional experiments on top of what we saw in the GPT-4 paper that really try and stress test it, really put up even more challenging things that aren't just benchmarks. And they demonstrate that it can do a lot of pretty impressive things

in a lot of domains. And they argue that at least in some sense, we should start thinking of this as artificial general intelligence, which is something that at least in the AI world is pretty well known.

AGI, and that has been kind of the dream. And personally, I do think that this is a reasonable stance. And I did find myself pretty impressed even having known GPT-3 and so on. There's a lot of examples of cool things you can do, and some of them were quite surprising to me. Yeah, I totally agree. And the mix with images as well, I mean, some of the

I just want to kind of as a quick aside, this idea of more of X paradox keeps coming up. So just basically that tasks that are really easy for humans are often really hard for AI systems and vice versa. And so we often look at an AI that seems to screw something up that would be obvious to us. And we kind of go, oh, cute little stupid AI, nothing to worry about here. And so just like in the example that I'm going to focus on here, I think this is a great

Great example of that. People showed how you could feed GPD-4 an image that contains something humorous, and it could explain the humor behind the image. Something that comes quite naturally to humans, but

For AI, this was a big issue that was preventing other smaller systems from actually working or that would break them. And so one among a large number, a constellation of different new capabilities that this paper is flagging, and I think rightly so, I totally agree. A lot of these things are qualitatively new capabilities, and they do seem to signal a combination of common sense capabilities

I don't want to say self-awareness, it's so loaded, but certainly an awareness of the world. And I would contend a robust world model on top of that. Exactly. That's often been a discussion. It's been argued, you know, wherever you do need embodied learning in general,

Here, it's still the case that there is no real-world agent. It's still just being fed a bunch of data. But when you look at some of these experiments, it's pretty surprising. So you can tell it to draw various things. It can draw without access to any sort of image output. It just produces JavaScript code or LaTeX code.

And even with that, it is somehow able to, for instance, draw a unicorn in LaTeX, which is something that is not easy. And it can write JavaScript code to make an image in the style of Kandinsky, which is a very abstract and very geometric drawing.

style. And again, just by producing code, the image looks pretty reasonable and much better than ChasGPT, as you might expect from GPT-4 having both images and text. So that's another interesting thing where it does seem to have this cross-modal transfer where the text part can understand instructions about images and

And even if you just feed a text and say, draw this using JavaScript, it can output it and presumably is better able to do it because it is strained on image inputs. So very interesting. And I do think that's also related to a deep question that we talked about it on an earlier episode where OpeningEye's head of alignment, Jan Leica, was posting to Twitter saying, hey, we have this mystery that we can take a model, we train it on English text and

and it learns all kinds of stuff about the world that it can express. And then we train it on just like a little bit of another language, and it's able to do all that shit basically, roughly, in that new language. And so he was pointing out like this is actually kind of hard to explain by just the old argument that, oh, this is just repeating, regurgitating statistics, statistical patterns that it's observed in text, because it hasn't had enough statistical patterns to observe in the new language.

Well, if you think of essentially visual stuff as language and text as another kind of language, essentially what we're seeing is a variant on that translation phenomenon that just applies to visual. So somehow, latently, somewhere in this model, there's probably a representation of the real world that can be manifested or translated into many different forms, including visual forms. So I think a really interesting phenomenon for sure.

Yeah. And just to cover one other example here that I found interesting as someone who's done a lot of robotics, uh, in figure 5.8 in the paper, they show how you can have GP4 navigate a map interactively. So you just, uh, you know, tell, uh,

the model, the layout of a given scene and have it output certain directions that you want to go. And it can map out the environment. So it has this ability to track kind of spatial dimensions, to imagine

this embodied question of what is the layout of the space. And yet that is yet another example of emergent capabilities where it's just trained on word prediction or image understanding, but it is able somehow to do this interactive task of continually exploring a given space and remembering what it looks like and how they connect.

So, yeah, a lot of examples in this paper that are pretty intriguing, way more than the original GPT-3 paper. And it does make this slightly philosophical case of maybe this is already general intelligence. And what does it mean as far as societal influences are?

How do we change our thinking now that we may be getting to general AI? Yeah, a long time coming. The number of things that are going to have to change is staggering, but

I think there's also this question about definitions of AGI that I personally find tiresome in a way. People argue, oh, that's not real AGI because or that's not whatever. It often happens in the context of existential risk safety too.

And, you know, there I think it's just I would just flag. I think it's important to keep in mind sometimes what matters is not what you call a thing, but just what that thing is capable of specific acts that it can pull off or specific intents that it may develop. And so those those concrete things can sometimes be good guiding posts when you want to just sidestep an annoying debate philosophically over whether something is technically an AGI, even though I still think that debate can be interesting. Yeah.

Yeah, I think it's not too much of a big deal to worry about the terminology. I think we all can see that it can do a staggering amount of things. And another thing I do like about this paper is it does have some discussion on the limitations that are still there in terms of what are its weaknesses, what is the kind of basic fundamental

things that are not quite possible with just these autoregressive text models. And it's important, I think, to keep a bit of a grasp on reality as we are all being blown away that there are some basic limitations that even with scale are not that likely to be solved without some tweaks to your approach.

for various reasons that are maybe more technical than you want to get into.

Yeah, that's true. I will say it is also a tune that we've heard sung before in the sense that if you go back to the GPT-3 era literature when we were being told that what was going to be possible and what was going to prove to be an absolute limit on our ability to do stuff, we talked about this idea of negative transfer before where, oh, that's going to be the thing that blocks us and this is a fundamental issue. And it turned out, hey, scaling just solved it.

I don't know, I may be very wrong, but looking at some of the arguments in that paper, I sort of thought, yeah, I'm getting shades of this. It feels like, you know, not that scaling is all you need, but it may be much of what you need. And anyway, interesting to see the arguments back and forth. I think a really healthy thing to include, as you say, in a paper like this, because we need to be talking about what theoretically is between us and full on, you know, human level intelligence or beyond. Definitely.

And then jumping on to the next paper, not GPT-4, but still yet another one of these models. This is ALPACA, a strong instruction following model. And this was met with, I think, a fair amount of excitement in the AI circles, especially the academia circles, where the basic story here is researchers at Stanford took

an existing model by Meta, and then they fine-tuned it to follow instructions to make it similar in performance to GPT-3 or one of the more advanced versions of GPT-3, and it only cost $600. So from a perspective of making it more possible for academia or other groups to use,

even be anywhere close to these giant models by OpenAI, this is an exciting result.

Yeah, I think it really is. It's so interesting from the standpoint also of its implications for the business side of things. What these guys basically did is, like you say, they started off with this open source meta AI model that was really big called Lama. And then they went to GPT 3.5 and they fed it a bunch of prompts and got some outputs. And they used essentially the output of OpenAI's system

to fine-tune the open source model that they had. That process, like you said, was super cheap, but it led to a model that could replicate the performance, the capabilities of the original, more expensive OpenAI model. You might ask, where does this leave things? Essentially, we now live in a world where OpenAI can spend $200 million, build the next big shiny system,

And then a bunch of yahoos from Stanford, I'm sorry, Andre, a bunch of yahoos from Stanford can come in and they can say, hey, we've got a budget of 600 bucks and this like open source, free to access language model lying around. We'll put the two together and we'll get a system that's

That basically reaches close to parity on a lot of tasks for that $200 million model. And so you can imagine what that now does to the incentives of people building these models at the cutting edge. All of a sudden, maybe it doesn't make sense to spend that much money to maintain a marginal advantage over everybody else if your shit can get copied within weeks and then open sourced.

And so a whole bunch of open questions about this, but the bottom line is just by having input output access to this model, just by being able to see what goes into the OpenAI model and then what comes out, the prompts and then the outputs, you can train your model to mimic its behavior.

and you can do it for really cheap. For proliferation, I think this is a really big game changer, one that people haven't quite absorbed fully. Eliezer Yudkowsky, the famous AI risk skeptic who likes to write unnecessarily long posts about safety that probably could be 10% as long, but a brilliant mind. He flagged this on Twitter, highlighted it, and has a great analysis of it. I really recommend checking that out.

Yeah. It is worth noting that in my understanding, this really did the fine-tuning part, the reinforcement learning from human feedback or just learning from human feedback. It did start off with a trained model from Meta, LAMA, that we may have discussed before. So to me, the takeaway is kind of interesting where if you have one big model that is aligned

and you have a similar model that is not aligned, you can replicate the alignment even about accents to weights or training data. That is deep. I think that's from an alignment perspective, safety perspective is cool.

Yeah, and hopefully good news. You're right. I think another dimension of it, too, is like there is a sense in which this shows us that, hey, you know, that meta AI model that they started with, which nobody ever would have put in the same league as like chat GPT when it was first released. Well, it turns out it actually could absolutely mimic the performance of a chat GPT just by being fine tuned on it in this way.

And so this seems to tell us something as well about the latent capabilities of our models, that actually that meta model had a lot of capability that it just wasn't revealing because it just hadn't been fine-tuned quite right. So I think there's like a lot to chew on here, but I got to say, pat on the back to your colleagues at Stanford there.

Yeah, I know. This is exciting. Unlike OpenAI and Google, they do promise to release, they already have released, I believe, the data and the training code, and they hopefully will release some auto-waits if meta happens.

kind of signs off on it. So from an academia perspective, we've already touched on so many questions that need to be explored. And if you have a GPT-3 level system, academia now could conceivably actually study these things. And that is good. Yeah.

Yeah. And moving on to the lighting round, first up we have our latest health AI research updates from Google. And this also was met with some excitement. Google has Palm, which is basically like GPT. And they are reporting here on MedPalm, which is kind of a specialized version of it for medical data.

This is an announcement of MedPalm 2, which seemingly does really well, you know, at least on medical exam questions.

Yeah, another case of that, I was going to say age-old question. This question is like 20 minutes old, but in AI world, that's ages. This age-old question of what wins? Is it a general system that learns from all kinds of data and then can solve specific problems by leveraging what it's learned from all kinds of data? Or a purpose-built system that's only been trained on medical data?

And that's what we're seeing here. I think a lot of Google's efforts do still involve some of this kind of fine-tuning, heavy fine-tuning, or in some cases, purely training on one particular problem class. And they're seeing results. I mean, certainly, this is MedPOM 2, so there is, at least I understand there to be quite a bit of pre-training on general data too. It's a bit of a balance.

But certainly fine-tuned on medical data and yeah, 60% passing score on US medical licensing style questions, expert doctor level on medical exam questions, 85% score. So these are pretty impressive numbers and up there with what we saw from GPT-4 on the LSAT and so on. And so interesting to see a lot of these human tests now starting to be vanquished by our AI systems.

Hopefully that means the cure for cancer is around the corner. I don't know about you, Andre, but I'm kind of optimistic, at least from a medical perspective, that we're going to be making some big progress in the next few weeks and months.

Hopefully, yeah. I think definitely it will help. Unfortunately, with these physical sciences, you do need to do experiments with stuff, but definitely it is exciting. We've seen a lot of progress on AI for Medicine, and in fact, they announced some partnerships with

organizations like Jacaranda Health, which is a nonprofit focused on improving health outcomes from others, a partnership on breast cancer detection, and some various things. So we're trying to go ahead and deploy this sooner rather than later.

Yeah. And like on the optimistic side of AI, right? Like that's really what you want to see is doctors are expensive and people in, you know, low income areas, countries, communities, you know, having more access to automated medicine in this form, I think is, is really exciting, you know, levels the playing field in some really important ways. Yeah. Yeah.

Speaking of more of a science topic, our next story is machine learning takes a starting role in exploring the universe. This is a pretty fun article about how the James Webb Space Telescope produces just astonishing amount of data. It produces 230 gigabytes of science data every day.

And one of the challenges is trying to process all the data. And if you're trying to do this with simulations often or models, that is expensive to be able to just do the number crunching. And so this goes over how you can use machine learning techniques to

really speed up and require less computational resources to process all of this data very quickly. It's nice to hear. Yeah.

It is. And it's also interesting to see the lens that AI is giving us on the universe. We talked about this a few weeks ago, just this idea that to a great extent, when we get AI generated images or AI generated composite images of whatever galaxy, we are in some sense not looking at stuff that that

that our brains would have been able to put together on their own. And so like, what is the, what is the nature of like, not to get too philosophical here, but like, what is the real nature of that image? Is it something that we ought to consider, um, to be as, as legitimate, let's say as, as any other, um, anyway, AI definitely allowing us to pull out stuff from huge data sets that our eyes and tiny brains couldn't otherwise handle. And, uh,

Cool to see. And speaking of that, actually, the next story is a new method to boost the speed of online databases. And this is talking about the paper, can learned models replace hash functions? If you're a programmer, you might know that hash functions are just a way to get kind of a code for any given data.

piece of data that can let you key into a database and retrieve a piece of data in kind of a quick way. And there's various kind of details where you can have better or worse hash functions. And this studied that in some cases, depending on your data distribution, you might get better hashing performance with a learned model.

Yeah, it's also interesting to look at the efficiency gains that they're reporting. In many cases, they see that this is actually more computationally efficient than perfect hash functions. There's something that you see in an increasing number of areas where it's like maybe there's a simulation you want to run, for example.

And there's an old tiny way of doing that by using like laws of physics and then propagating through time. But you could also compress all of that using a neural network and kind of simulate the same thing, whether it's a hash function or, you know, an environment of some kind. And yeah, I mean, like AI will just do it. It's another one of those that long and lengthening list of things that things can now help with.

Yeah, and I think in the past we've discussed, you know, it might seem like everything in GPT now, but in a lot of domains, in many different ways, we're seeing AI come in and have been seeing it. Yeah. Now, on to the last quick story. We have...

titled Dynamics in Deep Classifiers Trained with a Square Loss, Normalization, Low-Rank, Neural Collapse, and Generalization Bounds. Long title. Short story is they are able to do some more theoretical analysis and basically understand why certain things happen when we train classifiers. And I think this speaks to a common issue

idea in the past decade with deep learning is we have no idea why these neural nets are doing what they're doing. And I think that's sort of true, but it is worth noting that there has been a lot of research on the more theoretical side and papers like this are giving us new observations that are quite interesting.

Yeah, absolutely. Right. And it's, it's all, I had a conversation with a researcher at U of T a while ago, and we were talking about the definition of an explanation.

And this mathematical way of looking at, "Okay, this is one mathematical way of thinking about what classifiers are doing." That's one level of explanation. And then there are others like, "Can you explain it, say, in plain English to a five-year-old or to somebody who works in AI policy or whatever?" It's just really interesting to see we're starting to make some progress in all of these different directions.

And I think the mathematical one is important because it's upstream of a lot of those other more practical interpretations that then end up actually getting used. And so really good to see more foundational work being done on this stuff. Probably also useful for capabilities as well, right? Because when we start to figure out, you know, oh, that's what this thing is actually doing, we can find better ways to kind of help it along in that process. So sort of an interesting result.

Yeah. And bringing it back to language models and GPT, Anthropic has been doing some very cool research on understanding way more fundamental behaviors and training dynamics.

And in our next section, Policy and Societal Impacts, the first story is about Anthropic. They published a post titled Core Views on AI Safety, When, Why, What, and How. And yeah, it's a pretty good overview with very concrete kind of perspectives. And what did you take away from this, Jeremy? Yeah.

I think it's actually super important. This is Anthropic, which is, I think, arguably the most safety-minded of the AI organizations that are pumping out large language models at scale.

They're coming out here and basically saying the quiet part out loud. The thing that, frankly, has been taken for granted in the AI safety community for years now, that, hey, we are actually quite plausibly approaching human-level AI. It seems like all of the curves just keep steepening. And they talk about their assessment of the trend. We're seeing tenfold increases every year in compute usage in the largest models.

And they have this little passage that I think is just so great at explaining the relentlessness of the success of the scaling project. And so they write, back in 2019, it seemed plausible that multimodality, logical reasoning, and by the way, multimodality is video, one AI that you can look at video images, audio, text, and all that stuff.

So it seemed plausible that multimodality, logical reasoning, speed of learning, transfer learning across tasks, and long-term memory might be walls that would slow down or halt the progress of AI. In the years since, several of these walls, such as multimodality and logical reasoning, have fallen. And they kind of

essentially flag that, hey, we think this is going to just keep happening. And they also flag emerging risks like power seeking and reward hacking. So really taking seriously this case for existential catastrophic risk from AI. Again, this coming from a Google backed leading AI lab. And I think one of the key take homes here is their strategy. They kind of lay out a threefold strategy. One, it's possible that alignment is

In other words, the kind of existential risk stuff from safety that it turns out to be a problem that solves itself for some reason. It just doesn't end up being a risk. And that's great. And in that case, they're kind of like, eh, we're not going to have a big impact.

They're second tiers, like mediocre, where it's like, "Oh, it turns out to be really hard to do," in which case we expect our biggest contributions to be on the technical safety side. Then in the pessimistic scenario, I thought this was really interesting, they say that they see their role as proving decisively that AI cannot be made safe. If that's the case, alerting essentially, like pulling the alarm and alerting the community and policymakers and so on, on an urgent basis.

I think this actually makes a ton of sense. It's very well laid out, sort of tiered by possibility space and reflects what I know of at least the founders of Anthropic when I've spoken to them. It very much seems like the kind of practical, empirical approach that they like to take. And I think as good a strategy as anyone has at this point.

Definitely, yeah. And I tend to agree. I think this is a really nice post. It's very detailed and it's very concrete.

And this is maybe a bit of an issue I've had with AI safety. It's just it has been a little more on the side of discussing potential issues and potential kind of outcomes and processes, but hasn't been as active on the research end or on the sort of concrete, here's what we will do now. And here's what we're doing now.

And I think this lays out a very comprehensive sort of program and approach that Anthropic has already been pursuing. And by making it this concrete, I think it will kind of play a part in the discussion of how do we proceed safely now there is...

very kind of explicit concrete perspective on this and it will definitely I think influence other groups that are trying to do a similar sort of program

Absolutely. And yeah, to your point, I think that, yeah, the concreteness thing is so important. Frankly, I think there's been a cultural problem in AI safety, specifically AI catastrophic risk safety for a long time. And it's understandable, right? The kinds of people who started working in this space

in, you know, 2008, in 2012 were neurotic worry warts and they were onto something or at least it seems like there's a good chance that they were onto something at the very least, you know, they may have been wrong, but there's a good chance. And unfortunately, when you're the kind of person who is that worried about something so far out, so hypothetical, you're

you can take a kind of, that takes a psychological toll on you, on your team. Those kinds of people don't tend to found companies that contain optimistic employees. And optimism is what you need as a startup, as an initiative, as a nonprofit, doesn't matter. If you want to make something happen, you need to believe that you can actually do something about it, even if that belief is misguided. And, and,

That's ultimately, I think, the problem that's animated AI safety for a long time. A lot of hypothetical discussion by people who feel somewhat psychologically paralyzed for very understandable reasons. And I say this as somebody who shares basically all their concerns.

But this concrete kind of laying out of like, look, here are the possibilities. Here's what we're actually going to do about them. I just, I really like that. I think it's a great frame shift psychologically, culturally, and indeed technically for this whole space. Yeah, I think we're both big fans of Anthropic. Hopefully they'll continue their course as they have been for a while now.

And related to this quite a bit, our next story in policy and societal impacts is that Microsoft just laid off one of its responsible AI teams. So it laid off the ethics and society team within its AI organization. This is one of several groups working on kind of a broad issue of AI safety.

There have been interviews in this article that is pretty detailed that do point out that this is potentially worrying, that this particular group was meant to show how to concretely implement and follow rules. There's these principles that the Office of Responsible AI and Microsoft

And this group was helping people follow those rules. And now that group is no more. And this is kind of adding on to what we saw last year with Google really kind of having a lot of people leave its responsible AI group or AI ethics group.

Yeah. And as ever, you know, many different takes on this in every which direction. One of which is, well, you know, like, what does this signal about your commitment to, you know, broader safety related things, ethics related things, things that are not just scale it up to blazes and see what happens.

And then the other is like, you know, okay, what specifically was this team doing? And how far can those inferences go? Can we really infer that there's something deeper behind the scenes here? What are the true motivations of the executive in axing this group? I mean, it's

It's so difficult because we don't have that visibility. For stuff like this, it's a lot like the alignment stuff. You really wish you could just see the memos and the emails to be like, okay. It looks bad on the surface and it very well may be conceivable that there's a more practical explanation under the hood, but

And honestly, I think at this point, the onus is going to have to be more or less on Microsoft to at least make it clear what their reasoning was here and why that trend line doesn't need to be extended to other categories of risk as well.

Yeah. So we shouldn't jump to conclusions. There have been a lot of layoffs and restructuring. Then again, there has been precedent. So last year, we discussed quite a bit about how Google laid off its leaders of its ethics research team.

Timnit Gebru and Morgan Mitchell. And that was largely because they really pushed on discussing the potential problems with these larger language models, with stochastic paired system, as they called it. And here there was a bit of discussion on how there was just a ton of pressure from the very top

on taking these models from OpenAI and putting them out there very quick. And I think in a way similar to AI safety, AI ethics, which is somehow like a very different community, both of these communities really care about maybe slowing down, maybe taking more time and being a bit more considerate of the potential impacts of just putting out GPT-4 into a world.

Yeah, and that's a really good point, this sort of like polarity, but also similarity between the positions of safety and ethics teams. There's also this question too, and I heard this reference in the context of Google, that to some degree, when you have ethics teams or safety teams or whatever that slow down model deployment too much,

then there becomes frustration at the technical end, on the capabilities teams who just want to see this stuff out in the world. And you run a greater risk of having kind of like a runaround where people just go, you know what? Yeah, screw it. We're just not going to engage with any of this process.

And so it kind of gives you a little bit of sense of that negotiation, that very careful negotiation that's happening inside these organizations, multifaceted, lots of different interests, lots of different worries and concerns. And do we release? Do we go through this process, that process? Again, that opacity is so hard to sit here and judge Microsoft for whatever or Google or whatever, but certainly these issues are multifaceted and hard to pin down. Yeah.

Yeah, I think it's probably safe to say that AI, FX groups within Microsoft pushed back on rolling out things quite so quickly and there was friction and so on. We don't know too much about it, but this is just something to be aware of as kind of one of the dynamics at play.

Moving on to our lightning round, a few stories starting out with, again, alignment. We are now going to talk about some high-level thoughts on the DeepMind alignment team's strategy. So DeepMind has also been founded with quite a bit of concern or at least consideration of artificial general intelligence. And I think many people within it do care about safety and

And this is kind of a presentation that talks about the high-level perspective of DeepMind and what their program, I suppose, is.

Yeah, and I think really interesting in the sense that it represents the third in a series of three fairly significant announcements that I don't think were fully appreciated at the time in the last three or four weeks from leading AI labs, all of which in different ways using different language have called out catastrophic risk from AI, existential risk from AI as being something that they're actually starting to get quite concerned about.

So this is specifically the DeepMind alignment team. I think it's really important to highlight that because DeepMind, more than maybe any other cutting edge lab, consists of distinct teams and people sort of float around. And so what the alignment team thinks may not necessarily reflect what the whole org thinks. And they're very careful to highlight that.

But yeah, they talk about their threat model. What do they think the risk really is? Where does the risk come from when we talk about catastrophic or existential risk from AI? First off, they open with the view that, hey, we think not many more fundamental innovations are needed to reach artificial general intelligence.

So that in itself, I think, is a really big statement. Essentially, that deep learning plus reinforcement learning from human feedback, which we saw first with chat GPT, at least in the public eye, get you most of the way there.

When they talk about their main threat mode, the main thing that the risk comes from, for them at least, is the idea that, as they put it, a misaligned consequentialist arises and seeks power. Now, consequentialist, you don't need to worry about that word. It's a bit technical in the AI safety literature. But the bottom line is they see misalignment and power-seeking specifically as a key manifestation of that misalignment.

This idea that an AI system would realize that, hey, no matter what objective it has, it's never going to be more likely to accomplish that objective if it gets shut off. So it has an incentive to prevent itself from being turned off. And likewise, no matter what its goal is, it's never more likely to achieve it if it has fewer resources or if it's smarter.

So it has an incentive to aggregate resources and to get smarter. And so all these things kind of put it implicitly in conflict with the human control schemes that are built around it. And that seems to be a key vector. The last thing I'll just mention on this, because I think it's so important, is they do flag this idea of inner alignment as specifically being a source of risk here. And so inner alignment is actually something that's fairly newly discovered. I think it's about three years old as a concept or maybe four.

It's this recognition that, hey, maybe the objective that you train your AI system to accomplish don't turn out to be the things that it actually wants once it's built. The example that people give is evolution in humans. Evolution essentially bred us to be procreation machines.

So I should, if evolution actually did its job on me, I should want to line up at every sperm donor bank I possibly can to just get as much progeny out there as possible. But I'm clearly not doing that. I care about things that look nothing like what natural selection would have wanted. I care about, you know, like being happy, being healthy, thinking, having interesting thoughts, having interesting friends and all that. And so you end up with this agent that

that does not actually reflect the preferences that you would have expected based on the training regime, the training scheme, which was evolution.

And so the idea here is like, and there are anyway more complex and interesting arguments mathematically grounded that show that this is actually the case or likely to be the case for AI systems. The idea is you may actually think that you're training an AI to do next word prediction or whatever, beyond a certain threshold of capability, it actually develops distinct goals. And those goals are intrinsically potentially unpredictable. So anyway, kind of an interesting technical dive into how DeepMind is thinking about this, or at least their line of thinking.

Yeah, I think it is quite interesting. It is a pretty short presentation, but I think similar to Anthropic, it is very concrete about the approach we're taking. It cites the papers that DeepMind has been publishing, and it's interestingly a bit of a mix of what OpenAI is doing with human feedback and also this mechanistic interoperability that Anthropic is pursuing. So

On that front, it's kind of nice to see that across different organizations that are very, let's say, competent or scaled up or established, that people are doing a lot of work and are thinking a lot about this. Yeah.

On the topic of concerns, we have a story, Meta's powerful AI language model has leaked online. What happens now? And so this is about how Meta released one of its...

language models, LAMA, and this is similar in a way to GPT-3 and Chad GPT and so on. And they wanted to release it just to researchers, but then it got leaked as a torrent and now anyone can use it, this large version of a model. And

Yeah, I think this has caused some debate about should it just always be open or is this actually bad because now we'll have lots of spam and phishing powered by this. Yeah, and the model is competitive as they put it with Chinchilla, which was DeepMind's at the time groundbreaking system that everyone was like, "Oh my God, it's so powerful. What's going to happen if they release it?" And then Palm 540.

really, really scaled, powerful models. And it's just floating out there. You know, this kind of reminds me of that South Park episode where the kid shows up at the bank and he's like, here's a hundred bucks. Can you invest it, take care of it? The guy's like, yeah, yeah, we're just going to go ahead and put that in money market mutual funds and it's gone.

You know, like anyway, same idea here. You got in that going, hey, we're going to build this big model and we're only going to make it available to select researchers on request. We're going to do it very bad. It's gone. So basically, we're in the era where proliferation of these models is just a fact of the matter about things. And the strategy that like, oh, I'm only going to share it with a few people. I don't I just don't know how how tenable that is. And organizations are going to have to adjust accordingly.

Yeah, I think this is kind of interesting from the perspective of as this gets more and more integrated software and we just sort of go away from being a bit informal with how all this stuff works to maybe a little more structured. This is starting to enter a territory of software piracy, right? We've had torrents for a while and we have some techniques for that.

We've already discussed the potential to watermark the outputs of models for detection. And as usual, it's going to be a race between the scammers and hackers and people who are trying to combat that and prevent any negative outcomes.

Next, on the policy side, we have AI-generated image from text is not human authorship, according to the U.S. Copyright Office, pretty much what the title says. The kind of idea is that

Text-to-image AI is sort of like instructions to a commissioned artist. So it's not your own piece of art. There's a separate artist that created it for you. And only if you do a fair amount of work to modify the output can you actually have copyright over that.

Yeah, I mean, it's so interesting seeing the law desperately try to catch up here. Also interesting how long this perspective can be maintained. At what point do we start to go, okay, maybe not for today's AI, but at some point, are we going to hit a threshold where...

AI clearly has agency or something, and we have to consider it to be something with legal standing. Yeah, I guess I'm sort of curious and somewhat skeptical about the idea of just saying, oh, AI generally...

can't be considered an agent for these purposes, I think ultimately we're going to have to get a lot more nuanced in terms of what we consider to be smart enough for X purposes. And not like I have any solutions for this, but I just think it's an interesting philosophical challenge. Yeah. I would tend to agree that philosophically, legally, this is still a pretty open question.

And to round things out in terms of concerns, we have an article, I am an ER doctor. Here's what I found when I asked Chad GPT to diagnose my patients.

And this author noted that Chad GPT apparently passed the US medical licensing exam, as we've maybe touched on. And this person tried to apply it in a real world medical situation by feeding it anonymized versions of drugs.

real patients that this person saw and found that in about half of the patients, CHAS-GBT did not offer any valid diagnoses. And mostly it was sort of not worrying, but at least in one case, it offered an answer that could have been actively harmful had it been followed.

Yeah, it's really interesting because this brings us back into that question when we see something similar to self-driving cars, where it's like, what threshold of performance do we say, okay, this is worth using? Certainly, there's no question that the optimal threshold doesn't involve zero mistakes of that nature.

But then how many? How many does a human doctor make? What about issues with respect to access to human doctors? How scarce does access have to be before we start considering these sorts of things? But it's all this in the same vein as the last article. Philosophy on a deadline is what we're being asked to do here and in many cases with people's lives. And the medical profession is very front line for this sort of thing. Yeah. Yeah.

And then for our final section, art and fun stuff.

First up we have Mid Journey version 5 is out now. So Mid Journey, if you don't know, is one of the very popular text-to-image models. And now we have a new version of it, and it's actually quite a bit of an upgrade. Now AI can finally draw hands that don't have a monstrous kind of amount of fingers, and

And it's generally much better at realistic images compared to previous mid-journey. So it seems like quite a bit has been tweaked. And in fact, you may need to change how you use it. You may need to modify the text you input to get the best results.

Yeah. And it's so interesting to see the things like the evolution of this image generating AI just because of the kinds of mistakes that it would make. Like who knew that like, well, counting fingers would turn out to be one of the cripplingly hard things or actually one that I saw a lot of was just teeth. Have you seen some of the like their photos of teeth?

I think it's like a bunch of women with a bowl of salad in front of them. And they're smiling, but it looks terrifying because they have two rows of teeth. Something is uncanny valley-ish. It's just so interesting to see the kinds of things that turn out to be hard for these systems that just wouldn't be an issue for human beings. Yeah. And I think this also kind of goes back to...

This is a company, it's a for-profit company that has been raising a lot of money and we don't really know really internally what they're doing and why this is different from something like DALI, which is a bit of a bummer. But I'm a big fan of Midjourney. I like it quite a bit and I'm excited to try this out.

Yeah, yeah. Very interesting kind of story of open source success as well, because that's been a big part of their, I was going to say their journey. Now I need another word, but yeah.

Next up, quite related, our story is illustration competitions grapple with generative AI. And this is a fun little detail covered by Axios, where they contacted nine major illustration competitions and only three of them allowed AI submissions. So for instance, the ADC Awards said that this is...

has a separate discipline with a dedicated jury for AI work. So it's kind of like a separate category. There's the Society of Publication Designers actually allows submission created with AI, but also said that they are against art theft and support artists who use AI legally and ethically.

And then another one just had no guidelines or bans on AI submissions. And there's a quote here of everything is moving so quickly by the executive director of that competition.

You know what, though? I actually like the solution of having just two tiers. It's an interesting test, if nothing else, of how much do humans value art because of its providence, because of where it came from, who made it, the meaning, the intention behind it, versus how much do people just want to look at a pretty shiny thing? And I think that there's value in both. And being able to tease those things apart

assuming that they can actually make sure that humans are genuinely responsible for drawing or painting the human tear. That that's going to teach us something about what people like. And, you know, that's a very subjective thing. I'm sure different people will fall in different camps, but I'm just personally curious to see like, where do people gravitate? You know, are we still relatedly, you know, do you still want to watch a sports game that's played by like super competent robots? And like, I,

I don't know. I think the answer is probably not. I think for certain things, you do want to see human beings do them, and maybe art is going to be one of them. Anyway, it opens a whole bunch of questions about where humans see value in each other, in each other's work, and so on. Yeah, I definitely agree. I think just establishing a separate category makes a lot of sense for these visual arts. And

It is an interesting question of, you know, how does it affect art? And again, there's been quite a bit of discussion over the years, you know, when we had photography, the value of photorealistic paintings went down a lot and surrealism and impressionism became popular, you know, um,

It has been becoming easier and easier to do photography or filmmaking. So there's quite a bit of a history of this question of a intersection of technology and art. And now this is definitely kind of a pretty tricky and new case of it. But I think, yeah, it'll be interesting to see how people value AI art and visual art going forward. Yeah.

And for our last article, really directly touching on that question, the story is "Online storm erupts over AI work in Dutch museum's 'Girl with a pearl earring' display."

So, uh, the. Boise museum and the Hague in Neverland. I'm so glad you had to say that. I made, I made my best try, uh, has been, um,

loaning out the famous painting by Vermeer, Girl with Pearl Earring. And so they have this display now where they allowed many people to submit their own takes on this famous painting. And there were around 3,500 submissions submitted.

And a few hundred of those were chosen to be displayed in this kind of gallery. And one of those was made with AI. The artist himself kind of very directly said that this was generated with AI and Photoshop. And yeah,

Yeah, people were not happy. Some of them were disappointed that they chose an AI piece. It should have been by human. And yeah, I think we've already touched on there has been a lot of pushback from artists on AI art. And this is just another example of that. Yeah, I mean, I wonder what you think about this, but like the...

just the idea of what is the purpose of art? Like if I put together a collection of pixels that I display on your screen and it's AI generated and it creates in you a feeling you've never had a feeling of, you know, great emotion or whatever. Does that, does that have no value? And not no value, but like, does that not count as some form of art? Like,

How do we tease those things apart? The effect that the thing has on us versus where it came from. And before AI, those two were always one thing. You couldn't really separate them. Anyway, deep philosophical waters and it's only March. Yeah. I mean, I think this has been discussed over the years. We've already had a lot of this controversy about who is the artist going back years with GANs.

And the general kind of mainstream opinion from a lot of artists and philosophers is AI is a tool that humans use, the artist is a human. The question does arise, at what point do you no longer have authorship of a certain piece? So if you really put in no work, no creativity, no kind of piece of yourself, no self-expression,

You could argue that's not art and you're not an artist. Here, I would argue that this is a specific exhibit where people submit their own takes on Girl with a Pearl Earring.

And, you know, does it really matter what tools which generate to a VI? We had 3,500 submissions. This clearly had to be somewhat interesting, somewhat different and, and conceptually kind of an interesting take on the painting. And, and,

Yeah, I mean, I think you, I would argue that this can easily be considered an artistic creation, right? Because there was some thought behind it. It wasn't just sort of some random, give me a cyberpunk, you know, girl with earrings. This was made with intention by someone with a particular goal in mind and, you

Yeah, okay. There might be some arguments for pushback, but in this context, I really think it's maybe not fair. Yeah, I guess they don't explain what the actual prompt was, do they? I don't think so, no. Okay, well, we'll never know then. Yeah. Anyway, yet more conversation around this and it will keep going for a while.

All right. That's it for this episode. A bit longer than usual because GPT-4. But that's it for this week. Thank you so much for listening to this episode. Hopefully for any new listeners, this was as pleasant as the last one.

We will keep trying to do them weekly. Sometimes when we travel, we do take breaks. So yeah, if you liked it, please share it with your friends. Please review us on Apple Podcasts. Please do whatever you can to give us more listens. Except for using my OnlyFans link, which... Yeah, maybe not...

Not anything, but no pressure. We really do enjoy having listeners and hopefully helping people keep up with this insane period of AI. Yeah, so thanks for listening and be sure to tune in next week.