cover of episode Insights from OpenAI's AMA: The Next Breakthrough in AI

Insights from OpenAI's AMA: The Next Breakthrough in AI

2024/11/2
logo of podcast AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning

AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning

Chapters

OpenAI executives discussed future advancements in AI models, cost reduction, and the introduction of AI agents during their AMA session.
  • OpenAI is reducing API costs to make advanced AI tools more accessible.
  • The company is focusing on improving efficiency and hardware to lower costs further.
  • Regulatory challenges and compute limitations were acknowledged as significant hurdles.

Shownotes Transcript

Translations:
中文

Sam, in the city of OpenAI and a bunch of other top A I OpenAI executives just held an ama and asked me anything over on redit where people were asking them questions and they were giving all of the responses.

I want to show you a bunch of the answers because they gave details on projects like the updates on the timeline for sore, the video model, dolly, the new image model, what's coming with GPT like they what they didn't give exact dates, everything they gave you, timely ines give you ideas they told you when they didn't have timelines, they tons of new interesting things, Price changes. Um i'm going to breaking down all of their responses and all the most interested responses that I saw there in the podcast today. So let's jump along for the ride.

And the one thing I want to say before we get started, if you are interested in making money with A I tools and maybe helping grow or scale your current business, I would love to have you as a member of the AI hostile school community. So every single week I record an explosive piece of content with my cohoes is Jamie, where we break down different A I tools, different ways that we're making money, things we can share publicly when IT comes to products and software and workflows and kind of exactly what we're doing in the behind the scenes of IT. It's all in this uh school community with over two hundred members.

Some of them have come from companies started, which are hundred million plus, and others are just getting started to get a really wide range perspectives and some great feedback on your projects. What you're doing. We'd love to have use a member of the A I hustle school community, and IT is currently thousand dollars a month.

We're going to bump the Price up eventually. But if you get in now, you going to lock in that Price and I will never be raised on you. So i'd love to have you, as a member of the air hustle school community here about what you're working on and share some of behind the scenes stuff.

I don't share any words. You can check this out at the link in the description. So let's get back to what's going on with this new ama. So I actually got a great recap of this over on x from kim is mus or A K A travel. So shut out to came as much for some of these screen shots and recaps of the interesting stuff.

The first thing I want to break break out is that somebody asked them, are you planning to reduce the API cost of advanced voice? Now this is something that a lot of people are asking about because it's a little bit expensive. And what you really want to see is the cost of this advanced voice coming down.

The more that these things come down in Price, the more viability ties for developers to take them build really exciting new features and tools with them. So yeah, the A P I cost is developers ability to add this into their software. And this is exciting and a big question, right? Because you can measure for this advanced voice.

It's like you're at making you know you're making like A I life coaches and you're making you know fitness coaches and you're making you know someone that could you could talk with them and they can help you fix your car like A A I mechanic, right? There are so many interesting use cases, but if it's too expensive, it's not viable. And so the cost will come down.

So the CPU over opening eye, who is Kevin wale, said we've been reducing the cost of API for two years now. I think GPT four mini is like two percent. The cost of the original GPT three.

Expect this to continue with voice and others. This is a really big bit of alpha and update, which is these things are going to continue to get cheaper and cheaper. They're to make them more efficient, which essentially makes them cheaper.

And the hardware that runs them is going to get more powerful of the older models, get Better and you will probably be same pain, essentially a very similar cost for the most premium, best models and all the new features. But the older stuff is just rapidly getting cheaper. And a lot of the older stuff A K what we're using today is hard to be using in tons of different use cases.

So if you don't need the most advanced to do a specific use case, um the car is gna come down a lot. That's exciting, someone said. Any plan to negotiate with E, U, 和 E, U, users will get stuff faster, not dumped down.

This is a problem we're looking at, like apple intelligence not getting roll out to the E U, and about features getting blocked there or at least delayed just because of all the regulation. Sam man said, we will follow eu policy. Obviously, all of us hope for increasingly sensible eu policy.

A strong europe is important for the world. So I kind of like this approach, right? Like it's really easy to kind of take a, take A A pot shot at eu and say, terrible place.

I was there all summer. I mean, it's a nice place. I like IT. It's a bomb that some of their policies slow down. A lot of this A I stuff perhaps overregulated, that's a whole argument for another time I look at the sammon doesn't like try to take a dig at them. He's just saying, you know, a strong europe is more important, is important for the world.

He's saying, come on, guys like what we can make this happen, but it's really up to europe and they're going to follow their policies. Someone ask for for from sam out in the see you've opening, I D asked for a bold prediction for twenty twenty five and he said, saturate all the benchMarks, meaning all the places were the benchmark different from A M models which wants the best he wants all the opening I tools to be the top of the benchmark, everything. And this isn't always the case like they they do pretty well and they're usually at the top, especially when they come to the new release, they're usually at the top of for their models.

But it's aren't case sometimes. And throp IT comes out and there they get ahead or new AI image models come out, they get ahead. So OK, I think he really wants to be everything that the the best of thing is comes from opening full protection.

We'll see right and prediction, meaning um he hopes that their products in the pipe and what's coming soon, we'll do that. It's not really exactly like that necessary today. Someone asked how fast as opening ic inference costs reducing in order to enable chain of thought or multi layer thought trees.

From a business of perspective, we like to execute reasoning changes as fast and as cheaply as possible. So really what to talk about here is the o one preview is essentially chain of thought. So it's like you ask you a question and IT runs IT through like twenty different questions to make sure the response is the best possible.

But IT IT takes more time to do that. So the opening iv p of engineering said we expect inference costs to keep going down if you see the trend of the last year is coming down like ten ex. okay.

So this is exciting. IT seems like things are going to get a cheaper, faster. That's fantastic and absolutely wild.

Question was asked, which is what's the next breakthrough in GPT line of products and what the expected timeline and of open. I said we will have a Better we will have Better, Better models. But I think the thing that will feel like the next giant breakthrough will be agents.

This is fascinating to me. You know, the next thing that they're really putting on as a focus, uh after kind of having some Better reasoning from the A M moto ship currently achieved, is agents as the next step. That opening eyes, really i'm trying to achieve.

So this is going be amazing. And I think what's interesting is he saying what Better Better models, meaning they kind of have all the foundation models, uh, in the works. They want image, video, audio, text.

And so at this point, those things will get incremental improvements. I don't expect insane jumps or maybe big jumps, but you know there is going to be A A bulker a out off like these things are getting pretty smart. They'd be as smart as a human.

And if that's the data are trained on, that might be we might have a wall there unless we can figure out some clever ways to get them smarter than humans. But in any case, um getting them to be able to autonomous ally do things, I think is the next big steps on super excited. See where that goes.

Selon asked for advice from vicious Youngsters that want to contribute to the revolution of A I and the CPU of OpenAI, Kevin wheel said, my vote s using IT everyday. Use IT to teach you things and to learn whatever you want to learn, coding, writing, product design, anything if you can learn faster than others and you can do anything. That's just a general good advice, but didn't really have too much nothing too exciting as far as what we ever going to get.

So asked, why does anyone not support image input? And again, Kevin said we fox on getting out to the world's first versus waiting to make a full featured image input is coming in o one. And in general, the old series of models of beginning things like multi modality tool use setter in the coming months.

So this is great. I mean, this is kind of what we expect right there. And I think everyone would rather get the model first and then have all the features come out later than have to wait an extra three or four months to get all the features out. So yeah, this is going to be cool. Someone asked, when will we get more information about GPT for al image and three d model generation to which he replied soon um and had a screen shot giving ChatGPT some HTML and saying, render this and I was able to render what that H G ml would look like in a web browser meaning theyll have essentially covies aliza and some really cool stuff and by the way, that was opening eyes C S V P of research, which is mark chen.

Someone asked if sa being delight is sa being delayed due to the amount of computer time required for inference or due to safety Kevin, the CPU of opening, I said, need to perfect the model, need to get safety slash in personation, slash other things right? I need to scale compute sound replies so basically you're waiting for fifteen x speed boost on inference with the b two hundreds. Um there's a not when I love, but yeah if they don't get the computer, they need that so many different projects right now.

It's not like that. It's a company just working on store because open, I has their fingers and so many pies and there's such a compute. um.

Sensually botnet, then IT really slows down a lot of the other projects, which is a bomber, because you can imagine if there was four companies to size opening eye. One was working on image. One was working on audio, was working on video, was working on text. We have a wait faster, advantageous in all of them.

But because they only have so much compute in one company that after prioritize which projects to work on and IT looks right now like a sort as model, is getting essentially sort of shafted until they can figure out a couple of the problems but get more compute someone said when will the the full o one release Kevin said soon um which is obviously not very specific so people kind of roasted to him and said a date or IT didn't happen but you know at least were at least we're making progress towards this, which is good there's someone said, how will o one influence scaling elms? Will you continue scaling elms as per scaling laws or will inference computing scaling in smaller models with faster and longer inference will be the main focus, Kevin said. It's not either or.

It's both Better base models plus more strawy scale inference time compute OK sunset. What's one thing you wish ChatGPT could do but can't yet and um this if opening I V P of engineering that I love for you to understand my personal information Better and take action on my behalf it's kind of interest, especially agents some said, what chat G, P, eventually build, perform tasks on its own message you first zone, said Kevin opening S, C, P, O, said, I M H O. This is gonna a big theme in twenty twenty five and someone said, we heard about, we heard that about twenty twenty four, the cronin moggy.

So at the end of the day, like you, everyone wants be so optimistic about open eye, and they have such incredible tech. But sometimes when you feel like things are advancing so fast, like you can see what the next step would be. And so like, well, why I they're just taken that next step.

And then at the end of the day, like the reason is because they have constraints of safety or compute or priorities or money, like whatever. And so IT just sucks to feel like we want the technology. We know how to do the technology was a botnet by something other than the knowledge that kind of sex, but hopefully is going to be a big thing in twenty five.

But nothing super specific on that. Some said is a plan to continue to release o serious models from now on improving on the regular models GPT, three, four, four, five, both or recombination of those, Kevin from OpenAI, CPU said both. And at some point, I expect they'll converge.

This is what I really think is happening here. GPT four came out super soon after a GPT three, but both of those have been in the works for a long time before we'd even gotta ChatGPT launch. And so IT felt like a massive jump in capabilities when GPT four came out.

And you even can remember elon mask, no bunch about the people signing to letter, say, like we can, like the government needs to ban or everyone needs to ban, making any models Better than GPT four for the time being. And then of course, everyone like is trying to compute open the idea just fall so much further ahead. IT feels like they wanted to come on something that be GPT five, which ended up actually just GPT o one, but IT wasn't really such a big a difference from three to four to that.

So they didn't call GPT five. They just call IT, you know oh one or whatever. And eventually, when they have another thing that's a huge step in and how much Better IT is, they're gona call IT five.

But they're making updates and they are making the training new models and there they're putting in work. It's just it's just hard to to well as I think like they didn't in the past. And i've heard some i've heard some of them makes some comments about things that they plan on being able to do, uh, that they would consider GPT five.

And it's going to be impressive, but it's also ambitious and needs a lot of computer. So someone said, will we see advanced voice loosen restrictions around musical capabilities like singing at some point? Is there a timing for this Kevin's had working on, I want to hear ChatGPT sing too.

That's cool. Again, it's like these capabilities exist. It's just um you know for copyright reasons.

Now what's interesting as you have companies like sora that are already doing this, and what's interesting to me about that is the fact that essentially one of the edges these companies have is like A A willingness to kind of break some regulatory maybe copyright, maybe like other things like to be fair opening added at the beginning when they sucked up the whole internet and trained off of that. And he got mad about that, including the new work times in sudan. So you're willing to take some lawsuits.

You can be the first person like sora to really come up with a solid video model. And once they do that, they're able to take a really big lead, which you know to be fair, like open going to do eventually. But if they get a big lead to get ahead, and that's exactly what a company like eleven labs was able to do, they got a big lead on voice.

And opening is doing voice now. But this point, oh, like eleven labs is really well known for voice. They're doing a really good job. They have a lot of apps and developed tools that are built into a lot of things. And so you can kind of get this edge and you can get you can jump, start them a bit and you know, check you a bit amount.

Some said, what is the best use kiss for traditional tea you've seen in the wild so far? And what, if any area do you think um IT and future versions could versions next couple of years IT could be you are good for sammon said. There are a lot of great ones with the stories of people figure out the cause of the abilities disease and then getting fully cured are really awesome to hear, also a lot.

But the ability to be a really good software engineer feels deeply unappreciated, even still more generally, the ability to help scientists discover new knowledge even faster will be so great. I agree, all of those are um fascinating. I'm really useful um so i'm excited about pretty much all of those. Some says, do you have any plans to increase the memory chat chicken store Kevin CPO said, do you mean longer context windows of so yes, they said no. I mean amount of memory trajectory stories for single account.

The memory capacity keeps getting full and enforced to select which memories I would like to delete to make space for new memories to be persistent memory and someone plus one to that um so that's kind of interesting and I don't think we've a response to IT, but yeah, it's a definition issue. I have heard other people i've talked to and consulted have that run to same issue. Someone asked a release, state of charge, beauty fibre is equivalent, what its features and they said, we have some very good assail man said, we have some very good releases coming later this year.

Nothing that we're going to call jup ty five yet though. Okay, so holder horses there someone said, seriously though, what did elliot see? And some women said, the transcendent future ellia is an incredible visionary and seize the future more clearly than almost anyone else.

His early ideas, excitement and vision were critical to so much of what we have done. For example, he was, he was one of the key initial explores and champions for some of the ideas that eventually became a one. The field is very lucky to have him. So those ideas, and what became a one was essential chain of thought.

So I see he came over the ideas that essentially did the chain of thought, which really, to be fair, gave opening the eye a big boost and help them get ahead of, uh, some of their competitors and know someone else ask about him leave in which I don't think he was responded to some said, when will you guys give us a new text image model like text image model, lily three is kind of outdated. As a mom said, the next updates will be worth the way, but we don't have a released plan yet. Wow, that sucks.

Always sucks to hear that it's going to be worth the weight because, you know, we can't want IT now. But IT is what IT is. So much has happened and this is absolutely fast, and i'm going to keep you up today on anything else, any other renews coming out of opening eye like I don't know leagues, but this of this ama, I feel like gave a really deep insight into the time lines on some of their core features, some of their biggest products.

We're going to be expecting those and we're going to we're going to actually be able to use those. So absolutely fasted, excited. And if you're interested using any of these tools to make money online, again, I would love free to join the air has to school unity the links in the description, and I hope that you will have an incredible rest of your day.