Home
cover of episode Ep 404: AI News That Matters - November 18th, 2024

Ep 404: AI News That Matters - November 18th, 2024

2024/11/18
logo of podcast Everyday AI Podcast – An AI and ChatGPT Podcast

Everyday AI Podcast – An AI and ChatGPT Podcast

Chapters

Google's Gemini AI model has surpassed OpenAI's GPT-4 in the LM Arena chatbot leaderboard, highlighting Google's advancements in AI. However, access to the top model is restricted to backend developers, creating confusion among users.
  • Google's Gemini AI model is now the top model in the world.
  • Access to the top model is limited to backend developers.
  • OpenAI is expected to release an updated version of GPT-4 that may reclaim the top spot.

Shownotes Transcript

Translations:
中文

This is the everyday A I show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business and everyday life.

OpenAI is releasing agents A I google now has the world's top AI model, but for how long in microsoft is hours away from its biggest A I announcements of the year? That's a lot going on, and that's just the very tip of the iceberg. You can spend hours every single day trying to keep up with what's happening in the world of ai and how IT impacts your company and your career.

Or you could have tuned into every day, A, I, every single monday as we bring you the A I news that matters. Let's go on your my named john Wilson, and welcome to every day ai. This is for you.

This is your daily lifestream podcast and free daily newsletter, helping us all not just understand what's happening in the worth of A I, but how we can all actually use IT to grow our companies and to grow your your careers. So that sounds like you you are one hundred percent in the right place. So if you're brand new here, thank you for tuning in.

Make sure if you're tuning in on the podcast, check out the show notes. We always have a lot of great resources in there for you. What are the best resources is our website.

If you haven't already, please go to your everyday AI that come sign up for the free daily newsletter. We recap every single days episode monday through friday, and you can also go in there, click the little episode s tab, and there is literally more than four hundred episodes. IT is a free general AI university.

Whatever you want to learn, marketing, sales, H, R, it's all in there. All right. So thank you for tuning in.

You all, let's get straight into the AI news that matters for the week of november eighteen th. And hey, thank you to our live stream audience tuning in as well. Big bogan Michael on youtube, brian rolanda, my fred, everyone else, thanks for tuning in.

Let's get straight into the AI news that matters job. Well, A I news that really matters is going to be happening at microsoft ignite. So I will be there a tomorrow in chicago.

So if you are gonna at h microsoft at night as well, or maybe your company is, make sure you reach out, say what's up? Let's let's connect in chat A I while we're there. So the keynote kicks off tomorrow and hate, i'll just tell you this, there's big things that are gonna happening at seven thirty, a central center time.

So make sure you tune in, right? Let's get to one of the biggest pieces of A I news this week. Uh, google gami is now the top AI model in the world. So google deep mines latest AI model gi E X P eleven and fourteen.

So experimental eleven fourteen after november fourteen th has now surged to the top of the lm arena chatbot arena leader board, surpassing OpenAI GPT four o so this achievement highlights google advancements in A I, particularly the model of us really well in math and a vision tasks, areas where gina models traditionally excel. So the lm arena chatbot arena, so formally that was known as the L M S Y S arena. Yeah, a lot of acronyms already, three minutes and the show.

So the lm chat pot arena allows A I labs to compare their models in a blind head to head format, with users voting on performance without knowing which model they are evaluating. You'll remember that like pepsi versus coke, a taste test that's essentially ly what the lm arena is for chat box and open the eye. I'd say for ninety five percent of the days ah that this has been up opening eye is usually the top.

But google on the november fourteenth, so just a couple of days ago, uh, released a new model and IT is now the top model. Also right now, the top five models in the arena are dominated by OpenAI in google, with X A greg two being the first non google or OpenAI model on the leader board. So despite its success, gemini eleven fourteen is not yet available in the public APP or on the website, so access right now is limited to those with a google AI studio account.

Also, the chatbot arenas evaluation is based on human perceptions of performance in output quality rather than strict data based testing or benchMarks. So, uh, how long is this gonna bad? Say that long.

Here's why. Um we just spend a little bit of time over the weekend digging into the chap arena and we also saw that open a eye is actually testing a new model as well. So it's anonymous chatbot model from OpenAI is being blindly tested right now. So I don't know how long this new google eleven fourteen is going to be the most powerful model in the world.

But interestingly enough, I also saw google is testing to follow up models so they have their mystery ga mini three and secret chatbot named models a GTA love the the model naming for the secret models a like secret chat pot love IT right? It's like it's like a bad spy name. Um but we'll see I would would assume all that OpenAI in the next, I don't know, week or two would probably update its GPT for o model IT would presumably be the next top model.

So Normally um I said the name of a show, right a Normally a open eye doesn't waste time ah in being the top model. So it's always slight refreshes right. It's not like going from you know GPT three point five to GPT four and all of some open the eyes on the top again.

So kind of the trend over the last six months has been these incremental updates, right? So on the OpenAI side, it's actually called GPT four o latest. Um even though we don't really say that name a lot.

H also, what's important to notice here are important to note here about the google gami eleven fourteen. If you go on, you know, geni the front end, you're not getting this model. You're only getting this as a back and developer.

So this is something that I wish that google would change a little little bit because everyone else, and when I say everyone else, the other kind of two big players here, anthropic laud in OpenAI, when they released a new model, they make that new model publicly available on their kind of chat outside, right? So if you going to ChatGPT 点 com or claude out I, you'll get access to OpenAI and anthropic most powerful models respectively. Google, not so much.

Google really restricts their most powerful models just inside of their AI studio. So you know, you're using google gami on the front end as a chat pot. You're generally working with a mod that is anywhere from three to eight months old.

So i'm not sure what google a go to market strategy is there. Maybe they really want to make sure that the model stable before they rolled IT out as the defect or model. Not a big fan of IT because I think that creates a lot of confusion.

right? Our next piece of A I news chat pots are out performing doctors and medical diagnosis studies. So a recent study published in the journal of the american medical association network open reveals that ChatGPT for outperformed human physicians and diagnosing medical conditions from the case.

So the study involved fifty doctors, and IT showed that ChatGPT scoring average of ninety percent in diagnosing medical cases, while doctors using the chatbot scored seventy six percent, in those without IT scored at seventy four percent. So slight increase there. So this experiment, higher lights the potential of A I to serve as a valuable tool and medical diagnostics, yet also underscores the chAllenges in integrating A I into a medical practice.

So doctor adam marot, men who helped design the study, noted that doctors often remain in confident in their own diagnosis, even when the chat hot suggested potentially Better alternatives, and nothing like you second guessing chatbot, that is, has access to human history, uh. So the study suggests that many doctors are not fully exploiting a ice capabilities, often using IT, more like a search engine, rather than leveraging its comprehensive agnostic abilities. So doctor JoNathan change from stanford, who was a study author, emphasize that only a few doctors realized that they can input entire case history into the chat out for more comprehensive answers.

The findings point to a need for Better training and understanding of A I tools among medical professionals to fully harness they are dynastic potential. So the study used real patient cases that are part of a set of a hundred and five cases used since the nineteen nineties, ensuring that ChatGPT had no prior training on them, making the results particularly not worthy. So this is not the first study that kind of pits A I check pots against doctors.

Uh, there was a pretty famous study about a year and a half ago that actually showed ChatGPT was more empathetic tic than doctors and had Better bedside manner in kind of a blind tests. So I do think this is a gonna, one of those studies that commonly gets uh referred to in the coming months, right when we talk about uh ChatGPT or A I models capabilities, I don't know lifestream models would you want uh AI as a doctor, right? I know it's a weird, a weird concept, but I do think that is where we're heading.

Just a lot more a reliance on large language models in the medical setting. I don't care. I would love to hand A A chatbot over all of my medical history if IT meant that I didn't have to wait like nine months know for a stuffy nose, right? The medical system here in the us.

Is broken, and I am very excited for A I to disrupt that system. On next piece of A I news in video, earnings are coming out on wednesday, and there is going to be a lot of eyeballs on that report. So NVIDIA is set to report its quarter three earnings on wednesday, providing a crucial update on the AI markets momentum.

So as the world's largest publicly traded company by market cap in video has become a parameter for A I growth to do this, its significant advances in AI technology and applications. So the company's stock has soared nearly a hundred ninety percent year to date, driven by the surgeon demand for AI technologies, making IT the largest publicly traded company by market cap in the entire world. So ah technically the most valuable company in the world.

And you I told you two years ago when invidia was kind of like no one really knew about IT, i'm like in video going to be the biggest, the world. No one listen to me. Yes, you to start listening.

So investors are eager to see if in video will exceed expectations and raise its quarter for revenue outlook projected at thirty seven billion dollars. So despite past strong performances in videos, shares actually fell six percent after a quarter to earnings, possibly due to profit taking or unmet. High investor expectations are expectations are through the roof, especially when you hear about companies like know, as an example, X, A, I ordering one hundred thousand GPU chips all at once.

Expectations are pretty high for invidia to deliver, so CEO jenson wang may provide updates on the next generation blackwell AI chips, which are expected to significantly contribute to future revenues. The demand for black world chips is already outpacing supply, indicating robust future sales potential, so the company's performance is seen as a reflection of the expanding influence of A I across various sectors beyond just technology impacting industries like healthcare, automotive and finance. So yeah, I talk about this pretty much of the timeout.

The reason why we talk about chips a lot and we cover GPU and kind of the political aspect of this is because GPU mainly from invidia power, all A I, right? All of the large language models, right? For the most part, any A I tool you use, IT, is a high likelihood they are using invidia s chips.

So so much of our data date work and how the future of work is changing is actually be impacted by invidia chips, right? So that's why it's pretty important. So friends asking here how will increased terrorist impact in video?

Yeah we've seen a some reports from the, uh, trump administration, the incoming trump administration here in the U S. Of interface anywhere between ten to twenty percent and I believe up to sixty percent on chinese in ports. So yeah, we are probably going to be seeing uh, if these terms s do come to fruition, we're going to be seeing uh, probably higher costs on the hardware and software side.

That's not my personal opinion. Uh, actually you know third body reports that are looking at this. So we will see our next piece of A I news. ChatGPT has rolled out some no worthy deathtrap updates on both mac and windows.

So OpenAI has announced a new feature for the ChatGPT desktop APP on mac O S, allowing IT to read code directly from several developer focus coding apps like V S code, x code, text edit terminal and I ter m 2。 So this development eliminates the need for developers to copy in pace code in the ChatGPT, streamlining the coding process by sending code sections directly to the chat bott I as context. So unlike other A I coding tools such as get hub co pilot ChatGPT cannot yet write code directly into these apps, but I can assist by providing code suggestions.

So I did test this out last week as I came out. So there's also some confusion out there. People are thinking that this is the new Operator, uh, kind of uh, agenda.

I from OpenAI IT is not uh, all this does so on the ChatGPT mac APP. So you need to update IT to the newest APP. There's essentially kind of like how you would act mention a GPT.

You can essentially read uh, different files from those programs, right? So as an example, let's say you are using VS code or something like that to code. ChatGPT can simply read uh, the file that you are working on, but I can not write.

So essentially what this does is a great user, uh, kind of U I U X. So user interface, user experience update. No IT will save a little bit time.

But right now, I think this is just more of an incremental feature, right? And hopefully creating a stronger bridge between the ChatGPT does top APP and how you work with native mac apps. So we do not yet have that same capability on the windows side.

However, windows users were not left out of this because also a last week, ChatGPT announced that free users can now download and use the ChatGPT APP for windows, so before this was limited to a paid users only. So if you are a free ChatGPT user, you can now download the best top APP for windows. All right, more OpenAI news.

Let's keep IT going. So according to reports from a blueberry, OpenAI has proposed the north american A I alliance to compete with china. So OpenAI is urging U.

S. Legislators and regulators and its allies to collaborate to collaborate on A I infrastructure to maintain a competitive edge over china, according to bloomberg reports. So the company, he suggests forming a north american compact for A I to stream mind access to talent, financing and supply chains.

So OpenAI envisions this compact expanding globally to include U. S. Allies and partners such as countries in the middle east.

The proposal was unveiled at an event by the center for strategic and international studies, highlighting significance. So OpenAI policy blueprint emphasizes the need for the U. S. To support energy infrastructure projects by committing to purchase power, so the company recommend establishing A I economic zones to expert ite permitting processes and revitalize nuclear reactors. A one, one big thing.

We saw another report that we covered in the news slaughter last week that set up to forty percent of A I data centers are going to run into power problems in twenty twenty five. Not good. So expanding nuclear energy capacity by utilizing compact reactors from the U.

S. Navy was also proposed. So OpenAI views A I as an opportunity to we industrial ize the us. And Foster broad based economic growth.

The initiative is seen as crucial for national security, promoting a democratic values in the A, I development against china's influences. So open the eye has sought funding from investors in the middle east to support chip energy and data center expansion. So OpenAI CEO sam altman n has reportedly engaged with U.

S. Officials to garner supports for the infrastructure plan. So the proposal coincides with an upcoming change in the U. S. Administration with president elect Donald trump acknowledging the need for expanded energy capacity.

So this one is pretty interesting here, right? IT almost seems like on IT almost seems like on the ai side, at least according to reports that open the eye is almost trying to establish a nato ask partnership among the nations to the united states just for A I infrastructure and development, specifically to compete with china. So I do think that china was maybe a little late to the game.

But I think if you look at just over the last six to nine months, china, aside from the us, I think has been the leader in A I development. And it's not even close in terms of first, it's like one a, one b, and there's really no one else in second or third right now. IT is at least as of recently, IT really is just the U.

S. In china. So a pretty strategic move here, according to reports from a bloomberg from OpenAI trying to take a driver seats in turn of uh, kind of global dominance on the A I C E.

So uh, IT is a very a important, i'd say, peace of the puzzle to keep a look at, right. I think people sometimes, uh, d value A S role in the world economy. Uh, IT is everything right now, right? A I is being used uh, in the forefront of of military, is being used in in, in transportation, supply chains.

Literally every aspect of the world economy is becoming maybe overly reliant on artificial intelligence, on GPU, on the ability to generate power, right? So much of real economies right now. A is heavily reliant and quickly becoming potentially over reliant on A I and kind of related industry.

So I do think IT is important to see what happens in terms of partnerships on a global scale. I don't think people are paying too much attention to IT um but I think that we should so pretty big news there coming from a bloomberg in hey, as a reminder, you all this is always in our news letter. So if you're listening to this on maybe you on their trade Miller or walking your dog, something like that, you're like what happened there is always going to be in the news letter, so don't worry about that.

Hey, this is Jordan, the host of everyday A I. I've spent more than a thousand hours inside ChatGPT, and i'm sharing all of my secrets in our free prime prompt publish ChatGPT course that's only available to loyal listeners like you take IT from ben, a company founder who took the ppp course.

Your prime prompt in polish class was absolutely fantastic. If you're new to A I like me, a leap frogs, you're learning immensely. If you have been playing around with IT for a little while, I can imagine how I would help you just little bit. You're learning that much faster. So I encourage to give you a try and check IT .

out everyone's prompting. Wrong in the ppp course fixes that if you want access, go to pod P P P dot com. Again, that's pod P P P dot com site for the free course and start putting ChatGPT to work for you.

All right, our next piece of A I news, some big news from google. So google's gin I A I is reportedly ly set up for a major upgrade with some conversational capabilities. This sounds like a small feature, but stick with, stick with me.

I mean, to tell you why, this is actually pretty big. So google german I A I is boys for a significant upgrade, potentially enhancing its ability to interact with various media types like files, photos and recording. So this is with the gami live kind of live interactive uh, voice agent.

So this development mirrors the capabilities of OpenAI ChatGPT, particularly a future missing in its advanced of voice mode. So according to android authority, a recent beta version of the google gami APP revealed code suggesting that gami life might soon be able to enable discussions about user uploaded attachments. So although the feature is not yet active, the code hints at a user experience where questions can be asked and insights can be gained directly from files. Here's why this is important because also in the last couple of days, google has released gemini as a dedicated APP on the iphone, which includes access to its gami live voice assistant. So this is important because the combination of having both gi on the iphone, which has access to the neural german life agents and the ability for a gi life to potentially read your files, would put IT ahead of all other voice assistance.

So i'd say right now, there's really only at least here in the us, i'd say that by far OpenAI, yes, uh, advanced voice mode is the best live A I voice a system, right? And presumably we will hopefully see some of these role out to companies are to, uh services are like amazon's alexa, like apples theory is such a but right now, IT is really just open a eyes in the ice voice mode and google gemini alive that I think are pretty far above everyone else. But the downside, at least right now is these very um you know neural, low latency, very human like sounding I voice assistance powered by large language models do not cannot access actual data.

Rit really makes them a little less useful, right? So we've done a couple of shows here as an example on invoice fence mode from OpenAI and ChatGPT. The downside though is you can use all of the other tool so I cannot access the web.

You cannot even for advances voice mode on ChatGPT s APP, which is right now only available on the iphone actually, no, I do believe it's IT was rolled out yet during my my little vacation that I had celebrating my anniversary, IT was rolled out on desktop as well. But the downside is you cannot upload files uh to open a eyes at the ananzi mode and you actually can't even type as soon as you type, you can no longer use advance voice mode. So this is some actually, I guess, said that sounds like a small little a feature potential feature from google, geri and gami lie.

But it's actually pretty big because this would if that does actually rolled out, this would mark, the first very smart large language model, powered A I voice assistant that can access your data, which I think would really change how we work, right? Think of as an example of if you have a long computer, if you're taking a long walk, things like advanced voice mode in google, german I live are great to talk to. However, it's not super useful, right? Because if you do want to really gain some great insights when IT comes to your business, when IT comes to something that you're personally working on, you have to spend a lot of time verbalizing in talking all of that information into the context window, which is I mean redundant number one.

Number two is extremely time consuming ah so the ability to upload files and use google gami life and their new iphone APP that brings geri and german I life to iphone users could be a huge announcement right last. But death. So so not necessarily. Michael, great question from youtube asking, is this kind of like no book? L M ah not necessarily.

So uh the big thing right now with german, I live in advanced voice mode, as you can talk to these agents, right? No book A M I think IT is the product of the year, uh, for twenty twenty four in less OpenAI drops sora or drops, you know, GPT five orion. But I still think no book I M right now is the product of the year for twenty twenty four or no book L M H is by google.

IT is a grounded state of the art model that has built in retrieval admitted generation. IT is wild that i'm saying those words and that is available. However, with no book I em, you can't talk to IT, right? You can type to IT, you can chat with all of your documents and IT can also you can generate podcast, right? But you can't talk to IT.

So I think this is where once you can upload your files to a neural A I live agent and talk to IT, that's where it's like your the consulting game is going to completely change. So not necessarily like no book a lam, a slightly different art. Our last piece of A I news for the week, and i'd say this is probably the biggest news.

So according to reports, OpenAI is working on an agenda I system called Operator. So open a eye is preparing to release a new AI tool co named Operator in january twenty twenty five, designed to automate task with minimal human intervention, according to bloomberg reports. So Operator, like I said, is an A I agent that act like a personal assistance capable of performing multistep processes and executive actions on behalf of users.

Distinguish IT from traditional AI checkbox. So the introduction of Operator signifies a major shift in how business could Operate, potentially transforming daily workload by automating complex and labor intensive task. So the tool is expected to be released as a research preview first and also threw open a API for developer.

So IT will a, according to reports, be released in kind of those two tears first uh but at least right now, at least not now, kind of like on the front end of if you go to ChatGPT 点 com or using the ChatGPT APP, according to reports, IT will not be released like that, but first as a research preview and through the open A A P I. So this development allies with OpenAI a long term focus on agenda I, A priority highlighted by CEO sam Allen and reflects the companies is ongoing commitment to advancing A I capabilities. So competitors like anthropic, google and microsoft are also making rides in agencia I with products that we've talked about here on the show the last, the weeks, a lot in topics, computer use, which is live now.

Google jarvis, which was a chronic c tension that google launched and then they kind of took away, but will reportedly be released in december. And microsoft s autonomous A I agents in copilot studio, uh, which according to reports will be publicly released in a bea tomorrow, november. So essentially all the big I mean, those are the big four right there, right? Anthropic OpenAI.

Google and microsoft all in the fourth quarter of twenty twenty four are either going to be releasing agents A I or reports are coming out that they will be releasing IT. So essentially, we're seeing from october to now, we reportedly january. So in a four month span, we will likely see agencia I from the four biggest players in A I.

Uh, so uh the introduction of AI agents like Operator could accelerate the shift towards more automated work environments, emphasizing the need for businesses to invest in A I reading this and redefine job roles. I cannot underestimate or sorry, I cannot overstate how important and impactful this is. I think we've been kind of uh, you know slowly walking our way through large language model implementation over the last two years.

This changes everything. This changes how we work. IT changes what jobs look like. IT changes how humans a in in the human role in traditional knowledge work.

So as A I becomes more integrated into business processes, issues of safety, governance and data compliance are expected to become even more critical, requiring organizations to establish robust guard rails and human in the loop systems. Are you? Ah that was a lot.

Just one more reminder, microsoft ignite tomorrow. Let just say there's going to be some big announcements and you are going to want to make sure you join the show tomorrow and you should probably join us live at seven thirty A M central standard time. I would say that I can say a whole lot more.

But uh, as a recap, here is the A I news that matters for the week of november eighteen. So first, we have googles new eleven fourteen gam eleven fourteen, topping the AI arena leaderboards as the world's most powerful AI model. For now, I do expect open eye to release an updated version of GPT four o that would probably take that, that top spot.

Next, according to a new study, ChatGPT is out forming, outperforming doctors in medical diagnosis. Next in video is poised for strong quarter, three earnings that are expected this wednesday. Next, ChatGPT has rolled out some pretty note where the desktop updates for both mac users and for window users.

Next, according to reports, open a eye has proposed a north american AI alliance to compete with china. A, uh, next google gi A I is set for a potential major upgrade with conversational capabilities and giving users the available of the option to upload files while using gami life and the new iphone release. And then last but not least, reports are saying that OpenAI new agent's AI system called Operator we will be released in january.

All right, well, that was a lot. I hope that was helpful. If this was helpful, please make sure if you haven't already go to your everyday AI dot com. Sign up for the free daily newsletter. hey.

And while you're here, if you're listening, whether is on the podcast or on the live stream, please click that repost button, go ahead, tag someone in the comments that needs to hear this. If you're on the podcast, please follow or subscribe to the everyday A I show a whether you're listening on apple or spotify or somewhere else. And if you could leave us a rating, we would super appreciate that we spend so much time every single day.

Yeah, the show is monday to friday. Maybe only listen on monday for the AI news that matters. But we do this every single day, bringing you the brightest minds in A I from all over the world, every single sector, and giving you direct access to them.

So if you're Normal podcasts listener, I suggest you try to schedule IT into your day to come to the live stream. It's great to be able to ask questions of some of the world's leading A I experts and get answers in real time. All right, I hope this was helpful.

Make sure to shout out. Shout out. Let me know if you're going to be at microsoft ignite in chicago.

this. Starting tomorrow, my birthday, what a great birthday present, right? Uh, you know, the key note in the kick off a of microsoft ign.

So make sure if you want to connect, chat A I maybe tell me a little bit about what your company is doing in the A, I space. Make sure to reach out to me. We'll set up a little meeting.

Thank we're tuning in. Hope to see back tomorrow and every day for more everyday. A, I thank you.

And that's a rap for today's edition of everyday A I thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. IT helps keep us going for a little more AI magic. Visit your everyday AI dot com and sign up to our daily news letter so you don't get left behind, go breaks and barriers, and we'll see you next time.