Nvidia Part III: The Dawn of the AI Era (2022-2023)

2023/9/6

Acquired

Chapters

The "big bang" moment for AI was in 2012 with AlexNet, a machine learning algorithm that used NVIDIA GPUs. This breakthrough led to the dominance of Google and Facebook in AI research, raising concerns about a duopoly. In response, Elon Musk and Sam Altman founded OpenAI in 2015, aiming to democratize AI.

AlexNet's success in 2012 marked a turning point in AI.
Google and Facebook dominated AI research, primarily for ad targeting and social media.
OpenAI was founded in 2015 to address concerns about a Google-Facebook AI duopoly.

Shownotes Transcript

Translations:

中文

You like my box t shirt.

I love your box's t shirt.

I went for the first time but two weeks ago when I was done for meeting a benchmark and nostalgia. There is just unbelievable.

I can't believe you been before. I know jensen is a Dennis guy, but I feel like he would meet us at bucks if we asked them.

or at the very least, we should figure out some enva memoria to get on .

the wallet books. Total fit, right?

Alright.

let's do IT let IT easy. You wait. You wait, you. who. Easy, me down, say story. On way.

welcome to season thirteen, episode three of acquired the podcast about great technology companies and the stories and playbooks behind them. I'm been gilbert, David, rose all. And we are your hosts today. We tell a story that we thought we had already finished. And video, but the last eighteen months have been so insane listeners that IT warranted an entire episode on its own.

So today is a part three for us, with a video telling the story of the AI revolution, how we got here and why is happening now, starting all the way down at the level of Adams and silicon. So here's something crazy that I did a transcript search on to see if that was true in our April twenty, twenty two episodes. We never once said the word generative. That is how fast things .

have changed.

totally crazy. And the timing of all of this A I stuff in the world is unbelievably coin incidental and very favorable. So recall back to eighteen months ago, throughout twenty twenty two, we all watched financial markets from public equities to early stage startups to real estate, just fall off a Cliff due to rapid rise in interest rates. The crypto in when three bubble burst, banks fail IT seemed like the whole tech economy, and potentially a lot with IT, was heading into a long winter.

including in video.

including in video who had that massive inventory right off for what they thought was over ordering you.

Wow, how things have changed.

But by the fall of twenty twenty two, right, when everything looked the absolute blankest, a breakthrough technology finally became useful. After years in research labs, large language models, or l ms, built on the innovative transformer machine learning mechanism, burst onto the scene, first with OpenAI ChatGPT, which became the fastest APP in history to a hundred million active users, and then quickly followed by microsoft, google and seemingly every other company. In november of twenty twenty two A I definitely had its netscape moment, and time will tell, but IT may have even been its iphone moment.

That is definitely what jenson believes.

Well, today we'll explore exactly how this breakthrough came to be the individuals behind IT, and of course, why the entire thing has happened on top of invidia s hardware and oft ware. If you want to make sure, you know, every time there's a new episode, go sign up at acquired data m slash email. You'll also get access to two things that we aren't putting anywhere else.

One, a clue as to what the next episode will be, and two, follow up from previous episodes from things that we learned after release. You can come talk about this episode with us after listening at acquire dott F M slash slack. If you want more of David and I, check out our interview show A C Q.

Two, our next few episodes are about A I, with C, E, S leading the way in this world we are talking about today, and a great interview with dog the miro, where we wanted to talk about a lot more than just poor show with them. But now we only had eleven hours or whatever we had in dogs garages. So a lot of the car industry chat and learning about dug in his journey in his business.

We saved for A C, Q. two. So go check that out. One final announcement, many of you have been wondering, and we're beginning a lot of emails, when will those hats be back in stock? Well, they're back for a limited time. You can get an A C Q embroider Redhat at acquired dota F M slash store. Go put your order in before they go back into the disney volt forever.

This is great. I can finally get Jenny one of her own. So SHE stops stealing. Md.

yes. Well, all that further to do this show is not investment advice. Steven and I may have investments in the companies we discuss. And this show is for informational and entertainment purposes only. David.

history in facts. Oh, man, well, on the one hand, we only have eighteen months to talk about.

except that I know you're not going na start .

eighteen months ago. On the other hand, we have decades and decades of foundational research to cover. So when I was starting my research, I went to the natural first place, which was our old episodes from April 222， and I was listening to them when I got to the end of the second one.

And men, I had forgotten about this. I think jenson maybe wishes we all had forgotten this in one of the videos earning slides twenty one. They put up their total addressing market and they said they had a once a lion dollar tam.

And the way that they calculated this was that they were gona serve customers who provided a hundred trillion dollars worth of industry. And they're going to capture just one percent of IT. And there was some stuff on the slide that was fairly speculative, if you know, like autonomous vehicles and the omonia verse. And I think robotics were a big part of IT.

And the argument is basically like, well, cars plus factories plus all these things added together as one hundred trillion, and we can just take one percent of that because surely their computer will amount to one percent of that, which i'm not arguing is wrong. But IT is a very blunt way to analyze.

The market is usually not the right way to think about starting to start up or if we can just get one percent of this big market.

But it's the topic downway I can think of to size market.

So you, Billy, so called this out at the end of invited part. You know, I think to justify where in videos trading at the moment, you can actually got to believe that all of this is gonna happen and happen soon. Autonomous cars, robotics, everything.

yeah. Importantly, I felt like the way for them to become worth what they were worth at that time literally had to be to power all of this hardware in the physical world.

Yeah, I kind of keep believe that I said this because I was intentional and uninformed, but I was kind of grasping at straw is trying to play devils advocate for you. And we d just spent most of that whole episode talking about how machine learning, powered by in video, ended up having this incredibly valuable use case, which was powering social media feed recommendation, and that facebook and google had grown bigger than anyone ever a agent on the internet with those feed recommendations and in video was powering all of IT.

And so I just want to ideally proposed, well, maybe, but what if you don't actually need to believe any of that to still think that in video could be worth a trillion dollars? What if maybe, just maybe, the internet and software and the digital world are gonna keep growing and there will be a new foundational layer that in video can power? Is that possible? I think we are both like, yeah, I don't know.

Let's the episode yeah sure we struck IT off. We were like .

our right carvel. But the crazy thing is that, of course, at least in this time frame, most things on jenson's trillion dollar time slide have not come to pass. But that crazy question just might have come to pass. And from in videos s revenue and earning standpoint definitely has. It's just wild.

right? So how do we get here?

Let's rewind, tell the story. So back in twenty and twelve, there was the big bang moment of artificial intelligence, or as IT was more humbly referred to back then, machine learning. And that was alex net. We talked a lot about this on the last episode, IT was three researchers from the university of toronto who submitted the alex net algorithm to the image net computer science competition. Now image net was a competition where you would look at a set of fourteen million images that have been hand labelled with what the pictures were off like of a strawberry or a cat or a dog or whatever.

And David, you were telling me it's the largest ever use of mechanical turk up to that point was to label the image net data set yeah.

it's well, I mean, until this competition and until alex, there was no machine learning algorithm could accurately level images. So thousands of people on mechanical turk got paid, however much two box an hour to label these images.

yeah. And I am remembering from our episode basically what happened is the alex s. Net team did way Better than anybody else had ever done. The complete step change Better. I think the err rate went from mislabeling images twenty five percent of the time to suddenly only mislabeling ling them fifteen percent of the time. And that was like a huge over the tiny incremental progress that .

have been made along the way. You are spot on. And the way that they did IT and would completely changed the fortunes of the internet, of google, of facebook, and certainly have in video, was they actually used old algorithms, a branch of computer science and artificial intelligence called neural networks, specifically convolution neural networks, which had been around since the sixties, but they were really computationally intensive to train.

And so nobody thought that would be practical to actually train and use these things, at least not anytime soon, or in our lifetimes. And what these guys from tonto did is they went out probably to their local best buy or equivalent in canada. They bought two g force G T X five eighties, which was the top of line cards at the time. And they wrote, they're algorithm, the convolution neural network in kuta in invidia s software development platform for GPS. And I got they trained this thing on like a thousand dollars worth of consumer grade hardware.

And basically, the algorithm that other people had been trying over the years just wasn't massively parallel the way that uh, graphical cards would have enable. So if you actually can consume the full compute of a graphics card, then perhaps you could run some unique novel algorithm and do IT on, you know, a fraction of the time and expense that I would take in these supercomputer laboratories.

Yeah, everybody. Before I was trying to run these things on CPU, cp, user, awesome. But they only execute one instruction at a time. G, P, U, on the other hand, execute hundreds or thousands of instructions at a time. So GPU, in video, graphics cards, accelerated computing. What jenson in the company likes to call this, you can really think of IT like a giant art committees lever, like whatever advances are happening in mores law. And the number of transistors on the chip, if you have an algorithm, can run in parallel, which is not all problem spaces, but many can, then you can basically lever up mores law by hundreds of times, or thousands of times, or today, tens of thousands of times, and execute something a lot faster than you otherwise could.

And it's so interesting that there was this first market called graphics that was obviously parallel where every pixel on a screen is not sequentially dependent on the pixel next to IT. IT literally can be computed independently and output to the screens. So you have. However, many tens of thousands are now hundreds of thousands of pixels on a screen that can all actually be done in parallel. And little did. And video realized, of course, that A I and crypto and all these other linear algebra matrix, math based things that turned into accelleration the computer and pulling things off the CPU and putting them on GPU and other parallel processors was an entire new frontier of other applications that could use the very same technology they had pioneered for graphics.

Yeah, IT was pretty useful stuff. And this alex net moment, and these three researchers from toronto kicked off jensen calls IT. And he's absolutely right. The big bang moment for ai.

So David, the last time we told this story in full, we talked about this team from toronto. We did not follow what this team of three went on to do afterwards.

yes. So basically what we said was IT turned out that a natural consequence of what these guys were doing was, oh, actually, you can use this to surface the next post in a social media feed, unlike an instagram feed or the youtube feed, or something like that. And that unlocked billions and billions of value.

And those guys and everybody else working in the field, they all got scooped up by google in facebook. Well, that's true. And then as a consequence of that, google and facebook started buying a lot of invidia gp s. But turns out there's also another chapter to that story that we completely skipped over. And IT starts with the question you asked him, who are these people?

yes.

So the three people who made up the alex net team were, of course, alex cochefer, who was A P. H. D.

Student under his faculty advisor. The legendary computer science professor jeff hinted, I have an amazing piece of trivia about jeff hinton. Do you know who is great? Great grandparents were?

no. I have no idea.

He is the great, great grandson of George and mary bull, you know, like Billy, an algebra and billions logic.

This guy was born to be a computer science researcher. Oh my god, right?

Foundational stuff for a computation and computer science.

I also didn't know there were people named bull, but that's where that came from. The tolerance.

Yeah, we know the end or x or nor Operators that comes from georgia, mary wide. So he is the faculty advisor. And then there was a third person on the team.

Alex is fellow P. H. D. Student in this lab when alia said skiver, and if you know where were going with this, you are probably jumping up and down right now in your seat. Ellia is the co founder and current chief scientist .

of OpenAI. Yes.

so after alex, now alex, jeff and alia do the very natural thing, they start a company. I don't know what they were doing in the company, but, uh.

IT made sense to start one. And whatever they did, he was going to get to .

acquired real fast by google within six months. So they get skipped up by google. They join a bunch of other academics and researchers that google has been monopolizing really in the field.

Three specifically, greg orado, jeff teen and andrey famous, stanford professor. The three of them had just formed the google brain team within google to turbo charge all of this, A I work that has been unleashed by alex dep. And of course, to turn IT into huge amount of profit for google.

turns out individually serving advertising that's perfectly targeted on the internet through facebook or google or youtube is an enormously profitable business and one that consumes a whole lot of NVIDIA GPU.

yes. So about a year later, google also requires deep mind famously, and then read around the same time, facebook scoops up computer science professor Young lecon, who also is a legend in the field, and the two of them basically establish a do apply on leading A I researchers. Now, at this point, nobody is mistaking what these companies and these people are doing for true human level intelligence or anything close to IT.

This is A I that is very good at narrow tasks like we talked about social media feed recommendations. So the google brain team and jeff and alex and a one of the big projects they work on is redoing the youtube algorithm. And this is when youtube goes from like money losing yo crazy thing that google acquired to that just absolute jugan aut that IT is today.

I mean, back then, like twenty thirteen, twenty fourteen, we did our youtube bison not that long after the majority of views of youtube videos were in bed on other web pages. This is when they build IT into a social media site, they start the feed, they start auto play. All of this stuff is coming out of A I research, some of the other stuff that happens in google famously, after they acquire deep mind.

Deep mind built a buncha algorithms to save on cooling costs. And facebook, of course, they probably had the last laughs in this generation because they're using all this work. And Young hoon is doing his thing and hiring lots of researchers there.

This is tested couple years after they acquired instagram. Man, we need to let go back and redo that episode because instagram would have been a great acquisition anyway. But IT was A I powered recommendations in the feed that made that into a hundred, two hundred, five hundred billion dollar asset for facebook.

And I don't think you're exaggerating. Think that is literally what instagram is worth to meta. Now by the way, I ve bought a lot of things on instagram ads so that the .

targeting works hit absolutely does there's this amazing quote from astro teller er who ran google x at the time and still does in the new york times peace where he says that the gains from google brain during this period, I don't think this evening clicks deep mind. Just the gains from the google brain team alone terms of profits to google more than funded everything they were doing in google ex.

which has there ever been anything profitable?

Lot of google ex, google brain. yes. I mean, yeah, will leave IT at that.

So this takes us to twenty fifty when a few people in silicon valley start to realize that this google, facebook, A I do apply is actually a really, really big problem. And most people had no idea about this. This is really visionary of these two people. And not just a problem for.

like the other big tech companies, because you could make the argument is a problem is like serious, terrible. All the other companies that have lots of consumer touch points have pretty bad eye at the time. But the concern is for a much greater reason.

I think there are three levels of a concern here. One, obviously the other tech companies. Then there's the problem of startups. This is terrible for startups. How are you gonna compete with google and facebook when this is the primary value driver of this generation of technology?

I there really is another lens to view what happened with snap, what happened with musically and having to sell themselves to buy dance and becoming tiktok and going to the chinese. Maybe IT was business decisions, maybe IT was execution or whatever that prevented those platforms from getting to independent scale snaps. The public company now. But like it's no facebook, maybe IT was that they didn't have access to the same A I researchers that facebook in google head.

Um that feels like an interesting question is probably a couple steps too far on the conclusion but still sort of funds drawing.

And to think about fun dam, this is definitely problem. The third layer, the problem is just like this sucks for the world that all these people are locked. Google and facebook.

This is probably a good time to mention. This founding of OpenAI was motivated by the desire to find A G. I. Or artificial general intelligence first before the big tech companies did. And deep mind was the same thing.

He was gonna this winding and circuitous path at the time, since the really nobody knew then, or knows, know how the best path to get to A G. I. But the big idea of OpenAI is founding was whoever figures out and finds A G.

I. First will be so big and so powerful, so quickly. They'll have an immense amount of control. And that is best in the open.

So these two people were quite concerned about this. Convene a very fateful dinner in two and fifteen, at of all places, he said, the rosewood, the rosewood hotel on sand hill road. Naturally, IT won't been way Better if they were a danny are books and website like that, but that does actually .

just show like where the seeds of open eye come from and is very different than this sort of organic scrap way that the n videos of the world got started. You know, this is powers on high and existing money, saying, no, we need to will something into existence.

So of course, those two shadow figures are elon mask and sam altman. So at the time was president of my comedy. So they get this dinner together, and they invite basically all of the top A I researcher is a google and facebook.

And they are like, what is IT gonna take for you to leave and to break this to apply? And the answer from almost all of them is nothing you can't. Why would we ever leave? We're happy as claims here. We've got to .

higher the people that we want. We've built these great teams. There's a money speak at point in our face.

right? Not only are we getting paid just on godless amount of money, but we get to work directly with the best A I researchers in the field. If we were still at academy institutions, you know, say you're at the university wh sington amazing academic institution for computer sites on the top of the world, the universe of toronto, where these guys came from, you're still at a fragmented market.

If you go to google and you go to facebook, you're with everybody yeah so the answer is no from basically everybody except there's one person who is intrigued by elon and sams pitch. And to quote an amazing wired article from the time by cade maths that we will link to in our sources, the trouble was, so many of the people most qualified to solve all these A I problems were already working for google and facebook. And no one at the dinner was quite sure that these thinkers could be learned to a new startup even if musk and altman were behind IT.

But one key player was at least open to the idea of jumping ship. And then they've a quote from that key player. I felt there were risks involved, but I also felt that would be a very interesting thing to try.

And the key player was earlier, susi mer. yes. So after the dinner, ellia leaves google and signs up to become, as we said, cofounder and chief scientist of a new independent A I nonprofit research lab backed by ellan and sam. Open a eye.

okay. Listeners, now is a great time to tell you about long time friend of the show service now.

yes, as you know, service now is the A I platform for business transformation. And they have some new news to share. Service now is introducing A I agents. So only the service now platform puts A I agents to work across every corner of your business.

yeah. And as you know from listening to us all year, service now is pretty remarkable about embracing the latest AI developments and building them into products for their customers. AI agents are the next phase of this.

So what are A I agents? A I agents can think, learn, solve the problems and make decisions autonomously. They work on behalf of your teams, elevating their productivity and potential. And while you get incredible productivity enhancements, you also get to stay in full .

control up with service. Now AI agents proactively solve chAllenges from I, T, H, R. Customer service software development. You name IT. These agents collaborate, they learn from each other, and they continuously improve handling the busy work across your business so that your teams can actually focus on what truly matters ultimately.

service. Now, an agenda I is the way to deploy A I across every corner of your enterprise. They boost productivity for employees and rich customer experiences and make work Better for everyone.

Yeah, so learn how you can put A I agents to work for your people by clicking the link in the shower notes or going to service now document slash A I dash agents. Okay, so David OpenAI is formed. Its twenty fifteen. There are eight years later, and we have ChatGPT super Winnie path from there to hear.

right? Turns out a no. So as we were talking about a little bit, A I at this point in time, super good for narrow use cases, looks nothing like GPT for today.

The capabilities that they had were pretty limited. And one of the big reasons was that the amount of data that you could practically train these models on was pretty limited. So the alex, that example, you are talking about fourteen million images. In the grand scheme of the internet, fourteen million images is a drop in the bucket.

And this was both a hardware and a software constraint. On the software side, we just didn't actually have the algorithms to so of suppose that we could be so bold to train one single foundational model on the whole internet. He wasn't a thing.

but that was a crazy idea.

right? People were excited about the concept of language models, but we actually didn't know how we could algorithmically get IT done. So in twenty fifteen, andry karpathy, who was then at OpenAI and went on to lead A I for tesla and is actually now back at OpenAI rates, this seminal blog post called the unreasonable effectiveness of neural networks. And David, I only go get I go into IT on this episode, but note that recurrent neural networks are a little bit of a different thing than convolution neural networks, which was the twenty twelve paper .

the state of the art had evolved.

yes. And right around that same time, there is also a video that hits youtube, are a little bit later in twenty sixteen, that is actually on invidious channel. And that has two people in this very short one minute and forty five second video.

One is a Young elliot's giver, and two is andry carpathia. And here is a quote from Andray from that youtube video. One algorithm i'm excited about is a language model, the idea that you can take a large amount of data and you feeds IT into the network and IT figures out the pattern in how words follow each other in sentences.

So for example, you could take a large amount of data on how people talk to each other on the internet. You can train basically a chatbot, but you can do IT in a way that the computer learns how language works and how people interact. Eventually, we'll use that to talk to computers, just like we talk to each other.

Is twenty fifteen.

This is two years before the transformer. Well, carpathia is at OpenAI. He both comes up with the idea, or spouse is the idea of a chatbot. So that sort of ready been discussed even before we had the transformer, the method to actually pull this off. He sort of had the idea that, and there's an important part here, IT figures out the pattern in how words follow each other in sentences. So there's this idea that the very structure of language and the way to interpret knowledge is actually embedded in the training data itself, rather than requiring labeling.

This is so cool. So at spring G, T, C, this year, jensen did a fireside chat with ellia, and it's amazing you should go watch the whole thing. But in IT, this question comes up.

Jan kind of poses as a straw man, like, hey, some people say that GPT, three, four ChatGPT are everything going on? All these LLM, they're just public political ally predicting the next word in a sentence. They don't actually have knowledge.

And ellia has this amazing response to that. He says, okay, will consider a detective novel. Yes, at the end of the novel, the detective gather is everyone together in a room and says, I am now going to tell you all the name of the person who committed the crime and that person's name is black. The more accurately n LLM predicts that next word, ie, the name of the criminal, if so facto, the greater is understanding not only of the novel, but of all general human level, knowledg and and intelligence, because you need all of your experience in the world and as a human to be able to guess who the criminal is. And the LLM that are out there today GPT three GPT four laa bar these others, they can guess who .

the criminal is yeah put up in in that understanding versus predicting. Uh, hot topic is sures. So David is now good time to fast forward two years to twenty seventeen to the transformer. Paper absolutely .

then tells about the transformer.

okay? So google to twenty seventeen transform or paper, paper comes out called attention is all you need.

And it's from the google brain team, right? That ellia just left.

just left two years before to start OpenAI. So machine learning on natural language, just to set the table here, had long been used for things like auto correct or foreign language translation. But in twenty seventeen, google came out with this paper and discovered a new model that would change everything for these fields and unlock another one.

So here is the scenario, you're translating a sentence from english to french. You could imagine that a way to do this would be one word at a time in order. But for anyone who's ever travel abroad and a try to do this, you know that words are sometimes rearranged in different languages.

So that's a terrible way to do IT. United states in spanish is the status u need to. So failure, on the very first word in that example.

So enter this concept of attention, which is a key part of this research paper. So this attention is fairly magical component of the transformer paper. IT literally, is what IT sounds like. IT is a way for the model to attend to different areas of the input text at different times. You can look at a large amount of context while considering what word to pick next in your translation. So for every single word that you're about to output in french, you can look over the entire set of input ted words to figure out what words you should wait heavily in your decision for what to do next.

This is why A I M machine learning was so narrowly applicable before. If you enter promotional ze IT and you think of IT like a human, IT was like a human with a very, very short attendance for me.

yes. Now here's the magical part. Well, IT does look at the whole input text to consider what the next word should be.

IT doesn't mean that IT throws away the notion of position entirely. IT uses a technique called positional encoding, so IT doesn't forget the position of the words altogether. So it's got this cool thing where IT awaited the important part relevant to your particular word, and IT still understands position. So remember, I said the attention mechanism, sm looks over the entire input every time it's picking. What were the output?

That sounds very computationally hard.

yes. In computer science terms, this means that the attention mechanism is o of n squared.

Oh, that's given me the heb G B S back to my CS classes in college.

Just ll through the suit, IT is deeper. So obviously, yes, traditionally you'd say this is very, very inefficient. And IT actually means that the larger your context window A K A token limit A K A prompt length gets, the more computationally expensive IT gets on a quad draw basis.

So doubling your input means, quite drupal, the cost to computer and output, or tripling your input means nine times the cost. IT gets real early. yeah.

IT gets real of real fast. But G, P, S, to the rescue. The amazing news for us here is that these transformer comparisons can be done in parallel.

So even though there are lots of them to do, if you have big G, P, U chips with tons, of course you can do them all at exactly the same time. And previous technologies to accomplish this, like recurrent neural networks, or lt. ms.

Long short term memory networks, which is a type of recurrent neil network acta, those required knowing the output of each step before beginning the next one before you picked the next word. So in other words, they were sequential since they depended on the previous word. Now with transformers, even if you're string of tax that you're inputting as a thousand words long, IT can happen just as quickly in humid, measurable time as if there were ten words long, supposing that there were enough course in that big GPU. So the big innovation here is you could now train sequence space models in a parallel way. You couldn't train models of this size at all before, let alone .

cost effectively. Yeah, this is huge, probably for all listeners out there starting to sound very familiar to the world that we live in today.

Yeah, I sort of did a slide of hand morphing translation to using words like context, window and token length. Eakin, commercy, where this going? yep.

So this transformer paper comes out in two thousand seventeen. The significances is huge. But for whatever reason, there is a window of time where the rest of the world doesn't quite realize IT.

So google obviously knows how important this is. And there's like a year where googles A I work, even though alia has left in OpenAI as a thing, now accelerates again beyond anybody else in the field. So this is when google comes out with smart compose in gmail. And they do that thing where they have an A I bought that all call local businesses for you. Remember that demo from?

Know that they did. Did that ever ship?

I don't know. Mabe I did. baby. Is google here like the capabilities are there? The product about this is when they really started investing in way. no. But again, where really manifest is just back to serving ads and search and recommending youtube videos s like they're just crushing IT. In this period of time OpenAI and everyone else, though they haven't adopted transformers yet, they are kind of stuck in the past and they're still doing these really research computer vision projects. So like this is when they build a boat to play dota, to defensive the agents to the video game and .

super impressive stuff, like they beat the best dota players in the world at dota by literally just consuming computer vision, like consuming screen shots and inferring from there. And that's a really hard problem because dota two is not a game where you get to see the whole board at once. So IT has to do a lot of like really intelligent construction of the rest of the game based on just a single players worth of input. So it's unbelievably cutting edge research .

for the past dearies. It's a faster horse, basically.

maybe. Yeah, I mean, they are also doing stuff like a universe, which was the 3 modeled world to train self driving cars。 You don't really hear anything about that anymore, but they built this whole thing.

I think I was using grand thai auto as the environment, and then I was doing computer vision training for cars using the GTA world. IT means crazy stuff, but I was kind of scatter shot. Yeah, I was scatter shot.

I guess what i'm saying is IT was still in this narrow use case world. They weren't doing anything approaching GPT at this point in time. Meanwhile, google had kindly moved on.

Now one thing I do want to say in defensive OpenAI and everybody else in the field at the time, they didn't just have their heads in the sand to do what transformers enabled you to do, which when you going to talk about in a sec, cost a lot in computing power, G, P, U, in NVIDIA. And the transformer made IT possible. But to work with the size of models you're talking about, you're talking about spending an amount of money that certainly for a nonprofit and anybody really accept google was untenable.

right? It's funny, David. You made this lead to expensive and large models. All we were doing before I was mediately talking about translating one sentence to another.

The application of a transformer does not necessarily require you to go and consume the whole internet and create a foundational model. But let's talk about this. Transformers lend themselves quite well, as we now know, to a different type of task.

So for a given input sentence, instead of translating to a target language, they can also be used as next word predictors to figure out what words should come next to the sequence. You could even do this idea of pre training with some core pus of text to help the model understand how IT should go about predicting that next word. So backing up a little bit, let's go back to the recurrent neural networks, the state of the art before transformers will.

They had this problem, in addition to the fact that they were a sequential al rather than parallel, they also had a very short context window. So you could do a next word predictor. But IT wasn't that useful because I didn't know what you were saying more than a few words ago.

By the time you'd get to the end of the paragraphs, IT would forget what was happening at the beginning. IT couldn't sort of hold under all that information at the same time. So this idea of a next word, predict, that was pretrail with a transformer, could really start to do something pretty powerful, which is consumed large amounts of text, and then complete the next word based on a huge amount of context.

Yes, we're starting to come up this idea of a large language model, and we're going to flash forward here just for a moment to produce ministration that will come back to the story. In GPT one, the first OpenAI model, generative retrained transformer model GPT IT used unsupervised retraining, which basically meant that as IT was consuming this circus of language, IT was unlabeled data. The model was inferring the structure and meaning of language merely by reading, in which is a very new concept in machine learning.

The canonical wisdom is that you needed extremely structure data to train your smallish model on, because how else are you going to learn what the data actually means? This was a new thing. You can learn what the data means from the data itself.

It's like how a child consumes the world, where only occasionally does their parents. I, no, no, no, you have that wrong. That's actually the color red, but most of the time they're just self teaching by observing .

the world as a parent of a two year old can confirm.

And then a second thing happens after this unsupervised retraining step, where you then have supervised fine tuning, the unsupervised retraining used a large corpus of text to learn this are of general language. And then IT was fine tuned on label data sets for specific tasks that you. So I really want the model to be actually useful for.

So to give people a sense of why we're saying that the idea training on very, very, very large amounts of data here is crazy expensive. GPT one had roughly one hundred and twenty million parameters that I was trained on. GPT two had one point five billion GPT three had one hundred and seventy five billion n GPT four OpenAI has an announced.

But it's remember that IT has about one point seven trillion parameters that IT was trained on. This is IT a long way from alex net. Here it's .

scaling like a videos market cap. There is this interesting discovery basically that the more parameters you have, the more correctly you can predict the next word. These models were basically bad.

Sub ten billion parameters, I mean, maybe even sub a hundred billion parameters, they would just pollution ate or they would be. It's funny. When you look at some of them, I like one billion parameter models.

You're like, there is no chance that turns into anything useful ever. But by merely adding more training data and more parameters, IT just gets way, way Better. There's this weirdly emergent property where transformer base model scale really well due to the parallel. So as you throw huge amounts of data at .

training them, you can also throw huge amount of video GPU that processing that exactly.

and the output unexpectedly gets magically Better. I mean, I know I keep saying that, but IT is like weight. So we don't change anything about the structure.

We just give IT way more data and let you run these models for a long time and make the parameters of the model way bigger. And like, no researchers expected them to reason about the world as well as as they do. But I just canna happen as they were expLoring larger and larger models.

So in defensive OpenAI, they knew all this. But the amount of money that you would have to spend to buy G, P, U, or to rent GPU in the cloud to train these models is prohibitively expensive. And you know, even google at this point time, this is when they start building their own chips TPU because they're still buying tons of hardware from NVIDIA, but they're also starting to source their own here.

yep. And importantly, they've at this point or are getting ready to release TensorFlow to the public so they have a framework where people can develop for stuff. And they're like, look, if people are developing using our software, then maybe I should run on our hardware that's optimized to work without software.

So they actually do you have this very plausible story around why their hardware, why their software framework? He was kind of a surprising move when they open source stic as people were like, gasp, you know, why is google giving away the farm for free here? But this was three, four years early in a very present move to really get a lot of people using google architecture, compute at scale.

all within google cloud. yeah. So with this, IT starts to look like maybe this whole OpenAI and the world's A I resources are more than ever just locked back into google. So in 11点， elon, get super frustrated by all this. Basically, there's a easy fit and quits and pieces out of OpenAI.

There's a lot of drama around this that we're not gonna cover now he may or may not have given an auto to the rest of the team that he would either take over and run things or leave. Who knows? Did the one.

But whatever happened, this turns out to be a major catalyst for the rest of the OpenAI T. M. test. Returning on a knife point moment, IT was also a probably super bad decision by land. But again, sorry for another day.

So there is this great explanation of what happened in the emap piece that will link to in our sources. The author says that fall IT became even more apparent to some people at OpenAI that the cost of becoming a cutting edge, A I company, we're going to go up. Google brain transformer had blown open a new frontier where A I could improve endlessly, but that meant feeding endless data to train IT a costly endeavor.

OpenAI made a big decision to pivot toward these transformer models on martial eleventh twenty twenty and OpenAI announced was creating a four profit entity so we can raise enough money to pay for all the compute power necessary to pursue the most ambitious A I models. We want to increase our ability to raise capital while still serving our mission. And no preexisting legal structure that we know of strikes the right baLance, the company wrote at the time. Open a ee said I was capping profits for investors with any access going back to the original non profit. Less than six months later, open, I took a one billion dollar investment from microsoft.

yeah. And I believe this is mostly, if not all, due to same movements, influence and taking over here. So you know, on the one hand, you can look at this sort of skeptically and say, OK say, um you took your nonprofit and you converted IT into an entity worth thirty billion dollars today.

On the other hand, knowing this history now this was kind of the only path they had. They had to raise money to get the computing resources to compete with google. And sam goes out. And as these landmark deals with microsoft.

yeah, truly amazing. And their opinion at the time of why they're doing this is basically, this is a gime, super expensive. We still have the same mission to ensure that artificial general intelligence benefits all of humanity, but it's going to be ludicrous, expensive to get there. And so we need to basically be a for profit enterprise and a going concern and have a business that funds are research eventually to pursue that mission.

Yep, so two and nineteen, they do the conversion to a four profit company, microsoft investors, a billion dollars, as you say, and becomes the exclusive cloud provider for OpenAI, which is going to become highly relevant here for a video. More on that in a minute. June twenty twenty GPT three comes out in september of twenty twenty, and microsoft licenses exclusive commercial use of the underlying model.

Microsoft products twenty twenty one get hub copilot comes out. Microsoft invested another two billion dollars in OpenAI. And then, of course, this all leads to november seventh, twenty twenty two, in jenson's words, the A I heard around the world opening ee comes out with ChatGPT, as you said, ban, the fastest product in history or two hundred million users in january twenty twenty three.

This year, microsoft invests ten billion dollars in OpenAI, announced their integrating GPT into all of their products. And then in may of this year, GPT four comes out. And that basically catches us up to today.

We eventually need to go to a whole another episode about all the details here of OpenAI microsoft. But for today, the silent points are, one, thanks to all this general AI as a user facing product emerges as this enormous opportunity. Two, to facilitate that happening, you needed enormous amount of GPU compute, obviously benefitting in video. But just as important, three IT becomes obvious now that the predominant way that companies are gna access and provide that compute is through the cloud. And the combination of those three things turns out to be basically the single greatest moment that could ever happen .

for a video. Yes, so are you're team all of this up. And so far i'm thinking, so this is like the OpenAI and microsoft episode. Like what does this have to do with video and god, there's a great and via story here to be told. So let's get to the NVIDIA side of IT.

All right listeners, our next sponsor is a new friend of the show, huntress. Huntress is one of the fastest growing and most loved cyber security companies today. Its purpose built for small amid sized businesses and provides enterprise grade security with the technology, services and expertise needed to protect you.

They offer a revolutionary approach to manage cyber security that isn't only about tech, it's about real people providing real defense around the clock.

So how does IT work? Well, you probably already know this, but IT has become pretty trivial for an entry level hacker to buy access and data about compromised businesses. This means cybercriminal activity towards smaller medium businesses is at an all time high.

So hunches created a full managed security platform for their customers. Guard from these threats. This includes end point detection and response, identity threat detection, response security awareness, training in a revolutionary security information and event management product that actually just got launched. Essentially, IT is the full sweet of great software that you need to secure your business, plus twenty four, seven monitoring by the elite team of human threat hunters in a security Operation center to stop attacks that really software only solutions could sometimes miss. Countries is democratizing security, particularly cyber security, by taking security techniques that were historically only available to large enterprises and bring them to businesses with as few as ten, a hundred or a thousand employees at Price points that makes sense for them.

In fact, it's pretty wild. There are over one hundred and twenty five thousand businesses now using, and they rave about IT from the hill tops. They were voted by customers in the g two rankings as the industry leader in end point detection and response for the eighth consecutive season and the industry leader in manage detection and response again this summer.

Yeah, so if you want cutting edge cyber security solutions, backed by a twenty four, seven team of experts who monitor, investigate and respond to threats with unmatched precision, head on over two hunters dot com slash acquired, or click the link in the show. Notes are huge. Thanks to hunters. Okay, so in video.

okay, so we just said these three things that we've painted the picture of on the first part of the episode here, that a generative A I is like possible a thing. And it's now getting traction. Be IT requires an unbelievably massive amount of GPU computer train.

And three IT looks like the predominant way that companies are going to use that compute is going to be in the cloud. The combination ation of these three things is, I think, the most perfect example we've ever covered on this show of the old saying about luck being what happens when preparation meets opportunity for a video here. So obviously, the opportunity is generated ai, but the preparation from NVIDIA has literally just spent the past five years working insane hard to build a new computing platform for the data center.

G P U, accelerated computing platform to, in their minds, replace the old CPU LED intel dominated X A D six architecture in the data center. And for many years, I mean, they are getting some traction right in the data center segment was growing for in video. But people are like, okay, you want this to happen, but like why is gonna happen, right?

There's his little workloads here in there that will toss you, Jenny, that we think can be accelerated by your call GPU. And then you know crazy things like crypto to happen and there was like A I researchers in academic labs that are using IT, as you know, supercomputers. But for the longest time, the data center segment of NVIDIA IT just wasn't clear that organizations had enormous parts of their software stack that they were gna shift to GPU.

Like why? What's driving this? And now we know what could be driving IT. And that is A I A.

Not only could be, but if you look at their most recent courter, absolutely freaking is okay.

So now IT beggs the question, why is a driving IT? And David, are you open to me giving a little computer science lecture on computer architecture?

Oh.

please do right. Do my best professor impression here.

Do I love computer science and colleges? They were my favorite classes.

I will say, doing these episodes, this T S M C IT really does bring back the thrill of being in A C S. Lecture, being, oh, that's how that works, like it's just really fun. So let's take a step back and consider the classic computer architecture, the vano yan architecture.

Now, the vanov man architecture is what most computers, most CPU are based on today, where they can store a program in the computers memory and run that program. You can imagine why this is the dominant architecture. Otherwise, we need a computer that is specialized for every single task.

The key thing to know is that the memory of the computer can store two different things, the data that the program uses and the instructions of the program itself, the literal lines of code. And in this example, we are about to pay. All of this is wildly simplified because I don't want to get into cashing and speeds of memory and know where memories located, not located.

So let's just keep IT simple. So the processor in the van oya architecture executes this program written in assembly language, which is the language that compiles down to the bike code that the processor itself can speak. So it's written in an instruction set architecture and I S A from ARM.

for example, or intel before that.

yes. And each line of the program is very simplistic. So we're going to consider this example where i'm going to use some assembly language pu du code to add the numbers two and three to equal five.

Are you beat the programme live .

on acquired well suda assembly language code. So the first line is we're going to load the number two from memory. We're going to fetch out of memory and we're going to load IT into a register on the processor.

So now we've got the number to actually sitting right there on our CPU ready to do something with as line of code number one, two, we're going to load the number three in exactly the same fashion into a second registered. So we've got two CPU registers with two different numbers. The third line, we're going to perform an ad Operation, which performs the arithmetic to add the two registers together on the CPU and store the value in some either third register or into one of those registers.

So that's a more complex instructions since it's arithmetic that we actually have to perform. But these are the things that CPU are very good at doing math Operations on data fetch from memory. And then the fourth and final line of code in our example is we are going to take that five that has just been computed and is currently held temporary and register on the CPU, and we are going to write that back to an address in memory. So the four lines of code are load, load, add, store.

So sounds familiar to me.

So you can see each of those four steps is capable of performing one and only one Operation at a time. And each of these happens with one cycle of the CPU. So you've heard of bigger hurts that the number of cycles per second.

So a one bigger hurts computer could handle the simple program that we just wrote two hundred and fifty million times in a single second. But you can see something going on here. Three of our four clocks.

Cycles are taken up by loading and storing data to memory. Now this is known as the vanni man bottled neck. And that is one of the central constraints of A I, or at least IT has been historically.

Each step must happen in order, and only one at a time. So in this simple example, IT actually would not be helpful for us to add a bunch more memory to this computer. I can't do anything with IT.

It's also only incrementally helpful to increase the clock speed. If I double the clock speed, I can only execute the program twice as fast. If I need like a million x speed up for some A I work with that i'm doing.

I'm not going to get IT there with just a faster clock speed that's not going to do IT and IT would, of course, be helpful to increase the speed at which I can read and write to memory. But i'm kind of a bound by the laws of physics there. There's only so fast that I can transmit data over a wire.

Now, the great irony of all of this is that the bottle neck actually gets worse over time, not Better, because the CPU get faster and the memory size increases, but the architecture is still limited. So this one pesky single channel, known as a bus, I don't actually get to enjoy the performance is nearly as much as I should cause i'm jamming everything through that one channel. And that only gets to sort of be used one time per every clock cycle.

So the magical unlock, of course, is to make a computer that is not a annoyed architecture, to make programs executable in parallel and massively increase the number of processors or course. And that is exactly what NVIDIA did on the hardware side. And all these AI researchers figured out how to leverage on the software side. But interestingly, neither we've done that, David. They constraint is not the clock speed or the number, of course, anymore for these absolutely enormous language models is actually the amount of on shared memory .

that concerns, I think. And this is why the data center, what in video has been doing is so important.

Yes, there is an amazing video that will link to on the asian omelet youtube channel that we linked to also on the tc episode. But the constraint today is actually in how much high performance memory is available on the chip. These models need to be in memory all at the same time, and they take up hundreds of gig bites.

So well, memory has scaled up by me. We're gonna get flashing all the way, ford, that each one hundreds on ship RAM is like eighty gig bites. The memory hasn't scaled up nearly as fast as the models have actually scale in size.

The memory requirements for training A I are just obscene. So IT becomes imperative to network commute, title ships and multiple servers of chips and multiple racks of servers of chips together into one single computer and putting computer and air quotes there in order to actually train these models. It's also worth noting we can make the memory chips any bigger due to a quart of the extreme ultraViolet photo autographed that we talked about the E, U, V.

On the T, S, M, C, episode chips are already the full size of the radical. It's a physics and wave wave constraint. You really can't edge chipps larger without some new invention that we don't have commercially viable yet. So what IT ends up, meaning is you need huge amounts of memory, very close to the processors, all running in parallel with the fastest possible data transfer. And again, this is a vast over size fiction, but you kind of get the idea of while of this becomes so important.

okay. So back to the data center. And here's what in video is doing that I don't think anybody else out there is doing, and why it's so important for that, that all of this new generative A I world, this new computing era, is jensen dubz runs in the data set.

So NVIDIA has done three things over the last five years. One, and probably most importantly, related to what you're talking about, band. They made one of the best acquisitions of all time back in twenty twenty, and nobody had any idea. They bought a corky little networking company out of israel called milano x.

Well, IT wasn't little. They paid seven .

billion dollars. Okay, yeah. And IT was already a public company.

right? IT was yet.

yeah, but IT was definitely quirky. And what was melanotan? Milano x is primary product, was something called in fina band, which we talked about a lot, which chase lock Miller on our see two episode .

within from crucial and actually in finished was an open source standard or managed by a consortium. There were a bunch of players in IT, but the traditional wisdom was well in fini band is way faster, way higher band with a much more efficient way to transfer data around the data center. At the end of the day, ethernet t is the lowest combat denominator, and so everyone had to implement ethernet anyway.

And so most companies actually exited the market. And milano x was kind of the only in fini band spec provider left. yeah.

So you said that we do what is in finial. IT is a competing standard to either net. IT is a way to move data between racks in a data center. And back in twenty twenty, everybody was like either net fine.

Why do you need more band with than either nett between racks and a data center? What could ever require thirty two hundred giggle bits a second of band with running down a wire in a data center? Well, IT turns out if you're trying to address hundreds, maybe more than hundreds, of GPU as one single compute cluster, the train, a massive A I model, yeah, you want really fast data inner connects between them, right?

People thought, oh, sure, for supercomputers for these academic purposes. But what the enterprise market needs in my shared cloud computing data center is either net, that's fine. And most workloads are going to happen right there on one rack.

And maybe, maybe, maybe things will expand to multiple computers on the rack, but certainly, they won't need to network and multiple of racks together and and video steps in. And you got Johnson saying, hey, dummy, the data center is the computer. Listen to me when I tell you the whole data center needs to be one computer. And when you start thinking that way, you start thinking, cheese, we're really going to be cramping huge amounts of data through wires that are going between these relic. How can we sort of think about them as if that's all sort of on ship memory or as closest we can make IT to on ship memory, even though that in a box low three feet .

away job so that peace number one of the videos grand data center plan over the last five years, peace number two is in september twenty twenty two. In video makes a quite surprising announcement of a new chip, not just a new chip, an entirely new classic chips that they are making called the Grace CPU processor. In video is breaking a CPU. This is like herrera, but janson.

I thought all computing was gonna accelerated. What are we doing here on these arms? CPU.

yeah, these Grace CPU are not for putting in your laptop. They are for being the CPU component of your entire data center solution that is specifically from the ground APP design to orchestrate with these massive GPU clusters.

This is the end game of a business that has been in motion for thirty years. Member, when the graphics card was subservient to the P, C, I, E. Slot in intel's mother board, and then eventually know we fast for to the future.

And video makes these G, P, S that are these beautiful standalone boxes in your data center, perhaps these little work station that sit next to you while you're doing graphics programing, while you're directly programing your GPU. And then of course, they need some CPU to put in that. So they're using D R intel or they're licensing some CPU.

And now they're saying, you know what, we're actually just gona do the CPU too. So now we make a box and is a fully integrated and video solution with r GPU r CPU R N V. Link between them are in fini band and network. Get to other boxes. And, you know, welcome to the show.

One more piece to talk about the third leg of the stall there, strategy before we get to what IT all means that I think you are about to go to spoiler alert. You say solution, I hear gross margin. The third part of IT is the G, P U.

Up until the video is current. G, P U generation, the hopper generation of GPU for the data center, there was only one GPU architecture at in video, and that same architecture, and those same chips from the same wafers made A T, S M C. Some of them went to consumer gaming graphics cards, and some of those guys went to a one hundred GPU in the deal center.

IT was all the same architecture, starting in september of twenty twenty two, they broke out the two business lines into different architecture. So there's the hopper architecture named after great computer scientist Grace hopper. I think we are at more in the U.

S. navy. Grace hopper, get a Grace. C, P, U, hopper, G, P, U, Grace hopper, the eight one hundred.

And that was for the data centers. And then on the consumer side, they start a whole new architecture called loveless after eight to love. And that is the RTX4XX。 You buy, you know, top of the line, R T X forty, what have you gaming card right now that is no longer the same architecture as the h one hundred ds that are powering ChatGPT. It's got to to an architecture. This is a really big deal because what they do with the harper architecture is they start using what called chip on a wave on sub street, C O W O S COO s.

When you start talking to the real seminars, that's when they start busting out the coasts conversion.

This is a certain section of lists get really excited. So essentially what this is back to this whole concept of memory being so important for G, P, U and for A I workloads. This is a way to stack more memory on the D, P, U chips themselves, essentially by going vertical in how you build the chips.

This is the absolute bleeding age technology that is coming out of T S M C. And by invidia, by forcing they are chip architectures into a gaming segment that does not have this latest covas technology, this allows them to monopolize like a huge amount of T S M C capacity to make the coasts chips specifically for these eight, one hundred. And who are allows them to have way more memory than other gp s on the market.

yes. So this gets to the point of why can't they seem to make enough chips right now? Well, it's literally A T S M C capacity problem. So there these two components they're extremely related that you're talking about, the coasts chip on way for on sub straight and the hybrid with memory.

So there's S A great post from sem analysis where the author points out a two point five d ship, which is basically how you assemble this, a stuff, to get the memory really close to the processor. And of course, two point five d IT is literally 3d but 3d means something else。 It's even more three days.

So they came up with this two point five d dynamo. Anyway, the two point five d chip packaging technology from T S, M C is where you take multiple active silicon dye, like the logic chips and the stack of high band with memory. And they stack up on one piece of silicon.

And there's more complexity here. But the important thing is coasts is the most popular technology for GPU and A I acceleration for packaging these chips. And it's the primary method to co package high band with memory.

Again, remember, think back to the thing that's most important right now is going as much high band with memory as you can. Close this to the CPU. Next, the logic to get the most performance for creating an inference.

So coax represents right now about ten to fifteen percent of tmc capacity, and many of the facilities are custom built for exactly these types of chips that they're are producing. So when NVIDIA needs to reserve more capacity, there's a pretty good chance that they ve already reserved some large part of the ten to fifteen percent of T S M C total footprint. And T S M C needs to like go make more fs in order for NVIDIA to have access to more chaos capable capacity.

Yeah which as we know, IT takes years for tmc to do this.

Yep, there are more experimental things that are happening, like I would be remiss not to mention, there are actually experiments of doing compute in memory, like as we shift away from vi man, and sort of all bets are off. Now we were open to new computing architectures. There are people expLoring what what if we just proceed the data where IT is in memory, instead of doing the very lossy, expensive, energy intensive thing of moving data over the copper wired to get to the CPU, all sorts to trade off in there. But IT is very fun to sort of dive into the academic computer science world right now, where they really are rethinking, like what is a computer.

So these three things that in video has been building, the dedicated hopper data center or GPU architecture, the Grace CPU platform, the melon ox power never working stack. They now have a full sweet solution for general AI data centres. And then when I say solution.

I hear margins. But let's be clear, you don't need to offer some sort of solution to get high margins of your own. Video Price is set where supply meets demand, and they are adding as much supply as they possibly can right now. I believe me for all sorts of reasons. And video wants everyone who wants each one hundreds to have each one hundred ds. But for now, the Price is kind of like a i'll write you a black check and video you write whatever you want on the checks that their margins are crazy right now just literally because there's way more demand than supply for these things. yes.

okay. So let's break down what they're actually selling. So like you is something of course you can and of people do just go by eight, one hundred milk and care about the great CPU, care about this melange stuff, and run in my own ata center.

I'm really and the people who are most likely do are the hyper scale as or as n via refers to them is the cloud service providers.

This is A W S, this is azure, this is google, this is facebook for their internal use.

And video, don't give me one of these D G X servers that you assemble. Just give me the chip and I will integrate IT. I want integration.

Am a world class data center architects and Operator. I don't want your, I just want your chips. So they sell a lot of those.

Now in video, of course, has also been seeding new cloud providers out there in the ecosystem like our friend's, a cruel, also core wave and lamb. The labs, if you ve heard of them, these are all new GPU dedicated clouds videos, is working closely with. So they're selling eight one hundred hundred and eight one hundreds before that .

to all these club providers. But lets say you are an arbitrary company in the four and five hundred, that is not a technology company. And my god, do you not want na miss the boat on general of A I? And you got a data center of your own. Well, NVIDIA has A D G X for you.

Yes, they do. Full G P U based supercomputer solution in a box. You can just plugged in your data center. And IT just works. There is nothing else on the market like this.

And IT all runs kuta. IT is all speaking the exact language of the entire ecosystem of developers that know exactly how rights off or for the thing.

which means that whatever developers you already had who were working on A I R, anything else, everything they were working on, is just going to come right over and run within your brand new shiny A I secret computer, because IT all runs cute, amazing, more on cuter in a minute. But as we said, you say, solution. Hi here, girls.

Margin in video sells these D G, X systems for like one hundred and fifty to three hundred thousand dollars a box. That's wild. And now with all these three new legs, the stool, harper Grace and melon ox, these systems are just getting way more integrated, way more propriety and way Better.

So if you want to buy a new top of the line D G X eight one hundred system, the Price starts at five hundred thousand dollars for one box. And if you want to buy D G X G H two hundred super pad, this is the A I wall. The Jason recently unveiled the huge like room full of A.

I and it's like twenty rax wide. Imagine an entire row in a data center.

Yes, this is two hundred and fifty six. Grace hoppe, R D G X, rax, all connected together in one wall. They're billing this as the first turn key A I data center that you can just buy and can train a trillion parameter GPT for class model. The pressing on that is call us course, but i'm imagining like hundreds of millions of dollars, like I doubt it's a billion, but hundreds of millions, easily wild.

Well, let's talk about the h one hundred of the baseball card right here on this, a insane thing that y've built. So they launched IT in september twenty to twenty two. It's the successor to the a one hundred one GPU one eight one hundred cost forty thousand dollars. So that's how you get to the Price point .

you're talking about. That's what they're selling to amazon and google and facebook.

right? And you mentioned that five hundred thousand of our Price point, the five hundred thousand dollars is the eight forty thousand dollar, eight one hundred ds in a box with the grey CPU and you know the nice ball around the yep.

which do the math on that. So eight times forty thousand, that's three hundred and twenty thousand dollars. So that's essentially an extra hundred and eighty thousand dollars of margin that in video is getting out of selling the solution. It's an ARM CPU IT doesn't cost of anything to make that.

And these forty thousand dollar, each one hundreds have margin of their own. So like every time they bundle more, there's more margin in the fully assembled. I mean, that's literally bundle economics.

You are entitled to margin when you build up more things together and provide more value for customers. But just like illustrate the way that this pricing works. So the reason you want to eight, one hundred is there thirty times faster than in a one hundred, which might do is only like two and a half years old.

IT is nine times faster for AI training. The eight one hundred is literally purpose built for training. L like the full self driving video stuff is super easy to scale up.

It's got eighteen and a half thousand CUDA course. No one we are talking about the vanny me an example earlier like that is one computing core that is able to handle, know, those four assembly language instructions. This one, each one hundred, which they're calling a GPU, has eighteen and a half thousand cores that are capable of running kuta software.

It's got six hundred and forty tensor cores, which are highly specialized for matrix multiplication. They have eighty streaming multiprocessor. So what are we up to hear close to twenty thousand unique course on this thing.

It's got meaningfully higher energy usage than the a one hundred. I mean, a big take away here is that NVIDIA is massively increasing the power requirement every time they come out with the next generation. They are both figuring out how to pushed the edge of physics, but they're also constrained by physics.

Some of this stuff is only possible with way more energy. This thing is seventy pounds. This is one eight. One hundred makes a big deal about this.

Every know that he gives them like like it's got .

a quarter trillion transistors across thirty five thousand parts. IT requires robots to assemble IT none of IT is IT requires physical robots to assemble IT requires A I to design IT. They're actually using A I to design the chips themselves. Now, I mean, they have completely reinvented the notion of what computer is totally.

and this is all part of jenson's pitch here to customers. Yes, our solutions are very expensive. However, he uses the line that he loves. The more you buy.

the more you save if you could get your hands on some right.

But what he means by that is like, okay, say, you're mcDonald and you're trying to build a general vy, I so that I don't know customers going to order or something you're using IT in your business. If you were gonna try and build and run that in your existing data center infrastructure, IT would take so much time and cost you so much more over the long running compute. Then if you just went bought my super pod here, you can play and play and have IT up and running in a month.

yep. And by the fact that this is all accelerated computing, the things you'll doing on IT, you literally wouldn't able to do otherwise or might take you a lot more energy, a lot more time, a lot more cost. There is a very valid story to buying and running your workload here or renting from any of the cloud service providers and running your workloads here is more performance because the results just happen much faster, much cheaper or at all. yep.

And energy here like this is also Jason's argument is, yes, these things take a ton of energy, but the alternative takes even more energy. So we are actually saving energy if you assume this stuff is going to happen. Now there's A A bit of Carry out here and that I can't happen except on these types of machine. So he enabled this whole thing, but is a point.

Oh, I totally buy IT though. I mean, I think there's a very real case around lucky only have to train a model once and then you can do inference on IT over and over over again. I mean, the analogy, I think, makes a lot of sense for model trading is to think about IT as a form of compression.

LLM are turning the entire internet of text into a much smaller set of model weights. This has the benefit of storing a huge amount of usefulness in a small flip print, but also enabling a very inexpensive amount of compute, again, relatively speaking, in the inference step for every time that you need to prompt to that model for an answer. Of course, the trade staff you're making there is once you encode all the training data into the model, IT is very expensive to redo IT.

So you Better do IT right the first time or figure out little ways to modify IT later, which a lot of ml researchers are working on. But I always think a reasonable comparison here is to compress a zillion layer photoshop file for anybody that's ever dealt with. Oh, i've got a three gigabit photoshop file.

Well, that's not a thing you're going to send to a client. You're going to impress IT into a JPEG and you to that. And the JPEG is, in many ways, more useful as a compressed fact, simi I of the original layers comprising the photoshop file.

But the tradeoffs you can never get from that impressionable JPEG back to the original thing. So I think the analogy here is like you're saving everyone from leading to make the full P. S. D. Every time, because you can just use the JPEG, the vast, vast majority of the time.

So hopefully, we've now painted a relatively coherent picture of both the advances that made the generate A I opportunity possible, that IT has truly become a real opportunity, and why in video, even above the obvious reasons, was just so well positioned here, particularly because of the data center centric nature of these workloads and that they had been working so hard for the past five years. Two, fundamentally, we architect the data center.

yep. So on top of all this, in video recently announced yet another pretty incredible piece of their cloud strategy here. So today, like we've been saying. If you want to use eight one hundred ds and eight one hundred ds, say you N A I start up.

The way you're probably going to do that is you're gone to go to a cloud, either a hyper scale or a dedicated G, P U cloud like cor corria lam, the labs, and you're gonna a rent, your G P S. And then you did some research on this. So like what does that cost?

I just looked at the pricing pages on public clouds today. I think azure in AWS, where I looked, you can get access to A D J X server that eight eight one hundred ds for about thirty bucks an hour. Or you can go over A W and A P5 dot forty eight x large instance， which is eight eight one hundred ds, which I believe is an H G X server for about one hundred dollars an hour, so about three times as much.

And again, what I say you can get access. I don't actually mean you can get access. I mean, that's the Price.

If you could get access, that's what you would pay for IT, correct? Okay, that's just getting the G P U. But if you buy everything we were talking about a minute ago, say your mcDonald are U P S R who and you're like, you know, I really like Jenny.

I buy what you're selling. I want this whole in integrated package. I wanted A I supercomputer in a box that I can plug in in my wall and have IT run.

But i'm all in on the cloud. I don't remain on data setters anymore. In video has now introduced D, G, X. cloud.

yeah. And of course, you could rent these instances from amazon, microsoft, google or recall.

But like you're not getting that full integrated solution right and .

you're getting some in integration the way that the cloud servers, the rider wants to create the integration using their proprietary services. And to be honest, you might not have the right people on staff to be able to deal with this stuff in a sud bear metal way. Even if it's not in your data center and you're renting IT from the cloud, you might actually need based on your workforce to just use a web browser and just use a real nice easy web interface to load some models in from a trusted source that you can easily pair with your data and just click run and not have to worry about any complexity of managing a cloud application that's in amazon, microsoft, something a little bit scarier and closer to the medal.

yeah. So a video has introduced D G X cloud, which is a virtual zed D G X system that is provided to you right now via other clouds. So azure and oracle and google.

right? The boxes are sitting in the data centers of these other right.

They're sitting in the other cloud service providers. But as a customer, IT looks like you have your own box that you're renting.

You log into the D G, X cloud website through and video and it's all nice wizard wag stuff. There's an integration with huggy face where you can easily deploy models rate off of hugging face, you can upload your data like everything is just really busy week prior. Way to describe IT.

This is unbelievable. In video launched their own cloud service through other clouds in video does have.

I think, six data centers, but that I don't believe is what they're actually using to back D G X cloud.

no. So starting Price for D G X cloud is thirty seven thousand dollars a month, which will get you on a one hundred based system, not one eight, one hundred base system. So the margins on this are in scene for in video and their partners.

A listener helped to to sell and estimated that the cost to actually build an equivalent, a one hundred D G X system, would be today, something like a hundred and twenty k. Remember, this is the previous generation. This is not a one hundred ds and you can rent IT for thirty seven K A month. So that's three month payback on the capex for this stuff for in video and their cloud partners together and even more foreign video, more important longer term for enterprises that by this invidia now has a direct sales relationship with those companies, not necessarily intermediate by sales through azure or google or A W S. Even though the computer is sitting in their clouds.

which is crucially important because at this point, the CFO collect crest set on their last earnings call that about half of the revenue from the data center business unit is C, S, P, S. And then I believe after that is consumer internet companies and after that is enterprises. So there's a few interesting things in there, one of which is, oh my god, their revenue for this is concentrated among like five, eight companies with the cps two.

They don't necessarily own the customer relationship. They own in the developer relationship through kuda. They've got this unbelievable now of and video a developers that stronger than ever. But in terms of the actual customer, half of their revenue is intermediate by cloud providers. The second interesting thing about this is even today in this A I explosion, the second biggest segment of data centres is still the consumer internet companies.

It's still that stuff you're talking about before of the users of machine learning to figure out what should show up in your social media algorithms and match adds to you that actually bigger than all of the direct enterprises who are buying from NVIDIA. So the D, G, X, cloud play is a way to a sort of shift some of that C, S, P. Revenue into direct relationship revenue.

So all of this brings us to twenty twenty three in may of this year in video reported there q 1 fiscal twenty four earnings in videos on this weird january fiscal year。 And think so, key one twenty four is essentially key one twenty three, where in which revenue was up nineteen percent quarter of record to seven point two billion, which is great because member that a terrible and of twenty twenty two with the right off cyp to falling off a clear and all that.

yes. It's amazing that in that strategi interview, when was that and march of twenty twenty three, jensen said last year was unquestionably a disappointing year. This is the year ChatGPT was released. IT is while the roll coaster this company has been on.

the time frame is so compressed here.

And part of that, of course, is a theoria moving to proof stay the end of the crypto thing for in video, which i'm sure they are actually thrilled about. But part of IT was they also put in eight hundred of preorders for capacity with tsc that then they thought they weren't in the need, so they had to write down. So from an accounting perspective, that looks like a big loss, like a really big blemish on their finances last year. But now, oh my god, of they glad that they reserved that capacity.

Yep, it's actually going to be quite valuable. So speaking of you know this key one earnings is like great, up nineteen percent quarter over quarter. But then they drop the bomb shell due to unprecedented demand for generating A I compute in data centers in video forecasts q to revenue of eleven billion dollars, which would be up another fifty three percent quarter of a quarter over key one, and sixty five percent year over year. The stock goes not .

twenty five percent in after hours trading. This is a trillion dollar company, or at least this made them a trillion dollar company. But like a company that was previously valued at around eight hundred billion dollars, popped twenty five percent after earnings.

Well, even crazy than back when we did our episode's last April. The video was the eight largest company in the world by market tab, had about a six hundred and sixty billion dial market cap that was down slightly off the highs. But that was the order of magnitude back then, a crash down below three hundred billion.

And then within a matter of months, it's now back up over a trillion just while and then all of this culminated ates last week at the time of this recording when the video reports cue to fiscal tony for earnings. And this earnings is release we usually not talk about like individual earnings releases on acquired because in the long ark of time, who cares? This was a historic event. I think this was one of, if not the most incredible earnings released by any scale public company ever. Seriously, no matter what happens going forward, last week was a historic moment.

The thing that blows my mind the most is that their data center segment alone did ten billion dollars in the quarter that more than doubling off of the previous quarter in three months, they grew from tourist billion to ten billion of revenue in that segment. And revenue only happens when they deliver products to customers. This isn't pre orders.

This isn't clicks. This isn't wave your hands around stuff. This is we delivered stuff to customers and they paid us an additional six billion dollars this quarter than they did last quarter.

So here, the full members further quarter total company revenue of thirteen point five billion, up eighty eight percent from the previous quarter and over a hundred percent from a year ago. And then then, like he said, in the data center segment, revenue of ten point three billions.

So ten point three out of thirteen point five for a segment that basically didn't exist five years ago for the company, that's up one hundred and forty one percent from q one, and one hundred and seventy one percent from a year ago. This is ten million dollars that kind of growth at this scale. I've never seen anything like IT. No, neither has the .

market at right.

And so this this is the first time I noticed IT jenson to talk about this in q one learning. So wasn't the first time, but he brings back the trillion dollar tam, not in a slide. I think this time he just talks about IT. No.

but in a new way that I think is a Better way as licit.

this time is different. You look, will spend a while here now talking about what we think about this, but this is very different. This time he frames in video's trillion dollar opportunity as the data center. And this is what he says. There is one trillion dollars worth of hard assets sitting in data centers around the world right now.

growing at two hundred and fifty billion a year.

And you will spend on data centres to update and add to that. Capex is two hundred and fifty billion dollars a year. And in video has certainly the most cohesive, fulsom and coherent platform to be the future of what those data centers are going to look like for a large amount of computer clothes. This is a very different story than like or we're going to get one percent of this hundred trillion dollars of industry out there.

And the thing you have to believe now, because whenever someone paints a pictures, you say, okay, what do I have to believe? The thing you have to believe is. There is real user value being created by these A I workloads and the applications that they are creating.

And there's pretty good evidence, I mean, ChatGPT made IT. So OpenAI is rumored to be doing over a billion dollar run right now, maybe multiple single digit billions and still growing meaningfully. And so that is like the shining example again, that's the netscape navigator here of this whole boom.

But the bet, especially with all these fortune five hundreds, is that they're going to be GPT like experiences in everyone's private applications in a zillion and other public interfaces. I mean, jenson frames, that is in the future, every application will have a GPT front in IT will be a way that you decide that you want to interact with computers. There's more natural. And I don't think he means like versus clicking buttons. I think he means everyone can come to become a programmer, but the programing language is english.

And so when you're sort of like or why is everyone spending all of this money IT? Is that the world's executives with the purchasing power to go right, a ten billion dollar checks, quarter to NVIDIA for all this stuff, wholehearted believes from the data they've seen so far, that this technology is going to change the world enough for them to make these huge bets. And the thing that we don't know yet is, is that true, is the GPT like experiences going to be an enduring thing for the far future or not?

There's a pretty good evidence so far that people like this stuff, and that is quite useful, and transforming the way that, you know, everyone lives their lives and goes about day to day and does their jobs and goes through school and you know, on on, on and on. But that is the thing you have to believe. We want to think our long time friend of the show, vantine, the leading trust management platform, venta, of course, automates your security reviews and compliance efforts. So frameworks like soc two I saw twenty seven o one gdpr and hip compliance and modeling ing vent to takes care of these otherwise incredibly time and resources inc. Efforts for your organza and makes them fast and simple.

Yeah, fanta is the perfect example of the quote that we talk about all the time here and acquired jeff basis, this idea that the company should only focus on what actually makes your beer taste Better. I E spend your time and resources only on what's actually gonna, move the needle for your product, your customers, and outsource everything else that doesn't. Every company needs compliance and trust with their vendors and customers. IT plays a major role enabling revenue because customers and partners demand IT, but yet IT add zero flavor to your actual product that IT takes care.

evolve IT for you, no where spread sheet, no fragment to tools, no media reviews to cobble together your security and compliance requirements. IT is one single software pain of glass that connects to all of your services via and eliminate countless hours of work for your organization. There are now A I capabilities to make this even more powerful, and they even integrate with over three hundred external tools, plus they let customers build private integrations with their internal systems.

And perhaps most importantly, your security reviews are now real time instead of static, so you can monitor and share with your customers and partners to give them added confidence.

So whether you are start up or a large surprise and your company is ready to automate complaints and streamline security use, like fanta, seven thousand customers around the globe head go back to making your beer taste Better. Head on over to com, lush required, and just tell them that then. And David sent you.

And thanks to friend of the show, Christina anta CEO all acquired listeners get a thousand dollars of free credit van outcome slash acquired. okay. So David analysis, we ve got to talk about kuda before we started analyzing anything else here. Talk about a lot of hardwork so far on this episode, but there is a huge piece of the NVIDIA puzzle that we haven't talked about. sense.

Part two and quota, as folks know, was the initiatives started in two thousand and six by jenson and ian buck and a bunch of other folks on the via team to really make a bet on scientific computing that people could use graphics cards for more than just graphics, and they would need great software tools to help them do that. IT also was the glimmer in gentz's I have who maybe I can build my own relationship with developers. And you know, there can be this notion not of a microsoft and intel developer who happens to be able to, you know, have a standard in your place to my chip, but I can have my own developer stem, which has been huge for the company.

So CUDA has become the foundation that everything that we've talked about, all the A I. Applications are written on top of today. So, you know, you hear Johnson and these kind's reference could to the platform, could to the language. And I spent some time trying to figure out when I was watching developed our sessions and like literally learning some CUDA programs, what is the right way to characterized .

IT and what is the way to character today? Because IT is evolved lot.

yes. So today kuta is starting from the bottom, going up a compiler, er a runtime, a set of development tools like a debugging and a profiler. IT is its own programing language.

Could to c plus plus IT has industry specific libraries. IT works on every card that they ship and have shipped since two thousand and six, which is a really important thing to know. And if you are to develop, your stuff works on everything, anything in video. All this unified interface. IT has many layers of abstractions at existing libraries that are optimized.

So these library of code that you can call to keep your development work short and simple, instead of reinventing the wo so, you know, there are things that you can decide that you want to write in sea plus plus and just rely on their compiler to make IT run well on and video hardware for you. Or you can write stuff in their native language and try to implement things yourself, include sy plus plus. The answer is it's incredibly flexible.

IT is very well supported and there's a huge community of people that are developing with you and building stuff for you to build on top of. If you look at the number of kuta developers over time, IT was released in two thousand and six. IT took four years to get the first hundred thousand people.

Then by twenty, sixteen, thirteen years in, they got to a million developers. Then just two years later, they got to two million. So thirteen years to other first thirteen million than two years to at their second twenty twenty two, they hit three million developers.

And then just one year later, in may of twenty twenty three, CUDA has four million registered developers. So at this point, there is a huge vote for NVIDIA. And I think when you talk to folks there and Frankly, when we did talk to folks there, they don't describe IT this way. They don't think about IT like, well, CUDA is our motives. Sus competitions.

It's more like will look, we envisioned a world of celerity computing in the future, and we thought there are way more workloads that should be paralyzed to make more efficient, that we want people to run on our hardware and we need to make IT as easy as possible for them to do that. And we're going to go to great length and have one two thousand people that work at our company. They're going na be full time soft engineers building this programing languages, a pilot and foundation and framework and everything on top of IT to let the maximum number of people build on our stuff.

That is how you build a developer ecosystem. It's different language. But the bottom line is they have a huge reference for the power that IT gives them at the company.

This is something we touched down on our last episode. But really cr rytas zed, for me in doing this one in video, thinks of themselves as, and I believe is a platform company, especially this week after the blowout earnings and everything that happened this quarter and the stock and what not sort of a popular take out there that you've been seeing a lot is, oh, we've seen this movie before. This happened with cisco.

You could say over a longer time scale, this happened with intel. Yeah, these hardware providers, these semiconductor companies, they're hot when they're hot and people want to you spend capex. And then when they're not hot, they are not hot.

But I don't think that's quite the right way to character eyes in video. They do make semiconductors and they do make data center here, but really, they are a platform company. The right analogy from video also is microsoft. They make the Operating system, they make the programing environment, they make many of the applications right?

Cisco doesn't really have developers. Intel never had developers. Microsoft had developers, and intel had microsoft.

But intelligence have developers and video has developers. I mean, they've built a new architecture that is not a annoyin computer. They've bucked fifty years of progress.

And instead, every GPU has a stream processor unit. And as you imagine, you need to hold new type of programing, language and pilot and everything to deal with this new computing model. And that's CUDA and IT frequent works. And there's all these people that develop their livelihood in IT.

You talk to janson and you talk to other people, the company and they will tell you we are a foundational computer science company. We're not just slinging the hardware here. yeah. I mean.

interesting. There are platform company for sure. They're also a systems company productively selling mainframes and is not that different than IBM way back when they're trying to sell you up you one hundred million dollar while that goes in your data center. And it's all fully and integrated .

and at all just works. Yeah and maybe IBM actually is a really good analogy like old school IBM. Here they make the underline technology, they make hardware, they make silicon, they make the Operating system for the silicon, they make the solutions for customers, they make everything, and they sell IT as a solution. yeah.

okay. So couple of other things to catch us up here as we're starting analysis. One big point I want to make is let's look at a time line because I didn't discover this until like two hours before we started recording.

In march of twenty nine, NVIDIA announced they were acquiring melanotan for seven billion dollars in cash. And I think l was considering the purchase and the NVIDIA came in and kind of bloom out the water. And IT is fair to say nobody really understood what and video is going to do there and why I was so important.

But the question is why well, NVIDIA knew that these new models coming out would need to run across multiple servers, multiple racks. And they put a huge level of importance on the band with between the machines. And of course, how did they know that well, in August of 200， and video released what was at the time the largest transformer based language model called megaton.

Eight point three billion parameters trained on five hundred and twelve GPU for nine days, which at the time at retail would have cost them like half a million dollars to train, which at the time was a huge amount of money to spend on model training, which is only four years ago. But what I did, because they do a huge amount of research at the company, and they work with every other company doing A I research, and they were like, a, yes, this stuff is gonna work, and this stuff is going to require the fastest networking available. And I think that has to do with why no one else saw how valuable the milano x technology could be.

Another thing that I want to talk about for invidia s business today is this notion of the data center is the computer. And Johnson did a great interview with pen thoms in last year where he talks about the idea that they build their systems full stack, like their dream, is that you own and Operate A D G X super pod. And he says, we built our systems full stack, but we go to market in a disaggregate way, integrating into the compute fabric of the industry.

So I think that this sort of way of saying, look, customers need to use us in a bunch of different way. So we need to be flexible on that. But we want want to build each of our components such that if you do assemble them altogether, it's is unbelievable experience and we will figure out how to provide the right experience to you if you only want to use them in peace my ways, or you want to use us in the cloud or the cloud providers want to use us again as build the product as a system, build the system full stack, but go to market in a diagramed ted way.

And I think if I remember that that interview band picked up on this and was like, where are you building your own cloud? And gene was like, well, may be. And of course, then they launch D, J, X cloud in a well, maybe we'll see sort of way.

Yeah, you could imagine there are more and video centers likely on the way that are fly on and Operated. Speaking of all of this, we got a talk of some of numbers on margin this last quarter. They had a growth margin of seventy percent, and they forecasts for next quarter to have a growth margin of seventy two percent.

I mean, if you go back pre CUDA when they were a commoditized graphics card manufacture, IT was twenty four percent. So we've gone twenty four to seventy on gross margin. And with the exception of a few quarters of on the way for these strange one time events that basically been a linear climb quarter over quarter as they've deepen their mode and as theyve deepen their differentiation in the industry.

We're definitely at a place right now that I think is temporary due to the supply shortage of the world's enterprises and in some cases, even governments. You look at the U. K.

Or some of the midst eastern countries like blank jack, I just need access to and video, a hardware that's gonna go away. But I don't think this very high. You know, sixty five percent plus margin is gonna arrive too much? yes.

I mean, I think two things here. One, I really do believe what we are talking about a minute ago that NVIDIA is not just a hardware company. They're not just a chip company.

They are a platform company and there is a lot of differentiation victim and tell what they do. If you want to train GPT or a GPT class model, there's one option you're doing IT on in video, there's one option. And yes, we should talk about there's lots of less than GPT class stuff out there that you can do and especially inferences.

More of a wide open market verse is training that you can do that on the platforms. But they're the best and they're not just the best because of their hardware. They're not just the best because of their data center solutions. They're not just the best because of CUDA do the best because of all of those. So the other sort of illustrative thing for me that shows how why their leaders we haven't talked .

about china yet. The land of eight hundreds.

yes. So what's going on? Last year, china was twenty five percent or sales to maintain land. China was twenty five percent of invidia s revenue. And a lot of that as they were selling to hyper scale ers to the cloud providers in china by do alibaba and others.

And by the way, by do has potentially the largest model of anyone there. GPT competitor is over a trillion parameters and may actually be larger than GPT.

Wow, I didn't know that. Yeah uh well, so then I believe also in september of twenty twenty two last year, the biden administration announce pretty sweeping regulations and bands on sales of advanced computing infrastructure. David.

their export controls don't say bans.

I mean, yes, that's a fine line. This is pretty close to beds. What the administration introduce as part of that in video can no longer sell their top of the line eight one hundred ds or eight one hundreds to anybody in china. So they created a nerved skill essentially that meets the regulations, the performance regulations, the eight eight hundred and eight eight hundred ds.

which I think they basically just crank down the N, V links, data transfer speeds. So it's like buying top of the line a one hundred, but not with as fast of data connections as you need, which basically makes IT. So you can train large models or you can't.

as well as as fast as you could. With the later stuck, the incredibly telling thing is that those tips in those machines are still selling like hot cakes in china. There are still the best hardware and platform that you can get in china, even a crippled version. And I think that's true anywhere in the world.

And there has been a even a more recent Spike of them because a lot of chinese companies are reading the tea leaves and say, u expert controls might get even more severe. So I should get them while I still can. These eight hundred ds.

yeah. So I mean, I can't think of a Better illustration of just how why their lead is.

Yeah, that's a great point talking about the rest of n video just for a moment. I mean, this episode is about the data segment.

but they still make gaming currents.

Do IT is worth talking about this idea that omniverous is starting to look really interesting. As of their conference six months ago, they had seven hundred enterprises who had signed up as customers. And the reason this is interesting is that could be where there two different worlds collide.

Three d graphics with retracing, which is new and amazing, and the demos are mind blowing. A N, A, I theyve been playing in both of these markets since the workloads are both massively paralyzed. That is the sort of original reason for them to be in the A I market. If you recall back to way back, or part one, the original mission of NVIDIA was to make graphics a storytelling medium. And then their mission has expanded as we've realize, my god, our hardware is really good at other stuff that needs to be paralyze, too. But fascinatingly, with the universe, the future could actually look like applications where you need both amazing graphical capability and A I capability for the same vacation and I mean, for all the other amazing unique ss about enva that we've been talking about and how well position they are adding this on top or there the number one provider for graphic s hardware and software and A I hardware and software o and by the way, there's this huge application emerging where you actually do need both. There was going to get out of the park if that comes true.

There was a super cool demo at a recent keynote that might be a siggraph where NVIDIA created a game environment. You know fully rates game environment looks like a play game that looks amazing, basically distinguished from reality. But like you ve really got ta look hard to tell that this isn't real and this isn't a real human you're talking to.

So there's a non player character that you're talking to, an NPC who's giving you like a mission and they show this demo IT looks amazing that they are like the script. The words that that character was saying to you were not scripted that was all generated with a eye dynamically. See a hofus ly crap. You know, you think about, you play video game, the characters are scripted. But in this world that you're talking about, you can have generated A I controlled avatars that are unscripted, that have their own intelligence, and that drives the .

story totally. Or you know, an airplane that's in a simulation of not just a wind tunnel, but simulating millions of hours of flying time, using real time weather that's actually going on in the world and using A I to protect the weather in the future. So you can sort of know the real world potential things that your aircraft could encounter all in a generated graphical AI simulation.

I mean, there's can be a lot more of this stuff to come up totally. Another thing to know about NVIDIA that we really didn't talk about the last episode. They're pretty employee efficient.

They have twenty six thousand employees and that sounds like a big number. But for comparison, microsoft, whose market cap is only twice as big, has two hundred and twenty thousand. So that is five.

The number of employees per dollar of market cap going on over at microsoft. This is a little bit farcical sense. You know in via only recently has had such a massive market cap.

The scale of the platform that in video is building is on the order of magnus de of microsoft scale.

right? They have forty six million dollars of market cap per employee, wild crazy, which I think .

translators into the culture theirs. We've gotten to know some folks there. IT really is a very unique kind of culture, like IT is a big tech scale company, but you never hear about the same kind of silly big text stuff that you hear at other companies that in video, as far as I know, I could be wrong on this.

There is no like, oh, work from home or return to the office policy at the videos. Like, no, he just like you do the job and you know nobody y's forcing anybody to come into the office here. And like theyve accelerated their shift cycles.

Well, I also get the sense that it's a but a do your life's work or don't be here situation like jensen is remember to have forty direct reports and his office is basically just a empty conference room because he's just bounded around so much and he's on his phone and he's talking this person and that son and like you can't manage forty people directly if you're worrying about someone's career ambitions here.

he's talked about this. He's like. I have forty direct reports.

They are the best in the world of what they do. This is their life's work. I don't talk to them about their career ambitions. Like I don't need to like you know yeah for like recent college grads we do metering IT. But if you are senior employee, you've been here for twenty years, you're the best in the world of what you do and were hyper efficient and I started my day five A M seven .

days a week and you do too crazy .

yeah there's actually this amazing good from jensen that I heard done an interview with him, that I was listening to him towards the end of the conversation, the Jason you in the video to do this, just amazing things. What do you do to relax and just, I really, this is a quote, direct quote. I relaxed all the time.

I enjoy relaxing at work because work is relaxing for me. Solving problems is relaxing for me. Achieving something is relaxing for me. And he's one hundred percent serious, like a thousand percent serious.

How old this janson.

The dude is sixty years old.

I kind of feels like all of his peers have either decided to retire and relax oh, you know, relaxing while running their companies. There's another other crop of people that are doing that, and that is just not at all interesting to him or what he's doing. And I can get the sense like he's got another thirty years in him and he's architecture the company in such a way that that's the plan.

I don't think there's anyone else there or they're like getting ready for that person to take over. I think the company is a extension of johns's thoughts and will and drive and belief about the future. And that's kind of what happens.

I don't know if there is there isn't a jenson and Lorry hang foundation, but if there is, he's not spending his time on IT. He's not buying sports franchise. He's not buying mega yet or if he is, isn't talking about and he's working from them and .

is not buying social media platforms and newspapers.

Yeah totally.

I mean, that is quite telling that when you watch one of their key notes, it's jennen on stage and it's some customer demos. But it's not like the apple keynotes where tim cooks call him up. Another apple employee is the jensen show.

right? Nobody would accuse tim cook of not working hard, I don't think. But you go keynotes like tim does the welcome and then the hand off and you know a parade of other executives talk about stuff.

Good morning .

to maple.

I love you.

Love to have about the shoes sometime.

That would be a nice who text are right? Power left.

Dog power.

all right. So for a listener's who are new to the show, this is the section or what we talk about, what IT is about the company that enables them to achieve persistent differential return or in other words, to be more profitable than their closest competitor and do so sustainably. And NVIDIA is fascinating because they sort of have a direct competitor, but that's not the most interesting form of competition for them.

This in remediation is sure ostensibly there's envy of versus AMD, but like AMD doesn't have all this capacity reserved from T S, M C, at least not for the two point five d packaging process for the high N G P U. AMD doesn't have the developer ecosystem from CUDA. There are the closest direct comp.

But it's amazon building tradition and inferential. It's if microsoft decides to go and build their own ship as the rumor to with AMD. It's google and the TPU facebook developing PyTorch and then leveraging their foot hold with pitch ch with the development community to figure out how to extend underneath of pitch. There's a lot of competitive vectors coming at and video, but not directly .

not to maintain all the data set are hardware providers that are their direct competitors now too. Yep, l sea on down the line. yep.

Now all that said, they've got a lot of powers. So as we move through these one by one, I think lets just say them all and we can decide if there's something to talk about here. Counter positioning is the one where I actually don't think there is anything here. I don't think there is anything that and video does where there's another company that's actively choosing not to do that because any company would want to be envied. Or right now.

I would have agreed with you, but I actually think there is strong counter positioning in the data center world right now in video and genson put a flag in the ground several years ago where they said we are going to rearchitects the data center and all the existing data center hardware and computer providers had strong incentive not to do that.

But like right now, what do you think other data center hard work providers.

what are they not doing? Yeah, they're trying to put G, P, S in the data center to .

everyone's just going to chase exactly what everybody is doing years. That's the market right now. Yeah OK right of and the question is, will in video be able to stay ahead in ways that matter? That I think is the entire analysis on the company right now is in what ways that matter to customers at large scale and large markets?

Will they be able to sustainably be ahead of people that are just chasing them in trying to copy what they're doing because the margin profile is so fat and you see that people don't want to pit job. So the second one scale economies, this has could have written all over IT. You can make massive fixed cost investments when you have the scale to advertise that cost across.

And when you have four million developers who want to develop on your platform, you can justify whatever IT is. Sixteen hundred people who actively on linked in at and video today have the word CUDA in their job title. I mean, i'm sure it's actually even more than that who just started.

You know, they are saying software something like that. But thousands of people of an investment that they don't make any money on software, they may they make a diminished amount on software. But that is immoral, zed, across the entire developer base.

I think it's worth a singing a bit more here on this two, which we also talked about on our last episode. To me, the dynamics here are a lot like apple and IOS. Yes, versus android.

Apple has thousands and thousands and thousands of developers work on IOS. Android also has thousands and thousands of developers working on IT across a widespread ecosystem. But at apple, it's all tightly controlled and it's coupled with hardware. And android is not unlike as a user. Maybe you'll get the latest Operating system up date, maybe you want I .

think this is exactly the right framing here that NVIDIA is the apple of A I and pid orchis sort of android because it's open source and is got a bunch of different companies that care about IT. Open C, L is the android as a pertaining to graphics, but it's pretty bad and pretty AR. Behind rock m is the koda competitor made by AMD for their hardware.

But again, new not a lot of adoption. They're working on IT, but they've open source that because they realized they can go directly head ahead with any video, they need some different strategy. But yes, they are absolutely running the apple playback cap.

And I think in the current state of things that's even more favorable to in video then I O S vers android because in video has had first dozens and then hundreds and now thousands of engineers working on CUDA for sixteen years. Meanwhile, the android equivalent out there in the open source eco system has only just been getting going.

You know, if you think about the delta of the time line between I O S and android IT, with a year and half two years, there's a probably at least ten, probably closer to fifteen years lead that in video has. And so we talk to a few people about this. We like o, what's going on in the open sourcing system.

Is there an android equivalent? And even the most bullish people we talk to, we're like, oh yeah, you know, now that facebook has really moved PyTorch into a foundation and outside of facebook, that means that other companies can now contribute, you know, couple dozen engineers to work on IT. And you like, cool. So A M D is gone to contribute a couple dozen, maybe one hundred engineers to work on PyTorch. And so will google, and so facebook, and so everybody else in video has thousands of engineers working on good ten years ahead.

I thank you. This graph of d of my estimated number of employees working on CUDA per year since inception in two thousand and six. And then if you look at the area under the curve and just take the integral, it's approximately ten thousand person years that have gone into kuta. Like good luck.

Now I can. Open source is a very powerful thing. The market incentives are absolutely there for this to happen.

right? That is. The interesting point is every mode only works if the castle is sufficiently small. If the prize at the end of the finish line becomes sufficiently large, you're gonna need a bigger mote, and you need to figure out a, you know, how to defend the castle harder and mixing so many metaphors here.

But you get the idea, yeah.

I love that this was a perfectly fine mode when the addressing market was a hundred billion dollars is IT at a trillion dollar market opportunity? Probably not. Basically, that means margins come down and competition gets more fear over time.

And I think in video totally gets this, because part of this, as I was allowing to, is coffee related. But we talked way back in part one about how in video ended up to save the company moving to a six month shipping cycle for their graphics cards when their competitors were on a one two year shipping cycle that persisted for several years. And then they related back to an annual shipping cycle.

There are annual gtc since covered invidia has reacclimatise a six months shipping cycle. They've been doing two gtc a year, most years since coffee, which is insane for the level of technology complexity that they're doing. Yeah, imagine apple doing two wwdc z here.

Yeah, that's what's happening in the video. crazy. So on the one hand, that's a culture thing. On the other hand, that is an acknowledged of like we need to be headed to the floor right now, run competition.

We've built some structural ways to defend the business, but we need to continue running as fast as we've ever run to stay ahead because it's such an attractive .

race that were in yeah so scale economies, let's move to switching costs.

Now so far, everything of consequence, especially model training, especially on alarms, has been built on a video and that alone is just a big pile of code and a big amount of organizational momentum. So switching away from that, even from the software perspective, is going to be hard. But there are companies today in two and twenty three, both at the hyper scales and fortune five hundred companies that on the own data center.

Making data center purchase and roll out decisions that will last at least the next five years because these data center v architectures don't happen very often. And so you Better believe that and video is trying as hard as they can to ship as much product as they can. Well, they have the lead in order to log in that data center architecture for the next ten years.

Yeah we to many people, in preparation for this episode, one of the most interesting conversations was with some of our favorite public market investors out there, the N. S. ital.

Guys who are many insight from for isom.

So and obviously have been following in video in the space for a long time, they made the point that data center revenue and data center cap acs is some of the stickiest revenue that is known to humankind, the organization switching costs involved in data center procurement and data center architecture. Standardising decisions got that amount even to say at four thousand and five hundred companies and the like is like they're not change in that more than once a decade at most.

So even if we're sort of in this bubble moment around the excitement of general AI before we necessarily know the full set of applications, NVIDIA is leveraging this excitement to go gets up blocking. I've seen some people on the internet being like they love how supply constrained they are. I don't think so. I think they're looking for capacity in every way they can get IT to exploit this opportunity.

Well, exists. I completely good that yeah I think you know again, didn't talk to call IT video CF about this. But I strongly suspect fy with them I would be happy to trade some of this girl margin right now for increase through put on sales.

Yep, but there is only one T, S, M, C, and there's only so many fabs that they have that can do the, what they call the two point five d architecture.

So shall we talk corner resource?

Yeah, this is probably the textbook cornered resource. And video has access to a huge amount of capacity at T, S, M, C that none of their competitors can get their hands on, which they did luck into this quarter resource a little bit. They reserved all that way for a supply for a different purpose, partially cyp.

T do mining, but A M D doesn't have IT. A M D does have a ton of capacity, is worth saying at T S M C for their other products, data center CPU, which they've actually been doing very well in. But n video did end up with this wide open lane all to themselves on coaster active at tmc. And they got to make the most of that for as long as they have IT.

yep. And I guess to say a little more though, it's not like this is not a commodity. As we talked about on our T S M C episode, although T S M C is a contract, IT is the opposite of a commodity, especially at the highest and leading edge.

It's like a invention delivered by aliens that very few humans know how to actually do.

Yes, IT is worth access .

knowledge ging. It's kind of a two horse race for L. M.

training. I know we've been harping NVIDIA, but google tp us. Are also manufactured at volume. You can just only get them through google cloud. And I think I don't know if you have to use the TensorFlow framework, which has been waning in popularity relative to PyTorch, but it's certainly an industry standard to use tp, use the way that IT is to use and videos s hardware. I suspect a lot of the volume of the tp use is being used internally by google for bar for doing stuff in google search. Like I know we've added a lot of the general AI capability to search.

Yep, totally two points on this. Just sticking to the scope of this business and market discussion. This is a major casualty of a strategy conflict.

The google, obviously, the way you want to do this is the way in video is doing this of like your customers want to buy through the cloud, you want to be in every cloud. But obviously, google is not gonna be in A W S, in azure and oracle and all the new club providers. They're only gonna in G C P.

Maybe David, but I was gna .

say though do the expanded length though I think this makes sense for google because their primary business is their own products.

right? And they run among the most profitable businesses the world has ever seen. So anything they can do to further advantage and extend that runway, they probably should .

do nothing has changed through all of this with respect to the fact that what the previous generation of A I enabled with machine learning with regard to social media, internet applications being the most profitable cash flow guides known to man, none of that has changed. That is still true in this current world and still true for google.

The last one that I had highlighted is network economies. They have a large number of developers out there and a large number of customers that they can advertise these technology investments across and who all benefit from each other. I me remember there people building libraries on top of CUDA, and you can use the building blocks, other people built to build your code.

You can write amazing, cuter programs that just don't have that many lines of code because it's calling other preexisting stuff. And NVIDIA made a decision in two thousand and six that at the time was very costly, like big investment decision. But IT looks and highlight to make sure that every GPU that went out the door was fully cute, capable.

And today, there are five hundred million could capable GPU for developers to target. It's just very attractive. And putting this in the network economy, I think it's probably more a scale economy than a network economy.

But you could imagine a lot of people who humming around and video in two thousand and six to twenty twelve saying, why do I have to make IT so that my software fits on this tiny little footprint? And we can include CUDA taking up a huge amount space on this thing and make all these tradeoffs and are hardware so that we can, what are people use CUDA. And today, IT. Just like.

so genius and I, we've talked about this many times on the show, including with hammer and themselves. But for platform companies like in video, clearly is there is a special brand of power, that is accommodation of scale economies and twelve economies. This is what you're getting up.

Yeah, they do have branding power for sure. Yeah, I actually .

think it's worth talking about this a little.

but this is that nobody gets fired for bike IBM NVIDIA is the modern IBM in the AI era. Yep.

look, I don't feel confident enough to like on the table on this. But given the nature of how the company started, how long they've been around and the fact that the also have the market leading product in a totally different business in graphics, yeah, which is both consumers but also professional graphics, I think that probably does lend some brand power to them, especially when the cio and the sea sweet and mcDonald is making a bk decision here. Like everybody knows in video.

you're think that they Carried their consumer brand into their enterprise posture.

This is way, way, way down the stack and power. But I don't think it's hurt them. They've always been known as a technology leader, and the whole world has known for decades at this point that the stuff that they can enable is magical.

There's a big strength leads to strength thing here to where I bet the revenue result from last quarter massively dwarf any brand benefit that they ever got from the consumer side. I think it's just the fact that, like, hey, look, everyone else is buying in video.

i'd be any yet not to nobody is getting fired for the right .

or taking a big dependency on them or targeting that development platform. It's just the like if you are innovating in your business, you don't want to take risk on the platform. Your building on top of you want to be the only risk in the value chain are .

then the last one, right? This process power .

yeah and this is probably the weakest one, even though i'm sure you can make some argument that they have processed wer it's just that all the other powers are so much more valuable.

It's always so tRicky to tease out. Yeah you know, I think the argument here would just feel like in videos culture and their six months s shipping cycle that clearly they had in past and they didn't have for a while, and now they, again, I don't know. I think you can make an argument here is IT feasible.

Let's do a thought exercise. Could any of their competitors really, in any domain, move to a six months shape cycle be really hard? Yeah, you know, could a apple sized company due to wwdc a year like no.

The question is, does that actually matter? There are so many people that are using a one hundreds right now. And in fact, most workloads can be run on a one hundred ds unless you are doing model training of GPT four.

I just don't know that IT actually matters that much or as much as other factors. And i'll give you example, A M D does have three d packaging on one of their latest GPU. It's a more sophisticated way of doing real copper or real copper direct connection without a silicon and poser.

I'm getting in a little bit of details, but basically, it's more sophisticated than the process that the h one one hundred two point five d is using to make sure that memory is extremely close to compute. And does that matter? Not really. What matters is everything else we even talking about and nobody y's going to make a purchase decision on this thing because it's, you know a little bit of a Better mouse trap yeah .

thinking about this more, I think actually brand is a really important power for video right now.

Yeah and in a strengths to strength way so you can see why they are trying to sort of seize this moment.

Yeah, playbook, right? Let's move out the playbook.

So one thing that I want to point out is jenson keeps referring to this as the iphone moment for A I. And when he says IT, the common understanding is that he means a new mainstream method for interacting with computers. But there's another way to interpret IT. Does this sound familiar, David? When I say a hardware company differentiated by software that then expanded in the services.

yes, yes, IT does.

It's quite tongue cheek to be referring to the iphone moment of ai in referring to one's self in video as the apple because I really think that the parallels are uncanned, that they have this integrated hardware and software stack provided by and video you use their tools to develop for IT. They've shipped the most units. So developers have a big and of to target that market.

It's the best individual buyers to target because they are the least cost sensitive and they appreciate you building the best experiences for them. I mean, it's the iphone, but in many ways, it's Better because the target is A B to b target instead of consumers. The only way in which is different as apple has always how a market cap that sort of lag its proven value to users, where as n video right now as a exactly over their skies.

Well, let's save that for bullin bear at the end. great.

The second one is that they've moved on from becoming a hard ware company to truly being a systems company. Well, in videos, chips are typically ahead. IT really doesn't matter on a chip to chip comparison that is not the playing field.

IT is all about how well multiple GPU and multiple racks of GPU work together as one system with all the harbour and networking and software, the naabs that they have just entirely changed the vector of competition, which I think lots of companies can learn from. And my third one here is this quote that john had, again from the same protect ory interview, which is, you build a great company by doing things that other people can't do. You don't build a company by fighting other people to do things that everyone can do.

And I think it's so silent. IT comes out in all these interesting ways, one of which is NVIDIA never dedicated resources to building a CPU until there was a differentiated way in a real reason for them to build their own CPU, which is now and the way that they're doing IT, by the way, is not terribly differentiated. It's off the shelf ARM architecture that they're putting some of their own secret sauce, but it's not like they're doing apple style and three, creation of a chip from scratch.

It's not the hero product, right?

There are many ways that in video sort of applies this, where I think we talked about in the last episode, if they think it's going to be a low margin opportunity, they don't go after IT. But the nicer way to say that is we don't want to compete for things that anybody can do. We want to do things that only we can do. Only, by the way, we will fully realize the value of those things .

and we do them yeah I think this may be a related play book team here for in video of strike when the timing is right. I suspect that a lot of the inner competitive drive and motivation for jenson in the company over the past ten, fifteen years here has been to really fight against intel. L tried to kill them.

As we talked about many times in the previous episodes, we talked to somebody who framed to as intel was the country club and in video is the fight club. And back in the days, the entire country club didn't want na let in video in intel control the motherboard. Intel control the most important ship with the CPU.

Intel would integrating commoditize all other chips into the motherboards eventually. And if they couldn't do that, well, try and make the tips themselves. And they tried to run all these playbooks on, in video.

And in video just barely survive. And then in the data center to control the data center for so log P, C, I express, you know, that was the connect in the data center for so long. And the video had to live in there. And i'm sure they hated every single minute of IT, but they didn't turn around ten years ago and just feel like, guess what, we're making a CPU two. They waited until the time was right.

IT is crazy. They used have to plugged into other people's vers. And then they started making servers that plug in to other people's rax and rose and architectures, and then started making their own entire rose and walls. And at some point here, they are going to start running their own buildings, follow servers to, and we're going to say we not have .

to pog in anything. Yep, but I think for a lot of their leaders, IT would a bit hard to have the patients that theyve had .

totally you only get to do this stuff they're doing if you invested ten years ahead of the industry, were wildly inventive and innovative in creating these like true breakthrough innovations. And we're really, really right about huge markets. No, none of this stuff applies is unless you're doing those three .

things yeah for teen five hundred, C I O S aren't making buying decisions. None of what you just said isn't true.

right? So there is this interesting conversation I wanted to have with you ahead of winding IT up with the bowen bear case. So think back to our A W S episode.

We talked a lot about how A W S is just locked in. The databases are a ridiculously durable advantage once your data has been shipped to a particular cloud, often literally in semi trucks full of hard drives. Snowman, yeah, is hard to move off of IT.

There is a sort of interesting question of, will winning cloud one point o for all these google, microsoft, amazon, will that to hold actually enable them to win in the cloudy I era? On the one hand, you'd think, yes, absolutely. Because I want to train my A I models right next to where my data is. It's really expensive to move my data morals. To do that .

case in point, microsoft is the exclusive cloud infrastructure provider for OpenAI, which runs, as far as we know, solely on in video infrastructure. But they buy adult microsoft, right?

On the other hand, the experience that customers are demanding is the full stack and video experience, not this. Oh, you found the cheapest possible cost of good sold way to offer me something that's like the experience that I want. And sometimes the cloud providers have to offer me and a one hundred or and eight one hundred because my code is way too complicated to ever rearchitects for whatever accelerated computing devices they're offering me the first party and cheaper for them, I don't know. I just think for the first time in the last five years or so, i've sort of cocked my head a little bit at the mode of these existing cloud revisers and said, huh maybe there really is a vector to compete with them. And cloud is not a settle frontier.

Yeah well, this is majority here. Cloud is the, uh, ism for data centers, right? There is so much more to the hyper scales in public clouds than just data centers, right? But physically, their data centers there is mile of distance meter oracle between like an equity s and A W S.

But their data centres。 And there is a fundamental shift, at least according janson, a fundamental shift that is happening in data center. So I think that probably does create some shifting sounds that the cloud market is going to have to navigate.

Yep, I bet the way that plays out is that where you landed in cloud one point of strongly dictate tes where you land in this A I cloud area. Because at the end of the day, if customers are demanded and video stuff and the cloud providers have every incident of in the world to make IT so that you can run your applications great and their cloud.

But also, like there is more to this, too cruel. Exists, correct? exist. lamed. The labs exist. These are well funded. Start us with billions of dollars that a lot of smart people think there's a major cloud sized opportunity for. Yeah, that would not have happened a few years ago.

Super true. All right. Let's do the bulcke in barca and bring this one home.

All we've been in to delay this as long as possible. This is the correct of the question right now.

Yeah I mean, part of IT is, is their existing vote big enough if GPU actually become a hundred billion dollar a year market and right now GPS in the data center or like a thirty billion dollar year market going to like a fifty billion dollar next year? And like if this actually goes the way that everyone seems to think it's gonna, there's just too many margin dollars out there for these big companies to not invest heavily.

Met a through tens of billions of dollars, making the meta mean apples put fifteen billion dollars into rumour, into their headset. Amazon's put tens of billions of dollars in the devices, which by all means was a terrible investment. How is echo paying anything back?

Mean total cyber. I am so disappointed. I have standardized my house on the echo ecosystem, and IT keeps getting dummer. How in this world of incredibly accelerating AI capabilities, are my echoes getting dummer?

And I need to train them and inferential little bit harder.

Jesus OK rent over. yeah.

I mean, never doubt big texian order to throw tens of billions of dollars into something if the payoff could be big enough. These are lucre, sly, profitable and open cept. Amazon, not that profitable.

A yeah, but the google, facebook, apple. At some point here, there is a game of chicken that ends. And some of these companies go all in and say, ah we have smart engineers to like we're .

gna figure this out yeah but also never underestimate the inability of big tech to execute on stuff that thinks especially with major strategy .

yeah yeah alright. So let's actually do this bear case. Let's start with the bear case.

So you just illustrated, I think, bear case number one, which is literally everybody else in the technology goes system is now aligned and incentivise to say, I, anna, take a piece of invidious s and these companies have untold resources. yep.

And to put a fire point on that, lets look at PyTorch for a minute. Now of that, all the developers or lots of developers are using py orc h. IT does enable PyTorch h to aggregate customers, which gives them the opportunity to discard mediate.

Maybe you ve got to write a lot of new stuff underneath and ship a lot of hardware. I mean, the cloud service providers have taken some steps here. He was originally developed by meta.

And while it's open source, it's still hard for all these companies to invest in IT if it's really sort of owned and controlled by meta. So now pyt or rc h has been moved down into a foundation that a lot of companies are contributing to. And again, IT is a absolute false equivalence to be like PyTorch, vers and video. But in real bentham and aggregation theory, parolin, if you aggregate the customers, you have the opportunity then to take more margin to the immediate to direct where that attention is going. And PyTorch has that opportunity that feels like the vector that a lot of these csp will try and compete on and say, look, if you're building for PyTorch chit runs really well on our thing too.

Yes, for sure, no doubt that that's going to happen, right? So that's bare case number two, kind as part of their case number one.

The next one is like literally, the market isn't as big as the market cap reflects. I think there's a pretty reasonable chance that there's some falter in the next twelve to eighteen months where there's a crisis of confidence among investors where at some point something will come out where we all observe, oh, maybe GPT earn as useful as we thought. Maybe people don't want china faces.

And that crisis of confidence that many bubble burst will trickle out to america's ceos. And ceos make IT harder to advocate in the bedroom to make this big fundamental purchase. And the architecture of our whole budget from this year that we agreed on that I am trying to propose us changing.

There's a crypto to like element to a excitement bubble bursting that will, for some companies, slow their spent. And the question is sort of like when that happens because it's not and if it's a win, I have a hard time believing that given all the hype around everything right now, A I will be even more useful than everyone believes. And IT will continue in a linear fashion where without any drawdowns, everyone's excitement only gets bigger from here. IT may end up being way more useful than anyone thought, but there at some point will be some value trough. And at sort of about how does video far during that crisis of confidence.

it's funny. You know, again, we talked a lot of people for the episode, including a set of some of the foremost A I researchers and practitioners out there and founders and see streets of companies that are doing all this, and pretty much to a tea. They also the same thing.

When we asked them about this question, they all said, yeah, this is overhype dry now, of course, obviously, but on a ten year time scale, you haven't seen anything. The transformative change that we believe is coming. You can even image the most .

interesting thing about the overhype is that is actually showing up in revenue. It's everyone who is buying access to all this compute believes something. And for a video because its showing up in the form of revenue, the belief is real then. So they just need to make sure that they smooth the gap to customers, actually realize as much value as the IOS of the world are currently investing ahead of. yep.

So I think the sub point to that that's worth the discussion right now is like, okay, generate vi. Yeah is that all it's cracked up to be? Well, David, I ve asked you about .

this like a month. So but a month ago you were panning the table, insisting to me like I have no need for i've never used ChatGPT. I can't find IT to be used full of fluctuating all the time. I never think to use IT. It's not a part of my workflow or like where .

are you at still basically there, including forcing myself to try to use IT a bunch preparation for this episode. But also as we talk to more people, I think i've realized that like David rose and balls use case doesn't really matter here at all, right? A because as a business, we are such hyper specialized, unique little unicorn thing, where accuracy and the depth of the work and thought that we ourselves putting into episodes as the paramo thing well.

and we have no co workers. There are so many things about our business that is weird, like we never have to prepare a brief for a meeting.

right? All this stuff, anything external that we prepare is a labor of love for us. And there is nothing we prepare internal.

I know people who use ChatGPT to set their OKR. And like OK, what's an OKR? And they're like, I wish my life like that too. That's why I have ChatGPT .

do IT right honestly. Like I think you're doing this and talking with some folks and reading, I think there's a very compelling use case for for writing code right now, no matter what level of software developer you are from zero all the way up through elite te software developer, you can get a lot more leverage to this day and get up copilot. So is that valuable? You for sure that's valuable. yeah.

The l ms. Are unbelievably good at writing and helping you write good. I'm a huge believer in that use case.

you and then I think now there's a slightly more speculative stuff, but you can actually sort to see and now of like that gaming demo that I mentioned recently from my video, like oh, you're talking to an unplayable character that wasn't script IT. We did A C Q two epo de recently with Chris manzella from the sea of runway that was used in everything, everywhere, all at once. And he said, that is the deputy iceberg. Like the stuff that you can do that is happening that's out there today with generative A I in these domains is sounding.

yeah, I think what you're saying is one could be a bear on your own experience. Every time you try to use a generative application, IT doesn't fit in to your workflow. You don't find that useful.

You're not sticky. But on the other hand, actually what A I will be is a sum of a whole bunch of nitches. There's a video game market, there is a writing market, there is a creative writing market, there's a software developer market, there is a marketing copy market. You know there's a million of these things, and you just may happen and not fall into one of the first few nitches of IT.

Yes, I think for me, at least again, to speaking personally too, I have a very strong element of skepticism initially, because the timing was just too perfect. You know, is like all u VS out there. You just told everybody about how crypt is the future. Whatever you're talking about and then illustrates, went to, you know, five percent in your world fell off a Cliff.

Uh, the number of people who were like out raising a fund and they're like.

the future is A I yes.

this the best time ever have been investing.

And so there was a large part of me that I was just like, calm and guys.

yeah, it's too perfect.

It's too perfect. But this most recent couple months in this quarter for in video has shown that, put all that aside, 40 five hundreds are adopting this stuff。 C, I O S are adopting this stuff in videos. S, selling real dollars. And learning also about what IT takes to train these models and the step, scale, function of knowledge and utility going from a billion parameters to ten billion parameters to two hundred to a trillion parameter models. Yeah, like someone is given on there for sure.

So this leads to me in my next pair case, which is the models will get good enough, and then they'll all be trained and then will shift to inference. And most of the compute will will be on influence where invited is less differentiated. There's a bunch of reasons I don't believe that, that is a popular narrative, though.

One of the big reasons I don't believe that is the transformer is not the end of the road in a bunch of the research that we did, David is very clear that there are things beyond the transformer that are in the research phase right now. And the experiences are only going to get more magical and only going to a get more efficient. So there are sort of a second bar case there, which is right now we threw a brute force kitchen sink at trading these things and all of that revenue crude to n video because there are the ones that make the kitchen sinks.

And over time, like you look at uh google chinchilla or lama too, they actually use less parameters than GPT four and have equivalent quality or you know many other people can be the judge of that but were high quality models with less paramus. So there is this potential bare case around future models will be more clever and not require as much compute. It's worth saying that even today, the vast majority of AI workloads don't look like L M, at least until very recently.

LLM are like the current maximum in human history of jobs to be done that require a ton of compute. And I guess the question is, will that continue? I mean, many other magical recent AI experiences have happened with far less expensive model training like diffusion models and the entire genre of general AI on images, which we really haven't talked about a lot on this episode because they are less compute intensive. But many tasks don't require an entire internet of training data and a trillion parameters to pull off.

Yep, that makes sense to me. And I think there also is somewhere to workloads are shifting to inference that is happening. I agree with you, I don't think training is going anywhere. But until recently, you don't think back to the google days, training was what everybody was spending money on. That's what everybody was focused on as usage scales with this stuff then influence and influence, of course, being the compute that has to happen to get outputs out of the models after they are already trained, that becomes a bigger part of the pie. And as you say, the infrastructure and ecosystems around doing that is less differentiated .

than training you.

Okay, those are the bare cases. There is probably also a bare case around china, which is a legitimate one because that's gonna be a problem for a lots of people.

a large market that they won't be able to address for the first able future in a meaningful way.

And just what's gna happen generally, like, obviously, china is racing to develop their own homegrown ecosystems and competitors, and that's gonna a closed off market. So what's onna come out there?

What's gona happen? Yep, that's definitely one to my last one is a bad case. But IT ends up not being a bad case for most companies.

I would say that if they were trading at this very high multiple and they just experienced this tremendous real growth in revenue and Operating profit, that, that sort of Spike to the system when IT goes away will irreparably hurt the company when things slow down stock compensations in issue employee areas and issue customer perceptions in issue. But this is in video. Yeah, this is nothing new.

The number of times that they've risen from the ashes after, you know, years long. Terrible sentiment with something mind blowingly innovative. They're probably the best positioned company or the company with the best disposition to handle that when that happens.

Oh, i've like that a great turn of phrase. There you up to your training model language there you .

should see the number .

of promoters. Oh alright.

Just to list the ball cases. One, jensen is right about accelerated computing. The majority of workloads right now are not accelerated. Their bound to C P. Use, they could be accelerated.

And that shifts from some crazy low number, like five or ten percent of workloads being accelerated ted today to fifty plus percent in the future. And there's way more compute happening in parallel. And that mostly cruse to in video.

Oh, I have one new ones I want to add to that on the surface. I think a lot of people look at that in there like, yeah, I come 嘛呢。 But I think there actually is a lot of merit to that argument in the general vi world.

And everything we've talked about in this episode, I don't think jensen and NVIDIA are saying that traditional compute is going away or getting get smaller. I think what he's saying is that A I computer will be added on to everything and the amount of computer required. For doing that will dwarf what's happening in general purpose computer.

So like it's not that people are going to stop running shirt point services or that whatever products you use are going stop using their whatever interfaces that they use is the general V I will be added to all of those things. And they use cases pop up, which will also use traditional general purpose CPU based computing. But the amount of workloads that go into making those things magical is just gonna .

so much bigger. Yep, also just a general statement on software development. Writing paralyzed code is really hard unless you have a framework to do IT for, you even writing code with multiple threads.

Like if anybody remembers A C S. College in class where they had a race condition, or they need to to write a some of these are the hardest things to debug. And I would argue that a lot of things that could happen in an accelerated way aren't just because it's hard to develop for. And so if we live in some future where NVIDIA has reinvented the notion of a computer to a shift away from vano yan architecture into this stream processor architecture that they've developed and they have the full stack to make IT just as easy right applications and move existing applications, especially once all the hardware been bought and paid for. And sitting in data centres, there's probably a lot of workloads that actually do make sense to accelerate if it's easy enough to do so.

Yeah, that's great. To your point is that there's a lot of a late and accelerated addressable computing out there that just hasn't been accelerated.

It's like a this workloads s not that expensive and i'm not going to pay in engineer to go to architect the system. So it's fine. How IT is? How about like there's a lot of that.

So bulcke one, Johnson is right about accelerated computing. Bulcke two, jennen is right about gender. A, I mean, combined with acela computing, this will massive vely shift spend in the data center to envy as hardware.

And as we have mentioned, OpenAI e is rumor to be doing over a billion dollars in recurring revenue on ChatGPT. So I think there's, let's call IT three billion because that's the most sort of credible estimate that i've heard. Maybe that was a forecast for next year.

But like they're not the only one of a google with bard, which i've found tremendously useful actually preparing for the episode is not directly moitie ing that but their sort of retaining me as a google search customer by doing IT. There is a lot of real economic value even today, not nearly the amount that sort of baked into the valuation. But I suppose the bare case of this is that everything has to go right, friend.

Video, but the ball cases, indications are things are going right for n video job. Third, bulk es n video just moves so fast. Whatever the developments are is hard to believe that they're not going to find a way to be really well position to capture IT.

It's just a cultural thing. Four is the point that you brought up earlier that there's a trillion dollars installed in data centers, two hundred fifty billion more being spent every year to refresh and expand capacity and that in video could take a meaningful share of that. I think today, what's y're in your revenue .

at like thirty billion or something you run rate this current quarter, then it's like fifty plus.

yeah. So right now, that puts them at like twenty percent of the current data center spend. You could imagine that being much higher OK.

Wait, that includes to the gaming revenue is about forty because the data revenue is is ten forty and alias.

right? So 41 percent。 yeah. But you can imagine that creeping up again if the accelerator computing and generate A I belief comes true, like they'll expand that two fifty number and they'll take a .

greater percent of IT.

Yep, an interesting way to do a sort of a check on this math is to look at what other people in the ecosystem are reporting in their numbers. T, S, M, C, in their last earning, said that A, I heart, we are currently only represents six percent of their revenue, but all indications over there is that they expect A I revenue to grow fifty percent per year for next five years. wow.

So we're trying to come at IT from the customer workload side and say, is that useful there? But if you come at IT from the other side of auto n video suppliers forecasting and they have to put their money whether mouth is building these new way for fs to be able to facilitate that and packaging and all the other things that go in to chip. So it's expensive for T, S, M, C to be wrong. That's another bulcke. The last one that I have before leaving you with one final thought.

Are you saying you have .

one more thing? Yes, is that NVIDIA isn't intel. And I think that's the biggest realization that you helped me have and it's not cisco.

Yeah the comparison we are making in the last episode was wrong. They are microsoft. They control a whole software stack, and they simultaneously can have relationships with the developer and customer ecosystems. And I mean, IT may be even be Better than microsoft because they make .

all the hardware two maybe go IBM.

right? Imagine if IBM Operated the computing market of today's magnus de computing with tiny little market back then.

right? I mean, was like that, I mean, I took the PC wave to disrupt IBM, which was a personal computer in today's pilot edge computing, you know, device space computing, IBM dominated the B2B mai nframe cyc le of com puting. And again, if you believe everything jenson is saying and how he's steered the company for the last five years, we are going back into a centralized data center, modern version of a mainframe dominated competence. I call.

yeah, I suspect a lot of inference will get done on the edge. You think about the and same amount of compute that's walking around in our pockets that is not fully leverage right now. There is going to be a lot of machine learning done on phones that are going to like call up to cloud based models for the hard stuff.

No, no, I don't think training is happening at the genti time soon.

No, no, I certainly agree with that, right? Well, just like our T, S, M, C episode, I wanted to end and leave you with a thought, David, of what IT would take to compete within video, because my big take away from the T S M C episodes, like, wow, that's a lot of things you have to believe about a government putting billions of dollars and and hiring us town.

And I was like, what's the equivalent for in video? So here's what you would need to do to compete. Let's say you could design G, P, U chips that are just as good, which arguably AMD google and amazon are doing.

You'd of course, they need to build up the chip to chip networking capabilities like unveiling that very few have. And you'd of course need to build relationships with hardware assem plyers like fox khan, to actually build these ships into servers like the D G. X. And even if you did all that, you'd need to create server to server and retract c network and capabilities as good as mellin's x who was the best on the market within finigan that in video nawful li owns and controls, which basically nobody has. And even if you did all that, you'd d need to go convince ce all the customers to buy your thing, which means that we need to be either Better or cheaper or both, not just equal to n video .

and by a wide margin to this brand. You're not going to get fired for buying in video anytime soon like this is the kind you got to be ten next Better than in video on this stuff if you're going to convince cio.

yep. And even if you got the customer demand, you'd need to contract with T S, M, C to get the manufacturing capability of their newest cutting edge fs to do this two point five d chaos lithography and packaging, which there, of course isn't any more of. So you know, good luck getting that.

And even if you figured out how to do that, you'd need to build software that is as good or Better than CUDA. And of course, that's gonna take ten thousand person years, which would of course cost you not only billions and billion of dollars, but all that actual time. And even if you made all these investments and lined all of this APP, you'd of course need to go and convince the developers to actually start using your thing instead of kuda.

Well, video also wouldn't be standing still. So you would have to do all of this in record time to catch up to them and surpass whatever additional capabilities they developed. Sense, you started this effort. So I think the bottom line here is nearly impossible to compete with them head on. And if anybody's going to unseat and video in the future of A I accelerating puting, is there going to be from some unknown flank attack that they don't see? Or the future will turn out to just not be accelerated computing in I, which seems very unlikely.

Yeah well, when you put that way, I think the conclusion that we can come to is that mark Anderson was right. In what year was this that we are talking about on?

Was twenty fifteen or something?

Yeah, like teen.

they should have put every dollar of every fund that raised into nvidia's market Price of the stock every single day.

Yeah, because they were seeing all of these startups doing deep learning, micheline learning at the time, early AI, and they were all building on NVIDIA. And they should have just said, no, thank you told them. And but at all. In video, mark is very, once again, streng leads to strength there.

There IT is, well, listeners. I acknowledge that this episode generalized a lot of the details, especially for technical listeners out there, but also for the finance folks who are listening. Our goal was to make this more of a lasting and video apart three big picture episode than sort of a how did they do last quarter and one of the implications on that of the next three quarters. So hopefully, this holds up a little bit longer than just some current and video commentary. But thank you so much for gun on the journey with us.

But we also, as we valued to throughout the show, we owe a bunch of thank you lots of people who are so kind to help us out, including people who have way Better things to do at their time.

So we're very, very grateful. I mean, one, ian buck from NVIDIA, who leads the data center effort and is one of the original team members that invented CUDA way back when. Really grateful to him for speaking with us, to prep for this.

Absolutely also big shout out, two friend and less than the show, jeremy, from A, B, C, data. Who prepared for P, D, F for us, completely unprompted, like an instated right up for us about a lot of the technical .

detail behind this private blog post.

Yeah, private black post. Our acquired community is just the best like you guys continue to blow us away. So thank you.

Julian. The city of hugging face or in its yoni from A I to Louis from octo ml. And of course, our friends at N.

Z. S. Capital, thank you, are offered helping us research this.

I do. Hi car outs. What's five gears? Car outs.

what to get? My wife and I have been on an aliens bench.

Oh, wow. yeah. Jennifer garner, yes, I never thought .

when I came out. IT is like the perfect early two thousands junk food when you have one more hour at the end of the day. And years lay e on the couch .

then I never have one more hour the day. I have two year old. But I really appreciate IT for a sixteen years from now when he goes to college.

Keep on you play games.

That's true. But that research, i'm just in out the graphics only.

So my review of alliance is it's a little bit campy. They repeat themselves pretty often. I mean, it's weird to observe how much TV has changed between now and then because they make very similar shows today, but they're just much more subtle.

They're much darker. They leave much more sort of to the imagination. And in the early two thousands, everything was just so like explicit and on the nose. And we stated three times, i'm just glad the shot doesn't have a left track, but it's well worth the watch. Sometimes you have to imagine IT has a different soundtrack because every episode has like a matrix type song to a bumper to bumper to bumper dump dump.

Yes, that's right. This is like the TV version, the matrix.

yes, but it's great. I don't have an a lot of .

time watching IT. My guy related for my state of life, also something I missed and discovered recently. We just watched our first full movie, full disney movie.

with a major .

milestone. And SHE freakin all loved IT. I think we we picked a great one, moana, which neither january I had seen before. And in reading just a little bit about IT afterwards, you know how super sadly, pigs are kind of fell off in recent years like such a bammer. I mean, they still picks up but like they're picks .

are it's not the guaranteed hit every time that he used to be.

yeah. So majora came out in this kind of generation with tangled and some of the other stuff out of actual disney animation after the pigs are acquired that are these are returned to for isa. Disney made IT just like fires on all seller ders. And we loved IT we watched with our brother, sister in law who don't have kids in our thirty something living in service ago. They loved IT our daughter loved IT highly recommend meana no matter what lay face you're in.

marry great adding to my list .

and got the rock.

How can you complain ago? Well, listeners, if you want to be notified every time we drop a new episode and you want to make sure you don't mess that and you want little hints to play a guessing game at our next episode or you want follow ups from our previous episode in case we learn from listeners, hey, here's a little piece of information that we wanted to pass along. We will exclusively be dropping those in the email.

Acquire that F, M. Slash email is fun. I think about to talk about slack.

I people in slack talk about the hints for this offset. We read the little teaser. T I was like, h everybody, he's going to know .

exactly what this is. No one got IT. I was shocked.

Yeah, eventually somebody did, but he took a couple days.

Yeah, we have a hat. You should buy IT and this is not a thing that we make a lot of margin on. We just are excited about more people sporting the A C Q around. So participate in the movement shows to your friends .

it's not our super pod, but you know .

the pod is .

the super pod.

Um if you come on acquired D L P, you can come closer to the kitchen and help us pick an episode once a season and will do a zoom call every other month or so. Acquire gotta m slash LPL. Check out A C Q two for more acquired content in any podcast player, and come talk about this in the slack acquired data m, slash, slack listeners. I'll see you next time.

next time. Easy you, easy you with that, you who got the.

Nvidia Part III: The Dawn of the AI Era (2022-2023) 02:53:45 Share

Acquired

Chapters

Shownotes Transcript

PodQuest PodQuest Podcast Discovery Engine

Nvidia Part III: The Dawn of the AI Era (2022-2023)