cover of episode Building the Open Source AI Revolution (with Hugging Face CEO, Clem Delangue)

Building the Open Source AI Revolution (with Hugging Face CEO, Clem Delangue)

2024/10/14
logo of podcast ACQ2 by Acquired

ACQ2 by Acquired

People
C
Clem Delangue
Topics
Clem Delangue认为Hugging Face是AI建设者的首要平台,它提供模型、数据集和应用,并支持AI应用的构建。平台每天有超过500万AI建设者使用,共享超过300万个模型、数据集和应用。Hugging Face类似于GitHub,但用于AI模型,支持公开和私有项目。平台每10秒钟就会构建一个新的模型、数据集或应用,并提供协作功能,方便团队共同构建AI。大型科技公司内部有数千用户使用Hugging Face平台。Hugging Face最初是一个面向青少年的聊天机器人,后来转型为AI平台,这体现了灵活性和抓住机遇的重要性。平台的开发一直以来都是由社区驱动的,根据社区反馈逐步完善。Hugging Face鼓励开放,但不会强制要求,并相信开放的AI和科学能够促进共同进步,创造更安全的未来。Clem Delangue认为当前的模型是通往AGI的基石,但AGI并非如同科幻作品中描绘的那样,而更像是一种新的技术范式。开源是所有AI的基础,即使是闭源公司也大量使用开源技术。AI领域近年来变得不那么开放了,但Clem Delangue认为将AI的风险性作为限制开放性的理由是误导性的。AI领域的竞争环境与以往不同,权力更集中在少数大型科技公司手中,但开放的AI有助于促进更多AI公司的诞生。Clem Delangue认为大多数AI公司可能需要开发自己的基础模型,这需要巨额投资,成功的AI公司的运作方式与传统的软件公司不同。Hugging Face的盈利模式与其他AI公司不同,它采用了一种可持续的商业模式,以支持开源和免费平台。Hugging Face不参与计算价格战,而是提供附加值服务,将计算与平台功能整合,提供更无缝的用户体验。目前AI的盈利模式尚不明确,但Hugging Face的成功取决于AI建设者的使用和社区的参与。Hugging Face面临的挑战是如何保持平台的可见性和适应快速发展的技术,以及在构建成熟稳定的平台和快速创新之间取得平衡。Clem Delangue认为使用API构建AI是一种过渡性的方式,未来公司会更多地构建自己的模型,未来所有科技公司都将成为AI公司,并构建自己的模型。未来可能会有大量的AI公司,而非少数几家大型公司主导。AI领域需要更多人才和更易用的工具,AI的参与门槛比软件更低,AI公司不应该遵循传统的精益创业模式。AI的规模经济模式尚不明确,但Clem Delangue希望AI领域不会被少数几家公司垄断。

Deep Dive

Shownotes Transcript

Translations:
中文

Clem DeLong, welcome to ACQ2. Thanks for having me. It's a pleasure to have you here. We have heard so much about Hugging Face over the last few years. It just feels appropriate in this moment to talk to you about the company directly. I feel like it's a very critical time for AI and with Hugging Face, we have the pleasure and the honor to be at the center of it. So excited to be able to share some of the things that we're seeing.

I think the listeners who are tuning into this and saying, what is this episode going to be about? We want to frame it as you should come in and you don't need to know anything about AI and you should walk out with a pretty clear understanding of open source AI, the more closed ecosystem. What is the difference between the two? What are the trade-offs? What are the virtues of each one? And we're going to tell it through the hugging face story. So what role do you play in the ecosystem? Who do you work with? Who do you not?

How did this thing spring up out of quite an unlikely place, given the name of your company? We'll kind of work our way backwards. At this moment in time today, how do you describe what Hugging Face is? So Hugging Face has been lucky to become the number one platform for AI builders.

So AI builders are kind of like the new software engineers in a way, right? Like in the previous paradigm of technology, the way you would build technology was by writing code. You would write like a million lines of code and that would create a product like a Facebook, like a Google or all the products that we use in our day-to-day life. Now today, the way that you create technology is by training models, using data sets and building AI apps, right?

So most of the people that do that today are using the Hugging Face platform to find models, find data sets and build apps. So we have over 5 million AI builders that are using the platform every day to do that.

The ecosystem around Hugging Face in many ways reminds me of the 2008 to 10 era of the Web 2.0 sort of restful APIs that everybody was publishing and you could suddenly daisy chain together a million different companies. Mashups. Services into, yeah, this sort of like API mashups. It kind of feels like there's a loose analogy to at least the movement that you're on is similar to that one. What can we create with a bunch of these sort of more open, flexible building blocks?

Yeah, it's super exciting because it's replacing some of the previous capabilities. Now you're starting to see search being built with AI. You're starting to see social networks being built with AI. But at the same time, it's empowering new use cases. It's

unlocking new capabilities that weren't possible before to some extremes, right? Like some people were talking about super intelligence, AGI, completely new things that we weren't even thinking about in the past.

So we're at this kind of like very interesting time where the technology is starting to catch up to the use cases and we're seeing the emergence of a million new things that weren't possible before. It's cool. And just so listeners understand the scale at which you're operating, Hugging Face is currently valued as of recording at four and a half billion dollars. Investors include Nvidia, Salesforce, Google, Amazon, Intel, AMD, Qualcomm, IBM. It's

pretty wild set. What are some metrics that you care about as a company that you can sort of use to describe the scale at which developers are using it today? So I was saying that we have 5 million AI builders using the platform, but more interestingly, I think it's the frequency and volume of usage that they have on the platform. So collectively, they shared over 3 million models, data sets, and apps on the platform.

So some of these models, you might know them, might have heard of them, like LAMA 3.1. Maybe you've heard of stable diffusion for image. Maybe you've heard of whisper for audio or flux for image.

We're going to cross soon 1 million public models that have been shared on the platform and almost as many that have not been shared and that companies are using internally, privately for their use cases. So the analogy and model for you guys really is just like GitHub, except for AI models, right? You can public, open source, open to everybody, and companies can also use AI.

internal closed source repositories for their own use, right? Yeah, it's a new paradigm. So AI is quite different than traditional software. So it's not going to be exactly the same. But we're similar in the sense that we're the most used platform for this new class of technology builders. For GitHub, it was software engineers. And for us, it's AI builders.

And to add to kind of like the usage side of things, one interesting metric is now that a model, a data set or an app is built every 10 seconds on the Hugging Face platform. So I don't know how long this podcast is going to last, but by the end of this podcast, we're going to have a few hundred more models, data sets and apps built on the Hugging Face platform.

And to continue to maybe torture the repository comparison, the set of things that need to exist besides "I'm going to upload a pile of code that everyone can see and potentially attempt to modify" it's also the datasets themselves, it's also a platform to actually run applications, and also a compute platform where if you want to train a model that is also possible on Hugging Face, right?

Yeah, and one additional aspect that sometimes people underestimate is

is a lot of features around collaboration for building AI. The truth is that you don't build AI by yourself as a single individual. You need the help of everybody in your team, but also sometimes people in other teams in your company or even people in the field, right? So things like the ability to comment on a model, on a dataset, on an app, to version your code, your models, your datasets, right?

to report bugs, to comment and add reviews about your code, your models, your data sets. These are kind of like some of the most used features on the platform because it enables bigger and bigger teams to build AI together. And that's something we're seeing at companies is that

A few years ago, maybe there was a small team or five, 10 people leading the AI teams at companies. Now it's much bigger teams. So for example, at Microsoft, at NVIDIA, at Salesforce, we have thousands of users using the Hugging Face platform all together privately and publicly.

So I have a whole bunch of questions, kind of philosophical ones about where AI goes from here and sort of how the mental model for the AI ecosystem is different than previous generations. But to get there, I think it's helpful to understand how you arrived here. So in 2016, you co-founded a company named after the Unicode code point hugging face, the emoji. And I

As far as I can tell, it was an emoji that you could talk to as a chatbot aimed primarily at teenagers. Is that right? Yes, absolutely correct. It was a long journey. So you started neither an AI infrastructure company, nor did you even start in the current era of AI. No, but we did start...

Based on our excitement and passion for AI, even if we weren't even calling it AI at the time, right? We were saying more machine learning, deep learning. I was lucky enough, I think it's now almost 15 years ago, a few years more, to work at a startup in Paris that was called Moodstocks, where we're doing machine learning for computer vision.

So much before a lot of people were talking about AI. And it kind of made me realize some sort of the potential for the new technology and the way we could change things with AI. So when we started Hugging Face with my co-founders, Julien and Thomas, we were super excited about the topic. And we were like, okay, it's going to enable a lot of new things. So let's start with a topic that is both scientifically challenging and fun.

And so we started with conversational AI. We were at the time, okay, Siri, Alexa, they suck. We remember our Tamagotchi, which were this kind of like fun virtual pets that you would play with. So let's build an AI Tamagotchi, like an AI conversational AI that would be fun to talk to.

And that's what we did. We worked on it for three years. We raised our first two rounds of funding on this idea. So shout out to our first investors who invested in a very different idea than what we are today. Who were your early investors? So our earliest investor was Betaworks in New York. I had no idea. Yes, with John Borswick and Matt Hartman, who were our first investors.

Really backed us when we were like random French dudes with like no specific background or credentials with a broken English. I assume you're now the most valuable company Betaworks has ever invested in. Yes, and more proud of the fact that now with the companies that they invested the most money in. So we're like the biggest bets that they've made. They've been extremely, extremely supportive of us.

But the support from a bunch of very important, impactful angel investors for us, like Richard Socher, who was the founder of You.com, who was the chief scientist at Salesforce at the time. And then the support of the Conway's family with A Capital, run by Ronnie Conway, that led our next rounds. And Ron Conway, who also kind of like supported us throughout the early days of Hugging Face.

That's awesome. And so this was all still for the, I'm going to chat with an emoji idea. Yes. Yes. And to put a finer point on it, you started the company in 2016, 2017 is when the transformer paper gets released from Google. So we are not yet to the era of even people in the AI community really knowing LLMs are close to on the forefront, like open AI hadn't made their big pivot yet. And so the state of the art for natural language processing is still like

Pretty limited, small models trained on very particular, well-cleaned data sets. Is that right? Yeah. And surprisingly or luckily, that's what led to what Hugging Face is today. Because at the time, the way you were doing conversational AI is by

stitching a bunch of different models which would do like very different tasks so you would need one model to extract information from the text one model to detect the intent of the sentence one model to generate the answer one model to understand the emotion linked with the model and so very early on in the journey of hugging face we started to think about like how do you build a

a layer, a platform, an abstraction layer that allows you to have multiple models with multiple datasets because we wanted the chatbot to be able to talk about the weather, talk about sports, talk about so many different topics that you needed a bunch of different datasets. And that was kind of like the foundation to what Hugging Face is today, like this platform to host so many models, so many datasets.

So it's a very interesting fate, a very interesting thing. Obviously, it reinforces for people who are listening the importance of being flexible, being opportunistic and being able to seize kind of like new opportunities even three years in, right? For us, it was three years in with maybe $6 million raised, completely changing what we're doing, what we're going after, what we're building, right?

Obviously, we don't regret at all, but it's a good learning for everyone listening that even with $6 million raised three years in, you can still pivot and find a new direction for your company. And it is for the best. How did those conversations start? How did they go? How much time did it take to start?

Go from talking about it to doing it? Yeah, surprisingly, the transition wasn't as hard as we thought. It all started from an initiative from Thomas, who is our third co-founder and our chief scientist. I think it's right at the time when BERT, so the first very popular Transformers models came out. That's Google's model? Google's model that they open sourced.

I think on a Friday, that day, I remember really vividly, Thomas told us like, oh, there's this new transformer model that came out from Google. It's amazing, but it sucks because it's in TensorFlow. And at the time, the most popular language for AI was and still is actually PyTorch. And it was like, oh, I think I'm going to spend the weekends porting this model into PyTorch.

And Julien and I were like, okay, yeah, if you don't have anything better to do during your weekend, just have fun, do it. And on Monday, he released a PyTorch version of BERT, tweeted about it. And I think his tweet got maybe like a thousand likes. And for us at the time, we were like, what is happening here? We broke the internet. Thousand Twitter likes?

That's insane. The developer demand is so obviously at that point in time, PyTorch, but since it was born out of Google, of course we're going to implement it in TensorFlow. They had to use their own sort of endorsed stack. It's just waiting there for the first person to realize, oh my God, this thing needs to exist in PyTorch to like go and get all the internet points by doing that. Yeah. Yeah. I guess it's another gift from fate or from the universe to us.

that we managed to seize thanks to the work of Tom. And after that, we saw the interest, doubled down on it. And I think six months later, we told our investors, look, this is the adoption. This is the usage that we're getting on this new platform. We think we need to pivot from one to another. And luckily, they were all super supportive. And that's what led to the pivot and to the direction that we took. Wow. How did you take Thomas' experience

porting BERT from TensorFlow into PyTorch into the idea of like, oh, there should actually be a platform for this. BERT LEVINE: It was very organic. What we did is really followed community feedback. So what happened is after this first model release, we just started to hear from other scientists building other models who expressed interest in adding their models to our library.

So I think at the time it was things like ExcelNet actually coming, if I'm not mistaken, from Guillaume Lample, who is like the founder of Mistral now. There was, I think it was GPT-2 from the OpenAI team at the time, which was open source. That's right. It used to be OpenAI. Yes. And they told us that they wanted to add their model since we really followed the community feedback on it.

And that's what kind of like took it from a single model repository. I think the first name was pre-trained PyTorch BERT to, I think it was PyTorch Transformers.

to Transformers, and then it expanded to the Hugging Face platform as we see it now. And that's the thing you kind of got famous for, was that Transformers library. And you were sort of the steward of that open source project, and you sort of constructed the Hugging Face platform around it to sort of host and facilitate all the community interaction on Transformers. And it turned out, oh my gosh, there's a lot of other people who are building something that looks like our Transformers libraries that also want to place

for that same infrastructure. Exactly. It was the same process. At some point, users in the community started to tell us, oh, I have bigger models. I can't host them on GitHub anymore.

All right, let's build a platform for that. Or I want to host my data sets, but I want to be able to search in my data sets to see, you know, is there good data, bad data? How can I filter my data and things like that? So we started to build that. And a few months later, we realized that basically we built kind of like a new GitHub for AI. So our development has always been very community driven, really following the feedback from the community.

And I think that's a big part of the reason why we've been so successful over the years and why the community has contributed so much to our platform and to our success. We couldn't be anywhere close to where we are without the millions of AI builders, contributors that are sharing open models, open data sets, open apps that are contributing with comments, with bug fixes.

It's the main reason for success today. You're sort of famously open. I mean, you really embrace this. We literally will build the product that the community tells us they want. Internally, you have a very open policy. The Twitter account, your social media accounts are actually accessible, I think, by all employees, right?

Yes, yes. As someone who is a champion of open source, how much openness is too much openness? Like you're not a DAO. You don't do the thing where you like publish everyone's salaries, I don't think. What do you like to be open versus what do you feel is good that it's proprietary? What we like to do is to give tools for companies to be more open than they would be without us.

but without forcing them in any way. So I was mentioning the number of models, data sets, and apps that are built on the platform. Something that people don't know as well is that half of them are actually private, right? The companies are just using internally for their own AI systems that they're not sharing.

And we're completely fine with that because we understand that some companies build more openly than others. But we want to provide them tools to open what they feel comfortable opening. So sometimes it's not a big model. It's not a big data set. They can share a research paper on the platform because, obviously, openness is even more important for science than it is for AI in general.

And progressively, it allows them to share more and contribute more to the world because

Ultimately, we believe that openness and open source AI, open science is really kind of like the tides that lift all boats, right? That enable everyone to build, that enable everyone to understand, to get transparency on how AI is working, not working.

and ultimately leads to a safer future. It's like a lot of people right now are talking about AGI. I'm incredibly scared of a non-decentralized AGI. You know, like if only one company, one organization gets to AGI, I think that's when the risk is the highest versus if we can give access to the technology to everyone, not only private companies, but also policymakers, nonprofits,

civil society, I think it creates a much safer future and a future I'm much more excited about. I was going to not go here because it's almost like too much of a shiny question to ask, but we're talking AGI, so we have to do it. Do you feel that the models today are on a path to AGI or do you feel like AGI is something completely separate and these are not stepping stones to it? Well, I think they're building blocks.

for AGI, surely, in the sense that we're learning how to build some better technologies. But I think at the same time, there's some sort of a misconception based on the name of the technology itself. We can't call it AI, artificial intelligence. And so in people's minds, it brings association with sci-fi, with acceleration, with singularity.

Whereas for me, what I'm seeing on the ground is that it's just a new paradigm to build technology. So I prefer to call it almost software 2.0, right? Like you had software before, you have software 2.0. And I think it will keep improving in the next few years, the way software has kept improving in the past few years.

But it's not because we call it AI that it makes it kind of like closer to some sort of RoboCop scenario of kind of like an all-dominating AI system that is going to take over the world. It does feel like there's kind of these two different...

things that masquerade under the same name as AI. One of them is I kind of like software 2.0 because software gave humans leverage to do more and to scale more with a small set of humans. And this new era of software is

It really feels like it's just that on steroids. The richness of applications that you can build very quickly is astonishing and is another 10x improvement on top of the amazing software paradigms that we had until now. There is a completely separate thing, which is things that pass the Turing test. I'm talking to something and I'm pretty convinced that thing is a human, but it's not. And it is a little bit funny to me that these are both sort of referred to as AI, where one is really just like,

leverage for builders on how much they can make. Yes, it's also maybe because we overestimate the second field that you're talking about. To me, it doesn't feel

incredibly difficult and incredibly mind-blowing that we finally managed to build a chatbot. You thought you could do it in 2016, right? Yeah. If anything, I'm surprised that we didn't manage to build a good chatbot before. So to me, even that kind of like falls into development of the technology for the past few decades.

And I think sometimes we forget because we're so entrenched about on kind of like today and we are more impressed with like progress of today than progress in the past. But imagine the first vehicles that were going faster than humans. You know, imagine the first computer that can retrieve information much better than humans. Imagine the first time you would go on Google and find any information in a matter of a few seconds.

These are all like impressive progress. Now we take them for granted, but they were impressive progress. So I think technology continues to progress the way it's been progressing for the past few years. Obviously, some of the builders of these technologies are hyping it, right? And are excited about it, which is normal. But as a society, I think it's good to keep some moderation, right?

And understand that the technology will keep improving, that we need to take it into the direction that is positive for us, for society, for humans. And that everything is going to be fine, that we're not going to fall into a doomsday scenario in a few months because of a chatbot.

Fascinating. It's funny, as you were talking, you linked it to the bicycle. I always think back to the Steve Jobs quote, a computer is a bicycle for the mind, which is in many ways saying it's leverage. It's a way for the mind to output way more than it otherwise could have, the way that a bicycle does to someone walking. And it's almost like this software 2.0 is a bicycle for the bicycle for the mind. It's like a compounded bicycle. Yeah.

Yep.

Yep, Vanta is the perfect example of the quote that we talk about all the time here on Acquired. Jeff Bezos, his idea that a company should only focus on what actually makes your beer taste better, i.e. spend your time and resources only on what's actually going to move the needle for your product and your customers and outsource everything else that doesn't. Every company needs compliance and trust with their vendors and customers.

It plays a major role in enabling revenue because customers and partners demand it, but yet it adds zero flavor to your actual product. Vanta takes care of all of it for you. No more spreadsheets, no fragmented tools, no manual reviews to cobble together your security and compliance requirements. It is one single software pane of glass that connects to all of your services via APIs and eliminates countless hours of work.

for your organization. There are now AI capabilities to make this even more powerful, and they even integrate with over 300 external tools. Plus, they let customers build private integrations with their internal systems. And perhaps most importantly, your security reviews are now real-time instead of static, so you can monitor and share with your customers and partners to give them added confidence. So whether you're a startup or a large enterprise, and your company is ready to automate compliance and streamline security reviews,

like Vanta's 7,000 customers around the globe and go back to making your beer taste better, head on over to vanta.com slash acquired and just tell them that Ben and David sent you. And thanks to friend of the show, Christina, Vanta's CEO, all acquired listeners get $1,000 of free credit. Vanta.com slash acquired. You've kind of been there for this whole arc of

modern development of AI, how would you characterize open versus closed over the last, you know, call it six, seven years that you've been in this? Does it feel like the

the pendulum has shifted significantly during that time? Or is it like, oh, no, well, there was always open and closed. You know, you go back to the beginning and like, well, Facebook and Google were closed and the academic research community was open. How do you view it? So first, the debate itself is a bit misleading because the truth is that open source is kind of like the foundation to all AI.

Something that people forget is even the closed source companies are using open source quite a lot. So if you think about open AI, if you think about Anthropic, they're using open research, they're using open source quite a lot. So it's almost like two different layers of the stack, where open source, open science is here. And then you can build closed source on top of this open source foundation.

But I do think if you look at the field in general, that it has become less open than it used to be.

We talked about 2017, 2018, 2019. At that time, most of the research was shared publicly by the research community, right? That's how Transformers emerged. That's how BERT emerged. Players like Google, OpenAI at the time were sharing most of their AI research and their models publicly.

which in my opinion led to the time that we are now. It's all this openness and this collaborativeness between the fields that led to much faster progress than we would have had if everything was closed source, right? OpenAI took Transformers, did GPT-2, GPT-3, and that led to where we are today.

For the past few years, maybe two, three years, it became a bit less open or a lot more open, depending on your point of view. Probably because more commercial considerations are starting to play a factor. Also because I think there has been some misleading arguments around the safety of openness against closeness.

which leads to something weird where open source and open science is not as celebrated as it used to be. Yeah, maybe talk about that. What is the argument and why do you feel it is misleading? There are a lot of people emphasizing the existential risk of AI to justify the fact that it shouldn't be as open as it is.

right, saying that it's better not to share research because it's dangerous. A bad actor gets a hold of this and could do bad things. Exactly. That's not the first time that such things have been used. Actually, in every technology cycles, if you look at it, it's kind of the same, you know, like books are dangerous. They shouldn't be given to everyone, right? Like they should be controlled just by a few organizations. You need a license to write a book, to share a book.

It feels like that's never happened, though, in the software industry. Yes, that happened in the nuclear era. But I don't remember any of this around like, oh, my God, software as a service. That's terribly dangerous. Or the mobile apps. Ah, make sure state actors don't get a hold of that. Yeah, it's true.

Maybe the cycle has been faster with AI between people not knowing about the technology at all to everyone knowing. And so it creates more fears, more ability for people to manipulate and people to kind of like mislead. Maybe the name played a big factor, right? When you call it artificial intelligence, it's much more scary than when you call it software.

Even back in the day, it was the world viewed what was happening as like, oh, it's a bunch of nerds. Like there was its own community and it was the norms of the community were around openness and really just coming out of the hippie movement in the Bay Area, frankly, in the 60s and 70s. But now the stakes are way higher. Yeah. The competitive environment is quite different too, I feel like. The early days of software, I think it was...

easier for new companies, new actors to emerge than now where you have much more concentration of power in the hands of a few big technology companies. So that might play a role. For me, one of the most important things in support to openness is

is that hopefully it's going to empower thousands of new AI companies to be built, which is incredibly exciting. Big companies are doing a lot of good and they're doing a great job in many aspects. But I think if we can use this change in paradigm between software and AI as a way to redistribute the cards and change things and empower a new generation of

of companies, of founders, of CEOs, of team members to play a bigger role in the world, it would be great. I think it would align in a way more the challenges and the preoccupations of society with what companies are actually building. So I'm excited to try to do that. For listeners who haven't seen this firsthand, I was over the weekend with a good friend of mine who

is a startup founder, non-technical, has a small bootstrapped company, decided to essentially build an AI product around it 10 days ago, you know, built it, well, probably decided a month ago, built it over the course of a couple of weeks, being non-technical, I'm sure using Hugging Face, launched it and it's like completely transformed his business. And like the output of it as a product is like mind blowing and world class thanks to these AI tools. Yeah.

Yeah, it's incredibly exciting. That's one of the reasons why I feel like we don't need the doomsday scenario of AI or like the AGI super intelligence talks about AI because just the fact that it's a completely new paradigm to build all tech is

is exciting enough. It's already kind of like thinking about how many people it will empower, how many new capabilities, how many new startups, companies it's going to create.

is exciting enough for me and for a lot of people. It's going to change a lot of things in the ways that you build companies, you build startups, as you mentioned, the way you invest in startups. I know a lot of investors are listening to this podcast. I think it's going to completely change the way you invest in startups. I

I've played a little bit with investments. At this point, I've done hundreds angel investments in the past two years, mostly in the community around Hugging Face. And I think we're starting to see that building an AI startup is very different than building a software startup in many ways.

that is, I think, impactful for the way you think about investing and returns for funds. Like, for example, it seems like it's the first time that you're seeing so many of these startups with very heavy need for capital, for compute, like a Mistral that we know with like an open AI platform.

So I think it changed a little bit the way you think about investment, returns on investments, burn for startups. That category of companies requires way, way more capital, but there's not that many foundational auto companies. I think there could be. There could be. If you think of it, most of the investment now is going towards foundational LLMs.

But it's just one modality, text, right? What about foundational models for video, right? What about foundational models for biology, for chemistry, for audio, for image? What if actually foundational model companies are actually just normal AI companies, right?

The same way software companies were like the new type, the new default for companies in the software paradigm. The truth is that we don't know yet, right? I think it's still too early to tell exactly what are the...

recipes for AI startups. And so that's why it's super exciting as an investor too, because the truth is you can't apply the same playbook that you used to in software, right? In software, you were so mature that you had the playbooks, right? You needed like a co-founder, CTO, CEO, small team, and then you do the lean startup and then you follow your rounds and then you get to the highest probability of success.

What if in AI it's completely different? For example, most of the founders actually are not software engineers anymore. They're scientists. It's a totally completely different game. The lean startup doesn't work anymore because they need heavy capital investment before any sort of return. So what I'm saying is that it just completely changes the game.

And you have to forget everything that you've learned, everything that you've internalized and start from scratch. It's funny where I thought you were going to go with this was

AI companies or companies that use AI can be just a few people and get huge output because they're just using the API as provided by these foundational model companies. And there's an extreme amount of leverage to produce great value for customers with few employees. You took it completely the other direction, which I think is quite contrarian and said,

Most AI companies, or perhaps you were saying most dollars deployed into AI, will require new foundational models. And therefore, they're going to be these unbelievably large investments to get these step function advancements in a lot of different fields.

Am I hearing you right? Yeah. Yeah. And I think the truth is that nobody knows yet. So I'm not saying that I'm 100% sure that it's going to go that way, but I'm saying that it's possible. And so that's why it's exciting to see how it's going to evolve in the next few years. One easy way you win that argument is that the dollars consumed by foundational model companies are so large that even if there's a thousand times more regular startups consuming

APIs provided by AI companies, it's still the case that most investment dollars will actually go to foundational model and large training runs. I mean, if you look at some of the successful companies so far, if you look at Hugging Face, if you look at OpenAI, companies like that, I don't think they are

in the traditional way you would expect a software company to act, right? And maybe on OpenAI, they started with a billion-dollar raise, did open source, open science for six, seven years, and then started a completely new model. For Hugging Face, we operated in fully open source for many years, really community-driven, very different kind of organization than what everyone was telling us to do.

So I think there's something to be said about really throwing away the playbooks, throwing away the learnings from the software paradigm.

And really start from scratch, maybe start from first principles and build a new model, a new playbook for AI. Has Hugging Face as a company been particularly capital intensive? And if so, why? We haven't. So we raised a bit more than $500 million so far over the course of seven years. We actually spent less than half of that money.

And we're lucky enough to be profitable. Congratulations. Which is quite unusual for most AI startups. We have a different kind of model than some other AI companies. I assume you all don't have nearly the same kind of capital expenditure requirements that, say, an open AI does in terms of compute and training. Yeah, yeah. And we have enough usage already

that is free, that we have quite straightforward and quite permissive freemium model. We can easily get to a level of revenue that is meaningful. We have some specificities for sure that allows us to do that. And it was also an intentional decision for us because as a community platform, we want to make sure that we're not going to be here just for a year or two years when people build on top of you.

When they contribute to the platform, I think you have some sort of a responsibility towards them to be here for the long term. And so finding kind of like a profitable, sustainable business model that doesn't prevent us from doing open source and sharing most of the platform for free.

was important for us to be able to deliver to the community that we're catering to. Your customers do use Hugging Face for very capital-intensive things, training these models, but that doesn't show up in your financials as, oh my god, we had to sink a billion dollars into a training run. You partner with a cloud provider on the back end and pass it along to whoever is doing the training run, right? Yeah.

Yeah, we try to find the kind of like sustainable ways to do that, either by partnering with the cloud providers, by providing enough value so that the companies that are buying the compute are okay with paying a markup to the compute that makes it high margin value.

or providing paid features that are basically like 100% margins. Like, for example, a lot of companies are now subscribed to our enterprise hub offering.

which is an enterprise version of the hub, which is obviously kind of like a different kind of economics than selling compute. Yeah, very proven business model. You get to choose how you make money. Are you marking up compute? Are you selling SaaS? Are you going the enterprise route and developing this custom package for every engagement? I'm very curious on the routes where you choose to apply a margin or a markup on top of compute. What is it?

Because clearly you're not ashamed of this, and I think it's a great business model. What is it that Hugging Face can provide where a customer goes, yeah, I'll do it through Hugging Face instead of going and figuring out how to do it myself directly on a cloud provider? We've never been so interested in taking part of the race to the bottom on compute. It's a much more challenging business model than a lot of people think, especially with the hyperscaler being in such a position of strength.

both in terms of offering, but also in terms of cash flow, right? Giving them the ability to do a lot of things that other organizations wouldn't be able to do.

And so the way we think about it is instead of taking part of this race to the bottom, we're trying to provide enough value both with the platform, the features and the compute so that companies are comfortable paying a sustainable amount of money for it. So when you use, for example, the platform, the idea is when you use offering like the inference endpoints or

or spaces GPUs on the platform. The idea is that it's so integrated with the feature of the platform that it actually makes it 10 times easier for you as a company to use that as a bundle versus using just the platform and then going for cloud provider for the compute.

So it's what I call a locked-in compute. It's almost kind of like not the compute that you can trade in and it doesn't really matter to you if you switch from AWS, Google Cloud, or another provider. It's more we make the experience so much more seamless.

so much less complex, which is the name of the game for AI, right? AI is still complex for most companies. That at the end of the day, yes, companies are paying more for it. But instead of having 10 ML engineers, maybe they're going to have one or two. The alternative to this would be you have your AI researchers working on models. And then when you want to go train or deploy it,

Not through Hugging Face. You basically need a whole other team of AI infrastructure deployment engineers, right? Yeah, yeah. As we mentioned before, I think in the early days of AI, in the early days of AI monetization, today, no one knows what is like a profitable, sustainable business model for AI, right? Like even the big players. I mean, open AI is, of course, generating a lot of revenue.

But the question of profitability and sustainability of this revenue is

It's still an open question. And I think they're going to figure it out, and I hope they're going to figure it out. But we're so early in figuring out business models for AI that there's a lot to build. And so that is extremely exciting. And I would argue you're not figuring out any business model. You are using time-tested, proven ways to make money where you occupy a particular part in the value chain where you're building

providing a rich set of experiences to developers. They're willing to pay for that directly. They're willing to pay for it in the chain of slightly more expensive compute. The nice thing is you get to innovate on all the AI things without having to build a business model from scratch. These foundational model companies are

That is where there's this big open question of what exactly is the business model, especially when the consumer expectation with interacting with all these AI chat style agents is that that is free for a huge set of functionality. Yeah. The beauty of the position we're in is that if you're the number one platform that AI builders are using, and if AI becomes the default to build all tech,

it's pretty obvious that there's kind of like a sustainable, massive business model around it, right? Otherwise, we would be doing something wrong. That's why we're so much focused on the usage, on the community.

Because we believe if we keep figuring that out, if we keep leading on the usage and the adoption, we keep kind of like empowering the community to use our tools and be successful with our tools. There's going to be good things in the future for us, for Egging Face and hopefully for the community.

There are some businesses that are just perfect. Like you sort of analyze them. Visa is a good example. And you're like, man, there's basically nothing wrong with this business model. Everything about it is just glorious if you are a shareholder of Visa. And every business shy of Visa has these things where you're like, that's an exceptional thing about that business. And here's the thorn in my side, that as I'm operating this business, I just can't escape this thing that kind of sucks. We've talked a lot about

all the ways in which you've positioned yourself in a remarkable place in the emerging AI ecosystem. What's the thing that you have to deal with where you're like, oh, it is such a thorn in my side? For us, inherently, we have to almost take a step back from the communities that we're empowering. That's kind of like a little bit the curse of the platforms.

So like if you think, for example, of GitHub, it's probably the company in the past 20 years that has empowered the most the way you build technology, right? Because virtually all software engineers have used GitHub as their way of collaboratively building technology.

And yet people don't talk about them, right? Like don't talk about the product. It's not as visible as Facebook, Google or these companies can be. So we have some sort of a curse around, I would say, visibility, maybe sexiness. We'll never be kind of like an open AI in terms of sexiness and hotness and people talking about us and always kind of like stay a little bit in the background.

Back in the day, though, when GitHub was in its earlier years and was a startup, it was very... The $100 million Series A. I still remember it. I remember that for sure. It was plenty buzzy. But to your point of as an infrastructure company or a, well, developer writ large, in your case, AI builder platform, you're more behind the scenes. And then another challenge for us is that, yes, AI is starting to be mainstream in terms of usage.

But if you really look at it, the underlying technology foundations are still evolving really fast. And so there's this constant battle between building mature, stable platforms and solutions, but at the same time, innovating, iterating fast enough so that you don't miss the next wave. So for us, more like as a company building strategy,

It's something that we're always worried about. We have 250 team members in the company. We say that we always want to stay an order of magnitude less team members than our peers. Like we could be 2,000 people, but we prefer to be 202,000 as a way to reconcile this difficult challenge between building really, really fast, but really building tools that scale.

That's an important challenge for us, for sure. That's such a good point. And you made it a minute ago. I hadn't really considered, you know, we might still be in the sort of, you know, Yahoo, AltaVista era of foundational model companies. Many of them are very successful. You know about them. As you were saying, they make a lot of revenue, but like, are they fundamentally profitable endeavors yet? Probably not. I think we are. Even when you think about how

companies are building with AI. To me, an AI company using an API sounds very unintuitive or it doesn't sound like the optimal way to build AI and more almost kind of like a transitional time where the technology is still a bit too hard for all companies to build AI themselves.

But I would be surprised if it didn't happen. It's almost kind of like the early days of software where you had to use, I don't remember what they were at the time, but like a Squarespace, you had to use kind of like a no-code interface

platform to build the websites. Dreamweaver and Microsoft Frontpage. Yeah, yeah. Before technology companies could learn, before software engineers could learn to build code themselves, we might be at the same time in AI where companies are using API because they haven't built yet the capabilities, the trust, the ability to do AI themselves.

At some point, they will. They know their customers. They know their constraints. They know the value that they're providing. At some point in history, all tech companies will be AI companies.

And that means that all companies, all these technologies, they're going to build their own models, optimize their own models, fine-tune their own models for their own use case, for their own constraints, for their own domains. I think this is pretty contrarian too. I, coming into this conversation, would have fallen in the opposite camp of there are going to be

five to eight players, maybe even consolidating more from there that need to spend 10 to a hundred billion dollars every couple of years. And no one else has that ability to spend or attract that sort of research talent. And so we all consume their APIs.

And you're proposing a very opposite future. Yeah, I mean, I'm a bit biased, obviously, by the usage that we see. Well, you're a lot closer to it than we are. As I was saying, there's a new model, data set, or app that is built on Hugging Face every 10 seconds.

So I can't believe that these new models are just created for the sake of new models. I think what we're seeing is that you need new models because they're optimized for a specific domain. They're optimized for a specific latency, for a specific hardware, for a specific use case.

And so they're smaller, more efficient, cheaper, cheaper to run. So ultimately, I believe in a world where there's almost as many models as code repositories today. And actually, if you think about models today,

They're somehow similar to code repositories, right? It's a tech stack. A model is like a tech stack. So I can't imagine that only a few players are going to build the tech stacks and that everyone else is just going to try to ping them through APIs to use their tech stacks.

So I envision a bit of a different world. Yeah, it makes sense. And implicit in your comment is that 99.9 something percent of models are inexpensive to train and do inference on, and they're small and they're purpose-built. And it's nice that this thing happened in the last three years where these sort of god models seem to be able to do everything better than...

all the specialized models that people spent 10 years building before, but that's a blip in time. And we're going to kind of shift back to specialized cheap models, handling a lot of the labor as everyone gets better at the state of the art. Yeah, or something in between, right? It's always kind of like a gradient. And I think some companies, some contexts, some use cases will require very large generalist models.

So like when you're doing a chat GPT, yes, of course, you need kind of like a big generalist models because your users are asking everything. But when you're building a banking customer support chatbot, you don't really need it to tell you the meaning of life, right? So you can save some of the parameters to make sure that your chatbot is smaller and

has been trained more on the data that is relevant to you, that is going to cost you less, that is going to reply faster. So that's kind of like, of course, also very depending on the use cases that you plan to use AI for. I'm curious if you're listening to this and thinking about starting a company, thinking about starting an AI company, maybe you have a use case or a vertical use case knowledge that you want to go after.

What are the ingredients and skill sets that you need on your team if you if you buy what you're saying of like, hey, you could use API's but like really ultimately you want to build your own model. What do you need to build your own model and build a great one?

So for me, the main difference between the software paradigm and the AI paradigm is that AI is much more science-driven than software. It's a bit of a paradox because in software, sometimes we call people computer scientists, right? But the reality is that they're not really scientists in the true sense of it, right? Such a misnomer. They're engineers, yeah. This always bothered me.

studying computer science in college. All of the other sciences are things that occur in our natural world, biology, chemistry, physics. And computer science is like, no, you're learning how a thing that is man-made works and how to operate it. Yeah. So to me, that's the main difference between the software paradigm and the AI paradigm. So when it comes to founding teams and capabilities, I think having more science backgrounds is

are actually kind of like a must. Having one co-founder who is a scientist, I think is a big, big plus. If you look at most of the successful AI companies, they actually have like a science co-founder. We do at Hugging Face. I think OpenAI has, of course, with Ilya. That's one big thing.

How would you describe the difference in mindset and skill set between a traditional software startup and the engineering skill set you need for that versus the scientist skill set and the research skill set? Timing is very different. And the way you look at how fast to build something, ship something.

When I was more working at software startups, right, we have the cult of like shipping really fast. This might not be as true for AI.

I think you want to ship as fast as you can, but realistically, to train a model and optimize a model, it's more, at best, a matter of months than a matter of days. So you probably want to look differently at how you're shipping, how fast you're shipping, how you're iterating on things. The skills...

are quite different too. I think an AI scientist has the potential to be more skilled at math, pure math, than kind of like an engineer. I think thinking more in terms of like, how can I make foundational or meaningful progress compared to the state of the art? And you're kind of like looking at

bigger scales of improvement. I think in the software paradigm, you can almost think like, okay, if I make my product 5% better than others, it's going to be enough because I'm going to make it 5% better now. And then in two weeks, 5% more and in two weeks, 5% more. And at some point, you'll have enough differential in terms of value adds to get users and convince and retain users.

For science, it's almost like you don't create any value, you work on something for six months and then after six months you have something 10 times better than the existing. In a way, that's what OpenAI did. They worked for six years barely releasing anything or anything successful, but at some point they were able to release something that was probably 10 times better than others

So that's kind of like a different way of looking at it too. I'd push back on that. I think that's a little bit revisionist history. I'm sure you were watching OpenAI very closely. It felt like they were releasing all sorts of stuff. None of it had any commercial value and all of it felt super researchy. But that thing where they trained Universe on Grand Theft Auto, I mean, the GPT and GPT-2, they weren't known in the mainstream, but it was like,

pretty remarkable watching that. I think them going all in on the Transformer and deciding, hey, we need to fundamentally change the set of things that we're working on. I think that company has worked incredibly fast, shipped pretty

pretty fast. And now they're shipping faster than ever because they're actually in this arms race. Like I definitely don't think of them as a go away, think and build for 10 years and then finally release something. They did release a lot of things, but compared to their size and their scale, knowing that they started with a $1 billion investment,

Maybe they were releasing one thing every three months or like one thing every six months. So relatively to their size and their scale and the amount of money that they raised, I think they were shipping and releasing way less things than the typical software company would have with their budget. But I agree with you that it was an iterative way. I guess to the point too, like if you have a large model, you're not going to do...

continuous deployment because you got to retrain the model if nothing else, right? Yeah, it's just a different approach. The best advice I give to people is to trash their lean startup book when they're starting an AI company because I think these kind of things have been so ingrained in

into our minds, into our way of building as kind of like software entrepreneurs, that it's really easy to fall into the trap of doing it without even realizing we do it instead of completely changing the paradigm, changing the operating system of the startup builder, which in my opinion leads to much better results.

Well, Clem, this brings us to a topic that I've been wanting to ask you about, which I think will be kind of our last major topic for today. In the discussion of which approach will sort of win in the marketplace of open source versus closed source AI, there's a pretty compelling argument, which is as more companies

training data is real-time training data is required, people's interactions with an application will become incredibly valuable to fine-tune or train the next version of the model. There's sort of a compelling argument that is closed source AI will win because they're just going to get all of that directly from

from users when you own the model and the application and you sort of have tightly integrated everything versus in the open source world, like great, you publish something and then a bunch of people fork it and they build their own applications and then the real-time interaction data with the application doesn't make its way all the way back upstream to make the model smarter. How do you think about that?

Well, I think a lot of people are thinking and talking about modes and economies of scale for AI. I think that all of that is kind of like open questions at this point. I think nobody really knows how to create a mode or like how to generate economies of scale for AI. My intuition is that they're not going to be so different than the software paradigm.

and that you're going to find the same kind of modes, maybe applied differently, but you're going to have the cost

economies of scale, right? Similar maybe to a cloud provider or like a hardware provider who can get an advantage from larger scale to reduce prices. I think you're going to have like the social modes or like the network effect, right? That's more like the game that we play in where when you have collaborative usage, you

In a way, your platform becomes more and more useful the more users you have. And so it makes it difficult for anyone to compete with you, right? That's why GitHub has never really been challenged or that's why social networks are arguably very hard to compete with.

They're going to maybe be more intense than in the software paradigm. So maybe the cost mode of compute will be more extreme. But it's an open question because, you know, if you think about the current winners, some of the current winners, they didn't have so much of these advantages from the get-go. Like if you look at OpenAI,

they didn't really have more access to data than most companies, right? They ended up scrapping the web and getting data that everyone else could get. If you look at Hugging Face, I don't think going in, we had any specific advantage that allowed us except...

being kind of like as community-driven as we were, that enabled us to develop the social network effects. It's still an open question. I would be careful of people and companies

kind of like overplaying and overhyping one sort of mode compared to others. And even if you think of ethically and the kind of world that we need, I hope that we're not going to have just a few companies winning. It would be a shame, it would be quite sad if we ended up with just five companies winning in AI.

I think it would be dangerous, right? Imagine if only a few companies were able to do software. We would be like in a very different world than we are today. I hope many companies win. I think the technology is impactful enough so that there can be almost more AI companies winning than software companies winning.

In the past, that'd be very exciting to me. And you make a very credible argument that it's going to empower more people than ever to build products. And so it stands to reason that there should be more companies or at least more attempts to start companies that can serve a particular customer need in this generation than any previous generation before. AI is the opportunity of the century to shake things up.

break the monopolies and break off like the established positions and do something a bit new. I'm curious to get there. Do you think that we need just like a lot more people getting trained in how to be AI builders and AI scientists? Or do we need the tools and infrastructure to get a lot easier to use or both?

Both, but I think it's much more important that we get many more AI builders than we do today. If you're looking at Hugging Face, as I was saying, we have 5 million AI builders, right? So we can assume like most AI builders are using Hugging Face one way or another. So you can estimate that there's around 5 million AI builders in the world today. There's probably...

around 50 million software engineers or like software builders, depending on how you set this definition. I think GitHub has over 100 million users. A lot of them obviously are not software engineers, but probably half of them. So we're still at the early innings, right? It wouldn't be surprising that if in a few years you would have more AI builders than software builders, right? So maybe in a few years you're going to have

50 million, 100 million AI builders. Even more because the beauty of AI is that it's a bit less constrained than software in terms

the way that people can contribute to it. In a way, to be a software building, you have to learn a programming language and write lines of code, which is a pretty high barrier to entry. Versus for AI, you can be considered an AI builder if you contribute expertise, if you contribute data to a model that improves the model.

Maybe we're going to have like 10 times more AI builders than software builders, which would be also good for the world because it would mean that more people could contribute, could understand and could kind of like shape things.

the technology more aligned with what they want. I think sometimes in San Francisco, in Silicon Valley or in tech in general, we forget that it's a very small number of people shaping products for a much bigger number of people. Whereas if you maybe include more people in the building process, you can not only build better products, but more inclusive products, maybe products that can solve more social issues than we've been solving.

And so that's quite an exciting future for sure. Well, Clem, I can't imagine a better place to leave it. Where should listeners go to learn more about you or Hugging Face or get involved?

Huggingface.co, actually .com. We just got the .com a few days ago. Hey, congratulations. Yes, it's a good example that you shouldn't sweat the small things early on, right? Our name, Huggingface, is obviously very unusual for the kind of things we do. Our domain name, for like seven years, we kept Huggingface.co, but it didn't create too many problems for us.

I'm on Twitter. I share a lot on X and on LinkedIn. So you can follow me there or ask me questions there and happy to answer. Awesome. Well, thank you so much. And listeners, we'll see you next time. We'll see you next time.