cover of episode Live at Tech Week: Delivering AI Products to Millions

Live at Tech Week: Delivering AI Products to Millions

2024/7/12
logo of podcast a16z Podcast

a16z Podcast

AI Insights AI Chapters Transcript
Key Insights

Why is the current time so exciting for AI technology?

It's exciting because of key breakthroughs in technology like transformers and diffusion models, which allow us to use more data to train models, leading to better results.

Why is data quality crucial in AI models?

If the quality of the data is poor, the output will be half-baked with lots of mistakes. High-quality data is essential for achieving the expected results.

How has Descript's approach to AI evolved over time?

Initially, Descript embedded AI without highlighting it. Now, with better AI, they are integrating all AI features into one space and adding new ones, focusing on how humans and computers interact.

Why is it important to solve customer problems with AI rather than just showcasing the technology?

Solving customer problems is essential for retention and long-term success. Just showcasing technology can lead to users not knowing how to use it and leaving.

How can AI be used effectively in marketing?

Using the term 'AI' in marketing can help convey that a product is a step change from existing solutions, but it must meet the expectations set by the marketing claims.

What is the role of building your own models versus using APIs in AI products?

Building your own models can provide a competitive edge and allow for deeper integration. However, using APIs can be more cost-effective and quicker for certain applications.

What are the key factors for defensibility in the AI market?

Defensibility comes from user retention, community engagement, and multiple layers of value, such as foundational models, data, and partnerships.

How does retention differ between consumer and B2B AI products?

Retention is more challenging in consumer products due to the high turnover and the need to solve specific user problems effectively.

What metrics does Descript focus on to measure customer value?

Descript focuses on time to expression (how quickly users can create shareable content) and editing richness (the quality and complexity of the content created).

What are the challenges in preventing misuse of AI products?

Preventing misuse is an ongoing battle, requiring continuous innovation and investment in safety features. Open source tools can make it harder to control misuse.

How does internationalization play a role in AI product development?

Internationalization is crucial from the beginning, as AI products can bring change to workflows across different cultures and regions. Adapting to local needs and languages is essential.

How do companies balance the use of open source models with proprietary models?

Open source models can be complementary to proprietary models, offering a balance between cost and features. Proprietary models can provide a competitive edge and more advanced features.

Chapters
Advancements in AI models, particularly in handling unstructured data, have revolutionized how we process information. This progress is fueled by breakthroughs like transformers and diffusion models, along with increased access to larger datasets for training.
  • Unstructured data is now usable thanks to transformers and diffusion models.
  • Larger datasets and hardware/software advancements drive better results.
  • Data quality is crucial for avoiding errors and achieving desired outcomes.

Shownotes Transcript

We've existed for about three years and we've passed everybody in revenue in like literally a year and a half. Usage is important, but that does not define the long-term success of an actual customer. I think that daily active use is a pretty terrible metric to uncover customer value. There have been companies built in the past on just great design. There's no reason that they can't be built on the AI side.

Between June 3rd and June 9th, A16Z ran its second annual New York Tech Week. Now this week had thousands of people attend a record-breaking 700 plus events, including one event run by our podcast team.

Now this A16Z live recording is exactly what you're about to hear. But first, let's take a quick trip to memory lane. When ChatGPT was launched in November 2022, it quickly became the fastest growing consumer application in history. But tech space AI was just the beginning.

In the next 500 days, a flurry of AI models launched that spanned new modalities, from images to video to audio to 3D, that all yielded an entire ecosystem of applications that have upended, quite frankly, the way we work, learn, create, and even play. Now here in mid-2024, competition is fierce, but I don't think I have to convince you of that.

So for this live recording, we brought in key leaders at three AI companies to discuss how they've managed to stand out amongst the noise because they have products that reach millions of users. So in this conversation, you'll hear from Gaurav Misra, co-founder and CEO of Captions.

Carlos Reina, Chief Revenue Officer of Eleven Labs, and Laura Burkhauser, VP of Product at Descript. Together, we explore what ladders up to AI products that people actually use, including what features really matter when AI is necessary or distracting, whether you need to own your models, designing for retention and international expansion, and of course, where we all go from here. I hope you enjoy this recording as much as I did.

Thank you.

And so we're actually less than two years since that, but a lot of people are familiar with text-to-text. But all three of the products here go into several other modalities, right? We've got audio, we've got video, imagery. So I think that's really exciting. But maybe we could actually just start with the why now, and specifically maybe the unlock that we've seen with unstructured data, right? Before we used databases and everything needed to be really structured in order for us to make sense of it.

Today, that's not quite the case. So Gaurav, maybe we start with you. And what do you see really today as the why now? Yeah, I mean, I think it's a really exciting time generally, just because obviously there's been a couple of key breakthroughs just in terms of technology with transformers and diffusion models and so on and so forth. But I think the key here is we're able to use a lot more data to train these models now than ever before, right? And there's a bunch of things happening both on the hardware side, the software side, right?

And the data side to enable that to happen. And that's why we're seeing such amazing results, right? If you look at a lot of what the key players in this industry are doing, they're just training these models with more and more and more data every iteration, right? And that's able to produce reliably better and better and better results, which is pretty amazing to see. And it's not in sight.

So far. Carlos, maybe we'll go to you before we talk about Descript in a second. I think it's correct. The key message for us, experimentation for 11 lamps has been like, if you put garbage in, garbage out, right? If the quality of the data that you put in is not that great, then essentially what you end up producing is half-baked with lots of mistakes and things like that, right? And we can see that with Whisper.

How many of you have tried Whisper and it comes out that like, "Subscribe, subscribe, subscribe," and things like that, right? All the time. That's true. We've seen it all the time. But so I think like for us, like there's been a layer. Initially, we trained it with a lot of data and then over time we end up curating the data to make sure that like it is very high quality. Otherwise, you're not able to achieve the results that you are expecting or that you, your consumers or your businesses would need, right? But that's a fundamental change that has happened in the market.

amounts of data being used with transformers and LLMs to generate this human content generated, whether that's speech or text or anything else, right? Yeah, 3D models. We're seeing all types of stuff. So the reason I wanted to wait to talk to you, Laura, is because I don't know how many of you have used Descript, but any guesses on when Descript started?

We talked about ChatGPT, November 2022. So, Deskirk has been around since 2017. The reason I wanted to frame that is because, obviously, the last couple of years, very exciting, but machine learning, AI,

The 50s is when this really got going. And obviously, there have been unlocks. But I want to get your pulse, Laura, on the importance of putting AI at the forefront. A lot of AI is embedded in the applications that probably people in the room are building as well. But the script long used machine learning before really saying, hey, you're using machine learning, AI, etc. So what are your thoughts?

That's right. So Descript is software that lets you edit video just like a text document. So if you can edit a Google document, congratulations, you're also a video editor. If you can just download Descript and now you can edit video. And it turns out that the tech

Right.

right? They don't care what is the technology that is creating this value for me. What they care about is there is value here. This is helpful for me. And so that was long our way of designing software. And it probably would have continued that way forever, except that

Actually, when I think about the thing that is making us change our minds, in addition to some of these cool models that are coming out, it is that the way that humans and computers are interacting is totally different. So you can talk to your computer now. You can use human language to communicate more subtle intentionalities that you have for how you want to edit your video or create your video.

So, as this technology has gotten better, we've thought, "Well, gosh, do we actually want to design AI in the product differently? And if so, how?" And so with our latest release, we're actually bringing all of the AI features that we've long had in the product into the same space and adding a ton of new ones. And we had a big discussion with our design team about how do we do this? And one of the big discussions we had is, "Is AI a magic wand or is it an entity?"

And one of the big decisions you have to make there is that traditional creators are much more used to interacting with Pro Tools software or creative software in a point and click way. And so they want a magic wand. But you have this whole new wave of people that are now generating and editing video and audio. And they're used to using kind of more of this entity interaction. They want an entity.

Then you start talking about an entity, right? And you get into internal discussions like, I don't know if it's an entity. That might be a bad idea because what about our robot overlords, our inevitable robot overlords, right? That's kind of like one side of the debate. And then hilariously, you have the other side of the debate that I don't want an entity because actually it turns out this technology is really stupid sometimes. And if you make it an entity...

you know, you said like, hey, welcome. This is like your co-editor. And it turns out your co-editor is just like a total moron that makes horrible suggestions sometimes because it's hallucinating. And so we're like, okay, how do we deal with that? So what we decided to do with this newest release is we're actually, we're calling it Underlord. And it's a nod to the potentially apocalyptic future of AI while also admitting that right now this thing is kind of like

a very eager, like somewhat competent intern that does a really great job at the first pass of the worst parts of your workflow. So that's some of the story about how we've thought about designing with AI over the years. I'd love to get both of your calls. Like, how do you think about that same question? What part of AI do I put at the forefront? Or do I just use this really powerful technology and kind of give my users what they want, but not really sell this AI thing too much?

So I'd say at the end of the day, you have to solve customer problems. That's what we're trying to do. I think the biggest mistake that can be made is to say, "Hey, here's a technology.

you can have technology, do whatever you want. People can't just take that and be like, okay, I know what to do with this, right? I think you have to mold it into a product that solves a problem at the end of the day. So I think that's like traditional. Nothing's changed there, right? It's exactly the same as before. And if you're not doing that, then essentially you're going to see retention problems, right? You're going to see people coming in, trying out the thing, not knowing exactly what to do with it, not working perfectly for their use case, and then they'll leave, right? Kind of tourism is what we're calling it, right? But

I think at the same time on the marketing side, like stepping away from product for a second, there is something to be said about sort of having AI in your message on the marketing side. Here's why. If I just say, I have a better product, it's so much better. You won't believe it. I'll be saying the same thing that people have been saying for literally a hundred years about every product, right? Like, yeah, yeah. Trust me, it's better, right? Trust me, come on and try it out. This is every single product that exists, right? But

Putting in that AI term in there, just from the marketing side, it's just tactical, actually lets people understand, oh, wait, this is going to be a step change, right? Of course, if you don't meet that expectation, when they land on the product, you're going to have a problem. But if you're able to meet that expectation, putting that in kind of does inform people about like, okay, this is not going to be sort of like the better product, it's going to be a step change completely.

compared to everything else we've seen. So that's the general guide. I do feel like a lot of people are just throwing in the eye term on the marketing side now just to kind of get the eyeballs there. And maybe that message will kind of get lost a little bit. But so far, the innovation has just been so strong that the message has kind of remained strong. And if it continues this way, the marketing side can continue as well. But at some point, it might get muddled. We'll see.

Maybe just I can add on a modifier for you, because I think not only do you have to market the product, but if you use this bucket term of AI, right, that means many different things. Do you own your models? Do you build your own models? Are you an API wrapper? And so I'd love to hear from you, Carlos, at Eleven Labs in particular, like in building your own models as well, like how does that play into it? Is it a whole marketing packaging, thinking about what you share and what you don't?

Yeah, we need to be open. Like we are an AI company. Sorry, guys. And we say it all the time. Like we say, like we do AI voices, we do AI sound effects. We're going to be doing AI music in many ways. So for us, like it's all about the audio sphere, right? It's like that layer infrastructure that allows you to create high quality engaging content, whether that is like with voice, with like audio overall. And

And the way we thought is, well, actually, there wasn't really a good quality text-to-speech available before we invented our own site. So we were fundraising initially. It was difficult because the market is not there. Like, how are you going to be getting customers and so on? So it was like, it was really tough in the early days. But we thought, look, if you're able to deliver quality that voices that sound engaging, build applications on top of it,

then you end up having a market that is just fully on top, right? So how do you do that? AI voices, simple and plain, right? And that worked really well. So we started with like the LLM, pure like API play with a very simple UI. That was end of January last year when we launched the product. And we thought, well, actually, there's going to be like some pieces of like some content creators that might want to use the UI, but we expect

on the API side is going to be quite big purely because people might want to build their own applications on top of it. And it worked really well. And since then, we also realized that, well, you cannot expect all of the business to have the capabilities to build their own applications. So what if we end up going full end-to-end and we build our own applications for areas where we really care about?

And that's how we end up creating projects or Audio Native or the dubbing product and a bunch of other pieces, right? So it's been very interesting for us. And of course, we always say that it's AI-driven because at the end of the day, we're a foundational model that happens to also build applications on top of it. But I think the beauty of it is that anyone can build anything they fancy on top of the API. And today we power quite a lot of different companies. More than 41% of Fortune 500 companies use 11 Labs.

We power a lot of startups and we are very proud to help all of these companies like succeed as well. Right. So it's been very interesting, like having both sides, both motions, like the pure API play and the application layer on top of it. It's challenging as well, because then you end up having two different profiles in terms of like on the product side, on the engineering side and everything. Right. So you always need to balance it.

Absolutely. Maybe we can actually jump straight to that question of competition. I feel like if there's one question that comes up on this podcast the most, everyone's excited about AI and they're like, okay, well, where does differentiation come up? Where do moats arise? I'd love to probe all three of you on that. I know we're early, but where do you think you can stand out? Do you really need to be building at the model layer? We talked about the infrastructure layer. Or can you really just build a really great UI and capture the app layer? What do you think about that?

Maybe I'll start here by saying, again, not much has changed in terms of like, there have been companies built in the past on just great design. So I think there's no reason that they can't be built on the AI side. But at this point of the journey, there's so much sort of to innovate on so much to build on. It does help to have models that are foundational and built in house, because it does give you that extra differentiation. And that extra step, it is a competitive field. And the deeper you can go, and the

The more you can build from the ground up, really connecting these different layers together, right? You can deliver super fast fees on your models. You can deliver the highest quality that anyone's seen, right? And you can deliver a great user experience that solves a real problem. Then you have an advantage there. So I would say though, for consumer companies, which we're a consumer company, right? Like we're used by literally millions and millions of people around the world. And people make over a hundred thousand videos a day published through our platform for

For a consumer company, it does matter a lot to have that differentiation at this stage. I think in the longest term, if you think about what differentiates a consumer company in the longest of terms, it's probably just brand, right? And that's kind of what you're building over a period of time. And the only way a brand dies is with the generation. It also takes a generation to build a brand too, right? So I think...

that's kind of the ultimate goal of where you want to get to. But I think in the meantime, there's many modes that long last like different lengths of time, whether that's a data mode or a model or like whether it's a UI, UX mode, whatever it might be.

So at Descript, I would say that we are a horizontal editor and we're a very powerful human editor, which is something that I think a lot of kind of newer, just started in the age of AI, in the second chapter of AI companies can't say because it takes a long time to build a really powerful, horizontal, human-driven editor. So you can do like really complex editing jobs.

with Descript if you already are like an expert who's great at this work. And you can do it really quickly with low barriers to entry if you're new to it. For that reason, I think the application layer is especially important to us. And I almost see us as a mirror to kind of what Eleven Labs was saying, where I think like in general, we...

have a may the best model win sort of mentality when it comes to all of the different models that we use in our application layer. And that's because we're trying to do everything, not just AI voices, but things like eye contact, things like avatars, things like AI speech, transcription, editing video with text. If there's like a cool thing happening in AI, when video generation, when Thora comes out, that will be in Descript, we're going to have it.

And so I think generally, we have an attitude that is may the best model win. We want to give our customers the absolute best experience. If we don't see interesting enough work happening in a space that we want to be in, we'll build that model. And I think there are real places for Descript to differentiate because we own so much of the editing workflow and have really great editing workflow data that that may be a place where our models become differentiated. But in general, if you're trying to provide a

ton of different services to customers across a ton of different workflows, it can really make sense to not try to build every single one of those in-house, but instead to be very thoughtful about where it makes sense to own versus buy or borrow.

I think there's an element here on if you think purely about differentiation in these days, because the market has bought a lot from purely foundational picks and shovels and now the transition towards the upside, what you end up thinking about or how I think about defensibility is fear about like,

your users, your consumers, or your businesses, right? Like that's essentially what will drive defensibility over the long term. And if you think about like Instagram or Meta or like a Facebook in the early days, what was their defensibility? There was literally nothing out there, but they were able to fast grow, like outpace everyone in terms of growth, deliver value. And then the UI was not even that great, right? But it was actually like you were feeling there was part of the community and it was like the experience that you were getting, right? So the defensibility was coming from the actual users versus the product itself, right?

And I think like the transition that we're seeing today from the foundational models, like upside, it's actually very interesting because then you're able to engage different type of generations or different type of users that like, if you retain them and you give them the best experience possible, they will stay there for the coming year. Right. Whether that is because they're building their own applications on

top of the table because they essentially are like, well, I want to use your app overall. And the way we also think about this at 11Apps is like layers, right? So having the foundational layer, which is like the research that we provide. Great. We do LMs and essentially we provide the best text-to-speech and AI voices in the market. Fantastic. What else do you have on top of it? The data that we've acquired that we've licensed from partners, the

the end-to-end products that we're building, the partnerships that we have, the customers that we have. So you end up creating all of these multiple layers that essentially end up building your core defensibility in the market that hopefully will sustain us for the coming years, right? As the market changes,

If one of the layers ends up getting replaced, absolutely fine, because then essentially you have all of the other ones that will back you up in the long term, right? Yeah, and something you spoke to there is just like this new generation. And I think we're all kind of trying to figure out what can now be done with AI. You talked about UX even, or designing a new UI. Voice is now in the mix in ways that it wasn't before. But then you also have this question of, do I want to completely reinvent the wheel, show someone a very powerful UI,

that they're maybe just not familiar with and that you don't retain them. So Gaurav, I'd love to probe you on retention. I mean, even just from the perspective of desktop versus mobile, you do have a mobile app. How do you think about designing for that? Because we've seen over and over the last, let's say two years, there's this extreme willingness to try. But then I think someone internally coined this like AI tourist phenomena, right? It's people try and then a lot of them do leave. So how do you think about that?

Yeah, I mean, it's something we think about a lot because at the end of the day, I think you can kind of go by metrics and you can really worry about like, oh, there's a retention number. It should be at that number. And you can kind of get caught up in that a little too much when the reality is like those micro-optimizations are not going to solve whatever retention problem or any other metric problem that you might have, right? At the end of the day, it's about the user experiences. It's about solving a real problem. I think generally, if you want a complete hit end-to-end, you need to have a breakthrough technology that's applied to solve

a very specific problem that a user actually has, right? And then you need to have an engine that can deliver that solution to people who have that problem as quickly as possible across the world, right? If you have all those pieces, then you won't have a retention problem or an acquisition problem or any other problem, basically, right? Now, the cool thing about this time right now is the technologies are being developed and there's actually a crazy number of technologies out there, right? I think it's a very unique time from that perspective, right? And

For product people, the main problem is, hey, how do we actually solve problems? Actually solve real problems that people have. And not just sell the technology as technology. Like, hey, we have technology. Just that. But actually convert it into a real value delivery for users for a specific use case, even an issues case, whatever it might be. And then I think for marketers, the problem is

How do we actually educate people that there's a new way to solve these problems, right? Like, people may not think the first thing, oh, you know what, I'm going to Google AI for this, right? That might not be the first thing that people think about, right? They might be searching for just whatever they were normally doing, right? Which may be something that takes a long time. And, or they might be like, not aware that there's new solutions available to these problems, right? So I think,

That's sort of the end to end. I think if you focus on that, at that level, like all the other numbers sort of follow on their own. And that's kind of what we've seen, both across our desktop app, and our mobile apps as well. And we're in the consumer space. So retention is definitely a very hard game to crack compared to say B2B businesses. But we've been able to do it really well. And like, I think it's because of that high level focus across technology, product and marketing.

Hey, it's Steph. In addition to hosting the show, I love tinkering on new projects in my spare time. Who doesn't? And one podcast that helps me become an even smarter creator is the Signal award-winning podcast, Creator Science, hosted by Jay Klaus.

Now, it's never been easier to create content. So breaking through the noise gets more difficult every single day. And the only way to get ahead is to learn from other creators who are playing and also hopefully winning the game right now. And creator science gives you that edge. Each week, they explore what's working right now to help you grow your audience, generate more revenue, and break through that noise. You'll hear from today's top professional creators like James Clear, Tim Urban, and you might even see me on there.

This isn't your typical interview style podcast. This show is a classroom and their guests are the professors. So don't miss out. Follow Creator Science wherever you get your podcasts. Yeah, maybe Laura, you used to work at Twitter. What are you learning in terms of products that reach so many people? We're talking daily active users. What have you learned from that space that you can apply to AI when you are trying to fix this retention problem?

I will say that I am so glad to be out of the game of trying to optimize for MDAU for monetized daily active users. I think that daily active use is like a pretty terrible metric to uncover customer value, right? And so one of the things that I just love most about working at Descript is being able to identify alternative metrics to think about how they're done right by the customer. Two

Two that I really like to think about that are a bit in tension with each other, they act as guardrails, is time to expression and editing richness. So I think if Descript is doing its job really well, the amount of time it takes you from starting a project to getting it into a shareable state, whether you're a marketer who's like trying to repurpose a webinar into clips,

or someone who is more of a creator trying to make your latest YouTube review, or you're someone in learning development trying to create a training. I want the amount of time it takes you to create that to go down and down. And so you're able to just create more and more of the content.

Is anyone here a creator in any way? Have a YouTube channel or a marketer? Do you know about just like the gaping maw that can never be fully fed or sated for content that I find so many of our customers are just staring into with despair? And so getting kind of their time to expression down is really important. But one of the ways you do that is just like by creating worse and worse content that it's just...

a roll with an iPhone and you slap some captions on it, which is great for some use cases, but for others, just like a missed opportunity, like you could have done so much more to create really high quality video content. And so if Descript is also winning on increasing the editing richness, the number of jobs that you're able to do with us and the number of things you're able to do to transform your media and make it really high quality...

The interaction of those two metrics is such a great way to drive towards customer value. I will say that what Gaurav said around just good product fundamentals with retention totally resonates with me. My attitudes for the tourists is you've got to triage the tourists.

Some component of them just don't have a use case for your software. They want to create a voice clone. They want to see it. They're like, "Ooh, that looks cool." But they don't have anything to do with that voice clone. And it's like, "Great, let's let them do that. That's awesome. Maybe one day you'll think about Descript or 11 Labs and come back."

But then who are these tourists who actually have a legitimate use case and they just don't know it yet? They could be using video to communicate within their company. They could be using text-based video editing to create all of their marketing clips and they don't know that yet. And how can I create software that activates really well, that

displays all of our use cases and lets them have a good first time. And I find that like often retention problems are just activation problems in disguise in a trench coat. And so what I really try to focus on to improve retention is just like the activation experience.

Just having come from a social media background as well at Snap, such a good point about just DAU and how that can be such a trap. I think social media companies obviously optimize DAU for a reason because money is coming from a different source. And so actually, it's good to be out of that game. And really interestingly, with the generative AI space, it seems like it's kind of having the opposite effect on what it's trying to achieve, like social media on one end.

is using AI as well, but really to consume time from people as much as possible, right? Consume as much of your time and it's succeeding, right? And on the other end, generative AI is actually kind of giving back time to people, right? So they can actually do more. So pretty cool. Yeah. We talked about this on a recent episode, how some tools, I'm sure people would resonate with this. If you had one excellent session, it could have saved you four hours of work in

in five minutes, that's actually more valuable than spending 20 minutes every day in an app. And you don't see that in the same metrics, right? So I love that you brought up different metrics that you're paying attention to, Laura. Carlos, is there anything that jumps to mind there for you in terms of how you might rethink a business model in terms of what metrics you're paying attention to, the way that you're monetizing a product that might be different because the willingness to pay we've also seen is there, even if it is just, I'm using this once a month, once every two months even.

Yeah, and I think it's a really good point, right? Some consumers actually feel that if they need to do something twice, the product is not working well, right? It's that element that we've gone from one side to the other side. So probably like someone in the middle is what it fits well. I was actually like in a meeting with a customer and we presented a C-level last week and they

The question they came back with was like, okay, so how much time am I going to save? And I was like, well, you're going to save anywhere between 50 to 60 times the time. Like it's going to be like 50 to 60, like it's slashed by 50 to 60. And they were like, no, that's not possible. And I was like, let's do the math right now. And we did the math and it was very interesting. So I think there is an emphasis on that side. But I think like sometimes we try to

overemphasize the effects of like the efficiency that you're getting with generative AI when in fact generative AI is not perfect, right? I think like that's one of the main reasons why the AI tourists are there and they're very big is because everyone comes with like such a big expectation that he's going to be solving all of my problems and he's going to be cooking dinner for me tonight as well. And unfortunately, he's not going to cook dinner for you. He's just not going to solve all of your problems, but he's going to help you

quite a lot, either because you can do a lot of modern monetization with your customers, you can reach new markets, or you can actually do it much quicker, right? But I think framing it on actually what is valuable for you as a business or as an individual is much more important. So initially, our metrics were pure usage, right? And over the past month, we've ended up switching to usage is important.

but that does not define like the long-term success of an actual customer for us, right? It's more about like the activation side is about actually what's the use case that you have and how do we measure that over the long term and how do we understand and try to ensure the use case based on

the way you're using the product, right? So that we can offer you the best tools and the best tips and all that stuff. For us, essentially, those are the key metrics today versus like usage. Usage is still super important, but I don't really mind if someone uses the product today and then doesn't do it for like a week or two weeks because I know that like if we've nailed it, they're going to come back two weeks later, right? I think that's how we think

thinking about it. You don't have those social notifications that are like a friend of a friend maybe posted something. Please come to our app. All right. Well, so we're going to open up to questions very soon. So if you have any questions, start thinking about them. But I want to do rapid fire one or two more. So the importance of optimizing an application for a specific

role or someone's use case. Who are you? What are you trying to do? So each of you actually comes from different backgrounds, right? So Gaurav, you've done design and development. You've been an engineer. Laura, you've been immersed in product, Carlos, operations. And so those are roles where there's, gosh, I don't know how many other people who fit that subset. So I'd just love to hear your perspective, independent of your company, how

How do you think of AI as, let's say, the next five years? What does an AI-powered engineer look like in your case, Gaurav, or like an AI-powered operations person? What do you need? What's missing? Are there products out there that actually fit that use case and are doing it well? Yeah, I mean, thinking about it from an engineering perspective or even from a design perspective, I think...

maybe the closest on the engineering side would be like a tech lead manager, someone who's actually setting up the overall architecture of whatever's being built. Right. But a lot of the work's been done by AI and they're coming in, they're making edits. They're like, maybe we need to change this reviewing stuff. Right. Same on design. Right. Like kind of giving high level instructions and like, let's have this, let's maybe use this style over here. Let's change these components. Right. And getting that output back and kind of

reviewing it, leaving comments the same way that a manager might write and being able to produce hopefully a lot more value and output. So that means that companies can be going to a much larger revenue scales with way fewer people, which is going to be interesting. Yeah, I think a lot about this. What is the AI product manager? The paradigm that I use is more like how do I want to interact with AI to do my job better? One of the use cases I'm excited about is a rubber duck who talked back.

You guys hear about rubber ducking where you keep a rubber duck on your desk and you talk through difficult problems with that rubber duck. And I think I'm never going to cede control of the creativity and the genius to the entity. Clearly, have you met me? I'm in charge of that. But I think it can be fun to toss the ball around with someone. And I think I'm excited to see how AI continues to develop to be a fun thing to toss the ball around and then can take...

all of the stuff that you're just spewing out, all of the kind of word garbage and turn it into something crisp and readable and easy to understand. So that's a use case that I'm excited about. I think from an operations side, it's even more complex, right? Because there's so many things that you need to do. How do you automate or how do you get someone to help you on that front, right? So ideally, you end up having a product that helps you do twice as much in the same amount of time.

Not because I'm thinking about it from an efficiency perspective, but much more how I can potentially generate more revenue for the business, right? I think that's where potentially and hopefully the market is going to be going. On the sales side, it's much easier because you end up having AISDRs these days. We'll end up having AISCSMs and all of those pieces that are going to be

already there in many cases, right? But purely on the operations side, there's a lot more complex. ChatGPT is your friend for sure, right? Or on topic if you use it or like any of those tools that will help you generate quite a lot of different things on a day-to-day basis. Is that giving you a 2x? Not yet, right? So I'm not sure, like I still haven't found the right product that like would help anyone optimize and become like 2x themselves. Maybe someone will build it in the room. I guess final question, does anyone feel free to jump in? All three of your products have

a lot of customers, people are using it. It seems like maybe for the retention problem, what challenges are you facing? Whether it's like regulation or not having the right models or hoping that the open source models catch up or just curious if anything jumps out where just calling out a challenge that you'd like to be solved in the next few years. Yeah, I'd say for us, it's hiring, actually. It's very traditional, right? But I think hiring the right people to solve the particular problems that we're having in our company and

Problems grow really quickly if the company's growing really quickly, right? And you have to kind of keep an eye on all the different things that are happening, where new needs might come up, especially with a company like ours, where we've existed for about three years and there's video companies that have been around for a long time. We've passed everybody in revenue in like literally a year and a half. And with growth at that scale,

You just have to constantly be thinking about what are the new problems that are coming up and who can we hire to solve those problems, right? So I think that's like a very traditional answer. And maybe there's some AI recruiters out there, but we have a great team. So I don't think we need them, at least not yet. Maybe AI can help with that.

I think it's just that we're in the middle of a paradigm shift, right? Like we haven't gotten to the end of it. We're in the middle now. And what I can tell you is that the way that we're going to edit video and audio in a year or in two years is going to look completely different than how we're doing it right now. But we don't know how yet. And on one hand, like that's why I'm here. That's like why I'm doing this job because this is a place where the next generation of like

product managers and designers. We're going to reinvent the way that humans and computers interact with each other. Someone's going to figure it out. And God, I hope it's like me or that I'm part of it in some small way. But that's also just like a very fragile moment, right? Like it's both a challenge and an opportunity. And I think it's like the challenge of our industry right now.

I think for us, it's like there's two sides of it. What is definitely hiring? I can relate a lot on that. It's difficult. We've gone from like zero to like tens and tens of millions in months, not even years, in months. And it's really difficult to find people that have experienced that previously. Also because like the market has evolved very quickly in such a short timeframe. So that's one side. So there's a lot of commitment that like we expect from people at the company and we need to be able to actually keep growing at this pace.

And on the research side, it's extremely difficult to find the right researchers. On the engineering side, on the operation side, on sales, even support like across the board, right? But that's one side of the equation. The other side of the equation is preventing misuse, right? And I think realistically, that is something that we have an entire team dedicated to that day and night and before seven. But every time that we put together something, there is 20 other different things that like people make up to try to game it. And it is similar to fraud, where like you're always like two steps behind.

And it's really difficult to catch and I keep fighting it. So I think like about those two elements are like the biggest challenges that we're constantly facing as a company. Like we're winning, but still it's just a matter of making sure that you're constantly innovating and having resources for something that it is important. Otherwise, like regulators come or like consumers complain and things like that and people complain, right?

Yeah, you need unprecedented people for an unprecedented pace. Quick questions. Laura, who's our wonderful producer at the A16Z podcast is going to go around. So if anyone does have a question, just raise your hand and she'll come find you. I'm curious, how are we thinking about internationalization or serving users of like various levels of digital literacy?

We've had an international audience from the beginning, including every country and every region you could possibly imagine. So I think it's been a high priority from the beginning, right? Because

The interesting thing is a lot of the development that AI is bringing is not just things that are usable in like, oh, it's just an English thing or, oh, it's just like a US thing or something. It actually brings change in workflows across almost every country and every culture you can imagine. And it actually works, right? Like I think we've gone and launched new markets where we've had zero users and

overnight had an explosion of users in that market. But then we learned something about that particular market where, oh, they don't like this particular thing. Or if you think about, for example, the Middle East, right? Text is written in the opposite way. And so that changes a lot about the UI and changes a lot about the user experience, right? And

And we've done a lot of work to make that good and make that as usable and as amazing of an experience as it is in any other language. So those are the types of efforts we've made high priority from the beginning. Would you say that other countries or regions are actually more readily adopting the products? Because I'm just thinking through, well, actually, maybe they can't hire the software engineer, or maybe they can't pay for the traditional video editor, those thousands of dollars. So they're actually more readily adopting these technologies because they're bringing the cost down. Absolutely. I mean, I think...

Around the world, people are super open to trying something new to see if they can change their workflow, right? I think as long as you can provide something that is, once you try it, you can't go back to what you were doing before. That's it. That's the difference, right? If you can provide that experience in any language, any culture, any country, people will use the product.

I mean, I think like for us, internationalization has been like since day one there. We have a full international team. Everyone is fully remote. So that actually there's a very strong correlation, funny enough, between the actual employee profile and the fact that we have multiple countries. Everyone can base whatever they wanted and travel and all of that stuff.

and the actual user tag that we've gathered, right? So yes, in the initial days, like a lot of our growth came from North America and European markets. But actually these days, when you look at the entire pie, it's like super spread out across the world. I can relate to that purely like on the fact that people want the best tools that will help them on a day-to-day basis, right? And you don't really need to spend these days like thousands of dollars or like hundreds of dollars to actually produce a video or to produce a podcast or

preview something, right? You could do it much cheaper using tools. And that's the beauty of it. So by default, like anyone that truly wants to have a cost-efficient solution will end up like using any of the tools, script, captions or labs or anything else that you have out there. So by default, you end up having a strategy that is about international markets, multilingual content, like trying to engage your audiences like where they are and trying to personalize it to them anyway.

Otherwise, I think you end up having a problem of being very skewed towards the market. Traditionally, it's been always that, oh, you go one market, you conquer it, and then you expand to another one. And these days, just not. It just evolved quite a lot in that front. Yes. We have time for maybe one, maybe two more. I see one at the back.

I'm just wondering what barriers or stopgaps you might be putting in place for people who may be using your products for nefarious purposes and thinking about trust and safety. I think like from 11 Labs, we invest like millions every single year on actually like preventing misuse, right? And we were the first company to implement like a fingerprinting system for any content that gets generated. So since we launched, the fingerprinting has been in place. We then opened up the API and the UI to make

make sure that anyone can check whether something was generated by us or not. And since then, we've also essentially engaged on monitoring the content that our users generate. So that essentially, if someone is generating things that they shouldn't, then essentially we block them. We've gone as far as to build the no-go voices, which is a model that will prevent anyone that tries to clone a celebrity voice, for instance, right? We're constantly adding all of these layers to try to make sure that we stay ahead of the curve.

But as I was saying earlier, it's an uphill battle overall, right? There is always ways in which you can game it. But at the same time, you have open source tools, right? So we can try to do our side of the equation, like anything that is open source and

To some extent, you don't really have that much control over those tools, right? But I think it's important as a company, we will keep investing like millions every single year and we'll increase it as the market grows as well. I have to just quickly ask because it's very timely and I'm sure people in the audience are wondering with some of the recent news around AI voices, let's just leave it at that and celebrities. Are you finding there to be a bunch of false positives? Because I feel like that's maybe something that people wonder. You hear a celebrity's voice, but...

how unique can a voice be? And so if you're trying to filter out certain people's voices, are you finding that actually like our voices maybe aren't that unique?

That's a really good question, right? The voices are not as unique as everyone thinks, but however, they're quite unique. So you end up having like false positives for sure. But we ended up doing it like if it's a false positive, if it tells you like, oh, you don't have permission for this voice, automatically it tells you like, oh, but you can still pass the voice capture and it will show you the voice capture. So if you pass it because it is your voice, then you're able to actually like use your own voice, right? I have a twin brother for the ones that don't know. We

We do sound exactly the same. And even my parents actually, they sometimes they make mistakes, right? So truly like I could be talking, but you could be thinking it's my twin brother. We have exactly the same voice. And that is a challenge as a company we have and a society we have, right? But I think like if you end up building layers as a product from a product perspective,

to help filter those false positives. I think people understand that you're trying to go from everything is free for all and then you can misuse as much as you wanted. Let's put some controls and even if there's some false positives, people understand the downline.

Something about the product side of this too, which I do think is super important to sort of like build the safety features from the product from the ground up, like in the product from the ground up. And that's kind of the difference between offering a technology versus offering a product. If you just say, hey, come to our website, make deep fakes, right? That's offering a technology. And some people might be out there doing that, right? I don't know, right? But I think if you build that into a product, like for example, we have the language translation feature, right? Which can translate whatever you're speaking to a different language, changes your lip movements as well. And yes, that's using...

The same technology, but in a very opinionated way that you can't change what was said, but you can change what language it was set in. And so that limits the scope of abuse immediately quite a bit. And then all the traditional methods can be used on top of that as well.

I mean, with Descript, you can create a voice clone of yourself and sort of like intermingle. We have this thing called overdub, where if I say the wrong word, I can go back and with the text, say the word that I actually meant to say. And then it will, with my voice clone, kind of create that. But obviously, there are a lot of misuses there. And so whenever we launch a product, we launch it with protections in place and do a bunch of...

testing and hire outside people to try to crack it and try to make sure that we do our very best to make sure that it's un-gameable. But like you said, if people are extremely determined to crack through security, they will always find new ways to do it. And this was the case when I was in social media too, where you do all kinds of things to try to protect your platform and bad actors...

They get up every morning and grind just as hard as you do. And so you're just sort of in the eternal struggle. And I think like every single tech product should be thinking about like, how are people going to misuse us and making sure that they're responsibly providing a bunch of resources to stay in the fight.

So as VP of Revenue at Eleven, how do you view the role of open source? Because as a developer myself, I would rather use, for example, Falcon 70B, which is a dollar in dollar out per million tokens, as opposed to GPT-4, which is 30 and 50 out. So do you think that open source is a threat to your business, especially as companies like Meta are kind of taking a scorched earth approach to releasing models?

I mean, I think it's complimentary, actually. You always end up having businesses or people that can go and use open source and they have the means and the tools and the knowledge to make that work. And then you end up having quite a lot of different people that don't really have those means or knowledge, right? So it just ends up becoming different sides of the business or different sides of the market, right? However you want to segment it. When I think about voices, we've been talking to each other as humans for the past 50,000 years, right? And

And there wasn't really a good technology that was able to replicate how we talk as humans. So the fact that as a platform or even open source, you're able to actually replicate people's voices with their permission, make it sound natural, engaging, and then power a new type of communication and platform and experience, that market is massive. So by default, you need to have both sides to be able to actually counterbalance each other and push each other. But...

It comes also the open source at a cost, which is like the number of features that you will have is like more limited, right? So you will end up also having like less voices. So what's your preference? Like you don't have the UI. So what's your preference as a business or as an individual? Is it purely building on top of it? Then maybe open source is a good way. Like today, the quality is not there yet, but I'm sure that within the next three years, the quality is going to be like matching anything that is like private, right? So it's going to be more about like the actual ecosystem that you build around it

to make sure that people start using it in a much easier way and then embed it. But I actually think it's complementary. Without one, we can not have the other one, purely because the market needs both sides. So just to follow up, would you say that it's important for picks and shovels companies, closed source, to build an application layer on top to stay competitive?

I don't think anyone that has actually built a full LLM, if they're not able to build applications on top of it to make life easier for consumers and businesses, you will end up struggling down the line. Whether that is in six months time or that is in 18 months time, you will struggle. Because at the end of the day, I want to launch my own application or my product or use the product like this immediately. And if I need to spend the next months coding and building the UIs and everything, I'm

might give up and go somewhere else, even if it's more expensive, especially if I don't even know whether I have product market fit. And product market fit, we always think about actual startups, but big corporates might not have even product market fit. So if you want to iterate quickly and then go to market as quickly as possible,

then you might want to have a stack that is like fully readily available for you. But once you're ready and you've tested it and the technology feels good enough with other LMs or like open source, then you might end up looking to switch. And

And we've seen that with OpenAI, like the big migration that like from developers like that started using OpenAI chat GPT APIs and GPT 3.5s. And then now they're migrating to like Anthropic and like Mistral or Lama. That's been happening for the past six months. It will continue happening, right? So you start to validate that everything goes well and then you figure out what are the alternatives. Whether that is like renegotiating pricing or like open source. If you liked this episode, if you made it this far, help us grow the show.

share with a friend, or if you're feeling really ambitious, you can leave us a review at ratethispodcast.com slash asucc. You know, candidly, producing a podcast can sometimes feel like you're just talking into a void. And so if you did like this episode, if you liked any of our episodes, please let us know. I'll see you next time.