Agents @ Work: Dust.tt

2024/11/11

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

AI Deep Dive AI Chapters Transcript

A

Alessio

C

Charly

S

Stanislas

S

Swyx

Charly: Dust.tt 从最初的技术平台发展成为一个面向非技术用户的、可扩展的、开源平台，用于构建连接公司数据的个性化 AI 助手，并紧密集成 Slack、Notion、GitHub 等工具。 Swyx：Stanislas Polo 在 OpenAI 的经历以及对 GPT-4 的早期接触，促使他创立了 Dust 公司。Dust 的目标是通过提供用户友好的界面和易于理解的语言，降低非技术用户使用 AI 的门槛。 Stanislas：在 OpenAI 的经历类似于他一直想做的博士后研究，虽然研究工作充满挑战，但让他对 AI 的发展有了更深入的理解。相信 AGI 的到来时间可能不会太长，现在是创业的最后时机。Dust 最初的目标是为开发者提供工具，与 LangChain 采取了不同的方法，LangChain 侧重于广泛的社区采用和集成，而 Dust 则强调 UI 驱动的开发和更好的可观察性。Dust 采取了水平扩展的策略，虽然市场渗透率高，但市场推广难度也更大。与之相比，垂直解决方案的市场推广更容易，但公司内部的潜力有限。Dust 的目标是提高企业资源利用率，而非单纯取代工作岗位。 Charly: Dust.tt is an open source extensible platform for non technical users to build personalized AI assistance connected to company data with tight integrations with slack, notion, github and more. Swyx: About two years ago, you left OpenAI to start Dust. I think you were one of the first OpenAI alum founders. Stanislas: My journey into AI is a... I mean, Greg Brockman. Yeah, yeah. From Greg, of course. And Daniela, actually, back in the days, Daniela Amodei. My journey started as anybody else, you're fascinated with computer science and you want to make them think, it's awesome, but it doesn't work. I mean, it was a long time ago, it was like maybe 16, so it was 25 years ago. Then the first big exposure to AI would be at Stanford, and I'm going to, like, disclose a whole lamb, because at the time it was a class taught by Andrew Ng, and there was no deep learning. It was half features for vision and a star algorithm. So it was fun. But it was the early days of deep learning. At the time, I think a few years after, it was the first project at Google. But you know, that cat face or the human face trained from many images. I went to, hesitated doing a PhD, more in systems, eventually decided to go into getting a job. Went at Oracle, started a company, did a gazillion mistakes, got acquired by Stripe, worked with Greg Buckman there. And at the end of Stripe, I started interesting myself in AI again, felt like it was the time, you had the Atari games, you had the self-driving craziness at the time. And I started exploring projects, it felt like the Atari games were incredible, but there were still games. And I was looking into exploring projects that would have an impact on the world. And so I decided to explore three things, self-driving cars, cybersecurity and AI, and math and AI. It's like I sing it by a decreasing order of impact on the world, I guess. First, I'm not a trained researcher. So going through OpenAI was really kind of the PhD I always wanted to do. But research is hard. You're digging into a field all day long for weeks and weeks and weeks, and you find something you get super excited for twelve skirts. And at the certain seconds you like all the blues and you go back to you to digging. I'm not a trained, like formally trained researcher. And IT wasn't kind of a necessary ambition of me of creating, of having AA research career. And I felt the hardness of IT, I enjoyed a lot of like that a turn. And the other fun motivation was like, I mean, if we believe in AGI and if we believe the timelines might not be too long, it's actually the last train leaving the station to start a company. After that, it's going to be computers all the way down. So the starting point at the time was, if you wanted to talk about LLMs, it was still a rather small community, a community of mostly researchers and to some extent, very early adopters, very early engineers. It was almost inconceivable to just build a product and go sell it to the enterprise, though at the time there was a few companies doing that. The one on marketing, I don't remember its name, Jasper. But so the natural first intention, the first, first, first intention was to go to the developers and try to create tooling for them to create product on top of those models. And so that's what Dust was originally. It was quite different than Lanchain, and Lanchain just beat the s**t out of us, which is great. It's a choice. So technically we were open source and we still are open source, but I think that doesn't really matter. I had the strong belief from my research time that you cannot create an LLM-based workflow on just one example. Basically, if you just have one example, you overfit. So as you develop your interaction, your orchestration around the LLM, you need a dozen examples. Obviously, if you're running a dozen examples on a multi-step workflow, you start paralyzing stuff. And if you do that in the console, you just have like a messy stream of tokens going out and it's very hard to observe what's going there. And so the idea was to go with an UI so that you could kind of introspect easily the output of each interaction with the model and dig into there through an UI, which is- And I noticed that in your language, you are very much focused on non techy users. You don't really mention API here. You mention instruction instead of system prompt this very cautious. Alessio [00:54:14]: I've written a post called Maximum Enterprise Utilization, kind of like you have MFU for GPUs, but it's basically like so many people are focused on, oh, it's going to like displace jobs and whatnot. But I'm like, there's so much work that people don't do because they don't have the people. And maybe the question is that you just don't scale to that size, you know, to begin with. And maybe everybody will use Dust and Dust is only going to be 20 people and then people using Dust will be two people. And no, I think we didn't dive into the vertical versus horizontal approach to AI agents. We mentioned a few things. We spike at penetration and that's just awesome because we carry the tool that the entire company has and use. So we create a ton of value, but it makes our go-to-market much harder. Vertical solutions have a go-to-market that is much easier because they're like, oh, I'm going to solve the lawyer stuff. But the potential within the company after that is limited. So there's really a nice tension there. We are true believers of the horizontal approach and we'll see how that plays out. But I think it's an interesting thing to think about when as a founder or as a technical person working with agents, what do you want to solve? Do you want to solve something general or do you want to solve something specific? And it has a lot of impact on eventually what type of company you're going to build.

Deep Dive

Welcome back, listener. This is charly, your A I co host. We're back in the studio for a deep dive into agents with stanislas polo cofounder and CEO of dust. This is the first of a two part series covering agents at work as we are hearing a lot of interest in A I enhanced productivity and automation as a bonus stand worked with greg brock men, ellia sutch giver and sam altman at stripe and OpenAI prechAmber GPT. So we tell some stories there too.

Dust has come a long way from its earliest days as a technical platform, al alternative to LangChain, and now is an open source extensible platform for non technical users to build personalized AI assistance connected to company data with tight integrations with slack, notion, github and more. In lightning space news, we are gearing up for our next big rec cap episode, and we are taking listening questions had to speak pipe dot com slash lighten space to submit questions and messages for a chance to appear on the show. Also subscribed to our calendar for our singapore, europe and all upcoming metus. Watch out and take care.

everyone. Welcome to the living space focus. This is celestial partner and C T O L partners and i'm joined my michon ai.

hey. And today we're in the studio with stamp to welcome. Thank you very me visiting from party is and you have had a very distinguish career.

It's very hard to summarize, but you went to a college in in both equal polytechnique and stanford, and then you work in a number places, oracle, todas stripe, and then open the, I preach a tub t we will spend a little bit of time about that. About two years ago, you left open eye to start dust. I think you are one of the first OpenAI alarm sounders.

yeah. I think IT was about at the same time as a dead guys. So yes.

the first wave, yeah. And people really love David episode. We love a few as the open eyes story for back in the day like we're talking about three recording public status.

Al, limitations on the some of those stories has expired. You can talk a little bit more freely without without them coming after you. But maybe we'll just talk about like what was your journey into A I you know us strive for for almost five years. There are a lot of stripe of times going to open the I think the strike .

culture has come into open eee E A bit. Yes, I think the buses of strike people, uh, really started flowing in I S after ChatGPT. But yeah, my joining to eyes .

is a great.

And and Daniel actually back in the days.

and yes, he was C O O. I mean, SHE o SHE had a .

pretty high job at open at the time my journey started. As anybody else, you've fascinated with computer science and you want to make them think it's awesome. But IT doesn't work. I mean, IT was a long time ago, was like maybe sixteen. So IT was twenty five years ago.

Then the first big exposure to me, I would be at, i'm going to deal like disclose a whole lime because at the time IT was a class thought by Andrew anging, 嗯， and was there was no deep learning. IT was half we chose for vision and a star algorithm. So was fun.

But IT was the early days of of deep learning at at that time. I think few years after IT was the first project at google. But do you know that cat face for the human face train from many images when to uh is dated? Doing A P, G morning systems eventually decided to do, uh to go into a into getting a job.

When that hole started, company did get in mistakes, got a quiet by stripe, walked with greg book man there, and I D A try by started stressing myself in A I again felt like IT was a time you at the ata games, you self driving a great at the time. And I started expLoring project, and I felt like the atari games were incredible, but there were still games. And I was looking into expLoring project that would have an impact on the world, and so decided to explore three things, driving cars, cyber security and A I and math and A I. It's like, I think that by a decreasing order of impact on the world, I guess discovering .

new math would be very foundational.

IT is extremely foundational.

is not as direct as driving .

people around. And you necessary a bit of time where I started expLoring the member of work with friends on, uh, trying to get lc cars to drive autonomously. Almost started the company in france or europe about the south driving trucks.

We decided to not go for IT because we like IT was like probably very Operational. And I think the end of the company of of the team wasn't there. And also, I realized that if I wake up a day and because of bug, I wrote, I killed the family IT would be a bad experience.

And so just decided, like now that's that's just a crazy then I explored a cyber security with a friend. We're trying to apply transforms to cut fuzzing. So cut fuzzing h, you have kind of sorry, a and that goes really fast and tries to mutate tes, the inputs of library to find bugs.

And I tried to apply a transformer to that and do in for learning with the signal of how much propagate within the binary didn't work at all, because the transforms are so slow compared to evolutionary algorithms that is kind of didn't work. And I started interesting, makes something, the math and the AI, and started walking on set solving with the I, and the same time open. I was kind of studying the reasoning team that were attacked in that project as well.

I was in chat in dutch with greg, and eventually you get in touch with india, and finally found my way to open the idea. I don't know how much you want to dig in into that. The way to find your way to OpenAI when you're in paris was .

kind of an interesting. So I want to know, this was the two months journey. You did all listen two months. The search, the search, your search. Y next thing, because you left in july, you don't open september gonna ashamed .

to say that you were .

searching before. Yeah, was searching.

The truth that I move back to paris trip yeah and I just felt the hardship of being removed from your team. I know this way. And so I kind of freed a bit of time for me to stogy version before sorry, bad trick, sorry.

listening, joining open eye from paris and from like, obviously you had work with greg but not .

if not anyone else no, yes. So I didn't work with with greg, but not I hated with the and yeah was going because he knew that I was a good general progress, I presume. But I was not a train researcher.

P. H. D, of the research and shutting IT was excited, all with the point way.

But like, come back into music. And I think he didn't care where I was. You just wanted to try working together. So I go to S, F.

I go through the interview process, get an offer and so I get but my grew on the phone for the first time is like, hey, stand is awesome. You've got an offer. When are you're going to a safe? I'm like, it's awesome. I'm not coming to the, yes, I rise in paris and we just moved is like it's so some well, you don't have enough anymore.

Oh my god.

IT was not as as that, but it's basically the idea, uh and he took me like maybe a couple more time to keep chatting and they eventually decided to try a contractor set up and that's how I can have started working at open the eye officially as a contractor but but in practice really felt .

like being an what did .

you work on so was sorry, focused on mass and A I and in particular in the application. So study of the large grade models, mathematical reasoning capabilities, and in particularly in the context of formal mathematics, the motivation was simple. Prince formers are very creative, but yet they do mistakes, and a formal math systems are of the ability to verify and proof, and the tactics that can use to solve problems are very mechanical.

Who missed the creativity? So the idea was to try to explore both together, you would get the creativity of the l, ms. And the kind of a verification captive of the formal system, a formal system just to give a be of contact, is a, is a system in which a proof is a program. And the formal stem is a type system. A type stem is so evolved that you can verify the program if the type checks IT means that the program is write.

is the verification much faster than the, then actually executing program is vertically, is instantaneous.

basically. yeah. So the truth is, is that what you code in involve tactics that may involve competition to search for solutions. It's not instant time as you do have to do the competition to expand the tactics into the actual proof. The verification of the proof of the very low .

level is instant. You how quickly do you run into like you know halting problem p mp type things like impossibilities .

where you're just like that. I mean, you don't run into IT. At the time, I was really trying to solve very easy problems. So so example that done, yeah and I think that was the low end part of the mass bench market of time because that mass band tracking includes c problems. M, C, eight, m, stand twelve to deserve easy ones, the amy problems somewhat harder, and some ami problems like listers.

We covered this in our best Price one to one episode. Abc is literally the grade of, like high school grade eight, great. Ten grade. exactly. So you can solve just briefly to mention this because I don't think of you again this a bit of work with and then with you know more recently with with deep mind doing like scoring like silver on I M O O. Any commentary on like how math is evolved from your early work to today?

I mean, that result is mind blowing. I mean from my perspective spent three years and that at the same time give me love in paris. Um we were both in paris actually he was at fair was working as some problems.

We were pushing the boundaries and the goal was the I imo and we crack a few problems here and there. But the idea of getting a medal, and I imo, was like, just remote. yeah. So this is an impressive result and we can I think the deep mind team just did a good job of scaling. I think there's nothing too magical in their approach and if IT IT doesn't been published as a dancer ver talk from seven days ago where he goes into into more details, IT feels like there's nothing magical. There is really applying a reinforcement, noting and skating up the amount of data contact through the formation ation so we can dig in to what autoist ation .

means if you let's talk about the the taye and maybe of the ordinance. So you joined and you're like, i'm gonna a map and do one of these things. I saw one of your blog post you mentioned you find two over ten thousand models, that of I, using ten million, one hundred hours. How did the research above from, you know, the jup ity to and then getting closer, tony, the venture w tree. And then you have, just before a chat, P, D was release, tell people bit more about the research path that talking.

I I can give you my perspective of IT. I think um at any I there has always been like a lot ching of the computer that was reserved to train the GPT, which makes sense. So was pre entropic split.

Most of the computer was a going to a process nest, which was basically GPT free. And then you had A A bunch of let's a remote not call research teams that were trying to explore a maybe more specific problems or maybe the algorithm part of IT. The interesting, pat, I know if if if IT was where your question was going, is that in those labs, you managing researchers, so by definition, you shouldn't be managing them.

But in that space there is imagine to that, the great, which is computer location. Basically, by imagine the computer location, you can message the team of where you see the pride you should go. And so IT was really a question of, you were free as as a researcher to walk on whatever you, you, you wanted.

But if IT was not a line with opening a mission, and it's fair, you wouldn't get the the computer location. As IT happens, solving mass was very metro line with with the computer, with the direction of of any I. And so I was lucky to generally get the computer I needed to to make good progress.

What do you need to show as incremental results to get funded for further results?

It's an imperfect process because that a bit of there's a bit of if you're working on mass and A I obviously kind of a prior, that it's going to be a line with the company. So it's much Better than to go into something much. I guess you have to show incremental progress.

I guess it's like you ask for a certain amount of computer and you you deliver a few weeks after and you say so you demonstrate have a progress. Progress might be a positive view. Progress might be a strong negative result.

And a strong negative result is actually often much harder to get a much, much more interesting than a positive result and then is generally goes into as any organization, you would have kind of a people finding your project or any of the project kind of a cool infancy and so you would have that kind of phase of growing up computer location for its all the way to a point and and then maybe you richer can A A pix and then maybe you go back to rot mostly to zero and restart the process because you're going in different direction. That's why I felt explore, exploit. Yes, exactly, exactly, exactly. Student search process and you are .

reporting to alia, like the results .

are kind of bring him back to him or like what's the structure? It's almost like when you're doing such cutting get research, you need to report to somebody who was actually really smart to understand .

the direction is right? So we had a reasoning team, which was a working on the on reasoning, obvious ly and and so mass in general. So and that team had a manager, but he was extremely involved in the team as adviser.

I guess since he brought me in OpenAIr, was lucky to mostly for doing the first years to have a kind of a uh, direct access to him, he would really coach me as a training the true, I guess, with good injury skills. And elia, I think at open eyes, he was the one showing the showing the norstar right. He was his job, and I think he really enjoyed that. And he did. The super world was going through the teams and saying, this is where we should be going, and trying to you flock the different teams together towards .

towards an objective. I say, like the public perception of him is that he was the strongest believer in scaling. He was, he always persued that the compression, this is, you have words of him personally, the public not know about how he works.

I think is really focused on building the vision and communicating the vision within the company, which was extremely useful. I was personally surprised that spend so much time, you know, working on communicating that vision and getting that seems to .

work together with a vision is agi.

Vision is like, yeah, it's the belief in compression and and spending computer. I remember when I started working on the horizon team was the statement was really about scaling the computer run reasoning and that was really the the belief we wanted to grain in the team. And that's what has mean useful to the team and and with the deep mind result shows that IT was the right, the success objective for and stuff shows that he was .

right about IT was IT according to the neural scaling laws, the captain paper that .

was before that, because those ones came with jah basically at the time of the people being released or being ready interview. But before that, this really was a strong belief in in scale. I think IT was just the believed that the transformer was a generic enough architecture that you could learn anything and that this was just a .

question of scaling any other fun stories you want to do.

And I didn't with .

greg a when I was at any eye. He was always been mostly focused on on train the GPT and write fully. One thing about sandals man, he really impressed me because when I joined, he had joined not that long ago, and he felt like he was a kind of A A very high level C, E, O. And I was mind blown by how deep IT was able to go into the subjects within a year o or something all way to the situation where when I was having lunch, uh, by year too, I was at home, I was him. He would just quite no deeply what I was doing and .

and with no mail background like.

You know yeah was not remember, but I didn't I didn't have any easy. So like that's exchange. But I think you can it's a question about really you don't need to understand the very technicalities of whole things are down, but you you need to understand what's to go and what's being done and what are the recent results and all of that he knew and we could have kind of a very productive discussion. And that will impress me given this size at the time of open english ble.

Yeah I mean, you are founder before. You're founder now and you've seen sam as a founder. How is he affected you as as a founder?

I think having that capability of a changing the the scale of your attention in the company, because most of time you Operate at very hole's l but being able to go deep down and being in the known of what's happening on the ground is something that I feel is really and lightning. That's not a place in which I ever was as a founder because first company, we went all the way to ten people. Current company, this twenty five of us. So the hide of the sky and the ground, that premature, the same place.

You been too humble. I mean.

strike was also like you. So I was a like, I OpenAI was a really happy being being on the ground, pushing the machine. Make at work.

Yeah last opening, I question the anthropic s split. You mentioned here you are around for that very dramatic. David also left in around that time. You left this year. We've also had a similar management, a shake up that just call IT.

Can you compare what he was like going through that split during that time? And then like, does that have any similarities now? Like are we're going to see a new topic emerge from these folks that are just .

left that I really, really don't know. At the time, the spit was pretty surprising because they had been trained to the street, was a success and to be a completely parent. I wasn't in the weeds of the split. What I understand that is that there was a disagree ever of the commercialization of that technology. I think the focal point of that disagreement was the fact that we started working on the I and wanted to make those models of available from an is that really the court directed that I know .

was the safety, was the commercialization and did they just want to start a company?

actly? That I don't know, but I think what I was surprised of is how quickly OpenAI recovered at the time. And I think it's just because the we were mostly a research ogg, and the mission was so clear that some divergence in some teams, some people live a the mission is still there. We have the computer. We we have should I .

just keep going? Yeah, very bench, like just a lot of talent. Yeah yeah.

So that was the open eyes in part of the history. exactly. So then you leave over the temple, twenty twenty two. And I would say slow on valley, the two hardest companies at the time where you and link in what was that start like and what did you decide to start with a more developer focus, kind of like A A I engineer tool rather than going back to do some more research on something else.

First, i'm not a trained researcher, so going group, and I was really kind of the P. H, E always wanted to do. But research is hot.

You're digging into a field all day long for weeks and weeks and weeks, and you find something you get super excited for twelve skirts. And at the certain seconds you like all the blues and you go back to you to digging. I'm not a trained like formally trained researcher.

And IT wasn't kind of a necessary ambition of me of creating, of having A A research career. And I felt the hardness of IT, I enjoyed a lot of like that a turn. But at the time, I I that I I wanted to to go back to something more productive.

And the other of fun motivation was like, I mean, if we believe in ia and if we believe the time lines might not be too long, it's actually the last train leaving the station to start a company. After that, it's going to be computers all the way down. And IT was kind of the for motivation for like trying to go to go there.

So that kind of the coal motivation to building personally and the motivation for stalling company was a president. I had seen GPT four internally at the time. I was september twenty two, so he was preach a GPT.

But GPT four was ready. Since I had been ready for a few months internally, I was like, okay, that's that's obvious. The capabilities out there to create an intention of value to the wall and yet the deployment in is not there. Yes, the revenue of opening time where ridiculous ly small competitive today. And so the tested was this popular to be done at the product level to unlock the usage?

Yeah, let's talk a about more about the foreign factor. Maybe I think one of the first success is you have was kind of the web GPT like thing, like using the models to reverse the bib and like summer I thing and the brothers was really the the interface. What did you start with the brows? Er, like what I was an important then you build XP one which was kind like the .

brothers extension. So the starting point of the time was, so if you wanted to talk about the L. M. S. IT was still a rather small community as a committee of mostly researchers and some extent of very early adopters, very early engineers, IT was almost inconceivable to just build a product and go set IT to the rise. Though at the time, there was a few companies doing that, the one on marketing and reverts name jasper, but for the yeah the natural first intention and the first, first, first intention was to go to the developers and try to to create tooling for them to create product on top of those models. And so that's what dust was originally IT was quite different than link chain and LangChain just beat the shit dow of us.

great. It's a choice. You were cloud in close source.

They were open source, yes. So technique, we were open source still opens. But I think that doesn't really matter.

I had a strong belief from my research time that you cannot create a name based workflow on just one example. Basically, if you just have one example, you have a fit. So as you develop your interaction, your orchestration around the area, you need a dozen example.

Obviously, if you running a dozen example on a milk step workflow, you start paralyzing stuff. And if you do that in the console, you just have like a messy stream of tucker's going out, and it's very hard to observe what's going there. And so the idea was to go with an us so that you could kind of introspect easily the outfit of each interaction with the model and digging their through an U. I.

which is, was that open source? I actually didn't IT was not.

I mean, dust is entirely opens today. We're not going for if matters.

I don't know that .

the reason why is because we not open this, because we not doing an open your strategy. Yeah, it's not enough sous go to market at all. We can because we can and .

it's fun marketing. You have all the downside because people can you.

But I I think that that side is a big vacation. okay? Yes, anybody can close this today. But the value of dust is not the current states.

The value of dust is number of ebel and and hands of developers that are creating to IT in the future. And so yes, anybody can close today, but that wouldn't change anything. There is some value in being and source in a discussion with the security team.

You can be extremely transparent and just showed the code when you have discussion with users and there's a bug of future missing, can just point to the issue actually. That's but but you can show the progress. If the person that you think with technical, they really enjoy seeing the public quest advancing and seeing all way to deploy.

And then the the downsides are mostly around security. You never want to do security by your city, but the truth is that your vector of attack is facilitated by you being a pencils. But the same time, it's a good thing because if you're doing anything like a bug bonus stuff like that, you just give much more tools to the bug bontius.

So the output is much Better. So as many, many, many in trade off, I don't believe in the value of the good base. I think it's really the people that are on the good base that have the value and the go to market and the product and all of the things that are around the good base. Obviously, that's not true for every every code base into walking on a very secret cardinal to accelerate the inference of M S. I would buy that you don't want to be open source, but for product stuff, I really think that's no that's very risk.

Yeah, I signed up for X P. one. I was looking january twenty, twenty three.

I think at the time you were on the venture. Nw, given the U. S. In jup. Ty, for how do you feel having to push your product that was like using this model that was like so inferred and you're like, please just use IT today. I promise it's going to get Better. It's like just overall, as I found there, like how do you build something that maybe doesn't quite work with the model today, but you're just expecting the new model to be Better?

yes. So actually one was even on on the smaller one, that was the bus chat. P. T, release more version.

So IT. Bage.

no, no, no, no. Not that far away. But IT was the the small version of of chat. Vt, basically, I remember his name. Yes, you have a frustration there.

But at the same time, I think explain was design was an experiment that was designed as as which were useful at the current capability of the model. If you just want to a extract data from a linking page, that model was just fine. If you want to summarize an article on on a newspaper, that model was just fine.

And so IT was a really question of trying to find a product that works with the current capability, knowing that you will always have tel winds as models get Better and faster and cheaper. So that was a there's a bit of a frustration because you know what's there and you know that, you know, I have access to IT yet. But to also interesting to try to find the product that works with the current capture.

And we highlighted X P one in our tony y post in April of last year, which was, you know where are all the agents, right? So that we spent thirty minutes getting to what you're building now. yes.

So you busy had developer framework then then you had a brother or extension that you had all these things, and then you can go toward justice today. So maybe just give people never be of what does this today. Yeah and cause is hind IT.

Yeah, course does really. We want to build infrastructures so that companies can deploy agents within their teams. We are original by nature because we strongly believe in the emergence of use cases from the people having access to creating a agents that don't need to be developers.

They have to be thinker's, they have to be curious. But they can like anybody, can create an agent that will solve an Operational thing that they are doing, that is job. And to make those agents useful, there's two focus, which is interesting.

The first one is an infrastructure focus. You have to build a pipes so that the agent has access to the data. You have to build a pipes such as the agents can take action, can access the web as so that's really an infrastructure play.

Maintaining connections to notion, slack, getae. All of them is a of work. IT is boring, more boring. Frames cc to work, but at something that we know is extremely valuable in the same way that stripe is treating valuable because IT maintains the pipes. And we have that dual focus because we also building the product for people to use IT.

And there is fascinating because everything started from the conventional face, obviously, which is a great starting point. But we are only scratching the surface roads. I think we are the pm level of L.

M. productivity. And we have an invented, the safe three. We have an invented to strike. We invented yet. So this is a really our mission is to have to really create the product that let people equip themselves to just get away all the work that can be automated or assisted by.

Can you just comment on different takes that you had to maybe the most open is like auto GPT. It's just kind like just try and do anything. It's like it's all magic.

There's no way for you to do anything. Then you had the adapt, you know we had David on the podcast. They're very like super hands zone with each individual custom to build super Taylor. How do you the decide where to draw the line between this is magic, this is exposed to you, especially in the markets are most people don't know to build .

with expected to approach treme ly exciting. But we know that the agents capability of models are not quite yet. He just gets last.

So we are. We starting weight walks, same with the experiment. And the weight walks is prety simple. It's like simple workflows that involve a couple tools where you don't even need to have the model decide which tools is used in the sense of you just want people to put IT in the instructions, like take that page, do that search, pick up that documents, do the work that I want in the format I want and give me the results. There's no smart tester right in terms of orchestrating the tools is smoothly using english for people to program a walk flow where you don't have the constant of having competitive ly pi between the two.

That kind of personal automation would you say is kind of like a alli's appear type of thing, like if this then that then you know do this in this so .

very your programing with english, so programming with english. So you're just saying, or do this, and then that you can even create some m of, as you say, when I give you the command, x do this when I think, command, why do this? And you describe the work flow, but you don't have to create boxes and create the workflow explicated just need to describe what are the test was to do and IT would be and make the tool available to the agent.

Tool can be a semantics s surge. The tool can be clearing into a structure database. The tool can be searching on the web, and obviously, the interesting tool that we only starting to scratch and creating external elections, like, remember, sing something on stripe a, sending an email, clicking on the button in the end mean, or something at .

a do you maintain all these inaudible today?

We maintain most integrations. We do always have an escape edge for people to customers. But the great is that the rate of the market today is that people just want to to walk, right? And so it's mostly maintain the integration as an example of a very good source of information that is tRicky to produce this cell force because sales force is very clear database on the us, and they do the fact they want with IT.

And so every company is different models and stuff like that. So right now, we we don't support IT natively and the type of support or real native support is will be a slightly more and just all thing into its like is the case with slack as an a couple because it's probably gonna. Oh, you want to connect yourself to west.

Give us the secure, that's the salesforce al language. Give us the queries you want us to run on IT and inject in the context of dust. So that's the interest resting not into integrations.

I could and some of them require we have work on the user. And for some of them, they are really valuable to our users. But if we don't support yet, they can just build them internally and push the data to us.

I think I understand the world first thing, but let me just clarify that are using a browser automation because they're no .

A P I for something. No, no, no, no in in that case. So we do have brother automation for all the use cases implied the the public web, but for most of the integration with the international system of the companies runs for A P I.

Haven't you felt the pull to R, P, A, rather automation? And I would have .

been saying for a long time, maybe wrong, is that if the future is that you going to stand in front computer and looking at the agent, clicking on stuff, then i'll eat my computer. And my computer is a big leeboo. It's black, doesn't sounds good at all compared to a mac.

And if the API is out there, we should use them. Always can be a long tail of stuff that don't have A P S. But the wall is moving forward that that's disappearing.

So the court appeal value the in the best as reading all this, this old ninety product doesn't have an aps on IT to use the U I. To automate. I think for most of the icp companies, the company is are icp for us.

The scale apps that are between five hundred and five thousand people take companies most of the assess the use of as not as an interesting question for the open web because there are stuff that you want to do to involve websites that we have aps and the current state of web integration from which is us and open a and anthropic. I don't even know if they have web and navigation. The could instead of affairs is really, really broken because you have what you have basically search and headless browsing, but added bruising. I think everybody's doing icy body that inner text and fill that into the model, right? There's parsis into markdown and stuff.

But it's very excited by .

the companies that are explained, the capital of of random, a web page into where is competitive for a model being able to maintain these selectors so that the basically the place where to click in the page through that process exposed the actions to to the model, have the model sector action in a way that is compatible with model, which is not a big page of a full dumb that is very a noisy, and then being able to decompress that back to the original page and take the action.

And that's something that is really exciting and that we can change the level of things that about that, that agents can do on the web that I feel exciting. But I also feel that the bulk of the useful stuff that can do within the company can be done through API. The data can be rev P, I D actions can for listeners.

I know that you are basically completely .

disagreeing .

with David one actually. And you I mean, dust is where IT is and you and does this where IT is. So does this those standing?

Can we just quickly comment on function calling? You mention you don't need the models to be that smart to actually pick the tools, everything. The models not be good enough is IT, just like you just don't want to put the complexity in there? Like is there any room for improvement left and functional calling? Or do you feel usually consistently get always the right .

response to that product question? Because if you if the instructions are good and precise, then you don't have in the issue because it's scripted for you and the mall. Just look at the script and just foot and say, oh, he's probably talking about that action and i'm going to use IT and the primatologist objects, the state of the conversation that has with IT.

If you provide a very high level kind of auto GPT s level at the instructions and provide sixteen different tools to your model, yes, we're seeing the models in that state making mistakes. And there is obviously some progress. Some, of course, can be made under capabilities, but interesting parties that there is already so much work that we can assist.

Argument accelerate by just going with pretty simply screw for actions. agents. What i'm excited about by stalling in like pushing our users to create rather simple agents is that once you have those working real, you can create mita agents that use the agents as actions.

And the of you can have have a ari of responsibility, let that would probably gets you almost to the point of the auto gp t value. IT require the construction of intimidation artifacts, but you're probably gonna able to achieve as something great. I'll leave us some example.

We have our incidents are shared in slack, in a special channel or shipped or show in slack. We have a weekly meeting where we have a table about incidents and uh, ship stuff. We're not writing that weekly meeting table anymore.

We have an assistance that just go find right data slack and create the table for us. And that is instant, works perfectly. It's trivalent simple rights.

Take one week of data from that channel and just create the table. And then we have in that weekly meeting uh some uh obviously some graphs and and reporting about our financials and our progress and our err. And we've created assistance to generate those graph directly.

And those assistance works great by creating those assistance that covered those small parts of that weekly meeting slowly beginning to in wall where we'll have a week meeting. Assistance will just call IT. You going to prompt to me to say anything is going to run those different asianers and get that notion page just ready. And by doing that, if you get there and that's no beauty for us, to us using dust, get there. You're saving, I know, an hour of company time every time you .

yeah that's my topic of nm for agents is like how do you build dependently graphs of agent and how do you share them? Because why do I have to rebuild some of the smaller levels of already?

A question on agents managing other agents is the topic of a lot of research, both from like microsoft and even even in startups. What you've discovered the best for less is like a manager agent controlling bunch of smaller ents too weak communication. I don't know this. There should be a protocol .

format to be related. State we are at right now is creating the simple. So even yet, the mitre ents, we know it's there, we know it's gonna valuable and we know it's gonna awesome, but we're starting there because it's just simple place to start.

And it's also what the market in distance. If you go to a company, random set B2B com pany tha t nec essary spe cialized in the eye, and you take an Operational team and you tell them, build some tool for yourself, they understand the small agents. You tell them, build to GPT to what?

And I noticed that in your language, you are very much focused on non techy users. You don't really mention API here. You mention instruction instead of system prompt this very cautious.

yeah. It's very conscious. It's a mark of our designer ad. We kind of pushed us to create a friendly product. I was a deep into a when I started, obviously, and my profondo Gabriel was a was a stripe as well.

Um we started a company that got a quiet my stripe fifteen years ago was that alan health care company in in paris. After that, IT was a less needed in an AI but really focus on product. And I didn't realize how important is to make that technology not scary to end users.

IT didn't feel scary to me, but I was really seen by head, our designer, that IT was feeling scary to the users. And so we were very proactive and very deliberate about creating a brand that feels not too scary and creating a wording and language, you say that, that really try to communicate. If that is going be fine, be easy.

And another big point that David had about a debt is like we need to build an environment for like the agents to act. And then if you have the environment, you can simulate what to do. How's that different when you're interact in with ap and you're kind of touching systems that you can not really simulate like, you know if you called the sales, you're just call in IT?

yes. So I think that goes back to the the DNA of the company that are very different. I didn't I think was a prog company with a very stronger research DNA.

And they were still doing research. One of their girl was building a model, and that's why they raised a lot of lot of money to exit. We are one hundred percent deliberately product company.

We don't do research. We don't try in models. We don't even run G P.

S. We are using the models that exist. And we try to push the products under U.

S. Fast possible with the existing models. So that creates an issue indeed. So to answer your question, when you're interacting in the real world where you cannot simulate so you cannot improve the models, even contacting your improving your instructions is complicated.

For builder, the hope is that you can use models to evaluate the conversations, so that can get at least feedback, and you could get constructive information about the performance, severe assistance. But if you take actual trace of interaction of humans with those agents, IT is, even for us, human, extremely hard to decide whether IT was a proactive interact or really bad interaction. You don't know why the person left.

You don't know if they left happy or not. So being extremely, extremely, extremely pragmatic here, IT becomes a product issue. We have to build a product that specifies using the end users to provide feedback so that as a first step person that is building, the agent can edit on IT. As a second step may be later when we start train model and post training. And we can be around that of any you .

see in the future products offering kind of a simulation environments the same way all set now going to offer apps to build programmatically like in saber security. There are a lot of companies working on building simulative environment, so say games, so like red team, but I ve really think .

that I know mini. That's a super interesting question. I think really gonna depend on how much, because you need to simulate to generate data, you need to join data to train models. And the problem is that the an is I was gonna train models or I would just going to be using models as they are on that question. I don't have a strong opinion.

IT might be the case that will be trained models because in all of those A I first product, the model is so close to the to the products of face that as you get big and you want to really own your product, you can have to own the model as well. The model doesn't mean doing the pretrail ing that would be crazy. But at least having a an internal bus training reliant loop makes lot of sense.

And so if we see many companies going to start all the time, then there might be intensive for the diseases of the world to provide assistance in getting there. But the same time is attention, because the same, they only be interacted by and they are they by agents. They want the human to click on the biton.

So it's you.

exactly.

Thank you.

Just a quite question on models. I'm sure you've use many, probably not just open the eyes. Would you characterize some models is Better than others to use any open source model? What happened? The trends and models over the last two years.

we've seen over the past two years of a bit of a race, uh, in between models and at the at at times it's the opening, a model that is the best. At times, it's the anthropic models that is the best. I'll take some studies that we are gonna c and we let our users pick the model. yes. So when you create an assistance, your agent, you can you just say i'm going to run in time, bo.

don't you think for the montescue ser, that is actually an election that you should take away from.

We have the same default. So we take we move the default to the latest model school and we have the same default. And it's actually not very visible in the flow to create an agents, you would have to go go in advance and go pick your model. So is something that the technical person care about, but that something that obviously is a bit too complicated for the year?

And do you care most about function calling or instruction following or somebody else?

I think we care us for a function calling because you want there's nothing more than a function including incorrect boneless or being a bit off because it's just, uh, drive the .

whole interaction off. Yeah so got the L I fcon these .

days it's funny how comparison between GPT for all and GPT for turbo is still up in the air on function calling. I I don't have proof, but I know many people and i'm party part of them to think that GPT falls. Turbo is still Better than gp fall and functions calling.

We ll see what comes out of the own class. If IT ever gets function calling and cloth grouped in five senate is played as well. They kind of innovated in interesting way, which was never quite purpose.

But it's they have that kind of chain of salt step whenever you use a clothes model or senate model, wis functions calling. That of the step doesn't exist when you're just interact with IT just for for answers questions. But when you use function clean, you get that step and IT IT really helps getting Better future.

Yeah we actually just recorded the cast with the berkeley team that runs that little board this week. So they just release v three. Yeah, he was v one. Like two months ago, the turbos on top turbo, turbo over four o and then a third places, x lam from sales force, which is a large country model, been turned to popular rise. One mini is actually on here, I think o one mini, uh, number eleven.

but do you use leaderboard .

y of your own emails? I mean, this is kind of intuitive, right? Like using the older model is Better. I think most people just upgrade now. Yeah, what's the evil process like?

It's funny because i've been doing research for three years and we have bigger stuff to cook when you deploy in the company. One thing where we respect is that when we manage to activate the company, we have a crazy prevention. The highest entrain we have is eighty eight percent daily active users within the entire employ of the company.

The kind of average contrasting and exception we have current enterprise customers is something like more like sixty to seventy percent weekly active. So we basically have the entire company interacting with us. And when you're there, there is so minister of that marrows most than getting eval's getting their best model because there is some places where you can create products or do stuff that will give you the eighty percent with the work you do.

Where is deciding if it's gpi for all gpi for tubal or exempt? And I just give you the the five percent improvement. Yeah, but the reality is that you want to focus on the places where you can really change the direction or change the interaction of motorically, but to do eventually because we want in some ways.

the the model labs are competing for you, right? You don't have to do any effort. You just switch model and you grow.

What is you really limited by? Is that additional sources is not models, right? You're not really limited by quality .

of model right now. We right now, we are limited by, yes, the infrastructure is the available of ability to connect easily for users to all the data and need to do the the job they want to .

do because the stuff is out there. They are starting to provide integrations as a service. I used to work in the migration.

It's just that there is some intrigues about how you chunk stuff and how you process information from one platform to the other. If you look at the end of the spectrum you could think of, you could say all i'm gna support .

airboat and Albert .

french .

fans and seeing him today. And the reality is that you look at notion about those, the job of taking notion and putting into an a structured way. But that's the way that is not really usable to actually make IT available to models in a useful way because you get all the blocks details as a dry.

which is not the rate of notion, is get sometime .

you have a sometime you so when you have a page, there's a lot of structuring IT and you want to capture structure and shank the information in a way that respect the structure in nothing. You have databases. Sometimes those databases are real tablet data.

Sometimes those databases are full of text. You want to get the distinction and understand that this database should be considered like text information. Where is also one is actually consider information and to really get a very high quality interaction with that piece of information, I even find a solution that will work without us owning the connection into that's .

why I don't invest in these are this composition. There's um all hands from from gram new big. There's all these other companies that are like we will do the innovations for you.

You just we have the open source community will do on the show, but then you are so specific in your means that you want to own IT. Yes, exactly. You can talk to me about. He wants to put the A I and air. Well.

what are we missing? You know what? I like the things that are like snickey hard that you're tackling.

ye people realize is really building in fat at walks for the agents because it's a tenise walk. It's a an ever Green piece of work because you always have an extra integration that will be useful to non english able set of your users. I am super excited about is that there is so many interactions that shouldn't be conversational interactions and that could be very useful.

Basically know that we have the firehose of information of those companies. And does that will be that many companies that captured the firehose of information? When you have the virus of information, you can do tons of stuff with those with models that are just not accelerating people, but giving them supreme and capability even with the current model capture, because can just save through much more information.

An example is documentation repair. If I have the firers of slack messages and new notion pages, if somebody says I owe that page, I want to be updated when there is piece of information that should update that page, this is not possible. You get any email, remember, sing.

Oh, look at that slack message. IT says the opposite of what you have in that paragraph. Maybe you want to update or just think that people, that person, I think that there is a lot to be explored on the product layer in terms of what that means to contract productively with those models. And that's a problem. This extremely hard and extremely exciting.

One thing you keep mentioning about infrared, obviously does is building and serving that in a very consumer friendly way you always talk about in for being additional sources, additional connectors. That is very important by the most interested in, like the vertical info. There is an orchestrator underlying all these things, right? Where you're doing a connect work, for example.

Just the simplest one is a cron job is doing just schedule things. But also, if isn't that, you have to wait for something to be executed and and preceded the next task. I used to work on an orchestrator as well.

Temporal.

we use temporal temple. No, how was the experience I need to?

Yes, we're doing a self discovery. Al, no.

well, but you can also complain to me because I don't work there anymore.

Now we love some days. Is that a bit rough? Surprisingly rough. And you would say.

why is so? Is always vision stuff .

like that? Uh, but we really love IT, and we use IT for for exactly what you said, like managing the the entire set of of stuff that needs to happen so that see my real time we get all the updates from a slack on notion of into the system and whenever we see that piece of infection goes through, maybe trigger workforce because to run agents, uh, because they need to provide alerts to users and stuff like that. And temper is great. Love IT.

You haven't evaluated other you do want to .

build your own your happy with the business of of replacing some competitive product, very general.

If it's there's an interesting syria, but by verses build, I think, uh, in that case, when you high gross company you buy, build, trade off is very much on the side of by because if you have the capability IT just going be saving time can focus on your core computers six at a and it's funny because we are I was trying to see the post high growth company, post s of company going back on that trade off. interesting. So the clan news about removing as then that so .

do you believe? No, no, I .

well not talking .

about the .

customer traffic. I'm talking about building AI on top of south force and and desk music, if I understand correctly. And all the sudden you prox surface become much as smaller because you're acting with A I system that will take some actions.

And so all of the things you don't need the product they are anymore and you realize that, oh, those things are just database that I pay hundred time the Price, right? Because your post as cave company. And do you have tech capabilities, you are intensified to reduce your costs and you have the capability to do so.

And then you make sense to scratched the us away. So it's interesting that we might see I have A A bad time for SaaS in post ibc growth tech companies. So it's still a big markets.

It's not that big because if you're not to take company, you know have capabilities to reduce this coast. If you are high growth company, always going to be buying because did you go fester with that? That is interesting in new space, uh, new category of .

companies that h might remove some ah .

less firm has a interesting pieces on the future there at all. idea. It's all labor interface where you're asking somebody to do something for you, whether that's a person I or yes.

Yeah, that's interest. I have to ask you paying for temporary class.

Are you we we interest .

one one .

crazy expenses shareholder .

like to hear that festive.

So we happy to be other .

things in the inference stack. I am just one of this for other founders to think about ops, API, gateway, evel. Anything interesting there you build or or, or buy?

I mean, there's always an interesting question. We've been building a lot around the interface between models. And because does the does the original version was an orchestration platform, and we basically provide a unified in face to every model providers.

That's what I call gateway that we add because dust was that. And so we continued building upon and we owe IT. But as an interesting question was you want to build that .

or that I say like them, is the current open source consensually?

That's an interesting question.

Der ops dog just tracking.

Oh yeah. So is an oba. What are the mistakes that I regret? I started as sure driver scrip, not take script. And I think you wondering, I want to go fast. I'll a bit of java, just start with type ships.

And I Better. That is interesting, a research engineer that came out to open the eye that the.

well, the is that if you building a product, you're going to be doing a lot of and next, we're using next as an example to great is a great platform. Internal service is actually a build in pones. Its building rests that another choice.

the next year story is interesting because the next year is obviously looking on the world in java land. But recently ChatGPT just, we wrote from next year to remix. We are gonna having them on to talk about the big g right? That is like the biggest news in front. And words.

yeah in A I are just a rap. In twenty twenty three, you predict that the first billion dollar company would just one person running IT. And you said that basic like cosine V G, I really get you and you said that already been started. Any twenty twenty four updates on the take .

that quote was probably independently invented but a saltman told IT from me anyway that's that's it's a good quote. So I participated was maybe already being started. But if if it's a unique son company, he would probably grow really fast.

And so we should probably see IT already. I guess we gonna have to wait for IT little bit. And I think it's because the dust of the world don't exist.

And so you don't have that thing that lets you run those, just do anything with models. But one thing that is exciting is maybe that we're gonna be able to scale a team much further than before. Generations of company might be the first billion companies with injuring teams of twenty people.

That would be so exciting as well. That would be so great. You know you don't have a management hurdle. You just twenty focus people with a lot of assistance from machines to achieve your job, that would be great. That are you believe in a .

more yeah everything a post called maximum mental Price utilization can like you have M A few of her pus is busy like so many people people focus on it's going to like disport jobs and what not. But i'm like there's so much work that people don't do because they don't have the people. And maybe the questions that you just don't scale to that eyes, not to begin with, and maybe everybody will use us then um that is only gonna twenty people, and then people using that be two people.

My hot take is I actually know what vertical they'll be in. They'll be content .

created in podcasting ers.

So my so most people would regard Jimmy donal's like mr. Piece as a but his team is he's a all about like two hundred people, so he's not a single person company. The closer one actually is joe rogan where he .

basically .

just has like A Y ji. But you I don't saw this future to spotify, so he's not going to hit that billions. Statute non concedes one. He will be at the hot just anyway.

But like you want, creators were empowered by, uh, a bunch of agents, dust agents, to do all this stuff because then ultimately is just the brand, the creation. What is the role? The human? Then what what is that? What one person's supposed do if you have all these agents?

That's good question. Um I think I think the IT was uh I think that was pinter or drop x funder at the time was h when you see you, you must still have an editor position you you here to say yes, no to things you have.

Okay, so I make a daily A I news letter where I just ninety nine percent A I generated, but I would deserve the world as the editor write commentary, I choose between four options.

You say what does and goes out and something. As you said, you build your brand through those many decisions and persue creators. Yeah and you have made I think you've made a you've upcoming podcast with not booking m.

which has been doing yes, they are just in here yesterday. I'll tell you one agent that if you want to pursue the creative market, the one agent that we haven't pay for is our video editor agent. yeah. So if you want, you need to you know wrap F F peg in in the GPT and s this was great.

Anything we missed any final kind of call to action hiring is like, obviously.

people should buy the product. Dive into the voronin approach to to A I A. We a few things we .

Spike at penetration.

And that's just awesome because we Carried a tool that at the Young re company has and use. So we Carried a tone of value. But IT makes all go to market much harder.

Vertical solutions have a good to market that is much easer because they are like, oh, i'm going to solve the the layer stuff, but the potential within the company after that is limited. So there's really a nice tension there. We with true believer s of the arizona onal approach and will see that plays.

But I think it's it's an interesting thing to think about when as a as a technical person working with agents, what do you want to solve? You want to solve something general or do you want to solve something specific? And IT has a lot of impact .

on on eventually what type of company provide my response one of products, and it's basically your sense of their products drives your platform development. In other words, like if you're you're trying to be as many things, as many people as possible, we're just going to be one thing. We build our brand in one specific nih and in future, if we want to choose to spin off platforms for other things, we can because we have their their brand. So for example, perplexity, we went for products in search here, but that we also propel ity labs that here's the info that we .

use for the counter argument to that is that you always have literal movement within companies, but if you then desk is gona .

be serving. There are few of successes on both sides. The amazon .

and zon platform mean platform as, I mean, the product that is useful to everybody was in the company. And all take on that is that there is so many Operations within the company, some of them have been extremely rationalized by the markets like self people like support is been extremely alizon. And so you can probably create very powerful vertical product or on that. But there are so many Operations that make up a company that are specific to the company that you need a product to help people get assisted on those Operations. And that's kind of bet we and thanks .

again for the time.

Thank very much driving me.

That was so much fun. Yeah, great discussion. Thank you.

Agents @ Work: Dust.tt 01:00:06 Share

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0

Deep Dive

Shownotes Transcript

Agents @ Work: Dust.tt