This is absolute number eight hundred and thirty five with brian ma, C, T, O of u dot com.
Welcome to the super data science podcast. The most listen to podcast in the data science industry. Each week, we bring you fun and inspiring people in ideas, expLoring the cutting edge of machine learning, AI and related technologies that are transforming our world for the Better.
I'm your host, john cone. Thanks for joining me today. And now let's make the complex, simple.
Well, come back to the super data science podcast prepared to be blown away by today's a tremendously intelligent, successful and well spoke and guest, brian may. Yan is cofounder in C. T.
O. Of u. Dot com. A prominent AI company has raised in ninety nine million dollars in venture capital, including a fifty million dollars series b in september that value the firm at nearly a billion dollars. He was previously lead research scientist at sales force and an assistant on courses at stanford, such as Andrew angs wildly popular machine learning course. He holds a masters in computer science, a bachelors in computer science and backed hellers in philosophy, all from the stanford to university.
Today's episode should be fascinating to anyone interested in A I in IT brand details the philosophical underpinnings of the breakthrough that LED to the leading AI models we have today, as well as the ones that will emerge in the coming years. He talks about how a coating mistake made, and deputy ously revealed fundamental insights about mining and language model development. He talks about why he believes humanity is entering an existent al crisis due to A I, but nevertheless remains optimistic about the future. Just about the factory connection between language models and biological proteins and Y A I systems might soon be able to make scientific discoveries humans could never dream of making. Alright, ready for this extraordinary epsom, let's go.
Fran, welcome to the super data science podcast is awesome to have you here in. Your audio is so good.
Banks are having me. I'm honored to be here and thank you for helping me with my audio set up.
Yeah, we had a fun moment. We were scheduled to record several hours earlier than brian and I are actually recording. And he went out and bought a microphone, in case actually anyone's ever wondering if you really like my sound quality lately.
A couple of months ago, I bought a sure S H U R E, A sure M V seven plus. It's super easy. You just, you can plug in U, S, B, C to your laptop.
And IT has all kinds of building things that make the sound really good. And both brian and me are in a very, very echoing room. But you can probably barely tell.
I sure hope so.
It's so .
around town they .
talked me .
into the mike stand too.
It's all it's .
like fifteen pounds someone have to Carried on the plane back to new york yeah .
from the inference so which is where you're recording today right? Yeah yeah nice. right. So let's dig right into the technical content here. You're the cofounder and co of udt com. I'm super rival to have you on the show is udt com is a really ool company.
I've been using your dog for some time now and you can explain that is Better than me, but I can Q T you up for IT is that uta com IT reimagined search by instead having uh a search engine. It's what you describe as a do engine um and so the first product they made a big splash was connecting us to lots of different large language models. So you can go to u to com.
It's pretty use uh or at least you know is some kind of a month that you can use IT. Just afraid you can sign up with your email or log in via google or apple authentication and then there's a dozens of different elements that you can choose from. So if you want to be using the latest state of the art models from anthropic or open a eye, there are there opens worse models, they're there.
Um so anyway, so that's kind of like that was my introduction to do you to come. It's probably use IT for the most. But yeah tell us tell us more.
You can do a much Better job pitching sure I asc. But I were to a favorite ad on, I would say, the model agnostic m, if you will, is definitely a big selling point. People love coming to dot com for the latest and greatest to try out opening eye and thrown c chaman eye wherever a lot of the most recent s models are typically there.
Um we work with all of these companies and and the groups that are making these models to make sure that or you're gonna a have them as early as possible. And it's just really convenient for people you have, say one place they can go to, to try these things out side by side um to some extent for free. And then they can also go into our more premium modes and get a subscription. And that's where you can access maybe some of the more advanced models to greater stand I large her context windows, more advanced file upload features and some of the premium offerings where we decide which models to use for more complex use cases like really deep research.
So that's something that i'd like to understand. And obviously, you're not going to be able to go into intellectually property and spill all your secrets. But for example, you listen up listers um so instead of clicking on, say uh GPT foro, I can click on smart as the kind of as as the I I mean I guess that's not the V L E and i'm choosing but i'm choose en kind of like smart mode in udo com. And by the way, listeners, you dot com Y O U 点 com like not me 点 com, but you 点 com and that must have been an expensive of you R L。 But i'll .
get to that. Yeah yeah. Well, let me let me touch on smart mode. There's a research mode that's my personal favorite and there is also a genius mode, but or in the process of essentially combining research and genius into something more advance.
So smart mode is our version of a deputy four oh, but then surrounded with may be a dozen other models that are improving and rewriting your query es that are potentially rewriting prompts and dynamically constructing prompts for use case in your intent um so that we pick the best, prompt the best model and um also try to include a little bit in smart mode of excitation. So that is a little bit more grounded in search results. Now the next step up, so smart mode is free.
That's the free one. That's what you can go to. And if you don't want to think about which model to use, you use a smart mode research mode for me. That's where is that because in that one, we're using more advanced models by default for everything every time type in a query.
We're not just doing a search behind the scenes, but we're doing multiple searches behind the scenes for all the things you probably should ask, but you didn't in that original query going getting all the search results back. And then that model is optimized less for or that system, really that mode is optimized less for quick factory ID concerns, answers and more for comprehensiveness and accuracy being really, really grounded in that information. If you try a research mode and compared to a smart mode response, you're gonna is just going it's going to much longer. You're gonna get very accurate, usually one per sentence essentially and you're going a hundred different sources or something like that so that um you can use IT. You can start to see what I would be used for if you are like a biological research or or or an analyst or something like that.
Very cool um and it's actually is what you're describing. IT reminds me of some the kinds of things that i'm doing in my own company, nebula, with our elms. And this is the kind of thing that you might have some experience with because you are an advisor at literally a competitor of my companies called moon hub.
And I dare say moon hub is the best known um of the companies in my space. So these kind of um telling acquisition automation platforms and um what we have to we have an encoding LLM that so we have one hundred and eighty million public profiles that we've scraped, the people in the U S. The professional information and we've uh a synchronously, we have three computed vectors using an encoding aleen for each of those hundred eighty million people.
So um for our listeners, you can think of that is a like a big table or A A big spread sheet where IT has eight, one hundred and eighty million rows representing each of these people, each of profiles, and then there's um a thousand columns representing that are just numbers and those thousand columns indicate a location in space. And so somewhere in our thousand dimensional space there is day to scientists and probably nearby them. Those data analysts and software developers are probably not far away. But uh, servers that restaurants would be in a probably quite a different space um of of this space.
And actually the word server is a good example because what what's great about these kinds of uh sophisticated uh embedding these vectors in the way that they're created with modern element is that a word like server would mean a very different thing in the context of somebody who's talking about liquor and food relative to somebody who's talking about python and equal and um and so anyway, we have like that that coding on them in the real time, we allows people to queries, and we convert that query also into that thousand dimensional space. And then you can rank in milliseconds the top people in the U. S.
You rank two hundred eighty million people. For the three of the problem, people don't always put in optio queries like like you are just describing. And so we have our own generate, L M, that takes whatever input people provide to us and we convert that into something that's optimized for our encoding LLM to turn into a vector, uh, downstream.
And so IT sounds like you guys have done a similar kind of thing where you for I guess, each of the different kinds of models are there like the OpenAI A P I or the clock A P I or the cohere A P I. You've figured out different tricks with prompts and with uh restructuring or IT sounds like even rerunning and research mode running maybe multiple queries. Um yeah and so so everything you are saying kind of make sense to me and I truly wanted to wait too much detail about my no.
no, it's great. I love IT because that's that's a big component of of our stack as well. You can imagine the flow for us in our research mode, looking something like a query comes in.
We asked all m, hey, what are the top five, three, five, ten queries that should be asked? So is not just rewriting the query is is actually a little bit more than that to to ask the questions that you're not asking that could be relevant. Um and then we'll go out and do searches for all of those.
So we will go to some sort of search engine, you know like the one we've built internally but also potentially external tools. Um maybe some of the search is done through a vector database the way you're describing and sometimes it's electrical base search. Those things are all part of um well, we what I call like understanding and understanding.
So depending on the intent of the user or the type of query we're dealing with, all of these things can vary. Sometimes we personalize the answers, sometimes we want, sometimes you won't do a search at all. But then once you get all these sources back and bring them in, put them into, you could just put them into a context window for a general L M um but you can spend a lot of time optimizing that prompt.
And so we've found is for every model, there is an ideal way or a more ideal way to a write prompts and actually fervor combination of model. And um even if some extent like user, you can change these things to get much Better responses we'll have than models they're checking. So we'll go out and instead of a Normal rag based approaches and may be pulling some of these documents with posing snipe's and sumi zed, them will actually go out and crawl the pages live again in research mode to make sure that there hundred percent up to date in fresh.
So yes, you're using the search index is as a cash to know where to go. But then you actually do go get the precious latest up to date information and we have models, then go and check. okay.
Is this sentence generated by the language model actually say what that source said? Because sometimes even if you give them a source. The hallucinate, right? So you have to we check for implication and entailed ment in both directions.
And sometimes it's OK if one direction is off. But again, that depends the intent, depends on what people asking. So together is like maybe IT doesn't or so models that will run on a single research query, which is artifice that might take a little bit longer, but the output is usually so much Better and more accurate.
And we can see user behavior changing to ask for more of that. You know, people are willing to wait a couple extra seconds if the output is dramatically Better, if if the output wasn't ramages ally Better than speed is king. But if you can do something that feels almost magical, then it's worth waiting for. And now we're seeing this translate into B2B use cas es for a s wi tch whe re IT get s rea lly, really exciting as well.
Is a super data science listener. You're probably interested not only in data power capabilities like ml and A I models, but also interested in the underlying data themselves. If so, check out data. Citizens dialogues, a forward thinking podcast brought you by the folks over at colibris, a leading data intelligence platform. On that show, you'll hear first hand, from industry tightens, innovators and executives from some of the world's largest companies, such as data bricks, adobe and delayed.
As they dive into the hottest topics in data, you'll get insight into Brown topics like da governance and data sharing, as well as answers to specific nuance, questions like how do we ensure data read ability at a global scale for folks interested in data quality, data governance and data intelligence? I found da citizens dialogues to be a solid compliment to this podcast because those aren't topics I tend to dig into on this show. So well, data, maybe shaping our world.
Data citizens dialogues is shaping the conversation. Follow data, citizens dialogues on apple, spotify, youtube or wherever. You get your podcasts nice. Can you go into like a little bit of an example, like a case study too, where so I mean even explaining like maybe even before going to be to be b when your using research mode like what are some examples of query ziva recently where you're like, well, i'm really glad did this in research mode is a post a smart mode or instead of in ChatGPT or in clad hundred percent?
Yeah, I this will be revealing of my my daily life as well as one of my secrets. But I all regularly use our research mode before I meet any customers, you know. So that's that's a perfect example of a use case where if I were to just type in the name of a company that i'm about to go and give some technical pitch too, i'd get some links.
I was using something at google. I was using ChatGPT. I might get some information about that company if I use research mode and I type in the company's name, is also going to ask you maybe the founders are it's going to ask you know who the the key people I should be talking to, where I can kind of have all of this extra research done for me.
And if that takes five, six seconds, that's actually we more efficient than me sitting there are trying to think about all the things I should be asking me and then making follow searches. I want as much done in that one inquiry without me having to think about IT as possible. And so then that translate into, uh, some of the B2B use cas es we' re wor king wit h, maybe biotech companies, V C firms, things like this were huge.
You actually start seeing the queries change. So one company we've we ve been working with cut elusive data. They have a lot of researchers who are sifting through millions of research reports, P, D, F, S in CSV for clinical trial data associated with those research reports and there literally trying to figure out what is the next best experiment I could run you to get rid of cancer or something like that.
It's a lot of data to understand for one human mind to understand. And if you wanted to ask a question now of our solution for illuc data, it's basically like going to a research mode for them over their data and the public web data. So you can say, look, does this to this drug have a positive effect on cancer patients? You know, question mark.
They have all these research reports we can do search over those. They have all these clinical data that we can actually analyze in a code kind of agent for data data analysis there. And we can go out to the public web and get research reports that might not be from their profile, but might be from other companies or academia and do the same thing.
And with that one question, we're going to bring all of IT back, break IT down and do more than summarized but actually synthesize certain certain parts of their research for them um all with very clear citations and attribution so they can click through and verify over time. Another maybe even Better example was A A V C. Who is trying to know, should I invest in company acts about the company act.
But you know company why? Let's say, and the same holds, they have all the investments they ve made in the past from their firm. They have maybe information and the bunch of C S. Fees from their due dilly gent. There is a whole lot of information about that company, very likely on the public web.
So when you say, should I invest in this company, what you're really saying and what research are deep search modes over this kind of property use case? Does you as well who are the founders? Who are their key employees? Did their founders have no prior exists how their last companies do? Um in addition to that, let's go for all of the due diligence materials.
You does this fit into your previous investments? Uh, that we're successful a of the ones that we're not successful, but that looks like them. Why weren't they successful? Because we're optimizing for comprehensiveness.
And those responses, they might take a minute, but all of the vcs that are using IT are telling me, hey, do more like IT. Could I could take a week? I don't.
right? They don't.
Because if if you can actually answer that question accurate, yeah, and do all the research for them, I would take them two or three weeks to do the same work. So if that takes one week, that's fine. I do we don't actually know how to even yeah do something that runs for a week right now.
But exciting user behaviour or change to be asked a question like that is very, very interesting in that's why i'm they're excited about automating more, more of these workflows for now. Research mode on your account, I generate these text reports, but research mode for some of these companies will generate slide dex, you know, with images because we're calling out to image generation beauties on top of all of the research we have downing. You can just walk into your investment committee and almost decision made. We try not to have the AI actually make the decision, but again, comprehensive, so gives you all the information you would want .
to make the decision. This idea of longer inference times like taking a minute, or maybe in the future as you guys do more R N D, taking an hour, day or a week or longer is reminiscent for me, although under the hood, I suspect complete with a different kind of process, but is reminisce for me of open eyes of one model which is explicit designed to do that kind of you know, IT to scale at inference time, which seems like there's a lot of potential in that in terms of getting really accurate, really comprehensive, like you're describing really full results. So sounds like you're kind of following that same scaling opportunity.
Definitely there's definitely a similar theme going on here. Their approaches promising, you know they were focusing primarily on mathematical reasoning and and capabilities is like that. Where's were focused generally on productivity and typically and like a companies like most valuable space, you know not doing emails and slack in notion is usually this kind of core biotech research or at the heart of your investment thesis. Um but IT is a very similar theme and i'm super excited about IT.
What does the company get like A V, C firm that's working with you or about a company or you know maybe this there's people are companies that are listening that they think, wow, you know i'd love to be using ua com in A B to b use case. What are the kinds of advantages of engaging in that way, kind of having that, that B2B ent erprise rel ationship is opp osed to jus t goi ng to you to cal m and get ting a s ub stitution and usi ng the res earch too l the re?
Yeah, there's a you you can start. So typically what we see is some individual start with you dog calm. They are using that. They they try out research now, they get a subscription, but then fairly quickly, they wanted use IT for work and they want to start uploading files, files that maybe they need some data guarantees on or they want to see shared responses or something like they want a bit to collaborate on those things.
And so then you can move into the teams options on your 点 com and you can move into the enterprise options。 Um but once you get further beyond that, once you maybe have some specific needs that aren't offered by the out of the box platform itself, then IT really makes sense to reach out to us because sometimes will help people integrate our APP so they can build some of their own solutions and their own tooling. So some people just use your outcome as IT is, and that's essential ally if it's doing everything you need, great.
That is what it's. Therefore, if you need to build an an internal tool for your company that looks a lot like you are not calm, but I can't run you know on the public web or something like that. Well, then you can use all of our A P, I, which back smart motive, smart A P I search motive of research.
And we can make those work over your data. And if you just don't want to think about the rest, we will help do that. Then you can have your applications. And a lot of people at this point, in fact, my favorite customers are the ones that have spent maybe a year or so with five, six, maybe dozen people trying to build a rag like solution, using open a eye and following a blog post to set up a vector ata peace and and and do this.
And they just don't really see the R I or they don't get IT adopted within their company like a lot of large companies are not seeing adoption um even once they built their internal rag tool. And we can come in and bring all these extra models this extra often times i'll call like a trust player on top, like actually make the stuff work right and make a trustworthy for your employees. Um and that's where um you'll walk them through evaluations if they don't have.
Valuation set up. And if they do, then we crush those evaluations and show them how much Better I will be with our with our technology, 让 他 at the same time you're future proved again against, you know, future models coming up from opening. I anthropic acs, the same reason you might not want to be locked into a single vendor on anything crucially important. Uh, you can come to you outcome for that as well.
Everything you saying, brian, makes a huge mana sense to me. IT sounds, you know, with my entrepreneur had on or if I were busy in investing the things that seems like you guys have figured out all the angles just right in terms of having a great B2C pro duct, a valuable B2B pro duct, getting those enterprise kinds of functionalities in their like collaboration, in data production um customization. And yeah, I must be really exciting would be working on on product like this.
It's been it's been a phenomenal year for the company. Um and and IT makes IT a really exciting time to partner with folks that also have alignment without our longer term. Rode back. And you know if if there's anyone out there that kind of knows anything about my background or my cofounder er's background, where we try to push pretty hard on innovation front. And so we've always got a million ideas that we're looking for design partners to work with as well. And if there's just something you can't quite find out how to do with your upcoming or or otherwise than I think IT IT still makes sense to kind of chat with us and see whether that something that were looking at .
doing super cool when people hear some of the functionality you've been describing so far, like particularly this idea of bringing back results from the web, uh, using alums to present that some of our listers might think of complexity, for example, how do you guys distinguish .
against them? Yeah so perplexity in us, I would say maybe year so go year and half now very, very similar. They've continued a down a path that is seemingly much more focused on the consumer side of things and those kind of google replacement queries to be your your replacement for a daily driver search engine. We are really not focused on that anymore for us.
We're not focused on you. We are focused on you.
Yeah not google, right? We are focused on these deeper, more complex automated workflows that kind of wear. Our technical on product road map is going where we can.
Um you know our our our functionality is is getting further and further away from those quick can say knowledge like quick knowledge based answers that you can find on the web and focusing more and more on where do you get the most value in your company you know what what could be automated there? How could we double your productivity or double your bottom line or whatever the metric is without changing your headcount at all? Everything related to that, that was most important to us.
So our success is really dependent on customer. Success has nothing to do anymore with um you know beating google or having a certain marker share of the search space. That's just something we're not focused on or just focused on you, as you I said before.
Ready to take your knowledge in machine learning and A I to the next level joint superdad science and access an ever growing library of over forty courses and two hundred hours of content from beginners to advance professionals.
Super data science has tailor programs just for you, including content on large late with models, creating, boosting N A I with seventeen unique career pass to help you navigate the you will stay focused on your goal, whether you aim to become a machine learning engineer, a, generate A I expert or simply add data skills to your career. Super data science has you covered start your fine day free trial today at super data science dot com. nice.
I love that. Another thing that seems to distinguish from maybe anyone else out there that i'd love to into a lot more detail. In fact, I have quite a few questions on this coming up is AI agents.
So you know, I I talked rate at the beginning of the episode about how, instead of being a search engine, you do come talks about itself as being a due engine. But so far we've mostly been talking about, I mean, I guess, to summer in summer best we ve been talking about doing research, I suppose. But that still the result of doing that research is still kind of bringing back search results even if they are a lot more comprehensive, a lot more thoughtful.
You've cover a lot more basis. But IT seems to me like this, the AI agents that are really prominent in your platform these days that, that makes this really a due engine. So maybe maybe you could tell our listers a bit about your perspective on agenda A. I particularly given that you have predicted that is a venture beat article that we have in which you predicted that there will be more A I agents than people in twenty and twenty five, which is next .
year yeah if if I have you all just, you know trively create a enough to make that true. So this is great because that kind of goes back to our our origin and starting eat com.
We we entered start up life a coming out of our our research time, our A I research years and entered into a search in particular because our intuition was that search would have the most dramatic changes, that really our relationship to how we retrieve and interact with information would change dramatically in. And so the nature of a search box, you know, in twenty twenty, when we started, would really change. And and you could do a lot more.
So when we started out, we were talking about a search box. And so we started out building out the foundations of a search engine, but trying to mix in general vi along the way, so we could do more than just give you ten blue links. Today that we call those agents, right, like everything in that world of trying to do more for users, whether it's booking flights for you or booking reservations or whatever IT is taking actions on in your behalf, typically referred to as an agent.
I I maybe have a slightly stricter definition of you last year and a half's worth of agents being you know letting the l ams decide what to do. So I do think there is a little bit of A A A sense in which all software seems people are calling agents now um or anything that does anything automatically, which is probably a little bit of a stretch of the term, but I understand why it's happening. But a year and half ago, we started developing you L M S that were deciding which tools to use.
So it's intimately related to tour use. Some of those tools can be a search engine, you know. So some of our more advanced agents, they can use search engines or our research mode even as a tool in these more complex work workflow. So so for us, yes, you can put a query into research mode, and I can go to all that research for you.
Or you can build one layer up and have an agent that can go use research mode, but then can also interact with, you know, some of your own internal tools and IT can also then go right code based on all of that research. And perhaps that code is to do some of the data analysis or perhaps its to um you know right V B A code so that you have a slide deck. And so every time you're changing the tool being used automatically just based on what the alone I wants to do or how you prompted your agent to work, um that's why I see as extending these work close to do a lot more. And that's where a lot of users are, are finding to be really exciting as well, not just just is not just deep research and do all the research for me, but then do a bunch of other things with that uh whether it's updating my um you know C R M or you know going and sending some emails, creating sly decks, writing code that's gonna n on on these integrations um that can be researched based, but those are pretty different actions. Interesting research.
nice. Yeah and can not I guess I can I mean, IT seems to me like I don't have a suggestion to you to come, but I still have access to these agents and so any list or can go create free account and be getting things like help with booking travel. I kind of stuff through the .
iphone fish to give a try pressure super cool.
Um and congrats. Recently you had a fifty million dollar fundraising round. That's really that's really exciting. I think this is going to be maybe my last question on you to come and i'm going to get into some of your research from ah you've been doing in publishing what's I like when you raise fifty million dollars, what does that change uh, in terms of, I don't know, company structure the way you do things that does not change anything at all.
They change things for us. We raised forty five before that. And so but over the course of three and a half years, this round was specifically to start building out this B2B sid e of the bus iness. And so that meant we had to hire a lot, you know, we had to build teams that were equipped to um more create and sale and market enterprise solutions like so that the org did change and conversations changed and IT wasn't all about um in other consumer side in these subscriptions.
IT was suddenly over trying to do sales and m find in germany in london in and spending time with these enterprises, doing workshops and um helping them learn how to transform their company so that they can actually Better use your dog on. Since a lot of the enterprise stuff for us is A P I based usage based is very different in many ways. It's it's quite exciting to see um the company shift and change.
And uh my role has changed in ways I just described. And I i've really loved IT. I mean, it's it's been super exciting um and that means that we have been able to accelerate you are a revenue growth and a kind of keep up th Epace, I guess, that we had been having for the the previous years.
Yeah and grass brand and thank you for taking the time to be on the show. Well, all that is going on now. I really appreciate i'm sure audience does as well.
Um so let's stick into your research. So let's move away. You know feel free to still continue talking about you. Don't go wherever you want to. Very interesting. But I want to dig into some your research that spends not just you know recent years you'd have come but years before that because you've been producing machine learning research for over a decade at stanford, at sales force, and now you don't come. You hold over a dozen patents, mostly in natural language processing, in areas like united fied models, explainable A I L M evaluation and controllable text generation. I know if you want to give of, I mean, I guess i've kind of just given an overview, but I know if anything else you want to say is kind of a general overview before I start digg into some specific questions, some specific .
topics yeah, I think there are some I became just highlight some of the big theme isn't like motivators that I had when I was doing that research before we get into the specifics. Um you know I came at natural language processing. I came at A I from a philosopher perspective, visually.
I was doing flower of languages, computer science on the side for fun. I was interested in meaning and an academic, the academic world in analytic philosopher, of those questions of meaning become questions of playing guid. But I felt like there was a bit of A, I was running into a dead end, at least with armchair philosophy in.
So I wanted to use computers to study language, and really I wanted to use computers to study a meaning like what is this thing that we're doing when we make stuff, mean things and I don't just mean in the kind of very quite the logical positive is physical ical away, like true and false and and things like like, I don't know like this, this can be a meaningful conversation. You know, that maybe to someone listening out there, this is a meaningful thing for them to listen to. And it's not just about the content in the semantics, but it's and something more than that.
So I was always looking for that in my research but there was you know my first papers on context ized word vectors are really my broader pursuit and thesis that we should have unified models for all of A I um that was also driven by this kind of desire to understand meaning in um and even in focusing on controllable generation and other areas. IT was all about kind of giving us tools for understanding what we were doing. Yeah maybe post there you can.
Yes, let's sign to the united moas you just mentioned there because that's actually incidentally, I in the next topic that I had lined up. So on your website, you describe ponnuru unified A I systems for n lp for nursing language processing. Tell us more about that.
What does that mean? I mean, in my mind, when I hear about that IT makes me think of how S L M say now have been scaling. We end up uh discovering that they can end up doing this kind of modeling of the world having a world model um so for example, when opening sora generates a video, IT seems to encode within its model premature somehow.
Uh an understanding of how physics works you to allow fit ball is moving through the footage and sorrow that I kind of continues, uh you know at the same speed and the same trajectory or that, you know it's moving downward, its trajectory should increase, the speed should crease that kind of thing um and so when I think of a unified model, I think of kind of, yes, this elem scaling maybe involving more and more modality, uh, being able to offer sudden handle any kind of intelligence task, any kind of cognitive task. That's what comes to mind for me. I don't but I don't know if that's kind of what you're talking about when you talk about unified systems.
Yeah, I I think that is in in the right direction, if not exactly what i'm talking about. I I when I was starting out in research, and when I I walked over the computer science department, I was looking for someone who was working on language. I found my current cofounder, Richard.
He was doing his P. H. D. And he was working on deep learning for natural language processing and his dissertation by the time i've known .
him for a long time, because I used to run in two thousand and sixteen, and I started running this deep learning study group in new york, and we would decide together what course work to follow. And we started off with a deep learning textbook, but then we'd kind of covered the basics, and so we want to kind of get to the cutty edge. So we first went through Andrea Kathy course with a lee, uh, you know, the kind of general deep learning course, which has a lot of machine vision applications. And then the next thing that we did was Christopher meeting and Richards soldiers deeping for NLP.
uh, of course, lay go. So that's exactly the material that I was encountering for the first time in two thousand thirteen when I met Richard. And a lot of the core hypotheses that were being explored there, you know, like this idea of words meaning kind of being associated with their context in which they're used. Or I think that i'm going to push you the quote, but in other is a guy name forth and he says something like you shall know the meaning of word by the company keeps.
Mathematics forms the core of data science and machine learning. And now with my mathematical foundations of machine learning course, you can get a firm grasp that, particularly the essential than your algebra and calculus, you can get all the lectures for free on my youtube channel.
But if you don't mind paying a typically small amount for the u demi version, you get everything from youtube, plus fully works solutions to exercises and an official course completion certificate, as countless gets on the show have emphasized, to be the best day of scientists you can be. You've gotten know the underlying math. So check out the links to my mathematical foundations of machine course in the show notes, or at jon kona.
Com flash ud me, that's john. ra. Com flash U D E M Y. Yeah exactly. I think it's originally a victim .
shine and and exactly yes, exactly. IT goes back to that's what I was noticing, right? I was coming over and know I was saying, like, hey, this this deep learning stuff and these word vectors are actually a great way for me to test out. Some of these philosophical hypothesis is about about meeting every singing exactly that like I D be constant in mind and and .
you're read that the quote I think I I end up misa beating that I I have many times over the years taught it's and that said, you shall know where by the company he keeps but I just looked IT up quickly and you're .
write it's john Robert forth yeah it's it's, it's similar enough to consigning an ideas that use have the same brain connection I did back in two thousand thirteen or like, oh, that's that's like that that is the same. It's basically the same, you know and but they were quoting first a lot, and I was thinking a lot about big instead. And my first demo I saw was mics demote europe. And third, yes, plus this equals this and wow, this is working too.
We should be able to do that quickly here on air uh, so it's like, uh, king minus men plus women equals queen yeah that's .
that's that's what I was and they had a nice demo. You could go and type in different ones you do like dentist minus teeth plus heart, and you get cardiologist and I go, this is so cool. Um yeah.
actually IT tie in in a way. So what we were talking about earlier with vector searching, where your the semantic meaning of things is encoded in a high dimentia space and it's that same kind of property that I mean Thomas mickle off like you're describing with world of ec. He was really yeah I mean, that was my introduction as well to idea of meaning just being a Christopher or maybe was which social.
But I remember the way that Christopher, I would say IT, you can have introduced me to do this idea of superior meaning in his australian accent, you know, is supposed to like is supposed to trying to have a discrete points in like a tree and you know, using IT, some kind of tree ontology to represent the semantic relationships of things. Instead, you allow a vector space to have meaning, gently smear in all of the no thousand dimensions or whatever. You have an effect.
oral space. And I think, like, but I was like, I was a controversial idea at the time you know like micro hard Chris manning no IT IT was not IT was not accepted necessarily like you could cram the meaning of a sentence into a vector like that was that was a pretty radical idea um but I felt so right to me and more than that there were some early papers by color bear in western as well. A wonderful was caught like NLP from scratch.
And I got one really obsessed with this idea of that at the base this was working. There was some sort of structure to language and maybe meaning that we could encode with these models. And IT was starting to look like you could actually work.
But but the way that most of the field still did research, and was the way that they were building eye, was by taking a task like machine translation or sentiment analysis or question answering, whatever IT is named, entity recognition, we would pick a task out, conceptually well defined task. And you you'd architect a neural network for that task. You'd get data for that task, you train the model, and then you'd have a model for that task. And that just seemed so wrong to me from a day .
one so much smarter and forward looking at me because and it's like, that's how will the scientist always be? Oh, no, so small minded.
This is, this is, this is, this is the thing that's wrong, right? Like we have the step that were taking a deep learning for language and vision. It's such mostly for perception based task.
But why would we, why would we try to teach every model that's gna do a language based task? Language like is IT not important in some sense, for that month to know all of english and be as fluent as possible before IT decides whether a sentences positive or negative. Now is IT absolutely necessary in all cases.
Well, if you have a big enough, but maybe not, you know, but this was also a, this is very controversial. Most people believed what you, what you just said. I got a lot of push back.
I mean, IT was into me out. You know, I had so much conviction, was like war. But Richard, Richard and I agreed on this as well.
So my first kind of researching paper um before I was doing IT professionally, I was while I was in Richard's class I met him when his A P H D student but and I was in school he did his first start out and then uh he was he came back as an agent ent professor to teach the deep learn ning fu natural language buzzing so I did a know I did a class project on like offer one like multi task and multi model. So I was saying I want to create a single model that's going to do visual question answering. So answer questions over the A N sentiment analysis.
So classify with the sentence positive. why? Why did I want to do that? Because when people did visual question answering at the time, the way that you would design that model was actually picking from a list of answers, a list of possible answers to the questions, you weren't generating a sentence to say, like, okay, what color are the bananas in the image? And I was, say, yellow IT wasn't generating a word the way that we generate with an allama IT was picking yellow or three as a class out of a list of possible answers like he was a giant multiple choice test.
Again, felt poorly wrong to me to have because in your your models constrained to that space, I can't really do visual question answering. And then the same is true for sentiment analysis. And you could choose the class positive or negative, but the word positive and negative wasn't anywhere like in the model IT was just a class zero or one and be assigned a label positive and negative to that class.
So I built a model that was saying, well, you don't get to do the hat, you have to have a shared vocabulary, and you have to pick a word in a sequence of words that that do this IT didn't work, you know super well. But, uh, you know, that was the idea of of saying one tasks and the way that we very nicely and conveniently, conceptually to find them are made up. Let's stop doing that. And modalities, differences between modalities are not necessarily made up, but they could be very important for each other.
And that's what then they hired me self force to do essentially when I joined sales for a research and there may be four of us um after Richard started but quiet um my whole thing was okay how do we make a unified model for NLP specifically when I was IT as sales for as I always deep dawn wanted to do all modalities, but there was enough work to do with NLP that I focused on that first then and um for the first year, I tried, I really tried and I built a very generic architecture. I was taking a summer sation questions, wiring in translation, which were some of the hardest task at the time um that a shared vocabulary I was I was trying to make IT work but the data sets were really early. We didn't have a lot of the common ones people use today or or even a year later um so didn't really work and and you .
reminds me kind of maybe was the same time is t five .
model from google. This would have been before five. This would have been twenty, sixteen, actually twenty, twenty and sixteen yeah um I didn't IT didn't work super well, but I backed off and I said, okay, i'm going to to try to establish some connection between tasks so i'm going to to show something like if I train a model to be good on machine translation from english to german, then the part of the model that learns english should be learning something helpful for a model that wants to do question answering on english data on wikipedia and and that was the that was just the basic connection I want to establish. The transfer learning in natural language processing could work beyond just word vectors because people were using word vectors, but they weren't transfering architecture like the models on top of word vector, or are was using glove and word debet, but the the layers of the neural net, whether l they were L, S.
aren't my next question. That's exactly because IT predates two thousand and sixteen. Preda, tes, transformers being as for this kind of thing, and I was able to ask you, if you use some kind of a current approach. L, T, M, uh, is the answer.
which is what IT doesn't surprise. I, A transfer some by directional LSTM layers. So you d start with word delecour glove.
Then you would train these biology m layers for a task like translation or summize ation or whenever. And then you transfer that and show that you do Better on the second task. If you learn the first task, which was, you know, again, for some people is like that kind of intuitive.
Why I would, why would that help? But two minute just felt so right that like what you just learning more of the statistics of english from one task like translation, where we had a lot of data, relatively speaking, versus Q E. So you actually you'll do you Better um and I was my first paper where the accidental insight well, there two accidental insights are very important for me when I zing that research.
One was uh the hypotheses itself, which original I set out to say unified models and then I backed off a transfer learning and found, oh, context is is all about context. Word vector ies were about context. Now these contextualized word vectors are about incorporating more context.
And if you if you want to think about cross modality training, that extra context. And then for years after IT was oh, I was obvious to me that we needed to get really, really large context windows because you need to have longer context. So for a long time, I worked on really long context windows.
Um IT was also clear to me that like just GPU memory was super important because that I could fit more into context. So like a lot of these things just became natural implications of this idea of context, context, context. And in the second idea, uh, which will be really relevant to your transformer question, was I I, I mess up an experiment.
I accidentally ran an experiment for this translation model where I, I trained IT translating english sentences in the german sentences. But I I accidentally made the word vectors on the english side completely random and untradeable. So they they couldn't change.
They were just nonsense. But the german word vectors could change, in the in the layers on top could change. But those word vectors that are supposed to ensure semantic schema meaning across random, and the model did just well, is just just as well as without IT.
IT was fine. So what's going on there? When I started thinking, that's odd, i've been looking for a meaning. I kind of thought there was some meaning in these word vectors. What what's actually happening is my encoder, or my decoder, through this attention mechanism, are aligning these symbols. They are lying these word vectors, so that I can do the translation problem, but and so as long as some of them get to move, they can move around so that this random set is aligned and the problem can still be solved. So the translation problem is really about align nant of symbols and not meaning IT wasn't translating meaning um which was disappointing for me philos there was like no meaning anywhere and anytime and and very interesting because then I got obsessed with this idea that could I get rid of the recurrence and in the language I used to use was I just need the alignment is just an alignment problem.
And so when the transformer paper came out because we were trying to get rid this occurrence for for a long time because I was very slow, I didn't really make sense for GPU IT wasn't IT was bad um so i'd had tried to make attention only models myself am thinking, oh there is attention mechanism and there just an alignment problem is just aligned problem so is no surprise to me when transformers came out and IT was, oh attention is all you need is like finally someone figured IT out and like unlocked this this this part of the problem. And I think I mean that architecture up being even more general than I expected. But IT was something that several of us in different labs。 I think we're on the hunt for you know that there is this something fishy going on with recurrence and recurrent al networks that I wasn't didn't need to be there.
Um then then I came back to the unified model approach. So in two thousand and seventeen, the paper got publish with my first publish paper, europe 3 formers came out, and then the C. N. A daily male data set came out, and squad came out from a stanford. So then there was like, more, there is more canonical ways to do these hard tasks.
And and I, I, I did my first, I real unified NLP paper um which was never published, very much rejected and you I gave talks on IT a google brain and apple and bunch of different universities and I really kind of split the group I don't know, everyone I walked into was like fifty fifty or seventy thirty of people saying this is a terrible idea. The majority, you know was saying this is a bad idea. We should not be doing this for the same reasons.
An engineering, for example, of your google, you know, think this is a bad idea. You can't be mixing all of stay together and making giant models. How's IT in a scale? You know, we need to be decomposing the problem into smaller problems and then having tests specific, more specific things, and will always do Better on a task if we do that.
I know wrong to me. So I, I really wanted to teach a model language so that we could use language to describe what we wanted IT to do. And I could use language to do all of things that we do with language, rather than have the model be stuck on some artificial task.
And that paper was very, very similar to the t five paper, almost identical in concept, said the two, five paper came out like a fifteen months later. I think something like that, because bert came out fifteen months after my contextual ized word vector paper. And then two, five came out fifteen months after the my decade. L P paper that I remember was always fifteen months for somebody, and that I would have this idea, I do this paper, and then, and then they would, uh, they would, then they replaced my old games with transformers. And then they would add a bunch of like way more data and way more compute and then IT would be like the same, but Better.
You know, I honestly at at the time, so I was thinking to myself, I was in this tiny lab sales for research, like how do I get the googles in the facebooks in every brave to, like, do my research for me? So I got trying to put out these ideas and enemy you would get obsess with that contextualize word vectors. And then fifty months later, we'd have elmo and versions of elmo and bird.
And I oh yes, in the meantime I would be going to work on a unified model stuff. And then for that fifteen months, people, oh, maybe should you like two, five and the other ones while I was working on my next thing, which was more came out for that. Um so yeah, that was a thousand awesome paper, the t five papers.
I think the one that kind of Better showed that he was a viable direction. But the same core idea was that you should describe what you want the model to do in language. You should have a general approach to like generating the answer you should not be picking from yeah, I fix out of classes although the line is blurred but like you know, IT should be as generous as possible and yeah, that was that was the second big one for me and that's that's what was really to unify A I about my suppose. And then I always wanted to incorporate vision, but NLP keep me busy enough. I was in the research that I I never thought round to of myself.
I have been super cool to hear that story. I've been like on the edge of my see this whole time. It's so close to here, kind of from the inside, these things that i've been studying for a decade completely from the outside and just kind seeing pieces emerge uh but yeah you're right.
You're right there in the machinery, uh, causing some of this stuff to happen. So super cool to hear that from you. Another concept that really interested me, uh, from your research and is something that I don't understand me and showed up in our research and I I haven't dug into the research that we did you enough to know this means because I knew I could ask you on air.
So this is about controllable generation. So you worked on conditional transformer language models for control text generation. And so IT sounds like the the model or the approach was called control C T R L, like the control key ah on the keyboard uh and then and then summarized with control some so C T R L S U M. Um so yeah, what is that? What are these things?
Yeah, this is a so I guess to continue to continue the story a little bit from the docket, an aly paper. Um as I was saying, like a lot of people did not pick that up and did not think that there was a contraction, but there was one there were two groups of that. There was a group inside you GLE smaller group that fifty months later to t five.
And there is a group inside opening that roughly eight or nine months later, released GPT two. So, and if you go back and you look at the GPT two paper, though site can at all to eighteen on this idea that we could be using language as a region, eric, way to get a model to do things. When they released GPT too, IT was super exciting but they had this released strategy um of of not sharing like the the four model right away.
And so no, I okay, I will make my own, but i'm going to do IT slightly differently. So they had done at the way that I think most you would be Better. Now they took a language model to data to predict the next word.
So given given a sequence of words, maybe five hundred eleven tokens prett the five hundred fifth token. And that's how you get your language model. That's how you generate all the text that were generating today.
More or less, you say, what's the next word now? What's the next word and what's the next word after that? And if you have a language, one of this really, really good at predicting the next word, then IT.
IT makes a lot of sense for all sorts situations. But at the time, you know, G, P, who wasn't that good, he was not jeb. Ty four. He was not jeb. Ty three.
IT was sometimes generating some sentences that seemed like maybe the right style, you know, people like that you could generate a sentenced too, maybe a paragraph of coherent text before repeating itself. And and people started immediately enter them. Warriors like, you know, these models right away, like, oh my god, this is a conscious, is a person, whatever. And for me, so so this is all explaining why I took this approach of one trying to make the language model very controllable um kind of from like a almost moral ethical perspective.
Like people started doing things like, well, if I type something into a language model and then IT generates a bunch of nonsense that's like bad and and stuff, is that my responsibility? People are to asking this kind of questions, right? Well, if I put this in false information on the web, but I didn't write IT, the language model wrote IT, like, do I have any real accountability and really like that direction? You know, the models we're getting Better, Better like, uh, like i'd prefer if they were more controllable. So that one, the accountants is there and we could do what we wanted more. If I give us hard to get if you wanted to write a poem in the style of um you know I saw IT and site and so shakes but you like start writing the suit in the style shakespeare and then the language what would pick up on that and keep going but you couldn't say bite me .
a sona in the style shakespeare about x 嗯 哼 也 光 的 often health shot prompting .
to get the work exactly so so I was I just trained slightly different tly。 I gave IT the source so a little like you are well um every time I got a document from the internet to train this language model, I put the source in the like where the you are out at the start of the sequence. And so every piece of text that the language want to learn to generate IT was always conditioned south wise conditional generation on the source.
And the are, well, the nice effect that had was, let's say, you take you, you all like C N N dot com lash politics slash, you know, the president went to germany today. And then there is an article associated with that. Well, at inference time, if you trained IT with the structure, then you can go in and you can just instead of trying to find a clever way to start writing the the news article or to prompt the outlook to get what you want, you just say C N N outcomes flash, you know, sports, slash name of the article you want and the date, and slash the date, and then I would write that article.
So is a much more reliable way of generating. We were also, you want to do source attribution through the language model. Where is like GPT who couldn't do that? You could give IT a sentence and say, like, tell me which source this was from by looking at the conditional probabilities of the different sources that was useful. And you could kind of play with these preventers in a way that felt more like knobs and dials instead of just like alchemy was text um and a lot of that still used for for for different test, like have structure like generating proteins where you really need to condition on the function in the family with a protein whenever the like really conditional structures is useful to have these control codes is what we call them, gives you just a little bit of an extra um can straight on the output, very cooling ends .
IT to maybe my last really technical tion. I'm going to get any real for generation in addition to generating text, you free purpose language models for protein generation, which makes sense to me. As you know, I have a biology background in your science background.
And so i'm aware, maybe not all over listeners are aware that the proteins in your body that do all the functional stuff for you, that all every you imaginable thing their body can do happens because of, well, except some some small sessions, but generally speak, proteins are doing all the work and and proteins are A A sequence just like there are one dimensional sequence, just like a character string. Um you a made up of these things called the mini assets and each mino acid has like the different properties where you basically you create this chain of a mini. You could think of them as like letters of the alphabet.
And they yeah they allow you to create the vast the incredible amount of functional capability that our bodies have. Know this, the proteins that allow your eye to save, as is your liver to detoxify cohosh, uh, your skin to do all the things just does. You know this I could do. You could go on with examples for a lot of other things that our body does, and all this just executed by these one dimensional sequences of a relatively small number of minas ds um toy something um in humans and uh is so something is interesting. So you know you have this connection to move up that we talked about earlier, and this is just a stab in the dark.
I don't I don't know what you're s going to be here, but about a year ago I went to berlin and I interviewed in march ster um who has a start up that does is there there in the business of doing this, in the business of a of yeah exactly exact. And when I was there with ink, mr, he imagined your code, Richard searcher, having recently been there at the metics A I campus in berlin. And I don't know that is just seems like a connection there in some way, which could be be serious.
Yeah, i'll I i'll have to meet you one day by connection is my connection with the protein world has evolved. Primarily through my coauthor on a project paper from the sales first days. So as you said, we we kind took the control model trained for protein generation called the project.
And we we were doing exactly what you were thinking. There's way more sequences of proteins, then secondary initial structures, which are expensive to generate. So what if we can make a model that just depends on the sequences that that span out into a start up called profile ent? See you others names ali mahoni, crick guy.
Um they've been doing great work because we we had shown that you could generate these proteins and you could synthetic size them in a in a wet lab, you could get proteins that did not appear nature but had Better, Better fitness, lower energy. So they were like Better overall, Better the tasks that they were designed for um and that that became profit ent. Ah so i'm still connected to that world, not through in mr directly but I love to talk to him. Her man may maybe we've met and we can remember um but it's I .
think I think .
to generalize a little bit. Continuing to push all of these themes of like, okay, deep learning versus machine learning, right, getting out of the way for me that, that, that move is getting out of the way out of the algorithms as much as possible. So instead of designing features, don't just make them parameters. And then we've kind of got out the way of transformers. You instead of having this occurrence in our conceptual biases like they just have an architecture that moralist does make your smulders ico.
And then allows for sharing of information in context, context, context, context, keep adding context larger, context windows, context actors, whatever, unify as much as possible because whether it's a vision in language or just different parts of language like code, the fact that code helps with logical tasks and language models, right, helps you do Better and outside questions. So that is kind of interesting um the fact that having taking literally taking control which was a mono train in english and are using that to train on proteins was a much more stable training curve in a faster learning curve than training from scratch. That's odd.
What is the english have to do with the sequence of minas is what there's something general enough about learning how to do a lie menton do sequence generation or something going on their similarity just at the core of at all. And I think we need to keep pushing all of this into the natural sciences like more and more so biology, chemistry, physics. I you know if there was I don't know this I don't know if i've said this before um least like publicly in I think the same way I felt in twenty thirteen about.
But the deep learning transition being good, but our imposition of contextual tasks and such on how we were doing A I being bad. And so we need to move towards more unified stuff. I feel the same way about the science, like there's something about the way that we've been doing science.
There's a little bit constrained by our perceptions and our projections onto the world that perhaps A I broadly speaking, some sort of computational idc approach curd unlock for us. And IT might feel very similar, right? IT might feel first that is less explainable, right? People always go back.
Well, the move from machine learning deeping was less explainable. Next, all the data, all that's less explainable. Oh, like and we can't explain what what's really in a word vector anymore.
I I think there's an opportunity go for something like really, really fundamental about our understanding of the universe by getting out of the way and just giving as much context to these systems as possible. Um and I think like I keep you know I keep like a topology book like with me in a couple of pressure. I feel like there's something missing that, that we probably can't figure out, but maybe I A I can for us. And we might not explain IT in our current terms, but it'll be a much Better predict of how things work and you will find use cases for that IT makes perfect sense to me.
I think you spot on one hundred percent. I mean, to kind of try to make this maybe a little bit more concrete or explained in a city of different way yeah, when we go to university, you know you studied philosophy and computer science in separate departments. And those separate departments cover, you know, a kind of a standard curriculum that is kind of evolved over time, is like, this is what's important in philosophy.
This is what's important. Computer sciences are going to earn out them the data crores. Everyone is going to do IT over here, but those those constraints of saying this stuff of lots of stuff belongs over here, in this building with these people.
Computer science beggs over here. And you know, some people like you do, uh, you know, study philosophy and computer science. And in some way, your mind might be able to then make connections between them and have interesting ideas about semantic meaning and how natural language models could work, or unified models could work.
But we can only get exposed to so many different things. Humans, you um but an A I system can scale way, way, way, way more than us. And you can not just be learning physical hy in computer science, but I can be learning every subject and putting all subjects into a high dimensional vector representation .
or a context window.
Yes right. And somehow yeah in ways that we might not be able to understand, it'll be able to make predictions were similar ideas across all knowledge in a way that human ever could.
I think so and there's yeah so i'm looking for i'm looking for the next few decades of of science um as we learn how to incorporate these tools more and more than maybe like our fundamental understanding of the universe will change and we won't necessarily learnt to other problems we have with IT no yeah and so then maybe this .
is kind of my last question that could be potentially a duy. But you recently rote of the post and August to rot blog post called searching for meaning in the age of ai, and in you four, warn that humanity is at the beginning of an existential sis. So do you want explain more about that blood post? Maybe just kind of more generally provide us with some insight into where you think this is going, where I mean, we kind of got a sense rate there from your science perspective, which I think is spot on. But what are kind of the implications for humanity in that world where their super intelligence systems a kind of like a the episode that I released today of of the podcast, it's on dorio on a day is fifteen thousand word blog post you know, can really techno optimistic about powerfully I end yeah I am well, you know, do you kind of agree that will be going towards this potentially, like you hope, in kind of study where I solving a centuries worth of science problems in a decade like garie describes or yeah, you know, how do how do you kind of see things playing out?
So i'm generally I I I think i'm a tech technical timm here too. I'm generally very optimistic. I have even earlier blog post on a similar thing from twenty sixteen, maybe some other ones too when I was as reading a book called the life of the minds by hana rent and she's talking about, you know hey all and thinking and all these other things and I I wrote something like, you know our future with A I is going to destroy mankind um but I didn't mean in a terminal kind away but more so because so much of our focus on truth and knowledge and science.
Well, we may very well be obviated in that process now as we are just saying, like the everything ms might be Better at all of that than us. But so much of our identity has mankind or the cables world spirit. Or you know, as I guides some people you know will use that world um is is is related to that right now like we've built so much of who we are and our sense of like what humanity is on this pursuit of truth pursuit knowledge is a researcher um and so I was I was writing about how at the time, like twenty thousand and twenty four times, like the middle m there was like the short term, short term of it's all wrong.
We're gonna to like make figure that out. The middle term really, really hard because our sense of identity will be chAllenged as greater than than more than ever before. From chest to go to now, language and meaning, potentially all of knowledge and science, A I will be Better at than us.
So the middle being of jobs will be lost. People have to be transition to be be skilled was going to happen really, really quickly. Um and and that's going to hard.
The middle m is going to be hard. And then the long term, I was thinking that. Well, that will probably be the best of all times to be alive.
Um you could argue that that started the case, but I think I will definitely be true this specially long term um because if you think about us kind of free from that pursuit in some sense, then we can focus on all of the other things that maybe we are still uniquely special at and I don't necessarily I don't really know what those things are. But as we get pushed out of certain domains ull find out what humanity is really special for, if anything um and if nothing then great. We can just focus on beauty and wonder and admiration and um and see whether that separates from machines or not. Didn't think we are unique ly intelligence, but we may very well be uniquely privileged the way we perceive the world. And I can enjoy IT by creating meaning I love that .
is beautiful. I I couldn't agree more and I I think that's a really great point to end on. But before I let I guess go and you ve been really generous with our time, we run over on reporting time but quickly, if you have some book recommendations for us.
our audience about that yeah well from the engineering side, you know as I was like being A C T O and starting on my start up journey and things like that, I really liked um things called in an elegant puzzle.
I like that one as a conversation started not how I necessary did things, but so when I usually have all my people read as they become managers and staff engineers pick up any any book on a topology if you if you want to um try to figure out what's going on in the universe. I think there's something interesting there that they don't quite understand yet. But that might be like the newlin AR algebra in five years or ten years.
You talking about .
like donuts kind of yeah like general topology and like just seeing the world in terms of stuff and functions. I think that's gonna be a more important way of seeing things in the future. And I don't know if you want to a read along with me, have a book club and where we're reading a by dick so well do that .
one that's cool. How do people follow along with a book .
one um you can uh can reach out to me if you want.
Um so just like an email.
yeah you can you can email me if you want, can email me be a ut com, be while you don't come and we can we can chat about mobile. Dick, our basketball bookings were you lies by James joice and vital march by georgi eton. One of the fun things that I did with the illicit one as we, unless he takes place in doublin, IT happens in on one day.
Take a seven hundred page book. IT happens on one day. It's called blooms day in ireland, dublin, two in sixteen, sixteen. And so after the pandemic, eighteen months, we read to chater a month. And then we all went tumbling on bloomsday and met there in person some people I never met before and some friends from college so yeah um don't forget to get your daily dose of literature and fiction to keep your mind nimble and what's not real yet I love that .
and and this is also a perfect degree to my final question, which is how people should follow you so you are ready, kindly provided with your email dress, you know, their social media platforms or something like that, that we can follow you for your thoughts after the episode yeah.
I'm i'm most active right now on linton, so you know you can follow me and contact with me there. I do have twitter, but really the easiest way is just to go to my website and you can find all the links and it's called is just brian mat org. So B R Y N M C C A N N T O R G, and you will see my links, you'll see my blog post, you'll see a my poetry, my paintings and all things that we can potentially connect on and make a meaningful .
connection on. Very nice. And we'll have that website link in the shadows for sure.
Brian, thank you so much for taking the time. This has been a fascinating epo de. I've loved every second of IT. I really appreciate you taking the time.
Likewise, thank you so much for having me and really great.
Amazing episode in IT, brian filled in on how natural language processing was revolutionized by the idea that words derive meaning from their context, leading to innovations like word embedding and transformers. They also talked about how early unified AI models showed that training on one language task like translation can improve performance on other tasks like question answering by learning deeper language understanding.
He talked about how you do com distinguishes itself from other information retrieval AI companies like google and perplexity by focusing on complex automated workflows and enterprise solutions, rather than competing as a search engine. He talked about how language model trained on english text to prove surprisingly effective at generating novel protein sequences, suggesting fundamental similarities in how different types of sequences are structured. And he talked about how, while A I major pass humans at knowledge and science tasks, this could be a good thing.
I could free humanity to focus on other unique qualities we may have, like beauty, wonder and finding meaning. I read there's always, you can get all the shows notes, including the transcript for the episode of the video recording, any materials mentioned on the show, the euro for brians social media profiles, as well as my own at super duty science dog com slash, eight, three, five beyond social media. Another way we can interact is coming up on december fourth, when i'll be hosting a virtual half day conference on a gentle ai very hot topic, don't want to visit uh it'll be an interactive and practical session and it'll feature some of the most influential people in the development of AI agents as speakers.
It'll be alive in the o rally platform, which many employers and universities provide access to. Um if you don't already have access, however, you can grab a free thirty day trial of a Riley using our special code S D S pod two three. We've got a link to that code of life for you in the shootings.
I don't know. Thanks to everyone on the super da science podcast team for producing yet another extraordinary episode for you today, for all of us today, enabling that super team to create this free podcast for you, I am so very grateful to our sponsors. You can sport the show by checking out our sponsors links, which are in the shown ots, and if you yourself are interested in sponsoring and episode, even get the details on how to do that by heading to john grown 点 com slash podcast。 All right, share this episode.
People who love to hear about bran amazing gots review the episode on your favorite podcasting platform. I think that helps us out subscribe, of course, so that you don't miss updates if you're not already subscriber. But most importantly, I just hope you will keep on tuning in. I'm so grateful to have you listening and hope I can continue make epo des you love for years and years to come until next time. Keep on rock and out there, and i'm looking forward to enjoy another and suspicious very soon.