cover of episode The Future of Cloud Data Collaboration (with Samooha Co-Founder Kamakshi Sivaramakrishnan)

The Future of Cloud Data Collaboration (with Samooha Co-Founder Kamakshi Sivaramakrishnan)

2023/3/20
logo of podcast ACQ2 by Acquired

ACQ2 by Acquired

Chapters

The episode discusses the importance of cloud data warehouses and their role in enterprise data management, setting the stage for a deeper dive into cloud collaboration.

Shownotes Transcript

Hello, acquired LPs. We have a super cool episode today. David and I were brainstorming after the AWS episode, after we sort of talked about the power of cloud data warehouses, what would be the appropriate way to dive deeper into AWS?

why cloud data warehouses matter so much became this like big strategic part of cloud and how do enterprises actually use them because I think we ended the episode David by saying that oh this is like the most important thing in cloud today after you know databases and okay great snowflakes doing well billion dollar miss by Amazon

We'll leave it at that. And so the way that we wanted to explore this is through the launch of a new company called Samuha. And we have the founder and CEO of the company with us today, Kamakshi Shivaramakrishnan. Kamakshi, welcome to Acquired. Thank you so much, Ben and David. It's great to be here. Excited about the next hour plus or so of us talking about

Everything cloud and Samuha. Yeah. Well, listeners, just so you understand the crazy credentials of Kamakshi, before we sort of dive in to have her articulate the cloud landscape to us today, Kamakshi, your company was called Drawbridge, is that correct, before starting Samuha? That's right. So actually, after Stanford, I was a part of this like...

what I would call kind of a phenomenal company called AdMob that was quite by Google. So we were all called the mobsters, like the PayPal mafia. That was kind of the new version of what we had. And that was kind of... And that was the first mobile ad network, right? That's right. And it was kind of crazy. This was way back in 2007. I was just graduating from Stanford and we all had these flip phone devices and

And you had these scary kind of, you know, message coming from your telco operator saying, are you sure you want to launch your internet browser? Your data rates are $1.99 per minute. You're like, ah, no. So then kind of this whole notion of monetizing mobile web kind of sounded familiar because Google and kind of the others had done it on the traditional web, but

mobile web wasn't very accessible to everyone. So it was kind of an interesting company, obviously an interesting experience. So many cool people that AdMob brought together. And then from that point onward, it's a journey that I've never kind of looked back on. I fell in love with entrepreneurship and kind of being a founder. And being a founder is such a cathartic and transformative experience.

So from that point onwards, it's been kind of that journey one after another. So AdMob exited to Google in 2009. The transaction closed in 2010. A bunch of us spent a flash of a few months at Google and...

There were a number of companies that were born out of AdMob, and I started Drawbridge. It was an interesting journey in and of itself, another transformative experience for me. Sold that business to LinkedIn and Microsoft in 2019. Interesting part there was I literally was selling my company and giving birth to my daughter at the same time. Oh, my gosh. Not like in a...

You know, it was happening around kind of, you know, the same quarter. No, it was happening the same day, couple of days. Oh my God. It was kind of an interesting exercise. And Kamakshi, we're not going to ask you to confirm this, but just for listeners, if you Google around, you see a sort of reported $300 million...

neighborhood for what that acquisition was. So important purchase by LinkedIn for their advertising products. And then I think you went on to work closely with the CTO of Microsoft after that, right? That's right. So LinkedIn being a Microsoft company spent a couple years at LinkedIn has this best kept secret called LinkedIn marketing solutions. So when you're on the LinkedIn feed, a lot of enterprise marketing for a captive audience of, you know, professionals and this kind of, you know,

professional networking environment called LinkedIn happens on that feed. And that's the LinkedIn marketing solutions, as I said, best kept secret because it's a highly profitable business, high margin business, a large business at that as well. And so drawbridge was... LinkedIn is the best kept secret within Microsoft. It's $10 billion plus revenue annually growing 30, 40% a year. Absolutely. And a very well-run business. So many factors...

I had kind of a healthy, what I would say, cynicism about like corporate environments coming kind of being this proverbial founder. But it was amazing, both Jeff Weiner at LinkedIn in terms of understanding and learning from kind of an environment at that scale. And then you go to the mothership Microsoft at the office of the CTO with Kevin Scott and

That's a whole other experience because that brings kind of a canvas that is humongous. The office of the CTO is kind of...

I was discussing this with Kevin and the deputy CTO called Laila Tridikov, who I reported into. It's kind of the SWAT SEAL team like a McKinsey that could not do what we do because this is such deeply technical problems that the office of the CTO takes a look at that kind of helps both Satya, Microsoft at large, and certainly Kevin as well, Kevin Scott,

look into some of these strategic directions. It was a highly educational exercise in Microsoft parlance, this kind of the H1, H2, H3 of like, you know, multiple horizons, like the, you know, H3 horizon strategic imperatives for the company. And I was constantly contemplating kind of what next for me. I could fade away into semi-retirement or I could look into kind of, you know, interesting things to do.

The first one doesn't really seem, knowing you a little bit now, how you're wired. Yeah, that didn't seem as a compelling alternative for me. I felt like I had at least one, if not two companies in me. Being at Microsoft, obviously, like the whole cloud movement, and you kind of touched upon this in the intro as well. You just cannot ignore that you're in Microsoft. So Azure was a big part of kind of the learning experience for me, having that kind of wide canvas experience.

That education gave me some conviction that, look, I think while the cloud movement to the point that you raised, you know, data warehouses, whether it is structured, semi-structured, analytics, workloads, new kinds of workloads that a lot of imperative on the clouds are about new kind of workloads, et cetera, especially AI-centric workloads. When you think about all of that, you see that some of the problems haven't quite hit P0 yet. And I think that's a good thing.

And when I mean P0, I'm talking from kind of the P0 priority perspective. And that's where I would say that there's an implicit and explicit understanding, I feel at least across all the cloud players, that we are living in a reality of cross-clouds.

We are not living in a reality of there is no winner-take-all dynamic here. It is a fragmented market. No major enterprise, and even mid-sized enterprise, but certainly no Fortune 500 is like 100% on AWS or Azure or GCP. Too much risk. And it's too much pricing power to give the vendor. 100%, we are talking about a fragmented market. And then when you combine that with the fact that no enterprise, none of us have a full view of customer data. So when you put some of these...

trends and facts together. You're looking at a modern data stack that has to lend itself to collaboration. Collaboration not from a perspective of goodwill, from a perspective of business outcomes. To complete those vacuums of data that we all as businesses and enterprises have.

And that's where I connected back to kind of the comment around P0. None of the clouds have quite, at least at that time, had made it a P0 imperative to support cross-cloud collaboration, create a secure airway across clouds. When you say cross-cloud, that's not even thinking about collaborating with your customer or collaborating with a partner. It's literally like, I have data in two different, some on Azure, some on AWS, and I need to be able to use the data that's in two places for my own use case. Oh.

A hundred percent. And even if you extend that problem and even further simplify it and say that, look, I have my data on a cloud, but I have compliance requirements within my enterprise as a result of which the right hand has restricted access to the left hand in terms of, you know, the data boundaries. Even that has restricted solutions today. This is less about kind of a statement around the fact that clouds cannot solve the problems. Obviously, these are hyperscalers massively, you know,

kind of resourced companies with highly talented people. It's about the prioritization and it's about the product experience. Has this been made super easy for businesses and enterprises to make data sharing, data collaboration within their enterprise boundary and across their enterprise boundaries super easy? Is there an incentive misalignment too where any marginal engineering resource at any

call it AWS, would go toward building products that make people more inclined to put all of their data and compute on AWS rather than making it easier to interoperate with Azure.

Absolutely. The reality of kind of if you think about resources, especially technical, whatever scale you are, you have constrained resources. So that incremental allocation does go towards kind of the core mission, if you may, of being able to create the gravity around that cloud ecosystem rather than creating collaboration tools. But if you think about how where innovation has been strong is certainly around bringing a bunch of collaboration tools together.

how we think about communication-oriented collaboration tool, productivity-oriented collaboration tools. But then if you think about how data-oriented collaboration tools, that's a hard problem. The moment you put the word data before collaboration,

You're talking about security, you're talking about governance, you're talking about privacy, you're talking about compliance. It's not an easy problem to solve. And as a result of which, unless and until there's a certain trigger from an opportunity perspective, I think that's kind of where my characterization of P0 effectively comes in.

And that's where a company like Samuha, I would say, probably arbitrage is the opportunity of like, you know, in a near term sense, I think there's an opportunity for us to innovate and we are going all in on that.

We want to thank our longtime friend of the show, Vanta, the leading trust management platform. Vanta, of course, automates your security reviews and compliance efforts. So frameworks like SOC 2, ISO 27001, GDPR, and HIPAA compliance and monitoring, Vanta takes care of these otherwise incredibly time and resource draining efforts for your organization and makes them fast and simple.

Yep, Vanta is the perfect example of the quote that we talk about all the time here on Acquired. Jeff Bezos, his idea that a company should only focus on what actually makes your beer taste better, i.e. spend your time and resources only on what's actually going to move the needle for your product and your customers and outsource everything else that doesn't. Every company needs compliance and trust with their vendors and customers.

It plays a major role in enabling revenue because customers and partners demand it, but yet it adds zero flavor to your actual product. Vanta takes care of all of it for you. No more spreadsheets, no fragmented tools, no manual reviews to cobble together your security and compliance requirements. It is one single software pane of glass that connects to all of your services via APIs and eliminates countless hours of work.

for your organization. There are now AI capabilities to make this even more powerful, and they even integrate with over 300 external tools. Plus, they let customers build private integrations with their internal systems. And perhaps most importantly, your security reviews are now real-time instead of static, so you can monitor and share with your customers and partners to give them added confidence. So whether you're a startup or a large enterprise, and your company is ready to automate compliance and streamline security reviews like

like Vanta's 7,000 customers around the globe and go back to making your beer taste better, head on over to vanta.com slash acquired and just tell them that Ben and David sent you. And thanks to friend of the show, Christina, Vanta's CEO, all acquired listeners get $1,000 of free credit. Vanta.com slash acquired.

Help us understand, where does Snowflake fit into this picture? I mean, they're obviously an independent company, not part of AWS or Azure. How does that fit into this picture? And obviously, it doesn't fully solve the problem. And to dumb it down even further, why are cloud data warehouses a thing versus there were already eight database solutions available from AWS? So why doesn't everyone just use a relational database or a key value store? Yeah, I think to your point of where does Snowflake fit in, I would say that

I think Snowflake found, talk about opportunity and someone said about like if you define the term product market fit, it was what Snowflake went through starting 2011 because there were these cloud environments and there were these, to your point, warehousing tools that were not quite heavily invested. Pretty much the same dynamic that we are going through at this time vis-a-vis collaboration. AWS had a solution redshift before.

but wasn't like a major area of investment for them. And then here comes a process

a product-to-solution snowflake that basically abstracts away the underlying data stack and or the cloud infrastructure and says that all the high-performance access to warehouses and databases, here's basically the way we are going to solve for it. So we are abstracting away the complexity of the underlying cloud infrastructure, and here's basically the application around the warehouses that we are delivering to. In that way, we find ourselves very aligned with kind of that structure

mode of thinking. Forget the complexity of the underlying infrastructure, the cloud infrastructure. If we are able to build a collaboration environment that abstracts away that complexity, then we are basically talking about an intuitive product experience. So Snowflake comes in in the sense that I think Samuha's vision to being able to make

data collaboration, a highly intuitive product experience. So businesses have a big easy button to be able to share data within their enterprise boundary or across enterprise boundaries.

That seems to align very well with how Snowflake thought about the problem of data warehouses, if you may. For folks who haven't sort of followed the cloud data warehouse world, you're basically saying you had to make a lot of upfront technical decisions in a pre-Snowflake world around performance, around scalability, around...

how easy is it going to be to retrieve this data around security of the whatever underlying technology I pick to store the data. And with Snowflake, they said, no, no, no, it's all at the application level. We're going to figure out all the guts under the scenes, and it's all going to live in the cloud, and it's all going to be super easy and fast and everything you want. And you just interact with our application. And there's APIs, but you don't have to do a lot of the PhD-level computer science stuff underneath. And what you're saying with Samuha is...

There's this whole next generation now of collaboration where it's kind of the same thing around interoperability between data that lives in multiple places in multiple clouds. Very well said. And more importantly, not just the PhD level computer science, but the same problem gets solved over and over again across enterprises. That's also another thing. Again, speaking of the definition of product market fit,

I think that's what happens. It's a pain point that every business, every enterprise and every kind of, you know, data team and engineering team was going through. And here they basically kind of that abstraction away and being able to focus on the core product use cases at hand of the said enterprise or business was basically the value prop that Snowflake was able to offer. And exactly to the point that you said, I think if us being able to extend that,

to then make more sense out of the data. And making sense out of the data is how do we get insights out of the data? How are we able to kind of create more actionable, valuable business outcomes? And this has been kind of a goal for many SaaS enterprise businesses over the decades.

I think we are coming at it from the perspective of collaboration. And if you think about data collaboration, how do we do it today? Very simply put, we share documents over email. We put confidentiality kind of phrases as appropriately needed. We share it over Dropbox or Box. And there is no workload. This is just basically a share.

hardly a secure share. That's basically all that we do at this time. But if you want to bring a true collaborative experience where data can come together while maintaining the sanctity of the data without making copies of that data, and that's where basically data collaboration, at least the way Samuha is going for it,

has the potential to be disruptive. And that's kind of the vision for which we are building this company. It's such a good point. I hadn't really considered this before, but, you know, absent building your own custom engineering tools to make AWS talk to Azure or GCP, there's probably a lot of people who are under-resourced in organizations doing cowboy stuff, like downloading

in a file and your web browser kicks off a thing that says the download will be ready in 10 to 15 minutes and then it sends you an email that it's ready and then you download it. So now it's moved over the network into your corporate network or someone's box at home. Now it's sitting on their desktop. They're uploading it to somewhere else, probably through some web interface because they want to use some tool. So there's like copies of the data all over the place and it's moving through encrypted and unencrypted ways to get there. I totally see how this is a problem.

To the point that you said, it's moving unencrypted in most scenarios. And that's where kind of the safety and a lot of this is depending on the industries and depending on the enterprises and businesses.

There's a lot of sensitive data associated with it. Imagine versions of this happening in the healthcare industry, financial services as an industry, and certainly in even more recently, advertising media and marketing had a certain MO by which that industry operated. And now with the increased regulation, it's beginning to look like a financial services or healthcare industry with increased regulation, and rightfully so. So that's where most of these...

Industries are now having to be more accountable to how data moves, data mobilizes, and

data is not copied, data is encrypted, and yet you are able to not render the data valueless through this process. You have to still continue to extract value from the data because that's where we are. As a generation, we are smarter because of the data behind it. Particularly now, I'm curious how much this visibility you got into this when you were at Microsoft and in the office of the CTO and also how you're thinking about it now. But like,

With AI, you know, and the importance of data for that and using the data that you have, and in particular, your own unique data that you own that nobody else does, mobility has never been more important, right? There were a number of conversations around from a thought leadership perspective that happened at Microsoft translating to product initiatives. I'm sure versions of this happened across the other sort of hyperscalers as well.

Most of data was treated by bringing data to compute. And that's why this whole notion of copies, fragments and breadcrumbs of data and trails being left along the way and hence along the way, a lot of exposure. So if you kind of change that paradigm and bring compute to data, that's where you're mobilizing data without having the kind of needs of mobility of the data.

So I think that's where even our alignment with the likes of Snowflake even come in, because we think about sort of this new framework of bringing compute to data. When at Samuha, our data collaboration application is built on the fundamental paradigm of bringing compute to data. And a very narrow instantiation of that is what is called as clean rooms, right?

That's basically acquired the name because the industry that has been most affected by this, at least in the recent past, has had an existential crisis is this advertising media marketing industry.

So they rose to the occasion because that industry operated a certain way and suddenly the rug is pulled underneath them and then there's kind of an existential problem. You're talking about when there's leaks of customer email addresses because they were sort of sharing customer data back and forth between multiple sources. Actually worse, actually worse. The industry basically was effectively...

How many over the course of the last five, 10 years, we've seen all these third-party cookies, like surreptitiously capturing information that you would have on browsers. There is not a clear chain of consent and awareness from an end-user's perspective on what all data is being collected about them, whether it is through experiences of media, content, gaming, et cetera, that they're engaging in. And that was at the heart of how kind of, you know,

quote unquote, tracking would work in the media marketing use cases, because the promise of digital advertising and marketing is that it is measurable. So that means a user has to be tracked. And that's exactly where, you know, a lot of this, whether it is the Cambridge Analytica and kind of how Facebook kind of came into a lot of

public anger, if you may. But this is how the entire industry operated. And then comes... And here we are today now with the, you know, Apple app tracking, you know, like you say, the rug has been pulled on the whole industry. And then in response to that, you basically have Apple, Google, etc. change the way in which tracking works on their OEM, whether it's on the browser, on the application environment, on the devices that are shipped and manufactured.

by these large players. So that changes the paradigm of how advertising and marketing and media as an industry is operated. So they rose to the occasion and that's why now you have to share sensitive data. Okay, a user is identified and appropriately consented around an email address that they have shared with, I don't know, a certain game and or experience and or

media application that they're consuming from and similarly on the brand side. So data is being matched and joined against what is sensitive email addresses and other what is personally identifiable tokens. Now you see the analogy of why it starts looking like the healthcare industry where you have like personally identifiable but also sensitive pieces of data around health information, etc. that comes in.

Anyway, so that industry is basically having to overnight transform the way it operates. And they stood up to the problem of secure data collaboration in a very narrow instantiation as clean rooms, because the reason the instantiation of clean rooms makes sense in secure data collaboration is the underlying cloud infrastructure provides the security posture.

secure data collaboration, Ben, to the point that you were talking about, if we take it across clouds, an arbitrary business sitting on Azure, an arbitrary business on AWS or GCP are effectively bringing their data sets together to model for fraud better, to model for, I don't know, to do cycle detection and anti-money laundering, assuming that these enterprises are financial institutions.

You need a secure airway or a secure airspace across these three clouds, and there is no such solution out there. There's no underlying security posture, right? You're going across the cloud boundary. And that's where the zero trust paradigm effectively comes in. And you have to have cryptographic multi-party computing techniques that have to come to the rescue to be able to have this ability to

run workloads across cloud tenants that are across the cloud boundaries. We are probably going rather quickly into kind of some of the more technical aspects of this, but hopefully I'm able to illustrate that why these problems are not easy problems, especially when you reduce it down to the zero trust environment.

Yeah, I mean, what you're describing sounds like a magic trick and sounds like doing the impossible. When you say things like, well, we're going to make sure that the data is never unencrypted. And we're going to make sure that the compute moves to where the data is stored rather than, you know, uploading the data to a new source for the compute to happen there. And you start to think about some of the problems that you have to solve in doing this. This is sort of

where you think, well, who in the world is best suited to sort of perform these magic tricks of what sounds like theoretically very difficult to do? And I want to bring your co-founder, Abhishek's background in here, because I think it's like literally the background of someone who solved this at greatest scale in the world. So Abhishek formerly worked at Apple as the head of ML privacy and cryptography.

And for anyone who has sort of been a keen watcher of the Apple keynotes over the years, you'll remember something called differential privacy. The magic of what Apple can do on Apple photos on your device, rather than needing to upload unencrypted photos, you know, to Apple's cloud the way that a Google would do to run all of the ML in the cloud. And it's this like breakthrough innovation of, whoa, we can in a secure private way on device where the data lives,

go and perform a lot of this super complex machine learning workloads.

And then, of course, he goes on to work on the technology for the COVID-19 exposure notification system, which, Kamakshi, to your point, we're starting to treat all data like healthcare data, where we need just the sort of most unidentifiable bits to compare against other data sources in order to determine whatever outcome we're trying to understand without exposing people's sensitive information. I'd love for you to tell us a little bit about, like,

how you two found each other and how you sort of decided that you should aim both of your backgrounds, which with yours in ad tech and marketing tech and obviously important cloud problems at Microsoft and his in solving these like really unique, nuanced data and ML problems together to do this.

So Ben, you very aptly describe my co-founder's background here, and I wouldn't have the courage and conviction to go for this problem, which you aptly described as magic brought to bear with math.

And he's the kind of person to be able to be the ideal partner to solve this. And especially his background at being able to have done this, I'm going to use his phrasing at the scale of, you know, a billion plus devices in an ecosystem like Apple that kind of is the bastion for being able to do this securely and even bringing this new paradigm of what you called it, of like being able to do all of this compute and processing on device. That's basically the paradigm of bringing compute to data.

Otherwise, it's basically centralizing all data to kind of a cloud-based processing and then disseminating learning. And that's, I think, very much behind, you know, his expertise, at least his view has been how we can bring this more at scale, democratize this, bring access to this across businesses. Not just have a large company like Apple do this across its ecosystem, just enable businesses to get smarter about doing this and doing this the right way. Yeah.

So in short, I would only do this, as I said, because I have the power of a co-founder like him and certainly excellent background. Like, you know, many times when we speak to customers, market, when we are in meetings jointly, he goes about, he introduces himself. And it's hard to follow suit when such an illustrious founder goes first. And it's hard for me to follow suit when I have to go and then introduce myself.

It's a great coming together from a perspective of, I bring in a bunch of product perspective to the point of advertising, marketing, and media industry. This industry both abuses and also innovates. You're at the forefront of it. It's a very dynamic, fast-moving industry. So having had live experiences, live use cases,

I come in from the perspective of how can we replicate this across other industries? How can we make like, you know, pharma life sciences and health systems and payers and insurers collaborate with each other without having to go through extensive legalese and BAA agreements? How do we get financial services done?

and institutions to be able to avail enterprise-grade, highly secure, high mathematically guaranteed security and governance and privacy practices that these applications can offer? How do we bring these practices into these other industries? I come in from that perspective. So the joining of forces between me and Abhishek, I think, is at the heart of why we have a shot at solving this problem. And again, going back to that arbitrage analogy,

Hopefully, we are able to arbitrage the opportunity ahead of when maybe, you know, the hyperscalers deploy a thousand talented cryptologists and, you know, engineers behind this problem. So coming from the office of the CTO at Microsoft, I'm sure you get asked this by, you know, investors in meetings all the time. Why won't...

the hyperscalers do this? Or why should this be an independent company? So here's the thing, right? I think there's a reckoning across all the hyperscalers that we live in across cloud world. And there is absolutely movement happening from their cloud applications and services to support and accommodate

You know, their businesses and a recognition of their fact that to the point that you mentioned earlier, that not all of their 100% of their enterprise and our customer data is on their cloud. So there is certainly a move to provide support services, applications, and make it kind of a friendlier environment, but has that kind of catapulted to a level of being able to build a very secure environment?

As I said, data airway that is built on a zero trust paradigm that makes no assumptions of the underlying security postures of the cloud itself and really provides mathematical guarantees such that these businesses have a very secure way to collaborate across the cloud boundary.

I wouldn't say there is an imperative for this, right? You know, even this definition of this basically calls for an independent company to go solve this problem rather than one of the cloud themselves. I don't know if Hamilton Helmer would quite identify this as counter-positioning per se, but it is an element of counter-positioning here. Like for business model reasons, none of the hyperscalers are like super inclined to dedicate 500 engineers to doing this.

At the same time, I think there's an understanding. And that's why we are able to engage in conversations with these hyperscalers because they are very curious. They know it's happening. They know it's happening. They know their customers are asking for this. And if there is kind of, you know, a credible application that is able to, at a technical level, at an engineering level, at a mathematical level, legitimately solve this problem, it picks their interest. So that's exactly why we are able to engage the cloud environments.

And this is not taking away from the core movement to the cloud. This is just mobilizing data better, faster, and bringing the kind of the same snowflake grade efficiencies that were brought into the core data stack. This is also bringing it to the data stack, but more at an application level. I'll just probably add one more thing is that we are under no illusion that this is an easy problem for us to solve from a product technical perspective.

And also kind of the elbowing that we would need to do to be able to really do this at scale natively across the, you know, big three hyperscaler clouds itself. There are challenges to be able to do this. So in some sense, this is going to be a very high grade experiment and a learning experience.

But that's why Snowflake also comes in, right? Because it's a bounded environment where precisely this level of abstraction from the cloud, from a warehouse perspective has happened. And so we are able to build, learn from a product perspective. We are able to learn and understand the dynamics of businesses wanting to work with each other and understand kind of

Time to market, time to sell this product, time to understand use cases across multiple industries. How does this get adopted? So I think that's where the snowflake equation gets cemented a bit more to be able to have that innovation partner, design partner who's aligned at scale.

A core level, it's a more bounded environment from a complexity scale perspective. And the learnings from here, hopefully we are able to take and apply at a larger scale as well.

Tell us the story of how the company and Smuhag came together because you've met Abhishek at this point. You two have decided to team up to take on this incredibly difficult but rewarding challenge. And the spoiler is that your round ends up getting put together by Snowflake Ventures and our good friend Brad Gerstner at Altimeter. Obviously, we end the story and we've said Snowflake's name so many times. We end the story deciding that

Hey, Snowflake is the right sort of abstraction layer to build this. The punchline is that Snowflake is deeply, corely aligned with the company, as you say. But how do we get there? How did it all come together? Oddly enough, I haven't had a long operating and or working history with my co-founder. I got introduced to him about a year and a half back at this time.

by a friend who was like, hey, you know, you're thinking about this problem. And here's a guy at Apple I know who's arguably a domain expert in this. You guys should kind of, you know, connect and just share thoughts and compare notes.

I got on a call with Abhishek and I see this is like COVID times. And so it's a Zoom call. There's a certain distance that Zoom adds to be able to do this. And the first calls, while it was really he picked my interest because he's a person with a stellar background, I had no reason to believe that we were actually going to go ahead and do something together. I had an inkling kind of hearing him and getting to know him.

So basically, the exercise the two of us went through was whether it is engineering design, whether it is product design, whether it is problem space exploration and discovery, we are going to go through this, not just...

across those dimensions, but we are going to go through this iteratively till a point where we are going to find every reason not to do this, because either the hyperscalers are going to do this themselves, or this is going to be an incredibly hard problem. The elbowing from sort of, you know, the partnership dynamic, the go-to-market motion around these things are incredibly hard to solve, right?

The good news was from an engineering product design perspective, this is not an insurmountable problem with the right talent and for the right kind of individual. And that's kind of where Abhishek comes in.

And having spent the months to do that, I think we came to a point where we felt, yeah, let's still go ahead and do this. And even we had to establish that kind of, you know, founder dynamic between ourselves. And I'd like to kind of best characterize it as a spiritual connection that we developed over the months. You know, he hadn't been a startup guy before, right? Was he looking to go start a company or how? That's exactly the thing, right? I have done this before.

I kind of carry the scars of being a founder. Let me see if you appreciate my characterization of this. Entrepreneurship and founding a company is a high fixed cost problem, exercise that every founder goes through. There's a very high fixed cost, whether it's a

30 million outcome, 300 million outcome, 3 billion or 30 billion or an IPO company. There's a certain amount of fixed cost that goes through because until you achieve certain escape velocity, the fixed cost is coming from the founder and the founding team squarely. So it was something I was extremely aware of. So establishing the right dynamic with a co-founder was very, very important for me. And especially with a founder who hasn't done it before.

It could be you're on different kind of trajectories and evolution curves, if you may. My hope here is that I'm able to help him in ways where I can pattern recognize both opportunities and challenges. And similarly, he's able to help me in ways in which he's able to uniquely solve this problem ahead of others.

When you think about sort of the incumbent space as well, right? Yeah, there are probably a couple of the companies that comprise a set of venture backed startups who have taken a crack at this problem, probably unsuccessfully so, or at least kind of going through it at this time, because for one reason or another, there are different reasons why, you know, the escape velocity for companies are not quite established. Right.

The reason I feel that we will have a shot at doing this is also both from an engineering technology perspective, how we are approaching this, the partnership dynamic, whether it is Snowflake today or the hyperscalers tomorrow. And the Snowflake today kind of establishes a proof point for us to be able to then go ahead and do this with the hyperscalers. So I think all of these things coming together, we feel like we have a shot at a mission level as well to go ahead and do this. And to the point of Altimeter and Snowflake coming together,

I sit on the board of a public company called iHeartMedia, which owns close to a thousand kind of radio stations across the country, led by a legendary CEO, Bob Pittman. So I sit on that board with Brad Gerstner.

I honestly had a conversation with Brad, not because we designed it as a problem to go raise capital from Altimeter. It was just because Brad is an avid advocate and voice, not just for Snowflake, but kind of what I would say companies innovating around the modern data stack. And so to the extent that Snowflake is a part of our kind of, you know,

on-ramp journey here. I just was curious in understanding how should we be thinking about this problem. On a similar way, the partnership with Snowflake made sense because Snowflake abstracting away, warehousing, that's kind of where they started over the cloud and kind of building a data cloud economy as the abstraction layer over the underlying hyperscalers themselves. You take that one step further is basically mobilizing data and creating an application economy

Over that data cloud that Snowflake is, and effectively, we have a very, very, what I would say, a side by side adjacent kind of position, if you may. There are adjacencies, and there are also kind of, you know, consequentialities, meaning that we are a very natural consequence of what like Snowflake's cloud is. So that's where the partnership with Snowflake also lies.

made a lot of sense. And it helps us, as I said, establish our learnings on how we take this to the hyperscalers. Somebody asked me this question, and I think you did too, Ben, is

Why not do this at Microsoft? Like I was there. And I have to imagine you talked with Kevin Scott and Satya about this. Of course, I did talk to Kevin. I did talk to a bunch of players. I think, you know, as well-intentioned a group of people are and Kevin and all the execs at Microsoft are in that bucket of highly well-intentioned, putting their money where their mouth is.

But I would say that if you think about kind of the physics of the problem at the scale of an organization like what Microsoft is, there is restrictions to kind of the agility and nimbleness with which a team can go and achieve. And that's the age-old dynamic of innovation happening independently as a company rather than within another organization. So that's kind of what we ended up choosing as a company. And

And that does not in any way, you know, circumvent any opportunity for us to participate and partner with all of the hyperscalers. It's absolutely, that's kind of the eventual karma for this company. Their customers all want this. Well, it's just like Snowflake did. Exactly. That's another kind of interesting thing. Like many a times, like even the Snowflake dynamic itself, there are some questions like how far does the Snowflake dynamic go?

Well, it goes as far as it makes sense. And then, you know, it evolves. Snowflake went through that themselves. And Samuha will go through it ourselves. There's this funny thing if you keep taking Microsoft as the example, but all the hyperscalers are this way.

They know their customers want this. They're not really sure they want to accelerate the development of it that much or make it their core competency to commoditize themselves and enable data to flow freely between them and all their competitors. But

if someone that they have a great relationship with is going to go start this, it's like, great, go do this for our customers. We're super excited for you to provide this and we're going to keep a close eye and make sure that we're supporting all the use cases that make our customers the most delighted, whether it's with you or any other way that they're sort of meeting their needs. Well said. And I think all of this is grounded on the fact that

I think borrowing from Kevin's words, are you able to mathematically prove to me and establish guarantees around the privacy, security, and governance of the underlying data? And I think that's kind of where at the core of it, our strength comes in.

If you are able to do that, I think the hyperscalers have it in their advantage as well, especially at this stage of evolution to see how we are able to play it out. And to your point, there are probably opportunities in the future for how that can take shape. I have a question about the Snowflake partnership. So because they are a cloud agnostic layer that sits on top as the data warehouse, are

Did you ask them when you were starting to get to know their team, hey, why are you doing this? Why are you trying to facilitate this secure connection between data that lives inside Snowflake, outside Snowflake, under different underlying clouds? Does it seem like it's in their wheelhouse or should it be an independent startup? Actually, Snowflake has something called as clean room primitives that they provide themselves.

But they're precisely that, they're primitives. Primitives effectively means it's thousands of lines of code that businesses, enterprises have to take and implement themselves, typically with their own engineering resources to be able to then translate that as an end application for themselves. That is where I would say kind of the friction point exists for the primitives to kind of become an at-scale adoption across the Snowflake ecosystem. And then from there onwards,

all the various combinations that include Snowflake and the hyperscalers or not. So that's where, again, Samuha comes in from a product perspective on how we can be the big easy button

to enable this collaboration, no matter what the underlying security posture, the underlying data stack, the underlying data infrastructure is. So a big part of how we would do this, and this is where I even go to kind of the partnership between me and Abhishek, is the product experience of this.

If I may venture to say that the intuitive product experience of collaboration that, you know, Workspace and Slack and Teams and SharePoint, et cetera, offer is

That degree of intuitive experience has to be there in a product like this. We have to increase the addressability of this product from the data scientists and the data engineers who typically deal with data and touch sensitive data within their enterprise, across enterprises. We have to increase the addressability to the non-technical user persona as well, which is...

analysts, operations professionals, et cetera, within an enterprise who are analytical but are not technical. So that's a big part of how we build and deliver the product experience by truly making it easy and consumable.

That's a great jumping off point, too. We've done a great job so far, I think, setting the stage for the level of challenge of what you're undertaking here, how you had to bring this whole ecosystem together to bear to even begin to attack it.

What are you doing now? Like, how do you start this company? Like, what do you build first? What's the implementation? What does the product sort of look like? So what we did organically is build off of the Snowflake primitives. We have built a native application that is distributed via the Snowflake marketplace today. It's built on a streamlet of web application framework. The web application is built on streamlet.

So it's as easy as a one-click install that we are used to as consumers for an app that is distributed in the App Store. We are basically, it's an app in the marketplace that with one click is installed on the enterprise tenant on Snowflake or off.

And the business is very easily able to then go ahead, enable it and provision it to whoever uses within the organization. And the application allows you to collaborate on data within the enterprise itself, if there are compliance boundaries within the enterprise or across enterprises.

All of the workloads are brought to bear with templates that are offered that are easily pre-baked. So if you want to basically run queries about how my data joins and looks vis-a-vis kind of my partner's data, how many common records exist around my data and my partner's data. So these templates are pre-baked and offered within the application.

We have actually put together a little 30-second video snippet that basically gives a tangibility to the look, shape, and form of how this application looks like. Oh, yeah. We'll drop a link in the show notes. Another point that we are kind of excited about is we are living at a time where generative AI is dominating our collective consciousness.

So having this in natural language interface to a data application wherein you ask questions in natural language and that is translated,

into a set of APIs that queries the secure airspace across clouds. Have you done that? We have done that too. And I think it's still very early days in terms of where this would take. I heard Brad or someone else talk about kind of citizen data scientists. If you have to really create kind of a greater awareness of data, that means you're creating accessibility of data, not just for enterprises, but also for consumers. But let's start with enterprises.

Arguably, in any business or enterprise, there is more of a non-technical audience than a technical audience. When I say technical, I'm talking about code writing audience. And so many non-technical people have forced themselves to learn SQL over the last decade, but there are still 10 times more people who haven't and have questions about the data that the enterprise provides.

has that would enable them to actually do their jobs better. Exactly. And they're analytical, meaning they understand data and understand correlations and causations and value of data and understand models better. But they are probably not well-equipped to be able to write code themselves. And that's kind of where having this kind of a natural language interface that is able to query model data

But the heavy lift that is necessary for us to be able to do is basically translate that natural language under a language of APIs that is built on the secure airspace across the clouds. That, I think, is such a transformative opportunity of how we can take this even beyond just the promise of secure data collaboration agnostic of clouds and

It's about how we make it then super accessible to anybody in the business. Kamakshi, you're at the beginning journey of the company here. How are you thinking about go-to-market in these early days? Obviously, if you are successful at what you are setting out to do, Samuha will be a giant, broad, horizontal platform used by every industry, just like Snowflake is. Anybody listening who thinks that

what we're talking about could be helpful for your enterprise, get in touch with Samuha. But do you start with a particular vertical in terms of go-to-market, a particular use case? How do you think about it? So the way we are coming to market is...

kind of trying to verticalize this so that we are able to create more easy adoption rather than we could offer horizontal abstractions and these templates that I talked about, these query templates that allow you to query the shared data, the multi-collaborator shared data, multi-party shared data. These templates, we could stop at a certain point and allow for further customization by the said business and our customers themselves.

But what we are doing instead is we are actually verticalizing it and going deeper so that we are able to solve for end use cases. So what we offer in the product is whether it is an audience overlap use case between a large publisher media house, a Roku, an NBCUniversal, think of these kind of large media houses and brands that buy inventory and are trying to understand audience behavior on these media houses.

So there's this proverbial question of how do I model the audience that I want to go reach?

So that end use case is built and delivered all the way through within this application with a bunch of enablement points across third-party platforms. There are audience management platforms, content management platforms, media activation platforms, et cetera, who action on the secure insights that are captured within the clean room application. So we actually build this far out. Similarly, in the healthcare and financial services use case as well, we are building

a cycle detection for an anti-money laundering use case when there are multiple parties bringing in data that are all encrypted. So we are verticalizing the use cases across industries and developing end applications. Obviously, it is a pretty high bar for us to be able to do this as a multi-industry category and develop multiple applications at any given time. So we are bringing this

sort of an industry category one at a time. And that's where today there is kind of a bigger pull towards the advertising media marketing use cases. Every industry does need this and will need this more over time. But that's where like there's a real hair on fire problem right now with what's going on with Apple, with Facebook, with GDPR, etc.,

Correct. So that establishes reference customers for us. We are able to then apply that and carry forward. Then we are able to land and expand into some of their core product use cases as well.

So it is basically one vertical at a time in terms of how we are able to perfect the art of the application or template that is all the way through. That is not simply horizontal where there's a fair amount of lift that our customer will have to do at their end. But the reason we are able to still even contemplate this is because we do partner with Snowflake to be able to bring this to market because it is distributed in the Snowflake marketplace.

The Snowflake sales team is able to bring us in appropriate conversations where the ecosystem of 6,000 to 7,000 Snowflake customers have raised their hand and expressed a need for a solution or a problem or a product or a service like this. And that's where I would say that

I think Snowflake reminds me of, I don't know, Facebook or Salesforce a decade plus or so back when there was an entire ecosystem being developed around them. If the birth of Zynga, of Facebook, or Viva, of Salesforce was true, then I think

fast forward time to today, there is Snowflake and there will either be a bunch of successful large scale vertical applications that are built out of Snowflake and or developers who will build successful horizontal applications as well. I think that's where the dynamic with Snowflake is an interesting one that is playing out at this time. It's super cool how aligned you are with them. It's like, A, that Snowflake is now at the point where they are becoming that level of platform where

But they're just at the beginning and they're deeply aligned with you and helping you. You know, this is the core of your initial go to market for this. I love it. Maybe one other point there is I think this is less about us with only Snowflake and Snowflake only with us.

I would say that it's an example of how if you leverage the data cloud and have an appropriate application, Snowflake is a great partner, not just for us, for any other kind of startup that is able to build itself successfully around it.

Similarly, for us, if we are able to prove this on Snowflake, to be able to prove this across the hyperscalers and extend the learnings both from a product and go-to-market perspective, that's true for us too. So while building successful companies, Samu has just one example on Snowflake. Snowflake is also just one example for us to be able to do this from a cloud ecosystem perspective. So Kamakshi, as we drift toward the end here, I do want to bring back one thing you brought up earlier and scratch at it a little bit deeper.

Because the fixed costs of doing a startup are so high, and you've already had some success, some very nice successes, did it change the required outcome or the outcome that is interesting enough to justify the super high fixed costs to you? And do you feel like you had to do something even more ambitious in order to justify that fixed costs? Hi.

I think it's a great question because it's at the heart of the questions I asked myself, like, what would it take for me to go ahead and incur the fixed costs again? I think very broadly put, there is a desire for a higher impact radius.

And maybe put in management consulting parlance that proverbial two-by-two grid. Always love a good two-by-two grid. People knock it, but it is a great way to make decisions. It's a good framework. It, I think, simplifies things for some of us folks easier. If you think about impact from like a problem space perspective and the, as kind of one axis and the other axis as kind of the size of the problem that you're solving for,

It is still very early days in terms of how we are thinking about this. But if we navigate this right, to be able to offer a first-class service that makes it easy for any business to share data in a compliant fashion within its enterprise and or across its partner ecosystem,

what collaboration tools did to communication and productivity. So MUHA does to data collaboration. It's as easy. You get into a business, you get your badge, you get your laptop, you get a bunch of enterprise applications, a bunch of shared data resources come in appropriately permissioned for you. Behind the scenes, there is a highly complex and or secure layer that

It permissions this data, enables you to be able to run analytics workloads, etc. You don't have to be a technical person. You are an analytical person to the points that we have discussed. This makes it super easy for you to be able to get insights out of your enterprise data. Enterprise data, I think you would agree with me, is highly valued but highly underutilized as well. And...

I think if we are able to kind of generalize and bring that kind of an intuitive product experience and also verticalize it and kind of solve for bespoke use cases that are industry specific, I think the opportunity size for this problem is large. So that's kind of the value axis and the impact axis is if this makes healthcare better because bias in clinical trials can be solved for better because you're able to have easier data collaboration without onerous bias

legally is that kind of preclude either parties from kind of going through what needs to be done. I don't know, from a kind of an impact perspective, it feels like there are industries and use cases where this has a high impact.

So that got me over the hump to say that, look, it's worth to go through this experiment. Every founder accompanies an experiment at this early stage. I say this with respect to the audience that you have, no LPs and investors like to hear the word experiment, but it is an experimental until escape velocity has been proven.

It felt like I got the conviction around that two by two axes, that if we navigate this and design this experiment and continue to iterate the right way and most importantly, iterate fast enough, then we're going to be able to do it.

then we probably have the top right quadrant of high value and high impact. I mean, again, like you say, this is an experiment. It might not work, but it is pretty rare that you can identify, I think, as clearly as you have. And certainly it seems to me the opportunity for a broad horizon.

horizontal industry, you know, in the way that Snowflake did in data warehouses, in the way that the hyperscalers originally did in cloud. Like, it's just not that often that you see a white space like this. True. And that's kind of where, for me, this question around

I phrased it as arbitraging. I hope I got my point across. I think it's time until which the hyperscalers identify this to become an important enough problem, a large enough problem, and a persistent enough problem where they are not able to ignore it any longer from their customers.

I think it's basically arbitraging in that moment of time. That's kind of the time that we are arbitraging to be able to get ahead in the innovation curve. And at such time, we hope to be able to partner with the hyperscalers to bring the solution exactly at scale within their ecosystems. Of course, we are building across their ecosystems, on top of their environments, on top of their stacks. I think there is certainly a white space, but that white space exists because...

As I said, I think this is somewhere a prerogative, but not a priority yet at the highest level. So that's why it gets solved ineffectively across the big cloud environments. Although this is really interesting, over the course of the last six, seven months since we've been at this, probably every instance of the large investment banks having kind of their forums where the big cloud CEOs consistently talk about data sharing and collaboration. Right.

So this arbitrage opportunity probably doesn't exist for too long. And that's when it is important for us to be very innovative and establish the proof points at scale in this time that we have. Also, along the way, we are partnering with the hyperscalers themselves to be able to do this. So it's not simply an arbitrage opportunity, and I hope I do not minimize it in phrasing it as such. I just phrase it to make sure that we are capitalizing the white space and time dimension.

There's sort of an area under the curve thing here. And I don't know what the y-axis is, but the x-axis is definitely time. And there's sort of the window between when you started approximately a year ago and when everyone else wakes up to what a big problem this is. You sort of have to speed run the market in between these two edges of the x-axis. I like to quote my previous boss, Kevin Scott. I'll paraphrase it, not quote him as...

I think the right time from a compelling product is when you're a little ahead of kind of market primeness, right? As the demand grows, as there is a greater need for this, you're riding the wave and you're not starting then. So I think it's a phrasing around timing the market for opportunities like this. I hope that that dynamic is somewhere true for us. We'll see. At least that's kind of what we're trying to prove out at this time.

Well, Kamakshi, I think that's a great place to leave it. We will definitely link in the show notes to where listeners can check out the video that you mentioned earlier that helps sort of articulate the product implementation a little bit more. But where else would you direct listeners around the web to follow you or Samuha or anything else? Certainly active on Twitter. We certainly want to be thought leaders. Certainly it's about Samuha and what we're doing at this point. We are early in the journey that there's a lot that we say through our product and our company.

But I think the space is so nascent and the opportunity is, as David, you said that there is enough white space in this opportunity that we want to lead through this.

thought leadership as well, and the most distributed medium, whether it's on Twitter, on the social side, our blogs we intend to publish actively from a blog that we host on samuha.tech, our domain. We want to be very technical about this problem, so we are able to demonstrate, again, going back to kind of this, with provable mathematical guarantees that security indeed implies

definable, quantifiable degree of security of the underlying data. So our perspective is while we remain true to our grain of being able to be mathematical about the truth of what we are offering, the product truth of what we are offering, but also educating businesses who probably have some degree of appreciation for the engineering and math behind the problem, but in actuality, they have a product need for this or they have a use case to solve for.

So trying to bridge between the two. So I think between social, our blog destination for the more deep dive of content, and then we hope to be able to be back at the appropriate times in forums such as this and others as well, where certainly my co-founder Abhishek also talks about kind of how we continue to innovate and develop from a product perspective.

I'd love to continue to talk about how we are navigating the journey from where we are today across the hyperscalers themselves. There are lots of learnings that we are going to get from market in terms of someone told me this, and I'm really keen to learn this. I have some initial observations. Building any business you're selling to customers, it's hard enough.

What Samuha is doing is not selling to one customer, but we are selling to multiple customers because it's a collaboration problem. By definition, there are multiple parties involved. So I was asking myself, is this a polynomial complexity or an exponential complexity given on like how many customers are we selling to?

I think this is, again, where I would say that some of the partnership dynamic that we have chosen for this stage, we hope that this extends beyond today. I think the partnership with Snowflake, selling into that ecosystem, being able to validate the proof points of collaboration interests, this polynomial complexity, it's at least polynomial complexity, right? You're selling to at least multiple parties.

That is eased because the captive, qualified customers are brought to bear by our partner in question here, aka Snowflake. Similarly, I think this is something that hopefully will help us develop the right recipe when we are launching this across hyperscalers as well.

So the from here and now, how we continue to talk about this, educate the market around our learnings, learn from the market, especially I think this dynamic of how does the good market of this look like when you're selling to multiple entities at any given time? I think that's going to be very educational for us.

The learnings from a partnership perspective, how we do it with Snowflake today and with others tomorrow is also going to be something that we would love to share. My promise to myself is that as founders, there are lots of forums where we all kind of learn from, from the other founders. But I think successes are talked about more than kind of mistakes or failures, right?

I think this is going to be an interesting exercise. There are going to be mistakes that we make along the way. That is, I think, taxes, debt, and mistakes in foundership and entrepreneurship are a certainty.

I want to be able to, when I come back, and whether it's this forum or elsewhere, generally talk about kind of the learnings of how it is that you solve for complex go-to-market questions like what we are facing when we have to bring multiple parties to bear at any given time. I want to start a new podcast called Death, Taxes, and Founder Mistakes.

I do feel very strongly that these are three certainties equally. Exactly. I love it. Well, Kamakshi, thank you so much. Last quick question for you. Who should reach out if someone listens to this and they're thinking, oh, I might want to work at Samuha. It sounds really interesting. Or I might want to be a customer. What are those sorts of people look like?

Of course, work at Samuha, we'd love to like talk to talented, curious people who we align with culturally in spirit and in action.

So please do reach out to us if you're interested in learning more about how we build stuff and how we are not just building stuff from a product perspective, building the company as well. Are you remote? Is the company located in the Bay Area? We are fully remote. We are fully remote. But we have a center of mass in terms of employees who live in the Bay Area, but we are fully remote.

I would just say to those of your listeners who are listening in that the best way to learn about building companies is to be a part of an early founding team. There's a dynamic of being able to learn on someone's back or along with someone rather than it feeling solitary and more punitive. So it's a great exercise, something for you to consider. If you're interested, we'd love to talk to you.

And insofar as customers are concerned, of course, that's obviously kind of number one for us insofar as if you're a CDO, if you're a CISO, if you're a CMO, if you're a chief product officer across innovative healthcare companies, innovative pharma life sciences companies, innovative fintech companies, innovative marketing tech companies, innovative D2C retail companies, etc.

All of you are dealing with problems. In fact, horizontally speaking, no matter which business you are in, a CISO office, if this message around, you know, hey, look, any employee of mine gets onboarded, here's a badge, here's a laptop, here's a set of enterprise tools, and here's how they get access to enterprise data. If you see a world as a CISO where this is true, you're going to see a lot of people

We'd love to hear from you on your perspectives, and we'd love to talk to you about how we are solving for the problem. And if to the extent that our product fits your needs as you see it today and evolve, we would certainly love to talk to you about that. Kamakshi, thank you so much. Thank you very much. It was such a pleasure, Ben and David. Really, this was one of my highs of my entrepreneur journey. Hopefully, there'll be many more of being able to talk to the two, which is Ben and David. Awesome. Awesome.

I think we'll have many more opportunities. I agree. Listeners, we'll see you next time.