#266 Edge AI with Derek Collison & Justyna Bak, CEO & VP of Marketing at Synadia

2024/12/2

DataFramed

People

Derek Collison

Justyna Bak

Richie

Topics

Richie: 本期节目探讨了边缘计算的用例、何时使用边缘计算或云计算甚至本地计算，以及如何在业务中使用边缘计算。 Derek Collison: 边缘计算的转变将与云计算的转变一样大，甚至更大，并且速度更快。边缘计算的核心目标是降低访问分布式技术（服务或数据）的延迟。边缘计算将持续推进到设备、制造系统、工厂等各种场景中。 Justyna Bak: 边缘计算能够实现实时决策，释放数据中隐藏的价值。工业制造是边缘计算的一个重要应用场景，可以用于实时监控和异常检测。边缘计算应用需要能够在离线场景下运行，并具备弹性和自愈能力。

Deep Dive

Key Insights

What is the main difference between edge computing and traditional computing?

Edge computing brings computation and data storage closer to the source of data generation, reducing latency compared to traditional cloud computing, which centralizes these processes.

Why is edge computing considered beneficial?

Edge computing reduces latency by placing compute and data closer to where it is needed, enabling real-time decision-making and reducing the need for data to travel long distances to and from the cloud.

What industries are currently benefiting from edge computing?

Industries benefiting from edge computing include manufacturing, connected cars, retail, and live sporting events, where real-time data processing and low latency are critical.

What is the ideal latency for applications involving humans?

For applications involving humans, the ideal latency is around 160 to 200 milliseconds, as the human brain tends to switch context after this threshold.

What challenges arise when migrating applications from the cloud to the edge?

Migrating applications to the edge requires rethinking how they are built to handle intermittent connectivity, limited compute resources, and offline scenarios, which are not issues in the cloud.

What is the role of AI in edge computing?

AI at the edge enables real-time inference, prompt augmentation using real-time data, and multi-model agentic workflows, which are critical for applications like self-driving cars and manufacturing anomaly detection.

What are some practical examples of AI at the edge?

Practical examples include personal AI assistants trained on user data, automating DevOps workflows, and AI-based search engines, all of which are currently in use.

What technological advancements are making AI interactions with robotics easier?

Advancements in hardware, such as NVIDIA's new chips, synthetic data generation, and virtual reality environments for robot training, are accelerating the integration of AI with robotics.

What are the most exciting use cases of edge computing?

The most exciting use cases include self-driving cars, where real-time decision-making is critical, and manufacturing, where edge computing enables real-time anomaly detection and autonomous behavior.

Chapters

This chapter introduces edge computing as an alternative to cloud computing, focusing on its benefits, particularly reduced latency. It also discusses the advantages of edge computing in terms of infrastructure management and real-time data analysis, setting the stage for a deeper exploration of its applications.

Edge computing brings computation closer to the data source.
Reduced latency is a key benefit.
It offers a balance between local computing and cloud convenience.

Shownotes Transcript

Translations:

中文

Welcome to DataFramed. This is Richie. There's been a big push towards cloud computing over the last decade.

It's brilliant because it saves organizations having to worry about managing their own infrastructure. But there are times when it isn't the best option. If you need your compute to be close to where the data is generated, self-driving cars being the obvious example, then you need an alternative. Enter edge computing. Today we're going to explore the use cases for edge computing, when to use edge computing or cloud computing or even on-premise computing, and how you can make use of edge computing in your business.

On top of the infrastructure discussions, I also want to find out how edge computing affects real-time data analysis and see what the fuss is about with edge AI. I have two guests for you from Cynadia, a distributed computing platform company. Cynadia is a pioneer of secure messaging to help drive security and collaboration into IoT, edge, and cloud computing systems. Derek Collison is the CEO and founder.

He spent 30 years working on distributed systems and cloud computing. He's previously been a CTO at VMware, where he designed and architected Cloud Foundry, a technical director at Google, where he co-founded the Ajax APIs group and created the largest CDN for popular JavaScript libraries, and a senior VP at TIBCO. He's also a serial entrepreneur, having previously founded security platform AppSara, which was sold to Ericsson.

Next up we have Justina Back. She's the VP of Marketing where she helps customers understand how they can use AI and edge computing. She's a seasoned marketing executive with 15 years experience in data and technology marketing. Previously Justina was Head of Product Marketing for Data and AI Cloud and Head of Product Strategy and Marketing for the Firebase mobile application platform at Google.

She's also a member of the Harvard Business Review Advisory Council. So let's take this episode to the edge. Hi, Derek and Justina. Welcome to the show. Thanks, Rishi, for having us. Thank you. So just to begin with, can you talk me through what's the difference between edge computing and more traditional computing?

Well, I think it's a great question. And I think the reason that Xenadia exists today is for those that were around, we went through a pretty massive transformation of how we did things on premise or in our own data centers to cloud computing.

And what Synadia believes in is that the transition to edge computing will be as big, if not bigger, and will happen a lot faster than the change to cloud computing. Most folks realize that whether we knew it at the very beginning, now there's a very different set of rules on how you build, let's say, cloud-native applications. We believe the same transition is going to happen for edge-native applications and systems.

Okay, so it sounds like the next big thing is moving from cloud computing to edge computing. But can you jump through, what are the benefits of it? Why would you want edge computing? Well, I mean, I think the folks at Sanadia and Justina especially know this about me. I try to simplify things to make me understand them better. But one of our North Stars, believe it or not, is the assumption that

People are going to try to decrease latency to access distributed technology, whether it be services or data. And that's just always been the case. And so we went from data centers to cloud and then cloud in multi-TOs.

CDN providers, you know, this notion of what we call far edge, you know, nearest the cloud providers trying to hold on for dear life. I call it their blockbuster moment. But the far are kind of like the Akamai's and the Fastly's and the CloudPlayer's and Netlify's and of course,

Purcell and Deno deploys of the world. But what you're starting to see, and I think we started to see this right about when we started Synedia, and that was one of our big bets, was that it won't stop there. They'll keep pushing into their vehicles, their manufacturing systems, factories, distribution centers, medical devices, whatever that is. And again, it's just this major driver to decrease latency to access either data or a service. And for us, it's kind of a combination of both.

Okay, so it seems like the big goal is reducing latency. So you've got your compute happening near where it needs to be used. But also, companies don't want the hassle of sort of managing their own infrastructure as much, which was the original benefit of cloud, I guess. So is it a sort of best of both worlds situation there?

Well, it's definitely both compute, but also data. So for example, if you have compute running locally, let's say inside of a vehicle, but it still needs to trombone back and forth, right, to get access to data, you could imagine a world where moving the data closer as well would make a lot of sense. Okay, so compute and data together, that makes sense. Maybe we'll try and make this a bit more concrete, talk about like who's actually making use of edge computing. Justina, can you talk me through some of the use cases?

Well, edge computing enables real-time decision-making. So it really unlocks the value hidden in the data. Every industry and every use case that relies on real-time insights into how the business is doing will benefit from edge computing.

And so some of the examples that we're seeing today could be industrial manufacturing, where you'll have sometimes hundreds of sensors monitoring different machines on the assembly line, monitoring the levels of vibration, the temperature. If you can process their data as soon as it's generated and you can identify anomaly, maybe one of the machines is getting overheated. The moment you can act on this information, that can save you money because you will keep your assembly line going.

It can save you revenue because you keep producing things that whatever your factory is producing, whether it's vehicles, whether it's other things. And it's also keeping things efficient. And by the time you would have to send this data into cloud, it would be inefficient because it would introduce a delay in the time when you can add on this data. It will be very expensive because this data will consume a lot of bandwidth.

And in some cases, the data should not even leave the premises because traversing the regions will put you against the compliance regulations. To add to what Justine is saying a little bit more concretely in terms of the verticals, right? Manufacturing, we really believe manufacturing is going through a renaissance right now. Not only just decreasing latency, but autonomous behavior, east-west, you know, anomaly detection type stuff, meaning, hey, this part looks weird in our factory and we have 80 of these factories worldwide. Does it look weird in your factory as well?

The whole notion of a vehicle becoming a technology platform that just happens to have wheels, so the connected car is a big one. We're also seeing it in kind of a revitalization of what we call the physical store experience. So a lot of people went to pure digital online, obviously with Amazon, but Walmart went to keep pace. And I think...

What we've seen as we both entered into COVID as kind of a forcing function, which we can talk a little bit more about why we think that was.

but exiting out of COVID of how do we redefine the experience when someone's in a store? And you can imagine we need lots of access to information at a very, very quick rate. And to Justina's point, if for some reason the cell connection from our store to the cloud is down or our star link or whatever that might be, we want the store to have a certain level of autonomous behavior. And then the last example, believe it or not, is

We're seeing a massive explosion in how people want to enhance the fan experience. So think of live sporting events, events like racing and things like that, where they want to immerse the fan more with what the drivers are experiencing, the data that's coming off of their car, the data that's coming off of other cars, the track.

And you can imagine a world where, again, to Justina's point, if we were tromboning either to get the data to the cloud and then go to the cloud to get data back, right, to ask a question or to enhance some experience either on their phone or whatever that is, people are going to want to cut that out. And what they found was is that you can't just forklift an app from the cloud to this edge. That it is a very different world like we talked about earlier on in the podcast when we went from our server room to a data center to the cloud.

Obviously, I really like Justine's example of the manufacturing thing. I can certainly see how you don't want to wait until, like, shove things into a cloud and wait for some batch process to run, and then you get the answer the next day. You want to know when there's a problem immediately. And Derek, to add to your point, it seems like most of the examples where you want edge computing, it's like basically anywhere you've got sensors running.

Is that about accurate? I think more broadly, and I'll let Justina go in as well, is there's a couple of things happening at the edge, right? The first thing is, is most of the data interacting with your customers or your partners or whatever is happening at the edge. So you could easily reason that taking it, moving it to the cloud just to bring it right back wouldn't make a lot of sense. Now, that being said, a lot of these AI inference models are still trained in the cloud, right?

And I don't think that's going to change for quite some time. And so do you want the applications to have to worry about the condition of the cell phone network to get the data to there? Or can they just say, hey, some text that, here's the data, make sure it gets over there for training. And by the way, whenever you update a model, make sure I get that

And then anything that the model needs. So for example, prompt augmentation, agentic workflows using multiple models, make sure I know where those things are, you know, in a location independent way, which we can talk about and decrease the latency as fast as possible. And again, you're even seeing some of the newer models. I know we're waiting a little bit into AI at the edge, but

But there's always going to be this massive scrutiny on how long did that take? What was the latency for me to get the first bite back or the first response back? Or in AI, the first meaningful context that I said, aha, it's answering my question. So it seems like latency is the big driver then. So how long can you wait for an answer? Maybe can you give me some examples of what an appropriate latency is for some of the different use cases you talked about?

Well, it's, you know, this is really interesting. And so, you know, way back when, you know, in the early 2000s, I spent time at Google. I worked at Google and Google's consumer was the person.

And for the most part, the human brain switched context between 160, 200 milliseconds, depending on how fast your brain is oscillating, right? And so Google had a very, very hard and fast rule that once a request came in, let's say a Google search query, it had to be back out the door in under 200 milliseconds or else you didn't turn on the service.

And so anytime you're dealing with a person, most people generalize around those types of numbers. How much processing do we have to do to be able to get something so that the person, let's say me, doesn't context switch and like flip to something else?

Now, when you flip to other types of machinery and things like that, whether it's manufacturing or connected car, you could pick examples where that number needs to be a lot, lot lower. Should I be able to change a lane? You're not going to say, let me go ask an LLM in the cloud whether or not I should do that type stuff. And so it's semantically relevant what you're trying to do. Humans is about 160 to 200, but we're starting to see

Some of these processes within manufacturing and again, like connected car, things like that, that are getting very, very low. And you could imagine, again, anything that's actually physically controlling what the factory or what the car is actually doing. They have very, very strict tolerances around latency and how fast they can respond all the way down, obviously, to the hardware and operating system, things like that.

Okay, so that's really interesting that people get bored after 0.2 of a second. They'll continue to switch and go and do something else. But yeah, I can certainly see how that's even like way too slow in the context of a self-driving car. Okay, let's try and figure out how we go about doing this. So you've got your cloud application and suppose your CEO goes, right, let's migrate it to the edge. What do you do? Like, where do you begin?

So first of all, it all depends what's your destination. Because if you're going from cloud to edge and your edge is a modern industrial manufacturing assembly line, you will have hundreds of sensors, you will have rather stable connectivity. So it's almost like a small data center at the edge. Now, things will be dramatically different when you go to a remote oil plant.

And, you know, data is the new oil. So thinking about that and a remote rig where conditions are really way harsher than in a modern factory because connectivity can be intermittent. Latency is definitely going to be higher. Compute may even be limited.

So an application that would thrive in the cloud where all these resources are unlimited and always on will definitely not do so well at the edge. So we need to fundamentally rethink how we're building applications who can thrive in the harshest of conditions at the edge. And these applications, they have to be by design able to operate in offline scenarios because connectivity can go down and you still have to be saving data

analyzing data and using it for the critical decisions in keeping that remote all operation up and running and making things safe for the folks who are there on site. But at the same time, the application needs to keep the data, even if the connectivity goes down and the database in the cloud is not going to help us in any way if we are not connected to the cloud. The application has to be resilient. It has to be self-healing, handle these conditions,

And so the vision that we have at Synadia for enabling that future generation of applications is truly nomadic apps. The ones who can really thrive in the harshest of conditions. They don't have external dependencies. Everything they need in order to operate is captured in a small binary under 20 megabytes. And they are fully resilient and fully ready to operate in offline scenarios.

to Justina's excellent response there. I think one of the reasons Sinedi was very misunderstood at the very beginning is because we, meaning Sinedi, decided to attack the problem in a very different way.

Traditionally, you just said, okay, I've got an application. How do I get it to the edge? Meaning that that's the first thing you focus on. And by the way, in a previous lifetime, I designed systems the exact same way. Focus on the workload first. How do you get it running somewhere else? Then you figure out, oh, it needs data. Then it needs a network to connect it to the data. Then you need to secure it. And so you see this workload, data, network security type of a pipeline. And so things that I've designed in the past, like platforms as a service, but Kubernetes, OpenStack, all did it that way.

What we did was to say, to get to that end point where the application can truly run in any region, any cloud provider, all the way out into, let's say, a connected car type stuff. We started with the connectivity layer first. So we said, if this connectivity layer is intelligent, meaning everything is location independent, I don't need to know where you are. I can get access to you.

securely because the network is secured at the very, very fundamental pillar. Then we move to data based on intelligent connectivity. Things can move and ebb and flow in a secure way and be very location independent. And then at the very end, we care about the workload. And so at the very beginning of Seneja's lifetime, we were very misunderstood because we were taking a pretty much a totally reverse upside down path to the problem. But to get to Justina's point around truly nomadic applications, you can't start with the application.

You have to start with the connectivity, the networking, security, data layers before. And then in addition, there's a lot of things that make deploying things in the cloud on current architectures very easy because they're all right there. They're like a little click away. And all of a sudden, when you're inside of a retail store or a cafe or a manufacturing plant, it's like, oh, we don't have one-click load balancers and GSLBs and service discoveries. And next thing you know, you're not...

just replicating your nomadic application or the one you want to be nomadic, that might be a pretty simple application. It's like, oh, wait a minute. When we were running that in the cloud, we had security, we had networking, we had VPCs, we had data services, we had load balancers, we had API gateways, we had all of this stuff to get around, at least in our opinion,

the fundamental limitations of the current architectures, which mostly is around the connected pillar, which means everything in the current state of the art is location dependent. I have to know where you are and everything is a one-to-one request or apply. So anything based on HTTP, gRPC, all of these types of different protocols. And so we said, hey, how do we change that such that fundamentally it unlocks so many different opportunities as we go up the stack to data and to workloads?

That's absolutely fascinating. And it does sound like there are a lot of things you need to have in place before you can successfully create these edge applications. So you have all these worries about low connectivity, low resource requirement, compute capability, things like that. And you need to put all this infrastructure in place. So

How much of this can be like hardware infrastructure you need to sort out? How do you need to change your own applications before you get going? And, you know, you can get access to this robust connectivity and things like that. What do you need to do before you can start building these edge applications?

Well, I think the important part for the audience to understand is that, let's say we're three architects in a company, right? And we're in front of a whiteboard designing an application or designing a service. What we're going to be drawing and what we're going to be talking about in the modern world that Sanidia sees is actually not changing. We're going to be talking about microservices and access to relational stores and key value and object stores and streaming data and events and things like that, right? So the boxes and the circles and the lines all are going to look the same, right?

Where Snady comes in is the how. The how is radically different, almost like going from a petrol car to a Tesla. Still has a gas pedal, still has a steering wheel type stuff, and that's because that's what people are used to. So we're fundamentally not changing the what, but we are changing the how. And so we start at the connectivity layer and then the data layer. But for the audience to give them a little bit of groundedness in what we're talking about, imagine a world where there's something inside of a connected vehicle that's asking the question.

I don't know what the question is, but let's say it's asking something. But you can imagine the tech company within the automotive industry saying, we want Richie to be able to design V2

We want him to be able to test some sample data on his laptop if that's allowed, right? Then he can scale it up into different clouds, different regions, put it on a telephone pole, and then eventually move it all the way into the vehicle without the thing that's running in the vehicle that's asking the questions, ever having to go down, be reconfigured, anything. And hopefully the audience, if they kind of go,

hmm, how might I do that with today's technology of how I normally go about things with HTTP or Perest APIs or sockets and DNS and load balancers and GSLBs. And you realize that something that feels like it would be nice to have to be trivial to do this, like within, let's say, Richie, you finished the V2 of the really cool Richie microservice, to be able to deploy that the next day type stuff. And what we see is that most companies are like,

Once Richie's done, it's probably going to take us six plus months to actually get this all the way out to something like running inside of a vehicle or inside of a factory or whatever. And so one of Synedia's big goals is to provide a modern tech stack for what we consider this modern technology.

landscape of where you need to deploy these things. But you should be able to reduce complexity to Justina's point, batteries should be included. So all of those things we keep mentioning, GSLBs and DNS tricks and things like that aren't needed with a Sanadia tech stack. And so with that one binary, not your application, but that one binary of a Sanadia NAT server that can be Lego brick anywhere and run anywhere, once that's actually running, all of those other pieces aren't needed. And so now you can see a very quick rapidization

rapid response to Richie microservice B2. That makes sense.

Okay, yeah, that certainly sounds like once you've got the prototype, then it takes six months to put things in place. That's a lot of commitment there. It's a lot of effort to create these things. So making that easier. Well, six months after you're finished. So you've done your job. You're like, hey, Richie V2 looks great and it's running. And then as you look at, oh, well, if we deploy it to the cloud in one region and one cloud provider that we're used to, we might be able to do that fairly quickly, right? We've got Kubernetes set up. We know how that works.

where the friction and the time delays come in to deploying that is in that near, far, and beyond edge. Does it always have to be that long? Is there like a hello world equivalent for edge computing? Like what's the sort of simplest, useful project you can do?

That's where it becomes really interesting and why we took such a different approach. And again, I'm not pointing fingers at other people because I did the same thing, but you could say the hello world might be just a stateless application. How can I figure out how to get it to run inside of a manufacturing plant or a distribution center or a connected car like we're talking about? That's not hard, but that's also not reality.

Every technology is using multiple moving parts, lots of microservices, access to different data types and data stores and things like that that are all spread out. Some could be in the edge, some could be in a remote location but close. For example, with connected cars, they could be running in the base of the cell towers or it could be in different regions within different cloud providers. And so I think that's the big challenge. It's when you're trying to deploy into these unknown landscapes, that's where it becomes interesting. But

it's not just unknown, it's you don't have all of the moving pieces that a cloud has to kind of what I do, what I call the unnatural acts to paper over the fact that

All technologies, in my opinion, for the most part, that run what we're built on today are all location dependent. I need to know where you are, your IP, believe it or not. And people could argue, oh, no, I have an IP for the load balancer. That, in my opinion, is an unnatural act to get around the basic limitations that everything we do today is location dependent and one-to-one request reply.

And we're starting to see architectures that need quite a lot more and they don't want to have to wait for six, 12 months sometimes, right, to have...

not only Richie, but let's say Derek and Justina as the supporting characters on platform engineering, try to recreate everything in the cloud. And so what we do for our customers and partners in the ecosystem is we drastically accelerate the time to value. We also drastically reduce the complexity. You can imagine a world where if the cloud provider is running all of these services, you don't necessarily have to worry about them. You know that they're running and all.

But now all of a sudden, inside of a vehicle or inside of a factory, it's like, oh, do we want to replicate every single thing that the cloud provider does? I mean, that's a massive, massive cost. And you could go down that path or...

What Sanadia did was we said, let's reframe the problem from the ground up and try to think differently about what that might look like, such that we don't have to take six to 12 months to try to recreate everything that was in the cloud. And by the way, I love cloud, don't get me wrong. I do think it's the new mainframe. And I hope that the audience, and I'm sure they are, because I'm sure they're very smart. You know, the cloud providers are incentivized to be a Hotel California. You can check in, but they don't want you to check out.

I'm sure. Okay. So yeah, it sounds like there's still quite a lot of effort involved in creating these things. So I guess a few different teams within your organization are going to be involved. So if you're trying to create your first edge application, which teams or roles are going to be involved in this? Yeah, go on. Just in, you've not had to go for a while. Talk us through it.

Well, building a new application always starts with the business outcome that the application needs to drive. And during my time at Google, when I had a front row seat to the generative application revolution, I saw that the customers who had the biggest success were the ones who focused on a very specific business outcome they wanted, on the one business process they wanted to optimize end-to-end rather than trying to boil the ocean and trying too many things at the same time. So once you...

have this end goal, you'll assemble your team that can help you drive towards that goal. And so obviously you will have a solution architect who will be designing the application and looking at some of the alternative solutions to what we always use to build applications and considering approaches such as Cinedia Edge Native Tech Stack that allows you to build applications that will thrive under any circumstances, even in the harshest of the conditions you will encounter at the edge where you'll have intermittent communication

connectivity, high latency, bandwidth may be scarce, and sometimes you may even have limited compute. So solutions architects are definitely needed, but also your data team, because the applications need the data. And to Derek's point, you will not always have access to robust databases in the cloud. You may need to be able to operate in offline scenarios, but still be able to write the data and eventually ensure data consistency.

I think one of the really cool applications that we are seeing now and we'll be seeing more of is Edge AI. And the one functionality of this application that is so revolutionary is that you can start classifying the data that's generated into the high value versus low value data and that high value data that is really critical for you to act effectively.

on in real time because it includes the business insight that will help you drive to the business outcome. So for instance, if your business outcome is develop a monitoring application for your manufacturing plant,

making sure that all the machines are working in concert. And if something, if an anomaly is detected, you act on it. And maybe you can even create a more advanced version of AI where you change several functions. So first you detect the problem, then you classify it. Is it urgent or is it something I'm just monitoring? You triage it. And then if it's urgent, you start fixing it.

So you also want to have the ML specialist who will be able to inference at the edge, then classify the data, use some of the data, maybe to send back to cloud. So train further the models with this interesting data that's just been captured. Or maybe you want to train the small AI model at the edge.

But I think we will see more collaboration across overall solution architectures looking at new approaches to building through pneumatic applications. And you will see the collaboration between the ML specialists and the data teams because you cannot have AI without data.

Okay, I love that. So we've got obviously software developing and engineering because you're building some kind of application. You've got the data people who need data. And I love the use case of you're doing anomaly detection, so you're going to need some data scientists or machine learning specialists in there as well. I guess the teams that haven't been mentioned, there's no mention of sort of business teams. Do any commercial teams need to be involved in these things?

Well, I think Justina led with the business case, the business teams and what we're trying to achieve. And more succinctly to your point, it's less about the application itself that's running there and it's more about what does it need access to that drives the business outcome. So for example, if my application needs access to three other microservices and they need access to data stores and those all are remote,

The business team might say, I can't wait for the display in the car to update when you touch your phone and it takes five seconds to update. That's just a bad experience. It's a bad business outcome for our users. So then that trickles down. And now all of a sudden it's less about the app itself. It's more about platform engineering, data engineering going, well, what if we need to move those services and that data actually into the vehicle as well? And that's where a lot of our customers, when they hit that crossover point, that's when they reach out to us.

Okay, yeah, that certainly seems to make sense. You have the business people involved in the requirements and they're maybe less concerned with some of the technicalities. Now, you both mentioned the idea of AI at the edge. So I'd like to know, what are some sort of examples of this? Are we talking about sort of more traditional or predictive AI or other generative AI use cases? Like, do you want chatbots on the edge? Talk me through it. Well, I think what you're going to see is

inference, wherever it's running, right, is kind of a, it's its own ecosystem, in my opinion. And it has two distinct factors. Again, no matter where it's running. One is prompt augmentation, which I'm sure the audience is kind of familiar with. We can actually do it by hand. You might hear things like RAG and RAG++ and things like that. And it's essentially saying the raw prompt that you give me, I need a whole bunch of access to real-time data sources to augment it.

And so again, we're in that, hey, do we want to pay the latency if all of that stuff is not where we really need it? The second one, which is just starting to come into fruition, but I think to your use case that you're talking about here is very applicable, is what most people term as a gentic system. But what I actually term it as is probably a DAG, directed acyclic graph traversal through multiple models. So it could be that

I'm going to be talking. I think voice interactions, especially in certain situations, will dominate. And so LLMs and things like that might actually be the first thing that says, okay, I get it. I get what you're trying to do. And here's the plan. And oh, by the way, you need all of the access to these different data sources and models to kind of make this work. And where I think Synedia and our partners can benefit from this, and we have a lot of AI customers today, is

You don't need to know where they are, but you need to know that it's secure and that you're really talking to the model that you want to. But it could be a mix of models between LLM and generative to predictive to onboard vision models that are running in vehicle or on the manufacturing floor of robots. So you can see this ecosystem just keep playing out. And again,

more access to more data, more latency sensitive access to these things, and then traversals through multiple models where you don't know where they're running. And you could even, in my opinion, have the ability to move these models again from the cloud to a cell tower to in the factory or in the vehicle, so to speak. Okay, yeah, so two very cool examples. I like the idea of just using real-time data sources to augment your prompts. And that seems like a cool edge use case.

I think on that one, just real quick to kind of also highlight some of the things that we talk around the connectivity layer. Imagine a world where there's inference and it's a very large service. Let's say we go from 10 requests a second to a million. I'm making up a number, right? Obviously, you're going to have a very big bill if you're always talking to big LLMs. But more importantly, if you're doing prompt augmentation and everything is request reply point to point,

Every data layer that you're accessing now also needs to scale with your inference layer. And so when you see a RAG, RAG is retrieval. I ask and retrieve the value. And what we and our customers and partners kind of talk through is this.

hey, let's say Richie's the main service and Richie's now running at a million a second. And Justina and I are the data, you know, real-time data things for prompt augmentation. Instead of you having to ask Justina and me at a million times a second as well, I mean, we have to scale that commensurate with your scaling.

In our world, we have both push and pull. Push just means you can say, hey, Derek, anytime, let's say a sensor temperature changes on this, just let me know. And I know it's only going to change maybe a couple times, you know, over 30 seconds. But now all of a sudden, I don't have to scale to a million a second. I just know that

I have a change and I just, in our world, just send it out as an event that anyone who's interested in that event, meaning all 10,000 of you, right, that are running this million requests per second and maybe Justine is doing audit logs or something else with that.

In our world, that's just very natural and very simple. And so even at the just prompt augmentation and access to real-time data, when we talk about real-time data, yes, you can ask for it. You know, you can do request response. But what's, I think, a highlight point of a system like Sanadia's is you can architecturally move that to a push model and just say, hey, you guys, whenever something changes, just let me know. And then I'll hold on to that data for a certain period of time and just reuse it.

Okay, so not even having to ask the data, it's just being given to you and saying, here's something interesting completely automatically. Okay, I like that. It's sort of zero work for humans you have to do. So yeah, everything just happens without you intervening. All right, so do you have any success stories from organizations that have used agents? So you mentioned that as your second use case. Everyone's talking about agents, but real world examples are a little hard to come by. So have you seen this work in practice?

Oh, yes. We have lots of innovative AI startups as our customers, and they build a variety of interesting use cases. So for instance, we have a company who's building personal AI assistants and that are trained on your own data, and they can become useful assistants helping you discover and answer some questions about your, maybe even career choices based on the past data that you've had about your preferences. We have customers who are

of automating DevOps workflows and helping DevOps teams be more productive and focus on what they do best by automating some of the mundane and repetitive tasks. And then we also have organizations who are building a search engine based on AI. So there is a whole range of use cases that are very practical, providing lots of value, and they are not far away in the future. They are happening right now.

To Justina's point, but also to echo that, what you were saying, Richie, is it is still very early on. And so we know, or at least I firmly believe, that 2025, you will see these multi-modal, multi-modal, sorry, agentic workflows really kind of explode. I think...

And then the second half of 2025, I think we're going to see kind of an explosion of physical AI. So when these agents can actually start affecting physical environments, which I'll be honest, a couple of years ago, I thought that was that physical AI component, like robotics and things like that was going to be very far out. And everything that AI is doing, it's making a lot of us look kind of foolish because it's proving us wrong as we go. So

We're definitely seeing customers starting to utilize our tech stack to make these easier, but it is very, very early days. But second half of 2025, it's coming around fast. What's changed to make the AI interactions with robotics easier?

I think it's kind of like the iPhone. And what I mean by that, and the audience might go, what? But the iPhone was, it was just a massive, perfect storm of all of these technologies reaching kind of an apex where the iPhone was possible. And so...

you know, with generative AI, with the power of these models, with, for the most part, you know, at least in our opinion, looking at the AI landscape. And I studied this in university. Unfortunately, when we got out of university, or at least when I got out, we went into the second longest AI winter ever, right? But now what you're seeing is, is that hardware is moving faster than I ever thought it would, you know, with NVIDIA's new chips, you know, where they go in H100 in Broadwell, which is, I think, I might have messed up that name, coming out so fast.

And then everyone was really concerned around energy. And they're like, well, we'll just go to nuclear. And we really think fusion's probably within a decade, which will solve all those problems. So most of the forward thinkers in the AI space have said, let's assume that data is plentiful. We're going to figure out synthetic data. Compute will become faster and cheaper, and energy will go to zero.

So when you start thinking about that, you start seeing things like NVIDIA's Omniverse. Well, they've created a completely virtual reality environment to train virtual robots, but completely on the laws of physics. And they can really just transfer it to the physical thing and they're getting amazing results. But I think it's a, it's again, it's like this mini iPhone moment, which I just did not see happening for physical AI this fast.

But for us, it's great because manufacturing is a big vertical for us, right? We have lots of customers in that space. So it's great to see. That's a big gamble if you're assuming nuclear fusion is going to happen. Like I remember learning about nuclear fusion maybe coming soon when I was at school and that's several decades ago. So yeah, but certainly it's a very interesting idea. But the interesting part is, you're right. And I do believe it will be within a decade. But I think every hyperscaler already has contracts to build nuclear powered

data centers, AI data centers and things. And so even in the short term, you can, at least on paper, I know this isn't reality, but you can mentally trend energy to zero. But at the same time, the energy requirements of the hardware is accelerating faster than any tech I've seen in my 36 plus year career. So it's definitely kind of bonkers where we're at. And it's exciting, at least for me, very exciting.

Absolutely. That's very cool stuff. All right. So just to wrap up, what are you most excited about in the world of edge computing?

So one of the most exciting use cases of AI at the edge are definitely self-driving cars. Lots of sensors capturing lots of data in real time. And we need to act on this data in real time because it's a mission critical workload. Either it's safe to make a left turn in the intersection or it's not and you need to wait. And if something is really going wrong with the vehicle, you need to pull over and wait for help.

You need to have a tech stand that supports this mission-critical workload. And connectivity is not a given. Sometimes these vehicles, they have to really operate fully autonomously. And all the decisions have to be made at the vehicle. And so having a tech stand that supports it, it's key to making autonomous vehicles a reality.

I can tell you, at least from my perspective, Richie, I think the biggest thing is that to Justina's point, we focus both on training and inference, but where we're really concentrating is inference. And what we've seen is it's always going to be at the edge.

And for us, what at least I am most excited about is that all of the bets we made, for example, prompt augmentation needing access to lots of different data in real time that can be spread out. And then this multi-model agentic workflow that's exploding. All of it kind of needs the tech stack that we bet on about seven years ago. So that's what I'm most excited about, that some of the bets that we crossed our fingers on are kind of taking root.

All right. That's wonderful stuff. It's a very exciting future. So, yeah, thank you so much for your time. Absolutely. Thank you. I appreciate it. Thank you, Richie.

#266 Edge AI with Derek Collison & Justyna Bak, CEO & VP of Marketing at Synadia 39:03 Share