SummaryIn this episode of the Data Engineering Podcast, Anna Geller talks about the integration of code and UI-driven interfaces for data orchestration. Anna defines data orchestration as automating the coordination of workflow nodes that interact with data across various business functions, discussing how it goes beyond ETL and analytics to enable real-time data processing across different internal systems. She explores the challenges of using existing scheduling tools for data-specific workflows, highlighting limitations and anti-patterns, and discusses Kestra's solution, a low-code orchestration platform that combines code-driven flexibility with UI-driven simplicity. Anna delves into Kestra's architectural design, API-first approach, and pluggable infrastructure, and shares insights on balancing UI and code-driven workflows, the challenges of open-core business models, and innovative user applications of Kestra's platform.Announcements
Interview
Introduction
How did you get involved in the area of data management?
Can you start by sharing a definition of what constitutes "data orchestration"?
There are many orchestration and scheduling systems that exist in other contexts (e.g. CI/CD systems, Kubernetes, etc.). Those are often adapted to data workflows because they already exist in the organizational context. What are the anti-patterns and limitations that approach introduces in data workflows?
What are the problems that exist in the opposite direction of using data orchestrators for CI/CD, etc.?
Data orchestrators have been around for decades, with many different generations and opinions about how and by whom they are used. What do you see as the main motivation for UI vs. code-driven workflows?
What are the benefits of combining code-driven and UI-driven capabilities in a single orchestrator?
What constraints does it necessitate to allow for interoperability between those modalities?
Data Orchestrators need to integrate with many external systems. How does Kestra approach building integrations and ensure governance for all their underlying configurations?
Managing workflows at scale across teams can be challenging in terms of providing structure and visibility of dependencies across workflows and teams. What features does Kestra offer so that all pipelines and teams stay organised?
What are the most interesting, innovative, or unexpected ways that you have seen Kestra used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on Kestra?
When is Kestra the wrong choice?
What do you have planned for the future of Kestra?
Contact Info
Parting Question
Closing Announcements
Links
The intro and outro music is from The Hug) by The Freak Fandango Orchestra) / CC BY-SA)In this episode of the Data Engineering Podcast, host Tobias Macy interviews Anna Geller, a data engineer turned product manager, about the integration of code and UI-driven interfaces for data orchestration. Anna shares her journey from working with data during an internship at KPMG to her current role as a product lead at Kestra. She provides her insights into the concept of data orchestration, emphasizing its broader scope beyond just ETL and analytics, and discusses the challenges and anti-patterns that arise when using existing scheduling systems for data-specific workflows.Anna explains the overlap between CI/CD, scheduling, and orchestration tools, and the limitations that occur when these tools are used for data workflows. She highlights the importance of visibility and governance at scale and the need for a dedicated orchestrator like Kestra. The conversation also delves into the challenges of using data orchestrators for non-data workflows and the benefits of combining code and UI-driven approaches.Anna discusses Kestra's architecture, which supports both JDBC and Kafka backends, and its focus on API-first interactions. She explains how Kestra handles task granularity, inputs, and outputs, and the flexibility provided by its plugin system. The episode also explores Kestra's approach to data as assets, the target audience for Kestra, and how it bridges different workflows across organizational boundaries.The discussion touches on Kestra's open-core model, the challenges of balancing open-source and enterprise features, and the innovative ways Kestra is being applied. Anna shares insights into Kestra's local development experience, the lessons learned in building the product, and the upcoming features and projects that Kestra is excited to explore.