cover of episode 595: Data Engineering 101

595: Data Engineering 101

2022/7/26
logo of podcast Super Data Science: ML & AI Podcast with Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

Frequently requested episodes will be transcribed first

Shownotes Transcript

Tune in as Joe Reis and Matt Housley, co-founders of Ternary Data and co-authors of the book “Fundamentals of Data Engineering” join Jon Krohn to discuss major undercurrents across the data engineering lifecycle, and their top tools and techniques.

In this episode you will learn:

• What is data engineering? [3:55]

• Why Joe and Matt identify as “recovering data scientists” [6:12]

• What kinds of people tend to become data scientists vs. data engineers [10:38]?

• Key components of Joe and Matt’s book [26:31]

• Major undercurrents across the data engineering lifecycle [28:26]

• The most under-utilized tool in a data engineer's toolbox [34:39]

• How there are tradeoffs in any data pipeline latency considerations, but faster is typically the default assumption [38:55]

• Joe and Matt’s favorite data engineering tools and techniques [43:39]

Additional materials: www.superdatascience.com/595)