SummaryData is one of the core ingredients for machine learning, but the format in which it is understandable to humans is not a useful representation for models. Embedding vectors are a way to structure data in a way that is native to how models interpret and manipulate information. In this episode Frank Liu shares how the Towhee library simplifies the work of translating your unstructured data assets (e.g. images, audio, video, etc.) into embeddings that you can use efficiently for machine learning, and how it fits into your workflow for model development.Announcements
Interview
Introduction
How did you get involved in machine learning?
Can you describe what Towhee is and the story behind it?
What is the problem that Towhee is aimed at solving?
What are the elements of generating vector embeddings that pose the greatest challenge or require the most effort?
Once you have an embedding, what are some of the ways that it might be used in a machine learning project?
Are there any design considerations that need to be addressed in the form that an embedding takes and how it impacts the resultant model that relies on it? (whether for training or inference)
Can you describe how the Towhee framework is implemented?
What are some of the interesting engineering challenges that needed to be addressed?
How have the design/goals/scope of the project shifted since it began?
What is the workflow for someone using Towhee in the context of an ML project?
What are some of the types optimizations that you have incorporated into Towhee?
What are some of the scaling considerations that users need to be aware of as they increase the volume or complexity of data that they are processing?
What are some of the ways that using Towhee impacts the way a data scientist or ML engineer approach the design development of their model code?
What are the interfaces available for integrating with and extending Towhee?
What are the most interesting, innovative, or unexpected ways that you have seen Towhee used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on Towhee?
When is Towhee the wrong choice?
What do you have planned for the future of Towhee?
Contact Info
Parting Question
Closing Announcements
Links
The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano) by The Freak Fandango Orchestra)/CC BY-SA 3.0)