Dr. Ronen Dar (Co-Founder/ CTO of @runailabs)) talks about the challenges of running compute infrastructure for AI, the GPU ecosystem, sizing LLMs and more.
SHOW: 739
**CLOUD NEWS OF THE WEEK - **http://bit.ly/cloudcast-cnotw)
**NEW TO CLOUD? CHECK OUT - ****"CLOUDCAST BASICS"**)
SHOW SPONSORS:
SHOW NOTES:
**Topic 1 - **Welcome to the show. Tell us a little bit about your background, and what you focus on at Run:ai.
**Topic 2 - **Let’s begin by talking about the challenges of running AI applications. What unique characteristics and requirements do AI applications have?
**Topic 3 - **Most AI applications run on GPUs. How do things change when using GPUs vs. CPUs to power AI applications? What is needed to get the most out of GPUs?
**Topic 4 - **As environments grow larger, what is needed to scale-up environments, both in terms of scheduling applications and managing the underlying GPU infrastructure?
**Topic 5 - **GPUs are not only expensive resources, but also in high-demand. How are companies doing capacity planning with GPUs? What struggles are you seeing companies have as they manage planning for AI projects?
**Topic 6 - **Are the new Large Language Models (LLMs) much different in size than AI models of the past?
**Topic 7 - **How well is the industry prepared to deal with the new interest in AI from across the industry?
FEEDBACK?