What does it take to train a frontier model? What's the know-how, the secret sauce that makes firms lets OpenAI and Deepmind push the limits of what's possible? How much are Chinese firms benefitting from western open source, and in the long term is it possible for western labs to maintain an edge?
The hosts of the excellent Latent Space podcast, Alessio Fanelli of Decibel VC and Shawn Wang of Smol AI, come on to discuss.
We get into:
How the secret sauce used to push the frontier of AI diffuses out of the top labs and into substacks
How labs are managing the culture change from quasi-academic outfits to places that have to ship
How open source raises the global AI standard, but why there's likely to always be a gap between closed and open source
China as a "GPU Poor" nation
Three key algorithmic innovations that could reshape the balance of power between the GPU rich and GPU poor
Outtro music: CHEKI https://open.spotify.com/track/1zKL2bOEkMDGuIjLhG34YA?si=9a713a88aa3d4f71
Cover photo: "Inkstand with A Madman Distilling His Brains" 1600s Urbino. Kind of like training a model! https://www.metmuseum.org/art/collection/search/188899
The met description: In this whimsical maiolica sculpture, a well-dressed man leans forward in his seat with his head in a covered pot set above a fiery hearth. The vessel beside the hearth almost certainly held ink. The man’s actions are explained by an inscription on the chair: "I distill my brain and am totally happy." Thus the task of the writer is equated with distillation—the process through which a liquid is purified by heating and cooling, extracting its essence.
Learn more about your ad choices. Visit megaphone.fm/adchoices)