Deep learning allows engineers to build models that can make decisions based on training data. These models improve over time using stochastic gradient descent. When a model gets big enough, the training must be broken up across multiple machines. Two strategies for doing this are “model parallelism” which divides the model across machines and “data parallelism” which divides the data across multiple copies of the model.
Distributed deep learning brings together two advanced software engineering concepts: distributed systems and deep learning. In this episode, Will Constable, the head of distributed deep learning algorithms at Intel Nervana, joins the show to give us a refresher on deep learning and explain how to parallelize training a model.
Full disclosure: Intel is a sponsor of Software Engineering Daily, and if you want to find out more about Intel Nervana including other interviews and job postings, go to softwareengineeringdaily.com/intel). Intel Nervana is looking for great engineers at all levels of the stack, and in this episode we’ll dive into some of the problems the Intel Nervana team is solving.
Related episodes about machine learning can be found here.)
The post Distributed Deep Learning with Will Constable) appeared first on Software Engineering Daily).