#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

2020/4/3

Lex Fridman Podcast

Frequently requested episodes will be transcribed first

Chapters

David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning.

Support this podcast by signing up with these sponsors:

MasterClass: https://masterclass.com/lex
Cash App - use code "LexPodcast" and download:
Cash App (App Store): https://apple.co/2sPrUHe
Cash App (Google Play): https://bit.ly/2MlvP5w

EPISODE LINKS: Reinforcement learning (book): https://amzn.to/2Jwp5zG

This conversation is part of the Artificial Intelligence podcast. If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts, follow on Spotify, or support it on Patreon.

Here's the outline of the episode. On some podcast players you should be able to click the timestamp to jump to that time.

OUTLINE: 00:00 - Introduction 04:09 - First program 11:11 - AlphaGo 21:42 - Rule of the game of Go 25:37 - Reinforcement learning: personal journey 30:15 - What is reinforcement learning? 43:51 - AlphaGo (continued) 53:40 - Supervised learning and self play in AlphaGo 1:06:12 - Lee Sedol retirement from Go play 1:08:57 - Garry Kasparov 1:14:10 - Alpha Zero and self play 1:31:29 - Creativity in AlphaZero 1:35:21 - AlphaZero applications 1:37:59 - Reward functions 1:40:51 - Meaning of life

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

Lex Fridman Podcast

Chapters

What is the first program discussed in the episode?

How did AlphaGo revolutionize the game of Go?

AlphaGo

What are the rules of the game of Go?

What has been your personal journey in reinforcement learning?

What exactly is reinforcement learning?

AlphaGo (continued)

How does supervised learning and self-play work in AlphaGo?

What led to Lee Sedol's retirement from Go play?

What insights does Garry Kasparov provide?

How does Alpha Zero utilize self-play?

What creativity does AlphaZero exhibit?

What are the applications of AlphaZero?

What are the implications of reward functions?

What is the meaning of life according to the episode?

Shownotes Transcript

#86 &#8211; David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning 01:48:28 Share

Lex Fridman Podcast

Chapters

What is the first program discussed in the episode?

How did AlphaGo revolutionize the game of Go?

AlphaGo

What are the rules of the game of Go?

What has been your personal journey in reinforcement learning?

What exactly is reinforcement learning?

AlphaGo (continued)

How does supervised learning and self-play work in AlphaGo?

What led to Lee Sedol's retirement from Go play?

What insights does Garry Kasparov provide?

How does Alpha Zero utilize self-play?

What creativity does AlphaZero exhibit?

What are the applications of AlphaZero?

What are the implications of reward functions?

What is the meaning of life according to the episode?

Shownotes Transcript

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning