“增强学习-入门导读”版本间的差异
来自iCenter Wiki
(→AlphaGo计算机围棋) |
(→AlphaGo计算机围棋) |
||
第11行: | 第11行: | ||
===<B>AlphaGo计算机围棋</B>=== | ===<B>AlphaGo计算机围棋</B>=== | ||
− | Bandit based monte-carlo planning, ecml 2006. | + | :Bandit based monte-carlo planning, ecml 2006. |
− | Combining Online and Offline Knowledge in UCT, ICML 2007. | + | :Combining Online and Offline Knowledge in UCT, ICML 2007. |
− | Achieving Master Level Play in 9 × 9 Computer Go | + | :Achieving Master Level Play in 9 × 9 Computer Go, AAAI 2008. |
− | Mimicking Go Experts with Convolutional Neural Networks, ICANN 2008. | + | :Mimicking Go Experts with Convolutional Neural Networks, ICANN 2008. |
− | Monte-Carlo tree search and rapid action value estimation in computer Go, artificial intelligence, | + | :Monte-Carlo tree search and rapid action value estimation in computer Go, artificial intelligence, Elsevier 2011. |
− | The grand challenge of computer Go Monte Carlo tree search and extensions, cacm 2012. | + | :The grand challenge of computer Go Monte Carlo tree search and extensions, cacm 2012. |
− | Training Deep Convolutional Neural Networks to Play Go | + | :Training Deep Convolutional Neural Networks to Play Go,icml 2015. |
− | Mastering the game of Go with deep neural networks and tree search, nature 2016. | + | :Mastering the game of Go with deep neural networks and tree search, nature 2016. |
2017年1月14日 (六) 01:59的版本
增强学习入门
教材
An Introduction to Reinforcement Learning Intro_RL
Algorithms for Reinforcement Learning RLAlgsInMDPs
研究
AlphaGo计算机围棋
- Bandit based monte-carlo planning, ecml 2006.
- Combining Online and Offline Knowledge in UCT, ICML 2007.
- Achieving Master Level Play in 9 × 9 Computer Go, AAAI 2008.
- Mimicking Go Experts with Convolutional Neural Networks, ICANN 2008.
- Monte-Carlo tree search and rapid action value estimation in computer Go, artificial intelligence, Elsevier 2011.
- The grand challenge of computer Go Monte Carlo tree search and extensions, cacm 2012.
- Training Deep Convolutional Neural Networks to Play Go,icml 2015.
- Mastering the game of Go with deep neural networks and tree search, nature 2016.