“增强学习-入门导读”版本间的差异
来自iCenter Wiki
(→教材) |
(→AlphaGo计算机围棋) |
||
第9行: | 第9行: | ||
===<B>AlphaGo计算机围棋</B>=== | ===<B>AlphaGo计算机围棋</B>=== | ||
+ | |||
+ | 蒙特卡洛树搜索(Monte-Carlo tree search) | ||
:Bandit based monte-carlo planning, ecml 2006. | :Bandit based monte-carlo planning, ecml 2006. | ||
第16行: | 第18行: | ||
:Combining Online and Offline Knowledge in UCT, ICML 2007. | :Combining Online and Offline Knowledge in UCT, ICML 2007. | ||
− | + | ||
+ | *Monte-Carlo tree search and rapid action value estimation in computer Go, artificial intelligence, Elsevier 2011. | ||
+ | |||
+ | |||
+ | 神经网络 | ||
:Mimicking Go Experts with Convolutional Neural Networks, ICANN 2008. | :Mimicking Go Experts with Convolutional Neural Networks, ICANN 2008. | ||
− | * | + | *Training Deep Convolutional Neural Networks to Play Go, icml 2015. |
+ | |||
+ | 进展 | ||
+ | :Achieving Master Level Play in 9 × 9 Computer Go, AAAI 2008. | ||
:The grand challenge of computer Go Monte Carlo tree search and extensions, cacm 2012. | :The grand challenge of computer Go Monte Carlo tree search and extensions, cacm 2012. | ||
− | |||
*Mastering the game of Go with deep neural networks and tree search, nature 2016. | *Mastering the game of Go with deep neural networks and tree search, nature 2016. |
2017年1月17日 (二) 01:36的版本
增强学习入门
教材
增强学习 or 强化学习经典教材
- An Introduction to Reinforcement Learning Intro_RL
- Algorithms for Reinforcement Learning RLAlgsInMDPs
研究
AlphaGo计算机围棋
蒙特卡洛树搜索(Monte-Carlo tree search)
- Bandit based monte-carlo planning, ecml 2006.
- Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, CG 2006.
- Combining Online and Offline Knowledge in UCT, ICML 2007.
- Monte-Carlo tree search and rapid action value estimation in computer Go, artificial intelligence, Elsevier 2011.
神经网络
- Mimicking Go Experts with Convolutional Neural Networks, ICANN 2008.
- Training Deep Convolutional Neural Networks to Play Go, icml 2015.
进展
- Achieving Master Level Play in 9 × 9 Computer Go, AAAI 2008.
- The grand challenge of computer Go Monte Carlo tree search and extensions, cacm 2012.
- Mastering the game of Go with deep neural networks and tree search, nature 2016.