更改

增强学习-入门导读

添加553字节2017年1月13日 (五) 08:38
/* 研究 */
==研究==
 
<B>AlphaGo计算机围棋</B>
 
Bandit based monte-carlo planning, ecml 2006.
 
Combining Online and Offline Knowledge in UCT, ICML 2007.
 
Achieving Master Level Play in 9 × 9 Computer Go-AAAI-2008.
 
Mimicking Go Experts with Convolutional Neural Networks, ICANN 2008.
 
Monte-Carlo tree search and rapid action value estimation in computer Go, artificial intelligence, elseveir 2011.
 
The grand challenge of computer Go Monte Carlo tree search and extensions, cacm 2012.
 
Mastering the game of Go with deep neural networks and tree search, nature 2016.
行政员管理员
6,105
个编辑