“大数据智能”版本间的差异
(→语音识别) |
|||
第1行: | 第1行: | ||
− | = | + | == 人工智能 == |
− | + | 人工智能(Artificial Intelligence),是指计算机系统具备从听说读写到搜索、推理、决策和回答问题等类人智能的能力,即感知、理解、决策的能力。 | |
− | + | ||
− | ==人工智能历史== | + | === 人工智能历史 === |
过去经历了2次高潮与2次低谷 | 过去经历了2次高潮与2次低谷 | ||
第12行: | 第11行: | ||
基于大数据的机器学习的算法进步 | 基于大数据的机器学习的算法进步 | ||
− | + | === 四个层面 === | |
− | + | * 目标与功能 | |
− | + | : 语音识别、机器视觉、自然语言理解 | |
+ | : 智能问答是综合以上功能的高级系统 | ||
− | + | * 核心技术 | |
− | + | : 特定算法、机器学习算法、深度神经网络 | |
− | + | * 软件工具 | |
− | + | : TensorFlow / Caffe / Torch | |
− | + | * 底层硬件 | |
− | + | : 可编程逻辑阵列 FPGA / 通用图形处理器 GPGPU / 通用处理器 CPU 群集 | |
− | + | ||
− | === | + | === 国际研究 === |
− | + | [http://research.google.com/teams/brain/ Google Brain] | |
+ | ([http://research.google.com/pubs/jeff.html Jeffrey Dean]) | ||
− | + | [https://research.facebook.com/ai Facebook AI Research (FAIR)] | |
− | + | ([http://yann.lecun.com/ Yann LeCun]) | |
− | [ | + | |
− | ([http:// | + | |
− | + | ||
− | + | ||
− | + | ||
[https://www.microsoft.com/en-us/research/group/dltc/ MSR Deep Learning Technology Center (DLTC)] | [https://www.microsoft.com/en-us/research/group/dltc/ MSR Deep Learning Technology Center (DLTC)] | ||
− | (Li Deng) | + | ([https://www.microsoft.com/en-us/research/people/deng/ Li Deng]) |
[https://www.openai.com/blog/ OpenAI] | [https://www.openai.com/blog/ OpenAI] | ||
− | (Ilya Sutskever) | + | ([http://www.cs.toronto.edu/~ilya/ Ilya Sutskever]) |
+ | |||
+ | == 机器学习 == | ||
+ | |||
+ | 机器学习(Machine Learning),是指机器从数据中自动分析获得规律,并利用规律对未知数据进行预测。 | ||
+ | |||
+ | === 阅读材料 === | ||
+ | |||
+ | # Jordan, M. I., and T. M. Mitchell. "Machine learning: Trends, perspectives, and prospects." Science 349, no. 6245 (2015): 255-260. [http://science.sciencemag.org/content/349/6245/255 Machine_Learning_Science_2015] | ||
− | == | + | === 工具 === |
− | + | '''Python''' | |
[http://scikit-learn.org scikit-learn] | [http://scikit-learn.org scikit-learn] | ||
+ | ([https://github.com/scikit-learn/scikit-learn Source Code]) | ||
− | + | == 深度学习 == | |
− | + | 深度学习(Deep Learning),机器学习中一种基于对数据进行表征学习的方法,试图使用包含复杂结构或由多重非线性变换构成的多个处理层对数据进行高层抽象的算法。 | |
− | + | === 神经网络 === | |
− | [[卷积神经网络]] | + | 深度神经网络,Deep Neural Networks,简称DNN |
+ | |||
+ | [[卷积神经网络]],Convolutional Neural Networks,简称CNN | ||
+ | |||
+ | 历史:The rebirth of neural networks, ISCA 2010. | ||
+ | [http://pages.saclay.inria.fr/olivier.temam/homepage/ISCA2010web.pdf Rebirth_NN] | ||
+ | |||
+ | === 阅读材料 === | ||
[[深度学习-入门导读]] | [[深度学习-入门导读]] | ||
− | == | + | === 工具 === |
− | + | '''Google''' | |
− | [https://github.com/tensorflow/tensorflow | + | [https://www.tensorflow.org/ TensorFlow] |
+ | ([https://github.com/tensorflow/tensorflow Source Code]) | ||
− | [http://download.tensorflow.org/paper/whitepaper2015.pdf | + | [http://download.tensorflow.org/paper/whitepaper2015.pdf TensorFlow_Whitepaper] |
− | + | '''Facebook''' | |
− | |||
[http://torch.ch/ Torch] | [http://torch.ch/ Torch] | ||
+ | ([https://github.com/torch/torch7 Source Code]) | ||
− | + | [https://github.com/facebook/fbcunn fbcunn] | |
− | + | ||
− | [ | + | |
− | + | '''Microsoft''' | |
− | + | [http://cntk.ai CNTK] | |
+ | ([https://github.com/microsoft/cntk Source Code]) | ||
− | [http://dmlc.ml/ | + | '''[http://dmlc.ml/ DMLC]''' |
[http://mxnet.io/ MXNet] | [http://mxnet.io/ MXNet] | ||
− | [https://github.com/dmlc/mxnet | + | ([https://github.com/dmlc/mxnet Source Code]) |
− | + | '''Université de Montréal''' | |
+ | [http://www.deeplearning.net/software/theano/ Theano] | ||
+ | ([https://github.com/Theano/Theano/ Source Code]) | ||
− | + | == 增强学习 == | |
− | = | + | 增强学习(Reinforcement Learning),是机器学习中的一个领域,强调如何基于环境而行动,以取得最大化的预期利益。 |
− | + | ||
+ | === 阅读材料 === | ||
[[增强学习-入门导读]] | [[增强学习-入门导读]] | ||
− | ==工具 | + | === 工具 === |
− | + | ||
− | + | '''Google''' | |
− | [https://github.com/deepmind/lab | + | |
+ | [https://github.com/deepmind/lab DeepMind Lab] | ||
+ | |||
+ | '''OpenAI''' | ||
− | + | [https://universe.openai.com/ OpenAI Universe] | |
− | + | ([https://github.com/openai/universe Source Code]) | |
− | [https://github.com/openai/universe | + | |
− | =机器感知= | + | == 机器感知 == |
机器感知(Machine Perception),如语音,图像,视频,手势,姿态等 | 机器感知(Machine Perception),如语音,图像,视频,手势,姿态等 | ||
− | + | 以下重点讨论 | |
+ | '''基于深度学习的机器感知''' | ||
− | ===语音识别=== | + | === 语音识别 === |
语音识别(Automatic Speech Recognition),简称ASR | 语音识别(Automatic Speech Recognition),简称ASR | ||
第125行: | 第142行: | ||
基本工具 | 基本工具 | ||
− | |||
− | |||
− | *: | + | *: Long short term memory neural network (LSTM) |
− | :# | + | :# Long short term memory neural computation, Neural computation 9 (8), 1735-1780, 1997. [http://ieeexplore.ieee.org/document/6795963 LSTM] |
− | *: | + | *: Connectionist temporal classification (CTC) |
− | :# | + | :# Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, ICML 2006. |
− | + | *: Gated Recursive Unit (GRU) | |
− | :#Towards End-To-End Speech Recognition with Recurrent Neural Networks, ICML 2014. | + | :# On the Properties of Neural Machine Translation: Encoder-Decoder Approaches, SSST-8, 2014. |
− | :#Speech recognition with deep recurrent neural networks, 2013. | + | |
− | :#Hybrid speech recognition with deep bidirectional LSTM, ASRU 2013. | + | [http://www.cs.toronto.edu/~graves/ Alex Graves],DeepMind研究员,语音识别多项技术开创者。 |
− | :#Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, ICML 2006. | + | |
+ | :# Towards End-To-End Speech Recognition with Recurrent Neural Networks, ICML 2014. | ||
+ | :# Speech recognition with deep recurrent neural networks, 2013. | ||
+ | :# Hybrid speech recognition with deep bidirectional LSTM, ASRU 2013. | ||
+ | :# Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, ICML 2006. | ||
Google Speech | Google Speech | ||
+ | |||
:# Google Speech Processing from Mobile to Farfield, CHiME 2016. [http://spandh.dcs.shef.ac.uk/chime_workshop/presentations/CHiME_2016_Bacchiani_keynote.pdf Google_Speech_Processing] | :# Google Speech Processing from Mobile to Farfield, CHiME 2016. [http://spandh.dcs.shef.ac.uk/chime_workshop/presentations/CHiME_2016_Bacchiani_keynote.pdf Google_Speech_Processing] | ||
− | ===计算机视觉=== | + | === 计算机视觉 === |
计算机视觉(Computer Vision),简称 CV | 计算机视觉(Computer Vision),简称 CV | ||
− | + | Object Detection | |
− | + | [http://www.rossgirshick.info/ Ross Girshick],FAIR研究员,R-CNN算法的开创者。 | |
− | + | ||
− | [ | + | |
:<B>R-CNN (Region-based Convolutional Network method)</B> | :<B>R-CNN (Region-based Convolutional Network method)</B> | ||
第162行: | 第180行: | ||
:<B>Faster R-CNN (Faster Region-based Convolutional Network method)</B> | :<B>Faster R-CNN (Faster Region-based Convolutional Network method)</B> | ||
::#Faster R-CNN Towards real-time object detection with region proposal networks, NIPS, 2015. | ::#Faster R-CNN Towards real-time object detection with region proposal networks, NIPS, 2015. | ||
+ | |||
+ | ::• R-CNN(Matlab): https://github.com/rbgirshick/rcnn | ||
::• Fast_R-CNN(Python): https://github.com/rbgirshick/fast-rcnn | ::• Fast_R-CNN(Python): https://github.com/rbgirshick/fast-rcnn | ||
− | ::• Faster_R-CNN( | + | ::• Faster_R-CNN(Matlab): https://github.com/ShaoqingRen/faster_rcnn |
::• Faster_R-CNN(Python): https://github.com/rbgirshick/py-faster-rcnn | ::• Faster_R-CNN(Python): https://github.com/rbgirshick/py-faster-rcnn | ||
− | =机器认知= | + | == 机器认知 == |
− | 机器认知(Machine | + | 机器认知(Machine Cognition),自然语言理解、推理、注意、知识、学习、决策、交互等。 |
− | + | '''技术手段:''' | |
+ | 深度学习(Deep Learning)+ 增强学习(Reinforcement Learning) | ||
− | + | == 前沿应用进展 == | |
− | = | + | === 自然语言理解 === |
− | + | 自然语言理解(Natural Language Understanding),使用的技术称为自然语言处理(Natural Language Processing,简称NLP)。 | |
− | + | === 智能问答 === | |
− | + | ||
− | ==智能问答== | + | |
整合语音识别ASR,计算机视觉CV和自然语言处理NLP的问答系统QA。 | 整合语音识别ASR,计算机视觉CV和自然语言处理NLP的问答系统QA。 | ||
+ | 相关阅读: | ||
Reasoning in vector space: An exploratory study of question answering, ICLR 2016. | Reasoning in vector space: An exploratory study of question answering, ICLR 2016. | ||
− | + | 相关课程: | |
− | + | ||
[[实验室探究课-智能问答与智能系统]] | [[实验室探究课-智能问答与智能系统]] |
2017年1月26日 (四) 17:07的版本
目录
人工智能
人工智能(Artificial Intelligence),是指计算机系统具备从听说读写到搜索、推理、决策和回答问题等类人智能的能力,即感知、理解、决策的能力。
人工智能历史
过去经历了2次高潮与2次低谷
网络和云计算所支持的计算能力
基于大数据的机器学习的算法进步
四个层面
- 目标与功能
- 语音识别、机器视觉、自然语言理解
- 智能问答是综合以上功能的高级系统
- 核心技术
- 特定算法、机器学习算法、深度神经网络
- 软件工具
- TensorFlow / Caffe / Torch
- 底层硬件
- 可编程逻辑阵列 FPGA / 通用图形处理器 GPGPU / 通用处理器 CPU 群集
国际研究
Facebook AI Research (FAIR) (Yann LeCun)
MSR Deep Learning Technology Center (DLTC) (Li Deng)
机器学习
机器学习(Machine Learning),是指机器从数据中自动分析获得规律,并利用规律对未知数据进行预测。
阅读材料
- Jordan, M. I., and T. M. Mitchell. "Machine learning: Trends, perspectives, and prospects." Science 349, no. 6245 (2015): 255-260. Machine_Learning_Science_2015
工具
Python
深度学习
深度学习(Deep Learning),机器学习中一种基于对数据进行表征学习的方法,试图使用包含复杂结构或由多重非线性变换构成的多个处理层对数据进行高层抽象的算法。
神经网络
深度神经网络,Deep Neural Networks,简称DNN
卷积神经网络,Convolutional Neural Networks,简称CNN
历史:The rebirth of neural networks, ISCA 2010. Rebirth_NN
阅读材料
工具
Microsoft
Université de Montréal
增强学习
增强学习(Reinforcement Learning),是机器学习中的一个领域,强调如何基于环境而行动,以取得最大化的预期利益。
阅读材料
工具
OpenAI
机器感知
机器感知(Machine Perception),如语音,图像,视频,手势,姿态等
以下重点讨论 基于深度学习的机器感知
语音识别
语音识别(Automatic Speech Recognition),简称ASR
传统方法综述
- Karpagavalli, S., and E. Chandra. "A Review on Automatic Speech Recognition Architecture and Approaches." International Journal of Signal Processing, Image Processing and Pattern Recognition 9, no. 4 (2016): 393-404.
基本工具
- Long short term memory neural network (LSTM)
- Long short term memory neural computation, Neural computation 9 (8), 1735-1780, 1997. LSTM
- Connectionist temporal classification (CTC)
- Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, ICML 2006.
- Gated Recursive Unit (GRU)
- On the Properties of Neural Machine Translation: Encoder-Decoder Approaches, SSST-8, 2014.
Alex Graves,DeepMind研究员,语音识别多项技术开创者。
- Towards End-To-End Speech Recognition with Recurrent Neural Networks, ICML 2014.
- Speech recognition with deep recurrent neural networks, 2013.
- Hybrid speech recognition with deep bidirectional LSTM, ASRU 2013.
- Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, ICML 2006.
Google Speech
- Google Speech Processing from Mobile to Farfield, CHiME 2016. Google_Speech_Processing
计算机视觉
计算机视觉(Computer Vision),简称 CV
Object Detection
Ross Girshick,FAIR研究员,R-CNN算法的开创者。
- R-CNN (Region-based Convolutional Network method)
- Region based convolutional networks for accurate object detection and segmentation, TPAMI, 2015.
- Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR 2014.
- Fast R-CNN (Fast Region-based Convolutional Network method)
- Fast R-CNN, ICCV 2015.
- Faster R-CNN (Faster Region-based Convolutional Network method)
- Faster R-CNN Towards real-time object detection with region proposal networks, NIPS, 2015.
- • R-CNN(Matlab): https://github.com/rbgirshick/rcnn
- • Fast_R-CNN(Python): https://github.com/rbgirshick/fast-rcnn
- • Faster_R-CNN(Matlab): https://github.com/ShaoqingRen/faster_rcnn
- • Faster_R-CNN(Python): https://github.com/rbgirshick/py-faster-rcnn
机器认知
机器认知(Machine Cognition),自然语言理解、推理、注意、知识、学习、决策、交互等。
技术手段: 深度学习(Deep Learning)+ 增强学习(Reinforcement Learning)
前沿应用进展
自然语言理解
自然语言理解(Natural Language Understanding),使用的技术称为自然语言处理(Natural Language Processing,简称NLP)。
智能问答
整合语音识别ASR,计算机视觉CV和自然语言处理NLP的问答系统QA。
相关阅读: Reasoning in vector space: An exploratory study of question answering, ICLR 2016.
相关课程: 实验室探究课-智能问答与智能系统