“语音识别”版本间的差异
来自iCenter Wiki
(以“= 语音识别 = 语音识别,Automatic Speech Recognition,简称ASR == 基本工具 == === LSTM === Long short term memory neural network # Long short term...”为内容创建页面) |
(→研究) |
||
第41行: | 第41行: | ||
# Google Speech Processing from Mobile to Farfield, CHiME 2016. [http://spandh.dcs.shef.ac.uk/chime_workshop/presentations/CHiME_2016_Bacchiani_keynote.pdf Google_Speech_Processing] | # Google Speech Processing from Mobile to Farfield, CHiME 2016. [http://spandh.dcs.shef.ac.uk/chime_workshop/presentations/CHiME_2016_Bacchiani_keynote.pdf Google_Speech_Processing] | ||
+ | |||
+ | #Listen, attend and spell: A neural network for large vocabulary conversational speech recognition, ICASSP 2015. | ||
+ | #Convolutional, long short-term memory, fully connected deep neural networks, ICASSP 2015. | ||
+ | #Context dependent phone models for LSTM RNN acoustic modelling, ICASSP 2015. | ||
+ | #Learning the Speech Front-end With Raw Waveform CLDNNs, InterSpeech 2015. | ||
+ | |||
+ | === Baidu === | ||
+ | |||
+ | #Deep Speech 2 End-to-End Speech Recognition in English and Mandarin, JMLR 2016. | ||
+ | #Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling |
2017年3月6日 (一) 02:13的版本
语音识别
语音识别,Automatic Speech Recognition,简称ASR
基本工具
LSTM
Long short term memory neural network
- Long short term memory neural computation, Neural computation 9 (8), 1735-1780, 1997. LSTM
CTC
Connectionist temporal classification
- Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, ICML 2006.
GRU
Gated Recursive Unit
- On the Properties of Neural Machine Translation: Encoder-Decoder Approaches, SSST-8, 2014.
研究
传统方法综述
- Karpagavalli, S., and E. Chandra. "A Review on Automatic Speech Recognition Architecture and Approaches." International Journal of Signal Processing, Image Processing and Pattern Recognition 9, No. 4 (2016): 393-404.
Alex Graves,DeepMind研究员,语音识别多项技术开创者
- Speech recognition with deep recurrent neural networks, 2013.
- Hybrid speech recognition with deep bidirectional LSTM, ASRU 2013.
- Towards End-To-End Speech Recognition with Recurrent Neural Networks, ICML 2014.
- Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, ICML 2006.
Google Speech
- Google Speech Processing from Mobile to Farfield, CHiME 2016. Google_Speech_Processing
- Listen, attend and spell: A neural network for large vocabulary conversational speech recognition, ICASSP 2015.
- Convolutional, long short-term memory, fully connected deep neural networks, ICASSP 2015.
- Context dependent phone models for LSTM RNN acoustic modelling, ICASSP 2015.
- Learning the Speech Front-end With Raw Waveform CLDNNs, InterSpeech 2015.
Baidu
- Deep Speech 2 End-to-End Speech Recognition in English and Mandarin, JMLR 2016.
- Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling