Deep Speech 2 End-to-End Speech Recognition in English and Mandarin, JMLR 2016.
Geoffrey Hinton et al., "Deep neural networks for acoustic modeling in speech recognition." IEEE Signal Processing Magazine 29.6 (2012): 82-97.
EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding, ASRU 2015.