图像
- ImageNet
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009.
- ILSVR 2012数据集
Large Scale Visual Recognition Challenge 2012 (ILSVRC2012)
- Microsoft COCO数据集
T. Lin, M. Maire, S. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays, P. Perona, D. Ramanan, P. Doll´ar, and C. L. Zitnick. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014.
声音
- Google AudioSet 数据集
Dataset - AudioSet AudioSet
- TIMIT
DARPA-ISTO, The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT), speech disc cd1-1.1 edition, 1990.
TIMIT Acoustic-Phonetic Continuous Speech Corpus
- Hub5'00
数据集(英文):https://catalog.ldc.upenn.edu/LDC2002T43
- Switchboard
文本
中文语言资源联盟:http://www.chineseldc.org/
- 豆瓣电影影评