联系客服
客服二维码

联系客服获取更多资料

微信号:LingLab1

客服电话:010-82185409

意见反馈
关注我们
关注公众号

关注公众号

linglab语言实验室

回到顶部
中文语音数据库-Aishell

2231 阅读 2020-07-24 18:27:19 上传 0KB

希尔贝壳中文普通话开源语音数据库AISHELL-ASR0009-OS1录音时长178小时,是希尔贝壳中文普通话语音数据库AISHELL-ASR0009的一部分。

希尔贝壳中文普通话开源语音数据库AISHELL-ASR0009-OS1录音时长178小时,是希尔贝壳中文普通话语音数据库AISHELL-ASR0009的一部分。AISHELL-ASR0009录音文本涉及智能家居、无人驾驶、工业生产等11个领域。录制过程在安静室内环境中, 同时使用3种不同设备: 高保真麦克风(44.1kHz,16-bit);Android系统手机(16kHz,16-bit);iOS系统手机(16kHz,16-bit)。高保真麦克风录制的音频降采样为16kHz,用于制作AISHELL-ASR0009-OS1。400名来自中国不同口音区域的发言人参与录制。经过专业语音校对人员转写标注,并通过严格质量检验,此数据库文本正确率在95%以上。分为训练集、开发集、测试集。


Aishell is an open-source Chinese Mandarin speech corpus published by Beijing Shell Shell Technology Co.,Ltd.

400 people from different accent areas in China are invited to participate in the recording, which is conducted in a quiet indoor environment using high fidelity microphone and downsampled to 16kHz. The manual transcription accuracy is above 95%, through professional speech annotation and strict quality inspection. The data is free for academic use. We hope to provide moderate amount of data for new researchers in the field of speech recognition.




Aishell

Identifier: SLR33

Summary: Mandarin data, provided by Beijing Shell Shell Technology Co.,Ltd

Category: Speech

License: Apache License v.2.0

Downloads (use a mirror closer to you):
data_aishell.tgz [15G]   ( speech data and transcripts )   Mirrors: [China] 
resource_aishell.tgz [1.2M]   ( supplementary resources, incl. lexicon, speaker info )   Mirrors: [China] 

About this resource:

Aishell is an open-source Chinese Mandarin speech corpus published by Beijing Shell Shell Technology Co.,Ltd.

400 people from different accent areas in China are invited to participate in the recording, which is conducted in a quiet indoor environment using high fidelity microphone and downsampled to 16kHz. The manual transcription accuracy is above 95%, through professional speech annotation and strict quality inspection. The data is free for academic use. We hope to provide moderate amount of data for new researchers in the field of speech recognition.

You can cite the data using the following BibTeX entry:

@inproceedings{aishell_2017,  title={AIShell-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline},  author={Hui Bu, Jiayu Du, Xingyu Na, Bengu Wu, Hao Zheng},  booktitle={Oriental COCOSDA 2017},  pages={Submitted},  year={2017} }

External URL: http://www.aishelltech.com/kysjcp   Full description from the company website



点赞
收藏
表情
图片
附件