kaldi-master

所属分类:网络编程
开发工具:Java
文件大小:16535KB
下载次数:3
上传日期:2019-06-06 12:39:18
上 传 者画画1234
说明:  实验示例是基于语音中的mfcc,语音倒谱特征来进行聚类,先利用训练样本来计算训练样本聚类中心(用到了lbg算法),之后再进行分类。 注意:使用代码时需要自己更改文件路径。
(The experimental example is based on MFCC and cepstrum features of speech. First, the training samples are used to calculate the clustering center of training samples (using LBG algorithm), and then the classification is carried out. Note: When using code, you need to change the file path yourself.)

文件列表:
.travis.yml (1455, 2019-06-04)
COPYING (17264, 2019-06-04)
INSTALL (258, 2019-06-04)
docker (0, 2019-06-04)
docker\debian9.8-cpu (0, 2019-06-04)
docker\debian9.8-cpu\Dockerfile (814, 2019-06-04)
docker\ubuntu16.04-gpu (0, 2019-06-04)
docker\ubuntu16.04-gpu\Dockerfile (855, 2019-06-04)
egs (0, 2019-06-04)
egs\aidatatang_200zh (0, 2019-06-04)
egs\aidatatang_200zh\s5 (0, 2019-06-04)
egs\aidatatang_200zh\s5\RESULTS (1460, 2019-06-04)
egs\aidatatang_200zh\s5\cmd.sh (865, 2019-06-04)
egs\aidatatang_200zh\s5\conf (0, 2019-06-04)
egs\aidatatang_200zh\s5\conf\cmu2pinyin (195, 2019-06-04)
egs\aidatatang_200zh\s5\conf\decode.config (112, 2019-06-04)
egs\aidatatang_200zh\s5\conf\mfcc.conf (73, 2019-06-04)
egs\aidatatang_200zh\s5\conf\mfcc_hires.conf (624, 2019-06-04)
egs\aidatatang_200zh\s5\conf\online_cmvn.conf (96, 2019-06-04)
egs\aidatatang_200zh\s5\conf\online_pitch.conf (114, 2019-06-04)
egs\aidatatang_200zh\s5\conf\pinyin2cmu (421, 2019-06-04)
egs\aidatatang_200zh\s5\conf\pinyin_initial (50, 2019-06-04)
egs\aidatatang_200zh\s5\conf\pitch.conf (25, 2019-06-04)
egs\aidatatang_200zh\s5\local (0, 2019-06-04)
egs\aidatatang_200zh\s5\local\chain (0, 2019-06-04)
egs\aidatatang_200zh\s5\local\chain\compare_wer.sh (2426, 2019-06-04)
egs\aidatatang_200zh\s5\local\chain\run_tdnn.sh (21, 2019-06-04)
egs\aidatatang_200zh\s5\local\chain\tuning (0, 2019-06-04)
egs\aidatatang_200zh\s5\local\chain\tuning\run_tdnn_1a.sh (7166, 2019-06-04)
... ...

Aidatatang_200zh is a free Chinese Mandarin speech corpus provided by Beijing DataTang Technology Co., Ltd under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License. **About the aidatatang_200zh corpus:** - The corpus contains 200 hours of acoustic data, which is mostly mobile recorded data. - 600 speakers from different accent areas in *** are invited to participate in the recording. - The transcription accuracy for each sentence is larger than ***%. - Recordings are conducted in a quiet indoor environment. - The database is divided into training set, validation set, and testing set in a ratio of 7: 1: 2. - Detail information such as speech data coding and speaker information is preserved in the metadata file. - Segmented transcripts are also provided. You can get the corpus from [here](https://www.datatang.com/webfront/opensource.html). DataTang is a community of creators-of world-changers and future-builders. We're invested in collaborating with a diverse set of voices in the AI world, and are excited about working on large-scale projects. Beyond speech, we're providing multiple resources in image, and text. For more details, please visit [datatang](). **About the recipe:** To demonstrate that this corpus is a reasonable data resource for Chinese Mandarin speech recognition research, a baseline recipe is provided here for everyone to explore their own systems easily and quickly. In this directory, each subdirectory contains the scripts for a sequence of experiments. The recipe in subdirectory "s5" is based on the hkust s5 recipe and aishell s5 recipe. It generates an integrated phonetic lexicon with CMU dictionary and cedit dictionary. This recipe follows the Mono+Triphone+SAT+fMLLR+DNN pipeline. In addition, this directory will be extended as scripts for speaker diarization and so on are created.

近期下载者

相关文件


收藏者