hmm

所属分类:语音合成
开发工具:Visual C++
文件大小:25KB
下载次数:83
上传日期:2010-06-01 00:54:22
上 传 者dawn_89
说明:  隐马尔可夫模型源代码,实现了最基本的三个算法
(Hidden Markov model source code, to achieve the most basic of the three algorithms)

文件列表:
隐马尔可夫模型源代码\hmm-1.03\generate_seq.cc (1611, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\hmm.cc (30348, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\hmm.h (4145, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\Makefile (1167, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\makefile.old (815, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\random.dsp (3399, 2005-09-27)
隐马尔可夫模型源代码\hmm-1.03\random.dsw (537, 2005-09-27)
隐马尔可夫模型源代码\hmm-1.03\random.h (301, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\random.ncb (27648, 2010-05-18)
隐马尔可夫模型源代码\hmm-1.03\random.opt (48640, 2005-09-27)
隐马尔可夫模型源代码\hmm-1.03\random.plg (246, 2005-09-27)
隐马尔可夫模型源代码\hmm-1.03\random.suo (10752, 2010-05-18)
隐马尔可夫模型源代码\hmm-1.03\test.hmm (138, 1995-08-21)
隐马尔可夫模型源代码\hmm-1.03\test_hmm.cc (632, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\train_hmm.cc (1822, 1995-08-22)
隐马尔可夫模型源代码\hmm-1.03\Debug (0, 2005-09-27)
隐马尔可夫模型源代码\hmm-1.03 (0, 2010-05-26)
隐马尔可夫模型源代码 (0, 2005-09-15)

H I D D E N M A R K O V M O D E L for automatic speech recognition 7/30/95 This code implements in C++ a basic left-right hidden Markov model and corresponding Baum-Welch (ML) training algorithm. It is meant as an example of the HMM algorithms described by L.Rabiner (1) and others. Serious students are directed to the sources listed below for a theoretical description of the algorithm. KF Lee (2) offers an especially good tutorial of how to build a speech recognition system using hidden Markov models. Jim and I built this code in order to learn how HMM systems work and we are now offering it to the net so that others can learn how to use HMMs for speech recognition. Keep in mind that efficiency was not our primary concern when we built this code, but ease of understanding was. I expect people to use this code in two different ways. People who wish to build an experimental speech recognition system can use the included "train_hmm" and "test_hmm" programs as black box components. The code can also be used in conjunction with written tutorials on HMMs to understand how they work. HOW TO COMPILE IT: We built this code on a Linux system (8meg RAM) and it has been tested under SunOS as well; it should run on any system with Gnu C++ and has been tested to be ANSI compliant. To compile and test the program, 1) extract the code: tar -xf hmm.tar 2) compile the programs: make all 3) create test sequences: generate_seq test.hmm 20 50 4) train using existing model: train_hmm test.hmm.seq test.hmm .01 5) train using random parameters: train_hmm test.hmm.seq 1234 3 3 .01 After steps 4 and 5 you can compare the file test.hmm.seq.hmm with test.hmm to confirm that the program is working. FILE FORMATS: There are two types of files used by these programs. The first is the hmm model file which has the following header: states: symbols: A series of ordered blocks follow the header, each of which is two lines long. Each block corresponds to a state in the model. The first line of each block gives the probability of the model recurring followed by the probability of generating each of the possible output symbols when it recurs. The second line gives the probability of the model transitioning to the next state followed by the probability of generating each of the possible output symbols when it transitions. The file "test.hmm" gives an example of this format for a three state model with three possible output symbols. The second kind of file is a list of symbol sequences to train or test the model on. Symbol sequences are space separated integers (0 1 2...) terminated by a newline ("\n"). Sequences may either be all of the same length, or of different lengths. The algorithm detects for each case and processes each slightly differently. Use the output of step 3 above for an example of a sequence file. A file containing sequences which are all of the same length should train slightly faster. ASR IN A NUTSHELL: A complete automatic speech recognition system is likely to include programs that perform the following tasks: 1) convert audio/wave files to sequences of multi-dimensional feature vectors. (eg. DFT, PLP, etc) 2) quantize feature vectors into sequences of symbols (eg. VQ) 3) train a model for each recognition object (ie. word, phoneme) from the sequences of symbols. (eg. HMM) 4?) constrain models using grammar information. Most of the above components are readily available as freeware and building a system from them should not be too difficult. Making it work well, however, could be a major undertaking; the devil is in the details. FUTURE: I would like to eventually put together all of the necessary components for a complete speech recognition test bench. I envision something that could be combined with a standard speech database such as the TIMIT data set. Such a test bench would allow researchers to swap in and evaluate their own methods at various stages in the system. Reported results could be compared against the performance of a standard non-optimized system which would be publicly available. This way two methods could be compared while controlling for different data sets and pre/post processing. Unfortunately, speech recognition is mostly a side line to Jim's graduate work in neural networks and I currently have a job that has taken me away from the field of speech recognition. If someone uses this code in a complete system, we would appreciate hearing about it. Questions and comments can be directed to: Richard Myers (rmyers@isx.com) and Jim Whitson (whitson@ics.uci.edu) Bibliography: ------------- 1. L. R. Rabiner, B. H. Juang, "Fundamentals of Speech Recognition." New Jersey : Prentice Hall, c1993. 2. L. R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. of the IEEE, Feb. 1***9. 3. L. R. Rabiner, B. H. Juang, "An Introduction to Hidden Markov Models," IEEE ASSP Magazine, Jan. 1***6. 4. K. F. Lee, "Automatic speech recognition : the development of the SPHINX system." Boston : Kluwer Academic Publishers, c1***9.

近期下载者

相关文件


收藏者