HMMmodel
所属分类:语音合成
开发工具:C++
文件大小:15KB
下载次数:55
上传日期:2007-10-10 20:59:28
上 传 者:
JohnsonCow
说明: This code implements in C++ a basic left-right hidden Markov model
and corresponding Baum-Welch (ML) training algorithm. It is meant as
an example of the HMM algorithms described by L.Rabiner (1) and
others. Serious students are directed to the sources listed below for
a theoretical description of the algorithm. KF Lee (2) offers an
especially good tutorial of how to build a speech recognition system
using hidden Markov models.
文件列表:
马尔可夫模型\generate_seq.cc (1611, 1995-08-22)
马尔可夫模型\hmm.cc (30348, 1995-08-22)
马尔可夫模型\hmm.h (4145, 1995-08-22)
马尔可夫模型\Makefile (1167, 1995-08-22)
马尔可夫模型\makefile.old (815, 1995-08-22)
马尔可夫模型\random.h (301, 1995-08-22)
马尔可夫模型\test.hmm (138, 1995-08-21)
马尔可夫模型\test_hmm.cc (632, 1995-08-22)
马尔可夫模型\train_hmm.cc (1822, 1995-08-22)
马尔可夫模型 (0, 2007-10-10)
H I D D E N M A R K O V M O D E L
for automatic speech recognition
7/30/95
This code implements in C++ a basic left-right hidden Markov model
and corresponding Baum-Welch (ML) training algorithm. It is meant as
an example of the HMM algorithms described by L.Rabiner (1) and
others. Serious students are directed to the sources listed below for
a theoretical description of the algorithm. KF Lee (2) offers an
especially good tutorial of how to build a speech recognition system
using hidden Markov models.
Jim and I built this code in order to learn how HMM systems work and
we are now offering it to the net so that others can learn how to use
HMMs for speech recognition. Keep in mind that efficiency was not our
primary concern when we built this code, but ease of understanding
was. I expect people to use this code in two different ways. People
who wish to build an experimental speech recognition system can use
the included "train_hmm" and "test_hmm" programs as black box
components. The code can also be used in conjunction with written
tutorials on HMMs to understand how they work.
HOW TO COMPILE IT:
We built this code on a Linux system (8meg RAM) and it has been
tested under SunOS as well; it should run on any system with Gnu C++
and has been tested to be ANSI compliant.
To compile and test the program,
1) extract the code:
tar -xf hmm.tar
2) compile the programs:
make all
3) create test sequences:
generate_seq test.hmm 20 50
4) train using existing model:
train_hmm test.hmm.seq test.hmm .01
5) train using random parameters:
train_hmm test.hmm.seq 1234 3 3 .01
After steps 4 and 5 you can compare the file test.hmm.seq.hmm with
test.hmm to confirm that the program is working.
FILE FORMATS:
There are two types of files used by these programs. The first is
the hmm model file which has the following header:
states:
symbols:
A series of ordered blocks follow the header, each of which is two
lines long. Each block corresponds to a state in the model. The
first line of each block gives the probability of the model recurring
followed by the probability of generating each of the possible output
symbols when it recurs. The second line gives the probability of the
model transitioning to the next state followed by the probability of
generating each of the possible output symbols when it transitions.
The file "test.hmm" gives an example of this format for a three state
model with three possible output symbols.
The second kind of file is a list of symbol sequences to train or
test the model on. Symbol sequences are space separated integers (0 1
2...) terminated by a newline ("\n"). Sequences may either be all of
the same length, or of different lengths. The algorithm detects for
each case and processes each slightly differently. Use the output of
step 3 above for an example of a sequence file. A file containing
sequences which are all of the same length should train slightly
faster.
ASR IN A NUTSHELL:
A complete automatic speech recognition system is likely to include
programs that perform the following tasks:
1) convert audio/wave files to sequences of multi-dimensional
feature vectors. (eg. DFT, PLP, etc)
2) quantize feature vectors into sequences of symbols (eg. VQ)
3) train a model for each recognition object (ie. word,
phoneme) from the sequences of symbols. (eg. HMM)
4?) constrain models using grammar information.
Most of the above components are readily available as freeware and
building a system from them should not be too difficult. Making it
work well, however, could be a major undertaking; the devil is in the
details.
FUTURE:
I would like to eventually put together all of the necessary
components for a complete speech recognition test bench. I envision
something that could be combined with a standard speech database such
as the TIMIT data set. Such a test bench would allow researchers to
swap in and evaluate their own methods at various stages in the
system. Reported results could be compared against the performance of
a standard non-optimized system which would be publicly available.
This way two methods could be compared while controlling for different
data sets and pre/post processing.
Unfortunately, speech recognition is mostly a side line to Jim's
graduate work in neural networks and I currently have a job that has
taken me away from the field of speech recognition. If someone uses
this code in a complete system, we would appreciate hearing about it.
Questions and comments can be directed to:
Richard Myers (rmyers@isx.com) and Jim Whitson (whitson@ics.uci.edu)
Bibliography:
-------------
1. L. R. Rabiner, B. H. Juang, "Fundamentals of Speech Recognition."
New Jersey : Prentice Hall, c1993.
2. L. R. Rabiner, "A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition," Proc. of the IEEE,
Feb. 1***9.
3. L. R. Rabiner, B. H. Juang, "An Introduction to Hidden Markov
Models," IEEE ASSP Magazine, Jan. 1***6.
4. K. F. Lee, "Automatic speech recognition : the development of the
SPHINX system." Boston : Kluwer Academic Publishers, c1***9.
近期下载者:
相关文件:
收藏者: