zaoshengfenli

所属分类:系统设计方案
开发工具:Visual C++
文件大小:1069KB
下载次数:109
上传日期:2007-03-27 13:22:43
上 传 者fD3hHpbjaq
说明:  计算机场景分析(CASA),王德良论文中用c++实现噪声分离程序。很权威的论文和程序!
(computer scene analysis (CASA), c infection. Papers used to achieve noise separation procedures. Very authoritative papers and procedures.)

文件列表:
HuWang.dsp (3401, 2007-03-12)
HuWang.dsw (537, 2007-03-12)
HuWang.exe (163890, 2007-01-25)
HuWang.h (4776, 2007-01-25)
HuWang.ncb (41984, 2007-03-15)
HuWang.opt (48640, 2007-03-15)
HuWang.plg (2265, 2007-03-12)
Hu-Wang.tnn04.pdf (925593, 2007-01-25)
pitch.cpp (11393, 2007-01-25)
pitch.h (5218, 2007-01-25)
segment.cpp (4382, 2007-01-25)
segment.h (3399, 2007-01-25)
tool.cpp (2930, 2007-01-25)
tool.h (439, 2007-01-25)
Untitled.asv (147, 2007-01-25)
v3n3.dat (100883, 2007-01-25)
data.h (1863, 2007-01-25)
feature.cpp (13874, 2007-01-25)
feature.h (4644, 2007-01-25)
group.cpp (5876, 2007-01-25)
group.h (1283, 2007-01-25)
v3n3mask.dat (37152, 2007-01-25)
v3n3res1.dat (288333, 2007-01-25)
v3n3res.dat (288333, 2007-01-25)
HuWang.cpp (4358, 2007-01-25)

This C++ program is a software implementation of the voiced speech segregation algorithm described in detail in the appendix of the following book chapter: Guoning Hu and DeLiang Wang (2006): "An auditory scene analysis approach to monaural speech segregation," In Topics in Acoustic Echo and Noise Control edited by E. Hnsler and G. Schmidt. Springer, Heidelberg, pp. 485-515. This algorithm is a simplified and slightly improved version of their 2004 IEEE Trans. on Neural Networks article. You may run the program with the following command: HuWang input output "input" is a text file that contains the waveform of input, i.e., a speech mixture. "output" is a text file that contains the waveform of output, i.e., resynthesized target speech. If you want to store the final speech stream, i.e., the binary mask for resynthesis (see the chapter for more details), you may run the program with the following command: HuWang input output mask "mask" is a text file that stores the binary mask. Each line of the file corresponds to the time-frequency units in a time frame, from filter channel 1 to filter channel 128. We have also included some sample files here. "v3n3.dat": a mixture of a male voice and a "cocktail party" noise. "v3n3res.dat": the corresponding resynthesized speech. "v3n3mask.dat": the corresponding mask. We have included the source codes here. This program is developed on Microsoft visual C++ 6.0. The 'HuWang.exe' file is built with the following settings: SAMPLING_FREQUENCY: the sampling frequency of input, which is set to be 16000 MAX_SIG_LENGTH: the maximum number of samples in input, which is set to be 200000 (the actual size of input, can be smaller than the number, but cannot be larger). If you need to change these numbers, specify them in the file "data.h", and then re-build 'HuWang.exe'. We distribute this program freely, but please cite the paper if you have made any use of this program. ------------------------------------------------------------ Dr. Guoning Hu Perception and Neurodynamics Lab. The Ohio State University 2015 Neil Ave. Columbus, OH 43210-1277, U.S.A. Email: hu.117@osu.edu Phone: 614-292-7402 URL: http://www.cse.ohio-state.edu/~hu

近期下载者

相关文件


收藏者