se_relativisticgan-master

所属分类:其他
开发工具:Python
文件大小:75KB
下载次数:5
上传日期:2020-10-06 12:57:52
上 传 者jyh351
说明:  基于gan的语音增强算法; 还包括原始gan用于语音增强的网络,还有wgan等
(Speech enhancement based on GAN; Also includes the original GAN for voice enhancement network, wGAN and so on)

文件列表:
GT_16channel_31tap.mat (5880, 2019-09-19)
__pycache__ (0, 2019-09-19)
__pycache__\data_ops.cpython-37.pyc (1955, 2019-09-19)
__pycache__\file_ops.cpython-37.pyc (1063, 2019-09-19)
__pycache__\keras_contrib_backend.cpython-37.pyc (5172, 2019-09-19)
__pycache__\models.cpython-37.pyc (3635, 2019-09-19)
__pycache__\normalizations.cpython-37.pyc (17610, 2019-09-19)
__pycache__\wgan_ops.cpython-37.pyc (992, 2019-09-19)
data_ops.py (1919, 2019-09-19)
download_dataset.sh (1061, 2019-09-19)
file_ops.py (1091, 2019-09-19)
keras_contrib_backend.py (6259, 2019-09-19)
models.py (5871, 2020-06-14)
normalizations.py (25843, 2019-09-19)
prepare_data.py (5076, 2019-09-19)
run_aecnn.py (9062, 2019-09-19)
run_lsgan_se.py (11894, 2019-09-19)
run_rsgan-gp_se.py (13823, 2019-09-19)
run_wgan-gp_se.py (14350, 2019-09-19)
test_wav.txt (10712, 2019-09-19)
train_wav.txt (150436, 2019-09-19)
wgan_ops.py (798, 2019-09-19)

# Keras framework for speech enhancement using relativistic GANs. Uses a fully convolutional end-to-end speech enhancement system. Implemetation details of the paper accepted to ICASSP-2019 **Deepak Baby and Sarah Verhulst, _SERGAN: Speech enhancement using relativistic generative adversarial networks with gradient penalty_, IEEE-ICASSP, pp. 106-110, May 2019, Brighton, UK.** > This work was funded with support from the EU Horizon 2020 programme under grant agreement No 678120 (RobSpear). ---- ### Pre-requisites 1. Install [tensorflow](https://www.tensorflow.org/) and [keras](https://keras.io/) 1. Install [tqdm](https://pypi.org/project/tqdm/) for profiling the training progress 1. The experiments are conducted on a dataset from Valentini et. al., and are downloaded from [here](https://datashare.is.ed.ac.uk/handle/10283/1942). The following script can be used to download the dataset. *Requires [sox](http://sox.sourceforge.net/) for converting to 16kHz*. ```bash $ ./download_dataset.sh ``` ### Running the model 1. **Prepare data for training and testing the various models**. The folder path may be edited if you keep the database in a different folder. This script is to be executed only once and the all the models reads from the same location. ```python python prepare_data.py ``` 2. **Running the models**. The models available in this repository are listed below. Every implementation offers several cGAN configurations. Edit the ```opts``` variable for choosing the cofiguration. The results will be automatically saved to different folders. The folder name is generated from ```files_ops.py ``` and the foldername automatically includes different configuration options. 1. `run_aecnn.py` : Auto-encoder CNN model with L1 loss term (No discriminator) 1. `run_lsgan_se.py` : SEGAN with least-squares loss [1] 2. `run_wgan-gp_se.py` : GAN model with Wassterstein loss and Gradient Penalty 3. `run_rsgan-gp_se.py` : GAN model with relativistic standard GAN with Gradient Penalty 4. `run_rasgan-gp_se.py` : GAN model with relativistic average standard GAN with Gradient Penalty 5. `run_ralsgan-gp_se.py`: GAN model with relativistic average least-squares GAN with Gradient Penalty 3. **Evaluation on testset is also done together with training**. Set ```TEST_SEGAN = False``` for disabling testing. ---- ### Misc * **This code loads all the data into memory for speeding up training**. But if you dont have enough memory, it is possible to read the mini-batches from the disk using HDF5 read. In ```run_.py``` ```python clean_train_data = np.array(fclean['feat_data']) noisy_train_data = np.array(fnoisy['feat_data']) ``` change the above lines to ```python clean_train_data = fclean['feat_data'] noisy_train_data = fnoisy['feat_data'] ``` **But this can lead to a slow-down of about 20 times (on the test machine)** as the mini-batches are to be read from the disk over several epochs. ---- ### References [1] S. Pascual, A. Bonafonte, and J. Serra, _SEGAN: speech enhancement generative adversarial network_, in INTERSPEECH., ISCA, Aug 2017, pp. 3***2“3***6. ---- #### Credits The keras implementation of cGAN is based on the following repos * [SEGAN](https://github.com/santi-pdp/segan) * [DCGAN](https://github.com/carpedm20/DCGAN-tensorflow) * [pix2pix](https://github.com/phillipi/pix2pix)

近期下载者

相关文件


收藏者