wavenet
所属分类:音频处理
开发工具:Python
文件大小:0KB
下载次数:0
上传日期:2018-11-13 10:16:33
上 传 者:
sh-1993
说明: 小波网,,
(wavenet,,)
文件列表:
main.py (11199, 2018-11-13)
runme.sh (426, 2018-11-13)
utils/ (0, 2018-11-13)
utils/config.py (458, 2018-11-13)
utils/datasets.py (3207, 2018-11-13)
utils/models.py (14885, 2018-11-13)
utils/mu_law.py (2239, 2018-11-13)
utils/utilities.py (1544, 2018-11-13)
### WaveNet pytorch implementation
This code implements speech synthesis with WaveNet using pytorch. This code is partly translated from the tensorflow implementation: https://github.com/ibab/tensorflow-wavenet
## Requirements
pytorch: 0.4.0
## Data preparation
VCTK Corpus includes 44,352 speech sentences uttered by 109 native English speakers. Each utterance last for a few seconds. Download VCTK dataset from https://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html. The dataset looks like:
dataset_dir
├── wav48
│ ├── p225
│ │ ├── p225_001.wav
│ │ └── ...
│ ├── p226
│ │ ├── p226_001.wav
│ │ └── ...
│ └── ...
├── txt
│ ├── p225
│ │ ├── p225_001.txt
│ │ └── ...
│ ├── p226
│ │ ├── p226_001.txt
│ │ └── ...
│ └── ...
├── speaker-info.txt
└── ...
## Run
Modify the paths in runme.sh
Run commands in runme.sh line by line.
## Results
## FAQ
This code runs wtih a single GPU card with 12 GB memory. If you are running out of GPU memory, then modify Dataset to shorten the audio clip.
## References
[1] Van Den Oord, Aron, et al. "WaveNet: A generative model for raw audio." SSW. 2016.
[2] Paine, Tom Le, et al. "Fast wavenet generation algorithm." arXiv preprint arXiv:1611.09482 (2016).
近期下载者:
相关文件:
收藏者: