音声／音楽生成・音響処理分野におけるEnd-to-End系の論文情報とか各種スライド情報とかを忘れないうちにメモ

なんだか最近、当該分野でEnd-to-End系の論文が急に増えたなぁということで、忘れないうちに自分用にメモ。面白そうな論文情報も含めて。もうね、正直言ってお腹いっぱいなんですけど、流れには逆らえないですね。ほとんどarXivなので、信頼性は担保されておらず、あくまで参考までに。気が向いたら一言コメントつけます。
※音声認識系はあえて外しました。

Paper

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
- URL https://arxiv.org/abs/1704.01279
- Blog & Demo NSynth: Neural Audio Synthesis
- Google Brain and DeepMind’s work

Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
- URL https://arxiv.org/abs/1703.10135
- Demo https://google.github.io/tacotron/
- Google’s work, "submitted to Interspeech 2017"

MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation using 1D and 2D Conditions
- URL https://arxiv.org/abs/1703.10847
- Academia Sinica’s work

SEGAN: Speech Enhancement Generative Adversarial Network
- URL https://arxiv.org/abs/1703.09452
- Demo http://veu.talp.cat/segan/
- Code https://github.com/santi-pdp/segan
- a method of end-to-end speech enhancement

Raw Waveform-based Speech Enhancement by Fully Convolutional Networks
- URL https://arxiv.org/abs/1703.02205
- a method of end-to-end speech enhancement

Deep Voice: Real-time Neural Text-to-Speech
- URL https://arxiv.org/abs/1702.07825
- Demo http://research.baidu.com/deep-voice-production-quality-text-speech-system-constructed-entirely-deep-neural-networks/
- Baidu’s work; a method of end-to-end speech synthesis

Char2Wav: End-to-End Speech Synthesis
- URL https://openreview.net/forum?id=B1VWyySKx
- Demo http://josesotelo.com/speechsynthesis/

SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
- URL https://arxiv.org/abs/1612.07837

WaveNet: A Generative Model for Raw Audio
- URL https://arxiv.org/abs/1609.03499

GAN系でとりあえず以下。それにしてもGAN系の論文も、タケノコのようにポコポコ出てきますね。

Towards Principled Methods for Training Generative Adversarial Networks
- URL https://arxiv.org/abs/1701.04862

Wasserstein GAN
- URL https://arxiv.org/abs/1701.07875

Improved Training of Wasserstein GANs
- URL https://arxiv.org/abs/1704.00028
- Code https://github.com/igul222/improved_wgan_training

Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks
- URL https://arxiv.org/abs/1704.00849
- Demo https://jeremycchsu.github.io/vc-vawgan/

BEGAN: Boundary Equilibrium Generative Adversarial Networks
- URL https://arxiv.org/abs/1703.10717

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
- URL https://arxiv.org/abs/1703.10593

以下も参考までに。

Slide

[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adversarial Networks from DeepLearningJP2016

www.slideshare.net

Generative Model-Based Text-to-Speech Synthesis
- URL https://research.google.com/pubs/pub45882.html
- Video https://www.youtube.com/watch?v=nsrSrYtKkT8

音響分野におけるブラインド適応信号処理の展開
- URL https://www.slideshare.net/kame_hirokazu/kameoka2017-ieice03ver2-73894508

音声信号の分析と加工 ― 音声を自在に変換するには？
- URL https://drive.google.com/open?id=0B8UaDFgTTWodU0c2N2hFZWV0THc

音声変換技術の進展と課題
- URL https://drive.google.com/open?id=0B8UaDFgTTWodV3k1TkE3MlpKdmc

Website

Fantastic GANs and where to find them
- URL http://guimperarnau.com/blog/2017/03/Fantastic-GANs-and-where-to-find-them