Skip to content

gantasmo/Diffusion-GAN-for-Audio-Synthesis

 
 

Repository files navigation

CS570 Artificial Intelligence and Machine Learning

Repository for Group Project of CS570, KAIST, 2023 Spring

Collaborator

Sorted by Korean alphabetical order

About the Course

Please refer to the syllabus

About the Project

Diffusion-GAN Model for Audio Synthesis

Report

Overview of architecture

architecture

Demo

Our Diffusion-GAN melspectrogram result (100 samples) is in diffgan_output.zip. You can convert each melspectrogram .png file into .wav

About the Code

Pre-processing

Use wav2mel.py to convert from audio.wav into mel.png

$ python wav2mel.py --input audio.wav --save mel.png

Training

We utilized the publicly accessible source code of Diffwave and Diffusion-GAN for our project

Our experimental setup is described in our report.

Convert Mel-spectrogram to audio

Use mel2wav.py to convert from mel.png into output.wav

$ python mel2wav.py --input mel.png --save output.wav

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%