CS570 Artificial Intelligence and Machine Learning

Repository for Group Project of CS570, KAIST, 2023 Spring

Collaborator

Sorted by Korean alphabetical order

About the Course

Please refer to the syllabus

About the Project

Diffusion-GAN Model for Audio Synthesis

Report

Overview of architecture

Demo

Our Diffusion-GAN melspectrogram result (100 samples) is in diffgan_output.zip. You can convert each melspectrogram .png file into .wav

About the Code

Pre-processing

Use wav2mel.py to convert from audio.wav into mel.png

$ python wav2mel.py --input audio.wav --save mel.png

Training

We utilized the publicly accessible source code of Diffwave and Diffusion-GAN for our project

Our experimental setup is described in our report.

Convert Mel-spectrogram to audio

Use mel2wav.py to convert from mel.png into output.wav

$ python mel2wav.py --input mel.png --save output.wav

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
DiffGan_overview.png		DiffGan_overview.png
Diffusion-GAN_Model_for_Audio_Synthesis.pdf		Diffusion-GAN_Model_for_Audio_Synthesis.pdf
README.md		README.md
diffgan_output.zip		diffgan_output.zip
mel2wav.py		mel2wav.py
wav2mel.py		wav2mel.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS570 Artificial Intelligence and Machine Learning

Collaborator

About the Course

About the Project

Overview of architecture

Demo

About the Code

Pre-processing

Training

Convert Mel-spectrogram to audio

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CS570 Artificial Intelligence and Machine Learning

Collaborator

About the Course

About the Project

Overview of architecture

Demo

About the Code

Pre-processing

Training

Convert Mel-spectrogram to audio

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages