I used the pretrained checkpoints (64md_8k) for the sc09 dataset and generated samples as recommended. I used the following to read it and listen:
fname = 'commands_listen.mat'
mat = scipy.io.loadmat(fname)
import IPython.display as ipd
sr = 22050 # sample rate
ipd.Audio(mat['reconstructed'][0, :], rate=sr) # play a NumPy array
- I find that most samples are illegible, but I can find some sounds here and there. Is that normal?
- Out of curiosity, are the examples you present in the website cherry-picked?
- Am I doing something wrong in generating the samples?
I used the pretrained checkpoints (64md_8k) for the sc09 dataset and generated samples as recommended. I used the following to read it and listen: