Skip to content

Problem with generated audio from pre-trained checkpoints #11

@swamiviv

Description

@swamiviv

I used the pretrained checkpoints (64md_8k) for the sc09 dataset and generated samples as recommended. I used the following to read it and listen:

fname = 'commands_listen.mat'
mat = scipy.io.loadmat(fname)
import IPython.display as ipd
sr = 22050 # sample rate
ipd.Audio(mat['reconstructed'][0, :], rate=sr) # play a NumPy array
  1. I find that most samples are illegible, but I can find some sounds here and there. Is that normal?
  2. Out of curiosity, are the examples you present in the website cherry-picked?
  3. Am I doing something wrong in generating the samples?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions