MultiModal Emotion Recognition using Cross modal Interaction module and multiloss
- Data: KEMDy19
- Modality: Audio, Text
- Linux
- Python 3.7+
- PyTorch 1.11.0 or higher and CUDA
a. Create a conda virtual environment and activate it.
conda create -n MER python=3.7
conda activate MERb. Install PyTorch and torchvision following the official instructions
c. Clone this repository.
git clone https://github.com/Mirai-Gadget-Lab/Multimodal_Emotion_Recognition
cd Multimodal_Emotion_Recognitiond. Install requirments.
pip install -r requirements.txta. Prepare data
- root_path: Original KEMD19 path Ex) /home/ubuntu/data/KEMD_19/
- save_path: save folder, default: ./data/
python preprocess.py --root_path your_KEMD_19_path --save_path ./data/Here is the preprocess flow chart.
Note that, wav_length cliping is conducted in train_hf.sh or inference.py
b. Set config
Change config.py for your environment.
But, i recommand default config setting.
Run Training code
bash train_hf.shCheck your GPU, and change train_hf.sh properly.
if you train the model your self using above code, execute below codes.
CUDA_VISIBLE_DEVICES=0 python inference.py --model_save_path ./models_zoo/checkpoint/
