Skip to content

HassanHayat08/Modeling-Subjective-Affect-Annotations-with-Multi-Task-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modeling Subjective Affect Annotations with Multi-Task Learning

(Code Licensed under the MIT License)

For further details, please refer to the paper:
Hassan Hayat, Carles Ventura, Agata Lapedriza, “Modeling Subjective Affect Annotations with Multi-Task Learning”, Sensors, 2022, 22(14):5245. DOI: 10.3390/s22145245

This package was developed by Mr. Hassan Hayat (hhassan0@uoc.edu). For any inquiries regarding this package, please feel free to contact him. The package is provided free for academic use and is distributed at your own risk.


Operating System

  • Ubuntu Linux

Requirements

  • Python: 3.x.x
  • GPU: With CUDA support
  • Audio Features: VGGish
  • Visual Features: I3D Model
  • Textual Features: BERT-Base
  • TensorFlow: 1.14

Datasets


Dataset Preprocessing

COGNIMUSE Dataset

  • Movie Clips Creation:
    The COGNIMUSE dataset provides subtitle information with timestamps that indicate the start and end frames where a subtitle appears. These timestamps are used to create movie clips.

  • Visual Feature Extraction:
    An I3D model is used to extract visual features. The output from the 'Mixed-5c' layer of the model represents the visual features of each frame. Prior to feeding the frames into the model, each frame is resized to 224x224 pixels.

IEMOCAP Dataset

  • Audio Feature Extraction:
    The dataset’s dyadic conversations are divided into five sessions. A pre-trained VGGish model is used to extract audio features for each utterance. The audio representation is taken from the output of the last convolutional layer. Please refer to the paper for further details.

SemEval_2007 Dataset

  • Textual Feature Extraction:
    The BERT-Base model is employed to obtain textual semantics for each news headline. The representation is derived from the final layer (i.e., the 12th layer) of the model. Please refer to the paper for more details.

Setup Instructions

Single-Task (ST) Learning using Multi-Modalities

To train, validate, and test using multi-modalities in a single-task setting, execute the following:

./st_main.py

Multi-Task (MT) Learning using Multi-Modalities

To train, validate, and test using multi-modalities in a multi-task setting, execute the following:

./mt_main.py

About

We compare two generic Deep Learning architectures: a Single-Task (ST) architecture and a Multi-Task (MT) architecture. While the ST architecture models a single emotional perception each time, the MT architecture jointly models every single emotional and aggregated emotional perception at once.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages