A hybrid deep learning model to predict voxel-wise fMRI brain activation from EEG signals, trained on simultaneous EEG-fMRI recordings.
This project implements a comprehensive pipeline for training a neural network to predict brain activation (fMRI) from EEG signals. The model combines:
- Feature Extraction: Spectral features (bandpower) and Common Spatial Patterns (CSP) from EEG
- Multi-lag Learning: Captures temporal dependencies at multiple time lags to account for hemodynamic response
- Hybrid Architecture: Dense feature pathway for non-linear feature refinement + CNN spatial pathway + decoder for voxel-wise predictions
- Complete Preprocessing: ICA-based artifact removal for EEG, motion correction & registration for fMRI
Simultaneous EEG-fMRI Dataset (Gallego-Rudolf et al., 2020)
- Subjects: 20 healthy volunteers (all male, ages 22–35)
- EEG: 32 channels at 1000 Hz, recorded during fMRI acquisition
- fMRI: 3T GE scanner, TR=2s, 4mm slices
- Conditions: Eyes-closed/open resting state + eyes-open/closed alternating task
- Location:
Simultaneous EEG-fMRI/BIDS_dataset_EEG/andBIDS_dataset_MRI/
ne422_final_project/
├── config.py # Global configuration & hyperparameters
├── data_loading.py # BIDS data loader utilities
├── main.py # Main pipeline script
├── preprocessing/
│ ├── eeg_preprocessing.py # EEG filtering, ICA, epoching
│ └── fmri_preprocessing.py # fMRI motion correction, registration
├── features/
│ └── feature_extraction.py # Spectral & CSP feature extraction
├── models/
│ └── eeg_to_fmri_model.py # Hybrid CNN model architecture
├── training/
│ ├── train.py # Dataset class & training loop
│ └── evaluate.py # Evaluation metrics & visualization
├── utils/
│ ├── metrics.py # Evaluation metrics (MSE, correlation, R²)
│ └── visualization.py # Plotting utilities
├── requirements.txt # Python dependencies
├── data/
│ ├── preprocessed/ # Preprocessed EEG & fMRI
│ └── features/ # Extracted features
├── checkpoints/ # Saved model checkpoints
└── results/ # Results & evaluation outputs
pip install -r requirements.txtKey dependencies:
mne— EEG processingnibabel— fMRI (nifti) loadingtorch— Deep learningscikit-learn— ML utilitiesnilearn— fMRI analysis
python -c "from data_loading import BIDSDataLoader; loader = BIDSDataLoader(); print(loader.list_available_runs())"python main.py --stage all# 1. Preprocess EEG and fMRI
python main.py --stage preprocess
# 2. Extract features from preprocessed data
python main.py --stage features
# 3. Train model
python main.py --stage train
# 4. Evaluate on test subjects
python main.py --stage evaluatepython main.py --data-check- Multi-lag EEG features: (batch × n_lags × features_per_lag)
- Features: 32 channels × 5 frequency bands + CSP components
- Lags: 0, -2s, -4s, -6s, -8s (account for hemodynamic delay)
- Dense layers: 830 → 128 → 64 → 32
- ReLU activation + Dropout(0.2)
- Non-linear feature refinement
- 2D Convolutions: 1 → 16 → 32 → 64 channels
- Learned spatial feature maps
- Weighted aggregation of lag-specific features
- Learnable fusion layer
- Transpose convolutions: 64 → 32 → 16 → 8 → 1
- Upsamples to full brain voxel grid
- 3D activation map: 91 × 109 × 91 (MNI152 space)
Edit config.py to modify:
# Dataset
N_SUBJECTS = 20
EEG_SAMPLING_RATE = 1000 # Hz
FMRI_TR = 2.0 # seconds
# EEG preprocessing
EEG_HIGHPASS = 0.5 # Hz
EEG_LOWPASS = 100 # Hz
N_ICA_COMPONENTS = 32
# Features
N_SPECTRAL_FEATURES = 160 # 32 channels × 5 bands
N_CSP_COMPONENTS = 3
LAGS = [0, -2, -4, -6, -8] # TRs
# Training
BATCH_SIZE = 16
LEARNING_RATE = 1e-3
MAX_EPOCHS = 100
EARLY_STOPPING_PATIENCE = 15
# Model
DENSE_HIDDEN_DIMS = [128, 64, 32]
CNN_CHANNELS = [16, 32, 64]
SMOOTHING_FWHM = 5.0 # mm- Notch filtering (60 Hz + harmonics)
- Bandpass filtering (0.5–100 Hz)
- Bad channel detection (variance + correlation-based)
- ICA artifact removal (ECG, blink, BCG components)
- Downsampling to 250 Hz
- Epoching to match fMRI TRs
- Motion estimation & correction
- Framewise displacement (FD) outlier detection
- 24-parameter motion regression
- Gaussian smoothing (5 mm FWHM)
- z-score normalization
- Brain mask generation
- Spectral: Welch's PSD in delta/theta/alpha/beta/gamma bands
- CSP: Common Spatial Patterns for eyes-closed vs. eyes-open discrimination
- Multi-lag: Separate features at t, t-2s, t-4s, t-6s, t-8s to capture temporal patterns
- Adam optimizer with learning rate scheduling
- Early stopping (patience=15)
- L1 + L2 regularization
- Spatial smoothness (TV loss) for realistic activation patterns
- Data augmentation (time shifts, feature jitter)
- Voxel-wise MSE, correlation, R²
- Regional aggregation metrics (ROI-based)
- Temporal correlation analysis
- Leave-one-subject-out (LOSO) cross-validation
- Anatomical visualization (predicted vs. actual activation overlays)
from data_loading import BIDSDataLoader
from preprocessing.eeg_preprocessing import preprocess_subject_eeg
from features.feature_extraction import extract_features_per_epoch, create_multi_lag_features
import mne
# 1. Load data
loader = BIDSDataLoader()
raw = loader.load_eeg_raw("sub-001", "fmrieoec")
# 2. Preprocess
epochs, ica, log = preprocess_full_pipeline(raw, fmri_duration=444)
# 3. Extract features
features, info = extract_features_per_epoch(epochs)
multi_lag_features = create_multi_lag_features(features)
# 4. Train model
from models import EEGtoFMRIModel
model = EEGtoFMRIModel(n_input_features=830)
# ... training code ...
# 5. Predict
predictions = model(torch.from_numpy(multi_lag_features))Expected performance on validation set (16 train / 2 val / 2 test subjects split):
- Correlation: 0.3–0.5 (depends on subject-specific neural patterns)
- R²: 0.1–0.3 (difficult prediction task; voxel-wise fMRI is noisy)
- Best ROIs: Visual cortex (for eyes-closed/open task), motor areas
Regional predictions typically outperform whole-brain voxel-wise predictions.
Results are saved to results/:
training_history.png— Training & validation loss curves{subject}_activation_comparison.png— Predicted vs. actual fMRI overlays{subject}_evaluation_report.txt— Per-subject metrics
- Improved Registration: Use FSL/SPM for proper motion correction & MNI registration
- Attention Mechanisms: Add attention layers to identify important EEG regions/frequencies
- Multi-condition Model: Separate models for resting vs. task states
- Hierarchical Architecture: Predict ROI-level activation first, then full-brain fine-tuning
- Cross-subject Generalization: Train on one cohort, test on independent dataset
- Integration with Neuroimaging Tools: Export predictions in standard formats (nifti, Brain Connectivity Toolbox)
- Gallego-Rudolf et al. (2020). "Resting-state and eyes open-closed simultaneous (and independent) EEG-fMRI: A multimodal neuroimaging dataset." Data in Brief, 32, 106064. DOI: 10.17632/crhybxpdy6.1
- MNE-Python: https://mne.tools/
- Nilearn: https://nilearn.github.io/
This project is provided for educational purposes. The dataset is available under the Public Domain Dedication and License (PDDL) v1.0.
Contact: For questions or issues, please refer to the project repository.