Advanced Deep Learning – CIMAT (Fall 2024)

Author: Ezau Faridh Torres Torres
Advisor: Dr. Mariano Rivera Meraz
Course: Advanced Deep Learning
Institution: CIMAT – Centro de Investigación en Matemáticas
Term: Fall 2024

Comprehensive exploration of modern deep learning architectures—including transformers, recurrent networks, diffusion models, and domain-specific adaptations like LoRA and state-aware mechanisms—applied to time series forecasting and physical system modeling. Each assignment targets a specific modeling challenge and showcases techniques such as attention, transfer learning, and diffusion processes across domains ranging from financial forecasting to PDE-based simulations.

Repository Structure

Each assignment comprises the following elements:

Jupyter Notebooks implementing the model and training pipeline.
Supporting scripts or utility functions if needed.

Technical Stack

Developed in Python 3.11 using:

Deep learning: TensorFlow, PyTorch
Time series & sequence models: LSTM, Transformer, Seq2Seq
Visualization: matplotlib, seaborn
Utilities: numpy, pandas, scikit-learn, yfinance

Some assignments may use additional specialized libraries such as keras, scipy, or torch.nn.functional.

Overview of Assignments

The following section presents a concise overview of each task, highlighting its primary objective:

Assignment 1 – Extreme Learning Machines

Implementation and comparison of a multilayer perceptron (MLP), a standard extreme learning machine (ELM), and a binary-weight ELM for emotion classification using VQ-VAE encoded inputs. The study evaluates the impact of different regularization strategies (none, Ridge, Lasso, ElasticNet) on the ELM’s output layer, using 12×12 integer matrices as input representations of facial expressions.

Assignment 2 – Seq2Seq Prediction for Cryptocurrency Time Series

Implementation of a sequence-to-sequence (Seq2Seq) model with attention and teacher forcing to predict future values in multivariate time series of cryptocurrency prices. Using data from 7 cryptocurrencies (including Bitcoin) over 100 hourly intervals, the model forecasts the final segment of each series. Historical data is fetched via yahoo-finance and normalized using MinMax scaling. The model is trained for 300 epochs with LSTM layers of 1000 hidden units.

Assignment 3 – Transformer Encoder for Time Series Forecasting

Implementation of a Transformer encoder model in TensorFlow to predict future cryptocurrency prices from past sequences. The model is built from scratch using MultiHeadAttention layers, with three encoding blocks and a latent dimension of 64. Two versions are explored: one using logarithmic normalization, and another using MinMax scaling. The model’s performance is compared against a naive baseline (no change in price) and the Seq2Seq architecture from Assignment 2.

Assignment 4 – Transfer Learning and LoRA for Currency Forecasting

Extension of the Transformer model from Assignment 3 to a new dataset including exchange rates for multiple currencies and daily oil prices. Three strategies are compared: (1) training a new model from scratch, (2) full fine-tuning of the pretrained transformer, and (3) fine-tuning using Low-Rank Adaptation (LoRA) on affine layers. Multiple LoRA ranks are tested to evaluate efficiency vs. performance trade-offs.

Assignment 5 – Diffusion Model with Transformer Sampler

Adaptation of a DDIM-based diffusion model by replacing the original U-Net architecture with a custom Transformer for the reverse sampling process. The architecture uses 12 attention heads across 8 layers, maintaining all other parameters from the original implementation (e.g., number of diffusion steps, embedding size, learning rate). The model is trained for 50 epochs due to high computational cost.

Final Project – State-Exchange Attention (SEA) for Physics Transformers

Investigation and replication of the SEA architecture proposed by Esmati et al. (2024), which integrates a novel State-Exchange Attention mechanism into transformer-based models for simulating PDE-governed physical systems. The SEA module enables dynamic cross-field communication between state variables such as velocity, pressure, and volume fraction, effectively reducing autoregressive rollout error. The full ViT-SEA framework achieves up to 91% error reduction compared to state-of-the-art baselines, demonstrating its capacity to capture complex spatiotemporal dynamics in computational fluid dynamics scenarios.
For further details, see the original publication by Esmati et al. (2024).

Learning Outcomes

Built custom deep learning models for time series forecasting and generative modeling.
Gained hands-on experience with transformer architectures, LSTMs, and ELMs.
Explored fine-tuning strategies including full transfer learning and LoRA.
Adapted diffusion models and autoregressive frameworks to novel architectures.
Analyzed and visualized model performance across financial and physical domains.

References

Esmati, S., Gholami, A., & Mahoney, M. W. (2024). State Exchange Attention for Physics Transformers. arXiv:2403.04603.
https://arxiv.org/abs/2403.04603
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, L., & Chen, W. (2021). LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685.
https://arxiv.org/abs/2106.09685

📫 Contact

📧 Email: ezau.torres@cimat.mx
💼 LinkedIn: linkedin.com/in/ezautorres

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assignment1		assignment1
assignment2		assignment2
assignment3		assignment3
assignment4		assignment4
assignment5		assignment5
final_project		final_project
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advanced Deep Learning – CIMAT (Fall 2024)

📄 Table of Contents

Repository Structure

Technical Stack

Overview of Assignments

Assignment 1 – Extreme Learning Machines

Assignment 2 – Seq2Seq Prediction for Cryptocurrency Time Series

Assignment 3 – Transformer Encoder for Time Series Forecasting

Assignment 4 – Transfer Learning and LoRA for Currency Forecasting

Assignment 5 – Diffusion Model with Transformer Sampler

Final Project – State-Exchange Attention (SEA) for Physics Transformers

Learning Outcomes

References

📫 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Advanced Deep Learning – CIMAT (Fall 2024)

📄 Table of Contents

Repository Structure

Technical Stack

Overview of Assignments

Assignment 1 – Extreme Learning Machines

Assignment 2 – Seq2Seq Prediction for Cryptocurrency Time Series

Assignment 3 – Transformer Encoder for Time Series Forecasting

Assignment 4 – Transfer Learning and LoRA for Currency Forecasting

Assignment 5 – Diffusion Model with Transformer Sampler

Final Project – State-Exchange Attention (SEA) for Physics Transformers

Learning Outcomes

References

📫 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages