Skip to content

UFcompressor/CAESAR

Repository files navigation

CAESAR

Conditional AutoEncoder with Super-resolution for Augmented Reduction

A C++ / LibTorch foundation model for efficient compression of scientific data

Platform C++ LibTorch zstd License


Overview

CAESAR is a unified framework for spatio-temporal scientific data reduction. The baseline model, CAESAR-V, is built on a variational autoencoder (VAE) with scale hyperpriors and super-resolution modules to achieve high compression ratios while preserving scientific fidelity.

It encodes data into a compact latent space and uses learned priors for information-rich representation. This repository ports CAESAR into C++ with LibTorch for deployment in high-performance computing (HPC) environments and scientific workflows.

Reference: Shaw et al., CAESAR: A Unified Framework of Foundation and Generative Models for Efficient Compression of Scientific Data


Build Instructions

1. Clone the Repository

git clone https://github.com/UFcompressor/CAESAR
cd CAESAR

2. Create and Activate a Python Virtual Environment

python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip wheel setuptools

3. Install Platform Dependencies

Linux (Ubuntu/Debian)
sudo apt-get update
sudo apt-get install -y cmake g++ zstd libzstd-dev

source venv/bin/activate

grep -v "^torch" requirements.txt | \
  grep -v "^torchvision" | \
  grep -v "^--extra-index-url" | \
  grep -v "^cupy" | \
  grep -v "^nvidia" | \
  grep -v "^$" > temp_requirements.txt

pip install --no-cache-dir -r temp_requirements.txt
pip install torch==2.9.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install compressai==1.2.6 imageio==2.37.0
rm temp_requirements.txt
macOS
brew install cmake zstd gcc

source venv/bin/activate

grep -v "^torch" requirements.txt | \
  grep -v "^torchvision" | \
  grep -v "^--extra-index-url" | \
  grep -v "^cupy" | \
  grep -v "^nvidia" | \
  grep -v "^$" > temp_requirements.txt

pip install -r temp_requirements.txt
pip install torch==2.8.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install compressai==1.2.6 imageio==2.37.0
rm temp_requirements.txt
Windows
# Install CMake, zstd, and a recent MSVC toolchain (Visual Studio Build Tools) first

venv\Scripts\activate

findstr /v /b "torch torchvision --extra-index-url cupy nvidia" requirements.txt > temp_requirements.txt

pip install --no-cache-dir -r temp_requirements.txt
pip install torch==2.9.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install compressai==1.2.6 imageio==2.37.0
del temp_requirements.txt

4. Download and Prepare Pretrained Models

./download_models.sh

python3 CAESAR_compressor.py cpu
python3 CAESAR_hyper_decompressor.py cpu
python3 CAESAR_decompressor.py cpu

5. Configure and Build with CMake

mkdir -p build && cd build

TORCH_PATH=$(python3 -c "import torch; print(torch.utils.cmake_prefix_path)")

cmake .. \
  -DCMAKE_PREFIX_PATH="$TORCH_PATH" \
  -DBUILD_TESTS=ON \
  -DCMAKE_BUILD_TYPE=Release

make -j6

For debug builds, replace -DCMAKE_BUILD_TYPE=Release with -DCMAKE_BUILD_TYPE=Debug.


GPU Support (NVIDIA)

GPU support requires CUDA and nvCOMP.

Install nvCOMP
wget https://developer.download.nvidia.com/compute/nvcomp/redist/nvcomp/linux-x86_64/nvcomp-linux-x86_64-5.0.0.6_cuda12-archive.tar.xz

mkdir -p ~/local/nvcomp
tar -xJf nvcomp-linux-x86_64-5.0.0.6_cuda12-archive.tar.xz -C ~/local/nvcomp --strip-components=1

export CMAKE_PREFIX_PATH=$HOME/local/nvcomp:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=$HOME/local/nvcomp/lib:$LD_LIBRARY_PATH
Build with GPU support
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 \
  --index-url https://download.pytorch.org/whl/cu128

cmake .. \
  -DCMAKE_PREFIX_PATH="$TORCH_PATH;$HOME/local/nvcomp" \
  -DCMAKE_CXX_FLAGS="-I$HOME/local/nvcomp/include" \
  -DCMAKE_EXE_LINKER_FLAGS="-L$HOME/local/nvcomp/lib" \
  -DBUILD_TESTS=ON \
  -DCMAKE_BUILD_TYPE=Release

Environment Variables

CAESAR resolves model files in the following priority order:

Priority Location
1 $CAESAR_MODEL_DIR environment variable (if set)
2 ../exported_model/ relative to the executable (development builds)
3 /usr/local/share/caesar/models (installed builds)
export CAESAR_MODEL_DIR=/path/to/your/models

Dependencies

Core

Dependency Minimum Version
LibTorch (PyTorch C++ API) 2.8
CMake 3.10
Zstandard (zstd) 1.5 (required)
Python 3.10

Optional — GPU

Dependency Notes
CUDA Toolkit Ensure nvcc is in PATH
nvCOMP NVIDIA compression library

Citation

If you use CAESAR in your research, please cite the following works:

@inproceedings{li2025foundation,
  title        = {Foundation Model for Lossy Compression of Spatiotemporal Scientific Data},
  author       = {Li, Xiao and Lee, Jaemoon and Rangarajan, Anand and Ranka, Sanjay},
  booktitle    = {Pacific-Asia Conference on Knowledge Discovery and Data Mining},
  pages        = {368--380},
  year         = {2025},
  organization = {Springer}
}
@article{li2025generative,
  title   = {Generative Latent Diffusion for Efficient Spatiotemporal Data Reduction},
  author  = {Li, Xiao and Zhu, Liangji and Rangarajan, Anand and Ranka, Sanjay},
  journal = {arXiv preprint arXiv:2507.02129},
  year    = {2025}
}

Contact

For questions, bug reports, or contributions, please open an issue on GitHub.


References

Resource Link
Original CAESAR (Python) Shaw-git/CAESAR
NVIDIA nvCOMP developer.nvidia.com/nvcomp
CUDA Toolkit developer.nvidia.com/cuda-toolkit
PyTorch pytorch.org
Zstandard facebook.github.io/zstd
CompressAI InterDigitalInc/CompressAI

Built for science. Engineered for performance.

About

CAESAR (Conditional AutoEncoder with Super-resolution for Augmented Reduction)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors