Skip to content

zhicongsun/diff-4d-gaussian-rasterization

Repository files navigation

Differential 4D Gaussian Rasterization for D4DGS-SLAM

This is a modified differential 4D Gaussian Splatting rasterizer used in D4DGS-SLAM, the SLAM method introduced in Embracing Dynamics: Dynamics-aware 4D Gaussian Splatting SLAM (Sun, Lo, Hu — IROS 2025). [ paper | project page ] It is based on the original 3DGS rasterizer, extended with two complementary features:

  • 4D Gaussian map representation (following 4DGaussians, Yang et al., 2023): each Gaussian carries a temporal center t, temporal scale scale_t, and an extra rotation quaternion rotation_r, parameterizing a 4D covariance via the double-quaternion form R = L(q_l) R(q_r). At a query timestamp the rasterizer projects each 4D Gaussian to a conditional 3D Gaussian with temporally-modulated opacity (Eq. (1)–(3) of the paper).
  • MonoGS-style 6-DoF camera pose gradient (Matsuki et al., 2024): the backward pass returns a per-Gaussian se(3) gradient dL_dtau summed and split into (grad_rho, grad_theta), so the camera pose can be treated as a leaf tensor for tracking and bundle adjustment.

Warning

The Python/CUDA signatures are not drop-in compatible with the original 3DGS rasterizer. It is recommended to install this package into a new environment to avoid symbol/version conflicts.

Install

git clone https://github.com/zhicongsun/diff-4d-gaussian-rasterization.git
cd diff-4d-gaussian-rasterization
pip install --no-build-isolation .   # see note below

Important

Install your desired PyTorch into the env first, and always pass --no-build-isolation to pip install .. Without it, pip's PEP 517 sandbox fetches the latest PyTorch from PyPI (currently 2.11 / CUDA 13) into a temporary build env; the resulting torch.version.cuda no longer matches the in-env nvcc and the build aborts with RuntimeError: detected CUDA version mismatches. torch.version.cuda and nvcc --version must agree on major.minor.

For a fully isolated conda env install (creates diff4dgs-test, handles CUDA toolkit + gcc + PyTorch automatically), see scripts/install_on_ubuntu.sh and scripts/README.md.

Tested environment

The authors have only verified the configuration below end-to-end (build + install + all three examples in examples/ pass, with the analytic-vs-finite-difference gradient check in examples/03_pose_gradient.py reporting per-component ratios within ±1% of unity):

Component Version
OS Ubuntu 20.04.6 LTS
GPU NVIDIA GeForce RTX 4080 Laptop (compute 8.9)
NVIDIA driver 535.129.03
Python 3.10.20
PyTorch 2.4.0+cu121
CUDA toolkit 12.1 (in conda env, via nvidia/label/cuda-12.1.0)
gcc / g++ 11.4.0 (gxx_linux-64, in conda env)
nvcc release 12.1, V12.1.105

By default setup.py builds for compute capabilities 7.0 / 7.5 / 8.0 / 8.6 / 8.9 / 9.0; override with TORCH_CUDA_ARCH_LIST to restrict (faster build) or extend. Other combinations are expected to build if they satisfy PyTorch's standard C++/CUDA extension constraints, but they have not been tested in-house. PRs extending this table are welcome.

Calling method

Important

The CUDA kernels read every 4×4 matrix in column-major order (the GLM convention). PyTorch tensors are row-major, so you must transpose every matrix before passing it to the rasterizer. This is the same convention MonoGS uses (see utils/camera_utils.py: world_view_transform = getWorld2View2(R, T).transpose(0, 1)). A direct row-major hand-off looks "almost right" in forward rendering but makes the MonoGS-style pose gradient point in the wrong direction.

A minimal end-to-end snippet (a slightly trimmed version of examples/01_3d_sanity.py):

import torch
from diff_gaussian_rasterization import (
    GaussianRasterizationSettings,
    GaussianRasterizer,
)

# `view` is a row-major world->camera (SE3) tensor;
# `proj_raw` is a row-major pure perspective matrix.
# Both have shape (4, 4). `view_proj = proj_raw @ view`.
settings = GaussianRasterizationSettings(
    image_height=H, image_width=W,
    tanfovx=tan_fovx, tanfovy=tan_fovy,
    bg=bg,                                                  # [3] background
    scale_modifier=1.0,
    viewmatrix=view.transpose(0, 1).contiguous(),
    projmatrix=view_proj.transpose(0, 1).contiguous(),
    projmatrix_raw=proj_raw.transpose(0, 1).contiguous(),
    sh_degree=sh_deg,                                       # spatial SH order
    sh_degree_t=sh_deg_t,                                   # temporal SH order (0 disables it)
    campos=cam_center,                                      # [3] camera origin in world
    timestamp=t_query,
    time_duration=T_total,
    rot_4d=True,                                            # double-quaternion 4D rotation
    gaussian_dim=4,                                         # 3 -> static 3DGS, 4 -> 4DGS
    force_sh_3d=False,                                      # if True, ignore temporal SH bands
    prefiltered=False,
    debug=False,
)

rasterizer = GaussianRasterizer(settings)

color, radii, depth, opacity, covs_com, n_touched = rasterizer(
    means3D=means3D,
    means2D=means2D,
    opacities=opacities,
    shs=shs,                # [P, K, 3]; K depends on sh_degree / sh_degree_t
    colors_precomp=None,
    ts=ts,                  # [P]   per-Gaussian time center
    scales=scales,          # [P, 3]
    scales_t=scales_t,      # [P]   per-Gaussian time scale
    rotations=rotations,    # [P, 4] q_l
    rotations_r=rotations_r,# [P, 4] q_r
    cov3D_precomp=None,
    theta=theta,            # [1, 3] camera rotation perturbation leaf
    rho=rho,                # [1, 3] camera translation perturbation leaf
)

Outputs

Tensor Shape Meaning
color [3, H, W] Rendered RGB.
radii [P] Per-Gaussian screen-space radius (0 if culled).
depth [1, H, W] Alpha-composited depth.
opacity [1, H, W] Per-pixel 1 - T, the accumulated opacity.
covs_com [P, 6] Conditional 3D covariance of each Gaussian at the current timestamp.
n_touched [P] Number of pixels for which the Gaussian was the top contributor.

covs_com is what the paper uses for the keyframe covisibility metric in §3.3.2; n_touched is convenient for visibility-aware densification / pruning.

See §3.1–3.3 of the D4DGS-SLAM paper for the 4D Gaussian formulation that this rasterizer implements, and cuda_rasterizer/forward.cu / backward.cu for the GPU code.

Examples

See examples/ for three runnable scripts: a 3DGS sanity check, a 4D temporal-conditioning demo, and a MonoGS-style pose recovery loop with a built-in analytic-vs-finite-difference gradient check.

Notes / known limitations

  • Pose gradient API. viewmatrix / projmatrix / projmatrix_raw are accepted as plain tensors with no autograd. Pose gradients flow exclusively through the leaf tensors (theta, rho), which are held at zero during the forward pass and only serve as a target for dL_dtau in the backward pass. After every optimizer.step() the caller must (a) retract the update into the view matrix via T_new = exp((rho, theta)) @ T_cur, (b) recompute projmatrix = proj_raw @ T_new, and (c) zero (rho, theta). This mirrors MonoGS' update_pose routine in utils/pose_utils.py. See examples/03_pose_gradient.py for a worked example.
  • Optical-flow rendering is not part of this release. The original 4DGaussians codebase shipped an optional 2D flow output; D4DGS-SLAM does not use it, so the corresponding paths have been removed for clarity.

Citation

If you find this code useful, please cite D4DGS-SLAM:

@inproceedings{sun2025embracing,
  title        = {Embracing dynamics: Dynamics-aware 4d gaussian splatting slam},
  author       = {Sun, Zhicong and Lo, Jacqueline and Hu, Jinxing},
  booktitle    = {2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  pages        = {5331--5338},
  year         = {2025},
  organization = {IEEE}
}
BibTeX for upstream works (3DGS, 4DGaussians, MonoGS)
@article{kerbl3Dgaussians,
  title   = {3D Gaussian Splatting for Real-Time Radiance Field Rendering},
  author  = {Kerbl, Bernhard and Kopanas, Georgios and Leimk{\"u}hler, Thomas and Drettakis, George},
  journal = {ACM Transactions on Graphics},
  volume  = {42}, number = {4}, month = {July}, year = {2023}
}

@inproceedings{yang2024real,
  title     = {Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting},
  author    = {Yang, Zeyu and Yang, Hongye and Pan, Zijie and Zhang, Li},
  booktitle = {International Conference on Learning Representations (ICLR)},
  year      = {2024}
}

@inproceedings{matsuki2024monogs,
  title     = {Gaussian Splatting SLAM},
  author    = {Matsuki, Hidenobu and Murai, Riku and Kelly, Paul H.J. and Davison, Andrew J.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2024}
}

License

This work inherits the Gaussian-Splatting License from the upstream project (research / evaluation use only, no commercial use without prior consent of Inria & MPII). See LICENSE.md for the full terms and NOTICE for the third-party attributions (diff-gaussian-rasterization, 4DGaussians, MonoGS, NVIDIA, GLM).

About

Differentiable 4D Gaussian rasterizer with MonoGS-style camera pose gradient, used in D4DGS-SLAM (IROS 2025).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors