Skip to content

banucavlak/CS585-Project-SAEdit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SAEdit-TopK: Token-Level Continuous Image Editing with Hybrid Top-K Aggregation

This repository is based on the official implementation of SAEdit “SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder” by Kamenetsky et al. (2025).

While the original SAEdit method performs token-level editing using PCA/SVD-based aggregation of SAE directions, this project introduces a Top-K Hybrid aggregation strategy for computing edit directions during generation. The goal is to improve robustness and controllability by focusing on the most semantically active SAE dimensions instead of relying solely on global principal directions.

Overview

Large-scale text-to-image diffusion models provide powerful editing capabilities but lack fine-grained, disentangled control. SAEdit addresses this by manipulating token-level text embeddings using directions extracted from a Sparse AutoEncoder (SAE) trained on text embeddings.

This project:

  • Keeps the core SAEdit framework unchanged
  • Uses the same pretrained SAE and diffusion backbone
  • Modifies how SAE directions are aggregated during inference
  • Provides side-by-side vanilla vs. modified generation scripts

Key Differences from Original SAEdit

1. Aggregation Strategy

Method Aggregation Vanilla SAEdit PCA / SVD-based aggregation (aggregate="pca") Ours (Top-K Hybrid) Top-K sparse latent selection + hybrid aggregation (aggregate="top_k_hybrid")

Instead of averaging or projecting across all active SAE dimensions, the Top-K Hybrid approach:

  • Selects the K most activated SAE latent dimensions
  • Aggregates directions only from these dominant components
  • Reduces noise from weak or irrelevant activations
  • Improves semantic locality of edits

2. Example Scripts

Two separate scripts are provided for clarity and comparison.

Vanilla SAEdit (PCA-based)

python flux_dev_example.py \
  --variation_path configs/variations/smiling_man.yaml \
  --prompt "a portrait of a man riding a donkey in the snow" \
  --source_tokens man \
  --factor 0.6 \
  --output vanilla_output.png

Ours: Top-K Hybrid Aggregation

python flux_dev_example_ours.py \
  --variation_path configs/variations/smiling_man.yaml \
  --prompt "a portrait of a man riding a donkey in the snow" \
  --source_tokens man \
  --factor 0.6 \
  --output ours_output.png

Environment Setup

Our code builds on the requirement of the diffusers library. To set up the environment, please run:

conda env create -f environment.yaml
conda activate saeedit_env

or install requirements:

pip install -r requirements.txt

Acknowledgements

This code builds on the code from the SAEdit library.

  archivePrefix={arXiv},
  primaryClass={cs.GR},
  url={https://arxiv.org/abs/2510.05081}, 

}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages