Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
add7234
Add runtime requirements
lwneal Nov 24, 2022
f09f3ad
Add standard python gitignore with additions for IDEs and MacOS
kjerk Nov 24, 2022
d4e8234
Fix typo in ddpm.py
eltociear Nov 29, 2022
41c0646
Fix typo
UdonDa Dec 5, 2022
d4763cf
Merge pull request #10 from lwneal/main
dmarx Dec 6, 2022
4983241
Merge pull request #18 from kjerk/add-gitignore
dmarx Dec 6, 2022
d7440ac
Update README.md
hardmaru Dec 7, 2022
64888bc
Update modelcard.md
hardmaru Dec 7, 2022
2fc5104
Update modelcard.md
hardmaru Dec 7, 2022
ae721a6
Update README.md
hardmaru Dec 7, 2022
cc99e3a
Update README.md
hardmaru Dec 7, 2022
1b7bee1
Update README.md
hardmaru Dec 7, 2022
10d4a4a
Update README.md
hardmaru Dec 7, 2022
6e92cda
* Force cast to fp32 to avoid atten layer overflow
Dango233 Dec 7, 2022
f547c4a
Merge pull request #64 from eltociear/patch-1
rromb Dec 7, 2022
0611c60
Update README.md
hardmaru Dec 7, 2022
773e941
Update README.md
hardmaru Dec 7, 2022
e0efa32
Update README.md
hardmaru Dec 7, 2022
f0eeb79
Update README.md
hardmaru Dec 7, 2022
c7d5eb9
Update README.md
hardmaru Dec 7, 2022
e1797ae
Add env var for resume previous behavior
Dango233 Dec 7, 2022
8bde0cf
Merge pull request #89 from Stability-AI/dango.patch.atten_overflow
rromb Dec 7, 2022
dab18ab
Merge pull request #90 from hardmaru/main
hardmaru Dec 7, 2022
c12d960
add details on precision for 2.1
rromb Dec 7, 2022
99f1aae
Merge pull request #5 from Stability-AI/main
jamesthesnake Dec 10, 2022
d9ae297
Update README.md
jamesthesnake Dec 10, 2022
18724c1
Fix image link
ModelEarth Dec 12, 2022
cd8f328
Merge pull request #100 from datascape/main
miao-ju Dec 14, 2022
9718e11
Merge pull request #95 from jamesthesnake/main
miao-ju Dec 15, 2022
d55bcd4
Merge pull request #84 from UdonDa/patch-1
miao-ju Dec 15, 2022
7ad54c5
add cpu support & add intel ipex optimizations
aalbersk Dec 20, 2022
71e9042
add intel info to README
aalbersk Dec 20, 2022
872cc9e
add info about ninstance and license
aalbersk Jan 5, 2023
45287f9
stable unclip finetune
rromb Jan 14, 2023
929625a
make it work
rromb Jan 14, 2023
aad6e38
fix missing adm_in_channels and ClipImageEmbedder
rromb Jan 14, 2023
8ec7903
add noise-augmented unCLIP
rromb Jan 18, 2023
639b3f3
make it work in sampling script
rromb Jan 18, 2023
5ca0605
update for openclip release
rromb Jan 27, 2023
cddd65d
make it work
rromb Jan 29, 2023
d7980a2
add examples
rromb Jan 29, 2023
c81b231
no-ema in config, adapt noiseaugmtor
rromb Jan 29, 2023
4b71f18
support dpm
rromb Jan 30, 2023
3349693
adjust licenses and naming
aalbersk Feb 3, 2023
fc14884
Merge pull request #147 from aalbersk/intel_cpu_optimizations
dmarx Feb 7, 2023
edb2eb9
move unCLIP documentation to new .MD file
rromb Feb 20, 2023
fe1cf68
update examples for release
rromb Feb 23, 2023
4e89f57
support image mixing in streamlit
rromb Feb 23, 2023
89fdc12
increase default noise value for mixings
rromb Feb 23, 2023
88553b6
readme
rromb Feb 23, 2023
e04300b
final ckpt links for unclip
rromb Mar 20, 2023
c25a9a8
Bump gradio from 3.11 to 3.13.2
dependabot[bot] Mar 20, 2023
67fdc82
Merge pull request #207 from Stability-AI/dependabot/pip/gradio-3.13.2
rromb Mar 20, 2023
0bbbcbb
Add files via upload
hardmaru Mar 24, 2023
ae978e6
Update modelcard.md
hardmaru Mar 24, 2023
e272fe6
Update README.md
hardmaru Mar 24, 2023
f2aa661
Update UNCLIP.MD
hardmaru Mar 24, 2023
4e409af
Update UNCLIP.MD
hardmaru Mar 24, 2023
b4bdae9
add stable unclip
rromb Mar 24, 2023
afefb6b
Add diffusers integration
apolinario Mar 24, 2023
6dd3048
Update README.md
apolinario Mar 24, 2023
be2861a
Small Hugging Face as two words nit
apolinario Mar 24, 2023
fa4401e
merge unclip into main
rromb Mar 24, 2023
3cf0b08
merge readmes
rromb Mar 24, 2023
3396b0f
Merge pull request #215 from Stability-AI/unclip_prerelease
rromb Mar 24, 2023
06b5b40
Merge pull request #213 from apolinario/patch-2
rromb Mar 24, 2023
b7096e6
Update modelcard.md
hardmaru Mar 24, 2023
21236e8
Update README.md
hardmaru Mar 24, 2023
a451cec
Update README.md
hardmaru Mar 24, 2023
84d3c27
Update modelcard.md
hardmaru Mar 24, 2023
8d95d19
Fix diffusers code snippet
apolinario Mar 24, 2023
215046a
Merge pull request #216 from apolinario/patch-3
rromb Mar 24, 2023
b69cba5
Update modelcard.md
hardmaru Mar 25, 2023
616d52d
Update modelcard.md
hardmaru Mar 25, 2023
cf1d67a
Update modelcard.md
hardmaru Mar 25, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 165 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
# Generated by project
outputs/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# General MacOS
.DS_Store
.AppleDouble
.LSOverride

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# IDEs
.idea/
.vscode/
79 changes: 68 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,34 @@
# Stable Diffusion 2.0
# Stable Diffusion Version 2
![t2i](assets/stable-samples/txt2img/768/merged-0006.png)
![t2i](assets/stable-samples/txt2img/768/merged-0002.png)
![t2i](assets/stable-samples/txt2img/768/merged-0005.png)

This repository contains [Stable Diffusion](https://github.com/CompVis/stable-diffusion) models trained from scratch and will be continuously updated with
new checkpoints. The following list provides an overview of all currently available models. More coming soon.

## News
**November 2022**


**March 24, 2023**

*Stable UnCLIP 2.1*

- New stable diffusion finetune (_Stable unCLIP 2.1_, [Hugging Face](https://huggingface.co/stabilityai/)) at 768x768 resolution, based on SD2.1-768. This model allows for image variations and mixing operations as described in [*Hierarchical Text-Conditional Image Generation with CLIP Latents*](https://arxiv.org/abs/2204.06125), and, thanks to its modularity, can be combined with other models such as [KARLO](https://github.com/kakaobrain/karlo). Comes in two variants: [*Stable unCLIP-L*](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip/blob/main/sd21-unclip-l.ckpt) and [*Stable unCLIP-H*](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip/blob/main/sd21-unclip-h.ckpt), which are conditioned on CLIP ViT-L and ViT-H image embeddings, respectively. Instructions are available [here](doc/UNCLIP.MD).

- A public demo of SD-unCLIP is already available at [clipdrop.co/stable-diffusion-reimagine](https://clipdrop.co/stable-diffusion-reimagine)


**December 7, 2022**

*Version 2.1*

- New stable diffusion model (_Stable Diffusion 2.1-v_, [Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-2-1)) at 768x768 resolution and (_Stable Diffusion 2.1-base_, [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-2-1-base)) at 512x512 resolution, both based on the same number of parameters and architecture as 2.0 and fine-tuned on 2.0, on a less restrictive NSFW filtering of the [LAION-5B](https://laion.ai/blog/laion-5b/) dataset.
Per default, the attention operation of the model is evaluated at full precision when `xformers` is not installed. To enable fp16 (which can cause numerical instabilities with the vanilla attention module on the v2.1 model) , run your script with `ATTN_PRECISION=fp16 python <thescript.py>`

**November 24, 2022**

*Version 2.0*

- New stable diffusion model (_Stable Diffusion 2.0-v_) at 768x768 resolution. Same number of parameters in the U-Net as 1.5, but uses [OpenCLIP-ViT/H](https://github.com/mlfoundations/open_clip) as the text encoder and is trained from scratch. _SD 2.0-v_ is a so-called [v-prediction](https://arxiv.org/abs/2202.00512) model.
- The above model is finetuned from _SD 2.0-base_, which was trained as a standard noise-prediction model on 512x512 images and is also made available.
- Added a [x4 upscaling latent text-guided diffusion model](#image-upscaling-with-stable-diffusion).
Expand Down Expand Up @@ -54,7 +76,7 @@ Installation needs a somewhat recent version of nvcc and gcc/g++, obtain those,
export CUDA_HOME=/usr/local/cuda-11.4
conda install -c nvidia/label/cuda-11.4.0 cuda-nvcc
conda install -c conda-forge gcc
conda install -c conda-forge gxx_linux-64=9.5.0
conda install -c conda-forge gxx_linux-64==9.5.0
```

Then, run the following (compiling takes up to 30 min).
Expand All @@ -80,11 +102,11 @@ The weights are available via [the StabilityAI organization at Hugging Face](htt



## Stable Diffusion v2.0
## Stable Diffusion v2

Stable Diffusion v2.0 refers to a specific configuration of the model
Stable Diffusion v2 refers to a specific configuration of the model
architecture that uses a downsampling-factor 8 autoencoder with an 865M UNet
and OpenCLIP ViT-H/14 text encoder for the diffusion model. The _SD 2.0-v_ model produces 768x768 px outputs.
and OpenCLIP ViT-H/14 text encoder for the diffusion model. The _SD 2-v_ model produces 768x768 px outputs.

Evaluations with different classifier-free guidance scales (1.5, 2.0, 3.0, 4.0,
5.0, 6.0, 7.0, 8.0) and 50 DDIM sampling steps show the relative improvements of the checkpoints:
Expand All @@ -97,16 +119,16 @@ Evaluations with different classifier-free guidance scales (1.5, 2.0, 3.0, 4.0,
![txt2img-stable2](assets/stable-samples/txt2img/merged-0003.png)
![txt2img-stable2](assets/stable-samples/txt2img/merged-0001.png)

Stable Diffusion 2.0 is a latent diffusion model conditioned on the penultimate text embeddings of a CLIP ViT-H/14 text encoder.
Stable Diffusion 2 is a latent diffusion model conditioned on the penultimate text embeddings of a CLIP ViT-H/14 text encoder.
We provide a [reference script for sampling](#reference-sampling-script).
#### Reference Sampling Script

This script incorporates an [invisible watermarking](https://github.com/ShieldMnt/invisible-watermark) of the outputs, to help viewers [identify the images as machine-generated](scripts/tests/test_watermark.py).
We provide the configs for the _SD2.0-v_ (768px) and _SD2.0-base_ (512px) model.
We provide the configs for the _SD2-v_ (768px) and _SD2-base_ (512px) model.

First, download the weights for [_SD2.0-v_](https://huggingface.co/stabilityai/stable-diffusion-2) and [_SD2.0-base_](https://huggingface.co/stabilityai/stable-diffusion-2-base).
First, download the weights for [_SD2.1-v_](https://huggingface.co/stabilityai/stable-diffusion-2-1) and [_SD2.1-base_](https://huggingface.co/stabilityai/stable-diffusion-2-1-base).

To sample from the _SD2.0-v_ model, run the following:
To sample from the _SD2.1-v_ model, run the following:

```
python scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt <path/to/768model.ckpt/> --config configs/stable-diffusion/v2-inference-v.yaml --H 768 --W 768
Expand All @@ -125,6 +147,41 @@ Note: The inference config for all model versions is designed to be used with EM
For this reason `use_ema=False` is set in the configuration, otherwise the code will try to switch from
non-EMA to EMA weights.

#### Enable Intel® Extension for PyTorch* optimizations in Text-to-Image script

If you're planning on running Text-to-Image on Intel® CPU, try to sample an image with TorchScript and Intel® Extension for PyTorch* optimizations. Intel® Extension for PyTorch* extends PyTorch by enabling up-to-date features optimizations for an extra performance boost on Intel® hardware. It can optimize memory layout of the operators to Channel Last memory format, which is generally beneficial for Intel CPUs, take advantage of the most advanced instruction set available on a machine, optimize operators and many more.

**Prerequisites**

Before running the script, make sure you have all needed libraries installed. (the optimization was checked on `Ubuntu 20.04`). Install [jemalloc](https://github.com/jemalloc/jemalloc), [numactl](https://linux.die.net/man/8/numactl), Intel® OpenMP and Intel® Extension for PyTorch*.

```bash
apt-get install numactl libjemalloc-dev
pip install intel-openmp
pip install intel_extension_for_pytorch -f https://software.intel.com/ipex-whl-stable
```

To sample from the _SD2.1-v_ model with TorchScript+IPEX optimizations, run the following. Remember to specify desired number of instances you want to run the program on ([more](https://github.com/intel/intel-extension-for-pytorch/blob/master/intel_extension_for_pytorch/cpu/launch.py#L48)).

```
MALLOC_CONF=oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:9000000000,muzzy_decay_ms:9000000000 python -m intel_extension_for_pytorch.cpu.launch --ninstance <number of an instance> --enable_jemalloc scripts/txt2img.py --prompt \"a corgi is playing guitar, oil on canvas\" --ckpt <path/to/768model.ckpt/> --config configs/stable-diffusion/intel/v2-inference-v-fp32.yaml --H 768 --W 768 --precision full --device cpu --torchscript --ipex
```

To sample from the base model with IPEX optimizations, use

```
MALLOC_CONF=oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:9000000000,muzzy_decay_ms:9000000000 python -m intel_extension_for_pytorch.cpu.launch --ninstance <number of an instance> --enable_jemalloc scripts/txt2img.py --prompt \"a corgi is playing guitar, oil on canvas\" --ckpt <path/to/model.ckpt/> --config configs/stable-diffusion/intel/v2-inference-fp32.yaml --n_samples 1 --n_iter 4 --precision full --device cpu --torchscript --ipex
```

If you're using a CPU that supports `bfloat16`, consider sample from the model with bfloat16 enabled for a performance boost, like so

```bash
# SD2.1-v
MALLOC_CONF=oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:9000000000,muzzy_decay_ms:9000000000 python -m intel_extension_for_pytorch.cpu.launch --ninstance <number of an instance> --enable_jemalloc scripts/txt2img.py --prompt \"a corgi is playing guitar, oil on canvas\" --ckpt <path/to/768model.ckpt/> --config configs/stable-diffusion/intel/v2-inference-v-bf16.yaml --H 768 --W 768 --precision full --device cpu --torchscript --ipex --bf16
# SD2.1-base
MALLOC_CONF=oversize_threshold:1,background_thread:true,metadata_thp:auto,dirty_decay_ms:9000000000,muzzy_decay_ms:9000000000 python -m intel_extension_for_pytorch.cpu.launch --ninstance <number of an instance> --enable_jemalloc scripts/txt2img.py --prompt \"a corgi is playing guitar, oil on canvas\" --ckpt <path/to/model.ckpt/> --config configs/stable-diffusion/intel/v2-inference-bf16.yaml --precision full --device cpu --torchscript --ipex --bf16
```

### Image Modification with Stable Diffusion

![depth2img-stable2](assets/stable-samples/depth2img/merged-0000.png)
Expand Down Expand Up @@ -152,7 +209,7 @@ and the diffusion model is then conditioned on the (relative) depth output.

<p align="center">
<b> depth2image </b><br/>
<img src=assets/stable-samples/depth2img/d2i.gif/>
<img src=assets/stable-samples/depth2img/d2i.gif>
</p>

This model is particularly useful for a photorealistic style; see the [examples](assets/stable-samples/depth2img).
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/stable-samples/stable-unclip/panda.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions checkpoints/checkpoints.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Put unCLIP checkpoints here.
37 changes: 37 additions & 0 deletions configs/karlo/decoder_900M_vit_l.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
model:
type: t2i-decoder
diffusion_sampler: uniform
hparams:
image_size: 64
num_channels: 320
num_res_blocks: 3
channel_mult: ''

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

channel_mult should be a list, not an empty string.

The channel_mult parameter is set to an empty string, but UNet architectures expect a list of integer multipliers (e.g., [1, 2, 4, 4] as seen in the Stable Diffusion configs). This will cause a runtime error when the model attempts to iterate over channel multipliers during initialization.

🔧 Proposed fix
-    channel_mult: ''
+    channel_mult: [1, 2, 4, 4]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
channel_mult: ''
channel_mult: [1, 2, 4, 4]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@configs/karlo/decoder_900M_vit_l.yaml` at line 8, The channel_mult parameter
is currently an empty string which will break UNet initialization; update
channel_mult to be a list of integer multipliers (for example, 1,2,4,4) instead
of an empty string so the model can iterate over channel multipliers during
setup; locate the channel_mult entry in the YAML and replace the empty-string
value with a list of integers appropriate for the UNet configuration.

attention_resolutions: 32,16,8
num_heads: -1
num_head_channels: 64
num_heads_upsample: -1
use_scale_shift_norm: true
dropout: 0.1
clip_dim: 768
clip_emb_mult: 4
text_ctx: 77
xf_width: 1536
xf_layers: 0
xf_heads: 0
xf_final_ln: false
resblock_updown: true
learn_sigma: true
text_drop: 0.3
clip_emb_type: image
clip_emb_drop: 0.1
use_plm: true

diffusion:
steps: 1000
learn_sigma: true
sigma_small: false
noise_schedule: squaredcos_cap_v2
use_kl: false
predict_xstart: false
rescale_learned_sigmas: true
timestep_respacing: ''
27 changes: 27 additions & 0 deletions configs/karlo/improved_sr_64_256_1.4B.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
model:
type: improved_sr_64_256
diffusion_sampler: uniform
hparams:
channels: 320
depth: 3
channels_multiple:
- 1
- 2
- 3
- 4
dropout: 0.0

diffusion:
steps: 1000
learn_sigma: false
sigma_small: true
noise_schedule: squaredcos_cap_v2
use_kl: false
predict_xstart: false
rescale_learned_sigmas: true
timestep_respacing: '7'


sampling:
timestep_respacing: '7' # fix
clip_denoise: true
21 changes: 21 additions & 0 deletions configs/karlo/prior_1B_vit_l.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
model:
type: prior
diffusion_sampler: uniform
hparams:
text_ctx: 77
xf_width: 2048
xf_layers: 20
xf_heads: 32
xf_final_ln: true
text_drop: 0.2
clip_dim: 768

diffusion:
steps: 1000
learn_sigma: false
sigma_small: true
noise_schedule: squaredcos_cap_v2
use_kl: false
predict_xstart: true
rescale_learned_sigmas: false
timestep_respacing: ''
Loading