AutoCaption

AutoCaption is a simple script that generates captions for images using the llava-v1.5-13b vision model.

Installation

To get started, clone the repository and install the necessary dependencies:

git clone --recurse-submodules -j8 https://github.com/akiselevprivate/AutoCaption.git
cd AutoCaption
pip install --upgrade pip  # Enable PEP 660 support
pip install -e LLaVA/      # Install LLaVA module
pip install -r requirements.txt  # Install other dependencies

huggingface-cli login # login using key for weights download, skip if env variable set

Usage

Once you have everything installed, you can use the script to generate captions for all images in a specified folder.

Basic Command

python main.py <image_folder> --prefix "<prefix>" --suffix "<suffix>" --encoder_prompt "<encoder_prompt>"

<image_folder>: Path to the folder containing the images you want to caption.
<prefix>: Optional prefix that will be added to the caption (default is empty).
<suffix>: Optional suffix that will be added to the caption (default is empty).
<encoder_prompt>: Optional prompt addition for the encode model (default is empty).

Example

python main.py images --prefix "Photo of [trigger], " --encoder_prompt "for a t5 text encoder"

This will generate captions for all images in the images/ folder, and each caption will start with "Photo of [trigger], " followed by the description of the image generated by the model.

How It Works:

Image Folder: The script reads the images from the folder specified.
Captioning: The script uses a pre-trained model (llava-v1.5-13b) to generate captions.
Prefix/Suffix: You can customize the captions with a prefix and/or suffix.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LLaVA @ c121f04		LLaVA @ c121f04
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
caption.py		caption.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoCaption

Installation

Usage

Basic Command

Example

How It Works:

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AutoCaption

Installation

Usage

Basic Command

Example

How It Works:

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages