PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask
This repository is the official implementation of PromptDresser
PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask
Jeongho Kim, Hoiyeong Jin, Sunghyun Park, Jaegul Choo
- [v]
Inference code - [v]
Release model weights - [] Training code
git clone https://github.com/rlawjdghek/PromptDresser
cd PromptDresser
conda create --name PromptDresser python=3.10 -y
conda activate PromptDresser
# install packages
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
python -m pip install diffusers==0.25.0
python -m pip install accelerate==0.31.0
python -m pip install transformers>=4.25.1
python -m pip install ftfy
python -m pip install Jinja2
python -m pip install datasets
python -m pip install wandb
python -m pip install onnxruntime-gpu==1.19.2
python -m pip install omegaconf
python -m pip install einops
python -m pip install torchmetrics
python -m pip install clean-fid
python -m pip install scikit-image
python -m pip install opencv-python
python -m pip install fvcore
python -m pip install cloudpickle
python -m pip install pycocotools
python -m pip install av
python -m pip install scipy
python -m pip install peft
python -m pip install huggingface-hub==0.24.6
First, download the sdxl and sdxl inpainting models into the pretrained_models folder using Git LFS.
Human parsing model and our checkpoint on VITONHD should be placed in the checkpoints folder.
You can download the VITON-HD dataset from here, and download text file and two agnostic mask for prompt-aware mask generation from here.
For the inference, the following dataset structure is required:
test_coarse
|-- image
|-- image-densepose
|-- agnostic-mask
|-- cloth
...
test_fine
|-- image
|-- image-densepose
|-- agnostic-mask
|-- cloth
...
test_pairs.txt
test_unpairs.txt
test_gpt4o.json
#### single gpu
CUDA_VISIBLE_DEVICES=0 python inference.py \
--config_p "./configs/VITONHD.yaml" \
--pretrained_unet_path "./checkpoints/VITONHD/model/pytorch_model.bin" \
--save_name VITONHD
#### multi-gpu
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch --mixed_precision fp16 --num_processes 4 --multi_gpu inference.py \
--config_p "./configs/VITONHD.yaml" \
--pretrained_unet_path "./checkpoints/VITONHD/model/pytorch_model.bin" \
--save_name VITONHD
If you find our work useful for your research, please cite us:
@misc{kim2024promptdresserimprovingqualitycontrollability,
title={PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask},
author={Jeongho Kim and Hoiyeong Jin and Sunghyun Park and Jaegul Choo},
year={2024},
eprint={2412.16978},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.16978},
}
Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).