This is the official implementation of the paper
Yanbo Xu 1, Jayanth Srinivasa 2, Gaowen Liu 2, Shubham Tulsiani 1
1 Carnegie Mellon University, 2 Cisco Research
Abstract: Score distillation of 2D diffusion models has proven to be a powerful mechanism to guide 3D optimization, for example enabling text-based 3D generation or single-view reconstruction. A common limitation of existing score distillation formulations, however, is that the outputs of the (mode-seeking) optimization are limited in diversity despite the underlying diffusion model being capable of generating diverse samples. In this work, inspired by the sampling process in denoising diffusion, we propose a score formulation that guides the optimization to follow generation paths defined by random initial seeds, thus ensuring diversity. We then present an approximation to adopt this formulation for scenarios where the optimization may not precisely follow the generation paths (e.g. a 3D representation whose renderings evolve in a co-dependent manner). We showcase the applications of our `Diverse Score Distillation' (DSD) formulation across tasks such as 2D optimization, text-based 3D inference, and single-view reconstruction. We also empirically validate DSD against prior score distillation formulations and show that it significantly improves sample diversity while preserving fidelity.
This project is based on Threestudio. Please see the installation guide from Threestudio.
If more than 40GB of VRAM is avaliable, the results from full resolution (512) can be run as below:
python launch.py --config configs/dsd.yaml --train --gpu 0 system.prompt_processor.prompt="a toy robot" --seed 0
python launch.py --config configs/dsd.yaml --train --gpu 0 system.prompt_processor.prompt="pumpkin head zombie, skinny, highly detailed, photorealistic" --seed 0
python launch.py --config configs/dsd.yaml --train --gpu 0 system.prompt_processor.prompt="Mini Garden, highly detailed, 8K, HD." --seed 0
The results will be saved to outputs/diverse-score-distillation/
.
To generate diverse samples, just change the seeds:
python launch.py --config configs/dsd.yaml --train --gpu 0 system.prompt_processor.prompt="a toy robot" --seed 0
python launch.py --config configs/dsd.yaml --train --gpu 0 system.prompt_processor.prompt="a toy robot" --seed 1000
If VRM is limitted, we recommend reducing the rendering resolution by running command (tested on GPUs with 24GB of VRAM):
python launch.py --config configs/dsd_low_res.yaml --train --gpu 0 system.prompt_processor.prompt="a toy robot" --seed 0
If you find our project useful, please consider citing it:
@article{xu2024diversescoredistillation,
title={Diverse Score Distillation},
author={Yanbo Xu and Jayanth Srinivasa and Gaowen Liu and Shubham Tulsiani},
journal={arXiv preprint arXiv:2412.06780},
year={2024}
}