📒A curated list of Awesome Diffusion Inference Papers with codes. For Awesome LLM Inference, please check 📖Awesome-LLM-Inference for more details.
- 📙Awesome Diffusion Inference with Sampling
- 📙Awesome Diffusion Inference with Caching
- 📙Awesome Diffusion Inference with Multi-GPUs
- 📙Other Awesome Diffusion Inference Paper with codes
@misc{Awesome-Diffusion-Inference@2024,
title={Awesome-Diffusion-Inference: A small curated list of Awesome Diffusion Inference with Distributed/Caching/Sampling.},
url={https://github.com/DefTruth/Awesome-Diffusion-Inference},
note={Open-source software available at https://github.com/DefTruth/Awesome-Diffusion-Inference},
author={DefTruth},
year={2024}
}
Date | Title | Paper | Code | Recom |
---|---|---|---|---|
2020.06 | 🔥[DDPM] Denoising Diffusion Probabilistic Models(@UC Berkeley) | [pdf] | [diffusion] |
⭐️⭐️ |
2020.10 | 🔥[DDIM] DENOISING DIFFUSION IMPLICIT MODELS(@cs.stanford.edu) | [pdf] | ⭐️⭐️ | |
2022.02 | 🔥[PNDM] PSEUDO NUMERICAL METHODS FOR DIFFUSION MODELS ON MANIFOLDS(@) | [pdf] | [PNDM] |
⭐️⭐️ |
2022.02 | 🔥[DPM-Solver] DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps(@Cheng Lu) | [pdf] | [dpm-solver] |
⭐️⭐️ |
2022.11 | 🔥[DPM-Solver++] DPM-SOLVER++: FAST SOLVER FOR GUIDED SAMPLING OF DIFFUSION PROBABILISTIC MODELS(@Cheng Lu) | [pdf] | [dpm-solver] |
⭐️⭐️ |
2023.10 | 🔥[DPM-Solver-v3] DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics(@Kaiwen Zheng) | [pdf] | [DPM-Solver-v3] |
⭐️⭐️ |
2023.11 | 🔥[Parallel Sampling] Parallel Sampling of Diffusion Models(@Stanford University) | [pdf] | [paradigms] |
⭐️⭐️ |
2023.11 | 🔥[SAMPLER SCHEDULER] SAMPLER SCHEDULER FOR DIFFUSION MODELS(@sysu) | [pdf] | ⭐️⭐️ | |
2024.02 | 🔥[Parallel Sampling] Accelerating Parallel Sampling of Diffusion Models(@Zhiwei Tang) | [pdf] | [ParaTAA-Diffusion] |
⭐️⭐️ |
2024.01 | 🔥[YONOS] You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation(@Samsung AI) | [pdf] | ⭐️⭐️ | |
2024.01 | 🔥[S^2-DM] S^2-DMs: Skip-Step Diffusion Models(@Yixuan Wang) | [pdf] | ⭐️⭐️ | |
2024.08 | 🔥[StepSaver] StepSaver: Predicting Minimum Denoising Steps for Diffusion Model Image Generation(@intel) | [pdf] | ⭐️⭐️ | |
2024.09 | 🔥[DC-Solver] DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation(@Tsinghua University) | [pdf] | [DC-Solver] |
⭐️⭐️ |
- UNet Based (DeepCache)
![image](https://private-user-images.githubusercontent.com/31974251/352797438-a7257462-80d3-40af-a4ce-3550508fabe7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkwODI0NzIsIm5iZiI6MTczOTA4MjE3MiwicGF0aCI6Ii8zMTk3NDI1MS8zNTI3OTc0MzgtYTcyNTc0NjItODBkMy00MGFmLWE0Y2UtMzU1MDUwOGZhYmU3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA5VDA2MjI1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTliMDAxMzM5MTc5NjZkNTkwZTQ2MzlkMjIwOTgwZGIyYzIxMDMxZWM0MGRjMjY0OTIxMmYwMzUyNjY3ZjUzYzImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.6avTGF0Fe7sYku6wTTAuQ-b_fWxAqze-SA6tbe4kFJs)
- DiT Based (Fast-Forward Caching)
![image](https://private-user-images.githubusercontent.com/31974251/352797183-fad8f187-d4ac-4290-9943-7b34116fed05.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkwODI0NzIsIm5iZiI6MTczOTA4MjE3MiwicGF0aCI6Ii8zMTk3NDI1MS8zNTI3OTcxODMtZmFkOGYxODctZDRhYy00MjkwLTk5NDMtN2IzNDExNmZlZDA1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA5VDA2MjI1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWYxNDFjNzc3MzU4NjgwMGE4NDBiYmY0NWQ1Y2Y4MTMwNDI3YWVjNTIwNDY4OGVkZmM3OTQwYzc2NDA4NzVlNzImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.rd9a5KkIfZ0gVCW0TfmC60szZC64S7VO3R6wGYLUcyo)
Date | Title | Paper | Code | Recom |
---|---|---|---|---|
2023.05 | 🔥🔥[Cache-Enabled Sparse Diffusion] Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference(@pku.edu.cn etc) | [pdf] | ⭐️⭐️ | |
2023.12 | 🔥🔥[DeepCache] DeepCache: Accelerating Diffusion Models for Free(@nus.edu) | [pdf] | [DeepCache] |
⭐️⭐️ |
2023.12 | 🔥🔥[Block Caching] Cache Me if You Can: Accelerating Diffusion Models through Block Caching(@Meta GenAI etc) | [pdf] | ⭐️⭐️ | |
2023.12 | 🔥🔥[Approximate Caching] Approximate Caching for Efficiently Serving Diffusion Models(@Adobe) | [pdf] | ⭐️⭐️ | |
2024.06 | 🔥🔥[Layer Caching] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching(@nus.edu) | [pdf] | [learning-to-cache] |
⭐️⭐️ |
2024.07 | 🔥[ElasticCache-LVLM] Efficient Inference of Vision Instruction-Following Models with Elastic Cache(@Tsinghua University etc) | [pdf] | [ElasticCache] |
⭐️ |
2024.07 | 🔥🔥[Fast-Forward Caching(DiT)] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration(@microsoft.com etc) | [pdf] | [FORA] |
⭐️⭐️ |
2024.07 | 🔥🔥[Faster I2V Generation] Faster Image2Video Generation: A Closer Look at CLIP Image Embedding’s Impact on Spatio-Temporal Cross-Attentions(@Ashkan Taghipour etc) | [pdf] | ⭐️⭐️ | |
2024.04 | 🔥🔥[T-GATE V1] Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models(@Wentian Zhang etc) | [pdf] | [T-GATE] |
⭐️⭐️ |
2024.04 | 🔥🔥[T-GATE V2] Faster Diffusion via Temporal Attention Decomposition(@Haozhe Liu etc) | [pdf] | [T-GATE] |
⭐️⭐️ |
2024.06 | 🔥🔥[DiTFastAttn] DiTFastAttn: Attention Compression for Diffusion Transformer Models(@Zhihang Yuan etc) | [pdf] | [DiTFastAttn] |
⭐️⭐️ |
2024.06 | 🔥🔥[∆-DiT] ∆-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers(@Fudan University) | [pdf] | ⭐️⭐️ | |
2024.09 | 🔥🔥[TokenCache] Token Caching for Diffusion Transformer Acceleration(@Institute of Automation, Chinese Academy of Sciences) | [pdf] | ⭐️⭐️ | |
2024.11 | 🔥🔥[AdaCache] Adaptive Caching for Faster Video Generation with Diffusion Transformers(@Meta) | [pdf] | [AdaCache] |
⭐️⭐️ |
2024.11 | 🔥🔥[TeaCache] Timestep Embedding Tells: It’s Time to Cache for Video Diffusion Model(@Alibaba) | [pdf] | [TeaCache] |
⭐️⭐️ |
2024.11 | 🔥🔥[LazyDiT] LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers(@Adobe Research) | [pdf] | ⭐️⭐️ | |
2024.11 | 🔥🔥[Ca2-VDM] Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing(@ZJU) | [pdf] | [CausalCache-VDM] |
⭐️⭐️ |
2024.11 | 🔥🔥[SmoothCache] SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers(@Roblox) | [pdf] | [SmoothCache] |
⭐️⭐️ |
2024.10 | 🔥🔥[FasterCache] FASTERCACHE: TRAINING-FREE VIDEO DIFFUSION MODEL ACCELERATION WITH HIGH QUALITY(@S-Lab) | [pdf] | [FasterCache] |
⭐️⭐️ |
2024.10 | 🔥🔥[ToCa] ToCa: Accelerating Diffusion Transformers with Token-wise Feature Caching(@SJTU) | [pdf] | [ToCa] |
⭐️⭐️ |
2024.11 | 🔥🔥[SkipCache] Accelerating Vision Diffusion Transformers with Skip Branches(@SJTU) | [pdf] | [Skip-DiT] |
⭐️⭐️ |
2024.12 | 🔥🔥[DuCa] Accelerating Diffusion Transformers with Dual Feature Caching(@SJTU) | [pdf] | [DuCa] |
⭐️⭐️ |
2025.01 | 🔥🔥[FBCache] Fastest HunyuanVideo Inference with Context Parallelism and First Block Cache on NVIDIA L20 GPUs(@chengzeyi) | [docs] | [ParaAttention] |
⭐️⭐️ |
2025.01 | 🔥🔥[FlexCache] FlexCache: Flexible Approximate Cache System for Video Diffusion(@University of Waterloo) | [pdf] | ⭐️⭐️ | |
2025.01 | 🔥🔥[Token Pruning] Token Pruning for Caching Better: 9× Acceleration on Stable Diffusion for Free(@SJTU) | [pdf] | [DaTo] |
⭐️⭐️ |
- UNet Based: Displaced Patch parallelism (DistriFusion)
![image](https://private-user-images.githubusercontent.com/31974251/352798949-aefb2ae7-73eb-4e9c-bf1a-ec540f4dfa7d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkwODI0NzIsIm5iZiI6MTczOTA4MjE3MiwicGF0aCI6Ii8zMTk3NDI1MS8zNTI3OTg5NDktYWVmYjJhZTctNzNlYi00ZTljLWJmMWEtZWM1NDBmNGRmYTdkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA5VDA2MjI1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTNiNGUzMTI0YmYyMTY3YThkZjVjNTI2YTZjYjExYTRiZDA3NjNmYjY3ZTg1YjhiMmNiODQ1ZDU3NGIzOTY1ZWUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.FcByT_1eprsbv2rDkYg8leqTLMQC8fEjyFqISmpm7KM)
- DiT Based: Displaced Patch parallelism (PipeFusion)
![image](https://private-user-images.githubusercontent.com/31974251/352799269-692c5d54-19b3-4ce7-9613-9eb8bb035c7d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkwODI0NzIsIm5iZiI6MTczOTA4MjE3MiwicGF0aCI6Ii8zMTk3NDI1MS8zNTI3OTkyNjktNjkyYzVkNTQtMTliMy00Y2U3LTk2MTMtOWViOGJiMDM1YzdkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA5VDA2MjI1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTYwNTI5MjgwNTc5NTliN2Q5ZTE0Y2U3NDQ1M2JhYzcxNWYxM2RjYTkxNzc1ZGEzZGM0Y2FiM2NkMzdlZTRjOWYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.BLSxRFZZ65gRWUwF4z16oveQoUMIJW5EOgLqn7lzhuM)
Date | Title | Paper | Code | Recom |
---|---|---|---|---|
2024.02 | 🔥🔥[DistriFusion] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models(@MIT etc) | [pdf] | [distrifuser] |
⭐️⭐️ |
2024.05 | 🔥🔥[PipeFusion] PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models(@Tencent etc) | [pdf] | [xDiT] |
⭐️⭐️ |
2024.06 | 🔥🔥[AsyncDiff] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising(@nus.edu) | [pdf] | [AsyncDiff] |
⭐️⭐️ |
2024.05 | 🔥🔥[TensorRT-LLM SDXL] SDXL Distributed Inference with TensorRT-LLM and synchronous comm(@Zars19) | [pdf] | [SDXL-TensorRT-LLM] |
⭐️⭐️ |
2024.06 | 🔥🔥[Clip Parallelism] Video-Infinity: Distributed Long Video Generation(@nus.edu) | [pdf] | [Video-Infinity] |
⭐️⭐️ |
2024.05 | 🔥🔥[FIFO-Diffusion] FIFO-Diffusion: Generating Infinite Videos from Text without Training(@Seoul National University) | [pdf] | [FIFO-Diffusion] |
⭐️⭐️ |
Date | Title | Paper | Code | Recom |
---|---|---|---|---|
2024.08 | 🔥[Transfusion] Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model(@meta) | [pdf] | [transfusion-pytorch] |
⭐️⭐️ |
2024.08 | 🔥[VQ4DiT] VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers(@ZJU) | [pdf] | ⭐️⭐️ | |
2024.08 | 🔥[LBQ] Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models(@toronto.edu) | [pdf] | ⭐️⭐️ | |
2024.08 | 🔥[EE-Diffusion] A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models(@KAIST AI) | [pdf] | [ee-diffusion] |
⭐️⭐️ |
2024.08 | 🔥[TFM-PTQ] Temporal Feature Matters: A Framework for Diffusion Model Quantization(@SenseTime) | [pdf] | ⭐️⭐️ | |
2024.08 | 🔥[Diffusion-RWKV] Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models(@Zhengcong Fei) | [pdf] | [Diffusion-RWKV] |
⭐️⭐️ |
2024.09 | 🔥[LinFusion] LINFUSION: 1 GPU, 1 MINUTE, 16K IMAGE(@NUS) | [pdf] | [LinFusion] |
⭐️⭐️ |
2024.11 | 🔥🔥[SVDQuant] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | [pdf] | [nunchaku] |
⭐️⭐️ |
GNU General Public License v3.0
Welcome to star & submit a PR to this repo!