Awesome On-Device LLM papers

A curated and up-to-date paper list of awesome on-deivce Large Language Models (LLMs) / Small Language Models (SLMs) research.

When the AGI era comes, the role of on-device LLMs / SLMs will likely become increasingly critical. They offer unique advantages in terms of privacy, responsiveness, and user-centric customization that cloud-based models may not be able to match. However, LLMs are often large and resource-intensive, making it difficult to deploy them on devices with limited computational power and memory. Researching techniques such as model pruning, quantization, and distillation can lead to more efficient models as SLMs that retain performance while being lightweight enough for mobile devices. This can enable a wider range of applications and improve accessibility.

If you find some interesting work/projects, please contact me through issues or email withhaotian [at] gmail [dot] com.

This list only focuses on the on-device LLMs / SLMs research. If you are interested in edge AI computing and system, please refer to awesome-edge-AI-papers.

License

This project is licensed under the GPL-3.0 license - see the LICENSE file for details.

Overview

Awesome On-Device LLM papers
License
Overview

Surveys

[arXiv'24] On-Device Language Models: A Comprehensive Review - [PDF] [Code]
[arXiv'24] A Survey of Small Language Models - [PDF]
[arXiv'24] Small Language Models: Survey, Measurements, and Insights - [PDF] [Code] [Demo]
[arXiv'24] A Survey of Resource-efficient LLM and Multimodal Foundation Models - [PDF] [Code]
[arXiv'24] On-Device Language Models: A Comprehensive Review - [PDF]
[arXiv'24] A Survey on Model Compression for Large Language Models - [PDF]

Models / Architectures Design

[arXiv'24] OpenELM: An Efficient Language Model Family with Open Training and Inference Framework - [PDF] [Code] [HuggingFace]
[arXiv'24] FOX-1 TECHNICAL REPORT - [PDF] [HuggingFace]
[arXiv'24] Tinyllama: An open-source small language model - [PDF] [Code]
[arXiv'24] MobileVLM V2: Faster and Stronger Baseline for Vision Language Model - [PDF] [Code]
[arXiv'24] The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits - [PDF]
[arXiv'24] Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone - [PDF]
[arXiv'24] MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT - [PDF] [Code]

Algorithms

[arXiv'24] vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving - [PDF]

Model Compression

[arXiv'24] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration - [PDF] [Code]
[arXiv'24] Exploring post-training quantization in llms from comprehensive study to low rank compensation - [PDF]

Systems

[arXiv'23]AutoDroid: LLM-powered Task Automation in Android - [PDF] [Code]

Benchmarks

[arXiv'24] MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases - [PDF]
[EdgeFM'24] Large Language Models on Mobile Devices: Measurements, Analysis, and Insights - [PDF]

Applications

[arXiv'24] Toward Scalable Generative AI via Mixture of Experts in Mobile Edge Networks - [PDF]

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome On-Device LLM papers

License

Overview

Surveys

Models / Architectures Design

Algorithms

Model Compression

Systems

Benchmarks

Applications

About

Releases

Packages

License

withhaotian/awesome-on-device-LLM-papers

Folders and files

Latest commit

History

Repository files navigation

Awesome On-Device LLM papers

License

Overview

Surveys

Models / Architectures Design

Algorithms

Model Compression

Systems

Benchmarks

Applications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages