DeepSeek R1 Overthinker

Using this app you can force DeepSeek R1 models to think more deeply by extending their reasoning process. It uses unsloth optimized models for better performance and unlimited context length (only limited by available VRAM).

The app works by detecting when the model tries to conclude thoughts too early and replacing those with prompts that encourage additional reasoning, continuing until a minimum threshold of thinking set by you is reached.

App by anzorq. If you like it, please consider supporting me:

Features

🤔 Force models to think longer and more thoroughly
🔄 Customizable reasoning extensions and thinking thresholds
🎯 Fine-grained control over model parameters (temperature, top-p, etc.)
💭 Visible thinking process with token count tracking
📝 LaTeX support for mathematical expressions
🖥️ Optimized for various VRAM configurations
♾️ Unlimited context length (VRAM-dependent)
🔄 Choose from multiple model sizes (1.5B to 70B parameters)

Available Models

You can choose from any of the unsloth-optimized distilled DeepSeek R1 models:

Qwen-based Models

1.5B parameters (Qwen): unsloth/DeepSeek-R1-Distill-Qwen-1.5B
7B parameters (Qwen): unsloth/DeepSeek-R1-Distill-Qwen-7B
14B parameters (Qwen): unsloth/DeepSeek-R1-Distill-Qwen-14B
32B parameters (Qwen): unsloth/DeepSeek-R1-Distill-Qwen-32B

LLaMA-based Models

8B parameters (LLaMA): unsloth/DeepSeek-R1-Distill-Llama-8B
70B parameters (LLaMA): unsloth/DeepSeek-R1-Distill-Llama-70B

Choose the model size based on your available VRAM and performance requirements. Larger models generally provide better quality responses but require more VRAM. Qwen and LLaMA architectures may perform differently on various tasks.

Note: You can run models up to 14B parameters on a free Google Colab T4 GPU.

Related Work

s1: Simple test-time scaling

The paper "s1: Simple test-time scaling" is an independent work by Niklas Muennighoff et al. that tests and validates the approach used in this repository. The key contributions of the paper include:

Developing budget forcing to control test-time compute by forcefully terminating the model’s thinking process or lengthening it by appending “Wait” multiple times to the model’s generation.
Curating a small dataset s1K of 1,000 questions paired with reasoning traces.
Achieving strong reasoning performance and test-time scaling with the Qwen2.5-32B-Instruct language model.

For more details, see the paper's repository.

Credits

Original idea and implementation - vgel's gist
DeepSeek LLM - https://github.com/deepseek-ai/DeepSeek-LLM
unsloth - https://github.com/unsloth/unsloth
Gradio - https://github.com/gradio-app/gradio

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
LICENSE		LICENSE
README.md		README.md
r1_overthinker.ipynb		r1_overthinker.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSeek R1 Overthinker

Features

Available Models

Qwen-based Models

LLaMA-based Models

Related Work

s1: Simple test-time scaling

Credits

License

About

Releases

Sponsor this project

Packages

Languages

License

qunash/r1-overthinker

Folders and files

Latest commit

History

Repository files navigation

DeepSeek R1 Overthinker

Features

Available Models

Qwen-based Models

LLaMA-based Models

Related Work

s1: Simple test-time scaling

Credits

License

About

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages