DeepSeek R1 32B

Name: DeepSeek R1 32B
Author: DeepSeek

MIT

DeepSeek · 32B · transformer-decoder

🤗 HuggingFace Ollama Official Paper

2025-01-20 131K context 32B params

Use Cases

chat code reasoning math writing

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	20.7 GB	Good	—
Q5_K_M	5	23.9 GB	Good	—
Q8_0	8	34.0 GB	Excellent	—

About this model

DeepSeek R1 32B is a distilled reasoning model based on the Qwen 2.5 32B architecture, offering strong chain-of-thought reasoning capabilities in a size that fits on high-end consumer hardware. It provides a significant quality uplift over the 14B variant for complex reasoning tasks. This model excels at multi-step mathematical proofs, algorithmic problem solving, and analytical writing. At Q4 quantization it fits on a single 24GB GPU, making it the sweet spot for users who want powerful reasoning without requiring multi-GPU setups.

Benchmarks

83.2

mmlu

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run deepseek-r1:32b-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 32B
Architecture: transformer-decoder
Context: 131K tokens
Min VRAM: 20.7 GB
Recommended: 20.7 GB
Family: DeepSeek R1
Released: 2025-01-20
License: MIT