Llama 3.2 1B
by Meta · llama-3 family
1B
parameters
text-generation summarization
Llama 3.2 1B is the smallest model in the Llama 3.2 family, designed for ultra-lightweight deployment scenarios. It can handle basic text generation and summarization tasks while requiring minimal compute resources. This model is best suited for simple tasks, prototyping, or situations where hardware is extremely constrained. It runs on virtually any modern device and provides fast inference even on CPU-only setups.
Quick Start with Ollama
ollama run 1b-instruct-q8_0 | Creator | Meta |
| Parameters | 1B |
| Architecture | transformer-decoder |
| Context Length | 128K tokens |
| License | Llama 3.2 Community License |
| Released | Sep 25, 2024 |
| Ollama | llama3.2:1b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M | 0.8 GB | 2.1 GB |
★
★
★
★
★
| 1b-instruct-q4_K_M |
| Q8_0 recommended | 0.9 GB | 3 GB |
★
★
★
★
★
| 1b-instruct-q8_0 |
| F16 | 1.9 GB | 4 GB |
★
★
★
★
★
| 1b-instruct-fp16 |
Compatible Hardware for Q8_0
Showing compatibility for the recommended quantization (Q8_0, 3 GB VRAM).
Compatible Hardware
Benchmark Scores
49.3
mmlu