Phi-4 Mini 3.8B
by Microsoft · phi family
3.8B
parameters
text-generation code-generation reasoning math summarization
Phi-4 Mini 3.8B is Microsoft's compact reasoning model with 128K context, punching far above its weight on math and reasoning benchmarks. It continues the Phi tradition of proving that smaller, carefully trained models can compete with much larger ones. At Q4 it fits on any 8 GB GPU or Mac with minimal overhead — an excellent choice for users who want strong reasoning capabilities on resource-constrained hardware. Great for math, coding, and structured tasks.
Quick Start with Ollama
ollama run 3.8b-q4_K_M Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 2.5 GB | 4.5 GB |
★
★
★
★
★
| 3.8b-q4_K_M |
| Q8_0 | 4.2 GB | 6.5 GB |
★
★
★
★
★
| 3.8b-q8_0 |
| F16 | 7.8 GB | 10.5 GB |
★
★
★
★
★
| 3.8b-fp16 |
Compatible Hardware
for Q4_K_M (4.5 GB VRAM)
Compatible Hardware
| Hardware | VRAM | Type | Fit | Est. Speed |
|---|---|---|---|---|
| Mac Studio M4 Ultra 512GB | 512 GB | mac | Runs | ~182 tok/s |
| Mac Pro M2 Ultra 192GB | 192 GB | mac | Runs | ~178 tok/s |
| Mac Studio M4 Ultra 192GB | 192 GB | mac | Runs | ~182 tok/s |
| Mac Studio M4 Max 128GB | 128 GB | mac | Runs | ~121 tok/s |
| MacBook Pro M4 Max 128GB | 128 GB | mac | Runs | ~121 tok/s |
| MacBook Pro M3 Max 96GB | 96 GB | mac | Runs | ~89 tok/s |
| Mac mini M4 Pro 64GB | 64 GB | mac | Runs | ~61 tok/s |
| Mac Studio M4 Max 64GB | 64 GB | mac | Runs | ~121 tok/s |
| MacBook Pro M4 Max 64GB | 64 GB | mac | Runs | ~121 tok/s |
| Mac mini M4 Pro 48GB | 48 GB | mac | Runs | ~61 tok/s |
| MacBook Pro M3 Max 48GB | 48 GB | mac | Runs | ~89 tok/s |
| MacBook Pro M4 Max 48GB | 48 GB | mac | Runs | ~121 tok/s |
| MacBook Pro M4 Pro 48GB | 48 GB | mac | Runs | ~61 tok/s |
| Mac Studio M4 Max 36GB | 36 GB | mac | Runs | ~121 tok/s |
| MacBook Pro M3 Pro 36GB | 36 GB | mac | Runs | ~33 tok/s |
| NVIDIA GeForce RTX 5090 | 32 GB | gpu | Runs | ~398 tok/s |
| iMac M4 32GB | 32 GB | mac | Runs | ~27 tok/s |
| Mac mini M4 32GB | 32 GB | mac | Runs | ~27 tok/s |
| MacBook Air M4 32GB | 32 GB | mac | Runs | ~27 tok/s |
| AMD Radeon RX 7900 XTX | 24 GB | gpu | Runs | ~213 tok/s |
| NVIDIA GeForce RTX 3090 | 24 GB | gpu | Runs | ~208 tok/s |
| NVIDIA GeForce RTX 4090 | 24 GB | gpu | Runs | ~224 tok/s |
| iMac M3 24GB | 24 GB | mac | Runs | ~22 tok/s |
| Mac mini M2 24GB | 24 GB | mac | Runs | ~22 tok/s |
| Mac mini M4 Pro 24GB | 24 GB | mac | Runs | ~61 tok/s |
| MacBook Air M2 24GB | 24 GB | mac | Runs | ~22 tok/s |
| MacBook Air M4 24GB | 24 GB | mac | Runs | ~27 tok/s |
| MacBook Pro M4 Pro 24GB | 24 GB | mac | Runs | ~61 tok/s |
| AMD Radeon RX 7900 XT | 20 GB | gpu | Runs | ~178 tok/s |
| MacBook Pro M3 Pro 18GB | 18 GB | mac | Runs | ~33 tok/s |
| AMD Radeon RX 6800 XT | 16 GB | gpu | Runs | ~114 tok/s |
| AMD Radeon RX 7800 XT | 16 GB | gpu | Runs | ~139 tok/s |
| Intel Arc A770 | 16 GB | gpu | Runs | ~124 tok/s |
| NVIDIA GeForce RTX 4060 Ti 16GB | 16 GB | gpu | Runs | ~64 tok/s |
| NVIDIA GeForce RTX 4070 Ti Super | 16 GB | gpu | Runs | ~149 tok/s |
| NVIDIA GeForce RTX 4080 Super | 16 GB | gpu | Runs | ~164 tok/s |
| NVIDIA GeForce RTX 4080 | 16 GB | gpu | Runs | ~159 tok/s |
| NVIDIA GeForce RTX 5070 Ti | 16 GB | gpu | Runs | ~199 tok/s |
| NVIDIA GeForce RTX 5080 | 16 GB | gpu | Runs | ~213 tok/s |
| iMac M1 16GB | 16 GB | mac | Runs | ~15 tok/s |
| iMac M4 16GB | 16 GB | mac | Runs | ~27 tok/s |
| Mac mini M1 16GB | 16 GB | mac | Runs | ~15 tok/s |
| Mac mini M4 16GB | 16 GB | mac | Runs | ~27 tok/s |
| MacBook Air M2 16GB | 16 GB | mac | Runs | ~22 tok/s |
| MacBook Air M3 16GB | 16 GB | mac | Runs | ~22 tok/s |
| MacBook Air M4 16GB | 16 GB | mac | Runs | ~27 tok/s |
| MacBook Pro M1 16GB | 16 GB | mac | Runs | ~15 tok/s |
| MacBook Pro M2 Pro 16GB | 16 GB | mac | Runs | ~44 tok/s |
| AMD Radeon RX 7700 XT | 12 GB | gpu | Runs | ~96 tok/s |
| NVIDIA GeForce RTX 3060 12GB | 12 GB | gpu | Runs | ~80 tok/s |
| NVIDIA GeForce RTX 3080 12GB | 12 GB | gpu | Runs | ~203 tok/s |
| NVIDIA GeForce RTX 4070 Super | 12 GB | gpu | Runs | ~112 tok/s |
| NVIDIA GeForce RTX 4070 Ti | 12 GB | gpu | Runs | ~112 tok/s |
| NVIDIA GeForce RTX 4070 | 12 GB | gpu | Runs | ~112 tok/s |
| NVIDIA GeForce RTX 5070 | 12 GB | gpu | Runs | ~149 tok/s |
| NVIDIA GeForce RTX 2080 Ti | 11 GB | gpu | Runs | ~137 tok/s |
| NVIDIA GeForce RTX 3080 10GB | 10 GB | gpu | Runs | ~169 tok/s |
| AMD Radeon RX 7600 | 8 GB | gpu | Runs | ~64 tok/s |
| Intel Arc A750 | 8 GB | gpu | Runs | ~114 tok/s |
| NVIDIA GeForce RTX 3060 Ti | 8 GB | gpu | Runs | ~100 tok/s |
| NVIDIA GeForce RTX 3070 | 8 GB | gpu | Runs | ~100 tok/s |
| NVIDIA GeForce RTX 4060 Ti 8GB | 8 GB | gpu | Runs | ~64 tok/s |
| NVIDIA GeForce RTX 4060 | 8 GB | gpu | Runs | ~60 tok/s |
| MacBook Air M1 8GB | 8 GB | mac | Runs | ~15 tok/s |
| MacBook Air M2 8GB | 8 GB | mac | Runs | ~22 tok/s |
Benchmark Scores
70.0
mmlu