Mac mini M4 Pro 24GB
Apple · M4 Pro · 24GB Unified Memory · Can run 80 models
| Manufacturer | Apple |
| Unified Mem | 24 GB |
| Chip | M4 Pro |
| CPU Cores | 14 |
| GPU Cores | 20 |
| Neural Engine | 16 |
| Bandwidth | 273 GB/s |
| MSRP | $1,399 |
| Released | Nov 8, 2024 |
AI Notes
The Mac mini M4 Pro 24GB provides a significant step up in AI performance with 273 GB/s memory bandwidth. With 24GB of unified memory, it can run 13B models at full precision and 30B models with quantization. The M4 Pro chip's higher bandwidth delivers noticeably faster token generation than the base M4.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~109 tok/s |
| Qwen 3.5 0.8B | 800M | Q4_K_M | 1.5 GB | Runs | ~182 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~137 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~91 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~91 tok/s |
| SmolLM2 1.7B | 1.7B | Q8_0 | 2.7 GB | Runs | ~101 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~68 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~83 tok/s |
| Gemma 4 E2B | 2B | Q4_K_M | 4 GB | Runs | ~68 tok/s |
| Qwen 3.5 2B | 2B | Q4_K_M | 3 GB | Runs | ~91 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~55 tok/s |
| StarCoder2 3B | 3B | Q4_K_M | 3.5 GB | Runs | ~78 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~47 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~61 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~55 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~61 tok/s |
| Gemma 4 E4B | 4B | Q4_K_M | 6 GB | Runs | ~46 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~61 tok/s |
| Qwen 3.5 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~61 tok/s |
| Yi 1.5 6B | 6B | Q4_K_M | 5 GB | Runs | ~55 tok/s |
| Codestral Mamba 7B | 7B | Q4_K_M | 6.9 GB | Runs | ~40 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | Runs | ~30 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | Runs | ~40 tok/s |
| InternLM 2.5 7B | 7B | Q4_K_M | 5.5 GB | Runs | ~50 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | Runs | ~30 tok/s |
| OpenChat 3.5 7B | 7B | Q4_K_M | 6.9 GB | Runs | ~40 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | Runs | ~30 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | Runs | ~30 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | Runs | ~39 tok/s |
| StarCoder2 7B | 7B | Q4_K_M | 5.5 GB | Runs | ~50 tok/s |
| WizardLM 2 7B | 7B | Q4_K_M | 6.9 GB | Runs | ~40 tok/s |
| Aya Expanse 8B | 8B | Q4_K_M | 6.5 GB | Runs | ~42 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~36 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~36 tok/s |
| Dolphin 3 8B | 8B | Q4_K_M | 6 GB | Runs | ~46 tok/s |
| Granite 3.3 8B | 8B | Q8_0 | 10 GB | Runs | ~27 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | Runs | ~27 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~36 tok/s |
| Nous Hermes 2 8B | 8B | Q4_K_M | 6 GB | Runs | ~46 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~36 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | Runs | ~25 tok/s |
| Qwen 3.5 9B | 9B | Q4_K_M | 7.5 GB | Runs | ~36 tok/s |
| Yi 1.5 9B | 9B | Q4_K_M | 6.5 GB | Runs | ~42 tok/s |
| Yi Coder 9B | 9B | Q4_K_M | 8 GB | Runs | ~34 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | Runs | ~32 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | Runs | ~32 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | Runs | ~26 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | Runs | ~29 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~28 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~28 tok/s |
| Phi-4 Reasoning 14B | 14B | Q4_K_M | 11 GB | Runs | ~25 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~28 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | Runs | ~23 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | Runs | ~23 tok/s |
| StarCoder2 15B | 15B | Q8_0 | 17 GB | Runs | ~16 tok/s |
| InternLM 2.5 20B | 20B | Q4_K_M | 12 GB | Runs | ~23 tok/s |
| Codestral 22B | 22B | Q4_K_M | 14.7 GB | Runs | ~19 tok/s |
| Devstral 24B | 24B | Q4_K_M | 17 GB | Runs | ~16 tok/s |
| Magistral Small 24B | 24B | Q4_K_M | 17 GB | Runs | ~16 tok/s |
| Mistral Small 3.1 24B | 24B | Q4_K_M | 18 GB | Runs | ~15 tok/s |
| Gemma 4 26B | 26B | Q4_K_M | 20 GB | Runs | ~14 tok/s |
| Gemma 2 27B | 27B | Q4_K_M | 17.7 GB | Runs | ~15 tok/s |
| Gemma 3 27B | 27B | Q4_K_M | 20 GB | Runs | ~14 tok/s |
| Qwen 3.5 27B | 27B | Q4_K_M | 19 GB | Runs | ~14 tok/s |
| Nous Hermes 2 34B | 34B | Q4_K_M | 19 GB | Runs | ~14 tok/s |
| Qwen 3.5 35B A3B | 35B | Q4_K_M | 12 GB | Runs | ~23 tok/s |
| Qwen 3 30B-A3B (MoE) | 30B | Q4_K_M | 22 GB | Runs (tight) | ~12 tok/s |
| Gemma 4 31B | 31B | Q4_K_M | 22 GB | Runs (tight) | ~12 tok/s |
| Aya Expanse 32B | 32B | Q4_K_M | 22 GB | Runs (tight) | ~12 tok/s |
| Cogito 32B | 32B | Q4_K_M | 21.5 GB | Runs (tight) | ~13 tok/s |
| DeepSeek R1 32B | 32B | Q4_K_M | 20.7 GB | Runs (tight) | ~13 tok/s |
| Qwen 2.5 32B | 32B | Q4_K_M | 20.7 GB | Runs (tight) | ~13 tok/s |
| QwQ 32B | 32B | Q4_K_M | 21.5 GB | Runs (tight) | ~13 tok/s |
| WizardCoder 33B | 33B | Q4_K_M | 22 GB | Runs (tight) | ~12 tok/s |
| Yi 1.5 34B | 34B | Q4_K_M | 21 GB | Runs (tight) | ~13 tok/s |
| Command R 35B | 35B | Q4_K_M | 22.5 GB | Runs (tight) | ~12 tok/s |
| Qwen 2.5 Coder 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~4 tok/s |
| Qwen 3 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~4 tok/s |
| Dolphin Mixtral 8x7B | 47B | Q4_K_M | 26 GB | CPU Offload | ~3 tok/s |
| Mixtral 8x7B | 47B | Q4_K_M | 29.7 GB | CPU Offload | ~3 tok/s |
25
model(s) are too large for this hardware.