Qwen 3.5 4B
by Alibaba · qwen-3.5 family
4B
parameters
text-generation code-generation reasoning multilingual vision tool-use
Qwen 3.5 4B is an efficient multimodal model with strong reasoning and tool-use capabilities. It supports thinking and non-thinking modes, letting you trade speed for reasoning depth. At 4.5 GB VRAM (Q4) it fits on 8 GB GPUs with headroom, making it a practical daily driver for local AI on modest hardware.
Quick Start with Ollama
ollama run 4b-q4_K_M | Creator | Alibaba |
| Parameters | 4B |
| Architecture | transformer-decoder |
| Context | 256K tokens |
| Released | Mar 2, 2026 |
| License | Apache 2.0 |
| Ollama | qwen3.5:4b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 3.4 GB | 4.5 GB | | 4b-q4_K_M |
| Q8_0 | 5.5 GB | 6.5 GB | | 4b-q8_0 |
Compatible Hardware
Q4_K_M requires 4.5 GB VRAM