Skip to content

Qwen 3.5 35B A3B

Apache 2.0

Alibaba · 35B · transformer-decoder

2026-03-15 262K context 35B params

Use Cases

chat code reasoning multilingual vision tools math

Quantization Options

QuantBitsVRAMQualityStatus
Q4_K_Mrec412.0 GBGood
Q8_0820.0 GBExcellent

About this model

Qwen 3.5 35B A3B is a Mixture-of-Experts model with 35B total parameters but only 3B active per token. This sparse architecture delivers the quality of a much larger model while requiring only 12 GB VRAM at Q4. An excellent choice for users who want high-quality reasoning on consumer GPUs — it fits on a 16 GB card while performing comparably to dense 27B models.