Llama 3.2 Vision 90B
by Meta · llama-3 family
90B
parameters
text-generation vision reasoning multilingual summarization creative-writing
Llama 3.2 Vision 90B is Meta's largest multimodal model, combining powerful text generation with advanced image understanding capabilities. It delivers state-of-the-art performance on visual reasoning, document analysis, chart understanding, and image captioning tasks. With 90 billion parameters and a 128K context window, this model represents the top tier of Meta's vision-language offerings, providing significantly stronger visual comprehension and reasoning compared to the smaller 11B variant.
Quick Start with Ollama
ollama run 90b-q4_K_M | Creator | Meta |
| Parameters | 90B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Sep 25, 2024 |
| License | Llama 3.2 Community License |
| Ollama | llama3.2-vision:90b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 45 GB | 50 GB | | 90b-q4_K_M |
| Q8_0 | 90 GB | 96 GB | | 90b-q8_0 |
Compatible Hardware
Q4_K_M requires 50 GB VRAM
Benchmark Scores
86.0
mmlu