Nemotron 3 Nano 8B
by NVIDIA · nemotron family
8B
parameters
text-generation code-generation reasoning math tool-use
Nemotron 3 Nano 8B is NVIDIA's compact language model optimized for efficient inference with built-in tool-use capabilities. It delivers strong performance on reasoning, code generation, and mathematical tasks while supporting function calling out of the box. Designed for practical deployment scenarios, Nemotron 3 Nano combines a 131K context window with an 8B parameter count, making it suitable for running locally on consumer GPUs while retaining the ability to interact with external tools and APIs.
Quick Start with Ollama
ollama run 8b-q4_K_M | Creator | NVIDIA |
| Parameters | 8B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Mar 18, 2025 |
| License | NVIDIA Open Model License |
| Ollama | nemotron-3-nano:8b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 5 GB | 7.5 GB | | 8b-q4_K_M |
| Q8_0 | 8.5 GB | 11 GB | | 8b-q8_0 |
| F16 | 16 GB | 19.5 GB | | 8b-fp16 |
Compatible Hardware
Q4_K_M requires 7.5 GB VRAM