Nemotron 3 Nano 8B

Name: Nemotron 3 Nano 8B
Author: NVIDIA

NVIDIA Open Model License

NVIDIA · 8B · transformer-decoder

🤗 HuggingFace Ollama Official

2025-03-18 131K context 8B params

Use Cases

chat code reasoning math tools

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	7.5 GB	Good	—
Q8_0	8	11.0 GB	Good	—
F16	16	19.5 GB	Excellent	—

About this model

Nemotron 3 Nano 8B is NVIDIA's compact language model optimized for efficient inference with built-in tool-use capabilities. It delivers strong performance on reasoning, code generation, and mathematical tasks while supporting function calling out of the box. Designed for practical deployment scenarios, Nemotron 3 Nano combines a 131K context window with an 8B parameter count, making it suitable for running locally on consumer GPUs while retaining the ability to interact with external tools and APIs.

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run nemotron-3-nano:8b-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 8B
Architecture: transformer-decoder
Context: 131K tokens
Min VRAM: 7.5 GB
Recommended: 7.5 GB
Family: Nemotron
Released: 2025-03-18
License: NVIDIA Open Model License