Nemotron

by NVIDIA · Website

NVIDIA's Nemotron family spans efficient edge models to frontier-scale reasoning systems. The lineup includes the compact Nemotron 3 Nano with hybrid Mamba-Transformer blocks, and the Nemotron Ultra derived from Llama 3.1 405B via Neural Architecture Search. All models are optimized for inference efficiency with strong performance on coding, math, reasoning, and tool-use tasks.

Variants (2)

Smallest: Nemotron 3 Nano 8B (8B)

Largest: Nemotron Ultra 253B (253B)

Nemotron 3 Nano 8B

NVIDIA

Min 7.5 GB

text-generation code-generation reasoning

Nemotron Ultra 253B

253B

NVIDIA

Min 155 GB

text-generation code-generation reasoning