Gemma 2 9B

Name: Gemma 2 9B
Author: Google

Gemma Terms of Use

Google · 9B · transformer-decoder

🤗 HuggingFace Ollama Official Paper

2024-06-27 8K context 9B params

Use Cases

chat code reasoning multilingual summary

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_M	4	6.9 GB	Moderate	—
Q8_0rec	8	11.0 GB	Good	—
F16	16	20.0 GB	Excellent	—

About this model

Gemma 2 9B is Google's mid-range open model that punches well above its weight, outperforming many larger models on key benchmarks. It incorporates knowledge distillation from larger Gemma models, resulting in exceptional quality for its parameter count. The model excels at reasoning, text generation, and multilingual tasks. With its 8K context window and moderate resource requirements, it is an excellent choice for users seeking strong general-purpose performance on consumer-grade hardware.

Benchmarks

71.3

mmlu

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run gemma2:9b-instruct-q8_0

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 9B
Architecture: transformer-decoder
Context: 8K tokens
Min VRAM: 6.9 GB
Recommended: 11.0 GB
Family: Gemma 2
Released: 2024-06-27
License: Gemma Terms of Use