Skip to content

Model Families

15 model families with variants across different sizes.

Cogito

Deep Cogito

Deep Cogito's hybrid reasoning models that can dynamically switch between fast direct responses and deep chain-of-though...

3 variants 8B — 70B

Command R

Cohere

Cohere's Command R is a family of models optimized for retrieval-augmented generation (RAG) and enterprise use cases. Co...

3 variants 35B — 111B

DeepSeek R1

DeepSeek

DeepSeek's R1 family of reasoning-focused open-weight models, trained with reinforcement learning to excel at complex mu...

7 variants 1.5B — 671B

DeepSeek V3

DeepSeek

DeepSeek's V3 series of mixture-of-experts models with 671B total parameters and 37B active per token. Among the most ca...

2 variants 671B — 671B

Falcon 3

TII

The third generation of TII's Falcon models, offering efficient 7B and 10B parameter variants. Designed for strong gener...

2 variants 7B — 10B

Gemma 2

Google

Google's Gemma 2 is a family of lightweight, open-weight models built from the same research and technology used to crea...

3 variants 2B — 27B

Gemma 3

Google

Google's Gemma 3 is a major upgrade over Gemma 2, featuring native multimodal support (text + image input) starting at 4...

6 variants 1B — 27B

Llama 3

Meta

Meta's Llama 3 is one of the most capable and widely adopted open-weight model families. Spanning from compact 1B parame...

8 variants 1B — 405B

Llama 4

Meta

Meta's Llama 4 introduces mixture-of-experts architecture and native multimodal support to the Llama family. Scout (109B...

2 variants 109B — 400B

Mistral

Mistral AI

Mistral AI's open-weight model family, known for exceptional efficiency and strong performance relative to model size. I...

10 variants 7B — 141B

Nemotron 3

NVIDIA

NVIDIA's Nemotron 3 family features novel hybrid architectures combining Mamba and Transformer blocks. Optimized for inf...

1 variant 8B — 8B

Phi

Microsoft

Microsoft's Phi family of small language models, designed to demonstrate that carefully curated training data can enable...

4 variants 3.8B — 14B

Qwen 2.5

Alibaba

Alibaba's Qwen 2.5 is a comprehensive family of open-weight models spanning from 7B to 72B parameters, with specialized ...

9 variants 7B — 72B

Qwen 3

Alibaba

Alibaba's Qwen 3 is the next generation of the Qwen family, featuring both dense models (0.6B to 32B) and mixture-of-exp...

8 variants 0.6B — 235B

StarCoder

BigCode

BigCode's StarCoder is a family of code-specialized language models developed as part of an open scientific collaboratio...

1 variant 15B — 15B