Skip to content

DeepSeek V3

by DeepSeek · Website

DeepSeek's V3 series of mixture-of-experts models with 671B total parameters and 37B active per token. Among the most capable open-weight models, excelling across coding, math, reasoning, and multilingual tasks.

Variants (2)

Smallest: DeepSeek V3-0324 (671B)
Largest: DeepSeek V3 (671B)

DeepSeek V3-0324

671B

DeepSeek

Min 362 GB
text-generation code-generation reasoning

DeepSeek V3

671B

DeepSeek

Min 362 GB
text-generation code-generation reasoning popular