DeepSeek V3
by DeepSeek · Website
DeepSeek's V3 series of mixture-of-experts models with 671B total parameters and 37B active per token. Among the most capable open-weight models, excelling across coding, math, reasoning, and multilingual tasks.
Variants (2)
Smallest: DeepSeek V3-0324 (671B)
Largest: DeepSeek V3 (671B)