Dolphin Mixtral 8x7B

Name: Dolphin Mixtral 8x7B
Author: Cognitive Computations

by Cognitive Computations · dolphin family

47B

parameters

text-generation code-generation reasoning multilingual creative-writing

Dolphin Mixtral 8x7B is an uncensored, instruction-tuned mixture-of-experts model built on Mistral's Mixtral architecture. With 47 billion total parameters and approximately 12 billion active per inference, it delivers strong performance across text generation, coding, and multilingual tasks without alignment restrictions. This model combines the powerful MoE architecture of Mixtral with Dolphin's uncensored training approach, making it a popular choice for researchers and developers who need high-quality outputs without content filtering. It requires significant VRAM but rewards users with excellent reasoning and creative writing capabilities.

Quick Start with Ollama


ollama run 8x7b-q4_K_M

Resources Ollama

Creator	Cognitive Computations
Parameters	47B
Architecture	mixture-of-experts
Context	32K tokens
Released	Jan 20, 2024
License	Apache 2.0
Ollama	dolphin-mixtral:8x7b

Quantization Options

Format	File Size	VRAM Required	Ollama Tag
Q4_K_M rec	24 GB	26 GB	`8x7b-q4_K_M`
Q8_0	47 GB	49 GB	`8x7b-q8_0`
F16	90 GB	94 GB	`8x7b-fp16`

Compatible Hardware

Q4_K_M requires 26 GB VRAM

Compatible Hardware

Hardware	VRAM	Type	Fit	Est. Speed
Mac Studio M4 Ultra 512GB	512 GB	mac	Runs	~32 tok/s
Mac Pro M2 Ultra 192GB	192 GB	mac	Runs	~31 tok/s
Mac Studio M4 Ultra 192GB	192 GB	mac	Runs	~32 tok/s
Mac Studio M4 Max 128GB	128 GB	mac	Runs	~21 tok/s
MacBook Pro M4 Max 128GB	128 GB	mac	Runs	~21 tok/s
MacBook Pro M5 Max 128GB	128 GB	mac	Runs	~21 tok/s
NVIDIA RTX PRO 6000 Blackwell	96 GB	gpu	Runs	~74 tok/s
MacBook Pro M3 Max 96GB	96 GB	mac	Runs	~15 tok/s
Mac mini M4 Pro 64GB	64 GB	mac	Runs	~11 tok/s
Mac Studio M4 Max 64GB	64 GB	mac	Runs	~21 tok/s
MacBook Pro M4 Max 64GB	64 GB	mac	Runs	~21 tok/s
MacBook Pro M5 Max 64GB	64 GB	mac	Runs	~21 tok/s
NVIDIA RTX 6000 Ada Generation	48 GB	gpu	Runs	~37 tok/s
NVIDIA RTX A6000	48 GB	gpu	Runs	~30 tok/s
NVIDIA RTX PRO 5000 Blackwell	48 GB	gpu	Runs	~37 tok/s
Mac mini M4 Pro 48GB	48 GB	mac	Runs	~11 tok/s
MacBook Pro M3 Max 48GB	48 GB	mac	Runs	~15 tok/s
MacBook Pro M4 Max 48GB	48 GB	mac	Runs	~21 tok/s
MacBook Pro M4 Pro 48GB	48 GB	mac	Runs	~11 tok/s
MacBook Pro M5 Max 48GB	48 GB	mac	Runs	~16 tok/s
MacBook Pro M5 Pro 48GB	48 GB	mac	Runs	~11 tok/s
Mac Studio M4 Max 36GB	36 GB	mac	Runs	~21 tok/s
MacBook Pro M3 Pro 36GB	36 GB	mac	Runs	~6 tok/s
MacBook Pro M5 Max 36GB	36 GB	mac	Runs	~16 tok/s
NVIDIA RTX 5000 Ada Generation	32 GB	gpu	Runs	~28 tok/s
NVIDIA GeForce RTX 5090	32 GB	gpu	Runs	~69 tok/s
iMac M4 32GB	32 GB	mac	Runs	~5 tok/s
Mac mini M4 32GB	32 GB	mac	Runs	~5 tok/s
MacBook Air M5 32GB	32 GB	mac	Runs	~5 tok/s
MacBook Air M4 32GB	32 GB	mac	Runs	~5 tok/s
MacBook Pro M5 32GB	32 GB	mac	Runs	~5 tok/s
AMD Radeon RX 7900 XTX	24 GB	gpu	CPU Offload	~11 tok/s
NVIDIA GeForce RTX 3090 Ti	24 GB	gpu	CPU Offload	~12 tok/s
NVIDIA GeForce RTX 3090	24 GB	gpu	CPU Offload	~11 tok/s
NVIDIA GeForce RTX 4090	24 GB	gpu	CPU Offload	~12 tok/s
NVIDIA RTX A5000	24 GB	gpu	CPU Offload	~9 tok/s
iMac M3 24GB	24 GB	mac	CPU Offload	~1 tok/s
Mac mini M2 24GB	24 GB	mac	CPU Offload	~1 tok/s
Mac mini M4 Pro 24GB	24 GB	mac	CPU Offload	~3 tok/s
MacBook Air M2 24GB	24 GB	mac	CPU Offload	~1 tok/s
MacBook Air M4 24GB	24 GB	mac	CPU Offload	~2 tok/s
MacBook Air M5 24GB	24 GB	mac	CPU Offload	~2 tok/s
MacBook Pro M4 Pro 24GB	24 GB	mac	CPU Offload	~3 tok/s
MacBook Pro M5 24GB	24 GB	mac	CPU Offload	~2 tok/s
MacBook Pro M5 Pro 24GB	24 GB	mac	CPU Offload	~3 tok/s
AMD Radeon RX 7900 XT	20 GB	gpu	CPU Offload	~9 tok/s
NVIDIA RTX 4000 Ada Generation	20 GB	gpu	CPU Offload	~4 tok/s
MacBook Pro M3 Pro 18GB	18 GB	mac	CPU Offload	~2 tok/s

59 hardware device(s) cannot run this model at Q4_K_M.

Benchmark Scores

70.0

mmlu