Sakana Fugu: New Model from Sakana AI

Zuri O'Brien — Mon, 22 Jun 2026 12:25:51 +0000

Sakana AI released Fugu, a compact bilingual model optimized for Japanese and English tasks. The project first gained traction on Hacker News with 142 points and 83 comments.

Model: Sakana Fugu | Parameters: 7B | Speed: 38 tokens/s | License: Apache 2.0

What It Is and How It Works

Fugu combines a 7B transformer backbone with Sakana’s evolutionary model merging technique. The model was trained on a 120B token mix of Japanese web text and English technical corpora. It supports both text generation and lightweight instruction following without separate fine-tunes.

The architecture uses grouped-query attention and a 32k context window. No external retrieval is required for standard prompts.

Benchmarks and Performance Numbers

Early testers report 38 tokens per second on an RTX 4090 at 4-bit quantization. Memory footprint sits at 4.1 GB. On Japanese-to-English translation, Fugu scores 41.2 BLEU on the JESC test set.

Feature	Sakana Fugu	Llama-3-8B	Qwen2-7B
Tokens/s (4090)	38	31	34
Japanese BLEU	41.2	28.7	37.9
VRAM (4-bit)	4.1 GB	5.2 GB	4.8 GB
License	Apache 2.0	Llama 3	Apache 2.0

How to Try It

Download the weights from the official repository and run with llama.cpp or vLLM.

git clone https://github.com/sakana-ai/fugu
cd fugu && pip install -r requirements.txt
python -m fugu.chat --model fugu-7b-q4

An Ollama tag is also available: ollama run sakana/fugu.

Pros and Cons

Strong Japanese performance at small size
Apache 2.0 license allows commercial use
Runs on consumer GPUs with low VRAM
Limited English reasoning compared with larger models
No built-in tool-calling or agent scaffolding yet

Alternatives and Comparisons

Llama-3-8B and Qwen2-7B remain the main local alternatives. Fugu leads on Japanese benchmarks while trailing slightly on English MMLU. Developers needing bilingual output without 20+ GB VRAM now have a clear third option.

Who Should Use This

Researchers and developers building Japanese-facing chatbots or translation tools will benefit most. Teams focused solely on English reasoning or multi-agent workflows should continue with larger general models.

Bottom Line / Verdict

Fugu gives practitioners a practical, Apache-licensed model that closes the Japanese performance gap at 7B scale.

Sakana’s merging approach suggests further small, high-quality bilingual models will follow within months.

PromptZone - Leading AI Community for Prompt Engineering and AI Enthusiasts: Zuri O'Brien