Google's Gemma 4 Leaks on Arena โ Multimodal Open-Weights Model Appears Imminent
Google DeepMind's next open-weights model is nearly ready. Over the weekend, a model calling itself "Gemma 4" appeared on the LM Arena blind-testing leaderboard under the codename "significant-otter" โ and confirmed its own identity unprompted.
When users queried it, the model responded: "I am Gemma 4, a large language model developed by Google DeepMind. I am an open weights model designed to process text and images." Screenshots circulated widely on X.
What's been spotted
The expected lineup: a 2B, a 4B, and a 120B MoE variant with roughly 15B active parameters per pass. Multimodal support โ text and images โ is new for the Gemma line, which previously covered text only.
Additional confirmation came earlier in March when Google's internal automation bot "Copybara-Service" submitted a GitHub pull request titled "Add NPU support for AICore for Gemma4 model" โ a harder signal than a leaked screenshot.
Gemma 3, released in March 2025, became a go-to for local and fine-tuned deployments. The 120B MoE architecture would offer competitive performance while keeping inference costs manageable โ similar to what has made MoE-based models efficient for on-device and cloud deployment.
Why it matters
Open-weights multimodal models capable of matching closed-source APIs reduce developer lock-in. If the 120B MoE performs at the level the Arena testing implies, it would become a strong alternative for building agents, fine-tuned assistants, and research tooling. The 2B and 4B variants extend that access to edge devices and consumer hardware.
Google has not officially announced a release date, but the combination of Arena testing and internal GitHub activity suggests a launch is close.