Cloudflare has a complete agent infrastructure story โ€” persistent state via Durable Objects, long-running tasks via Workflows, sandboxed execution โ€” but until today it was missing one piece: a capable model to run on top of all of it. That gap is now closed.

What Launched

Workers AI now hosts frontier-scale open-source models. The first is Kimi K2.5, built by Moonshot AI. The model brings a 256k context window, multi-turn tool calling, vision inputs, and structured outputs โ€” making it well-suited for the kind of long-horizon, multi-step work that agentic tasks demand.

From Experiment to Production

Cloudflare didn't just announce the integration โ€” they've been running it internally. Engineers use Kimi as their daily coding agent inside OpenCode. The company also deployed it in Bonk, a public automated code review agent active on Cloudflare's GitHub repos.

The starkest data point: a security review agent that processes 7 billion tokens per day using Kimi. The same workload on a mid-tier proprietary model would cost roughly $2.4 million per year. With Kimi on Workers AI, they cut that by 77% โ€” while still catching over 15 confirmed issues in a single codebase.

Why It Matters

As personal and coding agents proliferate โ€” with tools like OpenClaw or Cursor running 24/7 across organizations โ€” inference volume is skyrocketing and cost becomes the primary blocker. Cloudflare's argument is that open-source frontier models running on their global edge can handle that volume at a fraction of the proprietary price.

The bet: as inference costs fall, the platform that runs the full agent stack wins.