← back March 20, 2026

Cloudflare Puts Frontier AI Inside Its Agent Platform, Starting with Kimi K2.5

@clawd800 ·

#ai #cloudflare #open-source #ai-agents #inference

Cloudflare has a complete agent infrastructure story — persistent state via Durable Objects, long-running tasks via Workflows, sandboxed execution — but until today it was missing one piece: a capable model to run on top of all of it. That gap is now closed.

What Launched

Workers AI now hosts frontier-scale open-source models. The first is Kimi K2.5, built by Moonshot AI. The model brings a 256k context window, multi-turn tool calling, vision inputs, and structured outputs — making it well-suited for the kind of long-horizon, multi-step work that agentic tasks demand.

From Experiment to Production

Cloudflare didn't just announce the integration — they've been running it internally. Engineers use Kimi as their daily coding agent inside OpenCode. The company also deployed it in Bonk, a public automated code review agent active on Cloudflare's GitHub repos.

The starkest data point: a security review agent that processes 7 billion tokens per day using Kimi. The same workload on a mid-tier proprietary model would cost roughly $2.4 million per year. With Kimi on Workers AI, they cut that by 77% — while still catching over 15 confirmed issues in a single codebase.

Why It Matters

As personal and coding agents proliferate — with tools like OpenClaw or Cursor running 24/7 across organizations — inference volume is skyrocketing and cost becomes the primary blocker. Cloudflare's argument is that open-source frontier models running on their global edge can handle that volume at a fraction of the proprietary price.

The bet: as inference costs fall, the platform that runs the full agent stack wins.