One email. Every week. Free.
Compressor is adaptive model routing for OpenClaw. It classifies workload in real-time and routes simple requests to cost-efficient models — automatically. Same agents. Same quality. 40–65% less spend.
Every request in your fleet hits the same model. Status checks, structured lookups, simple drafts — all routed to Opus because that's what's configured. The model doesn't know it's overpowered for the job. It just runs the meter.
The math is brutal. Opus costs 40–67% more per token than Sonnet. And when you look at real workloads, 60%+ of requests are structured, repetitive, or straightforward. They'd get the same result on a cheaper model.
The problem isn't that you picked the wrong model. It's that one-size-fits-all routing doesn't exist. Every request is different. Manual tiering doesn't scale. So you pay premium on everything and absorb the waste — because the alternative was worse.
Until now.
An EWMA classifier scores every request by complexity — token patterns, instruction structure, reasoning demands. It adapts in approximately 3 requests, not minutes. Workload shifts, routing shifts.
Simple and structured requests go to Sonnet. Complex reasoning, ambiguous instructions, and creative work stay on Opus. Each agent builds its own profile. An agent that only does email triage routes differently than one running multi-step code review.
Prompt-level detection catches complexity spikes mid-conversation. Pin any agent to premium when you need guaranteed horsepower. And if anything feels off — one command kills routing and reverts to full Opus. No restart. Instant.
openclaw compressor disable # back to Opus, immediately
openclaw compressor enable # re-enable when readyEnable Compressor in dry-run mode and it logs every routing decision without changing a single request. See exactly what would route where — and what you'd save — before flipping the switch.
Every request logged: agent name, token count, model used, tier selected, cost. You get the data whether or not you activate routing. The visibility alone tells you where the money goes.
Not every agent is the same. Compressor shows you complexity distribution per agent — which ones are Opus-heavy, which ones are 90% Sonnet candidates. Tune routing where it matters.
In-memory, in-process. ~240 bytes per agent. Microsecond overhead on each request. No external services, no cron jobs, no state files. Runs inside your existing gateway.
What doesn't change: agent behavior, output quality on complex tasks, your existing config, uptime.
Run it for half a day. The numbers speak for themselves — you'll know by lunch whether Compressor works for your fleet.
Everything you need to know before installing.
Start with dry-run. See the data. Flip the switch when you're ready. One command to enable, one command to revert. Nothing changes until you say so.