Intelligence
Weekly BriefingThe Brain Problem
Offerings
Skill MakerCompressorOptimizer
Writings
BlogManifesto

Get the Briefing

One email. Every week. Free.

Your AI fleet runs on Opus.
Most of those requests don't need it.

Compressor is adaptive model routing for OpenClaw. It classifies workload in real-time and routes simple requests to cost-efficient models — automatically. Same agents. Same quality. 40–65% less spend.

See your potential savings →
40–65%
cost reduction
~3 requests
to adapt
~240 bytes
per agent
0
external services

You're paying Opus prices for Sonnet work

Every request in your fleet hits the same model. Status checks, structured lookups, simple drafts — all routed to Opus because that's what's configured. The model doesn't know it's overpowered for the job. It just runs the meter.

The math is brutal. Opus costs 40–67% more per token than Sonnet. And when you look at real workloads, 60%+ of requests are structured, repetitive, or straightforward. They'd get the same result on a cheaper model.

The problem isn't that you picked the wrong model. It's that one-size-fits-all routing doesn't exist. Every request is different. Manual tiering doesn't scale. So you pay premium on everything and absorb the waste — because the alternative was worse.

Until now.

Three-layer routing in real-time

Classify

An EWMA classifier scores every request by complexity — token patterns, instruction structure, reasoning demands. It adapts in approximately 3 requests, not minutes. Workload shifts, routing shifts.

Route

Simple and structured requests go to Sonnet. Complex reasoning, ambiguous instructions, and creative work stay on Opus. Each agent builds its own profile. An agent that only does email triage routes differently than one running multi-step code review.

Override

Prompt-level detection catches complexity spikes mid-conversation. Pin any agent to premium when you need guaranteed horsepower. And if anything feels off — one command kills routing and reverts to full Opus. No restart. Instant.

openclaw compressor disable   # back to Opus, immediately
openclaw compressor enable    # re-enable when ready

Built for operators, not dashboards

Zero-risk dry run

Enable Compressor in dry-run mode and it logs every routing decision without changing a single request. See exactly what would route where — and what you'd save — before flipping the switch.

Full spend visibility

Every request logged: agent name, token count, model used, tier selected, cost. You get the data whether or not you activate routing. The visibility alone tells you where the money goes.

Per-agent diagnostics

Not every agent is the same. Compressor shows you complexity distribution per agent — which ones are Opus-heavy, which ones are 90% Sonnet candidates. Tune routing where it matters.

No infrastructure

In-memory, in-process. ~240 bytes per agent. Microsecond overhead on each request. No external services, no cron jobs, no state files. Runs inside your existing gateway.

What changes

Without Compressor
Every request → Opus
Flat spend regardless of workload
No visibility into request complexity
Manual model switching per agent
All-or-nothing model choice
With Compressor
Complex requests → Opus, simple → Sonnet
Spend scales with actual complexity
Every request classified and logged
Automatic, adaptive, per-agent routing
Granular control with override and pin

What doesn't change: agent behavior, output quality on complex tasks, your existing config, uptime.

40–65%
cost reduction
On real workloads with mixed complexity
< 4 hours
to real results
Half-day test gives you hard numbers
0
quality loss
On complex and creative work
1 command
to control everything
Enable, disable, or check status

Run it for half a day. The numbers speak for themselves — you'll know by lunch whether Compressor works for your fleet.

Common Questions

Everything you need to know before installing.

Right now, every agent runs on Opus — the smartest and most expensive model. Compressor watches every conversation in real time and asks: “Does this message actually need the expensive model, or can the cheaper one handle it?”

Think of it like a school with two teachers:

  • Opus = The AP teacher. Brilliant. Expensive. You want them for the hard stuff.
  • Sonnet = The solid regular teacher. Handles straightforward work just fine, at about 1/5 the cost.

Compressor routes easy tasks to Sonnet and keeps hard tasks on Opus — automatically, in real time.

It tracks a running difficulty score for each agent's recent conversations. Simple messages keep the score low; complex ones push it high.

To prevent flip-flopping, it uses a Schmitt trigger — like a thermostat that doesn't switch at every tiny temperature change. Once an agent gets bumped up to Opus, it stays there until things clearly calm down. And once it drops to Sonnet, it stays until things clearly get hard again.

No. Responses come back the same way. The only thing that changes is your bill goes down, because easy conversations stop burning Opus tokens.

Yes. Some agents are VIPs — you can pin them to always run on Opus no matter what. By default, your main agent, twitter agent, and config agent are pinned. You can customize which agents are pinned and which are adaptive.

Each message gets scored in about 20 microseconds. That's 0.00002 seconds. You'd never notice.

When you first install Compressor, it starts in dry-run mode by default. That means it watches and logs what it would do, but doesn't actually switch any models.

You run it like that for a day or two, look at the logs, and make sure it's making smart decisions before flipping it to active.

There are three levels — all fast, all safe:

Level 1: Soft Kill ~10 seconds
Flip it back to dry-run mode in the config and restart the gateway. Every agent instantly goes back to Opus. No data lost.
Level 2: Disable the Plugin ~30 seconds
Remove or comment out the compressor entry in your config file and restart. The plugin never loads. Everything runs exactly like it did before installation.
Level 3: Full Removal ~60 seconds
Remove the config entry, delete the plugin folder, restart. Zero trace — back to your pre-compressor setup.

The key thing: Compressor doesn't change any of your existing config. It sits in front of the model decision and intercepts it on the fly. Remove it, and the system underneath is untouched — like removing a traffic light from an intersection. The intersection still works.

Yes — run the /compressor command anytime to see per-agent diagnostics: which tier each agent is on, what their difficulty scores look like, and how many requests have been routed.

Your agents don't all need Opus.
Prove it in four hours.

Start with dry-run. See the data. Flip the switch when you're ready. One command to enable, one command to revert. Nothing changes until you say so.

Start with dry-run →Read the docs →