Benchmark: measured token reduction

Run it yourself, no API key required. The benchmark uses token estimation (chars ÷ 4), not Anthropic API calls.

Run the benchmark

$ npx tsx scripts/benchmark.ts

Sample output

[D] Standard session ............ 20% reduction (37k → 29k tokens)

[D] Heavy agentic session ....... 38% reduction (50k → 31k tokens)

[D] Monorepo exploration ........ 74% reduction (40k → 10k tokens)

[D] Long session trim ........... 99% reduction (26k → 211 tokens)

[D] Test-debug loop ............. 97% reduction (46k → 1.4k tokens)

Methodology

Token counts use chars ÷ 4 estimation, the same method Anthropic uses for pricing previews. No Anthropic API calls are made during the benchmark. Model: claude-sonnet-4-5.

Raw data: /benchmark.json

Install and measure your own usage

Install in 30 seconds →