Benchmark: measured token reduction
Run it yourself, no API key required. The benchmark uses token estimation (chars ÷ 4), not Anthropic API calls.
Run the benchmark
$ npx tsx scripts/benchmark.tsSample output
[D] Standard session ............ 20% reduction (37k → 29k tokens)
[D] Heavy agentic session ....... 38% reduction (50k → 31k tokens)
[D] Monorepo exploration ........ 74% reduction (40k → 10k tokens)
[D] Long session trim ........... 99% reduction (26k → 211 tokens)
[D] Test-debug loop ............. 97% reduction (46k → 1.4k tokens)
Methodology
Token counts use chars ÷ 4 estimation, the same method Anthropic uses for pricing previews. No Anthropic API calls are made during the benchmark. Model: claude-sonnet-4-5.
Raw data: /benchmark.json
Install and measure your own usage
Install in 30 seconds →