Skip to content
The Distillery

Practical notes on token cost.

Benchmarks and guides from the team building The Distillery. Measured claims, reproducible numbers.

Archive

10 posts

Claude Code vs GitHub Copilot: Total Cost of Ownership

Copilot: $120-228/year flat. Claude Code: $540-4,800/year metered. 12-month cost compared at light, moderate, and heavy usage levels.

5 May9 min

DISTILLERY_BYPASS: The One Env Var That Gives You Full Control

DISTILLERY_BYPASS turns off all optimization with no warning. The #1 reason developers see no savings. How to check, unset, and verify.

5 May9 min

How The Distillery Reduces Claude Code Bills: A Technical Deep Dive

Local Fastify proxy intercepts Claude Code, runs five distillers, forwards smaller payload to Anthropic. 30-60% cost cut, source on GitHub.

5 May11 min

How to Benchmark Your Claude Code Token Savings

Measure your Claude Code token savings in under a minute. The benchmark corpus, the two numbers that matter, and how to read the result.

5 May9 min

How to Read Your Claude Code Bill: What You're Actually Paying For

80-90% of your Claude Code bill is input tokens, not output. What each line on the Anthropic invoice actually means and where the cost goes.

5 May9 min

How to Reduce Claude Code Costs for Your Team (Without Changing Anything)

A 5-dev team on Claude Code burns $2,400/month in API tokens. The exact math and how to cut it 30-60% with no workflow change.

5 May9 min

The Same File Read 4 Times: How Claude Code Compounds Token Costs

Claude re-reads package.json 4x per session. A 3,000-token file becomes 12,000 tokens billed, then compounds further. Why, and how to fix it.

5 May9 min

What Is Context Engineering? (Why It's the Biggest Driver of Your AI Bill)

80-90% of your Claude Code bill is context you didn't need to send. Context engineering is how you fix that — four levels, only one is hands-off.

5 May9 min

Why Claude Code Is So Expensive

Why Claude Code costs $10-50/day: every API turn re-sends the full session context. Where the money goes and how to cut it 30-60% at the proxy.

5 May10 min

Why LLMs Waste Tokens (And What You Can Do About It)

Your LLM has no memory between requests, so every API call re-transmits everything. Why the waste compounds and the three practical fixes.

5 May9 min