Skip to content
The Distillery
Docs / Clients / Hermes
SETUP · hermes-agentverified

~/distillery setup → hermes-agent

Configure Hermes to route Claude requests through The Distillery via the OpenAI adapter.

vendorAnthropic (OpenAI compat)
authAPI key
methodenv var
portlocalhost:3080
compoundingagent · highest ROI

Overview

Hermes sends requests to `http://localhost:3080/v1/chat/completions`. The Distillery records token counts, distils context, then forwards to Anthropic. Your Anthropic API key is set in Hermes directly; The Distillery never sees it.

Two separate credentials: your Anthropic API key (sent to Anthropic for billing) and your Distillery login token (stays local, used for cost tracking only). Mixing these up is the most common setup mistake.

Why agents save more

A 10-step agentic session sends context forward on every step. At 40% distillation that's 20k tokens saved per run; multiply by daily runs and savings compound fast.

Multi-step agent loops are the highest-ROI case for The Distillery: context grows linearly, distillation compounds across every step.

Prerequisites

You have The Distillery installed and running locally on port 3080. See the main install guide if you have not done this yet.

$ configure

step 01 / 02

Set the OPENAI_BASE_URL environment variable before starting Hermes:

macOS · Linux
export OPENAI_BASE_URL=http://localhost:3080/v1
hermes start
Windows
set OPENAI_BASE_URL=http://localhost:3080/v1
hermes start

$ verify

step 02 / 02
01Trigger any Hermes agent request. In a separate terminal, run:
thedistillery stats

$ troubleshoot

No response / connection refused
The Distillery is not running. Start it with `thedistillery start` in a separate terminal before launching Hermes.
Authentication errors
Hermes is sending its own API key as x-api-key. The Distillery expects the Anthropic key there, not the Distillery login token. Check Hermes provider configuration.
Model not found
Hermes may be requesting a model string the OpenAI adapter does not recognise. Confirm the model name matches a Claude model available via the Anthropic API (e.g. `claude-sonnet-4-5`).
Session appears fragmented
Hermes reconnects frequently in agent loops. Sessions auto-flush after 30 seconds of inactivity; send an `x-distillery-session-id` header with a fixed UUID v4 to pin all requests to one session.
$ comparing tools

Evaluating other Claude Code cost-reduction approaches alongside Hermes? See how The Distillery compares to Claude Code Router (model-routing approach), Compresr (YC W26), or browse alternatives to Claude Code Router for the full landscape.

Pricing breakdown: thedistillery.com/pricing.