Context optimization for Claude Code

Your agent is wasting context.

Claude Code stuffs every tool output into context and never throws anything away. By step 10, your agent is reading step 1's bash output — and making worse decisions because of it. Optexai ranks and filters in real-time. Same results. Up to 45% fewer tokens than RTK.

Currently in private beta.
Context per agent turn — 103 baseline vs 65 gateway SWE-bench trialsBenchmark data
Without Optexai
With Optexai
Get started

Two commands. Zero config.

No API keys to rotate. No code changes. No workflow changes. From install to optimized in about 10 seconds.

01

Install & auto-configure

terminal
$ npx optexai setup
Proxy installed on 127.0.0.1:8787
CA certificate trusted
Claude Code configured
Ready — no restart required.
  • Detects Claude Code automatically
  • Injects trusted CA + proxy env
  • Idempotent — safe to re-run
02

Use Claude Code normally

terminal
$ claude
⏵ optexai · active -62% input tokens
⏵ ranking context (step 4) …
stripped 12.4k tokens of stale output
Every request — optimized, transparently.
  • Same commands you already run
  • Works with every tool call & subagent
  • Live dashboard for saved tokens

Free for individuals. MIT-licensed CLI.

The problem

Context grows.
Relevance doesn't.

Claude Code appends every tool output and result to the context window with no pruning. By step 10, the model is reading your step 1 bash output. That's your token budget.

01

Every step adds more

Each agent iteration appends tool outputs, results, and reasoning — without any pruning or relevance check.

02

Old context stays forever

Early steps that no longer matter remain in the prompt, consuming tokens that could go toward what's relevant now.

03

Signal gets buried

Long-context benchmarks show accuracy drops as irrelevant history accumulates. You're not just paying for noise — it's actively making your agent worse at the current step.

04

You pay for the noise

Every redundant token costs money and burns rate-limit capacity — regardless of whether it contributes to the output.

How it works

Three steps. Zero config.

01

Intercept

Optexai runs as a lightweight HTTPS proxy alongside Claude Code. No changes to your code or workflows.

02

Rank & Filter

Each request is analyzed in real-time. Prior context is ranked by relevance to the current step. Low-signal history is compressed out.

03

Forward

A focused, high-signal prompt reaches Anthropic's API. Context stays bounded across the entire agent run — automatically.

Real Data

Actual savings
by command type.

Per-command savings measured across 13,000+ real compactions. Every bar is a real tool call — left is what Claude Code sent, right is what Optexai forwarded.

Without Optexai — full output sent
cargo test
go test ./...
pip install
git log --oneline
find . -name *.py
git diff
  • Context bloat grows unbounded
  • No relevance ranking — stale history stays
  • Model accuracy drops as noise accumulates
  • You pay for tokens that hurt, not help
With Optexai — signal only
cargo test-99%
go test ./...-97%
pip install-99%
git log --oneline-72%
find . -name *.py-58%
git diff-21%
  • Context stays bounded and relevant
  • Only high-signal steps are included
  • Stable efficiency across long runs
  • Predictable, controlled token usage
Benchmark

RTK compresses commands.
Optexai compresses context.

RTK rewrites your CLI commands before they reach Claude — a useful trick, but it only touches Bash tool outputs and misses the conversation history, file reads, and thinking blocks that dominate real agent sessions. Worse: aggressive command filtering throws away context the model actually needed. Our SWE-bench trials show RTK can cut task reward scores by up to 30%. Optexai ranks context by relevance to the current step — so it strips noise, not signal.

TaskRTK v0.35.0BaselineOptexai
astropy1.98M tokens· reward 0.601.84M tokens· reward 0.851.06M tokens· reward 0.85-47% vs RTK
django2.71M tokens· reward 0.952.21M tokens· reward 0.911.38M tokens· reward 0.91-49% vs RTK
matplotlib1.68M tokens· reward 0.951.65M tokens· reward 1.000.93M tokens· reward 1.00-45% vs RTK

Reward = SWE-bench normalized patch correctness score (higher is better). RTK's filtering improves token count but hurts task quality — Optexai matches baseline quality while cutting deeper. ~20 trials per task on SWE-bench Verified.

384M
Characters of noise stripped —
roughly $288 saved across beta users so far
60%
Average savings per compacted tool call —
~$60–100/month per heavy Claude Code user
13K
Compactions processed
across 8,860 agent sessions
Who it's for

Built for every
Claude Code workflow.

Individual

Developer

Stay focused during long Claude Code runs. Plug in once and immediately get more from every session — no configuration needed.

  • Plug-and-play, zero config
  • Better efficiency in everyday workflows
  • Free tier available
Teams

Engineering Teams

When 10 engineers run 3 Claude sessions a day, context bloat compounds fast. One runaway sub-agent can balloon a $0.02 task to $2.00. Optexai caps context growth across the whole team — predictably, automatically, without touching a single line of code.

  • Predictable token usage at scale
  • Observability into context selection
  • Designed for production workflows
  • Consistent behavior across team runs
  • Centralised dashboard for team-wide token analytics
Trust & Privacy

Your code stays yours.
We just count the signal.

No code stored

Your source code and context content are never stored. Only anonymized per-command-type statistics are retained temporarily for your dashboard.

Stats with a short shelf life

Aggregated usage statistics — which command types waste context, how much was saved — are kept for a limited time to power your dashboard, then discarded.

Zero code changes

Works as a transparent proxy alongside Claude Code. No modifications to your codebase, workflows, or tooling.

Open proxy model

Yes, Optexai intercepts HTTPS traffic via a local proxy — that's how it reads and filters context before it reaches Anthropic. Your source code never leaves your machine. The CA cert is trusted locally, and all filtering happens in-process. Full source available for audit.

Send less.
Get more.

Start optimizing your Claude Code context today. Free for individuals.