Will RTK work if I'm not using Claude Code?

Yes. RTK supports 13 AI coding tools including Cursor, GitHub Copilot, Gemini CLI, Windsurf, Cline/Roo Code, Codex, and more. You can also use RTK manually (without any AI tool) just to get compact terminal output for your own reading.

Does RTK send my code or commands to any external server?

No. RTK runs entirely locally. It filters command output on your machine before it ever reaches any AI. The optional telemetry (disabled by default, requires explicit opt-in) only collects anonymous aggregate metrics like command counts and token savings totals, never source code, file paths, arguments, or secrets.

What if RTK's compressed output is missing something I need?

You have several options. For any command, you can call it directly without RTK to get the raw output. You can also use `rtk proxy ` to get raw passthrough with tracking. For failed commands, RTK automatically saves the full output to a local log file. And if there's a specific command where RTK's compression is too aggressive, you can add it to the `exclude_commands` list in `~/.config/rtk/config.toml`.

How does the Cargo/crates.io naming collision work?

There's a different Rust crate called "rtk" (Rust Type Kit) on crates.io. If you run `cargo install rtk` you'll get the wrong package. Always use `cargo install --git https://github.com/rtk-ai/rtk` or install via Homebrew/the install script to ensure you get the right RTK.

Does RTK slow down my terminal commands?

Benchmarked at under 10ms overhead per command. That's imperceptible in normal use. The token savings you get back more than compensate for this, as you're effectively trading 10ms of local processing for significantly less API round-trip time and context rebuilding.

Can I use RTK in CI/CD pipelines?

Yes. `rtk init -g --auto-patch` runs non-interactively, suitable for CI environments. The binary is a single zero-dependency executable that works in minimal containers.

What does "aggressive" mode do in `rtk read`?

`rtk read file.rs -l aggressive` strips function bodies and keeps only signatures, type definitions, and doc comments. This is useful when you want to give an AI agent an overview of a file's structure without burning tokens on implementation details it doesn't need yet.

Is RTK maintained actively?

As of May 2026, the project is on v0.38.0 with 863 commits, 141 releases, and an active Discord community. The release cadence is frequent and the project has multiple core contributors. Active issues and PRs suggest a healthy open-source project, not an abandoned one.

What's the best way to contribute to RTK?

Check CONTRIBUTING.md in the repository. The most impactful contributions are new command filters for tools that don't have coverage yet (check `rtk discover` to see what commands in your own workflow are missing coverage, which is also valuable feedback for the project). The project uses Rust, so Rust familiarity helps, but the filter logic for individual commands is straightforward enough that it's a good entry point even for Rust beginners.

Stop Burning Money on LLM Tokens: How RTK Cuts Your AI Coding Costs

May 14, 2026

Mukesh Kumar

How a single Rust binary is quietly revolutionizing the way developers work with Claude Code, Cursor, and other AI coding agents - by cutting LLM token usage by 60-90% with zero friction.

Introduction

If you've been using AI coding tools seriously - whether that's Claude Code, Cursor, GitHub Copilot, or any agentic LLM environment - you've probably run into a frustrating wall. You ask the AI to run a test suite, check git status, or scan a directory, and suddenly your context window starts to look like a firehose. Thousands of tokens vanish in seconds. The AI gets confused, loses track of earlier context, or you simply hit rate limits faster than you'd like.

This isn't a niche problem. It's something every developer using AI agents at scale deals with. And for a long time, the only fix was manual: carefully crafting prompts, using --quiet flags, or just accepting the inefficiency.

RTK (Rust Token Killer) - is a CLI proxy written in Rust that sits between your terminal commands and your AI coding agent, silently compressing output before the LLM ever sees it. The results are striking: a typical 30-minute Claude Code session that would consume ~118,000 tokens gets trimmed down to ~23,900. That's an 80% reduction across the board, with less than 10 milliseconds of overhead per command.

RTK CLI Interface - Terminal Command Output Before and After Compression

The Problem with AI Coding Agents

To understand why RTK matters, you need to understand how AI coding agents actually consume context.

When Claude Code (or any similar agent) runs a shell command, the entire output of that command gets injected into the model's context window. Every line. Every boilerplate message. Every progress bar character. Every "Counting objects: 100% (5/5), done." from git push.

This creates a compounding problem. LLMs have finite context windows. The more of that window you fill with low-information noise, including redundant log lines, verbose package manager output, and full test runner boilerplate, the less room you have for the things that actually matter: your code, your architecture decisions, and the specific errors you're trying to fix.

Let's make it concrete. Here's what git push normally outputs:

Enumerating objects: 5, done. Counting objects: 100% (5/5), done. Delta compression using up to 8 threads Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 892 bytes | 892.00 KiB/s, done. Total 3 (delta 2), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (2/2), completed with 2 local objects. To github.com:yourname/project.git a1b2c3d..e4f5g6h main -> main Branch 'main' set up to track remote branch 'main' from 'origin'.

That's roughly 200 tokens. The only information the AI actually needed? That the push succeeded and which branch it pushed to. RTK compresses that to:

ok main

Ten tokens. Same semantic content. Zero information loss from the AI's perspective.

Now multiply that across every terminal command your AI agent runs over a 30-minute session (git operations, test runs, directory listings, grep results, and build outputs) and you start to see why the token burn adds up so fast, and why shaving 80% off the total can meaningfully change how long your sessions last, how much they cost, and how effectively the AI maintains context.

What is RTK?

RTK (Rust Token Killer) is an open-source, single-binary CLI proxy that intercepts the output of common developer commands and filters, compresses, and reformats them before they get fed into your AI coding agent's context.

The project lives at github.com/rtk-ai/rtk, is written almost entirely in Rust (92.3% of the codebase), and ships as a zero-dependency binary. You install it once and it runs everywhere with sub-10ms overhead.

The key insight behind RTK is elegant: AI agents don't need raw terminal output. They need information. A test runner that ran 200 tests and failed 3 of them doesn't need to show you all 200 passing test names. The AI needs to know: which 3 failed, and why. RTK knows this, and acts accordingly.

RTK supports over 100 commands out of the box, including git, cargo, and pytest, to docker, kubectl, aws, and beyond. For each one, it applies intelligent, command-aware filtering rather than naive truncation.

How RTK Works Internally

RTK's architecture is beautifully simple in concept, even if the implementation is sophisticated.

When you run a command through RTK (either explicitly like rtk git status, or automatically via the hook system), here's what happens:

1. RTK intercepts the command before your shell executes it normally
2. It executes the underlying command itself, capturing stdout and stderr
3. It runs the output through a command-specific filter pipeline
4. It returns the compressed, filtered output which is what your AI agent's context receives
5. If the command fails, it saves the full unfiltered output to a local log file so the AI can access complete information if needed

The filtering pipeline uses four core strategies, applied intelligently based on the command type:

Smart Filtering :- strips structural noise progress bars, redundant status messages, boilerplate headers, decorative ASCII art (looking at you, cargo's crab and prisma generate's multi-line art), comments, and whitespace padding.
Grouping :- aggregates related items together. Instead of listing 47 individual files that failed a lint check, RTK groups them by directory or error type, giving the AI a structured summary instead of a flat dump.
Truncation :- keeps the meaningful parts and discards redundancy. A 200-line test output where 195 lines are passing tests becomes a focused summary of the 5 failures, with file names and line numbers intact.
Deduplication :- handles the common case of repeated log lines. Docker logs, application logs, and container output often repeat the same message hundreds of times. RTK collapses these into "message [x47]" format, preserving the signal while destroying the noise.

The failure recovery system is particularly clever. When RTK's compressed output isn't enough context - say, a particularly gnarly build error - RTK saves the full raw output to ~/.local/share/rtk/tee/. The AI can then access that complete log without re-running the command, which would cost more tokens and time.

FAILED: 2/15 tests [full output: ~/.local/share/rtk/tee/1707753600_cargo_test.log]

This gives you the best of both worlds: compressed default output for the AI's context, full output available on demand.

RTK Internal Architecture - Command Interception and Filtering Pipeline

The Auto-Rewrite Hook: RTK's Killer Feature

The thing that makes RTK genuinely seamless - rather than just a useful but manual tool - is the auto-rewrite hook system.

When you run rtk init -g, RTK installs a PreToolUse hook into your AI coding agent's configuration. For Claude Code, this hooks into the Bash tool execution pipeline. Before Claude Code runs any shell command, the hook transparently rewrites it to the RTK equivalent:

# Claude Code calls this: bash: git status # The hook rewrites it to: bash: rtk git status # Claude sees this - never knows the difference: M src/main.rs ?? tests/new_test.rs

Claude never knows the rewrite happened. It just gets cleaner output. Your workflow doesn't change. You don't have to remember to type rtk before every command. It just works.

This also means 100% RTK adoption across all subagents and all conversations from the moment you restart your AI tool - no configuration drift, no forgetting to enable it on a new project.

One important nuance worth knowing: the hook only intercepts Bash tool calls. Claude Code's built-in Read, Grep, and Glob tools bypass the hook. For those, you either use shell equivalents (cat, rg, find) or call RTK commands directly. This is a reasonable tradeoff given how the hook architecture works.

Installation Guide & integration with AI

RTK is engineered to be extremely lightweight, offering intelligent command filtering across multiple categories with zero workflow disruption. Getting started with RTK is remarkably simple, and it integrates seamlessly with modern AI agents using global hooks and project-scoped rules. For the full setup, custom configurations, and detailed contributions list, visit the official RTK GitHub Repository.

Performance and Token Savings

The project publishes detailed benchmark data for a typical 30-minute Claude Code session on a medium-sized TypeScript/Rust project:

Operation	Frequency	Without RTK	With RTK	Savings
`ls` / `tree`	10x	2,000 tokens	400	-80%
`cat` / `read`	20x	40,000	12,000	-70%
`grep` / `rg`	8x	16,000	3,200	-80%
`git status`	10x	3,000	600	-80%
`git diff`	5x	10,000	2,500	-75%
`git log`	5x	2,500	500	-80%
`git add/commit/push`	8x	1,600	120	-92%
`cargo test` / `npm test`	5x	25,000	2,500	-90%
`pytest`	4x	8,000	800	-90%
`docker ps`	3x	900	180	-80%
Total		~118,000	~23,900	-80%

To translate that into real-world impact: at typical Claude API pricing (claude-sonnet-4 tier), 118,000 input tokens per 30-minute session adds up fast during a full workday of agentic coding. An 80% reduction means roughly 5x more sessions for the same cost - or equivalently, your context window lasts 5x longer before the AI starts losing track of earlier context.

RTK Token Savings Performance Benchmark - 80% Reduction in Typical Claude Code Session

Advantages and Limitations

What RTK Does Really Well

Transparency is the biggest win :- Once the hook is installed, you literally forget RTK is there. Your workflow doesn't change. You don't modify how you prompt Claude, you don't add any flags to your commands. The compression just happens.

The failure safety net is thoughtful design :- When RTK compresses output from a failing command, it saves the full output locally. The AI always has an escape hatch to get complete information without re-running anything. This prevents the rare case where compression loses important context from an error.

The supported command list is comprehensive :- Over 100 commands across git, test runners, build tools, AWS, Docker, Kubernetes, and more. Most developer workflows are covered from day one.

Zero dependencies, single binary :- This is a significant operational advantage. No Node.js runtime, no Python, no framework dependencies. You copy one binary and it runs. This matters for CI environments, Docker containers, and machines where you don't want to install a toolchain just to run a developer productivity tool.

Built-in analytics :- Knowing exactly how many tokens you've saved, with daily breakdowns and historical graphs, is genuinely useful for justifying the tool to your team or understanding the ROI of AI coding investments.

Final Thoughts

The AI coding tools space is moving fast, but a lot of the focus has been on the AI itself (smarter models, better context handling, and improved code generation). RTK reminds us that the interface layer matters too. The plumbing matters.

For developers who take AI coding seriously, RTK is quickly becoming an essential part of the stack, not because it's flashy, but because it quietly makes everything else work better. Your sessions last longer. Your context stays coherent. Your API costs go down. And you didn't have to change how you work to get any of it.

If you're using Claude Code, Cursor, or any AI coding agent and you haven't installed RTK yet, the 2 minutes it takes to set up is probably the highest ROI developer productivity action you can take today.

brew install rtk rtk init -g # Restart Claude Code

That's it. Your AI sessions just got 80% more efficient.

Stop wasting tokens & context window. Partner with Starling Elevate to engineer highly optimized, cost-efficient AI systems.

Get In Touch

Frequently Asked Questions

RTK only compresses the *output* of commands, meaning it doesn't modify how commands execute. Your files, git history, and code are completely untouched. The tee system also preserves full raw output for failed commands, so you can always access complete information if needed. The project is open-source, so you can audit exactly what it does to any command's output.

Decade of Innovation and Impact

With a decade of innovation and impact, our journey has been marked by a relentless pursuit of excellence and a commitment to driving success for our clients. Over the past 10+ years, we have honed our skills and expanded our expertise across 15+ diverse industries.

Let's Connect

Stop Burning Money on LLM Tokens: How RTK Cuts Your AI Coding Costs

May 14, 2026

Mukesh Kumar

How a single Rust binary is quietly revolutionizing the way developers work with Claude Code, Cursor, and other AI coding agents - by cutting LLM token usage by 60-90% with zero friction.

Introduction

The Problem with AI Coding Agents

To understand why RTK matters, you need to understand how AI coding agents actually consume context.

Let's make it concrete. Here's what git push normally outputs:

That's roughly 200 tokens. The only information the AI actually needed? That the push succeeded and which branch it pushed to. RTK compresses that to:

ok main

Ten tokens. Same semantic content. Zero information loss from the AI's perspective.

What is RTK?

How RTK Works Internally

RTK's architecture is beautifully simple in concept, even if the implementation is sophisticated.

When you run a command through RTK (either explicitly like rtk git status, or automatically via the hook system), here's what happens:

1. RTK intercepts the command before your shell executes it normally
2. It executes the underlying command itself, capturing stdout and stderr
3. It runs the output through a command-specific filter pipeline
4. It returns the compressed, filtered output which is what your AI agent's context receives
5. If the command fails, it saves the full unfiltered output to a local log file so the AI can access complete information if needed

The filtering pipeline uses four core strategies, applied intelligently based on the command type:

Smart Filtering :- strips structural noise progress bars, redundant status messages, boilerplate headers, decorative ASCII art (looking at you, cargo's crab and prisma generate's multi-line art), comments, and whitespace padding.
Grouping :- aggregates related items together. Instead of listing 47 individual files that failed a lint check, RTK groups them by directory or error type, giving the AI a structured summary instead of a flat dump.
Truncation :- keeps the meaningful parts and discards redundancy. A 200-line test output where 195 lines are passing tests becomes a focused summary of the 5 failures, with file names and line numbers intact.
Deduplication :- handles the common case of repeated log lines. Docker logs, application logs, and container output often repeat the same message hundreds of times. RTK collapses these into "message [x47]" format, preserving the signal while destroying the noise.

FAILED: 2/15 tests [full output: ~/.local/share/rtk/tee/1707753600_cargo_test.log]

This gives you the best of both worlds: compressed default output for the AI's context, full output available on demand.

The Auto-Rewrite Hook: RTK's Killer Feature

The thing that makes RTK genuinely seamless - rather than just a useful but manual tool - is the auto-rewrite hook system.

# Claude Code calls this: bash: git status # The hook rewrites it to: bash: rtk git status # Claude sees this - never knows the difference: M src/main.rs ?? tests/new_test.rs

Claude never knows the rewrite happened. It just gets cleaner output. Your workflow doesn't change. You don't have to remember to type rtk before every command. It just works.

This also means 100% RTK adoption across all subagents and all conversations from the moment you restart your AI tool - no configuration drift, no forgetting to enable it on a new project.

Installation Guide & integration with AI

Performance and Token Savings

The project publishes detailed benchmark data for a typical 30-minute Claude Code session on a medium-sized TypeScript/Rust project:

Operation	Frequency	Without RTK	With RTK	Savings
`ls` / `tree`	10x	2,000 tokens	400	-80%
`cat` / `read`	20x	40,000	12,000	-70%
`grep` / `rg`	8x	16,000	3,200	-80%
`git status`	10x	3,000	600	-80%
`git diff`	5x	10,000	2,500	-75%
`git log`	5x	2,500	500	-80%
`git add/commit/push`	8x	1,600	120	-92%
`cargo test` / `npm test`	5x	25,000	2,500	-90%
`pytest`	4x	8,000	800	-90%
`docker ps`	3x	900	180	-80%
Total		~118,000	~23,900	-80%

Advantages and Limitations

What RTK Does Really Well

The supported command list is comprehensive :- Over 100 commands across git, test runners, build tools, AWS, Docker, Kubernetes, and more. Most developer workflows are covered from day one.

Final Thoughts

brew install rtk rtk init -g # Restart Claude Code

That's it. Your AI sessions just got 80% more efficient.

Stop wasting tokens & context window. Partner with Starling Elevate to engineer highly optimized, cost-efficient AI systems.

Get In Touch

Frequently Asked Questions

Decade of Innovation and Impact

Let's Connect

Stop Burning Money on LLM Tokens: How RTK Cuts Your AI Coding Costs

Introduction

The Problem with AI Coding Agents

What is RTK?

How RTK Works Internally

The Auto-Rewrite Hook: RTK's Killer Feature

Installation Guide & integration with AI

Performance and Token Savings

Advantages and Limitations

What RTK Does Really Well

Final Thoughts

Stop wasting tokens & context window. Partner with Starling Elevate to engineer highly optimized, cost-efficient AI systems.

Frequently Asked Questions

Table Of Contents

Tags

Decade of Innovation and Impact

Stop Burning Money on LLM Tokens: How RTK Cuts Your AI Coding Costs

Introduction

The Problem with AI Coding Agents

What is RTK?

How RTK Works Internally

The Auto-Rewrite Hook: RTK's Killer Feature

Installation Guide & integration with AI

Performance and Token Savings

Advantages and Limitations

What RTK Does Really Well

Final Thoughts

Stop wasting tokens & context window. Partner with Starling Elevate to engineer highly optimized, cost-efficient AI systems.

Frequently Asked Questions

Table Of Contents

Tags

Decade of Innovation and Impact