23 Tips to Save Smart Claude Code Token

Using Claude Code in large projects can lead to an increase in the cost of tokens. A 2025 Stanford study reveals that developers are wasting thousands of tokens every day, draining budgets as unchecked context parameters pile up. By setting strict limits from the start, teams can reduce costs without compromising code quality. Optimizing token usage and context window sizes early on ensures efficiency and keeps projects on track. In this article, we will break down the key steps you should take to save Claude Code tokens and manage your API costs.
Core Concept
As your chat context grows, so does the cost of tokens. This includes not only file reading and command output but also system commands and chat history. According to Anthropic, the cost of tokens increases as the core size increases. To avoid unnecessary expenses, it is important to keep your work environment together. By optimizing your context window sizes from the start, you can better manage token usage and keep costs consistent across projects.
High Impact Content Management Strategies
1. Clear Chat Between Tasks
Clear your chat when you change jobs. Kind of /clear to start a new session. This prevents debug logs from wasting tokens. He reduced the cost of the Claude Code by restarting.
Use:
/rename auth-debug-apr30
/clear 
Restart later:
/resume
2. Gather Content to Continue
use the /compact command of long tasks. This action summarizes the discussion. It keeps the thread but drops the old data. This boosts efforts to save the Claude Code token.
Add custom commands to CLAUDE.md:
# Compact instructionsWhen compacting, preserve:
- current task goal
- files changed
- commands already run
- failing tests and exact errors
- decisions made
- next action listDrop:
- old exploration paths
- repeated logs
- irrelevant discussion
In Claude’s code implementation
/compact

3. Lower the Auto-Compact Threshold
Merge the chat faster than the set limit. Claude meets near the 95 percent capacity. Set the output to 70 for normal operation.
export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=70Use 50 for a noisy workflow.
export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50This strategy helps you manage the usage of tokens.
4. Monitor Usage Metrics
View your limits with specific commands. Kind of /context to see what is taking up space. Kind of /usage track your session fee. Run these before larger tasks to free up context window space.


5. Add a Live Status Line
Add a status bar to your terminal. This shows the percentage of the live context and cost of the model. Prevents unexpected token spikes. This improves your AI coding assistant experience.
Use this JSON configuration in ~/.claude/settings.json file
{
"statusLine": {
"type": "command",
"command": "jq -r '"[\(.model.display_name)] \(.context_window.used_percentage // 0)% context"'"
}
}Or you can have Claude Code create this for you automatically by running this command within the Claude Code dialog:
/statusline show model name and context percentage
Also Read: Top 28 Claude Shortcuts That Will 10X Your Speed
Instructions and File Upgrade
6. Minimize Your Global Orders
Keep your main instruction file short. Anthropic suggests keeping CLAUDE.md less than 200 lines. Large files cost tokens per session. Keep only the important facts there. This strategy improves Claude Code token savings.
# Project essentials- Package manager: pnpm
- Test command: pnpm test
- Typecheck: pnpm typecheck
- Main app code: src/
- API handlers: src/api/
- Do not edit generated files in src/generated/
7. Use Path-Scoped Rules
Use path-scoped rules instead of global ones. Set specific rules for folders. This only loads when Claude edits the same files. He reduced the cost of the Claude Code by hiding unimportant instructions.
---
paths:
- "src/api/**/*.ts"
---# API rules
- Validate all request inputs.
- Use the standard error response shape.
- Add tests for authorization failures.
Applying path-scoped rules to Claude Codeyou must add them to the markup file within the .claude/rules/ your project directory.
Create a new one .md file inside the rules folder. A common naming convention is to name it by the subsystem it controls:
.claude/rules/api-validation.md (any name ending in .md).
8. Separate Special Travel Specials
Deploy specialized workflows to different skills. Skills load as needed. Add a disable flag to hide them until needed. This keeps the information clean. It helps you manage the usage of tokens.
You can add Claude SKILL to .claude/skills/
---
name: fix-issue
description: Fix a GitHub issue by number
disable-model-invocation: true
allowed-tools: Bash(gh *) Bash(pnpm test *) Read Grep Edit
---Fix GitHub issue $ARGUMENTS.
Steps:
1. Use gh issue view to read the issue.
2. Identify the smallest relevant files.
3. Write or update tests first.
4. Implement the fix.
5. Run the targeted test.
6. Summarize files changed.
Request it using:
/fix-issue 1239. Select CLI Tools
Choose CLI tools over server tools. Anthropic favors standard tools over MCP servers. The CLI tools cause a small increase. Disable unused MCP servers at the same time. This guides your AI coding assistant.
Good request:
Use it gh to test PR 42 and return only test names that fail.
10. Cap Server Output
Enter the sizes of your output devices. Tool results populate your conversation context. Set the maximum limit to 8000. You optimize the context window space this way.
export MAX_MCP_OUTPUT_TOKENS=800011. Cap Terminal Output
Type the output of your terminal command. Longer check logs issue tokens faster. Set the length of the bash output to 20000. This secures the storage of the Claude Code token.
export BASH_MAX_OUTPUT_LENGTH=2000012. Filter logs
Sort the log results before Claude sees them. Don’t feed the discussion green logs. Use basic commands to generate error lines. This step helps to reduce the cost of the Claude Code.
pnpm test 2>&1 | grep -A 5 -E "FAIL|ERROR|Error|failed" | head -120If you want to run a full session with filtered logs preloaded into the context, enter the output in the standard cloud command.
Start Code Claude with the following command
pnpm test 2>&1 | grep -A 5 -E "FAIL|ERROR|Error|failed" | head -120 | claude
Model techniques and agents
13. Enter Subagents
Use subagents for verbose research activities. Subagents handle heavy learning in an isolated environment. They return neat summaries to the main discussion. This helps you control the usage of tokens.
Use a subagent to inspect the failing auth tests and logs. Return only:
1. failing test names
2. likely root cause
3. files that need edits
4. shortest fix plan
If you do let’s say a detective job often, you can define a permanent subagent by creating an MD file in the .claude/agent/investigator.md
After saving, you can just type /investigator "auth tests are failing" to trigger the workflow.
Or you can use Claude to do this
Use it /agents in Claude Code.

Press on the left key to go to Library and choose to create a new agent.

Then choose It’s personal or The project Range and Start with Claude.
14. Choose Cheap Models
Choose cheap models with general functionality. Sonnet handles most of the day-to-day coding tasks. It costs under Opus. Keep the Opus for deep architectural thinking. This suits the workflow of an intelligent AI coding assistant.
claude --model haiku 
15. Lower the Level of Effort
Reduce the effort level of simple tasks. Low effort works faster and costs less. Use moderate effort to code normally. Avoid high settings. This supports Claude Code token storage.
/effort low 
16. Disable Extended Thinking
Disable extended thinking for easy editing. Thought tokens count as output tokens. Set a hard token cap for basic operations. You reduce the cost of the Claude Code a lot this way.
export CLAUDE_CODE_DISABLE_THINKING=117. Use code plugins
Install code intelligence plugins for typed languages. These plugins provide intuitive symbol navigation. Claude skips reading unimportant files. You improve the limits of the content window with this trick.
File Access and Workflow Control
18. Reject Loud Files
Deny access to project files with audio. Edit your local settings file. Block access to the log and create folders. Claude cannot find these ignored files. This protects your AI code assistant process.
Open it ~/.claude/settings.json and Merge the JSON into your existing file
{
"permissions": {
"deny": [
"Read(./.env)",
"Read(./.env.*)",
"Read(./secrets/**)",
"Read(./node_modules/**)",
"Read(./dist/**)",
"Read(./build/**)",
"Read(./coverage/**)",
"Read(./.next/**)",
"Read(./tmp/**)",
"Read(./logs/**)",
"Read(./*.log)"
]
}
}19. Avoid Broad Scans
Don’t ask Claude to read the entire archive. Unclear information causes the scanning of large files. Instead give the exact names of the files. This simple rule helps manage the use of tokens.
Good request:
Login redirect failed. Start with src/auth/session.ts. Read related files only.
20. Provide Validation Objectives
Provide pre-validation targets. Tell Claude how to check their work. Provide expected output and specific test names. This prevents maintenance loops and helps save the Claude Code token.
21. Lesson-Fix the Model
Lesson-correct the model at the beginning of the process. Raise Claude if he is reading unimportant files. Backup the session to a safe location. He reduced the cost of the Claude Code by stopping the evil ways.
22. Use Short System Prompt
Use the short system prompts in Opus 4.7. Enable this hidden setting carefully. Drops long tooltips. This trick helps to optimize the context window space.
export CLAUDE_CODE_SIMPLE_SYSTEM_PROMPT=123. Remove Git Commands
Remove the built-in git rules if needed. Disable the default git it flows. Do this only if you are using a custom workflow. It narrows down the basics of your AI coding assistant.
export CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS=1Recommended Configuration
Use this local setting to perform common coding tasks:
{
"permissions": {
"deny": [
"Read(./.env)",
"Read(./.env.*)",
"Read(./secrets/**)",
"Read(./node_modules/**)",
"Read(./dist/**)",
"Read(./build/**)",
"Read(./coverage/**)",
"Read(./.next/**)",
"Read(./tmp/**)",
"Read(./logs/**)",
"Read(./*.log)"
]
},
"env": {
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "70",
"BASH_MAX_OUTPUT_LENGTH": "20000",
"MAX_MCP_OUTPUT_TOKENS": "8000",
"CLAUDE_CODE_EFFORT_LEVEL": "medium"
}
}Use this setup for powerful savings:
{
"env": {
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
"BASH_MAX_OUTPUT_LENGTH": "12000",
"MAX_MCP_OUTPUT_TOKENS": "5000",
"CLAUDE_CODE_EFFORT_LEVEL": "low"
}
}Correct Request Template
Follow this template format to save tokens:
Task: Fix [specific bug] in [specific files].Scope:
- Start with: [file1], [file2]
- Do not scan the whole repo.
- Only read additional files if they are imported.Token discipline:
- Keep command output short.
- Filter test output to failures only.
- Summarize findings before editing.
- If context exceeds 70%, compact the chat.Verification:
- Add or update targeted tests.
- Run only the relevant test file first.
- Run broader tests after the targeted test passes.
Things to Avoid
- Don’t rely on files that become outdated. The system overrides these old settings. Use the deny permissions settings instead.
- Do not install all available plugins. More plugins are coming up regularly. Disable unused tools to maintain speed.
- Don’t always settle for the most expensive model. Use Opus for complex tasks. Trust Sonnet in your daily practice.
Also Read: Claude’s Skills Explained: Use Custom Skills in Claude’s Code
The conclusion
Controlling your tools builds confidence in your project and helps protect your budget. Managing token usage fine-tunes your AI assistant and makes development more efficient and cost-effective. Teams developing a context window space can reduce API costs significantly. Setting clear limits: such as deleting conversations, limiting file access, and writing brief information, leads to real savings. By applying these strategies to your next project, you’ll improve both your budget and code quality.
Frequently Asked Questions
A. Type the /clear command in your terminal. This drops all the previous context and starts anew.
A. Implicit information causes a large codebase scan. Provide precise file names to limit the scope of the search.
A. Set i BASH_MAX_OUTPUT_LENGTH limit in your area. Filter test results with standard bash tools.
Sign in to continue reading and enjoy content curated by experts.



