23 Tips to Save Smart Claude Code Token

admin 2 hours ago

0 0 8 minutes read

Using Claude Code in large projects can lead to an increase in the cost of tokens. A 2025 Stanford study reveals that developers are wasting thousands of tokens every day, draining budgets as unchecked context parameters pile up. By setting strict limits from the start, teams can reduce costs without compromising code quality. Optimizing token usage and context window sizes early on ensures efficiency and keeps projects on track. In this article, we will break down the key steps you should take to save Claude Code tokens and manage your API costs.

Core Concept

As your chat context grows, so does the cost of tokens. This includes not only file reading and command output but also system commands and chat history. According to Anthropic, the cost of tokens increases as the core size increases. To avoid unnecessary expenses, it is important to keep your work environment together. By optimizing your context window sizes from the start, you can better manage token usage and keep costs consistent across projects.

High Impact Content Management Strategies

1. Clear Chat Between Tasks

Clear your chat when you change jobs. Kind of /clear to start a new session. This prevents debug logs from wasting tokens. He reduced the cost of the Claude Code by restarting.

Use:

/rename auth-debug-apr30
/clear

Restart later:

/resume

2. Gather Content to Continue

use the /compact command of long tasks. This action summarizes the discussion. It keeps the thread but drops the old data. This boosts efforts to save the Claude Code token.

Add custom commands to CLAUDE.md:

# Compact instructions

When compacting, preserve:
- current task goal
- files changed
- commands already run
- failing tests and exact errors
- decisions made
- next action list

Drop:
- old exploration paths
- repeated logs
- irrelevant discussion

In Claude’s code implementation

/compact

3. Lower the Auto-Compact Threshold

Merge the chat faster than the set limit. Claude meets near the 95 percent capacity. Set the output to 70 for normal operation.

export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=70

Use 50 for a noisy workflow.

export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50

This strategy helps you manage the usage of tokens.

4. Monitor Usage Metrics

View your limits with specific commands. Kind of /context to see what is taking up space. Kind of /usage track your session fee. Run these before larger tasks to free up context window space.

5. Add a Live Status Line

Add a status bar to your terminal. This shows the percentage of the live context and cost of the model. Prevents unexpected token spikes. This improves your AI coding assistant experience.

Use this JSON configuration in ~/.claude/settings.json file

{
"statusLine": {
"type": "command",
"command": "jq -r '"[\(.model.display_name)] \(.context_window.used_percentage // 0)% context"'"
}
}

Or you can have Claude Code create this for you automatically by running this command within the Claude Code dialog:

/statusline show model name and context percentage

Also Read: Top 28 Claude Shortcuts That Will 10X Your Speed

Instructions and File Upgrade

6. Minimize Your Global Orders

Keep your main instruction file short. Anthropic suggests keeping CLAUDE.md less than 200 lines. Large files cost tokens per session. Keep only the important facts there. This strategy improves Claude Code token savings.

# Project essentials

- Package manager: pnpm
- Test command: pnpm test
- Typecheck: pnpm typecheck
- Main app code: src/
- API handlers: src/api/
- Do not edit generated files in src/generated/

7. Use Path-Scoped Rules

Use path-scoped rules instead of global ones. Set specific rules for folders. This only loads when Claude edits the same files. He reduced the cost of the Claude Code by hiding unimportant instructions.

---
paths:
- "src/api/**/*.ts"
---

# API rules

- Validate all request inputs.
- Use the standard error response shape.
- Add tests for authorization failures.

Applying path-scoped rules to Claude Codeyou must add them to the markup file within the .claude/rules/ your project directory.

Create a new one .md file inside the rules folder. A common naming convention is to name it by the subsystem it controls:

.claude/rules/api-validation.md (any name ending in .md).

8. Separate Special Travel Specials

Deploy specialized workflows to different skills. Skills load as needed. Add a disable flag to hide them until needed. This keeps the information clean. It helps you manage the usage of tokens.

You can add Claude SKILL to .claude/skills//SKILL.md (in the root of your project) or add Global capabilities globally .claude/ folder.

---
name: fix-issue
description: Fix a GitHub issue by number
disable-model-invocation: true
allowed-tools: Bash(gh *) Bash(pnpm test *) Read Grep Edit
---

Fix GitHub issue $ARGUMENTS.

Steps:
1. Use gh issue view to read the issue.
2. Identify the smallest relevant files.
3. Write or update tests first.
4. Implement the fix.
5. Run the targeted test.
6. Summarize files changed.

Request it using:

/fix-issue 123

9. Select CLI Tools

Choose CLI tools over server tools. Anthropic favors standard tools over MCP servers. The CLI tools cause a small increase. Disable unused MCP servers at the same time. This guides your AI coding assistant.

Good request:

Use it gh to test PR 42 and return only test names that fail.

10. Cap Server Output

Enter the sizes of your output devices. Tool results populate your conversation context. Set the maximum limit to 8000. You optimize the context window space this way.

export MAX_MCP_OUTPUT_TOKENS=8000

11. Cap Terminal Output

Type the output of your terminal command. Longer check logs issue tokens faster. Set the length of the bash output to 20000. This secures the storage of the Claude Code token.

export BASH_MAX_OUTPUT_LENGTH=20000

12. Filter logs

Sort the log results before Claude sees them. Don’t feed the discussion green logs. Use basic commands to generate error lines. This step helps to reduce the cost of the Claude Code.

pnpm test 2>&1 | grep -A 5 -E "FAIL|ERROR|Error|failed" | head -120

If you want to run a full session with filtered logs preloaded into the context, enter the output in the standard cloud command.

Start Code Claude with the following command

pnpm test 2>&1 | grep -A 5 -E "FAIL|ERROR|Error|failed" | head -120 | claude

Model techniques and agents

13. Enter Subagents

Use subagents for verbose research activities. Subagents handle heavy learning in an isolated environment. They return neat summaries to the main discussion. This helps you control the usage of tokens.

Use a subagent to inspect the failing auth tests and logs. Return only:
1. failing test names
2. likely root cause
3. files that need edits
4. shortest fix plan

If you do let’s say a detective job often, you can define a permanent subagent by creating an MD file in the .claude/agent/investigator.md

After saving, you can just type /investigator "auth tests are failing" to trigger the workflow.

Or you can use Claude to do this

Use it /agents in Claude Code.

Press on the left key to go to Library and choose to create a new agent.

Then choose It’s personal or The project Range and Start with Claude.

14. Choose Cheap Models

Choose cheap models with general functionality. Sonnet handles most of the day-to-day coding tasks. It costs under Opus. Keep the Opus for deep architectural thinking. This suits the workflow of an intelligent AI coding assistant.

claude --model haiku

15. Lower the Level of Effort

Reduce the effort level of simple tasks. Low effort works faster and costs less. Use moderate effort to code normally. Avoid high settings. This supports Claude Code token storage.

/effort low

16. Disable Extended Thinking

Disable extended thinking for easy editing. Thought tokens count as output tokens. Set a hard token cap for basic operations. You reduce the cost of the Claude Code a lot this way.

export CLAUDE_CODE_DISABLE_THINKING=1

17. Use code plugins

Install code intelligence plugins for typed languages. These plugins provide intuitive symbol navigation. Claude skips reading unimportant files. You improve the limits of the content window with this trick.

File Access and Workflow Control

18. Reject Loud Files

Deny access to project files with audio. Edit your local settings file. Block access to the log and create folders. Claude cannot find these ignored files. This protects your AI code assistant process.

Open it ~/.claude/settings.json and Merge the JSON into your existing file

{
"permissions": {
"deny": [
"Read(./.env)",
"Read(./.env.*)",
"Read(./secrets/**)",
"Read(./node_modules/**)",
"Read(./dist/**)",
"Read(./build/**)",
"Read(./coverage/**)",
"Read(./.next/**)",
"Read(./tmp/**)",
"Read(./logs/**)",
"Read(./*.log)"
]
}
}

19. Avoid Broad Scans

Don’t ask Claude to read the entire archive. Unclear information causes the scanning of large files. Instead give the exact names of the files. This simple rule helps manage the use of tokens.

Good request:

Login redirect failed. Start with src/auth/session.ts. Read related files only.

20. Provide Validation Objectives

Provide pre-validation targets. Tell Claude how to check their work. Provide expected output and specific test names. This prevents maintenance loops and helps save the Claude Code token.

21. Lesson-Fix the Model

Lesson-correct the model at the beginning of the process. Raise Claude if he is reading unimportant files. Backup the session to a safe location. He reduced the cost of the Claude Code by stopping the evil ways.

22. Use Short System Prompt

Use the short system prompts in Opus 4.7. Enable this hidden setting carefully. Drops long tooltips. This trick helps to optimize the context window space.

export CLAUDE_CODE_SIMPLE_SYSTEM_PROMPT=1

23. Remove Git Commands

Remove the built-in git rules if needed. Disable the default git it flows. Do this only if you are using a custom workflow. It narrows down the basics of your AI coding assistant.

export CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS=1

Recommended Configuration

Use this local setting to perform common coding tasks:

{
  "permissions": {
    "deny": [
      "Read(./.env)",
      "Read(./.env.*)",
      "Read(./secrets/**)",
      "Read(./node_modules/**)",
      "Read(./dist/**)",
      "Read(./build/**)",
      "Read(./coverage/**)",
      "Read(./.next/**)",
      "Read(./tmp/**)",
      "Read(./logs/**)",
      "Read(./*.log)"
    ]
  },
  "env": {
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "70",
    "BASH_MAX_OUTPUT_LENGTH": "20000",
    "MAX_MCP_OUTPUT_TOKENS": "8000",
    "CLAUDE_CODE_EFFORT_LEVEL": "medium"
  }
}

Use this setup for powerful savings:

{
  "env": {
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
    "BASH_MAX_OUTPUT_LENGTH": "12000",
    "MAX_MCP_OUTPUT_TOKENS": "5000",
    "CLAUDE_CODE_EFFORT_LEVEL": "low"
  }
}

Correct Request Template

Follow this template format to save tokens:

Task: Fix [specific bug] in [specific files].

Scope:
- Start with: [file1], [file2]
- Do not scan the whole repo.
- Only read additional files if they are imported.

Token discipline:
- Keep command output short.
- Filter test output to failures only.
- Summarize findings before editing.
- If context exceeds 70%, compact the chat.

Verification:
- Add or update targeted tests.
- Run only the relevant test file first.
- Run broader tests after the targeted test passes.

Things to Avoid

Don’t rely on files that become outdated. The system overrides these old settings. Use the deny permissions settings instead.
Do not install all available plugins. More plugins are coming up regularly. Disable unused tools to maintain speed.
Don’t always settle for the most expensive model. Use Opus for complex tasks. Trust Sonnet in your daily practice.

Also Read: Claude’s Skills Explained: Use Custom Skills in Claude’s Code

The conclusion

Controlling your tools builds confidence in your project and helps protect your budget. Managing token usage fine-tunes your AI assistant and makes development more efficient and cost-effective. Teams developing a context window space can reduce API costs significantly. Setting clear limits: such as deleting conversations, limiting file access, and writing brief information, leads to real savings. By applying these strategies to your next project, you’ll improve both your budget and code quality.

Frequently Asked Questions

Q1. How do I start a new conversation thread?

A. Type the /clear command in your terminal. This drops all the previous context and starts anew.

Q2. Why is Claude reading so many files?

A. Implicit information causes a large codebase scan. Provide precise file names to limit the scope of the search.

Q3. How do I set up large test logs?

A. Set i BASH_MAX_OUTPUT_LENGTH limit in your area. Filter test results with standard bash tools.

Harsh Mishra is an AI/ML Engineer who spends more time talking to Large Language Models than real people. I am interested in GenAI, NLP, and making machines intelligent (not to replace him yet). If he doesn’t use models well, he might be increasing his coffee intake. 🚀☕