The Miser, grumpy, pinching two glowing tokens between thumb and forefinger — representing the cost saved by placing shared content in the right cache tier.

Skills vs Rules vs Reads: The Claude Code Subagent Caching Cheatsheet


Accessibility rules. Coding standards. The list of behaviors that will get your API key revoked. Every Claude Code setup has a pile of shared content and a decision: where does it live?

Four candidates.

  1. The agent body (markdown in .claude/agents/xxx.md)
  2. .claude/rules/*.md
  3. A skill at .claude/skills/<name>/SKILL.md, listed in the agent’s skills: frontmatter
  4. A Read call in the agent’s startup checklist

From the outside, interchangeable. They are not.

Two reach subagents. Two don’t. One caches across every agent spawn for the price of a single write. One re-bills on every spawn. Assume wrong and you get a pipeline that quietly ignores your standards, hands you a larger bill, and the bill and the standards-violation arrive uncorrelated enough that you blame the model.

What a subagent actually receives

Straight from the docs: “Subagents receive only this system prompt (plus basic environment details like working directory), not the full Claude Code system prompt” [1].

Unpacked:

Subagent context:
  System prompt:
    - Subagent harness
    - Agent-specific memory file (if `memory:` is set)
    - Skills content (if `skills:` is set)
    - Agent body (markdown below the frontmatter)
  Project context:
    - CLAUDE.md (auto-loaded)
  Messages:
    - Task prompt from the parent
    - [conversation]

The official docs confirm what is injected [1][2]. Layer ordering inside the system prompt is not specified there; the sequence above is the community-observed one [3].

What is not on that list: .claude/rules/*.md. Rules files are a main-session feature. They do not propagate. If subagents have been ignoring the coding standards you wrote in .claude/rules/, that is not the model misbehaving. The rules never arrived.

Also missing: inherited MCP tool definitions. Those sit at position zero of the cache prefix and have their own cost story [see The MCP Tool Prefix Trap].

The placement table

LocationReaches subagents?Caches across agent types?
Agent body (markdown below frontmatter)YesNo (unique per agent type)
.claude/rules/*.mdNoN/A (main session only)
.claude/skills/<name>/SKILL.md via skills:YesYes, if same skill is listed
Read call in startup checklistYesNo (re-read per spawn, lands in messages)
CLAUDE.mdYes (auto-loaded)Yes (same per project)
memory: frontmatterYesNo (memory varies per agent)

First column decides whether your content shows up at all. Second decides whether you pay once or on every spawn.

Caching mechanics (the 10% cached-read rate, write premium of input × 1.25 for 5-minute TTL or input × 2 for 1-hour, Sonnet 4.6 pricing as of 2026-04-14, minimum-prefix thresholds, the full invalidation list) are covered in Prompt Caching: Read the usage Block. Those pricing rates are Anthropic’s posted numbers and do not depend on the CLI version. Observed cache_creation_input_tokens measurements on Claude Code 2.1.100–2.1.107 were subject to two concurrent regressions: a telemetry-gated TTL downgrade and #46917’s cache_creation inflation [5]. Both appear resolved in v2.1.108. Re-measure on 2.1.108 or later if you are reproducing these numbers from your own traces.

The decision tree

“Every agent should know this.”

Skill, listed in each agent’s skills: frontmatter.

# .claude/skills/shared-rules/SKILL.md
---
name: shared-rules
description: Cross-project rules every agent should follow.
user-invocable: false
---

# Shared Rules

## Accessibility
- Color alone must never convey meaning (symbol plus text label).
- No yellow/green adjacent status indicators.

## Python tooling
- uv/uvx for all Python package management. Never pip.
# .claude/agents/web-dev.md
---
name: web-dev
description: Front-end development agent
tools: Read, Write, Edit, Bash, Glob, Grep
skills:
  - shared-rules
---

Full skill content injects into the subagent’s system prompt at startup, per the docs: “not just made available for invocation” [2]. Set user-invocable: false and it stays background knowledge instead of cluttering the slash-command list.

First spawn pays a cache write at input × 1.25. Every spawn after that reads the same bytes at input × 0.1 until the TTL runs out. The second read is the break-even point on the write premium. Everything after the second read is money off.

“This is project context (architecture, conventions, tech stack).”

CLAUDE.md. Auto-loaded in the main session and in subagents via settingSources in the SDK, via the normal message flow for file-based agents. Caches across all spawns. Keep it lean: a product summary and pointers, not a full wiki.

“This is one agent’s role.”

Agent body. Caches across spawns of that type only. Role identity is not shared knowledge.

“This is per-task data (a ticket ID, a file path, a log blob).”

Read call. Lands in messages. Doesn’t cache across spawns. Dynamic content should be dynamic.

“What about .claude/rules/?”

Main-session working agreements. Session guides. Things a human wants enforced in the chat window. Not for anything subagents need to know. They will not see it.

The startup checklist anti-pattern

First drafts often look like this:

# .claude/agents/code-reviewer.md
---
name: code-reviewer
---

# Code Reviewer

## Startup Checklist
1. Read `.claude/rules/coding-standards.md`
2. Read `.claude/rules/security-checklist.md`
3. Read `CLAUDE.md`

Three problems on three lines.

Lines 1 and 2: .claude/rules/ does not reach subagents. The Read call may succeed (the file exists on disk, the agent has Read in its toolset), but the rules were never going to appear in this subagent’s prompt automatically. A Read call is not the fix. The actual bug: standards placed in a rules directory that subagents cannot access.

Even if the Read succeeded, results land in the messages layer, not the system prompt. Messages don’t share cache across spawns. Five code-reviewer subagents each reading 6,000 tokens of standards from scratch: five message-tier reads at full input rate. A third of a cent per spawn becomes $0.30 across fifteen dispatches.

Line 3: CLAUDE.md is auto-injected. Reading it explicitly tells the agent to re-do work the harness already finished.

The fix:

# .claude/agents/code-reviewer.md
---
name: code-reviewer
description: Reviews code for quality, security, and standards adherence.
tools: Read, Grep, Glob
skills:
  - shared-rules
  - coding-standards
---

# Code Reviewer

You review code for quality, security vulnerabilities, and team-standards
adherence. Standards are preloaded via skills (no startup Reads needed).
# .claude/skills/coding-standards/SKILL.md
---
name: coding-standards
description: Team coding standards and security checklist.
user-invocable: false
---

# Coding Standards

## Style
- Functions under 40 lines. Extract if longer.
- No magic numbers. Named constants only.

## Security
- Validate user input at the boundary.
- No secrets in source. Environment variables or a secrets manager.
- Parameterized queries only. No string concatenation in SQL.

Skill content lands in the system prompt. System prompt caches. Second spawn onward, those standards read at a tenth of input price until the TTL runs out.

Same checklist, three placements, three bills

Option A: in each agent body. Copy the rules into ux-reviewer.md, web-dev.md, qa.md. Each body is unique, so the cache diverges before reaching this content. Three agents, three writes. Cold starts multiply that.

Option B: in .claude/rules/accessibility.md. The main session sees it. Subagents do not. The standards exist on disk and are not enforced in the agents doing the actual review. The pipeline looks configured, produces reviews that silently skip half the checklist, and returns no errors.

Option C: as a skill listed in three agents.

# .claude/skills/accessibility/SKILL.md
---
name: accessibility
description: Accessibility standards all agents must follow.
user-invocable: false
---

## Accessibility Standards
- Color alone must never convey meaning (symbol plus text).
- Minimum contrast ratio 4.5:1 for text.
- Interactive elements need keyboard focus states.
- No yellow/green adjacent status indicators.
# In each agent's frontmatter:
skills:
  - accessibility

Content lives once. One cache write per agent type on first spawn. Every subsequent spawn reads at input × 0.1. If the agents share enough prefix before their bodies diverge (harness, CLAUDE.md, preceding skills in the same order) the skill content caches across agent types entirely: one write total.

Two frontmatter fields worth knowing

skills: takes a list:

---
name: web-dev
tools: Read, Write, Edit, Bash, Glob, Grep
skills:
  - shared-rules
  - accessibility
  - coding-standards
---

Subagents do not inherit skills from the parent session. Loading a skill via /skill in your main session does nothing for the agents it spawns. List the skill in each agent’s frontmatter or it is not there [1].

memory: is per-agent, not shared. The main session’s auto-memory (MEMORY.md and its pointer files) stops at the subagent boundary. A saved preference (“use uv not pip,” “no green/yellow adjacent status indicators”) is invisible to subagents unless you:

  1. Put it in a skill (cached, in the system prompt).
  2. State it in the task prompt when spawning.
  3. Give the subagent its own memory: field (separate file per agent type, not shared with the parent).

With memory: set, the agent’s MEMORY.md loads into its system prompt, first 200 lines or 25KB, whichever hits first [1]. Useful for agents that accumulate context across conversations. Not a substitute for shared rules. Memory is for what that agent has learned; skills are for what every agent must know.

TL;DR

Content typePut it hereWhy
Rules every agent must followSkill, via skills: frontmatterSystem prompt, cached, reaches subagents
Project context, conventions, tech stackCLAUDE.mdAuto-injected, reaches subagents
Role identity, judgment heuristicsAgent bodySystem prompt, cached per agent type
Per-task dynamic dataRead callMessages layer (appropriate for ephemeral data)
Main-session-only context.claude/rules/Works for the human; not for subagents

The audit

Open .claude/agents/. For each file with a startup checklist full of Read calls:

  1. File read from .claude/rules/? Move the content to a skill if subagents need it. Otherwise treat it as main-session-only and move on.
  2. Content static and shared across agents? Skill. List it in every agent’s skills: frontmatter.
  3. Content static and specific to one role? Agent body.
  4. CLAUDE.md in the checklist? Delete the line. Auto-injection already did the work.
  5. Content genuinely per-task? Keep the Read. Dynamic data belongs in messages.

Five subagents each doing three Read calls against the same rules file: fifteen message-tier reads on every pipeline run, at full input price. Moving that content to a skill costs one cache write on first spawn (Sonnet 4.6, input × 1.25, $3.75 per million tokens as of 2026-04-14) and every spawn after that reads at input × 0.1 ($0.30 per million) until the TTL runs out.

To confirm skill content is caching across spawns, check cache_read_tokens on each subagent’s first claude_code.api_request event via Claude Code’s OpenTelemetry traces. A high read count means the shared prefix is in the cache. Zero means something upstream is busting it.

The configuration change is one line per agent.

Fact Check
  1. Create custom subagents — Claude Code Docs — what subagents receive, skills: injection behavior, memory: field spec and size limits.
  2. Subagents in the SDK — Claude Code DocsCLAUDE.md injection via settingSources, the skills: field as a list, subagents do not inherit parent session skills.
  3. How Claude Code Builds a System Prompt — dbreunig.com — community analysis of system prompt architecture and layer ordering.
  4. Prompt caching — Anthropic API docs — cache tiers, pricing, TTL multipliers, minimum-prefix thresholds. Fetched 2026-04-14.
  5. Claude Code v2.1.108: two cache regressions fixed — coverage of the telemetry-gated TTL downgrade and anthropics/claude-code#46917, the cache_creation inflation on v2.1.100–2.1.107.