Choosing between skills, subagents, and MCP servers in Claude Code
A reference I can come back to
At December 3, 2025
Claude Code practitioners face a critical architectural decision: when to use skills, subagents, or MCP servers for extending Claude’s capabilities. Skills excel at packaging reusable expertise with minimal token overhead; subagents provide context isolation for complex multi-step workflows; MCP servers connect to external APIs and real-time data sources. The choice depends on whether you need portable knowledge (skills), parallel execution with clean context (subagents), or external system access (MCP). Most production teams use all three in combination, with skills handling procedural knowledge, subagents orchestrating complex workflows, and MCP providing the external connections.
This guide reflects the current state as of late 2025, following major updates including the plugins system (October 2025), sandboxing (October 2025), and MCP’s one-year anniversary specification release (November 2025).
How each approach works under the hood
The three mechanisms differ fundamentally in their relationship to Claude’s context window and execution model.
Skills operate through progressive disclosure, loading content on-demand rather than all at once. A skill exists as a directory containing a SKILL.md file with YAML frontmatter plus optional scripts, templates, and reference materials. At session start, Claude receives only the skill’s name and description (~100 tokens). When a task matches the description, Claude reads the full instructions; if those reference other files, Claude reads those too. Scripts execute via bash, with only their output entering the context—source code never consumes tokens. This architecture makes skills effectively unbounded in bundled content since files don’t consume context until accessed.
Subagents spawn as parallel Claude instances with isolated context windows. When the main agent delegates to a subagent, the subagent receives only task-relevant context (~500 tokens) rather than the full conversation history. Each subagent works independently, then returns a summarized result rather than a full transcript. This isolation prevents “context pollution”—where exploration and research tokens overwhelm the main conversation. Since version 1.0.60, subagents use native implementation rather than subprocess spawning, reducing spawn time from 10+ seconds to sub-second latency. Up to 10 subagents can run simultaneously.
MCP (Model Context Protocol) uses JSON-RPC 2.0 for client-server communication between Claude Code and external systems. MCP servers expose three primitives: tools (actions Claude decides to take), resources (data sources similar to GET endpoints), and prompts (reusable templates that become slash commands). Each enabled server adds tool definitions to the system prompt at session start, consuming context before any work begins. Output limits default to 25,000 tokens per tool call, configurable via the MAX_MCP_OUTPUT_TOKENS environment variable.
Token economics reveal different cost profiles
The token consumption patterns of each approach have significant implications for long sessions and cost management.
Skills achieve 70-90% token savings compared to loading full context upfront. The progressive disclosure model means a skill with 50,000 tokens of reference material might only cost 100 tokens if never invoked, or 3,000 tokens if only the main instructions are needed. This makes skills ideal for packaging extensive domain knowledge without paying the context cost until necessary.
Subagents provide ~70% token reduction for complex tasks compared to handling everything in the main context. The key efficiency gain comes from context filtering—subagents receive task-specific context rather than inheriting the full conversation. Results compression further reduces overhead since subagents return summaries rather than transcripts. However, each subagent invocation counts toward usage limits, and there’s overhead in spawn time and context re-establishment.
MCP servers carry the highest token overhead of the three approaches. Every enabled server loads its complete tool schema into the context at session start, consuming space before work begins. Tool outputs can easily reach the 10,000-token warning threshold. Performance recommendation from practitioners: limit active servers to 2-3 targeted MCPs to maintain optimal startup performance. Use /context to visualize MCP consumption and /mcp to disable unused servers.
Approach Metadata cost Full invocation Best practice limit Skills ~100 tokens ~5,000 tokens typical No practical limit on bundled content Subagents None until spawned ~500 token context + isolated execution 10 simultaneous MCP Full schema at start Up to 25,000 tokens output 2-3 active servers
Capability boundaries define when each is necessary
Each mechanism has clear boundaries that make it technically necessary—or impossible—for certain tasks.
Skills are necessary when you need capabilities that work across Claude products (Claude Code, API, claude.ai web). They’re the only mechanism providing portable expertise. Pre-built Anthropic skills handle document formats—PowerPoint, Excel, Word, PDF—that require code execution. Skills cannot spawn isolated contexts, access external APIs directly, or execute in sandboxed environments separate from the main VM.
Subagents are necessary when you need context isolation to keep the main conversation focused on high-level objectives while delegating deep research or complex multi-step tasks. They’re essential for parallel workflow execution—something skills cannot do. Critical limitation: subagents cannot spawn other subagents (prevents infinite nesting), cannot run in “thinking” mode with visible intermediate output, and cannot share information directly between sibling subagents.
MCP is necessary when you need access to external systems—databases, APIs, third-party services, real-time data. No other mechanism can query a live Postgres database, fetch current GitHub PRs, or integrate with Jira. MCP provides the standardized connectivity that works across AI applications (OpenAI and Google have also adopted the protocol). Limitation: MCP cannot enforce authentication at the protocol level; security researchers in April 2025 identified prompt injection vulnerabilities and tool permission escalation risks.
The decision framework for practitioners
Use this flowchart logic when deciding which approach to reach for:
Step 1: Does the task require external data or API access?
Yes → MCP server required. Configure via
claude mcp addor.mcp.json.No → Continue to Step 2.
Step 2: Does the task involve deep research, complex multi-step workflows, or extensive exploration that would pollute the main context?
Yes → Subagent appropriate. Spawn via
.claude/agents/definition or built-in agents.No → Continue to Step 3.
Step 3: Do you have reusable procedures, domain expertise, or document handling needs that should work across Claude products?
Yes → Skill appropriate. Package in
.claude/skills/directory.No → Handle directly in main agent context.
Step 4: Consider combinations. Production teams often combine all three:
Main Agent → Spawns Subagent → Subagent loads relevant Skill → Calls MCP for data → Returns summary
Concrete examples of appropriate choices
Scenario Best choice Why Code review with security analysis Subagent Context isolation prevents review details polluting main conversation Brand guidelines for presentations Skill Reusable across all Claude instances, auto-loads when relevant Querying Postgres database MCP Requires external system connection Multi-repository refactoring Subagent Parallel execution across repos Company coding standards Skill Portable expertise any agent can apply GitHub PR management MCP Real-time API access required Document format conversion Skill Pre-built Anthropic skills handle Word/Excel/PDF Research task with multiple strategies Subagent Own context window prevents pollution
Setup complexity and maintenance burden differ significantly
Skills have the lowest setup complexity. Create a directory with a SKILL.md file containing YAML frontmatter and instructions:
---
name: my-skill
description: When this skill should activate
---
# Instructions Claude follows when skill is active
Store in .claude/skills/ for project scope or ~/.claude/skills/ for user scope. The “skill-creator” skill provides interactive guidance for first-time creators.
Subagents require moderate setup with understanding of tool permissions and context isolation. Define in .claude/agents/ as Markdown with YAML frontmatter specifying name, description, tools, and system prompt. The /agents command provides a comprehensive management interface. Key pitfall: omitting the tools field implicitly grants access to all available tools including MCP tools—be intentional about permissions.
MCP servers carry the highest setup complexity. Practitioners recommend direct config file editing over the CLI wizard, which “forces you to enter everything perfectly on the first try or start over.” Configuration requires JSON in .mcp.json (project scope) or ~/.claude.json (user scope), including commands, arguments, and environment variables with API credentials. Three transport options exist: stdio for local processes (fastest), HTTP for remote services, and SSE for streaming (deprecated).
Debugging each approach requires different techniques
For skills, check activation by ensuring description field clarity matches intended use cases. Use /skills to list loaded skills. If Claude doesn’t invoke a skill, test by prompting with explicit matching phrases. Monitor how Claude uses the skill in real scenarios and iterate.
For subagents, common issues include hook output not appearing (must print to STDOUT and register appropriate hooks), hooks not loading (invalid settings JSON), and tool sprawl from implicit inheritance. Debug subagent transitions using hooks that log state changes. The agent_id and agent_transcript_path fields added in November 2025 help track subagent execution.
For MCP, verify connections with claude mcp list and check status with /mcp inside Claude Code. Launch with claude --mcp-debug for detailed logging. Common issues include mixing up npx and uvx package managers, incorrect environment variables, and servers showing “failed” status. The .mcp.json format supports environment variable expansion for team flexibility.
Common anti-patterns practitioners encounter
Over-verbose skill files waste tokens. Practitioners recommend keeping instructions concise and action-oriented. Instead of “When implementing authentication, always ensure you follow security best practices including input validation, proper error handling, secure token storage...” write “Auth code: validate inputs, handle errors securely, follow auth/ patterns.”
Over-specialized subagents gatekeep context inappropriately. Creating a PythonTests subagent hides all testing context from the main agent and forces human-defined workflows. Some practitioners recommend using the built-in Task(...) to spawn clones of the general agent rather than over-specialized custom agents. Overlapping duties between subagents—like having one handle both testing and review—confuses workflows.
Not reviewing MCP security poses real risks. Only install servers from trusted sources and review code first. Be especially careful with MCP servers that fetch untrusted content due to prompt injection risks. The November 2025 cybersecurity incident where Chinese state-sponsored actors manipulated Claude Code for automated attacks highlighted the need for careful MCP security practices.
Context pollution from not using subagents when appropriate causes Claude to “become dumber after compacting”—the most frequently cited frustration among practitioners. Use subagents early in conversations for complex problems to preserve context availability.
Team collaboration patterns that work
For skills, check them into the .claude/skills/ directory in your repository. The October 2025 plugins system enables bundling skills with hooks, subagents, and MCP servers as complete, versioned packages with automated installation.
For subagents, store in .claude/agents/ at project level for team sharing. Version control enables consistent team behavior. One production pipeline pattern (PubNub): pm-spec → writes spec and sets status READY_FOR_ARCH → architect-review → validates and produces ADR, sets READY_FOR_BUILD → implementer-tester → implements and flips status DONE.
For MCP, use .mcp.json with environment variable expansion so individual developers can use their own credentials while sharing server configurations. As the official documentation notes, “you can add Puppeteer and Sentry servers to your .mcp.json, so that every engineer working on your repo can use these out of the box.”
Human-in-the-loop pattern: Hooks suggest next steps, humans approve. The hook prints the recommended command; a human pastes it to proceed. This prevents runaway chains and forces quick verification—essential for responsible autonomous operation.
Recent changes affecting these decisions
September 29, 2025 brought the Claude Agent SDK with native subagent support, hooks, background tasks, and checkpoints. Subagents now enable true parallel development workflows (e.g., backend API while main agent builds frontend).
October 9, 2025 introduced the plugins system for bundling skills, subagents, MCP servers, and hooks as shareable packages. Plugins are now “the standard way to bundle and share Claude Code customizations.”
October 20, 2025 launched sandboxing with filesystem and network isolation, reducing permission prompts by 84% in internal usage. Strongly recommended for safer autonomous operation.
November 24, 2025 with Claude Opus 4.5 removed Opus-specific caps and reported 50-75% reductions in tool calling errors and build/lint errors.
November 25, 2025 brought the MCP one-year anniversary specification with task-based workflows, simplified authorization, agentic servers (MCP servers running their own loops), and enterprise features. The includeContext parameter was soft-deprecated in favor of explicit capability declarations.
MCP scope naming changed: previous “project” scope is now “local”; previous “global” scope is now “user”; new “project” scope allows committing MCP servers to .mcp.json in repositories.
Where official guidance conflicts with practitioner consensus
On subagent specialization: Official documentation encourages “focused subagents with single, clear responsibilities.” Some practitioners argue this over-specializes and gatekeeps context, preferring to spawn clones of the general agent via Task(...) for most workflows.
On skill verbosity: Anthropic’s guidance frames skills as “onboarding guides for new hires,” implying comprehensive documentation. Practitioners emphasize brevity: “Keep it scannable. Under 500 words in the main SKILL.md.”
On MCP server count: No official limit stated, but practitioner consensus strongly recommends 2-3 active servers maximum for performance. Official documentation focuses on capability rather than operational constraints.
On when to use subagents: Official guidance suggests using subagents for complex tasks. Practitioners increasingly recommend using them “early and often” even for moderately complex tasks to preserve context availability—the cost of spawning is low, and the benefit of isolation is high.
Conclusion
The choice between skills, subagents, and MCP servers maps directly to your architectural needs: skills for portable expertise, subagents for context isolation, MCP for external access. Most production workflows combine all three, using skills for domain knowledge, subagents for workflow orchestration, and MCP for system integration.
The October 2025 plugins system makes this combination easier by bundling all three into shareable packages. Sandboxing reduces the overhead of autonomous operation. The November 2025 MCP specification advances enable more sophisticated server capabilities including agentic loops.
Key tactical guidance: start with skills for knowledge packaging since they have the lowest overhead and highest portability. Graduate to subagents when workflows involve multi-step exploration or would pollute the main context. Add MCP servers only when external system access is required, and limit active servers to 2-3 for optimal performance. Use the /context command regularly to monitor consumption, and leverage plugins for team standardization.
This guidance will evolve—MCP’s next specification release is planned for late 2025, and the pace of Claude Code development suggests significant further changes. Monitor the official changelog and MCP blog for updates that may affect these architectural decisions.

