jsmanifest logojsmanifest

The Claude Code Full Stack: When to Use CLAUDE.md, Skills, MCP, Subagents, and Hooks

The Claude Code Full Stack: When to Use CLAUDE.md, Skills, MCP, Subagents, and Hooks

Most Claude Code integration failures stem from misunderstanding the four-layer extension stack. Learn when to use CLAUDE.md, skills, MCP servers, subagents, and hooks—and why the distinction matters in production.

The Claude Code Full Stack: When to Use CLAUDE.md, Skills, MCP, Subagents, and Hooks

Most Claude Code integration failures stem from treating the extension as a monolithic tool. Teams reach for MCP servers when a simple skill would suffice, or they embed critical context in chat when it belongs in CLAUDE.md. The result: brittle workflows that break when context shifts or dependencies change.

The Claude Code extension operates as a four-layer stack: CLAUDE.md for project memory, skills for in-process automation, MCP servers for external system integration, and hooks/subagents for orchestration. Each layer solves a specific class of problem. Mixing these layers—or choosing the wrong one—introduces failure modes that compound as codebases scale.

This post maps the entire Claude Code extension stack. The mental model here eliminates the guesswork that causes most integration bugs.

Understanding the Claude Code Extension Stack: A Mental Model

The Claude Code extension architecture separates concerns across four distinct layers. CLAUDE.md provides persistent project context that survives across sessions. Skills execute synchronous, in-process operations like file manipulation or code generation. MCP servers connect to external systems through a protocol-based interface. Hooks and subagents orchestrate multi-step workflows by delegating to specialized agents.

The failure mode here is subtle but expensive: teams conflate these layers and end up with context bleeding into skills, or MCP servers performing tasks that belong in CLAUDE.md. The distinction matters because each layer has different guarantees around state, execution model, and failure handling.

CLAUDE.md: Your Project's Single Source of Truth

CLAUDE.md functions as the project's persistent memory layer. Every time Claude Code starts a new conversation, it reads this file first. The content shapes how the agent interprets requests, understands the codebase structure, and makes decisions about implementation patterns.

Claude Code CLAUDE.md architecture

Effective CLAUDE.md files follow a four-section structure: project overview, architecture decisions, coding standards, and current focus. The project overview documents the tech stack, directory structure, and primary dependencies. Architecture decisions capture patterns like state management approach or API design conventions. Coding standards define style rules and testing requirements. Current focus directs the agent's attention to active work streams.

// Example CLAUDE.md structure
# Project: E-commerce Platform
 
## Tech Stack
- Next.js 14 (App Router)
- TypeScript 5.3
- Prisma ORM with PostgreSQL
- TanStack Query for data fetching
 
## Architecture Decisions
- Server Actions for mutations, Route Handlers for queries
- Parallel data fetching at layout boundaries
- Optimistic updates with TanStack Query's useMutation
- Form validation with Zod schemas
 
## Coding Standards
- Collocate feature code in src/features/[feature-name]
- Extract reusable UI to src/components/ui
- Keep Server Components async, Client Components synchronous
- All database queries through src/lib/db module
 
## Current Focus
- Implementing product search with Elasticsearch
- Migrating payment flow to Stripe Checkout
- Adding OpenTelemetry instrumentation

The implication here is that CLAUDE.md replaces the need to repeat context in every conversation. When the agent knows the project structure and conventions upfront, it generates code that matches existing patterns without explicit instruction. This distinction is critical: context in CLAUDE.md persists across sessions, while chat context expires when the conversation ends.

%% alt: CLAUDE.md feeds project context to every Claude Code session
flowchart TD
    Start["New Claude Code Session"] --> Read["Read CLAUDE.md"]
    Read --> Context["Load Project Context"]
    Context --> Decisions["Architecture Patterns"]
    Context --> Standards["Coding Standards"]
    Context --> Focus["Current Work"]
    Decisions --> Agent["Claude Code Agent"]
    Standards --> Agent
    Focus --> Agent
    Agent --> Generate["Generate Code Following Patterns"]

Skills vs MCP Servers: When to Use Each

The choice between skills and MCP servers determines execution model and failure boundaries. Skills run synchronously within the Claude Code process and have direct access to the workspace. MCP servers run as separate processes and communicate through a JSON-RPC protocol.

Skills excel at workspace-local operations: reading files, generating code, running linters, executing tests. The skill invocation is deterministic—Claude Code calls the skill function, receives a result, and continues. There's no network boundary, no external process lifecycle, and no protocol negotiation.

MCP servers handle external system integration: querying databases, calling APIs, reading cloud resources, interacting with third-party services. The protocol boundary provides isolation—if the database is down, the MCP server fails without crashing Claude Code. The tradeoff: latency increases due to IPC overhead and JSON serialization.

%% alt: Skills run in-process while MCP servers isolate external systems
flowchart LR
    subgraph SkillPattern["Skills: In-Process Execution"]
        SkillRequest["Skill Request"] --> SkillExec["Execute Function"]
        SkillExec --> SkillResult["Return Result"]
    end
    
    subgraph MCPPattern["MCP: External Process"]
        MCPRequest["MCP Request"] --> Protocol["JSON-RPC Protocol"]
        Protocol --> Server["MCP Server Process"]
        Server --> External["External System"]
        External --> Response["Response"]
    end

The practical guideline: use skills for operations that manipulate workspace state, use MCP servers for operations that query or modify external systems. Mixing these patterns introduces unnecessary complexity. A skill that shells out to curl is fighting the abstraction—that operation belongs in an MCP server.

Implementing Claude Code Skills with Auto-Invocation

Skills provide a mechanism for Claude Code to invoke developer-defined functions automatically when specific conditions arise. The skill system uses tool calling under the hood—Claude recognizes when a request matches a skill's description and invokes it without explicit prompting.

A skill consists of three components: a TypeScript function that performs the operation, a JSON schema that describes the function signature, and activation rules that determine when Claude should consider the skill. The activation rules support pattern matching on file extensions, directory paths, and request content.

// Example: Code review skill with auto-invocation
import { Skill } from '@anthropic-ai/claude-code';
 
export const codeReviewSkill: Skill = {
  name: 'review_code',
  description: 'Performs static analysis and suggests improvements for code files',
  
  parameters: {
    type: 'object',
    properties: {
      filePath: {
        type: 'string',
        description: 'Path to the file to review'
      },
      focus: {
        type: 'string',
        enum: ['performance', 'security', 'maintainability', 'all'],
        description: 'Aspect to focus the review on'
      }
    },
    required: ['filePath']
  },
 
  activation: {
    filePatterns: ['**/*.ts', '**/*.tsx', '**/*.js', '**/*.jsx'],
    requestPatterns: ['review', 'analyze', 'check', 'improve'],
    autoInvoke: true
  },
 
  async execute({ filePath, focus = 'all' }) {
    const content = await readFile(filePath, 'utf-8');
    const ast = parseTypeScript(content);
    
    const issues: Issue[] = [];
    
    if (focus === 'performance' || focus === 'all') {
      issues.push(...detectPerformanceIssues(ast));
    }
    
    if (focus === 'security' || focus === 'all') {
      issues.push(...detectSecurityIssues(ast));
    }
    
    if (focus === 'maintainability' || focus === 'all') {
      issues.push(...detectMaintainabilityIssues(ast));
    }
 
    return {
      filePath,
      issueCount: issues.length,
      issues: issues.map(i => ({
        line: i.location.start.line,
        severity: i.severity,
        message: i.message,
        suggestion: i.suggestion
      }))
    };
  }
};

The auto-invocation flag enables Claude Code to call the skill without waiting for explicit approval. This pattern works for low-risk operations like linting or formatting. High-risk operations—file deletion, database writes, API calls—should set autoInvoke: false and require confirmation.

Claude Code skills execution flow

Hooks, Subagents, and Delegation Patterns

Hooks provide lifecycle interception points where developers can inject custom behavior into Claude Code's execution flow. The system supports hooks at conversation start, before and after skill execution, on file changes, and at conversation end.

Subagents implement delegation through specialized agent instances. When the primary Claude Code agent encounters a task outside its expertise, it spawns a subagent with domain-specific context. The subagent completes the task and returns control to the primary agent.

The delegation pattern appears in multi-step workflows where different steps require different expertise. A refactoring task might spawn a test-writing subagent, a documentation subagent, and a migration subagent—each with focused context that eliminates irrelevant information.

%% alt: Hook and subagent orchestration for complex workflows
flowchart TD
    Request["Developer Request"] --> Hook["Pre-Execution Hook"]
    Hook --> Primary["Primary Claude Agent"]
    Primary --> Evaluate["Evaluate Task Complexity"]
    Evaluate -->|Simple| Direct["Execute Directly"]
    Evaluate -->|Complex| Delegate["Spawn Subagent"]
    Delegate --> Test["Test Writing Subagent"]
    Delegate --> Docs["Documentation Subagent"]
    Delegate --> Migration["Migration Subagent"]
    Test --> Aggregate["Aggregate Results"]
    Docs --> Aggregate
    Migration --> Aggregate
    Aggregate --> PostHook["Post-Execution Hook"]
    Direct --> PostHook
    PostHook --> Complete["Return to Developer"]

The hook system enables observability without modifying Claude Code internals. A pre-execution hook can log the request and skill parameters. A post-execution hook can validate output, update external systems, or trigger notifications. The separation of concerns here prevents coupling between Claude Code and custom tooling.

Building Your 4-Layer Extension Stack in Production

Production Claude Code setups layer CLAUDE.md, skills, MCP servers, and hooks into a coherent system. The stack architecture determines which layer handles each category of operation.

Layer 1 (CLAUDE.md) contains project-wide context: architecture patterns, coding standards, dependency constraints, and current priorities. This layer updates infrequently—typically during onboarding, after major refactors, or when adopting new patterns.

Layer 2 (Skills) implements workspace operations: code generation, refactoring, testing, linting. Skills have direct filesystem access and execute synchronously. The skill registry grows as teams identify repetitive operations that benefit from automation.

Layer 3 (MCP Servers) bridges external systems: databases, APIs, cloud services, monitoring platforms. Each MCP server isolates a specific integration and handles connection management, authentication, and error recovery.

Layer 4 (Hooks + Subagents) orchestrates complex workflows: multi-step refactors, parallel code generation, cross-file analysis. Hooks inject observability and validation. Subagents delegate specialized tasks to focused agent instances.

// Example: Production Claude Code configuration
export const claudeCodeConfig = {
  // Layer 1: Project context
  claudeMd: './CLAUDE.md',
  
  // Layer 2: Skills registry
  skills: [
    codeReviewSkill,
    testGenerationSkill,
    refactorSkill,
    migrationSkill
  ],
  
  // Layer 3: MCP servers
  mcpServers: {
    database: {
      command: 'npx',
      args: ['@anthropic-ai/mcp-server-postgres'],
      env: {
        DATABASE_URL: process.env.DATABASE_URL
      }
    },
    monitoring: {
      command: 'node',
      args: ['./mcp-servers/datadog.js'],
      env: {
        DD_API_KEY: process.env.DD_API_KEY
      }
    }
  },
  
  // Layer 4: Hooks and orchestration
  hooks: {
    preExecution: async (context) => {
      await logRequest(context);
      await checkRateLimit(context);
    },
    
    postExecution: async (context, result) => {
      await validateOutput(result);
      await updateMetrics(context, result);
      await notifyIfError(result);
    },
    
    onFileChange: async (file) => {
      await runLinter(file);
      await updateDependencyGraph(file);
    }
  },
  
  // Subagent delegation rules
  subagents: {
    maxConcurrent: 3,
    timeout: 300000, // 5 minutes
    specializations: [
      'testing',
      'documentation',
      'migration',
      'performance'
    ]
  }
};

Decision Framework: Choosing the Right Tool for Each Task

The decision tree for Claude Code extensions starts with two questions: Does this operation modify external state? Does this operation require specialized context?

If the operation modifies external state (database writes, API calls, cloud resource changes), use an MCP server. The protocol boundary provides isolation and enables independent failure handling. If the operation reads external state but doesn't modify it, MCP servers still apply—the abstraction keeps authentication and connection management separate from Claude Code.

If the operation works entirely within the workspace (file reads, code generation, local testing), use a skill. Skills avoid the serialization overhead of the MCP protocol and execute synchronously within Claude Code's process.

If the operation requires specialized context that differs from the primary task, spawn a subagent. Subagents receive focused prompts that eliminate irrelevant information. A test-writing subagent doesn't need to know about deployment pipelines. A documentation subagent doesn't need to understand database schemas.

If the operation needs to observe or modify Claude Code's behavior at specific lifecycle points, implement a hook. Hooks inject custom logic without modifying Claude Code internals. Pre-execution hooks validate inputs. Post-execution hooks enforce constraints or trigger side effects.

%% alt: Decision tree for selecting the right Claude Code extension pattern
flowchart TD
    Task["New Task"] --> External["Modifies External State?"]
    External -->|Yes| MCP["Use MCP Server"]
    External -->|No| Workspace["Workspace Operation?"]
    Workspace -->|Yes| Skill["Use Skill"]
    Workspace -->|No| Context["Needs Specialized Context?"]
    Context -->|Yes| Subagent["Spawn Subagent"]
    Context -->|No| Lifecycle["Lifecycle Interception?"]
    Lifecycle -->|Yes| Hook["Implement Hook"]
    Lifecycle -->|No| Claude["Standard Claude Code"]
    
    style MCP fill:#064e3b,stroke:#34d399,color:#6ee7b7
    style Skill fill:#1e3a8a,stroke:#60a5fa,color:#e0eaff
    style Subagent fill:#3b0764,stroke:#a855f7,color:#e9d5ff
    style Hook fill:#1e293b,stroke:#64ffda,color:#e2e8f0

Real-World Patterns: Memory, Context, and Parallel Execution

Production Claude Code setups handle three recurring patterns: long-term memory across sessions, context management for large codebases, and parallel execution for independent operations.

Long-term memory emerges from disciplined CLAUDE.md maintenance. Teams update the file after completing major features, making architectural decisions, or identifying recurring issues. The document evolves as the project grows. The alternative—relying on chat history—fails when conversations reset or when new team members join.

Context management for large codebases requires strategic skill design. Instead of loading entire files, skills extract relevant sections using AST parsing. Instead of analyzing the entire dependency graph, skills focus on changed files and their direct dependencies. The context window is finite—effective skills maximize information density.

Parallel execution happens through subagent delegation. When a refactor affects multiple independent modules, the primary agent spawns one subagent per module. Each subagent completes its work without waiting for the others. The primary agent aggregates results and handles conflicts. This pattern scales to dozens of parallel operations without overwhelming the context window.

That covers the essential patterns for the Claude Code extension stack. Apply these in production and the distinction between brittle chat-driven workflows and robust agent-driven automation becomes immediate. The four-layer architecture—CLAUDE.md, skills, MCP servers, hooks/subagents—maps directly to the separation of concerns that makes complex systems maintainable.