K3.6.3 Task 3.6

CLAUDE.md Is Your CI Agent's Only Source of Project Standards

claude -p "Review this PR" loads CLAUDE.md automatically. The CI agent gets the same project context as a developer running Claude locally. No extra flags, no environment variables, no configuration files.

If CLAUDE.md contains only a brief project description and no review criteria, the CI agent reviews against generic best practices. If it contains explicit testing standards, security patterns, and review criteria, the agent applies those specific checks.

The difference between generic and project-specific CI reviews is what you put in CLAUDE.md.

The Data

A team measured CI review quality before and after enriching CLAUDE.md:

Stage”Useful” ratingFalse positive rate
Before enrichment30%40%
After adding test standards72%
After adding security criteria85%12%

Each addition to CLAUDE.md produced measurable improvement. Testing conventions made the agent flag missing tests and incorrect fixtures. Security criteria made it catch payment handling issues. The false positive rate dropped by 70% because the agent stopped flagging practices that were intentional choices.

Two Teams, Same Pipeline, Different Reviews

Two teams share a monorepo with identical CI configurations. Team A’s reviews consistently flag missing error handling and incomplete tests. Team B’s reviews are generic.

The difference: Team A’s CLAUDE.md includes error handling standards and test coverage criteria. Team B’s CLAUDE.md does not. Same pipeline, same model — CLAUDE.md is the variable.

What Belongs in CLAUDE.md for CI

Testing conventions. Test framework details, fixture patterns, coverage expectations, integration vs unit test guidance. Without these, the agent cannot evaluate whether tests are adequate.

Review criteria. What the team considers a blocking issue vs a suggestion. Named severity levels. Required checks for different file types.

Security standards. Sensitive patterns specific to the project — payment handling rules, auth token management, data sanitization requirements.

Code examples. Acceptable and unacceptable patterns. These ground the agent’s understanding more effectively than text descriptions.

Not in CLAUDE.md for CI: ephemeral information (sprint goals, current incidents), information that changes per-run (diff context — that comes from the prompt), or standards that belong in .claude/rules/ with path-specific scoping.

CLAUDE.md vs Dynamic Wiki Injection

A DevOps lead proposes fetching review standards from a wiki API before each CI run and passing them as inline context.

This adds:

  • External dependency — wiki API outage breaks CI for all 50 daily runs
  • No version control — standard changes have no review history, no rollback
  • Custom scripting — fetching, caching, error handling for the API

CLAUDE.md in version control avoids all three. Standards change through PRs with review. They are always available without network calls. They load without custom scripts.

Claude Code in -p mode can process inline context passed as arguments — the dynamic approach is not technically impossible. It is operationally inferior to a version-controlled file that loads automatically.

Multi-Service Pattern

Five microservices with shared security standards and unique testing patterns:

Each service’s CLAUDE.md contains:

  1. Shared security standards (via @import from a common file, or included directly)
  2. Service-specific test patterns and conventions

The CI agent for each service loads its CLAUDE.md and gets both layers automatically. No centralized injection service, no MCP server for standards delivery, no manual copying.

First Step for New CI Integration

Document the team’s coding standards, test expectations, and review criteria in CLAUDE.md before the first CI run. Not after — before.

Running claude -p without project context produces generic output. Iterating on prompt arguments to improve quality is slower and less maintainable than establishing standards in CLAUDE.md upfront.


One-liner: CLAUDE.md auto-loads in CI just as it does locally — enriching it with review standards, testing conventions, and security criteria transforms generic reviews into project-specific ones (30% useful → 85% useful).