S1.6.1 Task 1.6

Without Exploration First, 80% of Effort Goes to Trivial Code

An agent asked to “add tests to this legacy codebase” starts testing utility helper functions — spending 80% of its time on trivial code while the untested payment module remains uncovered. Root cause: it skipped the exploration phase and executed immediately on whatever it found first.

The exploration-first pattern

  1. Map the terrain — codebase structure, module boundaries, dependency graph
  2. Identify priorities — high-risk areas (most used, most changed, most complex, least tested)
  3. Plan based on findings — not assumptions about what the code looks like
  4. Execute and adapt — adjust as execution reveals undocumented dependencies, implicit patterns

Without step 1-2, the agent has no basis for prioritization. It tests files in encounter order, which is usually alphabetical or directory-traversal order — neither correlates with code importance.

Plan revision on discovery

Initial plan: refactor modules A → B → C → D → E. While refactoring A, the agent discovers C has a tight, undocumented dependency on A. Refactoring A without simultaneously addressing C would break the system.

Correct: revise the plan. Refactor A and C together, then proceed with B → D → E. This is the core of dynamic decomposition — the plan adapts to discoveries.

Wrong: continue with the original plan and “fix C later.” You know it will break. Adapting now prevents the breakage.

The waterfall anti-pattern

“Phase 1: explore everything. Phase 2: create the complete plan. Phase 3: execute. No changes after phase 2.”

This assumes complete knowledge can be gathered before execution — false for open-ended tasks. Legacy codebases reveal their true complexity during modification, not during exploration. A locked plan becomes outdated as soon as execution begins. Progressive explore-plan-execute cycles are more effective.

Bounded exploration under time constraints

Exploring everything takes too long. A codebase with 3 modules of varying complexity:

  • Module A: 3 files, simple, documented → 5 min full exploration
  • Module B: 15 files, complex, no docs → 45 min full exploration
  • Module C: 8 files, moderate → 20 min full exploration
  • Total exploration: 70 min. Refactoring budget: 120 min total. Only 50 min left.

Bounded exploration: Module A briefly (3 min), Module C moderately (10 min), Module B enough to identify priorities (20 min). Total: ~33 min, leaving ~87 min for targeted refactoring. Revisit Module B exploration incrementally as refactoring reveals its structure.

Allocate exploration time proportional to uncertainty and complexity, not uniformly.

Time-constrained open-ended tasks

30 minutes to add tests to an unfamiliar codebase. The 5/25 split:

  • 5 minutes: rapid exploration — identify top 3 critical modules based on structure, naming, and complexity signals
  • 25 minutes: focused test writing on those modules

The brief exploration ensures effort targets the right code. Even 5 minutes of exploration dramatically improves targeting vs writing tests for whatever comes first.

Progressive deepening under constraints

500 files, zero existing tests, one session. Can’t test everything.

  1. Quick exploration: map structure, identify critical paths
  2. Write tests for highest-risk modules first
  3. Expand to medium-risk if time permits
  4. Each test written informs what to test next

If time runs out at any point, the most critical code already has coverage. This is strictly better than random, alphabetical, or uniform approaches.


One-liner: Without exploration, 80% of effort targets trivial code — explore first to prioritize, bound exploration time proportional to complexity, revise plans when discoveries reveal undocumented dependencies, and under time constraints use rapid exploration (5 min) before focused execution (25 min).