S1.3.3 Task 1.3

Goals + Standards Beat Step-by-Step: 82 vs 68 Quality Score

The coordinator’s system prompt shapes how it orchestrates. Step-by-step procedures produce predictable but rigid behavior. Goal-oriented prompts with quality standards produce adaptive behavior that scores 14 points higher on average — because 45% of research queries benefit from strategy changes that rigid procedures can’t make.

The data: procedural vs goal-oriented

MetricProcedural promptGoal-oriented prompt
Average quality68/10082/100
Adaptation rate0%45%
Quality varianceLow (uniform)Higher (scales with complexity)

The 14-point quality improvement comes from the coordinator’s ability to adapt when sub-agent results reveal unexpected findings or gaps. The higher variance in goal-oriented output is healthy — simple queries produce simple reports, complex queries produce complex reports. Forcing uniform output means over-investing on simple queries or under-investing on complex ones.

A/B test: 19-point gap, fixable edge case

500-query A/B test:

  • Procedural: 100% completion, 65/100 quality, 35% suboptimal execution
  • Goal-oriented: 95% completion, 84/100 quality, 40% adaptive strategy changes

The 5% incompletion is a convergence issue fixable with iteration guards. The 19-point quality gap is fundamental. Procedural “completes” by mechanically following steps — but in 35% of cases those steps were the wrong approach. A complete but wrong report is worse than one that iterates toward quality.

The rigid procedure trap: literal execution

A coordinator prompt says: “Step 1: Use search agent to find 5 papers.” During a quantum computing query, the search agent finds 2 papers and 1 key government report. The coordinator ignores the government report — the prompt says “papers,” not “reports.”

A goal-oriented prompt (“produce comprehensive analysis with diverse sources”) would include the valuable report. The coordinator’s job is to pursue the research goal, not to match a prescribed format for source types.

The balanced approach: goals + optional hints

Neither extreme is optimal:

  • Pure goals, no guidance: the coordinator may be aimless for unfamiliar tasks
  • Pure procedure, no goals: the coordinator can’t adapt when reality doesn’t match the plan

The effective middle: goals and quality standards as the primary framework, with optional procedural hints for common patterns. “Typically start with broad search before deep analysis” is a hint. “Always start with broad search” is a mandate. Hints bootstrap strategy without constraining it.

Complexity-adaptive coordinator

The same coordinator handles both simple fact lookups and complex multi-domain analyses. The prompt should enable proportional effort:

“Assess query complexity first. For simple factual queries, a single agent pass suffices. For complex multi-domain queries, use multiple agents and iterate until coverage criteria are met.”

This scales naturally: quick resolution for simple queries, thorough investigation for complex ones. One coordinator, adapted strategy.

Strategy vs format: separate the concerns

A CI coordinator needs two things that seem contradictory: adaptive strategy (skip security scan for docs-only PRs) and predictable output (consistent report format for auditors).

The fix: separate them.

  • Strategy (which agents, how many passes, what depth) → goal-oriented, adaptive per PR
  • Format (report structure, severity ratings, file references) → structured, predictable always

The coordinator adapts HOW it reviews while always producing WHAT auditors expect. Adaptability applies to the investigation process; predictability applies to the output structure.

What to include in coordinator prompts

  1. Research goals: what to achieve, how broad/deep
  2. Quality standards: what “good” looks like (cited sources, specific findings, coverage completeness)
  3. Evaluation criteria: when to iterate vs accept (coverage gaps? depth insufficient?)
  4. Optional hints: common patterns without mandating them (“typically start with…”)
  5. Output format requirements: consistent structure regardless of adaptive strategy
  6. Coordination meta-instructions: “evaluate coverage after each round,” “pass only relevant context to sub-agents,” “preserve both perspectives when sources conflict”

What NOT to include

  • Step-by-step procedures as primary structure (prevents adaptation)
  • Decision trees for every scenario (unmaintainable, still misses edge cases)
  • Minimal instructions without quality standards (no basis for self-evaluation)
  • Prescribed agent order without flexibility (“always run security, then performance, then style”)

One-liner: Goal-oriented prompts with quality standards score 82 vs 68 (procedural) because 45% of queries benefit from adaptive strategy — use goals for what to achieve, optional hints for common patterns, and separate adaptive strategy from predictable output format.