Add a detected_pattern field to each finding. When developers dismiss false positives, aggregate which patterns trigger the most dismissals. Add targeted SKIP few-shot examples for the top patterns. Measure. Repeat.
The Feedback Loop
Over 3 months:
| Quarter | False positive rate | True positive detection |
|---|---|---|
| Start | 42% | 94% |
| After round 1 | 23% | 95% |
| After round 2 | 11% | 96% |
The top 3 patterns covered 80% of all dismissals (Pareto distribution). Fixing those 3 patterns with SKIP examples produced the largest improvement.
The Process
- Capture — Each finding includes a
detected_patternfield (e.g., “empty_catch_block”, “unused_import”) - Aggregate — Track developer dismissals by pattern over 2-4 weeks
- Prioritize — Rank patterns by dismissal frequency
- Fix — Add paired SKIP examples for the highest-dismissed patterns
- Measure — Verify FP reduction without losing true positives
- Repeat — The loop is continuous, not one-shot
Anti-Patterns
Discarding dismissals without tracking which patterns cause them. The dismiss action is valuable feedback — wasting it means the same false positives persist indefinitely.
Auto-suppressing patterns post-generation. A filter that hides “empty_catch_block” findings treats the symptom. SKIP examples fix the prompt so Claude stops generating false positives at the source.
One-shot improvement. Fixing the top 3 patterns and stopping leaves the next 3 patterns untouched. The distribution shifts as top patterns are fixed — new patterns become the leading cause.
One-liner: Build a continuous feedback loop: capture detected_pattern per finding → aggregate developer dismissals → add SKIP examples for top patterns → FP rate drops from 42% to 11% while detection stays at 94-96%.