Response Analyzer
The ResponseAnalyzer class examines iteration outputs to detect completion, stuck loops, and quality signals.
Interface
interface ResponseAnalyzer {
analyze(response: string, context: AnalysisContext): AnalysisResult
detectCompletion(response: string): CompletionSignal
detectStuckLoop(errorHistory: string[]): StuckSignal
extractWorkSummary(response: string): string
calculateConfidence(response: string, context: AnalysisContext): number
}
interface AnalysisContext {
iteration: number
maxIterations: number
previousResponses: string[]
errorHistory: string[]
taskDescription: string
}
interface AnalysisResult {
shouldExit: boolean
exitReason: string | null
confidence: number // 0–100
completionSignal: CompletionSignal
stuckSignal: StuckSignal
workSummary: string
}
interface CompletionSignal {
detected: boolean
keywords: string[]
testOnly: boolean
confidence: number
}
interface StuckSignal {
detected: boolean
repeatedPatterns: string[]
errorFrequency: Record<string, number>
confidence: number
}Completion Detection
The analyzer scans for completion keywords in the agent's response:
| Keyword | Weight |
|---|---|
done | High |
complete | High |
finished | High |
all tests passing | Very high |
no remaining issues | Very high |
ready for review | High |
A single keyword isn't enough. The analyzer requires at least 2 signals or a very high confidence keyword match to trigger exit. This prevents premature termination from conversational use of completion words.
Test-Only Detection
When the response contains only test output without implementation changes, the analyzer flags it as testOnly: true. This indicates the agent may be running tests without making progress on the actual task.
// Pattern: response is entirely test output
const testPatterns = [
/^\s*(PASS|FAIL|SKIP)\s+/gm,
/\d+ (tests?|specs?) (passed|failed)/i,
/Test Suites?:/i
]Stuck Loop Detection
The analyzer maintains a frequency map of error patterns from the last N iterations. When the same error pattern appears repeatedly, it signals a stuck loop.
// Detect repeated error patterns
const patterns = errorHistory.map(extractErrorPattern)
const frequency = countBy(patterns)
const repeated = Object.entries(frequency)
.filter(([_, count]) => count >= 3)
.map(([pattern]) => pattern)Confidence Scoring
Confidence is calculated from multiple signals:
| Signal | Weight |
|---|---|
| Completion keywords present | 35% |
| Tests passing | 25% |
| Output quality (length, specificity) | 20% |
| No errors in response | 20% |
confidence = (keywordScore × 0.35) + (testScore × 0.25)
+ (qualityScore × 0.20) + (errorScore × 0.20)A confidence score of 40+ is the threshold for the done_signals exit strategy.
False Positive Handling
The analyzer applies several guards against false positives:
-
Conversational completion — "That's done!" in a conversational aside doesn't trigger exit. The completion signal must be in the context of reporting work results.
-
Partial completion — "Tests done but build failing" is not treated as full completion. The analyzer checks for qualifying language.
-
Test-only loops — When the agent repeatedly runs tests without code changes, the analyzer detects this pattern and reports it as stuck rather than complete.
-
Echo detection — If the agent echoes back a completion keyword from the task description, it's discounted.
Work Summary Extraction
The analyzer extracts a structured summary of what work was performed:
// Extract: files changed, tests run, errors fixed
const summary = extractWorkSummary(response)
// → "Modified 3 files, 12 tests passing, 2 errors fixed"This summary is included in the episode recording and progress stream.