Ralph Universal
The RalphUniversal class tracks adapter effectiveness across task categories and generates data-driven recommendations for adapter selection.
Interface
interface RalphUniversal {
recordUsage(taskCategory: string, adapter: string, success: boolean): void
getRecommendation(taskCategory: string): AdapterRecommendation
getEffectiveness(adapter: string): AdapterEffectiveness
getReport(): RalphReport
}
interface AdapterRecommendation {
adapter: string
confidence: number // 0–100
reason: string
alternatives: Array<{ adapter: string; confidence: number }>
}
interface AdapterEffectiveness {
adapter: string
totalUses: number
successRate: number
byCategory: Record<string, { uses: number; successRate: number }>
trend: 'improving' | 'stable' | 'declining'
}
interface RalphReport {
totalEpisodes: number
adapters: AdapterEffectiveness[]
categories: Record<string, { bestAdapter: string; sampleSize: number }>
generatedAt: Date
}How It Works
Ralph maintains a rolling effectiveness matrix — rows are task categories, columns are adapters, values are success rates weighted by recency.
claude-sonnet claude-haiku gpt-4o local-7b
code 0.91 0.72 0.85 0.41
test 0.88 0.69 0.82 0.38
refactor 0.93 0.65 0.79 0.35
docs 0.85 0.81 0.87 0.52
research 0.89 0.74 0.91 0.44adapter_recommend Tool
The recommendation engine is exposed as a tool that Nity calls during strategy planning:
{
"name": "adapter_recommend",
"description": "Get adapter recommendation for a task category",
"parameters": {
"taskCategory": {
"type": "string",
"enum": ["code", "test", "refactor", "docs", "research"],
"description": "The category of task to execute"
}
}
}Response
{
"adapter": "claude-sonnet",
"confidence": 87,
"reason": "Highest success rate for code tasks (91% across 47 episodes)",
"alternatives": [
{ "adapter": "gpt-4o", "confidence": 72 },
{ "adapter": "claude-haiku", "confidence": 58 }
]
}Ralph recommendations improve over time. Early sessions rely on heuristics; as episode count grows, recommendations become data-driven. The confidence score reflects sample size — fewer than 5 episodes yields lower confidence regardless of success rate.
Confidence Scoring
Confidence is calculated from three factors:
| Factor | Weight | Description |
|---|---|---|
| Sample size | 40% | More episodes = higher confidence |
| Success rate | 40% | Higher rate = higher confidence |
| Recency | 20% | Recent episodes weighted more heavily |
confidence = (sampleFactor × 0.4) + (successFactor × 0.4) + (recencyFactor × 0.2)Where each factor is normalized to 0–100.
Cross-Adapter Learning
Ralph detects when a new adapter outperforms the current recommendation and updates the effectiveness matrix. It also identifies task categories where adapter choice doesn't matter (flat effectiveness across adapters) and reports these as "adapter-agnostic" categories to save compute.