Quality Gates
Nity's quality_gates tool validates output across 7 dimensions. Every task result passes through quality gates before being accepted.
The 7 Dimensions
1. Functional Correctness
Does the output do what it's supposed to do?
| Check | Method |
|---|---|
| Tests pass | Run test suite, verify green |
| Build succeeds | Compile/build without errors |
| Output matches intent | Semantic comparison against task description |
2. Determinism & Reproducibility
Can this result be reproduced?
| Check | Method |
|---|---|
| No random dependencies | Scan for Math.random(), Date.now(), etc. |
| Seed-based behavior | Verify seeded randomness where applicable |
| Consistent output | Same input produces same output |
3. Observability
Can you see what's happening?
| Check | Method |
|---|---|
| Logging present | Verify meaningful log statements |
| Error messages clear | Check error output quality |
| Events emitted | Verify progress/status events |
4. Security & Access Control
Is it safe?
| Check | Method |
|---|---|
| No hardcoded secrets | Scan for API keys, passwords, tokens |
| Input validation | Check for unvalidated user input |
| Dependency audit | Verify no known vulnerabilities |
5. Documentation & Handoff
Can someone else understand this?
| Check | Method |
|---|---|
| README updated | Verify docs reflect changes |
| Code comments | Check complex sections are explained |
| Interface documented | Verify public API is typed/documented |
6. Regression Protection
Does this break anything?
| Check | Method |
|---|---|
| Existing tests pass | Run full test suite, not just new tests |
| No breaking changes | Verify API compatibility |
| Integration tests | Check cross-component interactions |
7. Property-Based Validation
Are the invariants maintained?
| Check | Method |
|---|---|
| Type safety | Verify TypeScript strictness |
| Data integrity | Check state consistency |
| Boundary conditions | Test edge cases |
Scoring
Each dimension scores 0–100. The overall quality gate result is the minimum dimension score — the weakest link determines quality.
interface QualityGateResult {
overall: number // min of all dimensions
dimensions: {
functionalCorrectness: number
determinism: number
observability: number
security: number
documentation: number
regressionProtection: number
propertyBasedValidation: number
}
passed: boolean // overall >= 70
blockers: string[] // dimensions below 50
warnings: string[] // dimensions 50-69
}Thresholds:
| Score | Status | Action |
|---|---|---|
| 70–100 | Pass | Accept output |
| 50–69 | Warning | Accept with warning, log for review |
| 0–49 | Fail | Reject, send back to loop |
The quality gate pass threshold is 70, not 0. An output that scores 65 is not "almost passing" — it's failing. The threshold exists because low-quality outputs create technical debt that compounds over time.
Integration
Quality gates run after execution and before episode recording:
execution complete
→ quality_gates.validate(output)
→ pass → episode.record(success) → bridge.stream(result)
→ fail → loop continues (or circuit breaker opens)The quality gate result is included in the episode recording for Ralph's effectiveness tracking — adapters that consistently produce lower quality outputs get downgraded in recommendations.