Value of Information Escalation
The Problem: ESCALATE as Investment, Not Fallback
Traditional governance frameworks treat ESCALATE as a binary safety fallback: when the system is uncertain, ask a human. This creates two failure modes. Over-escalation fatigues reviewers until they rubber-stamp everything (the "alert fatigue" problem). Under-escalation misses genuinely ambiguous cases where human judgment would improve the outcome.
Value of Information (VOI) reframes escalation as an investment decision. The system pays a measurable cost — latency, reviewer attention, process interruption — in exchange for expected information gain. VOI quantifies this tradeoff: escalate only when the expected improvement in verdict quality exceeds the cost of obtaining it.
Three VOI Signals
1. Dimension Score Entropy
The 13 governance dimensions each produce a score between 0.0 and 1.0. When dimensions agree (low variance), the system's model is internally consistent and escalation adds little value. When dimensions disagree (high variance), the system's model is uncertain — a human reviewer might resolve the conflict.
Entropy is computed as weighted variance across dimension scores. Dimensions with higher governance significance — ethical_alignment, scope_compliance, behavioral_consistency — carry configurable disagreement weights so that conflict in these dimensions contributes more to entropy than conflict in lower-stakes dimensions. The top 2-3 dimensions contributing to entropy are tracked as dominant_disagreement for diagnostic transparency.
2. Trust-Weighted Uncertainty
Trust calibration (TrustProfile.overall_trust) reflects the system's confidence in its model of the agent. Low trust means the model may be unreliable, so human review provides more marginal value. Trust uncertainty is 1.0 - overall_trust, scaled upward when the trust estimate is based on few observations (low successful_actions + violation_count), since a trust score derived from 3 actions is less reliable than one derived from 300.
3. Cost-Profile Sensitivity
The CostProfile specifies how costly each type of governance error is. In high-stakes domains (financial transactions, medical decisions), the cost of a wrong autonomous decision is high, so even moderate uncertainty justifies escalation. In low-stakes domains (content suggestions, routine queries), the threshold for escalation is higher because autonomous errors are cheap to correct.
Cost sensitivity is derived as max(false_allow_cost, false_deny_cost) from the active cost profile.
Composite VOI Score
The three signals are combined with configurable weights:
Default weights: entropy 0.40, trust 0.25, cost 0.35. These can be tuned per organization through VOIConfig.
Escalation Cost Computation
Escalation is not free. The VOI engine computes escalation cost from:
Base costs:
base_latency_cost(delay to the agent) +base_attention_cost(reviewer cognitive load).Profile costs: If a
CostProfileis provided, addsescalation_latency_cost+escalation_interruption_cost.Reviewer fatigue feedback: The
HumanDriftMonitortracks reviewer behavioral drift. When a reviewer's approval rate exceeds the fatigue threshold (default 0.85), the escalation cost is inflated by a configurable multiplier (default 2.0×). This reflects the reality that a fatigued reviewer provides less information gain, so the effective cost of escalating to them is higher.
Decision Rule
The trust_floor ensures that extremely low-trust agents never bypass human review regardless of VOI arithmetic. The min_voi_to_escalate (default 0.3) provides a conservative minimum threshold.
Expected Verdict Shift Probability
As a diagnostic signal, the engine estimates how likely a human reviewer is to change the verdict:
High entropy combined with a UCS near the decision boundary (0.5) produces the highest shift probability. A UCS near 0.0 or 1.0 suggests the verdict is clear regardless of disagreement.
Tiered Escalation Paths
Not all escalations require full human review. The VOI engine recommends a graduated path based on VOI magnitude:
No escalation
none
Decide autonomously
Low (< 0.5)
request_context
Ask for additional context before deciding
Medium (0.5-0.7)
secondary_check
Route to automated secondary evaluation
High (> 0.7)
human_review
Full human review required
This tiering reduces reviewer load by reserving human attention for cases where it provides the most value.
The EscalationManager implements this tiering as a concrete resolution pipeline: when Tier 3 determines that escalation is warranted, it walks through an ordered sequence of escalation steps (cheapest first), using VOI to gate each step. Only steps whose cost is justified by the current VOI score are attempted. See Escalation Paths for the full architecture, step resolution logic, fallback verdicts, and custom path configuration via Behavioral Contracts.
Worked Examples
High-Stakes Disagreement → Escalate
An agent requests to modify a financial database. Dimensions disagree: ethical_alignment scores 0.9 (no ethical concern) but scope_compliance scores 0.2 (outside declared scope). The cost profile is CONSERVATIVE (false_allow_cost = 1.0). Trust is moderate at 0.5.
dimension_entropy: ~0.6 (high, weighted by scope_compliance)
trust_uncertainty: 0.5
cost_sensitivity: 1.0 (CRITICAL false_allow)
voi_score: 0.40×0.6 + 0.25×0.5 + 0.35×1.0 = 0.715
escalation_cost: 0.15 (base)
Decision: Escalate →
human_review
Low-Stakes Consensus → Decide Autonomously
An agent requests to generate a content suggestion. All dimensions agree around 0.75. The cost profile is PERMISSIVE (false_allow_cost = 0.1). Trust is high at 0.85.
dimension_entropy: ~0.0
trust_uncertainty: 0.15
cost_sensitivity: 0.6 (false_deny is HIGH at 0.6)
voi_score: 0.40×0.0 + 0.25×0.15 + 0.35×0.6 = 0.2475
escalation_cost: 0.15
min_voi_to_escalate: 0.3
Decision: Do not escalate (VOI < min threshold)
Fatigued Reviewer Pool → Suppress Low-VOI Escalations
Same scenario as the first example, but the assigned reviewer has an approval_rate of 0.92 (above the 0.85 fatigue threshold).
voi_score: 0.715 (unchanged)
escalation_cost: 0.15 × 2.0 = 0.30 (fatigue multiplier applied)
Decision: Still escalate (VOI 0.715 > cost 0.30)
But a borderline case with voi_score = 0.35 would now be suppressed: 0.35 > 0.30 but just barely, versus 0.35 > 0.15 without fatigue. The fatigued reviewer pool raises the bar for escalation, concentrating human attention on the cases with highest information value.
Runtime Integration
The VOI engine plugs into Tier 3 deliberation via enable_voi_escalation in RuntimeConfig. When enabled:
The
GovernanceRuntimecreates aVOIEngine(with optionalVOIConfigfromvoi_config).The engine is passed to
TierThreeDeliberatoralong with accessors for cost profiles and reviewer drift.During Tier 3 evaluation, VOI is computed after custom deliberators but before any trust-based fallback logic.
The
cost_profile_accessorfetches theCostProfilefrom the agent's activeBehavioralContract.The
reviewer_drift_accessorfetches the latestHumanDriftResultfrom theHumanDriftMonitor(if available).
When VOI is not enabled (enable_voi_escalation=False, the default), Tier 3 behavior is completely unchanged — the original trust-threshold logic applies.
Pipeline Tracing
When the BehaviorLedger is enabled, the VOI computation is recorded as a voi_computation tracer step before the tier3 step. This provides full decision reconstruction including:
Input:
dimension_entropy,trust_uncertainty,cost_sensitivityOutput:
voi_score,should_escalate,shift_probability
Verdict Metadata
All VOI-driven verdicts include verdict.metadata["voi_result"] containing the serialized VOIResult. This allows downstream systems to inspect the VOI computation that informed the decision.
Configuration
All VOI parameters are configurable through VOIConfig:
Signal weights:
entropy_weight,trust_weight,cost_weightDimension disagreement weights: per-dimension multipliers for entropy
Escalation cost:
base_latency_cost,base_attention_costFatigue parameters:
reviewer_fatigue_threshold,reviewer_fatigue_multiplierSafety:
trust_floor,min_voi_to_escalate
Last updated

