Three-Tier Evaluation

Nomotic evaluates actions through a three-tier cascade. Each tier handles a different class of decision. If a tier makes a final decision, evaluation stops. Only ambiguous cases proceed to the next tier.

Tier 1: Deterministic Gate

Fast, rule-based checks. Runs in microseconds. No scoring, no weighing — binary pass or fail.

Tier 1 handles hard boundaries:

Scope violations (action outside agent's permissions)
Authority failures (agent lacks specific authority)
Resource limit breaches
Isolation boundary violations
Temporal constraint violations
Ethical hard constraints
Human override requirements

If any veto dimension scores 0.0, Tier 1 issues an immediate DENY and evaluation stops.

Exception: if human_override is the only vetoing dimension, Tier 1 issues ESCALATE instead of DENY — the action is queued for human review, not permanently rejected.

Tier 2: Weighted Evaluation

The UCS engine. Combines all 14 dimension scores into a Unified Confidence Score using configured weights. Most actions are decided here.

UCS ≥ allow threshold → ALLOW
UCS ≤ deny threshold → DENY
UCS in the ambiguity zone → pass to Tier 3

Default thresholds: allow at 0.70, deny at 0.30. Configurable per preset (strict uses 0.75/0.35, ultra strict uses 0.85/0.45).

Cost-Sensitive Thresholds

Tier 2 thresholds can be static (from RuntimeConfig) or dynamic (derived from a CostProfile on the agent's BehavioralContract). When enable_cost_sensitive=True, the ThresholdEngine computes per-evaluation thresholds using signal detection theory, and passes them to TierTwoEvaluator as dynamic_allow / dynamic_deny.

Dynamic thresholds are recorded in behavioral provenance for auditability. The verdict's modifications dict includes a "derived_thresholds" entry with the allow/deny values, source, and cost profile summary.

When dynamic thresholds are not available (no cost profile or feature disabled), Tier 2 falls back to the static thresholds — behavior is identical to the default configuration.

Tier 3: Deliberative Review

Handles ambiguous cases that Tier 2 couldn't resolve.

VOI-Driven Flow (when `enable_voi_escalation=True`)

When VOI is enabled, Tier 3 replaces the simple trust-threshold escalation with a decision-theoretic approach. The flow is:

Custom deliberators run first (application-specific logic). If any returns a verdict, it's final.
VOI computation: The VOIEngine computes a Value of Information score from dimension entropy, trust uncertainty, and cost sensitivity.
Trust floor (default 0.2): Trust below this always escalates regardless of VOI — a safety override.
VOI > escalation cost: If the expected information gain from human review exceeds the cost, escalate.
VOI-informed autonomous decision: When VOI is too low to justify escalation:
- UCS > 0.5 → ALLOW
- UCS ≤ 0.5 + high false-allow cost (≥ 0.6) → DENY
- UCS ≤ 0.5 + low false-allow cost → ALLOW (low stakes)

All VOI-driven verdicts include metadata["voi_result"] with the full VOI computation for auditability.

Enabling VOI Escalation

from nomotic.runtime import GovernanceRuntime, RuntimeConfig

runtime = GovernanceRuntime(RuntimeConfig(
    enable_voi_escalation=True,
    enable_contracts=True,  # Required for cost profile access
))

Example: Same UCS, Different Outcomes

An ambiguous action with UCS 0.45:

Without VOI (default): Trust 0.5 → falls through to critical dimension check or default ALLOW.
With VOI + CONSERVATIVE profile: High false-allow cost → DENY (VOI too low to escalate, but stakes are high).
With VOI + PERMISSIVE profile: Low false-allow cost → ALLOW (stakes are low, autonomous decision is fine).
With VOI + high dimension disagreement: Dimensions conflict → ESCALATE (human review has high expected value).

Legacy Flow (when `enable_voi_escalation=False`)

When VOI is disabled, Tier 3 uses the original trust-threshold logic:

Custom deliberators run first (application-specific logic). If any returns a verdict, it's final.
High trust (> 0.7) + UCS > 0.5 → ALLOW. An agent with a strong track record gets the benefit of the doubt.
Low trust (< 0.4) → ESCALATE. An untrusted agent gets human review.
Critical dimension (weight >= 1.3) scoring < 0.4 → MODIFY with reduce_scope and require_confirmation.
Default → ALLOW. If nothing is clearly wrong, the action proceeds.

Custom Deliberators

def my_deliberator(action, context, scores, ucs):
    # Return a Verdict or None to pass to the next deliberator
    if action.action_type == "delete" and ucs < 0.6:
        return Verdict.ESCALATE
    return None

runtime.tier3.add_deliberator(my_deliberator)

Performance

Tier

Typical Latency

Tier 1 (veto cases)

< 100 microseconds

Tier 2 (most cases)

< 1ms at p99

Tier 3 (edge cases)

1–5ms

:::note Governance adds less than a millisecond to the vast majority of agent decisions. Tier 3 is only invoked when Tier 2 produces an ambiguous result. :::

Verdicts

Every governance evaluation produces a GovernanceVerdict — the output of the cascade. The verdict determines what happens to the agent's action.

Verdict Types

Verdict

Meaning

What Happens

ALLOW

Action is approved

Execution proceeds. Trust increases (+0.01).

DENY

Action is rejected

Execution blocked. Trust decreases (-0.05). Violation recorded.

MODIFY

Action approved with constraints

Execution proceeds with reduced scope and/or confirmation required.

ESCALATE

Human review required

Execution paused. Action queued for human approval.

SUSPEND

Agent suspended

All agent activity halted pending investigation.

:::warning A SUSPEND verdict means the agent cannot take any further actions until a human reviews and reinstates it via nomotic inspect <agent-id>. :::

GovernanceVerdict Structure

@dataclass
class GovernanceVerdict:
    action_id: str                          # Which action was evaluated
    verdict: Verdict                        # ALLOW, DENY, MODIFY, ESCALATE, SUSPEND
    ucs: float                              # Unified Confidence Score (0.0–1.0)
    dimension_scores: list[DimensionScore]  # All 14 individual scores
    tier: int                               # Which tier (1, 2, or 3) decided
    vetoed_by: list[str]                    # Dimension names that vetoed (if any)
    modifications: dict[str, Any]           # Constraints for MODIFY verdicts
    reasoning: str                          # Human-readable explanation
    timestamp: float                        # When the verdict was issued
    evaluation_time_ms: float               # How long evaluation took

Working with Verdicts

verdict = runtime.evaluate(action, context)

if verdict.verdict == Verdict.ALLOW:
    execute(action)
elif verdict.verdict == Verdict.MODIFY:
    execute_with_constraints(action, verdict.modifications)
elif verdict.verdict == Verdict.ESCALATE:
    queue_for_human_review(action, verdict)
elif verdict.verdict == Verdict.DENY:
    log_denial(action, verdict.reasoning)

Accessing Verdict Details

print(verdict.verdict)            # The verdict enum
print(verdict.ucs)                # 0.0–1.0
print(verdict.tier)               # Which tier decided (1, 2, or 3)
print(verdict.reasoning)          # Human-readable explanation
print(verdict.vetoed_by)          # List of dimensions that vetoed
print(verdict.dimension_scores)   # All 14 dimension scores

Trust Impact

Every verdict feeds back into the trust calibrator:

Event

Trust Change

ALLOW verdict

+0.01

DENY verdict

-0.05

Successful completion

+0.005

Interrupted during execution

-0.03

Per-dimension trust is also updated. If a specific dimension vetoed or scored below 0.3, that dimension's trust decreases independently, even if overall trust remains stable.

API

GET /v1/audit/{agent_id}                    # Verdict history for an agent
GET /v1/ui/approval-queue                   # Pending ESCALATE verdicts
POST /v1/ui/approval-queue/{id}/approve     # Approve an escalated action
POST /v1/ui/approval-queue/{id}/deny        # Deny an escalated action

CLI

nomotic audit <agent-id>                   # View verdict history
nomotic audit <agent-id> --filter denied   # Filter by verdict type

PreviousSequential Governance Optimization NextTrust Model

Last updated 8 days ago

Good evening

hashtagTier 1: Deterministic Gate

hashtagTier 2: Weighted Evaluation

hashtagCost-Sensitive Thresholds

hashtagTier 3: Deliberative Review

hashtagVOI-Driven Flow (when enable_voi_escalation=True)

hashtagEnabling VOI Escalation

hashtagExample: Same UCS, Different Outcomes

hashtagLegacy Flow (when enable_voi_escalation=False)

hashtagCustom Deliberators

hashtagPerformance

hashtagVerdicts

hashtagVerdict Types

hashtagGovernanceVerdict Structure

hashtagWorking with Verdicts

hashtagAccessing Verdict Details

hashtagTrust Impact

hashtagAPI

hashtagCLI