Three-Tier Evaluation

Nomotic evaluates actions through a three-tier cascade. Each tier handles a different class of decision. If a tier makes a final decision, evaluation stops. Only ambiguous cases proceed to the next tier.

Tier 1: Deterministic Gate

Fast, rule-based checks. Runs in microseconds. No scoring, no weighing — binary pass or fail.

Tier 1 handles hard boundaries:

  • Scope violations (action outside agent's permissions)

  • Authority failures (agent lacks specific authority)

  • Resource limit breaches

  • Isolation boundary violations

  • Temporal constraint violations

  • Ethical hard constraints

  • Human override requirements

If any veto dimension scores 0.0, Tier 1 issues an immediate DENY and evaluation stops.

Exception: if human_override is the only vetoing dimension, Tier 1 issues ESCALATE instead of DENY — the action is queued for human review, not permanently rejected.

Tier 2: Weighted Evaluation

The UCS engine. Combines all 14 dimension scores into a Unified Confidence Score using configured weights. Most actions are decided here.

  • UCS ≥ allow threshold → ALLOW

  • UCS ≤ deny threshold → DENY

  • UCS in the ambiguity zone → pass to Tier 3

Default thresholds: allow at 0.70, deny at 0.30. Configurable per preset (strict uses 0.75/0.35, ultra strict uses 0.85/0.45).

Cost-Sensitive Thresholds

Tier 2 thresholds can be static (from RuntimeConfig) or dynamic (derived from a CostProfile on the agent's BehavioralContract). When enable_cost_sensitive=True, the ThresholdEngine computes per-evaluation thresholds using signal detection theory, and passes them to TierTwoEvaluator as dynamic_allow / dynamic_deny.

Dynamic thresholds are recorded in behavioral provenance for auditability. The verdict's modifications dict includes a "derived_thresholds" entry with the allow/deny values, source, and cost profile summary.

When dynamic thresholds are not available (no cost profile or feature disabled), Tier 2 falls back to the static thresholds — behavior is identical to the default configuration.

Tier 3: Deliberative Review

Handles ambiguous cases that Tier 2 couldn't resolve.

VOI-Driven Flow (when enable_voi_escalation=True)

When VOI is enabled, Tier 3 replaces the simple trust-threshold escalation with a decision-theoretic approach. The flow is:

  1. Custom deliberators run first (application-specific logic). If any returns a verdict, it's final.

  2. VOI computation: The VOIEngine computes a Value of Information score from dimension entropy, trust uncertainty, and cost sensitivity.

  3. Trust floor (default 0.2): Trust below this always escalates regardless of VOI — a safety override.

  4. VOI > escalation cost: If the expected information gain from human review exceeds the cost, escalate.

  5. VOI-informed autonomous decision: When VOI is too low to justify escalation:

    • UCS > 0.5 → ALLOW

    • UCS ≤ 0.5 + high false-allow cost (≥ 0.6) → DENY

    • UCS ≤ 0.5 + low false-allow cost → ALLOW (low stakes)

All VOI-driven verdicts include metadata["voi_result"] with the full VOI computation for auditability.

Enabling VOI Escalation

Example: Same UCS, Different Outcomes

An ambiguous action with UCS 0.45:

  • Without VOI (default): Trust 0.5 → falls through to critical dimension check or default ALLOW.

  • With VOI + CONSERVATIVE profile: High false-allow cost → DENY (VOI too low to escalate, but stakes are high).

  • With VOI + PERMISSIVE profile: Low false-allow cost → ALLOW (stakes are low, autonomous decision is fine).

  • With VOI + high dimension disagreement: Dimensions conflict → ESCALATE (human review has high expected value).

Legacy Flow (when enable_voi_escalation=False)

When VOI is disabled, Tier 3 uses the original trust-threshold logic:

  1. Custom deliberators run first (application-specific logic). If any returns a verdict, it's final.

  2. High trust (> 0.7) + UCS > 0.5ALLOW. An agent with a strong track record gets the benefit of the doubt.

  3. Low trust (< 0.4)ESCALATE. An untrusted agent gets human review.

  4. Critical dimension (weight >= 1.3) scoring < 0.4MODIFY with reduce_scope and require_confirmation.

  5. DefaultALLOW. If nothing is clearly wrong, the action proceeds.

Custom Deliberators

Performance

Tier
Typical Latency

Tier 1 (veto cases)

< 100 microseconds

Tier 2 (most cases)

< 1ms at p99

Tier 3 (edge cases)

1–5ms

:::note Governance adds less than a millisecond to the vast majority of agent decisions. Tier 3 is only invoked when Tier 2 produces an ambiguous result. :::


Verdicts

Every governance evaluation produces a GovernanceVerdict — the output of the cascade. The verdict determines what happens to the agent's action.

Verdict Types

Verdict
Meaning
What Happens

ALLOW

Action is approved

Execution proceeds. Trust increases (+0.01).

DENY

Action is rejected

Execution blocked. Trust decreases (-0.05). Violation recorded.

MODIFY

Action approved with constraints

Execution proceeds with reduced scope and/or confirmation required.

ESCALATE

Human review required

Execution paused. Action queued for human approval.

SUSPEND

Agent suspended

All agent activity halted pending investigation.

:::warning A SUSPEND verdict means the agent cannot take any further actions until a human reviews and reinstates it via nomotic inspect <agent-id>. :::

GovernanceVerdict Structure

Working with Verdicts

Accessing Verdict Details

Trust Impact

Every verdict feeds back into the trust calibrator:

Event
Trust Change

ALLOW verdict

+0.01

DENY verdict

-0.05

Successful completion

+0.005

Interrupted during execution

-0.03

Per-dimension trust is also updated. If a specific dimension vetoed or scored below 0.3, that dimension's trust decreases independently, even if overall trust remains stable.

API

CLI

Last updated