"Trust Reliability Layer (SSOT)"

Trust Reliability Layer (SSOT)

This document defines the current trust/reliability architecture used by orchestrator routing, Socrates telemetry, endpoint reliability, and downstream analytics.

Why this exists

The codebase historically had multiple trust-like signals that were useful but partially disconnected:

agent_reliability (Laplace-smoothed task outcomes)
in-memory AgentTrustScore (attention/approval behavior)
endpoint EWMA metrics (endpoint_reliability)
Socrates turn telemetry (socrates_surface)
file-based MENS/eval artifacts

The unified trust layer adds a common vocabulary and persistence model so these signals can be queried and used together.

Canonical trust vocabulary

Trust observations are recorded as:

entity_type: agent, endpoint, model, skill, workflow, repository, evidence_bundle
entity_id: stable identifier for the entity
dimension: e.g. task_completion, factuality, contradiction_rate, refusal_propensity, latency_reliability
scope: domain, task_class, provider, model_id, repository_id
value + confidence: observation_value, confidence_weight, sample_size
provenance: source_kind, artifact_ref, metadata_json, created_at_ms

Storage model

Two database tables are the SSOT:

trust_observations: append-only evidence log for replay/audit.
trust_rollups: materialized scoped rollups keyed by (entity_type, entity_id, dimension, scope...).

Current implementation:

each observation is inserted into trust_observations
each insert updates trust_rollups.score with EWMA
rollups retain sample_size, ewma_alpha, and updated_at_ms

Runtime producers

Current producers that write into the trust layer:

orchestrator task completion/failure writes agent + task_completion observations
endpoint reliability writes endpoint observations for factuality/contradiction/infra dimensions
Socrates surface telemetry writes model observations for factuality/contradiction/refusal dimensions

When persistence writes fail in task completion/failure paths, orchestrator now emits explicit degradation signals in shared context keys under:

orchestrator/persistence_health/trust/reliability_observation
orchestrator/persistence_health/trust/observation
orchestrator/persistence_health/lineage/task_completed
orchestrator/persistence_health/lineage/task_failed

Each key carries status, degraded_count, last_error, and last_error_unix_ms so operators can detect silent durability regressions.

The orchestrator also writes outbox lifecycle health to orchestrator/persistence_outbox_lifecycle with queued, pruned_last_run, retried_last_run, replayed_last_run, and last_run_unix_ms. Replay diagnostics now include replay_failed_last_run (count of replay attempts that failed in the latest tick) and replay_failed_by_op (map keyed by replay operation label, usually replay.op, with unknown fallback) so operators can identify stuck replay classes without inspecting raw queue payloads.

Runtime consumers

Current consumers:

routing uses scoped agent task_completion trust rollups as floor + weighted utility
vox db reliability-list --domain trust shows trust rollups for operators
MCP vox_db_trust_rollups lists scoped rollup rows; vox_db_trust_summary returns grouped aggregates (by dimension, domain, entity type, or combined keys); vox_db_trust_drift compares recent vs prior window means on raw observations; vox_db_trust_propagate runs domain-clique affinity smoothing over model rollups (optional persist to *_propagated dimensions)
vox_db_trust_drift can now include forensic payloads when requested:
- include_raw_observations: true returns raw trust_observations rows (optionally filtered by task_id/since_ms/raw_limit)
- include_lineage_for_task: true with task_id and repository scope returns task lineage rows for trust/lineage correlation
vox ci mens-scorecard ingest-trust --summary <path> ingests a validated vox_mens_scorecard_summary_v1 summary.json into trust_observations / rollups for the workspace repository id
vox_scientia_worthiness_evaluate with with_live_trust: true attaches live_trust_rollups summaries for the workspace repository when VoxDb is connected
MCP vox_orchestrator_status now includes persistence_outbox_lifecycle so clients can read outbox replay health (replayed_last_run, replay_failed_last_run, replay_failed_by_op) without direct context-store access
MCP also provides dedicated outbox inspection tools: vox_orchestrator_persistence_outbox_lifecycle (typed lifecycle snapshot) and vox_orchestrator_persistence_outbox_queue (queued lane entries with optional lane filter and replay redaction)

Notes on score semantics

trust_rollups.score is normalized to [0, 1] and interpreted as “higher is better”.

For inverse-risk metrics, writers invert before recording (1 - risk).
dimension names can represent the source signal, but stored score remains normalized-goodness.

Known gaps (next iterations)

extend domain tagging and policy-profile attribution beyond primary MCP chat/plan/edit surfaces
automated calibration transforms (e.g. isotonic) on top of drift reports—not only windowed mean comparison
richer graph propagation than same-domain clique affinity (explicit trust edges, provider graphs)
per-validation-failure-class dimensions (schema_conformance, semantic_policy, repair_exhaustion): proposed in research-llm-output-mediation-validation-2026.md §8.4 as part of the unified LLM Mediation Layer (LML) design. Currently trust signals capture per-task outcomes but not per-inference-call validation failure modes.

Vox: The AI-Native Programming Language