Socrates protocol — single source of truth
The Socrates protocol is Vox’s unified anti-hallucination pipeline: retrieve evidence, verify claims, calibrate confidence, gate outputs, and persist telemetry. Implementation spans vox-socrates-policy, vox-orchestrator, vox-toestub (review), vox-mcp, and Codex schema extensions.
Questioning strategy (when to ask, what question type to ask, and when to stop) is specified in the companion SSOT:
Protocol states
- Retrieve — Hybrid lexical + vector retrieval; every factual claim should bind to
EvidenceItemrecords. Pure fusion helpers incrates/vox-db/src/retrieval.rs(RetrievalResult,fuse_hybrid_results) preserveevidence_source, timestamps, optionalquery_id,supporting_claim_ids, andcontradiction_hintsacross modality merge. In-process memory search usesHybridSearchHit(potential_contradiction) invox-orchestrator. - Verify — Claims checked against evidence; contradictions increase
contradiction_ratio. - Calibrate — Produce
ConfidenceSignal(score, coverage, contradiction ratio). - Gate —
RiskDecision:Answer,Ask, orAbstainviaConfidencePolicy::evaluate_risk_decisionin cratevox-socrates-policy. - Persist — Log outcomes to
research_metrics/eval_runs/ reliability tables; update routing weights.
Telemetry and hallucination-risk proxies
- MCP tools (
vox_chat_message,vox_plan,vox_replan,vox_plan_status,vox_inline_edit,vox_ghost_text): when Codex is attached, each successful turn appendsresearch_metricswithmetric_type = socrates_surface,session_id = mcp:<repository_id>,metric_value = hallucination_risk_proxy(...), and JSON metadataSocratesSurfaceTelemetryincrates/vox-db/src/socrates_telemetry.rs(re-exported fromvox_db). Logs also emit targetvox_socrates_telemetry. Effective thresholds followOrchestratorConfig::effective_socrates_policy()(mergesvox-socrates-policywith optional config overrides).vox_planadequacy (Codex): whenplan_telemetry_session_idis set,plan_sessions.iterative_loop_metadata_jsonmay includeadequacy_before,adequacy_after(and/or legacyadequacy),adequacy_improved_heuristic,task_count_before_refine/task_count_after_refine,aggregate_unresolved_risk,plan_depth, andinitial_plan_max_output_tokens. The tool response addsplan_adequacy_score,plan_too_thin,adequacy_reason_codes, andplan_depth_effective. See plan adequacy.
- Hybrid memory retrieval (
vox_search::MemorySearchEngine::hybrid_search): used by MCP unified retrieval triggers (vox_chat_messageautonomous preamble andvox_memory_search) viavox_search, appendsmemory_hybrid_fusionunder sessionsocrates:retrievalwith contradiction-rate metadata. - Rollups —
VoxDb::aggregate_socrates_surface_metrics,VoxDb::record_socrates_eval_summary(writeseval_runswith answer/abstain rates and a quality proxy derived from mean risk proxy). - CLI —
vox codex socrates-metricsprints the aggregate JSON;vox codex socrates-eval-snapshot --eval-id <stable-id>appends aneval_runsrow (same DB resolution as othervox codexcommands). Fails if there are zerosocrates_surfacerows in the scan window (prevents bogus “perfect” scores). For a nightly job: setVOX_DB_*(or local path), then e.g.vox codex socrates-eval-snapshot --eval-id nightly-$(date +%F)(POSIX) or a CI step with a uniqueeval_idper run.
Canonical JSON shapes (orchestrator / MCP)
Input (task or turn context)
{
"risk_budget": "normal",
"factual_mode": true,
"required_citations": 1
}
Output envelope (optional socrates on MCP chat / plan / inline / ghost tools)
{
"risk_decision": "answer",
"confidence_estimate": 0.82,
"contradiction_ratio": 0.05
}
(risk_decision is serialized from vox_socrates_policy::RiskDecision.)
Handoff extension (HandoffPayload)
confidence_signal,unresolved_claims,required_checks— seecrates/vox-orchestrator/src/handoff.rsin the repo.
Invariants
- No high-confidence factual assertion without linked evidence when
factual_modeis true. - Abstain when normalized confidence is below
ConfidencePolicy::abstain_thresholdor contradiction ratio exceedsmax_contradiction_ratio_for_answer. - Unresolved contradictions block
Answer; gate returnsAbstainorAskper policy. Askdecisions should follow information-theoretic question selection and stop rules from the questioning SSOT.
Shared policy crate
Numeric defaults and risk classification live in vox-socrates-policy — do not duplicate magic thresholds in prompts or filters; import or configure via ConfidencePolicy and ConfidencePolicyOverride merge in the orchestrator. Reputation routing: blend weight for Socrates reputation signals is configurable via OrchestratorConfig::socrates_reputation_weight and env VOX_ORCHESTRATOR_SOCRATES_REPUTATION_WEIGHT (see vox-orchestrator config.rs).
Rollout
- Shadow —
OrchestratorConfig.socrates_gate_shadow: compute and logSocratesOutcomewithout blocking completion. - Enforce —
OrchestratorConfig.socrates_gate_enforce: failed gate requeues task with structured remediation (when task carriesSocratesTaskContext).