Information-theoretic questioning protocol
This document is the SSOT for clarification strategy across chat, planning, and agent-to-agent handoffs.
Goals
- Minimize user effort while maximizing uncertainty reduction.
- Prefer high-diagnostic prompts over broad or redundant questions.
- Stop asking as soon as confidence and risk thresholds are met.
- Preserve auditability: each question has reason, expected gain, and stop rationale.
Question trigger policy
Ask a question only when at least one of these conditions is true:
- Ambiguous intent: multiple plausible actions exist with materially different outcomes.
- High consequence uncertainty: action is costly, irreversible, or policy-sensitive.
- Missing hard constraint: required parameter is absent (
target,scope,risk tolerance,deadline, etc.). - Socrates medium-risk band: confidence is in the ask range and contradiction is non-blocking.
Do not ask when:
- the request is unambiguous and low risk,
- additional questions are expected to provide negligible information gain,
- maximum clarification turns or user-time budget is reached.
Question type selection
Use the smallest interaction that resolves the highest-value uncertainty.
Multiple-choice (multiple_choice)
Prefer when hypothesis space is known and bounded.
- Use 2-5 options (3 default).
- Options must be mutually exclusive when possible.
- Include a deliberate "other / none of the above" only when genuinely needed.
- Design unselected options to remain diagnostically useful (infer constraints/preferences).
Assumption-confirm (assumption_confirm)
Prefer when agent confidence in its inferred value is ≥ 0.80 and the value is not policy-sensitive or destructive.
- State the assumed value explicitly: "I'm assuming X. Correct me if wrong; otherwise I'll proceed."
- Include a default timeout: how long the agent waits before proceeding with the assumption.
- Include a brief impact note: what changes if the assumption is wrong.
- Do not use when the assumption is irreversible — use
multiple_choiceorentryinstead. - Anti-pattern: stating the assumption confidently without a clear correction mechanism (obsequiousness trap).
Open-ended (open_ended)
Prefer when user intent space is broad or unknown.
- Ask exactly one targeted free-form prompt.
- Include a short frame to reduce interpretation variance.
- Follow with one narrow multiple-choice if remaining ambiguity persists.
Entry (entry)
Prefer for scalar/structured fields (IDs, ranges, dates, file paths, thresholds).
- Validate format immediately.
- Echo parsed value before execution.
- Re-ask only for invalid/unsafe values.
Information-theoretic scoring
Each candidate question is scored by expected value:
score = expected_information_gain_bits / expected_user_cost
Where:
expected_information_gain_bitsis entropy reduction over active hypotheses.expected_user_costapproximates burden (time, complexity, interruption).
Choose the highest-scoring candidate that passes policy constraints:
expected_information_gain_bits >= min_information_gain_bitsexpected_user_cost <= max_expected_user_costclarification_turn_index < max_clarification_turns
Structural question funnel
High-diagnostic questioning follows a three-stage funnel. Each stage runs only if the previous left material ambiguity.
- Intent — Resolves the plan branch (
open_endedorbinary). Most tasks resolve here. - Scope/constraint — Resolves the execution envelope (
multiple_choiceorentry). - Parameter confirm — Confirms specifics for high-stakes or highly parameterized actions (
assumption_confirmorentry).
For planning specifically:
- Is the goal unambiguous with clear scope? → Plan without asking.
- Does the goal map to N≥2 materially different plan shapes AND EVPI exceeds threshold? → Ask ONE disambiguating question. See
planning-meta/12-question-gate-standard.md. - Is any high-risk step irreversible? → Confirm with
assumption_confirmbefore that step executes. - Is the plan thin but the missing detail is specification-level (not intent-level)? → Auto-expand via
auto_expand_thin_plan; ask only for genuine intent gaps.
Stopping rules
Stop clarification when any condition is met:
confidence >= target_confidencemarginal_information_gain_bits < min_information_gain_bitsclarification_turn_index >= max_clarification_turnsexpected_user_cost > max_expected_user_cost- contradiction/risk forces abstention or escalation
Persist stop reason explicitly for telemetry and audit.
Attention and time-respect constraints
Questioning must be cost-aware with attention budget coupling:
- Penalize long clarification loops under high interrupt load.
- Raise gain threshold when attention budget is near exhaustion.
- Prefer concise multiple-choice in high temporal demand contexts.
Attention budget → EIG threshold table
The EIG threshold for question approval scales with focus depth and budget state:
| Budget / focus state | EIG threshold adjustment | Permitted question types |
|---|---|---|
FocusDepth::Ambient, spend < 50% | None (use configured baseline) | All types |
FocusDepth::Focused, spend 50–80% | +20% | All types; prefer multiple_choice |
FocusDepth::Deep, spend > 80% | +50% | binary, assumption_confirm only |
BudgetSignal::Critical | Questions suppressed | None; proceed on best inference |
BudgetSignal::CostExceeded | Questions suppressed | None; proceed on safe default |
interrupt_ewma > 0.8 | +50% (backlog penalty) | Defer non-critical; batch with next checkpoint |
MCP records estimated wall-time per session_id and can mirror those debits into the orchestrator global attention budget. Cap override and mirror toggle: VOX_QUESTIONING_MAX_ATTENTION_MS, VOX_QUESTIONING_MIRROR_GLOBAL_ATTENTION — see Environment variables (SSOT).
Dynamic interruption control (runtime)
When VOX_ORCHESTRATOR_ATTENTION_ENABLED=true, MCP does not emit every model-proposed question immediately. The orchestrator evaluates evaluate_interruption using:
- information gain vs. normalized user cost (same SSOT ratio),
- live
AttentionBudget(spent ratio, focus depth / interrupt EWMA), - trust, contradiction, risk band, open session hints, and turn caps.
Outcomes: interrupt now (persist question + AttentionEvent), defer, batch with existing prompt, or proceed autonomously (metric-only). High-risk / abstain-band cases can still require human before continue. Answered clarifications append ClarificationAnswered attention rows via vox_questioning_submit_answer. VOX_ORCHESTRATOR_ATTENTION_ENABLED=false keeps prior behavior (no dynamic deferral on this path).
Runtime now records policy-only outcomes (PolicyDeferred, PolicyProceedAuto) as first-class attention events, so calibration can learn from suppressed interruptions too (not only displayed prompts).
Vox.toml [orchestrator] can tune channel calibration via interruption_calibration (gain offsets, backlog penalty, trust-adjustment scale) without changing policy code.
Surface behavior differs:
vox_submit_task: defer/proceed-auto record telemetry and continue submit; require-human blocks unless description carries explicit marker ([approval:confirm],[approval:reviewed],[human-approved]).vox_a2a_send(pilot-visible escalation types): defer/proceed-auto suppress send and returndeferred=true; require-human blocks.vox_a2a_send(pilot-visible escalation types): defer suppresses send and returnsdecision=DeferUntilCheckpointwithdeferred=true; proceed-auto suppresses send and returnsdecision=ProceedAutonomouslywithdeferred=false; require-human blocks.vox_plan/vox_replan/vox_plan_status: defer/proceed-auto suppress only the questioning trace; plan output still returns.
A2A clarification contract
For agent-to-agent clarification, persist these payload fields in a2a_messages.payload:
clarification_intent(why clarification is needed),hypothesis_set_id,question_kind,expected_information_gain_bits,expected_user_cost,requested_evidence_dimensions,urgency,stop_policy.
Recommended msg_type values:
clarification_requestclarification_responseclarification_stop
Contract schemas:
contracts/communication/a2a-clarification-payload.schema.jsoncontracts/communication/interruption-decision.schema.json
Metrics (minimum set)
- Clarification trigger rate.
- Mean clarification turns per resolved task.
- Mean realized information gain per question.
- Gain-per-cost ratio.
- Multiple-choice option diagnostic power (selected + unselected).
- Clarification abandonment rate.
- Resolution latency after first clarification.
- A2A clarification round-trip latency.
Persistence requirements
Policy and telemetry must be persisted in dual-write form:
- Canonical publication artifact (
publication_manifests). - Searchable mirror (
search_documents+search_document_chunks).
Question-level runtime telemetry must be queryable in VoxDB via dedicated questioning tables.
MCP (clients and agents): vox_questioning_pending returns open sessions, unanswered assistant prompts, and structured multiple-choice options (plus parsed belief_state_json). vox_questioning_submit_answer persists free-text and optional selected_option_id (posteriors in belief_state_json and question_options.posterior_probability are updated for MC). Env vars for attention caps, global budget mirroring, and task-gate bypass are listed under MCP / Socrates questioning in env-vars.md.
Related SSOTs
docs/src/reference/socrates-protocol.md— confidence gate and Ask decisiondocs/src/reference/scientia-publication-worthiness-rules.mddocs/src/reference/orchestration-unified.mddocs/src/architecture/research-diagnostic-questioning-2026.md— full research grounding (POMDP, EVPI, gap analysis, implementation roadmap)docs/src/architecture/planning-meta/12-question-gate-standard.md— Tier 1 normative rules for planning-mode questioning