"Telemetry taxonomy and contracts SSOT"

Telemetry taxonomy and contracts SSOT

Status

This document is roadmap: it defines the target taxonomy and contract layering for a unified telemetry system. Shipped behavior today remains authoritative in code and telemetry-metric-contract.

Goals

  • One vocabulary for event families, sensitivity, retention class, and transmission across CLI, MCP, orchestrator, Populi, CI, and clients.
  • No duplicate schema primaries: extend contracts/index.yaml rather than ad-hoc JSON in random folders.
  • Keep content-bearing payloads out of the usage-telemetry namespace (see telemetry-trust-ssot).

Event family model (target)

Each logical event SHALL declare:

FieldDescription
familyStable grouping: benchmark, syntax_k, mcp_surface, mesh_control, questioning, workflow_journal, completion_ci, context_lifecycle_trace, mens_training_jsonl, …
metric_typeValue written to research_metrics.metric_type where applicable, or parallel column in domain tables
session_id_conventionPrefix per telemetry-metric-contract
schema_refURI or repo path to JSON Schema (or SQL comment + generated schema)
sensitivity_classS0 coarse / S1 operational / S2 workspace-adjacent / S3 content-bearing
transmission_classlocal_only | explicit_operator_export | approved_usage_upload (future)
owner_cratePrimary Rust owner for writes

Shipped metric_type constants (today)

From research_metrics_contract.rs (METRIC_TYPE_*). CI (vox ci data-ssot-guards) requires each literal to appear in this page or in telemetry-metric-contract.

metric_typeTypical session_idPrimary owner crate(s)
benchmark_eventbench:<repository_id>vox-clivox-db
syntax_k_eventsyntaxk:<repository_id>vox-clivox-db
socrates_surfacemcp:<repository_id>vox-mcp, vox-db
workflow_journal_entryworkflow:<repository_id>vox-workflow-runtime, vox-db
populi_control_eventmens:<repository_id>vox-cli, vox-mcp, vox-db
questioning_event(linked session keys)vox-mcp, vox-db
memory_hybrid_fusionsocrates:retrievalvox-search, vox-ludus, vox-db
agent_exec_time(no prefix, agent_exec_history)vox-db

Contract inventory (machine)

AreaContract pathNotes
Completion CIcontracts/telemetry/completion-*.v1.schema.jsonIngest → ci_completion_*
Context lifecycle tracingcontracts/orchestration/context-lifecycle-telemetry.schema.jsonTracing fields, not necessarily DB rows
Syntax-K payloadcontracts/eval/syntax-k-event.schema.jsonmetadata_json for syntax_k_event rows (metric_type above)
Interruption / attentioncontracts/communication/interruption-decision.schema.jsonAttention / interruption plane; normalized decision envelope
(planned) Usage telemetrycontracts/telemetry/usage-event-*.schema.jsonNot shipped yet — add files + contracts/index.yaml rows before wiring producers; see implementation blueprint.

Target: single telemetry contract registry row pattern

Future work SHOULD register each family in contracts/index.yaml with:

  • description
  • enforced_by including at least one of: vox ci command-compliance, vox ci data-ssot-guards, crate tests

Transmission classes (normative definitions)

  • local_only: never leaves the machine unless the user performs an explicit export (file copy, support bundle). Includes default structured tracing and local DB rows.
  • explicit_operator_export: gated by CLI/MCP action and documented in telemetry-client-disclosure-ssot.
  • approved_usage_upload: reserved for a future central sink; requires separate policy doc, Clavis-backed credentials per AGENTS.md, and CHANGELOG entry per release.

Forbidden in usage-telemetry schemas

The following MUST NOT appear in approved_usage_upload or default local_only usage events without S3 classification and a separate consent path:

  • raw source text, prompts, completions
  • full MCP tool arguments_json (use hash/omit patterns from mcp_privacy.rs)
  • absolute paths, repository remotes, user home segments in stack traces
  • retrieval query text and document bodies