"Telemetry retention and sensitivity SSOT"

Telemetry retention and sensitivity SSOT

Status

Roadmap: sensitivity classes below are normative for future implementation. Current TTLs are authoritative in retention-policy.yaml and db_retention.

Sensitivity classes

ClassDefinitionExamples
S0Coarse counters, version strings, bucketed timingsAggregated benchmark names, build timing buckets
S1Operational metadata without user contentrepository_id labels, mesh event names, model ids
S2Workspace-adjacent: can infer project shapeRelative paths in CI findings, repo-scoped session keys, cross-repo query metadata (see telemetry-metric-contract)
S3Content-bearingChat text, prompts, tool args (full), retrieval hits, transcripts

Rule: centralized “usage telemetry” MUST stay at S0–S1 unless explicitly classified as S2 with user/org opt-in and documented re-identification risk.

Retention alignment

Today: research_metrics

retention-policy.yaml lists research_metrics with 365 days (days relative to created_at). Prune is operator-driven via vox db prune-plan / prune-apply.

Today: build_run* telemetry tables

The vox ci build-timings --deep command persists structured build telemetry in build_run plus child tables (build_crate_sample, build_warning, build_run_dependency_shape). Retention follows retention-policy.yaml:

TablePrune ruleNotes
build_rundays / 365 / recorded_atParent run cadence aligned with benchmark retention horizon.
build_crate_sample, build_warning, build_run_dependency_shape(via FK)ON DELETE CASCADE from build_run; no separate policy rows needed.

Today: ci_completion_*

Completion ingest persists workspace-adjacent rows (ci_completion.rs), classified S2 (paths, fingerprints). retention-policy.yaml defines:

TablePrune ruleNotes
ci_completion_rundays / 365 / finished_atSame default horizon as research_metrics for comparable org-local telemetry.
ci_completion_finding, ci_completion_detector_snapshot(via FK)ON DELETE CASCADE from ci_completion_run; no separate policy rows.
ci_completion_suppressionexpires_lt_now / expires_atTTL suppressions auto-prune when expires_at is set and past datetime('now'); expires_at NULL stays until manual change or a future policy decision.

Policy alignment: there is no separate “manual vs automated” conflict for runs: automated prune-apply ages out old runs (and cascaded children) on the same 365-day calendar basis as research_metrics. Suppressions without expiry remain operator-visible for governance until edited or a stricter rule is adopted.

Other adjacent tables

Tables such as conversation_messages, agent_events, behavior_events, llm_interactions (see agents.rs schema) are content or behavior stores. They MUST NOT be folded into “telemetry” naming without a separate data-class chapter in telemetry-trust-ssot.

Today: agent_exec_history

Execution time telemetry records for agentic budgeting (exec_time_telemetry). Classified S1 (tool names, IDs, duration, costs). Retention is set to 90 days in retention-policy.yaml because budgeting models only need a recent trailing window to detect anomalies; stale execution timings become irrelevant quickly.

Orchestrator and Populi sidecars

  • Memory / log retention in orchestrator (for example local log retention knobs) is separate from SQL TTL; document any future alignment in this file.
  • Populi privacy_class on envelopes (a2a/envelope.rs) MUST be referenced when classifying mesh-visible events.

Controls linkage