Research index
This page groups the research-oriented documentation in docs/src/architecture/ so it is easier to discover without mistaking it for the current shipped architecture.
Research classes
| Pattern | Typical status | Meaning |
|---|---|---|
*-research-2026.md | research | investigation, evidence gathering, constraints, and trade-offs |
*-findings-2026.md | research | synthesized results or conclusions from a research wave |
*-implementation-plan-2026.md | roadmap | ordered implementation proposal |
*-implementation-blueprint.md | roadmap or experimental | intended technical design for a future or in-progress path |
planning-meta/* | current process docs or roadmap planning docs | contributor planning governance, not public product narrative |
Pipeline and corpus SSOT (implementation)
- Vox source → Mens pipeline SSOT — single map from
.voxon disk to Mens training inputs (lexer vs HF tokenizer). - Populi data pipeline — disambiguates mesh runtime data from training JSONL.
Corpus lab, vision, and Qwen family (research, April 2026)
- Vox corpus lab: mass examples, metrics, and eval harness (research 2026) — Tier A/B/C layout, compiler lanes vs golden parity, Syntax-K and WebIR aggregates, optional UI and vision rubrics, Mens
validate-batchintegration sketch. - Mens vision and multimodal inputs (research 2026) —
TrainingPairlimits, orchestrator hints vs attachments, screenshot-to-JSON pipeline, Candle text-only vs remote VLMs. - Mens Qwen family migration and native stack (research 2026) — Qwen2 vs Qwen3.5 retention tiers, operator runbook vs code removal, external QwenLM and Hugging Face references.
- GUI, v0/islands, vision, and Mens Qwen — virtuous-cycle implementation plan (2026) — 50+ tracked ideas with repo anchors: WebIR,
vox island, Playwright/MCP screenshots, orchestrator vision, Mens Qwen3.5 text vs optional VL rubric lane, execution waves W0–W5. - Orchestrator
attachment_manifestRFC (2026) — MIME+hash task attachments and vision routing without substring-only hints (spec ahead of types).
Suggested reading paths
Deep Research Clusters (April 2026)
- Research Synthesis: Grand Strategy Seed 2026 — the master framework connecting these discoveries.
LLM Hallucination & Type System Impact (Wave 1)
- LLM-Native Language Design — cluster overview with Vox implications
- Cognitive Science of LLM Hallucinations
- Empirical Evidence for Type Systems
- Frontier Model Challenges
- K-Complexity Reduction Strategies
- Zero-Shot Invariants Validation
- Works Cited: Hallucination & Type Systems
Continual Learning & Flywheel Risks (Wave 2)
- Continual Learning Flywheel Risks — cluster overview with risk taxonomy
- MAD and Mode Collapse
- The Compile-Pass Oracle and Semantic Drift
- Catastrophic Forgetting in QLoRA
- Schola / Scientia Typicality Bias & Slop
- Minimum Viable Corpus for QLoRA
- Negative Examples via DPO/NAT
- Risk Taxonomy and Telemetry Mitigations
- Works Cited: Continual Learning Flywheel
- MENS Synthetic Corpus: Limitations and Mitigation Strategies (research 2026) — maps all active synthetic corpus strategies to their known failure modes and proposes 8 concrete mitigations (AST mutation, DPO wiring, anchor floor, curator LLM, CURLoRA, fictional knowledge graphs, automated flywheel, Rust cross-pollination).
- MENS Corpus: Full Implementation Plan (2026) — 4-wave execution plan grounded in mix-report audit (97.3% synthetic monoculture confirmed). Specifies W0 emergency corpus bootstrap, W1 DPO lane wiring and missing mix-config creation, W2 AST mutation + Rust→Vox corpus expansion, W3 semantic quality gates, W4 automated flywheel. Includes exact CLI commands, file specs, dependency graph, and volume projections.
- TOESTUB Line Limit & MENS Corpus Size Research (2026) — Investigation into Vox's actual TOESTUB God Object limits (1700 lines) vs documentation (500 lines) and an analysis on optimal LLM chunking/file sizes for SFT pipelines using modern models like Qwen3-4B.
GRPO Reward Shaping for Code LLMs (Wave 3)
- GRPO Reward Shaping for Code LLMs — cluster overview with architectural adjustments
- Efficacy of Binary Parse-Rate Signalling
- GRPO VRAM Efficiency and Small-Batch Dynamics
- AST Coverage Scoring and Reward Hacking
- Empirical Justification for Reward Weights
- Optimization Landscape of Positive-Only Loops
- Gap Analysis and Adjustments
- Works Cited: GRPO Reward Shaping
AI Agent Context and Handoff Continuity (Wave 4)
- Empirical Evidence for Context Compaction
- Context Bleed and Identity Confusion
- SOTA Context-Aware Protocols
- Context Retrieval Policies
- A2A Protocol Evidence Sharing
- Context Truncation Failure Modes
- Production Failure Catalog
- Design Pattern Recommendations
- Implementation Checklist
- Works Cited: Agent Handoff Continuity
Autonomous Research Localization & MENS Research Lane (Wave 6)
- Local autonomous research findings 2026 — SearXNG meta-search integration, native Rust scraping stack (
vox-scraper), DuckDuckGo fallback, and performance tiering. - MENS Research Track Blueprint 2026 — Lane G (
research-expert) spec, GRPO+RLVR reward functions, synthetic fact-chain generator, and Socrates integration. - GraphRAG Iterative Retrieval Research 2026 — Multi-hop retrieve-reason-retrieve loops, stopping heuristics, and C2RAG constraint checking.
Scientia distribution, discovery, and publication surfaces
- SCIENTIA multi-platform ranking, discovery, and anti-slop SSOT (research 2026) — Tiered citations for social and scholarly ranking surfaces; ingest vs syndicate posture; manifest-centered projection profiles; operator KPI sketches for signal vs noise. Complements external discovery and impact / readership.
- Syndication Ecosystem & Multi-Platform Publishing Research 2026 — Analysis and adoption strategy for third-party Rust SDKs (
atrium,megalodon,twapi-v2) to reduce maintenance burden and eliminate manualreqwestmanipulation for social publishing channels. - Scientia Community Publishing Playbook 2026 — Operational playbook for multi-platform community management with minimal overhead. Covers Discord webhook setup, Reddit OAuth + anti-spam rules, GitHub Discussions GraphQL API,
vox-publisherdata model extension requirements, Clavis secret registration needs, and subreddit policy pack templates. Companion to the multi-platform ranking research above. - 🔬 Scientia Publication Endpoints — Ground-Truth Research & Implementation Policy (April 2026) — v2. Comprehensive code audit + web research across all 18 publication targets. Adds: ResearchGate full policy (no API exists; passive via DOI; do not implement), ORCID member API (highest-leverage new scholarly target), Figshare REST API (datasets/supplementary). Corrects v1 errors: Reddit User-Agent WAS correct;
social_retry.rshas zero call sites (dead code);bluesky/mastodon/discord/linkedinare absent fromswitching.rsallowlist and retry infrastructure. Defines formal implementation policy: channel classification taxonomy (ActivePush/ScholarlyDeposit/ManualAssist/PassiveDiscovery/Deferred), gate requirements per class, 13-column hallucination inventory, and 8-wave task backlog with ~50 EP-NNN gap IDs. Last verified: 2026-04-13.
Multi-Repository Context Isolation (Wave 5)
- Multi-repo context isolation: research findings 2026 —
.voxignoreSSOT policy, scope guard architecture, agent instruction file hierarchy, IDE workspace isolation, Git worktree patterns, security threats (IDPI, slopsquatting, scope escalation), context engineering guidelines, monorepo/polyrepo AI-readiness analysis, andvox repo initscaffold specification. Directly actionable: gaps table, implementation priorities, and cross-references tocross-repo-query-observability.mdandcontext-management-research-findings-2026.md.
Independent Deep Research Tracks
- Agent Trust Reliability Evaluation
- AI Plan Adequacy Heuristics
- AI-Augmented Testing & Hourglass Architecture Research
- Compiler Testing Research
- Multi-Agent Mesh Economics
- Grammar-Constrained Decoding for Code LLMs
- LLM Output Mediation and Programmatic Validator Generation — Proposes a unified
LlmMediator<T>architecture connectingvox-constrained-gen(Tier 1),vox-jsonschema-util(Tier 2), Socrates confidence (Tier 3), and the trust layer into a single composable seam. Covers dynamic finite-response-set schema derivation, MCP reduction strategy, RLVR training alignment, and a four-wave implementation roadmap. Cross-references grammar-constrained decoding, trust reliability, HITL doubt loop, and capability registry. - Clavis as a one-stop secrets manager: research findings 2026 — Comprehensive gap analysis for evolving Vox Clavis into a full-lifecycle secrets management platform. Covers: complete env-var taxonomy across 9 secret classes, user-facing feature requirements, OWASP NHI Top 10 alignment, AI-agent credential isolation boundaries, MCP OAuth 2.1 target model, A2A credential delegation via RFC 8693 Token Exchange, runtime secret redaction pipeline, KEK/DEK envelope encryption model, competitive feature gap table vs. Doppler/Infisical/Pulumi ESC/Vault. Extends clavis-secrets-env-research-2026.md.
- Clavis V2: Full Implementation Plan (2026) — Codebase-verified, code-grounded implementation plan for the full Clavis V2 platform. Anchored in the live codebase (spec.rs, vox_vault.rs, resolver.rs, clavis.rs CLI). Defines: single canonical data structure for all ~580 secrets (TaxonomyClass + LifecycleMeta + scope_description on SecretSpec, 3 new ResolutionStatus variants, 4 new SecretMaterialKind variants); 4 new VoxDB tables (version history, audit log, profile overrides, A2A delegations); updated write path with atomic multi-table transactions; 12 new/updated CLI subcommands (set-secret, rotate, rollback, history, list, diff, run, audit-log, delegate, revoke-delegation); runtime secret scrubber (redact.rs + aho-corasick); consumer wiring for all 8 platform crates; 8-wave execution plan with verification steps per wave; 5 new security invariants extending the V1 threat model.
- Cryptography Research Findings 2026 — ZIG/AEGIS eradication and AES performance evaluation.
Documentation
- Orphan surface inventory
- Architecture index
- planning-meta documents when you need contributor process detail
Packaging and portability
- Vox Docker-backed portability research 2026
- Vox Docker-backed portability implementation plan 2026
- Vox packaging research findings 2026
- Vox packaging implementation blueprint
Language and architecture direction
- AI IDE feature research findings 2026
- Prompt engineering, system prompts, document-skills, and SCIENTIA (research 2026)
- Terminal execution policy research findings 2026 — PowerShell-first shells, IDE allow/deny limits, future unified contract
- Telemetry unification research findings 2026
- Telemetry implementation blueprint 2026 — roadmap implementation plan
- Telemetry implementation backlog 2026 — executable checklist
- Protocol convergence research 2026
- Populi GPU network research 2026
- Populi GPU mesh implementation plan 2026 — paired decision docs: ADR 017, ADR 018, ADR 020, placement matrix; probe SSOT: GPU truth probe spec, node lifecycle / hotplug
- Mobile/Desktop Convergence & Language Extension Research 2026 — unified browser view, std.mobile namespace, agent/environment parser gaps, Web API vs Capacitor strategy, maintainability quantification
- Vox bell-curve strategy
- Feature growth boundaries
- Interop tier policy
Hygiene and maintenance
- Dependency Sprawl Audit and Resolution (2026) — Records the workspace-wide audit of sprawling Cargo dependencies, centralization into the root
[workspace.dependencies], and implementation of TOESTUB CI-CD enforcement rules.
Agentic planning and orchestration
- Research Synthesis: Symphony Conduction vs. Agent Orchestration 2026 — Extensive structural mapping of real-world conduction (Ictus, DAGs, HITL) to
vox-dei - Claude Code Ultraplan research 2026 — architecture deep-dive, cost model, failure modes, and actionable Vox recommendations
- Unified Agentic Control Surface Research 2026 — Tri-state pilot console, "Second Pass" validation, and Doubt metaphor unification.
- Dynamic agentic planning 2026 — earlier research seed for planning-mode architecture
- Orchestrator multi-agent groundwork 2026
- Context management research findings 2026
- Context management implementation blueprint
- Vox agentic loop and MENS plan
- VCS for agent state and artifact snapshotting research 2026 — Using Jujutsu to automate artifact persistence and reversibility over Vox DEI.
SCIENTIA novelty / publication ledger (contracts)
- Finding-candidate and novelty-evidence v1 JSON Schemas live under
contracts/scientia/(finding-candidate.v1.schema.json,novelty-evidence-bundle.v1.schema.json); example fixtures undercontracts/reports/scientia-*.example.v1.json. CI:vox ci scientia-novelty-ledger-contracts(also nested invox ci ssot-drift). CLI spot-check:vox scientia finding-candidate-validate,vox scientia novelty-evidence-bundle-validate. - 🔴 PRIMARY IMPLEMENTATION SSOT (use this for all implementation work): scientia-pipeline-ssot-2026.md — unified inbound + outbound gap remediation specification. Code-verified against real sources. 28 implementation tasks (G1–G28) organized into 9 dependency-ordered execution groups. Includes canonical data model, DB schema changes, env var registry, Clavis secret registry, and LLM-executor verification ritual. Supersedes gap analysis and wave playbook for implementation decisions.
- Impact / readership / citation-adjacent signals (research seed): scientia-impact-readership-research-2026.md and tunable weights in
contracts/scientia/impact-readership-projection.seed.v1.yaml(orthogonal to novelty; no default publish gate). - Multi-platform ranking, discovery, and anti-slop SSOT (research 2026): scientia-multi-platform-ranking-discovery-research-2026.md — social and scholarly feed mechanics (tiered sources), ingest vs syndicate, projection profiles, anti-slop metrics; bridges outbound
vox-publishersyndication and inbound external discovery. - Publication-worthiness + SSOT unification research plan: scientia-publication-worthiness-ssot-unification-research-2026.md (standards-to-signals matrix, canonical metadata graph proposal, detection calibration protocol, Codex research snapshot persistence blueprint, automation boundary ledger).
- Implementation wave playbook (historical context): scientia-implementation-wave-playbook-2026.md (232-task execution map, wave outputs, first-30 lock order, and contract inventory).
- Comprehensive gap analysis (historical context): scientia-gap-analysis-2026.md — 45 identified problems with solutions, severity ratings, and a 7-wave execution order.
- Scientia Worthiness × Socrates Unification (research 2026): scientia-socrates-unification-research-2026.md — deep structural analysis of isomorphisms between the Worthiness publication gate and the Socrates real-time confidence protocol. 38+ integration ideas organized into 8 themes (shared numeric language, inbound pipeline, A2A communication, MENS training, etc.), explicit separation-of-concerns boundaries, risk map, and wave-gated implementation roadmap.
- Scientia Publisher & Orchestrator Hardening Plan (roadmap 2026): scientia-publisher-hardening-implementation-plan-2026.md — ordered execution plan for de-factoring God Objects across vox-publisher, vox-orchestrator, and vox-cli to adhere to the 500-line TOESTUB policy.
- 🔴 PRIMARY IMPLEMENTATION TASK LIST v2 (use this to execute work): scientia-publication-pipeline-implementation-plan-2026.md — 31 explicit tasks (T-001 to T-031) across 8 waves. v2 corrects 13 factual errors from v1 including: Bluesky XRPC URL had wrong method path AND wrong request field conflation;
SyndicationResultalready had bluesky/mastodon/linkedin/discord fields;social_retrywas already wired (not dead code); Zenodo adapter is fully complete (564L, create+upload+publish+retry); Mastodon API accepts JSON body; Discord resolves its own Clavis webhook; LinkedIn REST endpoint is/rest/postsnot/v2/posts; all four social Clavis SecretIds already exist. Includes exact Rust code patterns, per-task verification commands, wave-gated dependency ordering, and a permanent Do-Not-Implement registry.
Labeling rule
If a page is primarily research or a roadmap, say so in the title, frontmatter, or first paragraph. Do not rely on filenames alone.