"Vox source → compiler → Mens training (pipeline SSOT)"

Vox source → compiler → Mens training (pipeline SSOT)

This page is the persistent crosswalk for contributors: where .vox files are enforced, how they relate to documentation, and how they reach Mens fine-tuning. It deliberately separates compile-time lexing from training-time tokenization.

1. Authoritative .vox layout

TreeRoleEnforcement
examples/golden/**/*.voxCanonical, training-eligible demoscargo test -p vox-compiler --test golden_vox_examples (parse → HIR → WebIR validate → Syntax-K metrics)
examples/parser-inventory/**/*.voxNegative / recovery fixturesMust not be mixed into Mens goldens; excluded by SSOT
Policy fileDeclares golden roots, negative roots, doc scan rootsexamples/examples.ssot.v1.yaml
mdBook includesHash-include paths under docs/src must resolve to existing .vox under examples/golden/ (see Golden Examples corpus)cargo test -p vox-compiler --test examples_ssot

Operator entry: examples/README.md.

2. Lexer and parser (language surface)

The lexer’s keyword inventory is the source-of-truth for what characters become which tokens before AST construction. It does not define Mens vocabulary.

Lexing note: lex currently skips spans that do not match a token (logos errors are dropped). Prefer adding explicit #[token("@…")] entries for documented decorators so source is not silently altered.

3. Documentation corpus

4. Mens training path (model input)

  1. Golden / codegen pairs: vox_corpus walks examples/golden/**/*.vox (and other configured roots) to build instruction–response rows.
  2. Mix + validate: mens/config/mix.yaml, vox mens corpus validate, etc.—see Native ML pipeline and Mens native training.
  3. QLoRA default: vox mens train uses Hugging Face tokenizer for the chosen base model—not VoxTokenizer and not the compile lexer. Lab VoxTokenizer in vox-tensor is a small Burn/dogfood path only.

5. Gap checklist (goldens vs journeys)

Use this when adding files under examples/golden/:

Journey / capabilityGolden coverage (Apr 2026)Suggested follow-up
Script / CLI vox runmesh/noop.vox, hello.vox, std_http_wrappers.voxOptional: dedicated golden/script_args.vox if CLI argv story grows
Reactive UIreactive_counter.vox, dashboard_ui.vox, web_routing_fullstack.voxExpand when layout_groups grammar lands (see backlog docs)
Data + HTTP APIcrud_api.vox, blog_fullstack.vox
Actors / workflows / MCPcounter_actor.vox, checkout_workflow.vox, mcp_tools.vox
@scheduled decoratorscheduled_tick.voxWebIrModule.scheduled_jobs carries name + interval from HIR
@pure / @require / @deprecatedref_effects.vox (regions wired in mdBook API pages)HTTP Result / Error mapping: http_error_mapping.vox
Error / Result patternshttp_error_mapping.vox, type_system.vox (partial)