Mens Qwen family migration and native stack (research 2026)
Executive summary
- Product default in this repository is already Qwen3.5-class text bases (
DEFAULT_MODEL_IDinvox-populimens/mod.rs, nightly workflowqwen35-native-nightly.yml, Mens training reference). - Qwen2 remains in-tree as
HfArchitecture::Qwen2,InferenceModel::Qwen2, HF keymap tables, and unit test fixtures using"model_type":"qwen2"JSON snippets. That is intentional compatibility and regression surface, not legacy neglect. - Public ecosystem still ships many Qwen2-named weights and LoRA adapters; “delete Qwen2 from Candle” is a semver-scale decision, not a documentation tweak.
This document defines deprecation tiers, a migration story split (runbook vs weight surgery vs code removal), and external references to re-check before any removal milestone.
External references (April 2026 snapshot)
Re-verify URLs and claims before release-blocking decisions.
| Source | Use |
|---|---|
| QwenLM: Qwen3 — Think Deeper, Act Faster | Product positioning: thinking vs non-thinking modes, multi-size lineup. |
| QwenLM: Qwen2.5-Coder family | Code-specialized line; still a credible baseline for comparisons. |
| airank.dev: Qwen2.5-Coder-32B vs Qwen3 Coder Next | Third-party benchmark/cost framing (non-authoritative). |
| Hugging Face Transformers: Qwen3_5 model doc | text_config / vision_config, multimodal token ids; upstream pages may still contain scaffolding — treat as evolving. |
Migration story: three layers of difficulty
| Layer | Meaning | Effort band |
|---|---|---|
| A — Operator runbook | New work uses Qwen/Qwen3.5-*; refresh tokenizer.json; train or merge QLoRA; serve via Schola path in Mens serving SSOT; re-run eval on fixed JSONL. | Small (documentation + checklist + one dry run). |
| B — Adapter continuity | Same LoRA directory must run on a new base without retrain — may require out-of-tree conversion or may be unsupported; document honestly. | Medium to large if promised automatically. |
| C — Code removal | Delete Qwen2 branches in Candle and tests. | Large; requires audit, CI matrix, release notes. |
Narrative for contributors: default new recipes to Qwen3.5; keep Qwen2 paths until an explicit audit shows zero product dependency; prefer “retrain recommended” over silent weight conversion.
Deprecation tiers (proposal)
| Tier | Qwen2 native path | Qwen3.5 |
|---|---|---|
| Supported | Load + inference + tests maintained | Default for new training and docs. |
| Frozen | Bugfixes only; no new Qwen2-only features | Active development. |
| Removed | Delete after migration guide + major boundary | Single text architecture path (names TBD). |
Repository audit checklist (for tier movement)
Execute before Frozen or Removed:
rg/ search:Qwen2,qwen2,HfArchitecture::Qwen2,InferenceModel::Qwen2acrosscrates/vox-populi,crates/vox-cli, workflows,contracts/mens/.- Confirm no operator-facing doc promises Qwen2 as default.
- Confirm
training-presetsandDEFAULT_MODEL_IDstay aligned (vox-populitesttraining_presets_yaml_contract.rsin the workspace crate). - Update Mens training reference cross-links if serve or merge matrix changes.
Qwen3.5-specific technical notes (native stack)
- Linear / hybrid attention blocks —
hf_keymap.rsbranches onHfArchitecture::Qwen35and layer type (linear_attentionvs full attention). Changes to upstreamconfig.jsonnaming must be reflected here. - RoPE and preflight —
qlora_preflight.rsincludes Qwen3.5-specific rope key warnings; keep tests when touching layout discovery. - Thinking-mode tokens — If training data includes chain-of-thought, define whether Mens supervised spans strip them for
vox_codegenlanes (Mens training data contract lane policy).
Multimodal (HF) vs native Candle
Hugging Face Qwen3_5Config documents vision_config and image placeholder token ids. Native Candle QLoRA in this repo remains text-only until a separate ADR and execution planner workstream adds a vision encoder and training contract. Until then, multimodal serving belongs in external runtimes (vLLM, Ollama, HF) as already described in Mens training reference external serving section.
See also
- Mens vision and multimodal inputs (research 2026)
- Vox corpus lab (research 2026)
- Candle full graph feasibility and ADR 006 / 007 linked from Mens docs
- Mens training reference
- Vox source → Mens pipeline SSOT
Open questions
- Minimum Qwen2 fixture set to keep permanently in
vox-populitests after tier Frozen. - Whether to publish a single
external_serving_handoffextension field forbase_familywhen VL is used only for eval, not training. - Official policy on community weight migration scripts (license, no vendoring without review).