"Continual Learning Flywheel Risks"

Continual Learning Flywheel Risks

Executive Summary

Deploying an autonomous dogfood or self-play training flywheel—in which a model continuously fine-tunes itself on its own generated outputs—carries a critical baseline risk of systemic degradation. Three interacting failure modes threaten the Vox MENS architecture:

  1. Recursive ingestion of synthetic data drives Model Autophagy Disorder (MAD), leading to irreversible variance loss and mode collapse.
  2. Reliance on a binary compile-pass oracle without semantic execution checks exposes the system to reward hacking and severe semantic drift.
  3. Repeated QLoRA fine-tuning cycles on limited data volumes induce catastrophic forgetting, mechanically overwriting the base model's generalized reasoning and natural language capabilities.

Contemporary research offers empirically validated countermeasures: transitioning from a "replace" to an "accumulate" synthetic data strategy; integrating execution-based verification or oracle-less proxy metrics; and deploying advanced PEFT stabilization techniques such as CURLoRA, O-LoRA, or FAPM. Agent-generated prose (Schola/Scientia) remains the most volatile element and requires stringent external filtering.

Detailed Research Pages