Clavis secrets, env vars, and API key strategy research 2026
See also: Clavis as a one-stop secrets manager: research findings 2026 — extends this document with a complete env-var taxonomy, user-facing feature requirements, AI-agent credential isolation design, A2A delegation via RFC 8693, competitive gap analysis, and an 8-wave implementation roadmap.
Implementation plan: Clavis V2: Full Implementation Plan (2026) — codebase-verified plan translating the research into concrete data structures, SQL schema, CLI surface, and 8-wave execution order.
Implementation support docs:
- Clavis Cloudless Threat Model V1
- Clavis Cloudless Implementation Catalog
- Clavis Cloudless Ops Runbook
- Clavis Break-Glass Runbook
Purpose
This document is a research dossier for evolving vox-clavis from a strong environment-variable-first baseline into a more durable, auditable, and AI-era-safe secret management system.
It is intentionally research-only. It does not define migrations, schema diffs, rollout sequencing, or implementation commits.
Scope and non-goals
In scope
- The most persistent friction points with environment-variable and API key management in modern teams.
- AI-agent-era risks (prompt injection and context leakage) that change secret-handling assumptions.
- Key-sprawl reduction strategies that preserve capability.
- Maintainability and SSOT improvements for Clavis and adjacent Vox surfaces.
- VoxDB account-level persistence considerations and trust boundaries.
- Candidate Rust ecosystem dependencies for optional backend support.
Out of scope
- Immediate code changes to resolver precedence,
SecretIdinventory, or backend wiring. - A final architecture decision on cloud-vault vs local-only storage policy.
- Concrete policy enforcement changes in
vox cibeyond current guards.
Executive summary
Vox already has a healthy Clavis foundation:
- Canonical metadata in
crates/vox-clavis/src/lib.rs. - Clear resolution precedence and compatibility tiers.
- CI enforcement (
secret-env-guard,clavis-parity) for drift prevention.
The main strategic risk is no longer "missing secret support." It is fragmentation and leakage pressure across an expanding AI + automation surface:
- Too many static credentials across domains (LLM, GPU providers, publication adapters, mesh, telemetry, DB, webhooks).
- AI toolchains increase the chance that resolved secrets can leak into prompts, tool output, traces, and logs.
- Environment variables remain useful but weak for lifecycle controls (rotation, auditability, and cross-machine consistency).
The recommended direction is a layered model:
- Keep Clavis as metadata and lookup SSOT.
- Reduce key count where possible via gateway and workload identity patterns.
- Distinguish irreducible domains where multiple credentials remain necessary.
- Add explicit redaction and secret-boundary rules for agent-facing data paths.
- Define account-scoped persistence policy for VoxDB with envelope encryption and role-scoped access semantics.
As-built Vox Clavis baseline (code-grounded)
These files form the current architecture baseline:
crates/vox-clavis/src/lib.rsdefinesSecretId/SecretSpec, canonical env names, aliases, deprecation, and requirement bundles.crates/vox-clavis/src/resolver.rsimplements precedence (env -> backend -> secure/compat stores) and status reporting.crates/vox-clavis/src/lib.rscontrols backend mode selection (Auto,EnvOnly,Infisical,Vault,VoxCloud).crates/vox-clavis/src/backend/vox_vault.rsprovides encrypted vault behavior backed by local file or Turso remote connection.crates/vox-clavis/src/sources/auth_json.rsmanages~/.vox/auth.jsonand secure keyring-backed token indirection.crates/vox-cli/src/commands/ci/run_body_helpers/guards.rsenforcessecret-env-guardandclavis-parity.crates/vox-db/src/secrets.rsexposes a parallel keyring API surface that should be kept in explicit contract with Clavis boundaries.
Current SSOT documentation is docs/src/reference/clavis-ssot.md.
C-L-A-V-I-S working mnemonic (research lens)
The codebase does not define this acronym formally. For this dossier, use it as an analytical lens:
- C - Canonical metadata:
SecretIdand canonical/alias naming policy. - L - Lookup precedence: deterministic resolver order and compatibility semantics.
- A - Auth sources: backend + keyring + auth file + compatibility stores.
- V - Vault backends: local encrypted store and remote secret systems.
- I - Integration boundaries: CLI/MCP/runtime/database/publication/tooling surfaces.
- S - SSOT governance: docs parity, deprecation lifecycle, CI guardrails.
Industry pain points: why env-var secrets remain annoying
Lifecycle and auditability limitations
Environment variables are still simple and portable, but they do not natively provide:
- Read audit trails ("who accessed which secret, when").
- Rotation orchestration and expiry policy.
- Versioning and rollback of secret values.
- Drift detection across local, CI, and deployed environments.
Sources:
- OWASP Secrets Management Cheat Sheet
- Twelve-Factor config guidance
- Doppler analysis on env-var production limits
Exposure surface
- Env vars can leak via process inspection, crash dumps, shell history, and accidental logs.
- Repository leaks remain frequent; push-time scanning has become a baseline requirement.
Sources:
- OWASP NHI Top 10: Secret leakage
- GitHub push protection docs
- GitHub changelog: configurable push-protection patterns (GA)
Config-vs-credentials confusion
The classic guidance ("config in env vars") remains valid for non-sensitive deployment tuning, but modern practice increasingly separates credentials from generic config and applies stricter controls to credentials.
Source:
2026 AI-era threat model deltas
Prompt injection + tool access multiplies blast radius
In agentic systems, untrusted content can influence tool calls and retrieval chains. This changes secret assumptions:
- Not enough to "store securely"; must also prevent secret propagation into model-visible context.
- Capability metadata should be separated from secret material.
- Any accidental secret inclusion in prompt context may propagate to third-party model logs.
Sources:
- OWASP LLM Prompt Injection Prevention Cheat Sheet
- OpenClaw issue discussing API key exposure in model context
MCP local vs remote implications
- Local stdio MCP has an implicit trust boundary (host process owner).
- Remote MCP should favor OAuth 2.1 + PKCE and avoid query-parameter secrets.
Sources:
Secret inventory stress-test: what can be reduced vs what is irreducible
Domains currently represented in Clavis inventory
- LLM provider keys and compatibility aliases.
- Cloud GPU provider keys.
- Publication/syndication adapters (GitHub, Zenodo, OpenReview, Crossref, social APIs).
- Vox platform tokens (mesh roles/JWT/HMAC/runtime ingress).
- VoxDB/Turso credentials.
- Telemetry upload secrets.
- Webhook verification/authentication secrets.
Reduction opportunities
- Inference routing consolidation
- Keep OpenRouter-first as default cloud gate where suitable.
- Optionally add self-hosted unified gateway pattern for enterprises requiring stronger governance.
- Identity-first cloud auth
- Prefer workload identity and short-lived credentials where available.
- Token class simplification
- Split "operator bootstrap tokens" from "runtime service credentials" from "per-account user BYOK material" so each class has clear lifecycle and storage expectations.
Likely irreducible categories
- Publication adapters using platform-specific OAuth/token contracts.
- GPU providers where no common broker fully replaces provider-native credentials.
- Cross-boundary webhook verification material.
- Mesh/routing auth when role-specific isolation is required.
Strategy to reduce key count while preserving power
1) Multi-provider gateway as default abstraction layer
- Use one Clavis-managed gateway credential for common LLM workloads.
- Keep direct provider keys optional for advanced use cases, fallback, or compliance constraints.
- Gate direct-provider mode behind explicit profile/capability flags.
Supporting references:
- LiteLLM gateway pattern
- AWS multi-provider generative AI gateway guidance
- LLM traffic governance concepts
2) Move from static keys to short-lived identity where possible
- AWS: IAM Roles Anywhere or workload identity for non-AWS runtimes.
- Azure: Managed Identity where workloads run on Azure.
- GCP: Workload Identity Federation replacing service account keys.
Supporting references:
3) Dynamic secrets for databases and high-value services
- Prefer generated, short-TTL credentials from a vault backend for DB-like integrations.
- Use static long-lived credentials only when dynamic issuance is unavailable.
Supporting reference:
Maintainability and SSOT improvements for Clavis
Keep one contract, many adapters
Maintain SecretSpec as the canonical control plane and treat backends as pluggable retrieval adapters. This keeps naming policy, required/optional semantics, deprecation windows, and docs parity centralized.
Clarify the vox-db::secrets boundary
Document and enforce one of two explicit outcomes:
vox-db::secretsis a narrow low-level primitive and all product secret policy remains in Clavis; orvox-db::secretscallsites migrate behind Clavis APIs to avoid dual behavior surfaces.
Unowned overlap should be considered an SSOT risk.
Expand CI checks from parity to data-flow safety
Current checks already prevent direct env reads and docs drift. Future enforcement candidates:
- Secret value redaction checks in structured logs and telemetry.
- Guardrails preventing
ResolvedSecretserialization to user/model-visible channels. - Additional policy checks for deprecated alias removal readiness.
VoxDB account-level persistence: research directions
Account-level persistence should start with explicit threat-model choices:
- Device-local trust only (keyring-backed, optional cloud sync disabled).
- Account-synced encrypted vault (VoxDB/Turso stores ciphertext only; master key outside DB rows).
- Hybrid (local default; optional account sync for selected secrets/classes).
Research criteria:
- Secret classification by blast radius.
- Key hierarchy and envelope encryption design.
- Rotation semantics and credential version tracking.
- Access controls per account/workspace/profile.
- Incident response path (revoke, rotate, invalidate, replay-safe propagation).
Rust ecosystem options (appendix for future implementation)
These are candidates, not commitments:
- Existing baseline in
vox-clavis:secrecy,keyring,aes-gcm,blake3,turso. - HashiCorp Vault client:
vaultrs. - AWS Secrets Manager:
aws-sdk-secretsmanager. - Google Secret Manager:
google-cloud-secretmanager-v1. - Linux secret service internals:
secret-service. - Memory hygiene support:
secrecydocs,zeroizedocs.
Guidance:
- Keep backend crates behind optional features to control compile and MSRV impact.
- Preserve deterministic fallback behavior when optional backends are not enabled.
Security issues to address explicitly
- Secret-in-context leaks for AI paths (prompt/tool serialization boundaries).
- Secret-in-log leaks (including debug, telemetry, panic messages).
- Static key overuse where identity federation is available.
- Dual-storage ambiguity (
vox-dbkeyring helpers vs Clavis-managed surfaces). - Rotation gaps for optional integrations (social/publisher/provider keys with long lifetimes).
- Insufficient metadata on secret lifecycle state (age, source, rotation status, owner, scope).
Greenfield feasibility proof (code-evidenced)
Conclusion
Yes, greenfield cutover is feasible, but only with explicit compatibility cuts accepted up front.
If compatibility aliases and parallel env paths are not preserved, current users relying on those paths will break immediately by design.
Evidence: where secret-like env reads still bypass Clavis
- Clavis itself is env-first by design
crates/vox-clavis/src/lib.rs(resolve_secret) auto-selects backend based on env probes (VOX_TURSO_URL,INFISICAL_*,VAULT_*) before fallback.crates/vox-clavis/src/sources/env.rsresolves canonical env, aliases, and deprecated aliases.
- DB credential path remains parallel
crates/vox-db/src/config.rsreadsVOX_DB_*and compatibility aliases (VOX_TURSO_*,TURSO_*) directly.
- MCP HTTP gateway tokens are env-only today
crates/vox-orchestrator/src/mcp_tools/http_gateway.rsreadsVOX_MCP_HTTP_BEARER_TOKENandVOX_MCP_HTTP_READ_BEARER_TOKEN.
- Runtime model registry can read arbitrary api_key env names
crates/vox-runtime/src/llm/types.rschecksapi_key_envviastd::env::varbefore provider-specific Clavis fallback.
- Publisher OpenReview path is mixed
crates/vox-publisher/src/publication_preflight.rsreadsOPENREVIEW_ACCESS_TOKEN/VOX_OPENREVIEW_ACCESS_TOKENdirectly while also using Clavis for email/password.
- Orchestrator still reads social credentials directly
crates/vox-orchestrator/src/config/impl_env.rsreadsVOX_SOCIAL_REDDIT_*andVOX_SOCIAL_YOUTUBE_*.
- CI already enforces a partial boundary
crates/vox-cli/src/commands/ci/run_body_helpers/guards.rshassecret-env-guardandclavis-parity, proving policy intent but not total migration completion.
Breakpoints if compatibility is intentionally skipped
- Existing env-only deployments using Turso legacy aliases fail immediately.
- MCP HTTP deployments expecting
VOX_MCP_HTTP_*TOKENenvs fail auth startup if not remapped. - Runtime registry entries that rely on
api_key_envfail provider auth unless replaced. - OpenReview token-only paths fail unless a Clavis-native equivalent is introduced.
- Orchestrator social integrations fail unless Clavis-backed loading is wired consistently.
Minimal guardrails required even in greenfield mode
- Keep one documented "hard cut" release boundary and reject legacy secret names at startup.
- Fail-closed secret resolution for production profiles (missing/invalid secret must stop action).
- Enforce no-secret-in-context/no-secret-in-logs checks in CI for MCP/runtime/tool outputs.
- Require explicit source annotation for each secret read path (
Clavis,keyring,vault,none).
2026 platform decision matrix for Vox Cloudless
Compliance and liability notes below are technical risk framing, not legal advice.
| Platform | Capability depth | Rust integration path | Lock-in | Operational burden | Compliance/liability posture | Cloudless fit | AI-agent leakage risk profile |
|---|---|---|---|---|---|---|---|
| HashiCorp Vault | Very high (dynamic secrets, PKI, transit, policy) | HTTP API / optional vaultrs | Medium-high | High (HA, unseal, policy ops) | Strong control if operated well; ops failures are your liability | High (self-host) | Low-moderate if strict policy/redaction; high if broad token scopes |
| OpenBao (Vault-compatible fork) | High (Vault-style model) | HTTP API / Vault-compatible clients | Medium | High | Similar to Vault; self-host governance burden remains | High (self-host) | Similar to Vault; depends on policy discipline |
| Infisical (self-host/cloud) | High for app secrets and team workflows | HTTP API / existing Clavis backend direction | Medium | Medium | Better DX; self-host shifts liability to operator, SaaS shifts trust to vendor | High for self-host, medium for SaaS | Moderate; strong if centralized policy + short-lived access tokens |
| AWS Secrets Manager | High in AWS-centric estates | AWS SDK / HTTP + IAM | High | Low-medium (in AWS) | Strong cloud-native controls; vendor + IAM misconfig risk | Low-medium (not cloudless-first) | Moderate; strong server-side controls, but cross-env copying remains risk |
| Azure Key Vault | High in Azure-centric estates | Azure SDK / HTTP + Entra ID | High | Low-medium (in Azure) | Strong enterprise posture in Azure; identity/RBAC hygiene required | Low-medium | Moderate; similar to AWS pattern |
| GCP Secret Manager | High in GCP-centric estates | GCP SDK / HTTP + IAM | High | Low-medium (in GCP) | Strong in GCP compliance envelope; IAM complexity remains | Low-medium | Moderate; similar to AWS/Azure pattern |
| Doppler | Medium-high (excellent env distribution workflow) | CLI/API integration | High | Low | Vendor-managed security posture; contractual/vendor dependency | Low for strict cloudless | Moderate; centralization helps, but downstream prompt/log boundaries still yours |
| 1Password Secrets Automation | Medium (strong team secret workflows, less dynamic infra auth) | CLI/API/Connect server | Medium-high | Low-medium | Strong for org workflows; vendor dependence and service-account model | Medium | Moderate; good human+machine hygiene, still needs output redaction controls |
| SOPS + age | Medium (great static secret files, weaker dynamic issuance) | CLI-driven workflow (not runtime API-first) | Low-medium | Medium (process-heavy) | Strong Git history controls if managed well; key custody risk on operator | High | Moderate-high if decrypted artifacts leak in CI/tool logs |
| OS keyring only | Low-medium (device-local only) | Existing keyring crate usage | Medium (OS APIs) | Low | Good local boundary; weak central audit/revocation | High local-only | Moderate; local safety good, team-scale governance weak |
Sources for platform matrix
- HashiCorp Vault docs
- OpenBao
- Infisical and Infisical GitHub
- AWS Secrets Manager
- Azure Key Vault
- Google Secret Manager
- Doppler pricing/product
- 1Password Secrets Automation
- SOPS
- age
Vox Cloudless operating models
flowchart LR
localFirst[LocalFirst_KeyringOnly] --> hybrid[Hybrid_KeyringPlusVoxDBCiphertext]
hybrid --> managedSelfHost[ManagedSelfHost_VaultOrInfisical]
hybrid --> managedCloud[ManagedCloud_SM]
Local-first (KeyringOnly)
- Secret classes owned: local developer/provider keys, short-lived sandbox credentials.
- Blast radius: device compromise + local process leakage.
- Operator burden: low.
- Developer ergonomics: high for single-user/dev machines; weak for team sharing/rotation/audit.
Hybrid (Keyring + VoxDB ciphertext)
- Secret classes owned: account-scoped keys, cross-device sync classes, policy metadata.
- Blast radius: account compromise can expose encrypted corpus if key hierarchy is weak.
- Operator burden: medium.
- Developer ergonomics: strong balance; one control plane with local bootstrap.
Managed self-host (Vault/Infisical backend)
- Secret classes owned: production/system secrets requiring policy and audit controls.
- Blast radius: backend compromise can be broad without segmentation.
- Operator burden: high (especially Vault-class operations).
- Developer ergonomics: medium-high after setup; high policy power.
Managed cloud secret manager
- Secret classes owned: cloud-native runtime credentials in a single cloud boundary.
- Blast radius: IAM/policy mistakes can cross workloads quickly.
- Operator burden: low-medium.
- Developer ergonomics: high in one cloud, lower in multi-cloud/cloudless narratives.
In-house vs vendor boundary (technical and liability lens)
Potential gains from in-house Cloudless model
- Unified SSOT semantics under Clavis across all providers/services.
- Lower long-term vendor lock-in pressure for core secret logic.
- Better control over agent-specific no-leak constraints and audit model.
- Ability to optimize for VoxDB account-level workflow directly.
Costs and liabilities of in-house model
- You own incident response, key hierarchy mistakes, and rotation failures.
- You own secure defaults, audit retention correctness, and operational uptime.
- Compliance claims become implementation-dependent on your controls and evidence.
What should usually remain external
- Hardware-rooted key custody and cloud identity federation primitives.
- Commodity secret scanning and provider-specific security telemetry.
- High-assurance compliance attestations that require dedicated governance staffing.
Research gates (implementation readiness)
- Gate A: surface proof complete
- direct env + Clavis + parallel secret stores fully enumerated and source-linked.
- Gate B: platform decision matrix complete
- candidate platforms scored against Cloudless objectives and constraints.
- Gate C: liability/ops boundary complete
- explicit split of in-house vs vendor responsibilities.
- Gate D: implementation input package complete
- non-negotiables, constraints, and success criteria ready for engineering plan.
Open research questions (feeding a later implementation plan)
- What is the canonical account-scoped secret object in VoxDB (shape, encryption envelope, audit metadata)?
- How should Clavis represent short-lived federated credentials vs static API keys in one model?
- Which secrets can be fully abstracted behind one gateway credential, and which must remain explicit?
- What minimum policy guarantees should apply to all MCP tool outputs and traces regarding secret redaction?
- Which hard-cut release boundary should enforce greenfield compatibility removal, and how is it validated in CI?
Research bibliography
- OWASP Secrets Management Cheat Sheet
- OWASP NHI Top 10 - Secret Leakage
- OWASP LLM Prompt Injection Prevention Cheat Sheet
- NIST SP 800-57 Part 1 Rev. 6 (IPD)
- RFC 8693 OAuth 2.0 Token Exchange
- The Twelve-Factor App - Config
- Beyond Twelve-Factor: configuration, credentials, and code
- GitHub secret scanning and push protection docs
- GitHub changelog: push protection pattern configuration
- MCP specification
- MCP registry authentication
- Zuplo: securing MCP server auth
- LiteLLM repository
- AWS multi-provider generative AI gateway guidance
- Solo.io LLM traffic governance topic
- AWS IAM Roles Anywhere
- Azure Managed Identity overview
- Google Workload Identity Federation
- HashiCorp Vault dynamic database credentials tutorial
- secrecy crate docs
- zeroize crate docs
- vaultrs crate
- aws-sdk-secretsmanager crate
- google-cloud-secretmanager-v1 crate
- secret-service crate docs