KERNEL
Signals → Eval → Reflect → Patch → A/B → Promote
CYCLES · 30D
184
loops completed
PROMOTIONS · 30D
41
patches that won
ROLLBACKS · 30D
7
auto-reverted in window
ACTIVE EXPERIMENTS
4
in eval / A-B / shadow
EVAL HARNESS · CASES
1,420
frozen + auto-extending
The Kernel Loop
~4-hour mean cycle time · 6 stages · gated promotion
SIGNALS
production decisions, rejections, escalations
EVAL
frozen suite + new failure cases
REFLECT
agent self-critique on misses
PROPOSE
patch: prompt · skill · LoRA · policy
A / B TEST
shadow → 1% → 5% → 25%
PROMOTE
win? promote. lose? auto-rollback.
The promise: when a customer accepts a proposal, that's a positive example. When they reject one, that's a hard negative. When they edit one, that's a directional gradient. The kernel never stops learning — and never stops being safe to do so.
Active experiments
4 in flight
AB+2.1 F1
traffic:
5%vs v24promote in 4hAB−0.2 PR
traffic:
1%vs v15rollback in 1hPROPOSE—
traffic:
0%vs v9shadow evalREFLECT—
traffic:
0%vs v18patch draftingAuto-rollbacks
regression-gated
Maintenance Optimizer · predictive PM
v22 → v21Regressed on stamping line; eval F1 dropped 0.04 vs baseline.
5h ago
Safety rails
policy contracts
EVAL GATE THRESHOLD
≥ baseline + 1σ
AUTO-ROLLBACK WINDOW
60 min
MAX TRAFFIC ON UNVERIFIED
5%
SHADOW EVAL CYCLE
every 4h
FROZEN-EVAL SUITE SIZE
1,420 cases
PRIVACY BOUND
rule-only · per-tenant LoRA
No ungated change has reached production in the last 184 cycles.