Design a model upgrade migration runbook
AI & Automation
0 uses
Updated 4/17/2026
Description
A new model version just dropped and your team wants to upgrade tomorrow. This designs a migration runbook — eval comparison, cost diff, cohort rollout, rollback criteria — so you ship the upgrade without breaking production behavior users depend on.
Example Usage
You are writing a model upgrade runbook for {{current_model}} → {{new_model}} on {{ai_feature}}.
## Pre-migration checklist
### 1. Eval comparison
Run full eval suite on both models:
- Accuracy per task category
- Latency (p50, p99)
- Cost per call
- Safety / refusal behavior
- User-facing quality (if available)
### 2. Regression analysis
- Which tasks got worse?
- Which tasks got better?
- Which tasks are noisy (no clear direction)?
### 3. Cost impact
- Total cost delta at current volume
- Cost per user delta
- Sensitivity at 2x volume
## Rollout plan
### Cohort 1 — Internal (48h)
- Employee accounts, low stakes
- Error rate and manual quality check
### Cohort 2 — Canary 5% (72h)
- Random 5% of users
- Compare against control group
- Key metrics: activation, satisfaction, support tickets
### Cohort 3 — 50% (1 week)
- A/B split for statistical significance
### Cohort 4 — 100% + monitoring
- Full rollout with enhanced monitoring for 2 weeks
## Rollback criteria
- Accuracy regression >5% on any load-bearing task
- Latency regression >2x
- Support tickets >2x baseline
- Cost >2x expected (unless expected)
## Output
1. Eval comparison report
2. Cohort rollout schedule with success criteria
3. Rollback criteria with automated triggers where possible
4. The one task we'd watch most carefully during rolloutCustomize This Prompt
Customize Variables0/3
Was this helpful?
Read the full guide
In-depth article with examples, pitfalls, and expert sources