Failover chain
AI providers go down. Rate limits trigger. Keys exhaust their credit. Pendraic's failover chain is a 4-slot ladder that walks past failures so a writer mid-flow doesn't lose their AI.
The four slots
- Slot 1, preferred model. Your default. Set in the Preferred Model picker.
- Slot 2, first backup. Usually a similar model from a different provider (Sonnet → GPT-4o, or Sonnet → Haiku for cost savings).
- Slot 3, second backup. A cheaper model from any provider, Haiku, GPT-4o-mini, Gemini Flash.
- Slot 4, last resort.Anything that'll keep the writer moving. Often the cheapest available model you have a key for.
When the chain walks
The runtime moves to the next slot when the current one returns:
- HTTP 429 (rate-limited), try the next slot immediately.
- HTTP 5xx (provider down), try the next slot.
- Provider credit exhausted, try the next slot.
- Auth failure on the slot's key, try the next slot.
On a successful generation, the chain returns the result with a small badge indicating which slot served the call so you know your primary slipped.
Designing a chain
A defensible chain has two properties: provider diversity(don't put all four slots on one provider, when they go down they all go down) and cost gradient(later slots should be cheaper so you don't pay frontier prices when the runtime falls back). A reasonable example: Sonnet → GPT-4o → Haiku → GPT-4o-mini.
The master switch
The Preferred Model surface has a master toggle for failover behavior. Off = strict (the call fails when slot 1 fails). On = graceful (the chain walks). Most writers want it on; a few specialists want strict so they always know which model wrote which paragraph.

