Team Modes: Delivery Mode vs. Rescue Mode

As engineering leaders, we are constantly juggling competing priorities. We want to deliver value to our customers, but we also need to maintain the health of our systems and the wellbeing of our teams. It’s a delicate balancing act, and it’s easy to feel like we’re always falling short in one area or another. I decided to break down patterns I’ve encountered over the years building teams, and I’ve realized that there are two distinct modes that teams operate in: Delivery Mode and Rescue Mode.

The idea of spliting the team modes into two came from reading the book The Hard Thing About Hard Things by Ben Horowitz. In the book, Horowitz talks about the difference between running a company during good times and running a company during bad times. He argues that management techniques must flip entirely depending on the context.

Peacetime CEO: Focuses on culture, fostering creativity, and minimizing conflict. They have the luxury of time to build consensus.
Wartime CEO: Focuses on the prime directive (market survival). They care little about feelings or consensus; they care about the strict adherence to the path that leads to safety.

An engineering manager must similarly toggle their style, being an empathetic coach in Peace (Delivery Mode) and a directive commander in War (Rescue Mode).

Trying to apply the same tools, processes, and leadership style to every situation leads to frustration: sprint planning feels bureaucratic during a production outage, and incident style urgency burns people out when applied to routine feature work. The reality is that each mode demands a different point of view and a different set of levers.

By explicitly naming Delivery Mode and Rescue Mode, you gain clarity:

Right tool for the job: Instead of forcing one process to cover all scenarios, you can adopt tools and rituals purpose built for each context. OKRs and sprint planning for Delivery, incident command and war rooms for Rescue.
Faster mode recognition: Once you learn to identify which mode you’re in, you stop wasting time applying the wrong playbook. The question shifts from “why isn’t this working?” to “which mode are we in, and what does that mode require?”
Reduced cognitive load: Your team can relax into the current mode’s expectations instead of constantly renegotiating how to work.

The goal isn’t to eliminate Rescue Mode since some level of crisis is inevitable. The goal is to know which mode you’re in, apply the right approach, and transition cleanly when circumstances change. This document is designed to help you do exactly that.

In the sections that follow, we’ll define each mode in detail, what triggers it, what success looks like, and what leadership behaviors it demands. We’ll also explore the “burst capacity” model (how much Rescue Mode a team can absorb before burning out), dependency management (why other teams’ crises become yours), and practical checklists for transitioning between modes. Whether you’re currently firefighting or enjoying a calm sprint, this framework will give you a shared vocabulary and a playbook for both.

Core Concepts

Delivery Mode: Sustainable Growth

The default state. Validating the path, optimizing the machine, and growing the team.

In Delivery Mode, the backlog is clear, and the environment is predictable. Leadership focuses on long-term sustainability, career growth, and system hygiene.

Aspect	Description
Primary Objective	Deliver roadmap value sustainably
Mindset	Thoughtful, predictable, optimize for long-term throughput
Operating Constraints	Keep work predictable; protect focus time
Typical Signals	Stable on-call, known backlog, low “unknown unknowns”
Success Metrics	Velocity, capacity, quality, predictability

Key Activities:

Delivery & Planning: Sprint execution, OKRs, capacity/velocity management
People & Org: Onboarding, coaching/mentoring, performance cycles, hiring
Technical Excellence: System design reviews, refactoring, dev lifecycle improvements, TDD
Learning & Leverage: Stretch goals, deliberate learning, platform investments that compound
Artifacts: Plans/OKRs, design reviews, career plans, tech debt strategy, hiring plan

Rescue Mode: Crisis Response

The exception state. High operational intensity, unknown territories, and immediate survival.

In Rescue Mode, the team faces perceived chaos due toincidents, security threats, or existential business deadlines. The path is unclear or obstructed, and the standard process is often too slow.

Aspect	Description
Primary Objective	Restore service/safety fast, reduce blast radius, return to Delivery Mode
Mindset	Alert, decisive, action-oriented
Operating Constraints	Context switching is expensive; comms must be tight
Typical Signals	Sev-0/1, active exploit/CVE, outage, major regression, urgent escalations
Success Metrics	MTTD, MTTR, customer impact containment

Key Activities:

Reliability Incidents: Sev-0/1 response, on-call escalations, detection + recovery
Security Response: CVEs, active threats, emergency patching/mitigations
Operational Interrupts: Urgent support escalations, after-hours stabilization
Business-Critical Crunch: “Do-or-die” moments, funding deadlines, existential customer impact
Artifacts: Incident timeline, decision log, status updates, postmortem + follow-ups

The Burst Capacity Model

Every team has a burst capacity for Rescue Mode similar to AWS EBS burst credits.

Key Principles

Finite Resource: Teams cannot sustain Rescue Mode indefinitely without burning out
Recovery Time Required: After Rescue Mode, teams need Delivery time to recover and recharge
Predictability Trade-off: Too much Rescue Mode erodes stability, morale, and predictability
Cannot Be Cut: Rescue Mode work takes as long as it takes—you cannot artificially constrain it

Warning Signs of Burnout

Decreased velocity in Delivery Mode
Increased employee turnover
Declining code quality
Reduced engagement in planning activities
More incidents (creating a negative feedback loop)

Critical Success Factors

For Rescue Mode Success

Factor	Description
Trust	Team members must trust each other’s technical judgment under pressure
Tooling	Automated detection, deployment, rollback, and monitoring reduce MTTR
Onboarding	Every team member must be fully onboarded to contribute during crises
Focus	Eliminate all distractions outside the critical task
Clear Communication	Defined incident commander roles and communication protocols

DORA Metrics for Rescue Mode:

MTTR (Mean Time to Recovery): How quickly can you restore service?
MTTD (Mean Time to Detection): How quickly do you identify issues?

For Delivery Mode Success

Factor	Description
Rhythm and Ritual	Establish sustainable work patterns and ceremonies
Automation	Automate toil work to maximize time for business value
Capacity Planning	Realistic estimation accounting for potential Rescue Mode interruptions
Career Investment	Dedicate time to mentoring, feedback, and development
Technical Excellence	Pay down debt proactively to prevent future incidents

Dependency Management and System Resilience

Critical Principle: A system is only as strong as its weakest link (internal or external).

The Dependency Problem

Your team’s Rescue Mode frequency is affected by:

Number of dependencies on external services
Hardness of coupling (hard vs. soft dependencies)
Resilience patterns implemented (circuit breakers, timeouts, graceful degradation)

Example Scenario

Your service displays customer data including “last 3 orders” from an external Orders service:

Dependency Type	Behavior	Your Mode
Hard Dependency	Orders service down → Your service appears down	Enter Rescue Mode
Soft Dependency	Orders service down → Your service shows data without orders	Remain in Delivery Mode

Mitigation Strategies

Make non-critical attributes optional
Implement circuit breakers and fallbacks
Cache data for resilience
Design for graceful degradation
Establish clear SLA expectations with dependency owners

Incident Command Structure

Use an incident command structure so the team can execute without everyone trying to lead at once:

Role	Responsibility
Incident Commander (IC)	Owns decisions + keeps the incident moving
Comms Lead	Stakeholder/customer updates, keeps noise off responders
Ops/Tech Lead(s)	Directs technical investigation and mitigations
Scribe	Timeline + decisions + actions captured live

Exit Criteria (“Back to Delivery” Checklist)

Customer impact stopped or bounded
Monitoring confirms stability trend
Short-term mitigation in place (even if root cause isn’t fully fixed)
Postmortem scheduled and owners assigned

Leadership Responsibilities by Mode

Delivery Mode Leadership

✓ Focus on career development and growth opportunities
✓ Maintain predictable velocity and capacity planning
✓ Work on team alignment and priority setting
✓ Invest in automation and toil reduction
✓ Build resilience through architecture and tooling
✓ Create psychological safety and trust
✓ Hire and onboard effectively
✓ Review and optimize processes

Rescue Mode Leadership

✓ Define clear success criteria for resolution
✓ Identify steps to return to Delivery Mode
✓ Remove obstacles and distractions
✓ Communicate status to stakeholders
✓ Ensure proper incident command structure
✓ Monitor team energy and prevent burnout
✓ Document decisions for AAR
✓ Shield team from external pressure

Post Rescue Mode Leadership

✓ Conduct After-Action Reviews (AAR) and postmortems
✓ Identify root causes and prevention measures
✓ Update runbooks and tooling
✓ Give team recovery time
✓ Recognize contributions and effort
✓ Implement improvements to reduce future occurrence

Communicating Mode Transitions

Clear communication during mode transitions prevents confusion and sets expectations. Ambiguity about “are we in incident mode or not?” leads to half-measures and delayed responses.

Entering Rescue Mode (Delivery → Rescue)

To the Team:

Explicit declaration: “We are now in Rescue Mode. [Brief description of the trigger].”
Assign roles immediately: “IC is [name], Comms is [name].”
State the success criteria: “We’re done when [X metric is restored / customer impact is stopped].”
Clear expectation shift: “All sprint work is paused. Focus only on this.”

To Stakeholders:

Use a standing incident communication template:
“We have identified [issue]. Impact: [scope]. Current status: [investigating/mitigating]. Next update in [30 min].”
Over-communicate cadence, under-communicate speculation.

Exiting Rescue Mode (Rescue → Delivery)

To the Team:

Explicit stand-down: “We are exiting Rescue Mode. The immediate threat is contained.”
Acknowledge the work: Quick recognition of effort before moving on.
Set recovery expectations: “We’re resuming sprint work tomorrow. Today is buffer/decompression.”
Schedule the postmortem: “Postmortem is [date]. Action items will flow into the backlog.”

To Stakeholders:

Final incident summary:
“Issue resolved at [time]. Root cause: [brief]. Customer impact: [duration/scope]. Follow-up postmortem scheduled.”

Rituals That Reinforce Mode Awareness

Ritual	Purpose
Status channel	A single #team-status channel where mode is always visible ("🟢 Delivery" or “🔴 Rescue: [incident link]”)
Standup framing	Start every standup with “We are in [Mode].” Reinforces shared awareness.
Calendar block	When entering Rescue Mode, IC or manager cancels/postpones non-essential meetings for the team.

Post Incident Follow Through

A postmortem without owners and dates becomes a diary. The value of Rescue Mode isn’t just surviving the incident—it’s learning from it so it doesn’t happen again.

The Postmortem Checklist

Every postmortem should produce:

Output	Description
Timeline	What happened, when, and who did what. Facts only, no blame.
Root Cause(s)	The underlying issue(s), not just the trigger. Use “5 Whys” or similar.
Contributing Factors	What made detection slower or impact larger?
Action Items	Concrete tasks with owners and due dates.
What Went Well	Reinforce behaviors you want repeated.

Tying Actions to the Backlog

Postmortem action items must flow into your tracked work system (Jira, Linear, etc.):

Create tickets immediately — Don’t wait. Do it in the postmortem meeting.
Tag them consistently — Use a label like postmortem or incident follow up for visibility.
Assign owners, not teams — “Team X will fix this” means no one will fix this.
Set realistic due dates — Balance urgency against current Delivery Mode capacity.
Review in sprint planning — Postmortem items compete for capacity like any other work.

Follow Through Rituals

Ritual	Purpose
Postmortem review meeting	48-72 hours after incident. Fresh enough to remember, calm enough to think.
Action item standup tag	Weekly check: “Any postmortem items blocked or at risk?”
Monthly incident review	Leadership reviews open postmortem actions. Escalates stale items.
Quarterly retro	Are we seeing repeat patterns? Is our prevention investment working?

Anti-Patterns to Avoid

Postmortem delayed indefinitely — Schedule it before closing the incident.
Actions without owners — If no one owns it, it won’t happen.
“We’ll be more careful” — Not an action. What system change prevents recurrence?
100-item action list — Pick the 3-5 highest-leverage items. The rest goes to tech debt backlog.
Blame culture — If people fear punishment, they’ll hide information. You’ll learn nothing.

Team Maturity Assessment

How do you know if your team is actually good at operating in both modes? Maturity isn’t about never entering Rescue Mode but how smoothly you operate in each mode and how gracefully you transition between them.

Use this framework to assess where your team stands and where to invest.

Rescue Mode Effectiveness

A mature team in Rescue Mode doesn’t panic. They’ve built the muscle memory and tooling to respond quickly without chaos. Assess your team on:

Dimension	What to Look For
Speed of detection and response	How long between “something is wrong” and “we’re actively working on it”? MTTD and MTTR are your primary signals.
Quality of communication	Are stakeholders informed without responders being interrupted? Is there a single source of truth?
Ability to execute under pressure	Does the team fragment or unite? Do people know their roles, or is there confusion about who decides what?
Post-incident learning	Do postmortems happen? Do actions get completed? Are you seeing fewer repeat incidents over time?

Red flags: Finger-pointing during incidents, no clear IC, stakeholders pinging individual engineers for status, repeat incidents on the same root cause.

Delivery Mode Effectiveness

A mature team in Delivery Mode operates predictably and sustainably. They’re not just shipping—they’re building capability for the future. Assess your team on:

Dimension	What to Look For
Consistent velocity and predictability	Can you forecast what will ship this sprint with reasonable accuracy? Or is every sprint a surprise?
Proactive debt management	Is tech debt addressed intentionally, or only when it causes outages? Do you have a debt backlog?
Career development culture	Are 1:1s happening? Are people growing? Do ICs have visibility into their path forward?
Low unplanned work percentage	What fraction of your sprint is reactive vs. planned? High unplanned work erodes trust and predictability.

Red flags: “We never have time for refactoring,” cancelled 1:1s, engineers surprised by their performance reviews, constant scope creep.

What Mature Teams Look Like

Rescue Mode is a calm workflow, not a panic. Roles are clear, runbooks exist, and people trust each other.
Most time is spent in Delivery Mode. Rescue Mode is the exception, not the norm.
Rescue Mode frequency is decreasing over time. Prevention investments are paying off.
DORA metrics trend positively. Deployment frequency up, lead time down, change failure rate down, MTTR down.
The team can articulate which mode they’re in. Shared vocabulary exists and is used.

Common Pitfalls and Anti-Patterns

Attempting to Budget Rescue Mode Work

“Ops days” or “on-call effort” percentages are estimates at best
Rescue Mode work cannot be cut due to time constraints
Attempting to time-box critical incidents leads to poor outcomes

Treating All Work as Delivery Mode

Applying standard planning processes during crises creates delays
Requiring extensive documentation before action during incidents
Not recognizing when to shift gears

Operating in Permanent Rescue Mode

Treating every task as urgent undermines true priorities
Team burnout and turnover
No time for strategic improvements
Accumulating technical debt

Ignoring Post-Rescue Recovery

Immediately returning to normal sprint work without reflection
No AAR or postmortem process
Missing opportunities for prevention
Team exhaustion accumulates

Final Tips

Engineering leadership requires fluency in both Delivery and Rescue modes. Success comes from:

Building trust and capability during Delivery Mode to enable effective Rescue Mode response
Investing in tooling and automation to reduce toil and improve MTTR/MTTD
Managing dependencies to prevent cascading failures
Creating sustainable rhythms that allow for recovery and growth
Learning continuously from incidents to prevent recurrence

The most mature teams minimize time in Rescue Mode through prevention while maintaining the capability to execute effectively when crises occur. Your role as a leader is to create the conditions for both modes to succeed without burning out your team.

Core Concepts#

Delivery Mode: Sustainable Growth#

Rescue Mode: Crisis Response#

The Burst Capacity Model#

Key Principles#

Warning Signs of Burnout#

Critical Success Factors#

For Rescue Mode Success#

For Delivery Mode Success#

Dependency Management and System Resilience#

The Dependency Problem#

Example Scenario#

Mitigation Strategies#

Incident Command Structure#

Exit Criteria (“Back to Delivery” Checklist)#

Leadership Responsibilities by Mode#

Delivery Mode Leadership#

Rescue Mode Leadership#

Post Rescue Mode Leadership#

Communicating Mode Transitions#

Entering Rescue Mode (Delivery → Rescue)#

Exiting Rescue Mode (Rescue → Delivery)#

Rituals That Reinforce Mode Awareness#

Post Incident Follow Through#

The Postmortem Checklist#

Tying Actions to the Backlog#

Follow Through Rituals#

Anti-Patterns to Avoid#

Team Maturity Assessment#

Rescue Mode Effectiveness#

Delivery Mode Effectiveness#

What Mature Teams Look Like#

Common Pitfalls and Anti-Patterns#

Attempting to Budget Rescue Mode Work#

Treating All Work as Delivery Mode#

Operating in Permanent Rescue Mode#

Ignoring Post-Rescue Recovery#

Final Tips#