A DR runbook that has never been tested is a hypothesis. The first real incident will prove that — conclusively and painfully. But most organizations treat DR testing as a full failover or nothing, which means it rarely happens. Tabletop exercises fill that gap: they test the people, not the systems, and they cost nothing but two hours in a conference room.
This is the facilitation guide we use at Shift7 Consulting. It includes ready-to-use scenario scripts, a timed inject timeline, the four questions you ask at every runbook step, and a post-exercise action framework. Whether you're running your first tabletop or your tenth, this guide works.

Why Tabletop Exercises Matter
They test people, not systems. A failover test validates that SRM works and VMs power on. A tabletop validates whether your team knows what to do, who to call, in what order, and who has the authority to make decisions. Human readiness is the gap that actually kills you during a real event.
They find gaps nothing else does. Missing escalation contacts. Unclear declaration authority. A runbook step that references a server decommissioned six months ago. Two people who both think they're in charge. These only surface when you talk through the scenario step by step.
They're cheap and low-risk. No systems touched. No production impact. Two to three hours. Compare that to the cost of discovering these gaps during a real incident — where every minute of confusion is a minute of extended downtime.
They build muscle memory. The team that has practiced together handles pressure differently than the team seeing the runbook for the first time at 2 AM.
Before the Exercise (1-2 Weeks Out)
Preparation makes or breaks a tabletop. Here's the checklist:
- Pick a scenario and scope. Total site loss? Ransomware? Partial failure? Choose one. Don't try to cover everything in one session.
- Invite the right people. Facilitator, scribe, incident commander, VMware admin, network admin, and at least one business application owner. For ransomware scenarios, add the security lead and optionally comms/legal.
- Distribute the runbook in advance. Everyone should read it before they walk in. Most won't. That's a finding too.
- Prepare your inject timeline. Write 4-6 complications you'll introduce at specific points. These are the pressure tests.
- Assign the facilitator and scribe. These are separate roles. The facilitator runs the exercise. The scribe documents findings. Neither participates in the recovery discussion.
The scribe is non-negotiable. Without a scribe, you have no findings. Without findings, you have no follow-up. Without follow-up, you had a meeting, not an exercise.
Setting Ground Rules (First 10 Minutes)
The first ten minutes set the tone. Here's the exact script we read at the start of every exercise:

The key messages to land: no wrong answers (if you don't know, say it — that's a finding), stay in your role (respond as you would at 2 AM, not with hindsight), and nothing leaves the room unowned.
Scenario 1: Total Site Loss
Facilitator reads:
"It's Tuesday at 9:15 AM. Facilities reports a major cooling system failure at our Phoenix data center. Multiple ESXi hosts are thermal-shutting down. vCenter is unresponsive. The MPLS link is still up but there's nothing to connect to on the other end. Your phone is ringing — the VP of Finance can't access SAP, and the customer portal is returning 502 errors. What do you do first?"

Then walk through the runbook step by step. At each step, ask:
- Who makes the call to declare? Is that person available right now?
- Who gets notified first? Do you have current phone numbers?
- Can you reach DR vCenter? What's the URL? Who has the password?
- Which SRM plan do you run first? Why that one?
- How do you verify AD is healthy before starting databases?
Injects for Scenario 1
Introduce these complications at the appropriate points:
Scenario 2: Ransomware Attack
Facilitator reads:
"It's Wednesday at 2:30 AM. Your monitoring system fires an alert: CPU and disk I/O have spiked to 100% across multiple production VMs simultaneously. Within minutes, the helpdesk starts receiving calls — file shares are showing encrypted files with a .locked extension. A ransom note appears on the domain controller console. The note demands 50 BTC and threatens to publish exfiltrated customer data within 72 hours. What do you do first?"

The ransomware scenario changes every assumption. The questions are fundamentally different:
- Who do you call FIRST? The IC? The CISO? Both? It's 2:30 AM.
- What's your FIRST technical action? The answer should be ISOLATE — disconnect affected systems from the network. Not "restore from backup."
- Do you power off affected systems? No — you lose forensic evidence in memory. This catches people every time.
- Can you confirm your backups are intact? Right now? How?
- When was the last verified clean backup? Not the last backup — the last one that predates the breach.
- Who contacts the cyber insurance provider? What information do they need?
Ransomware Injects
The 45-minute inject is the gut punch. If the attacker's dwell time was 8 days and your backup retention is 7 days, you have no clean copy. This single finding has changed the backup architecture at every organization where we've run this scenario.
The Four Questions
These are the engine of the tabletop. Ask them at every single step of the runbook:

After the Exercise: The 48-Hour Rule
If the after-action doesn't happen within 48 hours, it won't happen at all. Urgency fades fast. Here's the framework:
- Compile findings report — clean up the scribe's notes, group by severity, send to all participants. Day 1-2.
- Assign remediation owners — every finding gets a named person and a due date. No exceptions.
- Update the runbook immediately — wrong IP, wrong server name, wrong contact? Fix it today, not in 30 days.
- Log in the DR Test Log — date, type, scope, findings count, action items.
- Track remediation at 15 and 30 days — anything still open at 30 gets escalated to the IC.
- Feed into next quarter — unresolved findings become the focus of the next tabletop.
Annual Scenario Rotation
Don't run the same scenario twice in a row. Here's the recommended annual rotation:

Q1: Total Site Loss — the classic DR scenario. Tests full failover procedures. Q2: Ransomware Attack — tests isolation, forensics, clean room recovery. This consistently reveals the most gaps. Q3: Partial Failure — storage controller outage, half the datastores. Tests gray-area decision making. Q4: Wildcard — ransomware during a DR exercise, WAN failure plus insider threat. Keeps the team from getting complacent.
Common Mistakes
Six mistakes we see in every other tabletop:
- No scribe assigned. Without a scribe, you have no findings. Non-negotiable.
- No app owners invited. If only IT is in the room, you're testing half the plan.
- Same scenario every time. Teams memorize responses instead of thinking.
- Skipping the after-action. A tabletop without follow-up is a meeting.
- Too gentle on the team. Don't softball the injects. Real incidents are chaotic.
- Facilitator participates. The facilitator runs the exercise, not the recovery.
Watch the Video
Need Help Facilitating?
We run tabletop exercises for enterprise clients — including ransomware scenarios with realistic injects.
One session. Real findings. Named owners. 30-day remediation tracking.