Back to Team Upskilling

Simulation Catalog

Browse our library of "War Game" scenarios. From beginner troubleshooting to nightmare SRE drills.

The Rogue Migration

Beginner

A database migration script has locked the production table during peak hours. Replication lag is increasing.

#Database#DMS#Performance
Mission ObjectiveCancel the blocking query, restore read replicas, and reschedule safely.

Kubernetes CrashLoop

Intermediate

The checkout service pods are constantly restarting. Logs are silent. Customers can't buy.

#K8s#Dig#OOMKilled
Mission ObjectiveDebug the startup probe, identify the memory leak, and adjust resource limits.

Ransomware First Response

Advanced

An unknown IP is exfiltrating data from an EC2 instance. The root volume is encrypted by a weird key.

#Security#Forensics#IAM
Mission ObjectiveIsolate the instance (SG change), capture a snapshot for forensics, and identify the compromised credential.

The $10k Bill Spike

Intermediate

A Lambda function is infinitely calling itself (S3 trigger loop). The bill is growing by $500/hour.

#Cost#Serverless#Loops
Mission ObjectiveStop the bleeding (break the loop), clean up the bucket, and implement valid triggers.