Site crashed during a flash sale with 1,000+ customers shopping. Lost sales every minute, angry customers, reputation damage in motion.
On-call engineer on the call inside 4 minutes, root-caused a queue overflow, restarted the application tier, scaled the load balancer.
Site restored in 20 minutes, the sale ran to completion.
