AWS Outage ChaosSolved

Participant
Discussion
3 days ago

So… yesterday AWS decided to take a surprise nap, and suddenly half the internet panicked. Fortnite froze mid-match, my Ring camera was streaming static like a horror movie, and alexa just stared at me. Anyone else feel personally victimized? 

Replies (5)

Marked SolutionPending Review
Participant
3 days ago
Marked SolutionPending Review

Oh yes. Here’s the full scoop. AWS’s US-EAST-1 region ran into a DNS resolution problem caused by a subsystem that monitors the health of their network load balancers. This subsystem is basically a watchdog that checks whether all traffic and service requests are flowing correctly across AWS. 

On that day, the subsystem glitched. Imagine a traffic controller in a huge city suddenly falling asleep, traffic lights stop updating, intersections jam, and cars don’t know where to go. Requests to DynamoDB, SQS, and Amazon Connect couldn’t be routed properly. 

Because these services rely on accurate DNS to connect to other components, the glitch caused a ripple effect. Many apps and platforms like Fortnite, Canva, Duolingo, Ring, Alexa, WhatsApp, and even Venmo and Coinbase couldn’t communicate with their backend servers. Some apps failed completely, others slowed to a crawl. The outage lasted roughly 15 hours, but once AWS fixed the subsystem, DNS started resolving again and services gradually came back online. 

Marked SolutionPending Review
Participant
3 days ago
Marked SolutionPending Review

Ahhh, that makes sense now. Thanks for the “explain like I’m five” version. So basically, one internal monitoring error cascaded into chaos across tons of apps. And the memes about this were hilarious. My favourite one had a sign saying, “Internet closed, AWS on lunch break.”

Marked SolutionPending Review
Participant
3 days ago
Marked SolutionPending Review

I saw that too, and another one showing a skeleton at a computer saying, “Me waiting 10 hours for AWS to wake up.” And one more of a cat sitting on a keyboard with the caption, “Aws engineers trying to fix US-east-1 like…” Absolute gold. 

Marked SolutionPending Review
Participant
2 days ago
Marked SolutionPending Review

And don’t forget Pokémon Go players who couldn’t catch anything and Mcdonald’s app users who couldn’t even order a burger. People were tweeting about standing outside restaurants hungry while their apps glitched. I was watching my coworkers freak out on slack, hitting refresh every 5 seconds. Meanwhile, I was like, “Sit back, grab popcorn, and enjoy the show.” Honestly, this outage might go down as one of the funniest tech events of the year. 

Marked SolutionPending Review
Participant
2 days ago
Marked SolutionPending Review

And the human side of it… everyone refreshing apps like crazy thinking it would magically fix the outage. But yeah, seeing everything gradually come back online was such a relief. Definitely a day to remember. 

Save