Follow Slashdot blog updates by subscribing to our blog RSS feed


Forgot your password?

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).


+ - Resilience Testing at Amazon, Etsy, and Google->

Submitted by
CowboyRobot writes "Kripa Krishnan and Tom Limoncelli at Google have a detailed look into Google's GameDay resiliance exercise, what they call DiRT (Disaster Recovery Testing) and in related pieces, Etsy's John Allspaw makes the case for resilience testing, and the three continue with a roundtable discussion with Amazon's Jesse Robbins on lessons learned from these kinds of exercises.

Among other insights and anecdotes, "We simulated a long-term power outage at a data center. This test challenged the facility to run on backup generator power for an extended period, which in turn required the purchase of considerable amounts of diesel fuel without access to the usual chain of approvers at HQ. We expected someone in the facility to invoke our documented emergency spending process, but since they didn't know where that was, the test takers creatively found an employee who offered to put the entire six-digit charge on his personal credit card. Copious documentation on how something should work doesn't mean anyone will use it, or that it will work if they do. The only way to make sure is through testing.."

Link to Original Source
This discussion was created for logged-in users only, but now has been archived. No new comments can be posted.

Resilience Testing at Amazon, Etsy, and Google

Comments Filter:

If a thing's worth having, it's worth cheating for. -- W.C. Fields