Skip to main content

This is fine

In 2017 I was working at AWS, and AWS went down… and went down hard. It was a tough night to be an engineer and when the dust settled hours later, it turned out that it was caused by a typo. The amount of money lost by AWS and more importantly their customers that day was huge, and one person was to blame.

What would you do if you were Andy Jassy (CEO of AWS at the time) with that employee? Fire them? Demote them? No… nothing like that! That engineer was first supported, and then they led the work in their team to ensure it never happened again. Their work then rolled out to the rest of the organisation over the next few months. Ultimately the entire organisation improved from that.

This absolutely speaks to building a safe organisation for making mistakes, which leads to success. There is more coming on how to build safety in teams coming in the series, so I won’t cover those.

Rather what I want to share today is about how to build knowledge, and unfortunately, it is a work to build knowledge. I worked with a tech lead who told me “failing is the best way to learn”… and he was entirely wrong. Failure is not learning, failing is failing… so how do we build knowledge from failure or anything other situation? Obviously, the safety of the environment matters, but it is also taking time to step back, with diverse views and discuss it, and find the learnings.

What practical tools do we have to help with this?

  • Do your sprint retros!
  • Norm Kerth wonderful retrospective prime directive is something to be said every time you take time to review a sprint or every time you review an incident.
  • When you have an incident, take time to do a blameless correction of error (or post-incident review or post-mortem). The best I’ve ever seen is the AWS model. That article is so detailed and covers an amazing tool which you can use not just in a CoE, but in many situations, and that is the 5 Whys

In short, failing in a safe environment where learning is an action and not just a statement on a company’s values can lead to powerful improvements and you can help that by building blameless cultures and ensuring that learning happens!