Clouds of Vagueness Resilience (Part 2): aircraft carriers on the lawn

Resilience (Part 2): aircraft carriers on the lawn

Posted By Andy Garlick On 30 April 2013 @ 11:13 In Articles,Featured | No Comments

^[1]In the previous article ^[2] I explored resilience in the way it is described in the WEF global risk report. It was hard to find much that distinguished it from a conventional risk management approach – listing actions against risks – apart from:

a recognition that things look different at different scales – the global uncertainty of climate change compared with the local risk of flooding, for example
a recognition that you need to be agile, adaptive, or whatever; you can’t write down all the plan, you have to make some of it up as you go along.

It’s this second point that is potentially a real departure from classical (?!) risk management and it’s worth exploring a bit more, or actually a lot more.

One of the more interesting aspects of my job is seeing what happens when engineering projects get into trouble. There is a project management fantasy that you map out the activities, resource them up, baseline the plan, draw bar charts and EVA graphs, put a risk register together, put some funding against it and set boldly out. If anything goes off course, you accelerate, reschedule or whatever, paid for from the risk pot. Basically you plod along implementing your plan. What could go wrong when you have all that process supporting you?

Inevitably, reality and fantasy sometimes don’t coincide and an activity called ‘project recovery’ or something kicks off. At this point you are totally off piste, all process has stopped and you are thrown back on the imagination and creativity of the managers. The interesting thing to me is the lack of process, the transition from process-bound constipation to solutions-orientated free thinking. There is plenty of scope to formalise the way in which projects can adapt and recover (and no doubt there is a substantial literature on it out there; I get regular marketing emails from one company who pushes this which I’ll read more attentively next time), but the point is that it doesn’t seem to be a routine skill which project managers are trained in. Possibly they welcome this for the reasons I just hinted at: it gives them an opportunity to show their skills at fighting fires and be creative (though of course one knee-jerk milestone on the ‘project recovery’ plan is firing the previous incumbent).

Which is a rather lengthy preamble to making the obvious point that when it comes to adaptive management you are moving from an environment where the prime driver is process to one where it is culture. So one primary feature of the recent upsurge in interest in risk culture ought to be how organisations can hone their adaptive skills and improve their resilience as a result.

One area where this has been studied and understood is resilience engineering. This focusses on how safety is achieved by those responsible for the management of complex systems, sometimes called High Reliability Organisations. I’m going to do a brief review of some of the principles which research shows seem to work for HROs and then see if the concepts are useful for more general risk management, upside as well as downside, strategy as well as preventable in the Kaplan and Mikes terminology ^[3].

Mention of complex systems raises an immediate issue. Close-coupled complex systems are sometimes regarded as creating unmanageable risk which leads to ‘normal accidents’. I’ve discussed Perrow’s ideas about this ^[4] in the context of the financial system. The management of aircraft operations from a carrier is often regarded as the prototypical HRO and another is nuclear power plant operation. This latter is the prototypical source of normal accidents so there is an open question of whether even the highest of HROs is up to the job. But this should not detract from the good practices which have been identified to deal with ‘risks that emerge from the system’ as the resilience engineers say. You can read Perrow’s ideas about the alternative approach of de-concentrating – cockroach-ising if you like – in his chilling book on how risk has become enhanced through corporate influence on governance, especially in the US.

There are several sets of principles which can be applied to create HROs. The one which appeals most to me is the idea of ‘mindfulness’ as set out by Weick and Sutcliffe in their very readable little book ^[5]. This has 5 elements:

don’t simplify – this runs directly counter to much management wisdom (often associated with the alleged inability of managers to understand anything complicated); the devil is in the detail
attend to operations – pay constant attention to what is going on, realising that just because things have gone OK up to now, this doesn’t mean they will continue that way, in fact recognise the potential for overconfidence and complacency
focus on failure – likewise keep a careful eye out for things going off track so that risks can be recognised and controlled early
build resilience capability – in terms both of investing in capacity and making sure people will make the right decisions when the time arises
defer to expertise – not formal superiority; the idea that decisions can migrate through the organisation to find the person best placed to make them.

The five principles (which I have reordered and restated in a way which makes sense to me) are pretty well established as a way to improve the safety of complex systems. But do they have much to offer organisational risk management, especially at the strategy risk end?

First, it’s helpful to consider the idea of mindfulness itself, very similar to risk awareness in that you are constantly alert to the possibility of risk – and opportunity, for that matter – and work to reduce it. It’s very important that risk becomes something that is the daily subject of managerial conversations and not just something that sits in a register to be processed every month or whatever. Managers need to set the example, inculcating a culture of challenge and risk awareness, and bringing risk and the way it is to be dealt with up in every discussion. This is part of dealing in natural way with risk.

The idea of defining and monitoring KRIs is quite well established in the operational risk management processes of the financial industry and this could be extended to project and enterprise risk. However, in my experience it’s quite difficult to do this, and to keep the KRIs separate from KPIs.

Incidentally one fallacy which Weick and Sutcliffe bring out is very important. Apparently the Columbia shuttle launch was permitted because previous strikes of detached items had been survived. These near misses were interpreted as proof of the robustness of the shuttle rather than an indication of a new hazard. Humans are quite good at using survival of previous crises as evidence that we will survive the next one – look no further than climate change – but it will always be a bad argument.

Another helpful anecdote from Weick and Sutcliffe concerns the bottle of champagne Wernher von Braun had sent to the engineer who caused a rocket failure though a maintenance error but owned up. That was something that really helped the programme. A great lesson for our own NHS and its whistleblowers.

Second, refusing to oversimplify (whilst not over-complicating or the sake of it) is also a good discipline. It is often quite difficult to articulate how risks will play out, but it’s necessary to do this and if we are good at it we will get fewer surprises. What’s more, it’s clearly important to develop a culture in which assumptions are questioned. This was one area which was also flagged up by the WEF report across several of its risk cases.

Third, I’ve discussed elsewhere the importance of investing in antifragility ^[6]: giving yourself options and spare capacity. That’s what stops you being a JIT, lean turkey. The message here is just the same: you need to invest in safety; you need to invest in resilience; you need to invest in risk management.

Finally, the idea of deferring to expertise is an interesting one. I’d broaden out the concept of passing decisions round the organisation to the more general idea of communication. It’s important for the risk profile to be broadly shared and understood. For example, I mentioned Kaplan and Mikes who in their follow-up on JP Morgan ^[7] tentatively attribute the losses experienced to failure to implement an otherwise effective risk management strategy throughout the organisation. More generally, while it may be difficult for many firms to buy into the idea that it is not the managers who make the decisions – and of course, it’s important that we recognise that it’s managers, not risk managers who decide how to manage risk – organisations need to be alive to the risks of ignoring expert advice, or being closed to it. Again that was a major feature of both shuttle disasters.

Kaplan and Mikes are unequivocal in recommending that risk management advice needs to be a non-embedded, separate thread, ‘effective risk management is also costly, because it has to be separate from existing strategy-oriented functions.’ The resilience engineering discipline would not accept this. It seems there is a fundamental disagreement on whether you can embed risk management properly in enterprising organisations.

The study of resilience engineering, safety culture and HROs has generated considerably more literature than the 5 bullet points of mindfulness. You can see more at the Resilience Engineering Association website ^[8] and Ashgate has published a compilation of articles ^[9] in which these principles are applied to the banking crisis.

The raison d’etre of this site is my belief that organisational risk management has a considerable way to go before its principles and practices are properly worked out. This article has highlighted three controversial issues where there is fundamental disagreement:

are normal accidents inevitable until we become cockroaches?
can risk management be embedded?
is it better not to plan for disaster?

Until we can agree on these fundamental issues – or at least understand the circumstances in which each answer is right – we are wasting our time trying to encapsulate good practice on standards. No wonder our attempts to date are so vapid.

^[10]

Article printed from Clouds of Vagueness: http://riskagenda.com/cv

URL to article: http://riskagenda.com/cv/?p=394

URLs in this post:

[1] Image: http://riskagenda.com/cv/wp-content/uploads/2013/04/images.jpg

[2] previous article: http://riskagenda.com/cv/?p=358

[3] Kaplan and Mikes terminology: http://riskagenda.com/cv/?p=360

[4] Perrow’s ideas about this: http://riskagenda.com/cv/?p=30

[5] little book: http://www.amazon.co.uk/Managing-Unexpected-Resilient-Performance-Uncertainty/dp/0787996491/ref=sr_1_1?ie=UTF8&qid=1367247772&sr=8-1&keywords=weick+sutcliffe

[6] importance of investing in antifragility: http://riskagenda.com/cv/?p=318

[7] follow-up on JP Morgan: http://blogs.hbr.org/cs/2012/05/jp_morgans_loss_bigger_than_ri.html

[8] Resilience Engineering Association website: http://www.resilience-engineering-association.org/

[9] compilation of articles: http://www.ashgate.com/isbn/9781409429661

[10] Image: http://www.printfriendly.com/print?url=http%3A%2F%2Friskagenda.com%2Fcv%2F%3Fp%3D394

[11] Email: http://riskagenda.com/cv/?p=394&share=email

[12] Facebook: http://riskagenda.com/cv/?p=394&share=facebook

[13] Twitter: http://riskagenda.com/cv/?p=394&share=twitter

[14] LinkedIn: http://riskagenda.com/cv/?p=394&share=linkedin

[15] Google: http://riskagenda.com/cv/?p=394&share=google-plus-1

Click here to print.

Share this: