Why operational resilience strategies must be holistic
Professor Gianluca Pescaroli, Assistant Professor in Business Continuity and Organisational Resilience, University College London’s Institute for Risk and Disaster Reduction talks through the steps organisations must take to ensure that their operations are truly resilient
Sara Benwell POSTED ON 7/20/2020 1:56:55 PM
Sara Benwell: What qualifies as Important Business Services that require the most attention in operational resilience strategies?
Professor Gianluca Pescaroli: Each organisation is unique and every consideration to be taken must consider their operational context.
The common assumption for operational resilience is the availability of technology, and there is a growing attention to cyber threats to mitigate disruptions.
However, it must be considered that those are just one of the many triggers of possible disruptions.
"The common assumption for operational resilience is the availability of technology"
Operational resilience strategies should give more attention to understanding and assessing which invisible utilities that can be the lifeline of other core business services.
For example, the satellite infrastructure and its implications for both transactions and Information Communications Technology (ICT).
Thus, it is strongly needed to address the common vulnerabilities to their partial or total disruptions.
Sara: Looking at assigning accountability, for multi-jurisdictional and worldwide groups, what key questions do we need answers on from parent companies and how best to hold them to account on deliverables?
Organisations should constantly monitor and verify that they are not missing the basics, which in a fast changing operational context may be taken as assumption and become visible just in time of crises.
There are two key elements that we need to ensure consistencies across the whole groups, and they may be not rocket science:
- To what extent are all the basics really done consistently? So, for example, are you really talking the same operational language across the group and the departments?
- To what extent are the cascading effects of other infrastructure failures are considered consistently and integrated in the existing procedures?
These have now been included in the last ISO 22301:2019 and NFPA1600:19 but are often associated with add on more than being considered in their practical implications.
For example, do you have a contract for your generators, or do you have them in place, trained and with a gasoline reserve? Are the critical third-party providers doing the same?
Sara: How should we determine “severe but plausible” scenarios now and how can we measure such risks?
Gianluca: I do not believe in “severe but plausible scenarios”, because this helps in a limited way to address uncertainties.
Moreover, “low probability high impact “does not mean “It won’t happen tomorrow”. In science and statistics this point is very clear, but this is often missed when translated into decision making.
I suggest, start with identification, which can be common vulnerabilities and cascading effects to different threats, assessing which could be the common paths of possible escalations that we may need to address.
"I do not believe in “severe but plausible scenarios”"
This is done for increasing the capacity to take decisions in situations of high uncertainties.
For example, think about an extreme space weather event and a targeted cyber-attack: the global navigation satellite system may be affected, and the operational implications may be very similar. This could be common to other triggers and threats.
Sara: What are the new lessons for drafting Service Level agreements and how has this changed the holding of suppliers to account?
Gianluca: The new lessons are associated with an increased relevance of cascading effects of technological failures, that can impact suppliers both directly and indirectly and must be considered for operational resilience.
For example, direct impact to suppliers may be caused by targeted cyber-attacks, or indirectly they may include the consequences of disruptions on other infrastructure sectors, such as a blackout.
The complex network system in which we are operating implies a raised value of common scenario and exercises with key suppliers, as well as defining realistic service level agreements derived from joint stress testing.
Sara: Given the global dynamic of the recent pandemic, has the case grown for removing centralisation of operational support functions or has it grown?
Gianluca: The case has grown for a balance between the two solutions.
The situation is still evolving and there is the need to achieve the maximum flexibility in the system that should be able to compensate and adapt depending on the scenarios that will arise in the next months.
This includes the possibility of concurrency of events, e.g. storms, flooding or extreme space weather happening during the pandemic that could strain further the resources available.
Sara: According to a FCA report into Cyber and Technological Resilience from November 2018, 91% of disruptive events came in periods of change management. How can we best minimise these threats?
Gianluca: Change is an essential part of a dynamic organisational environment and is a constant praxis in the global networked society.
We want to maintain flexibility and resilience instead of losing it in the process.
This is possible investing in three aspects:
- Minimise the loss of organisational knowledge that could happen with the change of roles of personnel and the turnout, including the standardisation of training for new personnel to allow and update practices of stress testing
- Identify, train and exercise the common vulnerabilities to different treats that could be even weaker spots during transitional periods
- Integrating bottom up perspectives and feedbacks of personnel on the first line that could be aware of invisible points of failures and bottlenecks that need to be preserved during change.
Get the recent popular stories straight into your inbox