Bend but don’t break: Operational Resilience

The word resilience was coined by Emmy Werner in the 1970s. She studied a cohort of children from Hawaii. The cohort typically comprised of families that were quite poor, among the children who grew up in these detrimental situations only one third didn’t exhibit destructive behavior in later stages and hence Werner deemed this group to be “resilient”.

Being resilient means “the capacity to recover from difficulties” but unlike the movies we can’t wait for miracles to happen. In the real world we anticipate, prevent, recover and/or adapt. These 4 pillars make up the institution that is “operational resilience”.

The first driver is a defensive agenda which is about stopping bad things from happening as opposed to a progressive agenda which is about making things happen. (taken up on Y axis)In terms of approach some organizations drive for resilience by driving for consistency opposed to having a flexible agenda, which is having people that have variety of different ideas, beliefs, outlooks, and practices which enables them to be much more agile. (taken up on X axis)

Now this creates four core quadrants, in the bottom left we have preventive control which is about defensive agenda by being very consistent, this is about putting defenses in depth and stopping bad things from happening opposed to the bottom right quadrant that is mindful action , this is about having people who notice the problem, raise those concerns. Those concerns are listened to and they are further powered to act from further escalading.

In the progressive agenda which is mostly about making things happen the top left bracket is about performance optimization; this is about the organization doing what it does now but doing it better to drive competitive advantage. Michael Porter gave following three types strategies to gain competitive advantage: Cost leadership, Differentiation and Market segmentation. PepsiCo worked on cutting down its operating costs to offer its products at a price lower as compared to its competitors thus using the Cost leadership strategy to gain competitive advantage. Opposed to the performance optimization quadrant is the top right box which is about being the disruption in the marketplace by being innovative and adapting to the situation. The fintechs disruption has led to a shift from traditional banking methods of cash and credit cards to contactless transactions. Another classic example of disruptive innovation is Netflix which disrupted the existing market of the entertainment industry.

Adaptation is a situation where you rise to the occasion even when your strategy may have failed. It is not easy to adapt, as it has its own traps, one of them being the success trap, where the organization sees the signals but ignores the warnings. It is also termed as “Hubris”, where organizations tend to move into a comfort zone after attaining a particular level. The other trap is the failure trap where there is a fear of failure because of incompetency in skill and tools. Due to these reasons the organization may fail or withdraw at a premature stage. It is also known as “active inertia”. Many thriving businesses tend to fail in the face of changes not due to inaction but due to an inability to take appropriate action. The best is example of Kodak, which went bankrupt in 2012 in spite of being the innovator of digital camera, and market leader.

To avoid these traps of “Hubris- Inertia” an organization needs to develop an adaptive culture. An adaptive culture is made of a set of attitude and behaviors like learning from past, being resourceful, adhering to best practices, integration within the organization, robustness, and rehearsing resilience. Rehearsing resilience is an everlasting process.  It is more than recovery after the crisis it is persistence in the face of threat, changing before it became a necessity.

Financial sector:

Like any other industry the financial sector has its highs and lows. Incidents like the Barings Bank, Lehman brothers (also known as financial crisis) not only impacted the organization but had a contagion impact. The financial crisis of 2008 led to the Basel Committee on Banking Supervision (BCBS) introducing   Basel reforms.

The root causes of the crisis were excess liquidity, insufficient good quality capital, increased leverage, and lack of transparency. The financial institutions had to turn to their central banks for liquidity support and some to their government for capital injection.

BIS paper of June 2011 was published with a  heading “ Basel III: A global framework for more resilient banks and banking systems” The reforms focused on improving quality of regulatory capital , increasing the level of capital , specifying minimum leverage ratio and introducing the concept of liquidity coverage ratio (LCR) and Net stable funding ratio ( NSFR).The focus here was on building financial resilience.

The supervisory role was also broadened with the introduction of Supervisory Review and Evaluation Process (SREP). In the changed scenario non-financial risk also became important as the supervisors and regulators focused on assessing the risk culture, governance, and compliances. The supervisory stance is stricter globally and financial institutions have been fined for noncompliance, Not having a good risk management systems and Governance  . The latest being the case of Citigroup being fined in October 2020 for deficiencies in Risk Management and internal controls.

Thus, it is important for every organization to take note that , the pace of environmental changes and technological development is faster  than development of risk management tools which makes the task of being   resilient more pertinent.

Current scenario in Financial sector:

To keep pace with the technological changes, increased consumers demand for easy and instantaneous access to the services, shift to fintech (financial technology) has made it necessary to outsource activities and increased reliance on technology. The topmost Operational risk being IT Disruptions and Data Management.

Some of the major incidents which have had a huge impact not just in financial terms but non-financial:

  • Banking information-technology (IT) failures like that of TSB Bank when it separated from Lloyds Bank and moved on to a new platform. The customers were unable to carry online transactions and wrong balances were being reflected in their accounts. Thus, organizations undergoing the process of mergers and acquisitions can face such a situation. Even within the organization if a new platform is launched for ease of functioning or to enhance the customer experience it is very important that it is approved by the appropriate committee and  put through use test so that the transition is smooth.
  • Ransomware attack: Such attacks have brought the biggest and the best of organization to a standstill. A ransomware attack can lock a computer, encrypt important files and the attacker seeks ransom for granting access to your system and files.
  • Climate change: Climate action failure and extreme weather have been identified as the top global risks. It is a growing concern for not only governments but businesses too. The unchecked growth through businesses is one of the reasons for the extreme shift experienced in the climate. The transition has been far too slow than desired. Climate action required bringing down the carbon emission and temperature to 1.5°C whereas globally we are still at 3°C . In the given circumstances if governments and organizations decide to go for sudden transition it may have impact on certain industries but the economic damage would be less as the long-term impacts can be limited. However, the longer it takes to move to green scenario the severity or the impacts of incidents like the hurricane, storms, floods will continue to increase over time leading to economic damages as well.

Thus, organizations will be faced by either the transition risk or physical risk. The Pacific Gas &Electric case is a first climate change bankruptcy declared in 2019, where the company was held liable for non- adherence to the laid down standards ( as the fire was caused due to failure of a transmission line for which standards were not adhered to). Dam collapse in Brumadinho (Brazil), in 2019,  is one of the worst industrial accident as tons of toxic mining waste flowed out in the areas nearby (Normally these mining dams are built of mining waste itself). The company is facing charges of murder along with environmental charges as it failed to report warning signs.

  • Current outbreak of the coronavirus: The impact of COVID 19 has led to less in-person interaction which has increased the pace of the shift to digital platforms in all areas and has increased the demand for digital financial products.

In this background it needs to be assessed whether financial firms are prepared and equipped, to deal with these changes they arise. In other words, are they resilient?


Requirement to Build Operational Resilience: 

Every incident should be assessed from two angles that is the strength as well as weaknesses. COVID 19 has made it necessary to relook at operational resilience standards and identify critical/important business services, employees that support these important business services, and ensure that they can safely resume their duties. In view of the prevailing circumstances the Basel Committee on Banking Supervision (BCBS) in August 2020 issued a consultative document on Principles for Operational Resilience seeking comments from organizations. The basis of this document is in the Principles of Sound Management of Operational Risk (PSMOR). These two documents have been designed to work together and they draw upon existing guidance and current practices. The final guidelines are expected to be issued soon and will serve as an integrated framework. BCBS in its document defines Operational Resilience as “the ability to deliver critical operations through disruptions” The document brings about the key features of broad areas:

  • Governance
  • Operational Risk Management
  • Business Continuity Planning and Testing
  • Mapping Interconnections and Interdependencies
  • Third Party dependency Management
  • Incident Management:
  • Resilient Information and Communication Technology, including cyber security

PSMOR: BCBS has also proposed to update PSMOR in the areas of Operational Risk. The changes proposed in PSMOR are based on the review done for financial institutions in 2014. One of the highlights of the review was a need for specific principle on Information and communication technology risk management. AS per the draft guidelines the Information and Communication Technology (ICT) principle states: “ Banks should implement robust ICT governance that is consistent with their risk appetite and tolerance statement for operational risk and ensure that their ICT is fully supports and facilitates their operations. ICT should be subject to proper risk identification, protection, detection, response and recover programs that are regularly tested. This requires incorporating appropriate situational awareness and conveying relevant information to users on a timely basis. The proposed updates in these two consultative documents will enhance the clarity of the document, guidance on change management and align the principles with Operational Risk Framework.

Approach to Build Operational resilience: 

Organizations risk depends on the nature, size and scope of its business. There is no off the shelve solution or a blanket approach to build Operational Resilience. Identifying the firm’s business risk perspective is very important before starting to develop the approach for building resilience.

  • Identification of critical functions: Critical functions are services provided to external user and disruption could cause damage to consumer, safety and soundness, integrity of the market or financial stability. For identification of risk it is very important to go to the root cause of the incidents that have come to light. Thus, it is very important that all incidents should be reported and escalated as per the velocity of the incident. The external events that is incidents reported by other organization and failed attempts also need to be considered to get the true picture and trend. Incident reporting takes care of where we went wrong in the past. In addition to this the present controls need to be continuously assessed and if required, monitored through key Indicators. The nature of root cause range from change management, third party failure, software issue, hardware issue, human error, process control failure, capacity management and external factors. To define the criticality the impact needs to be assessed and monitored against the tolerance limits set.
  • Risk Tolerance: Is setting impact tolerances for each important business service, such as maximum acceptable outage time of a business service. The firm while setting the impact tolerance must assume that the incident has happened, and then set the maximum tolerable level and duration of the disruption. Risk tolerance is different from Risk appetite. Risk appetite is the level of risk the organization is willing to take for example, risk appetite for Return on equity would be set more than cost of equity.
  • Mapping of systems and processes needed to support the important business services: While mapping of systems and processes it needs to be ensured that the action plan is not complex, substitute resources are available and no overreliance on a single resource is there. The mapping and the plan must be well documented and communicated. The operating people need to be made aware of the sensitivity and importance of process.
  • Testing using plausible scenarios: Organizations need to build a library of severe scenarios considering the rapid changing environment and external incidents. This would help in identifying low frequency, high severity vulnerabilities, the organisation is likely to be exposed to. The most important point to consider is that no plausible scenario should be rejected on the ground that it cannot happen to my organization. They need to have an action plan in place for such scenarios. The action plan formulated should also be put to test. While testing it is important to verify that the scenarios are as per the nature, size, scope of its business activities. The action plan should clearly state the people, process systems that need to deliver at the time of crisis. Bottom up approach works better as resilience not only needs to be built in the design / functionality of the system and process, but it is required to be built in the culture of an organization.



Internal challenges: There are certain challenges which the financial organizations face especially budget constraints, obtaining board approvals. For many institutions they might be still using outdated technology systems, while at the same time trying to meet the market needs by innovating new products.

External challenges : In addition  to the budget constraints there are  external challenges like the emerging technologies such as artificial intelligence, Block chain-distributed ledger technology, sophistication of external threats in the cybersecurity space, demand for crypto-assets, increased scrutiny on value for money from customers, who so easily switch to new providers; system complexity and third party risk. To drive innovation organizations, must balance concentration risk that may provide economies of scale against spreading the risk of supplier failure.

Thus, the key threats that come out of the challenges and need to be focused on are:

  • speed of technological changes
  • disruption from less established technologies
  • Increase in the frequency and severity of cyber attacks
  • Physical Risk due to Climate change: Resilience will be put to test under physical risk due to climate change and disruptions caused to mitigate it

Organizations lagging in developing resilience or having operational weaknesses will be targeted by fraudsters. These key threat areas are broadly similar for all organizations, it is the approach adopted by the organization that will differentiate them in the long run. To gain a competitive advantage the organization that adapts and adopts a dynamic risk assessment methodology which is proactive, integrated and based on concept of granularity will increase its chances of survival. Going granular helps in identifying its leading indicators. This  not only reduces the complexity but is easy to communicate  and implement as the operational level team can relate.


Way Ahead:

Organization level:

The organizations need to develop on the existing governance and risk frameworks and keep pace with innovations. Operational resilience needs to be built into business plans , which would require a clarity of  purpose,  roles and responsibility (individual as well as collective ) and skilling and upskilling at all levels (i.e Board, Senior Management and operating)

The approach adopted needs to be continuously reviewed to tackle disruptions and a routine needs to be developed to address resilience of critical / important business services. Transparency in regulatory reporting and disclosure of threats to the critical / important business services.

Regulator level:

The Regulator has an important role to play by setting standards, indicating best practices, and developing stress scenarios considering the common challenges faced across sectors and geographies. Mapping of sector depend enciesis required to reflect on the common challenges, understand the interconnectedness and come up with collective solution. The recovery and resolution may be required to be done across the sector to address issues like complete lockdown. If required, a new framework can be developed to bring the third party within the regulatory ambit.


Operational Resilience extends beyond business continuity planning as it includes man made threats like cyber-attacks, third party failures, natural disasters, and geopolitical risks. Resilience needs to be recognized as a separate risk and managed accordingly.  It requires not only to build on capabilities but embed systems & behaviors so that the organization can carry out its mission and implement its strategies in the face of any disruption.

