The Digital Operational Resilience Act (DORA) becomes applicable on January 17, 2025, and NetOps teams in the financial services industry must evolve accordingly.
Banks, insurance companies, investment firms, and their third-party ICT providers must now meet an enhanced set of requirements covering risk management, the resilience of their networks, incident reporting, and much more. My colleague Ian Waters covered the precise requirements of the DORA regulations and how ThousandEyes can help you meet them if you need more information.
DORA isn’t a simple one-time to-do list that you can tick off and forget about. It’s a set of requirements mandating consistent monitoring and vigilance—not only of your own IT infrastructure, but that of your third-party partners too. This really is making financial institutions take full responsibility for their entire service delivery chain, even the parts they don’t directly control themselves.
Consequently, it’s going to require a new mindset for NetOps teams in this industry, which is why we’ve created a DORA checklist of three critical factors that NetOps teams must monitor on an ongoing basis—ideally using automation to minimize the workload and improve reliability.
The Digital Operational Resilience Act (DORA) Checklist
If the new DORA regulations apply to your team, here are the three things that you simply cannot afford to ignore.
1. Ensure your backup is always ready for action
You might argue that this should always have been the case, but now more than ever, backup systems simply must not fail. Your failover environment must be ready to go whenever it’s called upon.
We’ve seen examples from the financial industry where this hasn’t been the case in the recent past. In October 2023, for example, two major Singaporean banks experienced an outage that lasted the best part of a weekend after an upgrade to a chiller facility at one of their data centers didn’t go as planned, leading to the shutdown of IT equipment. Both banks reportedly activated their backup sites, but they were unable to fully recover services until the following day.
DORA not only requires financial institutions to have a secure backup system that allows services to resume quickly in the event of a disruption, it demands that these systems are regularly tested to ensure their effectiveness. If your backup breaks when you need it, you’re not only going to be dealing with a raft of disgruntled customers, you may be fined. For institutions relying on third-party ICT services providers, DORA mandates that financial entities evaluate the operational resilience of these providers. This process includes conducting simulations or stress tests to determine how effectively the third party can manage a significant disruption.
As a result, consistent monitoring of your backups continues to be vital, because even the most innocuous change can create unexpected consequences. A small change on a webpage can break authentication. The change of a security policy at a third-party cloud provider could deny access to a backup circuit.
You need to know in real time when these problems occur, with detailed monitoring of the entire service chain. You shouldn’t wait for something to go wrong on a live system and hope for the best.
Automation will play a vital role here, with systems such as ThousandEyes constantly probing the status of both your live and backup operations, flagging when a critical component is not performing as it should be.
2. Build a comprehensive monitoring system
There are two ways to know when things are going wrong: a robust monitoring system or when customers start complaining. You should definitely use the former rather than the latter. In fact, for financial services institutions serving the EU, DORA demands that you do.
Continuously validating that your alert and monitoring systems are working properly is crucial for detecting security breaches and other reliability issues. You can’t just assume that they’re working, they need to be periodically checked to ensure everything is in order. There’s no point in fitting smoke alarms if you don’t regularly check the batteries.
Validating the integrity and performance of your monitoring systems should be an ongoing, automated process to ensure they are reliable and ready to provide the necessary alerts and notifications. AI could also play a role here by examining billions of daily measurements from the Internet, cloud, and enterprise networks to identify issues and provide insights into how these affect users.
With ThousandEyes, for example, you can set the thresholds for metrics such as latency, response time, and packet loss and build dashboards that alert you when these KPIs aren’t being met. This information can be combined with other datasets to provide a comprehensive view of current application health and performance.
DORA also imposes stringent requirements on financial entities for reporting incidents to the relevant authorities. Without proper monitoring and reporting, you could once again be leaving your company exposed to regulatory action if an incident occurs that you failed to detect in real time.
3. Actively monitor third-party systems
Under DORA, your responsibility doesn’t end with your own physical infrastructure. The regulation requires financial institutions to continuously validate the integrity and performance of third-party environments that your organization relies on.
This doesn’t only mean validating the performance of third-party services, but checking for potential security risks such as a BGP route hijack or DNS poisoning that could ultimately affect your customers.
That’s no small task when you think about the number of third-party systems involved in service delivery. Cloud providers, CDNs, SaaS providers, DDoS mitigation services, and payment gateways are just some of the third-party systems that your service or applications count on. And as we talked about in a recent blog post, relying solely on a service status page to discover when there’s a problem with a third-party ICT service provider, may not be effective. Instead, teams should consider a variety of data points and have their own comprehensive monitoring in place for both owned and unowned services.
Once you’ve identified that a third-party ICT service provider is responsible for an outage you’re experiencing, it’s also important to have concrete data you can share with them—and internal colleagues—as you discuss the issue and collaborate to resolve the problem.
This also comes back to a topic that my colleague Kemal Sanjta wrote about recently, the importance of traffic engineering. Network engineers are constantly optimizing routing to ensure that traffic takes the optimal path, but that requires comprehensive monitoring of both ingress and egress traffic to make sure that your traffic is following the expected routes and that nothing problematic is occurring with third-party systems. With hundreds of different monitors all over the world, you can be notified rapidly if outages or route leaks occur.
The DORA Era
These tips are designed to help financial institutions take the right steps in a DORA world, but they’re also simply good business practice in the first place, irrespective of the industry involved or where organizations trade.
Continuously and proactively checking that your backup is ready to step into action if your primary systems fail ensures you’re not wasting money on ineffective failsafes, and could save your company from disaster. Maintaining real-time monitoring systems is crucial for spotting problems before your customers do. And keeping a close eye on the performance of third-party environments ensures your systems are less likely to be dragged down when a partner is having a problem.
In that sense, our DORA checklist could be a worthwhile exercise for any NetOps team, not only those connected to the European financial services industry.