As enterprise investments in digital transformation continue to accelerate, the Internet is becoming increasingly mission-critical to worker productivity and business continuity. Cloud and SaaS consumption have increased reliance on Internet-related infrastructures so significantly that the Internet (and cloud provider networks) have essentially become the new enterprise backbone. Yet, the Internet is notoriously opaque, with IT organizations, historically, having little visibility into its inner workings—putting digital transformation investments at risk and making remote workforce management more difficult.
When vast amounts of employees recently shifted to working remotely after the COVID-19 pandemic began, enterprise IT professionals and Internet observers had a collective existential question: “Is the Internet going to break?” In particular, enterprises that have relied on managed network services, such as MPLS, to connect employees to critical apps and data, may have been hard-pressed to assess and communicate the Internet’s potential impact on business continuity. Many of these enterprises also had to quickly increase VPN capacity to meet the unexpected demands of an entirely remote workforce and, in some cases, leveraged cloud provider infrastructure to scale, pushing even greater amounts of traffic onto these networks. Content Delivery Network (CDN) providers also reported large-scale increases in usage, as Internet users upped their digital consumption.
Could Internet infrastructures truly support and sustain this unforeseen and unplanned for traffic onslaught without a substantial degradation in performance?
What began in earnest to answer that question soon turned into a weekly podcast, a real-time Internet weather map, and a first-of-its-kind, measurement-based study of global Internet health, specifically tracking the impact of the COVID-19 pandemic on shared network infrastructures. This study has culminated in the report we are pleased to be releasing today—the Internet Performance Report: COVID-19 Impact Edition.
The Internet Performance Report examines the various network infrastructures critical to modern content and application delivery, specifically those belonging to Internet Service Providers (ISPs), as well as cloud, Content Delivery Network (CDN) and Domain Name System (DNS) providers. The report examines the resilience and behavior of these providers’ networks over time (January-July, 2020), comparing each of these under “normal” conditions, as well as under conditions without precedent in its history. I encourage you to download your own copy of the report to read the full analysis.
Doomsday Averted; The Internet Is Fine
Despite early fear and speculation immediately following pandemic-related lockdowns, the state of the Internet was and remains healthy, with our network measurements (taken via active probing) showing little evidence of systemic network duress, even when traffic shifts and volumes were at their peak. With few exceptions, Internet-related infrastructures over the last six months have held up well. An initially concerning rise in network disruptions post-pandemic (~63% increase post-February) was found, upon closer inspection, to have the hallmarks of traffic engineering activity, which reportedly increased amongst providers in order to meet changing service demands. In most regions, however, network disruptions are now at, or near, pre-pandemic levels, suggesting that the temporary increase was the result of a necessary adaptation by providers, enabling them to successfully scale capacity and demonstrate their operational agility to meet unforeseen conditions.
In the course of examining the network availability and key performance indicators (e.g. packet loss and latency) of providers, we also provide insight into the operational patterns of the various types of providers, particularly in how they were expressed within different geographic regions. The result is a high-level, data-driven survey of networks and their operators over time.
Continue reading for a preview of some of our findings.
Cloud Providers Demonstrated Greater Stability than ISPs
With consistently excellent performance throughout the period examined, cloud provider networks showed themselves to be highly available relative to ISP networks—both pre and post-February. ISPs experienced nearly 10x the number of outages as cloud providers (shown in Figure 2) despite similar infrastructure coverage in the data set. Between January and July 2020, cloud providers had ~400 outages versus more than ~4,500 in ISP networks (excluding China). The purpose-built, software-defined networks employed by the cloud providers may be at least one of the reasons behind this resiliency advantage.
But fewer outages does not necessarily equal less disruptive outages.
While cloud providers had fewer network disruptions overall, their indiscriminate timing made many of them more disruptive to local users compared to ISP outages. Many ISP outages, particularly in North America, took place outside of business hours—while in contrast (and despite their lower numbers), cloud provider network outages tend to take place more frequently during business hours compared to other periods.
CDN and DNS Providers Were Mostly Unscathed
CDN and managed DNS providers experienced few network-related disruptions during the first half of 2020, and when they did occur, within CDN providers specifically, their pattern suggested maintenance events or automation-gone-awry, rather than a systemic performance issue (e.g. network congestion). For public DNS services, fluctuating patterns of longer and shorter resolution times between weekdays and weekends, appeared to be related to changing usage due to workers moving from offices to the home environment. Overall, however, it’s important to note that while DNS providers performed consistently, with response times staying within reasonable limits for reliable user experience. As is the case with the CDN providers, managed DNS providers experienced few outages within their networks.
Not All Outages Are Equal—or Equally Impact Users
Overall, we observed considerable variation in availability patterns based on type of provider, the region, and even specific location. When evaluating providers for your enterprise based on their resilience, it is important to remember that the characteristics and potential user impacts of outages are critical factors—not just the total number of outages. Impact isn’t necessarily determined by if an outage occurs, but by when an outage occurs. The context of an outage—such as whether it occurred on a weekday or on a weekend, or during business hours or the middle of the night—can mean the difference between an outage that makes headlines and one that goes unnoticed.
Large scale working from home is expected to persist for some time, perhaps never fully returning to pre-pandemic levels. Enterprises are also increasingly realizing the benefits of using cloud provider services to operate with greater agility—a trend that also appears here to stay. With so much dependence on external networks and services, an understanding of not only the performance, but availability and operational tendencies of your providers (and their peers) is critical in order to successfully thrive in both a digitally and socially transformed world.
Vendor and peer selection, collaboration, and assertive management cannot be done with a one-size-fits-all approach. Global businesses should consider the regional, as well as individual, availability and performance patterns for critical providers, so that enterprise IT is able to understand their new operating environment and make more effective planning and management decisions. For greater detail and analysis on the brief summaries provided here, check out the infographic below and be sure to download the full report.