OUTAGE ANALYSIS
Cloudflare Outage Analysis: December 5, 2025

The Internet Report

How AI Is Transforming the Internet Landscape

By Mike Hicks
| | 15 min read
Internet Report on Apple Podcasts Internet Report on Spotify Internet Report on SoundCloud

Summary

The demands of AI workloads are evolving Internet infrastructure, leading organizations to shift to a more distributed model and creating an increasingly complex service delivery chain that NetOps teams must manage.


This is The Internet Report, where we analyze outages and trends across the Internet through the lens of Cisco ThousandEyes Internet and Cloud Intelligence. I’ll be here every other week, sharing the latest outage numbers and highlighting a few interesting outages. This week, we’re taking a break from our usual programming for a special conversation on how AI workloads are transforming Internet infrastructure. As always, you can read more below or tune in to the podcast for firsthand commentary.

AI’s Impact on Internet Infrastructure

Artificial intelligence (AI) isn't only changing the way people use the Internet, it's changing the plumbing of the Internet itself. The demands of generative AI workloads are prompting even the biggest hyperscalers to rethink connectivity, bringing smaller data centers back closer to the users, and creating a more distributed Internet topology once more.

Let's dive deeper into how AI is reshaping the Internet landscape and what it means for the network operations (NetOps) teams who are managing it.

The AI Bandwidth Boom

Generative AI workloads are fundamentally reshaping Internet infrastructure. The computational demands of training large language models and serving billions of inference requests require unprecedented levels of bandwidth, specialized interconnects, and distributed computing resources. Traditional data center models—where hyperscalers concentrate compute in massive regional hubs—are proving inadequate for AI's unique requirements: massive parallel processing for training, ultra-low latency for user-facing inference, and enormous data transfer volumes between facilities.

Existing backbone infrastructure is being pushed to its limits, which is why we've seen major telecommunications players investing billions in fiber network companies and long-haul capacity. These investments create the high-speed interconnects needed to link distributed AI infrastructure across geographic regions.

In January 2025, Zayo announced it would be building more than 5,000 long-haul fiber route miles to meet the increasing demands of AI workloads. Lumen Technologies reported securing more than $8 billion worth of deals with companies such as Google and Microsoft to boost their AI services.

One of the key drivers behind these infrastructure investments is that AI workloads are fundamentally different from traditional cloud computing. AI training requires enormous bandwidth between GPUs within data centers, while AI inference—the user-facing applications—demands low-latency connections to edge locations near end users. This dual requirement is incompatible with the centralized hyperscaler model, creating a resurgence in demand for more distributed, localized facilities. Additionally, leaning on third-party network and data center resources enables companies to scale quickly to meet burgeoning AI demands, likely faster than if they built all the infrastructure themselves.

The Distributed Model

The Internet infrastructure is evolving from regionally concentrated hyperscaler hubs to a multi-tier distributed architecture: massive training facilities (neoclouds) where models are built, hyperscaler infrastructure for orchestration and storage, and edge facilities where models serve end-user requests. This resembles the distributed nature of the early Internet, but with specialized tiers optimized for different AI workload types.

The hyperscalers allowed organizations to move applications into the cloud, to accelerate and improve what teams could deliver. But AI's unique demands require compute power distributed to the edge, so organizations are deploying AI inference workloads across multiple points of presence, with these new, smaller edge data centers providing the low-latency compute infrastructure necessary for responsive AI applications.

As part of this shift, the telecommunications companies are now repurposing existing infrastructure to support AI inference workloads—the responsive, user-facing applications that need to be geographically close to end users. Buildings that might previously have gone unused or been transitioned out of their real estate portfolio are being transformed into edge compute facilities. Telcos have office buildings and telephone exchanges in towns and cities that can serve as starting points for this new distributed architecture.

Several telecommunications providers have begun evaluating their existing exchange as landline telephony becomes largely redundant, and now where suitable, these buildings are being retrofitted as integrated cloud nodes for AI inference workloads. While significant upgrades are typically required—including enhanced power delivery systems, modern cooling infrastructure, and high-density rack configurations—these existing buildings offer substantial advantages over building data centers from scratch.

These facilities are already connected to the power grid and served by existing fiber connections, eliminating the need to dig up roads for new fiber runs. Their urban locations position them close to end users, which is critical for AI inference applications where every millisecond of latency matters to user experience. These edge facilities typically handle inference workloads—the AI applications that respond to user queries—rather than the power-intensive training workloads.

The Rise of the Neoclouds

Alongside edge infrastructure, we're also seeing the emergence of neoclouds—purpose-built AI training facilities located in areas with abundant power and space, often away from major population centers.

AI training workloads are fundamentally different from inference. Training large language models requires thousands of high-performance GPUs working in parallel, consuming megawatts of electricity and generating enormous amounts of heat. These facilities need specialized infrastructure including high-density power delivery, advanced cooling systems (often liquid cooling), and ultra-high-speed GPU interconnects using technologies like NVIDIA NVLink and InfiniBand to enable the massive parallel processing required for training.

Rather than competing for scarce resources in major metropolitan areas, these neoclouds are built where there's available power capacity, space for expansion, and access to cooling resources. The physical location is less critical for training workloads because they're not serving real-time user requests—they're running batch processing jobs that may take days or weeks to complete. The latency consideration for training isn't about millisecond response times, but rather about sustained throughput and the ability to reliably complete data-intensive workflows that may run continuously for extended periods—from days to months depending on model size and complexity.

Organizations can purchase access to these neoclouds from specialized providers, running their AI model training without building the infrastructure themselves. The compute power is available on-demand, but the facilities themselves are permanent installations with the power, cooling, and GPU fabric infrastructure that training workloads demand.

These facilities still need to be linked to the broader Internet ecosystem, of course—to receive training data, store model weights, and deliver trained models to inference infrastructure. This is another driver behind the huge investment in building fiber interconnects, creating a high-speed network connecting training facilities, hyperscaler storage, and edge inference locations.

Even the hyperscalers are shifting to become customers of this ecosystem. Instead of trying to build and own everything themselves, they're looking to partner with neocloud providers and rent GPU capacity for their own or customers' training workloads. When your company is training AI models using a hyperscaler's services, it might actually be running on this remote neocloud infrastructure—which is important for NetOps teams to understand as they map out their end-to-end service delivery chain.

A More Complex Service Delivery Chain

This new multi-tier Internet topology creates a significantly more complex service delivery chain with new potential points of failure that NetOps teams need to understand and plan for.

Consider what happens when an end user interacts with an AI application: The request travels from the user through ISP networks to an edge inference facility, which may need to retrieve or access cached model weights, potentially call additional AI models or services that themselves may be running in different locations, and return results back through the entire chain. Each interaction potentially traverses multiple networks, providers, and infrastructure types—and the path may be different each time depending on load balancing, available resources, and which specific AI models or tools are invoked.

We now have:

  • More interconnected systems across multiple providers (telcos, neoclouds, hyperscalers, edge facilities)

  • Greater distribution of compute power across different infrastructure tiers

  • Heterogeneous performance characteristics (edge inference vs. hyperscaler orchestration vs. neocloud training)

  • Dynamic routing where the service path changes based on which AI models are called

  • Dependency chains when AI models invoke other models or external services

  • Multiple administrative domains, each with different service level agreements (SLAs) and performance profiles

Having comprehensive visibility into how all these different components interact is even more critical than before the AI boom. NetOps teams must be able to answer questions like: Where is this specific model being served from? What's the network path to that neocloud provider? Which ISP interconnects are in the critical path for our AI services? How do we detect when performance degradation is occurring at a third-party edge facility versus our own infrastructure?

These questions matter not only from a performance perspective, but for data governance, security, and compliance purposes. AI training data, model weights, and inference requests may traverse multiple jurisdictional boundaries across this distributed infrastructure, creating complex regulatory compliance requirements that NetOps teams must help their organizations navigate.

Learning To Live in an AI World

The shape of the Internet is transforming to meet AI's demands. We've evolved from a highly distributed early Internet, through an era of concentrated hyperscaler hubs, to today's emerging multi-tier distributed model where different types of compute and data processing happen in facilities optimized for specific workload types.

This architectural evolution creates new challenges for NetOps teams. The distributed, multi-provider nature of AI infrastructure increases the number of potential failure points. We're not managing a single architecture anymore—we're managing interconnected architectures that must work seamlessly together, often across organizational boundaries. Some of these services and connections are spun up dynamically based on demand, making the topology fluid rather than static.

With more providers involved, more interconnects between facilities, and more specialized infrastructure types, there are more potential areas for service degradation. But this complexity also creates opportunities for optimization—teams that understand the full service delivery chain can make informed decisions about provider selection, traffic routing, and workload placement.

The Internet is getting smarter, but it's also getting more architecturally complex to support those AI capabilities. Organizations prepared with the visibility they need to optimize performance across this distributed model will have a clear advantage.

By the Numbers

Let’s close by taking our usual look at some of the global trends ThousandEyes observed across ISPs, cloud service provider networks, collaboration app networks, and edge networks over recent weeks (October 20 - November 2).

Global Outages

  • From October 20-26, ThousandEyes observed 199 global outages, representing an 86% increase from 107 the prior week (October 13-19).

  • During the week of October 27 - November 2, global outages decreased 52%, dropping to 96.

U.S. Outages

  • The United States mirrored the global trend, with outages increasing to 99 during the week of October 20-26, representing a 52% increase from the previous week's 65.

  • During the week of October 27 - November 2, U.S. outages similarly decreased 55%, dropping to 45, following the broader global pattern of reduced network disruptions.

  • Over the two-week period from October 20 - November 2, the United States accounted for 50% of all observed network outages.

Month-over-month Trends

  • Global network outages decreased 47% from September to October 2025, declining from 1,316 incidents to 701. This represents a reversal of what has been observed in previous years, where there appeared to have been an increase from September to October.

  • The United States showed a similar pattern with a 45% decrease, with outages dropping from 730 in September to 404 in October, closely tracking the global trend.

Bar chart showing global and U.S. network outage trends over eight recent weeks, September 8 - November 2, 2025
Figure 1. Global and U.S. network outage trends over eight recent weeks

Subscribe to the ThousandEyes Blog

Stay connected with blog updates and outage reports delivered while they're still fresh.

Upgrade your browser to view our website properly.

Please download the latest version of Chrome, Firefox or Microsoft Edge.

More detail