This blog post was written by Barry Collins, a technology writer and editor who has worked for numerous publications and websites over his 20-year career.
Public cloud services are similar to real clouds: always changing, affected by regional conditions, and moving in unpredictable ways. Yet, with cloud services being used to provide payment gateways, web analytics, back-end databases, and all kinds of other business-critical services, knowing how they operate—and how traffic flows within, between, and into them—is critical for any customer.
This is where the ThousandEyes Cloud Performance Report comes in. It offers a clear, unbiased insight into the behavior of public cloud services, helping customers understand which provider will best suit their individual needs. Ahead of the publication of this year’s Cloud Performance Report, we examine why it’s vital to get maximum visibility over your public cloud services.
Tracking Traffic
One trend that has become apparent in recent Cloud Performance Reports is that it’s becoming harder to keep track of your traffic within public cloud infrastructure. This partly stems from how cloud providers route traffic once it has entered their network and how the cloud impacts performance for some locations, regions, and applications.
Some public cloud providers favor what Mike Hicks, Principal Solutions Analyst at ThousandEyes, refers to as “hot-potato routing,” where a “cloud service provider hands off traffic to the Internet as quickly as possible.” Others prefer to rely on their own backbone networks, where the provider “carries traffic as far as possible on its own network before handing it off downstream,” said Hicks.
The 2022 Cloud Performance Report identified notable differences between the major cloud providers, with Amazon’s AWS leveraging the Internet to a greater extent than its rivals from Google Cloud or Microsoft Azure. And once traffic enters the cloud provider’s network, it can be difficult to keep tabs on it. The 2022 report highlighted how Google obscured 33% of forward paths once traffic had entered its network, for example.
“Cloud providers use complex rules to determine the routing of traffic, taking into account the changeable conditions of Internet-connected networks,” said Hicks. “However, such routing decisions may not always be apparent or desirable. In some cases, domestic traffic intended for a domestic destination may be routed via a second country due to the way an outsourced solution is architected.”
Although Hicks is keen to stress that the public cloud providers are not acting maliciously when they divert traffic, it can create security or regulatory concerns for customers. “This may open an organization to security or geopolitical risks,” said Hicks. “For sovereignty-conscious organizations, it is critical to know where data is and the path it's taking between two points at all times. Path visualization should be used to make all possible network paths transparent and observable, including the complex peering relationships that underpin these paths.”
Businesses are used to ensuring that data is stored in the right place when it’s at rest. However, the increasing reliance on public cloud networks is forcing more organizations to think carefully about where their data is going when it’s in transit, too.
“Data in flight is an emerging consideration for many organizations when it comes to data sovereignty,” said Hicks. “While most conversations about data sovereignty tend to focus on data at rest, it is important to acknowledge that data is not a static asset."
“To create value from data, it is often necessary to move it around from edge-based collection points to a central warehouse or lake, through data pipelines, and in and out of analytics models,” he added. “Furthermore, the distributed nature of organizational structures and IT infrastructure means there is constant data movement between people, nodes, and locations.”
Variable Performance
Data sovereignty is not the only concern when traffic takes an unexpected path through a public cloud network—performance can also be affected.
“Cloud providers have certain preferences and priorities when it comes to resolving issues and optimizing performance,” said Hicks. “These preferences are not based on different service tiers but on the preferential handling of specific traffic categories. The reasons behind this preferred treatment are unclear, but they may be related to traffic or market conditions. Both of these factors are reasonable to consider when managing shared networks.”
However, if customers have a better understanding of these factors after reading the Cloud Performance Report, they can make better informed choices. “Organizations need to be aware of their position in relation to these preferences and prioritizations and whether they could be affected by them,” said Hicks. “Understanding how and what services are accessed by the user population is key to making the right decision.”
And, while you might think cloud providers that rely on their own backbone more heavily than the more outage-prone public Internet are likely to offer better performance, that’s not always the case. “Just because one provider has a backbone over an Internet-centric approach doesn't mean it will be the best choice for you,” said Hicks. “One size doesn't fit all. It's not just about where your workloads are located—you must also consider other services, such as DNS, database, etc., offered by that provider.”
“It may be that while performance is very quick to enter a particular provider's backbone, the variable capacity and backbone path for that provider are not as compatible with an organization's requirements as accessing the Internet for that carriage and vice versa,” Hicks added.
“With better knowledge of users and expected performance, it may be that customers can accept a slightly lower level of performance consistency or higher latency when making API calls to an application compared to the way it’s been specified. That could open the door to hosting it in a different availability zone, cloud region, or instance size, or to configure the application’s underlying infrastructure differently altogether.”
Of course, these factors are always changing, so it’s critical that public cloud customers have a persistent view of what’s happening with their traffic. “Organizations that have continuous visibility into the various cloud ecosystems are able to consistently make infrastructure choices that are aligned with user needs and to ensure these choices remain optimal, even as ecosystems evolve and new options become available,” added Hicks.
Regional Bottlenecks
Even though the major public cloud providers are continuing to build out their own infrastructure, there are parts of the world where they must still rely on shared physical links. As the 2022 Cloud Performance Report highlighted, the primary flow of international traffic into Australia comes through sea cables. Close examination of a latency spike spanning a two-day period in July 2020 showed a nearly identical pattern for both Azure and Google Cloud traffic entering the country.
Pure coincidence? It’s unlikely. Instead, it highlights that even the major cloud providers are vulnerable to unpredictable events such as undersea volcanoes, ships, storms, or even sharks damaging those shared sea cables.
Even on a less dramatic, day-to-day level, regional bottlenecks can create problems. “Network performance can differ significantly over time and in different regions,” said Hicks. “As we mentioned, some cloud providers prefer to route traffic via the Internet and only bring it on closer to their physical locations, while others attempt to bring traffic into their networks as close to its origin as possible, regardless of its destination."
“Businesses must have oversight of their connectivity, whether traffic is inside or outside a public cloud provider, taking into account regional performance conditions, route diversity, Internet sovereignty, legal compliance, and organizational policy as appropriate.”
Better Informed Decisions
Whether it’s greater visibility over the international routes that public cloud traffic can take, understanding the factors that can impact performance, or being aware of the regional bottlenecks that can affect any cloud provider, the forthcoming Cloud Performance Report will leave customers better informed when it comes to making critical decisions about their infrastructure.
The report will highlight a new range of potential dependencies or irregularities that customers of the public cloud platforms may face, allowing them to make smarter decisions about optimizing their architecture. That can save companies a lot of wasted effort. As Mike Hicks explains, "While it's easy to say that organizations and their tech teams should architect for the cloud more intelligently, the question is how?"
“The answer is through visualization. Visibility and simulations of these decisions are necessary. Tech and product teams often spend time and resources optimizing the part of the architecture they think is the biggest issue, only to later realize that it wasn't.” Once the significant performance bottlenecks have been identified, “organizations can design their applications and workloads for efficiency and begin a path of continuous improvement gains,” Hicks added.
The report might even highlight areas where a customer can lean on a cloud provider to improve its performance. “In addition to understanding their own cloud provider's performance, enterprises should also determine what good performance looks like by reviewing their peers' performance,” said Hicks.
“This way, they can ensure that their cloud provider is performing in line with other providers and collaborate with them if that is not the case.”