Two years ago, we launched the inaugural edition of the Public Cloud Performance Benchmark Report report, a metric-based, unbiased study that compares the network performance and connectivity architectures of the top public cloud providers. We soon realized that, because there is no steady-state in the cloud, yearly updates were necessary in order to reflect the most recent state of the cloud. That led us to release the 2019-2020 Cloud Performance Benchmark this past fall.
But in “cloud years,” even 12 months can be a long time. With cloud providers continuously updating their networks and services, the “State of the Cloud” warrants a more frequent refresh. That’s why we began a Quarterly Update section within the appendix of the report itself, where we share the most recent changes we have seen. That said, one of the areas that caught our attention recently is the performance enhancement to the AWS Global Accelerator, which we will detail in this blog post.
A Quick Primer on AWS Global Accelerator
If you are not familiar with AWS Global Accelerator, here is a quick primer on the service offering. In short, it is a pay-as-you-go service delivered by AWS that improves the availability and performance of your AWS-hosted application for global users. To understand how AWS does this, let me provide some context on how the cloud connectivity architecture (how users around the globe access cloud workloads) of AWS works.
By default, traffic destined to AWS-hosted services, irrespective of the region, traverses through the public Internet, only to enter the AWS backbone closest to the region. The reason traffic hot-potatoes is because AWS does not anycast public IP addresses associated with each of their regions from their global edge locations.
Let’s see what this looks like in ThousandEyes. Figure 2 below shows the end-to-end network path from user vantage points, deployed in Tier 2 and Tier 3 ISPs, on the left to a service hosted in AWS’ us-east-1 region on the right. The path visualization shown below breaks down the connectivity to a hop-by-hop layer 3 view, geolocating every hop along the way. As seen in Figure 2, without Global Accelerator, traffic from the user, irrespective of the user’s location, enters AWS’ backbone closest to the region where the service is hosted.
Introducing AWS Global Accelerator
AWS Global Accelerator is a solution that alters this default behavior. It is a commercially available service that enterprises can pay for to leverage the benefits of AWS densely-connected backbone network. Instead of using the Internet to carry user traffic, AWS Global Accelerator directs traffic to optimal endpoints on the AWS edge network by anycasting static IP addresses designated for your service. You can find the complete list of AWS edge locations here. This results in traffic entering the AWS network closest to the user and making its way to the destination service region through the AWS private backbone, as seen in Figure 3 below.
When we evaluated the performance of the AWS Global Accelerator from the perspective of round-trip network latency, jitter and standard deviation over a four-week period from 38 global vantage points in 2019, we noticed that there was an undeniable improvement to performance. However, it is not to be assumed that this performance uplift was consistent across all global locations. For granular metrics from each of the vantage points to five AWS regions, please refer to the AWS Global Accelerator section under "Findings and Takeaways" in the 2019-2020 Cloud Performance Benchmark.
It is important to caveat that network performance can be influenced by a variety of factors. For example, the network in which the user or vantage point is located, AWS’ peering relationships with ISPs (which by the way, is pretty extensive), routing policies with global ISPs and whether the closest edge location supports Global Accelerator or not. Given these various contributing factors to performance, enterprises should always evaluate the readiness of a new deployment from vantage points that are representative of their customers for accuracy.
Over the last six months, the AWS Global Accelerator team has been making improvements to their network, like improving routing for optimized paths to AWS’ edge network, supporting Global Accelerator PoPs in newer cities and introducing performance-enhancing features such as TCP termination. In March 2020, we re-ran the tests to understand the impacts of these optimizations. Check out the latest edition of the report for updated performance metrics that include network latency, jitter and standard deviation.
Note: It is important to clarify that the TCP termination was recently introduced by AWS and was not included in the original report. In order to maintain parity, TCP termination was not included in the quarterly refresh.
We noticed significant improvements to network performance and network connectivity for a few regions. Highlighted below in Table 1 is a comparison of round-trip network latency from user vantage points to a service in us-east-1 (Ashburn). Let’s look at Los Angeles, for instance. In October 2019, a user connecting from Los Angeles, from Peer 1 ISP, to Ashburn on AWS Global Accelerator would have noticed very negligible improvements to network latency, from 74.92ms to 74.45ms. However, the latest results show that the same user connecting to AWS us-east-1 will see an improvement in network latency by 15ms, approximately a 20% improvement from the past state. The following table highlights a few locations where we saw improved performance. Note that measurements taken from other locations or other ISPs from the same location may yield different results.
ThousandEyes Path Visualization corroborates this improvement in performance from Los Angeles to the optimized network path. Despite using AWS Global Accelerator, traffic from Los Angeles enters the AWS backbone in Washington, DC, in 2019 (Figure 4). However, March 2020 (Figure 5) tells a different story, where traffic from Los Angeles is handed off to AWS’ backbone at an Internet Exchange point (CoreSite) earlier in the path and closer to the user. While AWS was peering with Peer 1 Network, a sub-optimal routing resulted in traffic going to Washington, DC, in October 2019. But by working closely with Peer 1 to correctly route to the LA edge location, AWS was able to alter the network path resulting in a 20% uplift to network latency.
These improvements highlight a key characteristic of the cloud: there is no steady state. Optimizations and enhancements are continuous, sometimes it can help with improving performance as witnessed above, but sometimes it doesn't. While cloud providers are always optimizing their networks to provide the best customer experience, it is imperative that enterprises relying on the cloud continue to monitor proactively from locations that best represent their customer footprint. Also, when you have data, you can avoid the common issue of finger-pointing, but rather collaborate with cloud providers to resolve issues faster.