On June 25, 2016, all 13 DNS root servers suffered from a major distributed denial-of-service (DDoS) attack. This wasn’t the first time in recent memory that attacks were aimed at critical DNS infrastructure — at the end of last year, several root servers came under a DDoS attack, and in mid-May NS1 also experienced a DDoS attack that brought down services like Yelp and Alexa.
The events of June 25 represented a serious, large-scale attack on critical Internet infrastructure. According to a post mortem written by the root server operators, the attackers flooded all DNS root name server letters with both TCP SYN and ICMP traffic. Each DNS root server received traffic volumes of about 10 million packets per second, which in bandwidth terms is 17 Gb/s.
What is interesting about this incident is that while each letter (each of the root servers is denoted by a letter) received roughly equal amounts of traffic, some root servers emerged virtually unscathed, while others experienced much more serious issues. As we dig into the details of the attack, feel free to follow along at this share link.
DDoS Caused 3 Hours of Performance Issues
The effects of the DDoS attack on June 25 lasted for around three hours, from 2:45pm to 5:50pm Pacific, during which availability averaged across all root servers dipped to around 50%.
Average resolution time also saw elevated levels during the attack, peaking at 13 ms, up from a pre-attack baseline of 3.7 ms.
Throughout the attack, we saw varying levels of stability in the routing layer. The operators of A-Root and other root servers began making route changes involving their own networks in response to the DDoS attack, likely as part of a deliberate plan to re-route traffic through scrubbing centers to mitigate the immense traffic volumes.
However, other root servers like M-Root saw continued route instability, and it’s unclear why this happened and whether it was intentional.
On Anycast and Resilience
Once we start digging into the performance of individual root servers during the DDoS attack, we see that the attack’s effects varied widely from letter to letter, and is unsurprisingly correlated with how heavily anycasted a given root server is.
Anycast is a technology used to advertise a single IP address from multiple physical endpoints, so that traffic is delivered to the nearest endpoint. Anycast helps make networks much more resilient than unicast, where there is a one-to-one relationship between an IP address and its authoritative nameserver. In the case of this DDoS attack, traffic originated from sources around the globe and so was distributed across multiple locations, if the root server was anycasted. Root servers with high numbers of anycast locations saw traffic diluted across the majority of those locations, and as a result the DDoS attack’s impact was similarly weakened.
We can see this correlation at the height of the attack. The root servers returning the most errors and unavailable to the most Cloud Agents are H-Root (with two anycast sites) and B-Root (the only unicast root server, with one location). In contrast, the root servers with the fewest errors and availability issues were uncoincidentally those with the most anycast locations: J-Root (113 sites, operated by Verisign) and L-Root (154 sites, operated by ICANN).
A multi metric table with availability, resolution time and packet loss averaged over the duration of the DDoS attack conveys a similar picture of the relative performance of the root servers. The difference in a given metric between the current time period and the previous one (before the DDoS attack began) is also shown for each metric.
To see more detailed comparative visualizations, see our report snapshot from the event. If you’re also interested in learning more about the relative performance of the DNS root servers, check out the guest blog post by Mehmet Akcin on comparing root server performance.
To dig into the details of root server performance during the attack, let’s take H-Root as an example, which saw around 100% average loss during the DDoS attack. We can see all path traces terminating in either the network hosting H-Root, Navy Network Information Center, or in its peering points with upstream ISPs DoD Network Information Center, Qwest and Hurricane Electric. Because H-Root has only two anycast locations, traffic volumes were concentrated enough to overwhelm the destination network and its edges.
In contrast, J-Root saw average loss under 15% for the entirety of the attack. The below path visualization, taken at the peak of packet loss, shows only two nodes with loss over 10%, located in the edges of the destination network in the Netherlands and Russia. We can tell that J-Root is heavily anycasted because path traces from each Cloud Agent travel separately to the destination without converging, and the hops right before the destination are located in various disparate locations, from Amsterdam and India to Madison, WI. With so many geographic sites, very little of J-Root’s network infrastructure was overwhelmed.
As a result of the DDoS attack, a number of our customers saw errors in their DNS Trace tests, which query the root servers for “.” for every test interval without consulting any caches. It served as an important reminder that such a large-scale attack on DNS root servers also has the potential to impact requests to all other domains—in other words, everything under the “.” hierarchy.
The DDoS attack on all 13 root servers also illuminated the importance of anycast to the resilience of the critical DNS infrastructure underlying the Internet. Without working root servers and the heavy use of anycast to make their continuous availability possible, the Internet would cease to exist.
To run all of the above analyses and monitor the resilience of your own DNS nameservers, anycast or not, sign up for a free trial of ThousandEyes today.