Guest blog post by Mehmet Akcin, Service Strategist and Architect.
In June 2017, we published an update to this post with more recent measurements — read the most up-to-date post on this topic here.
As a person responsible for operating various DNS services, performance has always been an area of interest for me. One performance metric on which we all focus is: How fast is my service? Let me be more specific here, by “fast” we mean that when a DNS query is made, an answer is provided quickly to a DNS client. We call this query resolution time. We are going to be running several tests in upcoming months that focus on DNS performance. This first analysis is about the performance of DNS root servers.
The Domain Name System (DNS) contains two kinds of agents: resolvers, who ask questions and servers, who answer them. While resolvers can ask about any name, name servers typically are configured to answer only for the domain names for which they are responsible. These domain names are configured on the server as DNS “zones.” In this model, every resolver is immediately faced with a problem: when a resolver needs to ask a question about a name, such as "www.example.com", which name server can provide the answer? Which name servers are authoritative for the zone "example.com?" This information is found in the ".com" zone, so first the resolver must ask the ".com" name servers a question. But which name servers are authoritative for the ".com" zone? That information is in a distinguished zone, called the root zone of the DNS, which is served by the root servers.
DNS resolvers are configured with a list that provides the IP addresses for all the root servers. When it needs to know a DNS name, the resolver first asks a root server for the answer, and the response will direct the resolver to the name servers for the appropriate top level domain name (such as .com) and so on. Resolvers cache the answers they receive, so subsequent queries for names with a common top level domain label can be served from the local resolver’s cache without re-querying the root servers each time. However, queries to root servers are common enough that the overall resolution time offered by a DNS resolver depends to an important extent on how far away the resolver lies from the root servers.
The root servers are a set of thirteen service points that each have a current copy of the DNS root zone. These root servers are operated by twelve different governments, non-profit organizations, and commercial entities.
Each of these thirteen root servers is actually comprised of clusters of physical and virtual servers that are typically geographically distributed in order to decrease latency, improve resiliency, and decentralize infrastructure among countries. This geographic distribution is achieved by using anycast, a methodology in the Internet Protocol (IP) which routes IP packets to the nearest network destination. In the case of the root name servers, each root (except B-root) has a collection of servers with identical IP addresses, and anycast addressing will route DNS queries to the nearest physical server. As at February 2015, there were a total of 446 anycast instances of root name servers.
As previously mentioned, DNS resolvers are typically bootstrapped with a list of root server IP addresses, contained in the “root hints” file. A resolver will select one of the root servers from the root hints file (named.root in BIND) at random when first started. In general a resolver can select any root server, with the exception that E-root and G-root are not IPv6 enabled. Over time, DNS resolvers will prefer the root server(s) with the fastest resolution time.
Measuring Root Server Performance
I put together a small research project to understand root server performance. ThousandEyes nodes distributed globally were used to trace network paths to the Anycast sites. More than 3100 DNS resolvers in nearly 70 countries and 800 Autonomous Systems, referred to as vantage points, were used to measure root server availability and latency globally and by country. These DNS resolvers are a small subset of the 20+ million open DNS resolvers on the Internet, selected to ensure geographic and network-level diversity, stability and data quality.
Availability of the Root Server System
In order to understand overall availability of the root servers, I did a country-by-country comparison. We tested the presence of the root servers in the DNS resolver cache, similar to a DIG +trace for the “.” SOA record. In general, we would expect close to 100% availability from the major DNS resolvers that we’re sampling. The below map shows root server availability by country over a two week period (more than 300 samples per vantage point). There were several intermittent reachability issues, which is normal, but availability was consistently above 95%. There was persistently worse performance in a few networks which returned ‘no mapping’ and ‘servfail’ errors from several vantage points.
Measuring Latency Across Countries
Next I wanted to understand root server latency across countries. To accomplish this, first I needed to understand which root server has the lowest latency in each country. For places like the United States, with anycast sites of every root server, this is less important; but for countries like Vietnam and Turkey, with few or no root server instances, latency is closely correlated to which root server has the fastest response and how far away that root server is. In Figure 5, you can see the fastest root server per country, as defined by mean latency over a 2 week period. You can see some of the regional focus of certain root servers, with C-Root in North America, J-Root in Eastern Europe, K-Root in Central Europe, L-Root in South America and M-Root in Japan.
With information on which root server is fastest in each country, I now wanted to understand what latency that represented. Global average latency to root-servers is around 70ms from the parts of the world we measured in this research; as measured by the root server which answers fastest. To break this out by country, I took the mean latency to the fastest root server in each country over the course of a two weeks. Figure 6 plots the mean latency of the fastest root server in each country.
Let’s take Turkey, a country with high latency, as an example. From Istanbul, root server requests tend to be served from Frankfurt rather than from inside Turkey. This is likely due to peering policies within these networks preferring routes outside of the country.
Latency by Networks
In addition to country-level comparisons, the data set allowed me to also look at root server latency from hundreds of Autonomous Systems. This data tends to confirm the country-level analysis, with Bharti Airtel in India finding I-Root to be the fastest root server, yet also showing diversity across networks: in China the major providers have lowest latency to different root servers.
Root server performance is crucial for the overall resolution time offered by a DNS resolver. I hope that you find this research useful in optimizing your own DNS infrastructure. If you have any questions regarding the details of the research, you can contact me at mehmet [at] akcin.net or @mhmtkcn.
PS: Many thanks to Geoff Huston and Greg Lindsay for their help writing this post and of course all of the ThousandEyes team.