In past blogs, we’ve explored how you can alert for and diagnose BGP route leaks and hijacks. Route leaks and hijacks — routing events where illegitimate prefixes are wrongly propagated through the Internet — are notoriously difficult to troubleshoot and have the potential to bring down entire swaths of the Internet. While much has been said about how network operators can detect and identify these events, there are few resources on how you can respond to leaks and hijacks to reduce the associated performance and security impacts as much as possible.
In this post, we’ll discuss many of the common methods and best practices that can help mitigate route leaks and hijacks affecting your prefixes. We’ll also discuss longer-term solutions to help prevent the propagation of bad routes, ultimately making the Internet a more secure place for everyone.
Mitigating Route Leaks Affecting Your Prefixes
In the face of an unfolding route leak or hijack, you’ll need to work quickly to resolve the issue before it significantly impacts application delivery and the security of your network. Because BGP on its own is founded on trust and is thus insecure, it can be extremely difficult to quickly resolve a route leak affecting your network, as you’ll need to convince other networks to choose the legitimate route over the incorrect one. While you won’t have complete control in a route leak situation, you do have some options to combat ongoing route leaks. We’ll discuss these from roughly the fastest to slowest time to resolution.
Contact upstream ISPs
The BGP Route Visualization, together with the use of a Private BGP Monitor in your own AS, will give you a bird’s eye view of the routes both into and out of your network, to and from the services and users you care about. With complete BGP-layer visibility, you can rapidly identify the upstream ISPs most likely to have propagated the bad routes during a route leak.
For example, in the major route leak affecting Amazon and many AWS services in June 2015, by isolating the newly appeared ASes common to all of the route monitors observing the route leak, we very quickly saw that traffic had begun to be routed through new networks, Hibernia and AxcelX. AxcelX had leaked Amazon’s prefixes, which Hibernia then accepted and propagated to other ISPs. Feel free to interact with the data at this share link.
In this case, the clear first step for affected services and networks would have been to contact the upstream ISPs that accepted the bad routes — Vocus, Obit, Econet and ClaraNET — to reject the bogus routes and restore service. Depending on the cause of the route leak, which is often the result of an error, it may also be effective to reach out to its originators of the illegitimate prefixes — in this case, AxcelX and Hibernia — to withdraw the illegitimate routes.
Announce preferred routes
If it is ineffective or inefficient to reach out to the originators and propagators of a route leak, an affected service can consider countering the illegitimate routes by announcing routes more preferable than the leaked route.
Because routers always prefer the more specific prefix, an effective way to combat route leaks affecting your prefixes is to announce prefixes more specific than those leaked. This is generally only feasible when the leaked prefix is bigger than a /24, as prefixes smaller than a /24 generally won’t be propagated by routers. In the case of the AxcelX route leak, a /17 prefix was leaked, which would have allowed room for a countering, more specific /18 (or smaller) prefix announced by Amazon.
If announcing a more specific, covered prefix is not possible, you can also try shortening your routes where possible, including removing any AS path prepending from your routes. Because the preference for shorter AS paths is not as strong as the preference for more specific prefixes, this method will generally be less effective.
Change prefixes with DNS
As a last resort, you can consider changing your prefixes entirely by changing your DNS records. This is feasible only if traffic can be shifted to other locations during the route leak, like alternate data centers or a CDN network. This method may need a significant amount of time to take effect, depending on the TTL value set on your original DNS records.
Publish ROAs
Finally, as a preventative measure to guard against future leaks or hijacks of your prefixes, make sure to publish Route Origin Authorizations (ROAs) in the various regional Internet registries (RIRs). ROAs are records which verify that a given origin autonomous system (AS) is authorized to announce its associated prefixes, as well as the maximum prefix length that the AS can announce. Publishing these records ensures that networks using the Resource Certification (RPKI) system to check prefixes against ROAs are able to validate the origin AS and verify that your routes are legitimate.
Due to the implicit trust built into BGP, there are admittedly few sure-fire solutions for combating ongoing route leaks affecting your prefixes. However, the best practices outlined above will go a long way toward lessening any performance impacts.
Preventing the Propagation of Bad Routes
While mitigation strategies are crucial to putting out the ongoing fires of unfolding leaks and hijacks, they are ultimately band-aids for the deeper issue that BGP is inherently insecure. Solving the problem of BGP insecurity to prevent future route leaks and hijacks requires more widespread coordination in the Internet community to adopt best practices for route filtering, BGP security standards and preventing malicious hijacking activities throughout the Internet.
Each network and its operators are instrumental in fortifying the security of the Internet, and here we’ll discuss how your network can be a good Internet citizen.
Route Filtering
Because BGP relies on routes to be shared across networks, network operators must be vigilant in only accepting legitimate routes into their routing tables and propagating them to other autonomous systems (ASes). The first line of defense in preventing leaks is filtering out illegitimate route advertisements, whether accidental or intentionally malicious. Route filtering can be based on the prefix, AS path or community within the route advertisement.
The key to proper route filtering is to build a set of robust heuristics into your network’s filtering rules. Some of these rules should likely include:
- Filter out Bogon prefixes and routes with Bogon ASNs anywhere in the AS path. Bogon prefixes and ASes are located in reserved or unallocated IP space; these should never be advertised.
- Filter out routes with more than two Tier 1 (“transit-free”) networks in the AS path. When there are three or more Tier 1 networks in the path (like the one pictured in Figure 2), at least one of the networks is providing transit to another. This is usually a mistake.
- If you don’t sell transit to large networks (like Tier 1 networks), filter out routes from customers that contain a large network (that you wouldn’t sell transit to) in the AS path. To go a step further, also keep a whitelist of prefixes that each of your customers may announce to you.
- Use peer locking, as described by Job Snijders at NANOG67. Email your peers and ask who all of their possible upstream networks are, and only allow those upstreams to be intermediate networks between you and your peers.
- Use BGP Maximum-Prefix to set the maximum number of prefixes that can be announced from your peers. This acts as a circuit breaker in route leak situations where many prefixes are announced in a short period of time.
As always, use a route monitoring solution to ensure that you’re properly filtering routes and that traffic is being correctly routed to your network.
BGP Security Standards
There are also a number of security standards used to authorize part or all of the AS path in route advertisements, including RPKI, RPSL and BGPSEC.
RPKI only validates the origin AS. It relies on Route Origin Authorizations (ROAs) published in Regional Internet Registries (RIRs) to ensure that an AS actually owns the prefixes it’s announcing.
Make sure to publish ROAs for your own prefixes so that they can be verified by others who have deployed RPKI. Other security standards exist as well: RPSL verifies ASes with their intended routing policies, as published in Internet Routing Registries (IRRs), while BGPSEC aims to validate the entire AS path in route advertisements.
Unfortunately, adoption of these standards is low, and will need much higher adoption in the Internet community to effectively prevent large-scale events like route leaks. While the benefits of using security standards are not immediate, don’t let this deter you — the Internet needs networks to begin the hard work of deployment to ensure a more secure future.
Block Malicious Hijacks
While route filtering and origin validation like RPKI can do a lot to prevent accidental route leaks, they are less effective at combating the activities of clever, malicious hijackers. Preventing intentionally malicious events requires an additional set of security mechanisms. One technique is TCP MD5, which uses a secret key to compute a hash over TCP headers. This ensures that route advertisements are authentic and haven’t been tampered with. Another technique is the Generalized TTL Security Mechanism (GTSM), where your peer sets the time to live (TTL) value of route advertisements to the maximum of 255, so that attackers more than one hop away from your network won’t be able to fake the TTL. As a result, attackers are blocked from spoofing route advertisements and impersonating other networks.
In sum, this overview of preventative techniques isn’t comprehensive, but it will provide a good start for protecting your network and prefixes against both accidental route leaks and malicious hijacks. Because the BGP protocol was never meant to be used to knit together the Internet on such a large scale, the Internet community will need to work together to push BGP toward a more secure and reliable future. Together with the strategies to combat unfolding leaks and hijacks affecting your prefixes, networks will be well equipped to weather severe routing events, both local and global.