Reliability matters. Reliability of the Internet matters, but the Internet wasn’t built for how we use it today. Our global bases of users, our remote workers, the backbone of our SD-WAN migration, what do these all have in common? The Internet. And we are relying on it more than ever. But have our tools changed? Do we have the requisite visibility, the ability to gather the necessary data and drill down to identify root cause at Internet scale?
For many, the Internet has become a “black box” that is too complex to manage, too big to maintain visibility and too vast to monitor. As the humble vision of a network of networks grew into the Internet we know, the balance of visibility and actionability has slowly shifted away from the individual participants. Get back what you’ve lost with Internet Insights™.
Beyond Crowdsourced Sentiment
An authoritative source of holistic Internet measurement data has been missing in action. Backed by real data, the ThousandEyes collective intelligence dataset, Internet Insights™ identifies network outages at Internet scale.
Real data changes everything. You no longer have to hope that enough people complain about the issue you are experiencing on Downdetector or Twitter. Nor do you have to worry that, in the face of so many tweets and reports, the root cause remains elusive. Save time identifying root cause and drive provider escalation and customer notification with ground truth data. Avoid the pain of choosing bad providers. Quickly answer the question: “Is it just me, or am I affected by an Internet outage?”
Real-time and Based on Real Data
Internet Insights™ aims to answer that question and more. First, Internet Insights™ identifies common points where service provider networks terminate. Then, when traffic from a ThousandEyes test stops as it crosses publicly routed provider networks, Internet Insights™ identifies if that test is part of a greater outage, what the impact is to destination visibility and what other network paths were impacted.
An end-to-end test can tell you whether its target is up or down, as well as what path its traffic took across the network. However, when its traffic stops, it can’t tell you if it’s due to an outage or what other traffic was impacted, but Internet Insights™ can. Provider network outages impact the visibility of your service or application to your customers and employees regardless of actual availability.
It’s Not Just You, It’s the Internet
You might recall, this last June a massive BGP route leak caused a wide-spread Internet outage. For hours after the initial impact, users felt ripples of the outage across the Internet. Fingers were pointed, incorrectly, at application providers like Discord and platforms like Cloudflare. It didn’t help that throughout the duration of the outage, there were little to no updates from Verizon, the service provider, which turned out to be the chief propagator of leaked BGP routes causing the outage.
For most teams, the response to the June 24th outage would have been a massive fire drill. Much consternation was caused for service desks receiving complaints that their application was unreachable from a customer’s location. Many response teams were mobilized to troubleshoot an issue they could not resolve. Their tests might have said their application was up, or that a platform like Cloudflare was down: both conclusions would have been wrong. Their monitoring wasn’t broad enough to answer the question posed above, “Is it just me?”
Internet Insights™ identified the outage, its locus and radius of affected networks in real-time. The Outage Timeline shows all outages identified and enables quick scrubbing to the time of a customer-reported issue. Robust filtering by provider, location and affected tests allows for easy event correlation and noise reduction. As shown in Figure 1, we see the beginnings of the outage on the timeline at 10:35 UTC.
The Topology View (Figure 2) helps you cut through the data and shows the root cause network front and center. Network outages can also be viewed on a map and as detailed metadata. We’ve kept Internet Insights™ workflows simple. Internet Insights™ highlights the epicenter of the outage: “MCI Communications Services, Inc. d/b/a Verizon Business,” AS 701. On the left side are agent locations that are the source of the affected test traffic. On the right side are the networks associated with the destination interfaces of the affected test traffic: services and applications, for example. In the middle, you can see the service provider networks traversed as well as the affected interfaces of the terminal network hop.
At the end of the day, the cause of this outage was Verizon forwarding routes from a dual-homed enterprise customer network, due to an unintended leak from a Noction BGP Optimizer: routes that should have never been forwarded to the Internet. In this case, it wasn’t just Cloudflare. Google, Amazon, Facebook and hundreds of other networks were affected. Cloudflare, although widely reported as a cause, was just collateral damage.
Making Outages Actionable
Outage Alerts: Take action when network outages occur, even where you don’t have the infrastructure needed to install agents or run tests. Outage Alerts can be super focused, filtering through outage data by severity, location, affected agent tests and specific providers, among other conditions. Outage Alerts support email notification and our other alert integrations, including outage notifications via Slack, for example. Use them to drive outreach to key customers or automatically drive an improvement conversation with a service provider.
Outage Snapshots: At the core of the ThousandEyes product, snapshot sharing lets you transparently share your view of an issue, and Internet Insights™ is no different. Outage Snapshots let you capture and share the Internet Insights™ view as well as any of your own affected test data within one easy-to-share link.
Outage History: With a full year of outage data, analyze past outages and correlate past outages by provider to see their performance over time. See if an outage is a one-off or part of a pattern. Should you change providers or purchase a new circuit? Ground truth data from Internet Insights™ can help.
Global Internet Health Dashboard
The Internet Insights™ Overview (Figure 3) is a global Internet “weather map” of provider network outages, both recent and in-progress. At a glance, your NOC, your service desk and other teams can respond quickly to customers and employees who many be experiencing the impact of an Internet outage, or work proactively to escalate the issue to service providers. You can easily click through from the Overview to Timeline or Topology views to filter based on the customer’s service provider or drill down to the root cause network.
What’s Next for Internet Insights™?
We’re just at the beginning of the journey for Internet Insights™. The ThousandEyes collective intelligence dataset holds the key to answering even more of your questions about availability, performance, global routing changes, hijacks and leaks.
We believe in a proactive approach; not just monitoring the Internet, but giving you a platform to manage your digital experience at Internet scale. We believe that to drive real improvements to a collective network, like the Internet, you must take a collective approach with real data and transparency.
The Internet Is Your Network
It’s not enough to blame the Internet anymore. Your customers depend on you to deliver their digital experiences, and when they encounter outages, they turn to you. When you run your applications on the Internet, you need to see the Internet like you own it.
With ground truth data, you can quickly resolve outage issues from your point of view, drive escalations or help your customer with theirs and provide proactive updates to customers impacted—all before the provider updates their status page. Bottomline, you can save time and money on unnecessary troubleshooting and unneeded SEV1 responses.
To adequately respond to an issue like we saw on June 24th, you need to see into the gaps in your monitoring—the area not covered by your own tests. It’s impractical and cost-prohibitive to monitor the Internet alone, but we can do it together. Get back the leverage lost in the black-box Internet: measure, influence and make change. We hope you will join our effort. Together we can improve the Internet we all rely on.
Internet Insights™ is available now. Customers are already using it to solve their most difficult Internet challenges. We want to hear how you would use Internet Insights™ and discuss how we can help.