The Internet is an increasingly vast and constantly changing environment, and one we rely on more than ever to get work done. It’s now a business imperative to be aware of Internet and application outages and how they impact the experiences of customers and workers. Two years ago, we introduced Internet Insights™ Network Outages—the first ever collectively powered global view of Internet health—helping our customers reduce troubleshooting time for complex provider issues from hours to minutes, manage the impact global network conditions are having on the availability of their services, and understand the ability of their workforce to perform.
Today, we are thrilled to unveil the latest addition to the Internet Insights product: Application Outages—giving IT operations instant insight into SaaS application availability on a global scale. Without needing to set up a single test, no deployment or instrumentation required, IT teams are empowered with both near real-time and historic views into the availability of more than 75 of the most important SaaS applications employees rely on.
Now, customers can quickly correlate user-specific issues to broader application issues to proactively alert their workforce that an app is unavailable, while also escalating the issue to the SaaS provider—often before the SaaS provider acknowledges the issue publicly. And, by understanding application provider availability over time, customers are equipped to make more informed vendor decisions and better manage providers.
Watch this on-demand demo to see how Internet Insights: Application Outages provides near real-time insight into outages that impact the apps that power your workforce.
But how often do provider outages happen, that IT teams would need this kind of global, always-on visibility? It’s more often than you’d think. In 2019, Internet Insights detected 20 network outages per day (on a 30 day average). By 2020, that same average had grown to 30 outages daily. Today, Internet Insights detects on average over 50 network outages per day. These outages, despite not making the news cycle, continue to cost operations teams valuable time troubleshooting and can result in lost revenue for the businesses that rely on them.
Powered by Collective Intelligence: Global Visibility and Real Data
How do we get this data? The answer lies in the power of collective intelligence. In order for enterprises to confidently take action during an outage, we know the data powering Internet Insights must be both trustworthy and authoritative. In terms of trustworthiness, the insights are based on real network data. In terms of being authoritative, insights are derived from a massive, collective data set. Internet Insights harnesses the collective intelligence of tens of thousands of ThousandEyes Cloud Agent and Enterprise Agent tests, analyzing billions of daily path measurements to thousands of digital services spanning tens-of-thousands of viewpoints located in cities around the globe.
Why does collective intelligence from real telemetry matter? Because the Internet is too vast for any individual company to monitor on its own, and crowdsourced outage websites based on unverified user sentiment can be unreliable and can distort the reality of an outage. Case in point, in an August 2021 Newsweek article, Verizon pointed out the challenges it recently experienced when crowdsourced data incorrectly reported a broad-scale outage. It noted that crowdsourced data can result in “widespread misinformation” that is not actionable for service providers and presents yet another challenge to surmount during an outage response.
Now, operations teams can take advantage of both Network and Application Outages detected within the Internet Insights collective data set to quickly get to the bottom of these fundamental questions at the onset of an outage and drastically shorten the mean time to identify (MTTI):
- Is it just them? Or, is there an Internet or provider outage?
- Is it an application outage, a network outage, or both?
- What are the common threads that can indicate the cause? For example:
- Time: What is co-occurring?
- Application Providers: Which applications, one or more?
- Network Providers: Which networks, one or more?
- Locations: Where is the impact occurring? At the agent, networks (path), or target (servers)?
- Domains: Which specific properties?
In Hybrid Work, New Challenges Require New Solutions
And it’s not just operations teams that can benefit from Internet Insights. As enterprises grapple with new hybrid work scenarios, users are becoming more distributed than ever, and the applications they rely on to stay productive are often hosted in the cloud or within a SaaS provider’s network. The growing sophistication of such an ecosystem also means that identifying exactly where the issue lies is more complex than ever. Internet Insight provides distinct value to several parts of the business:
- For the enterprise service desk, with a growing responsibility for services that are outside of their control, they can reduce the MTTR (Mean Time To Resolution) of help desk tickets from hybrid workers impacted by network or application issues.
- For executives, it means no more flying blind, with clear external visibility of application availability from your customer’s or employee’s point of view. They get global, macro insights for strategic governance and reporting, as well. They can also make an informed response to outage misinformation stemming from crowdsourced sentiment.
- For service providers and Ops teams, it is no longer enough to maintain general service availability because customers expect service wherever they are. They can see how an application is experienced holistically, isolate regional issues, problem data centers, avoid unneeded SEV1 responses (see this case study: "How a Top Financial Services Firm Gains Visibility Outside the Corporate Perimeter"), and identify platform issues (DNS, SSL, etc.).
- For IT, they can manage response to provider outages, reduce mean time to innocence (MTTI), and accelerate escalations (see this case study: "Money Moving with Internet Insights"). Even better, they can use Outage Snapshot Sharelinks to prove innocence and work with the responsible party to influence the mean time to repair (MTTR). Using one year of historical outage data, they can also enforce vendor SLAs, improve SaaS provider selection, and conduct effective network planning.
- For Application teams and developers responsible for SaaS integrations, they can see the impact that an outage is having on the targets they test using the Affected Tests capability in Internet Insights. This allows them to understand the full picture of how external dependencies are affecting their application’s performance and user experience.
Recent Outages Highlight the Value of Cross-layer Visibility to Application Outages
With the addition of Application Outages to the existing Network Outages in Internet Insights, we can now deliver the same cross-layer visualization you know and love from ThousandEyes test views. This allows you to quickly and easily understand whether a network or application outage—or both—is occurring.
Why is cross-layer visibility important? Let’s look at a recent outage to understand how the visibility that Internet Insights: Application Outages can help. The “summer of outages” in 2021 brought us some of the largest and most paralyzing to date. Most recently, on October 4th, Facebook went down for more than seven hours when a configuration change took down a critical part of their backbone network, leading to other issues including DNS service disruption. On July 22nd, Akamai’s DNS went down, impacting PlayStation Network, Delta, Costco, and UPS among many others. Earlier in the summer on July 16th, Akamai had another hiccup when its DDoS mitigation service, Prolexic Routed, left customer websites unreachable.
You also probably remember the Fastly outage back on June 8th. On that day, a latent software bug was triggered by a Fastly customer who updated their own individual CDN configuration. That change, according to Fastly, resulted in customer applications delivered by Fastly failing to be served from their origin servers. The press took note of the most high-profile customers impacted from Amazon, to Reddit, Spotify, eBay, Twitch, and Pinterest. Let’s take a closer look at how Internet Insights: Application Outages identified this outage, and why cross-layer visibility is so important.
At the onset of the outage, we saw 503 service unavailable errors in the application layer while the network layer appeared to be normal. Clicking through the Application Outages timeline, we saw the outage’s impact spread—PayPal, Vimeo, Target: all affected. From the Internet Insights destination groupings, we learned that the issue was not geographically constrained and not limited to a single application or provider property. And, then there was one critical detail: this set of application provider outages all had Fastly’s network, AS 54113, in common. Within minutes, Application Outages gave us a clear picture of the evolving outage and its cause.
Application Outages adds unique value for application and network service providers alike. Looking back at this outage, it’s likely that Internet Insights could have, within minutes, provided Fastly with clear visibility into the external availability of their service during the outage and recovery periods. On the other hand, application providers that rely on Fastly could have benefitted from knowing what other providers were also impacted during the same time period, to help get to the conclusion that it wasn’t an application issue, faster. Critically, providers would have been able to confirm that service recovery mirrored the customer experience, cross-checking recovery timelines with real-network data from ThousandEyes collective intelligence and adding much needed confidence during the disruption response.
Register for the upcoming webinar to see a demo of Internet Insights and learn how you can monitor digital experience at Internet scale.
Introducing Business-Critical SaaS Outage Detection for the Masses
With the availability of Application Outages, we’re delivering a new layer of application capabilities in Internet Insights. But we haven’t stopped there. Today, we’re also announcing the addition of the Application Outages data set to the Internet Outages Map, bringing near real-time application outage detection to the masses. The map is updated every 5 minutes, and can be referenced to quickly understand if an ongoing outage in a network or application provider you rely on is the source of an issue you are experiencing.
At ThousandEyes, we are committed to delivering on our mission to become “Google Maps for the Internet,” but we’re just scratching the surface of the collective intelligence capabilities on the ThousandEyes platform. ThousandEyes Internet Insights is a critical piece of the puzzle to solve your macro questions about availability, performance, global routing, and planning needs.
Internet Insights: Application Outages is available today. In fact, customers are already using it to address their most critical visibility gaps. Let us know how you would use Internet Insights, and we’d be happy to help.