Application Performance Management (APM) is an important part of many IT organizations’ portfolios because of the need for DevOps teams to gather user experience data to continuously improve application code. Yet, as SaaS-based applications and services rise (SaaS handles 11% of application workloads according to AlphaWise/Morgan Stanley Research), APM is meeting limitations in its ability to offer insight into digital experience. Furthermore, dependence on externally owned networks, infrastructure, and services—especially within the Internet, means that IT teams need to go beyond what APM offers to deliver superior digital experiences.
The Benefits and Limitations of APM
Many organizations use APM solutions to instrument code for applications that they own or develop. APM enables application discovery, tracing, diagnostics and real user monitoring, giving you a deep view of the health of your application components, so you can discover issues (such as bottlenecks) and use that data to troubleshoot and optimize application performance. It provides essential insight to DevOps and a bridge to production IT teams, as they work to maintain continuity of operations. In short, for applications that you own, APM is often critical to your business. Its usage is limited, however, to applications that are within your data center or VPC. For apps owned by SaaS providers such as Salesforce, Box, and Microsoft (e.g. Office365) or external services like API endpoints from Paypal, Braintree, Twilio and Google Cloud (like maps), APM dramatically loses insight into availability and performance. APM vendors also lack expertise in the underlying network behavior that impacts end-to-end delivery.
APM is very good for:
- Getting code-level visibility into your applications
- Enabling DevOps to fix/optimize apps
APM is not good for:
- Getting operational visibility into your external dependencies (SaaS applications, APIs, as well as service delivery elements such as DNS, ISPs, etc.)
- Understanding your user experience from different locations, correlated to the many intermediate networks that deliver connectivity
- Enabling your production IT/operations teams to identify and troubleshoot external network, service, and provider dependencies and issues quickly
Increasing External Dependencies
As applications become increasingly atomized and distributed (e.g., across multiple clouds and vendors), a greater proportion of your traffic is shifting external to your enterprise environment. Payment gateways, call forwarding integration and other external SaaS API endpoint providers are now a critical part of how your application functions. All of this inter-service communication takes place over the Internet, which is made up of many providers and dependencies such as BGP routing and DNS. Knowing what’s gone wrong when issues occur and who is the responsible party can very challenging. Since APM only provides visibility into the applications that you own, you can have significant blind spots into the overall health of your applications.
Even though many aspects of user experience are outside your management domain, you still own user experience and are, ultimately, responsible for ensuring an application is available to customers—regardless of the issue or responsible party.
What about Synthetic APM?
Some APM vendors offer synthetic monitoring, which enables you to perform load and transaction testing and determine page load time from a variety of user vantage points. This level of visibility is useful for determining how user location may impact performance, but it does not provide sufficient contextual data—specifically network and routing—which would enable issues to be identified and attributed to the right party. When degraded performance is determined with Synthetic APM testing, it can be challenging to determine whether the application or the network is at fault. And if the network is the problem, these tools provide no visibility into which external provider your Operations team should escalate to in order to gain problem resolution, and what data to provide them to help that escalation turn into problem resolution.
Like traditional APM, synthetic monitoring is primarily useful for your DevOps or application development team. For complete visibility that can provide data for both DevOps and Operations, you need cross-layer insight (application, network, routing, and device metrics) for all of your critical dependencies.
Network Intelligence for External Visibility
Network Intelligence fills in the visibility gaps left by APM and traditional NPM in your application and user experience by providing deep insights into the availability and performance of all of your external dependencies, as well as simulating the experience of your users from a variety of vantage points. Unlike APM, which requires code-injection into applications that you own, Network Intelligence leverages active monitoring techniques that don’t require instrumentation of your monitoring targets. It correlates app-layer visibility with hop-by-hop metric visibility across the end-to-end network path, plus Internet routing data. Together, this multi-layered intelligence enables your Operations teams to provide a high-quality user experience.
How Network Intelligence and APM Work Together
The high-level goals of both APM and Network Intelligence are the same, specifically:
- Improve mean time to troubleshoot (MTTT)
- Improve user experience and employee productivity
- Improve business outcomes
However, APM and Network Intelligence address different domains, with APM targeting applications that you own and manage, and Network Intelligence targeting end-to-end networks including the Internet and cloud-based services.
To get a complete view of the health of your applications and how your users experience them, you need to use both APM, as well as application-aware Network Intelligence. APM will provide visibility for your DevOps team, while Network Intelligence will give external visibility—either inside-out or outside-in—for your NetOps and CloudOps teams.
Core Capabilities for Monitoring Cloud-based Applications
For any enterprise that has a digital presence or consumes applications and services from the cloud, there are a core set of capabilities that are critical in order to get performance insight.
Multi-point network path visualization with root cause indicators
Path visualization is an essential aspect of end-to-end network diagnostics, especially given external, Internet dependencies when using cloud applications. Seeing one path at a time is siloed and forces operators to perform manual correlation. Seeing multi-point sources (data centers) to multi-point destinations (3rd-party APIs) with highlighted nodes that are exhibiting anomalies, provides instant visual "triangulation" of network-based issues.
Enterprise applications may connect with a variety of 3rd-party APIs, many of which may be fundamental to the functioning of your application.
Users rely on the Internet to access applications and services. But the Internet is made of a many different best-effort shared networks, any of which could impact experience.
DNS is a critical dependency for all cloud communications, a key component of availability.
Both application providers and users may rely on a Content Delivery Network (CDN) to improve service delivery. Given that the CDN is the “front-line” for service delivery, it’s critical that you’re able to determine its performance for users in various locations.
Security services monitoring
Given the security threats to all enterprises, some now employ cloud-based services to enable automatic mitigation that can scale out to deal with the scope of each attack. To ensure your security provider is operating as expected (and to ensure you’re not collateral damage when another enterprise is targeted), you need visibility into provider availability and performance.
Internet routing visibility
BGP routing changes can have a dramatic impact on fundamental reachability from enterprise sites using SD-WAN/DIA to cloud provider networks. Without this insight, it is not only difficult to troubleshoot in the moment or identify an ISP or another provider that is impacting end-to-end behavior, it is impossible to track the Internet resilience and stability of critical external providers.
Application insights via HTTP, Page Load, Transaction Tests
Variable depth of tests is required for completeness of visibility. HTTP tests provide an overview of availability and performance; Page Load tests offer a breakdown view of the components affecting end-to-end performance; Transaction tests offer visibility into a specific user experience flow.
Deep-link sharing of full diagnostic data and visuals with external providers
Cloud adoption creates a major troubleshooting process change, from "find and fix" that applies to infrastructure owned and operated by the IT organization, to "evidence and escalate" to identify the external root cause domain (provider) and present sufficient evidence to rapidly garner acknowledgement, collaboration and remediation by higher support tiers of the provider. This is especially important given that the provider in question may need evidence to escalate to one of their provider dependencies, such as an ISP.
Complete Application Visibility—Inside and Out
Whether or not your organization leverages APM solutions for internally owned application visibility, Network Intelligence can provide the deep insight your IT teams need to understand how external factors impact application health and user experience. Once armed with this data, ThousandEyes makes it simple to share with providers and internal stakeholders, so you can reduce mean time to troubleshoot (MTTT), maintain a high-quality user experience and ensure business continuity.