Over the last few years, I’ve seen a lot of discussions and content online that equate the performance of an application with speed, most notably from the web performance and WPO (web performance optimization) communities. The corollary of this trend is to equate end user experience with the time it takes for a web page to load, aka RUM or real-user monitoring. This is becoming a commodity feature in several application performance monitoring products since it’s trivial to perform, especially with the new W3C extensions. In fact, there’s even open source tools like Yahoo’s Boomerang that have been doing this for years, and google analytics included page load time metric more than 2 years ago. In this blog post I’ll cover some of the shortcomings of this myopic and superficial view of performance.
The very first pillar of performance is service availability. If a service is not available, what’s the point of measuring how fast it is? In fact most of the passive measurements (including RUM) are unable to determine if a service is available or not. Active measurements are often required for this. Also, it’s not sufficient to determine the availability, but it’s also important to pinpoint which component/layer is causing the outage. At the first level, is it a network or application problem?
This is somewhat related to the previous metric, but it measures the accuracy of the content of the service. For example, if a site is hacked, it will be considered available, but the content might be defaced. In fact it might be even faster to load than before. If a critical third party component is not loading, it can also be a problem. It’s important to assure that the application reflects the expected behavior and has all the necessary components to function as expected.
Page load data collected from thousands or millions of users is generally not very actionable, since it’s only a coarse indicator of slowness. This data is often very noisy and outliers that capture real problems are often buried by the aggregates. Also, due to lack of detailed metrics it is often impossible to drill down into the data to understand what is the root cause of performance degradation (e.g. wireless access is often a culprit). Aggregate page load times are useful to spot wide scale performance degradation, but data lacks further insight.
Application Consumer vs. Provider
Consider the case of an IT team inside an enterprise using a third party SaaS service. Knowledge of page load time is not actionable since there’s no control over the application. However, if the IT team finds out that it can optimize its own infrastructure to speed up the delivery of the application, then it can install a WOC (WAN optimization controller) inside its network. Furthermore, if it can correlate application degradation with limitations/bottlenecks inside its network, then it can fix the problem by reconfiguring the network.
Network vs. Application Metrics
Page load time is just the tip of the iceberg. It’s the most obvious and noticeable aspect. However to understand what’s causing performance degradation a deeper dive is required. Often times network/infrastructure problems bring down application speed or render a service unavailable. Without having full stack visibility into both application and network, it becomes almost impossible to diagnose problems. It also becomes hard to assign problems, since network and application problems often belong to different teams inside IT.
User experience = Speed?
“User experience” has been an over-abused term in the web performance world and often equated to page load speed. User experience is a much more complex topic that crosses the border of interaction design, usability, business utility, value proposition, etc. Speed is just a tiny element of user experience.
At ThousandEyes, we have been developing a product that can take users to the root cause of performance degradation. We strive to make information as actionable as possible. The top of the iceberg is but a starting point, to the depth and richness of what happens under the surface.