Understanding the Meta, Comcast, and LinkedIn Outages


Future-Proofing End User Monitoring to Meet Scale Demands

By Derek Mok & César Tron-Lozai
| | 3 min read


The "hockey stick" momentum ThousandEyes experienced for End User Monitoring has required innovative solutions to ensure optimal performance at scale.

The ThousandEyes customer community has grown rapidly since the onset of the pandemic, and our engineering team has been hard at work to ensure that our platform scales to meet their needs. 

From the beginning, our goal has been to help organizations proactively manage performance risks on their global networks. And the Endpoint Agent goes a long way in supporting this objective by delivering both on-demand and real-time visibility into each employee's experience of SaaS (software as a service) and internally hosted applications. In addition, this lightweight service installed on the laptops and desktops of end users can monitor underlying wireless LAN, WAN, Internet connectivity, and system health.  

End User Monitoring is our fastest-growing product category, which is fantastic for users. It indicates that companies are taking digital experience and its effects on employee morale and productivity seriously. However, this hockey stick momentum initially required some innovative solutions to ensure optimal performance at scale.

 Figure 1. Endpoint Agent user growth from January 2020 until October 2021. DMY date format. 

Scaling the Endpoint Agent was not as simple as buying more hardware to increase capacity. Instead, it involved fundamentally rearchitecting the Endpoint Agent platform to deliver a future-proof solution fit for our customers.  

In this video produced by Devoxx UK, we dive into our team's scaling journey to deliver an Endpoint Agent poised for exponential growth. Check it out.

In our presentation, we provided a glance at the different techniques the team used to scale the Endpoint Agent platform to the next level, including: 

  • Upfront investment in building a comprehensive suite of load tests to get a performance baseline and to validate the impact of future optimizations

  • Leveraging WebSockets to enable real-time configuration updates for the Endpoint Agent 

  • Implementing different caching strategies  

  • Replicating data into multiple datastores to achieve high performance on specialized workloads: For example, fast key lookup with DynamoDB and efficient searching with Elasticsearch  

  • Embracing eventual consistency by balancing data freshness with performance 

Does the work ThousandEyes does excite you? Are you looking for a change and a challenge in your career? Come innovate with us!

Our engineering team is seeking bold and talented candidates to help us ensure we deliver unparalleled insights to our customers for improved troubleshooting and optimized user experience. See all open positions here.

Subscribe to the ThousandEyes Blog

Stay connected with blog updates and outage reports delivered while they're still fresh.

Upgrade your browser to view our website properly.

Please download the latest version of Chrome, Firefox or Microsoft Edge.

More detail