We just got back from New York after hosting our second Connect event of the year. Keeping up with tradition, we welcomed speakers from a variety of industry verticals: healthcare, Internet security and financial institutions. In today's post we'll discuss the presentation by Ivan Shepherd, Senior Manager at AIM Specialty Health, a medical benefits manager for services like radiology, oncology and cardiology. Ivan leads the Network and Technical Security team at AIM Specialty and is responsible for managing data center and campus networks and application delivery across these networks.
Ivan and his team have been working with ThousandEyes for over a year now and use ThousandEyes to gain visibility into service delivery for AIM's Telecommuter program. In his talk Ivan discusses the inception of the Telecommuter program and the parallel evolution of the program alongside ThousandEyes Cloud, Enterprise and Endpoint Agents to monitor AIM's network.
The Requirements
Ivan kick starts the session by recollecting how a little over a year ago he was tasked with building a network for specialized medical professionals connecting from remote locations in the continental US. Named the Telecommuter program, business requirements mandated this network to have a flawless user experience, providing remote users a data and voice experience comparable to that of a campus user. The data aspect did not worry Ivan, however, voice was a different beast. Ivan recounts "With TCP-based web applications we can account for a lot, but voice gets a little trickier."
The now remote user base also meant that private cloud was no longer a sustainable option. The network architecture at AIM's traditionally relied on the private cloud but that was going to change with the Telecommuter program. Ivan says "The increasing need for partners broke down the private cloud model."
The Proof of Concept
Although a bit apprehensive of the requirements, Ivan says his team got right down to building a proof of concept. The team started with a VPN-based solution by employing a leased circuit from a relatively new carrier with an IPSec gateway at the remote location. Users were able to able to connect their VoIP phones and laptops through the IPSec gateway and were trunked into AIM's datacenter. Voilà! Problem solved and the Telecommuter network was born.
Well, not really. Ivan remembers the call from the NoC team complaining that the PoC system was breaking down with poor user quality and overall network instability. Ivan draws a molded bread analogy comparing how couple of months after the PoC, although very successful, started getting moldy when real users were on it. The PoC system was converted into a soft production system, and Ivan had a new problem at hand.
As it turns out, the carrier that AIM Specialty decided to go with had nationwide BGP issues resulting from blackholing traffic from a couple of their backbone locations. Troubleshooting from the datacenter didn't provide a good vantage point as in-house tools showed that the leased circuit was underutilized and DNS reachability was stellar. "4.2.2.2 was responding clean and we didn't see any issues from our side" recollects Ivan.
Solution Part 1: Cloud Agents
Faced with a new problem to solve, Ivan says he quickly realized that existing monitoring tools like Ping, Curl, SNMP or IP SLA were not coming to his rescue. Vendor compatibility, lack of historical data and unreadable UX were few of the challenges. While researching how to overcome the issue of poor visibility, he says technical forums like NANOG and SLAC led him to ThousandEyes. With a free trial version of ThousandEyes and setting up tests from Cloud Agents, Ivan says "We very quickly got the information we needed." He points to the ThousandEyes Path Visualization snapshot shown in Figure 3, that shows exactly where packets were getting lost. With ThousandEyes Cloud Agents he was able to gain visibility into service delivery, BGP routing and make informed decisions on remediation once a fault was detected.
As his team was getting ready to move the PoC system to a real production system, ThousandEyes became an integral part of the architecture planning. With ThousandEyes his team was able to benchmark service providers, understanding where and how the remote users transited the network. This allowed his team to engineer traffic around problematic circuits, which was very critical in transitioning from PoC to deployment.
Solution Part 2: Enterprise Agents
Ivan moves on to the second use case, where his team adds Enterprise Agents to the mix. AIM Specialty collaborates with multiple partners and healthcare providers, which entails workflow related-tools being hosted in third party organizations. When things break down, troubleshooting is tricky as it constantly involves finger pointing. Additionally, the current set of 15 odd troubleshooting tools like Ping, Curl etc, all running from the same central location, failed to provide the right visibility. Enterprise Agents provided a meshed vantage across partner sites, helping to monitor the private transit. As Ivan says "Visibility into delivery was critical and ThousandEyes Enterprise Agents fit rightly into this use case."
Ivan illustrates this with an example where an Enterprise Agent, dropped into one of the partner sites, was able to accurately point out where in the transit ISP packet drops were occurring and also narrow down the exact timeline of the fault. He says, "We were seeing a 40% loss on average everyday between 5PM to 10PM. Path Visualization immediately pinpointed the issue."
ThousandEyes Path Visualization
Quoting Ivan's exact words "Love, love, love Path Visualization." With Path Visualization Ivan's team has been able to short circuit conversations with partners and ISP providers. Before ThousandEyes, ISPs would stonewall issues with ICMP traceroute and Curl-based data. The simplicity of Path Visualization and the direct correlation to where loss occurs has elevated the conversation within AIM Specialty and partners. He jokingly says "Even my mother can now see why her Internet is not working." Combined with the ThousandEyes dashboard, he says, it has not only been easy to converse with NOC personnel and operations staff but also to show the value to the executive staff. He says, "One look at the dashboard and we can identify what the normal state of operation is and quickly identify where the issue lies and initiate the right escalations."
Road to the Future
As the Telecommuter program expands Ivan and his team and constantly looking to resolve networking challenges through ThousandEyes. In the near term his team is working on integrating the ThousandEyes API with their internal system to pull out data and store in their systems for long term visibility. AIM Specialty Health has also been an active participant in our beta program for the Endpoint Agent. The Endpoint Agent extends the perimeter of network monitoring all the way to the remote work-from-home user. Read how the Endpoint Agent complements the Cloud and Enterprise Agents in AIM Specialty Health’s Telecommuter network.
If you are interested in listening to Ivan's talk, check out the video below. And stay tuned for more great content from ThousandEyes Connect.