Understanding the Meta, Comcast, and LinkedIn Outages


Optimizing WAN to Deliver SharePoint Online Globally

By Young Xu
| | 10 min read


In this post from ThousandEyes Connect New York, we’ll summarize the talk by Scot Clark, Enterprise Infrastructure Solutions Manager at a Forbes Top 150 multinational consumer goods company.

Figure 1
Figure 1: Scot Clark presenting at ThousandEyes Connect New York.

During his talk, Scot described how his company’s WAN has evolved, as well as the business’ continuing migration to the cloud. Scot delved into performance issues that employees in India experienced while using SaaS application SharePoint, and discussed how they used ThousandEyes to identify the issues and ultimately absolve the network from blame.

Migrating Toward a Hybrid WAN

In the past, Scot’s team managed a traditional internal network with multiple data centers that implemented single or dual MPLS running globally. It was a hub-and-spoke model with Internet access provided via point-to-point VPN links, as shown in Figure 2.

Figure 2
Figure 2: In the past, the company’s network was a traditional internal network that implemented single or dual MPLS with Internet access provided via point-to-point VPN links.

From there, the network has evolved. Scot said, “Over time, we’ve been flattening out into a hybrid network consolidated into two data centers. Most of the sites now have local ISP links that break directly out into the Internet.” In addition, many of their applications have migrated to the cloud, including Office 365 and SharePoint (as shown in Figure 3).

Figure 3
Figure 3: Over time, Scot and his team moved to a hybrid network consolidated into two data centers. Most of the sites now have local ISP links that access the Internet directly.

One of the biggest benefits of this change was that users could now be properly recognized as accessing a service from their actual location. For example, in the past, Canadian employees connecting to the U.S. data center would be seen as Internet users accessing a given service from the U.S. This created problems, for example, when websites showed ads to the company’s marketing team based on their detected location, since they needed to see content targeted to users in their actual location.

Evolution of the Office

Not only are networks changing, but the office environment is changing as well. For Scot’s IT team and much of the rest of the company worldwide, what used to be traditional cubes is now an open floor plan. This is also related to a trend toward flexible working environments, where employees may be working from headquarters, a branch office, customer site or even from home or a cafe.

In addition to a changing work environment, Scot’s team also deals with migrating applications — for example, voice has now moved to UCaaS Skype for Business. Scot also noted that “we have over 100,000 PC users, 70-80% of which are now laptop users, so we’ve started moving everyone to wireless, which introduces its own challenges.” Maintaining wireless network performance has become increasingly important for Scot’s team.

Degraded Performance with SharePoint SaaS Migration

Scot discussed a specific issue that his team had with investigating performance issues with SharePoint. SharePoint was deployed within a data center in the UK, with the expectation that the page load time for SharePoint’s home page from anywhere in the world should be 5 seconds. To achieve this goal, Scot and his team deployed WAN acceleration in the form of Riverbed appliances, which helped speed up page delivery to locations worldwide.

However, Scot’s company then decided to move to SharePoint’s cloud application, but “they completely forgot that the WAN acceleration was put in place. They did very limited testing from one location in the UK, so when the cloud migration was rolled out, employees began experiencing performance issues again.” The 5 second homepage load time increasing to 20-30 seconds, and the network was blamed, so the network engineers were put to the task of figuring out what was wrong.

Scot’s team successfully used ThousandEyes to locate and resolve the issues. They moved the WAN acceleration devices and identified other issues, including a bloated homepage and numerous additional WAN hops to reach the Dublin location where SharePoint was hosted.

Scot noted that his team also had real user monitoring (RUM) in place to monitor the webpage, but unfortunately due to issues with the JS injection and loading, performance times were underreported. RUM was also unable to show the impact of latency and geographical distance on overall user experience.

Using ThousandEyes to Identify the Issues

Scot’s team deployed Enterprise Agents in four global locations from which they ran Page Load tests. The tests provided them an understanding of page load timings for each step and for main page access and authentication with ADFS, all from the viewpoint of their four locations.

After setting up a Selenium-based Transaction test that involved visiting SharePoint’s main page, authenticating login information and loading the homepage, Scot’s team discovered that the total transaction time from their Enterprise Agent in India was over 26 seconds.

Figure 4
Figure 4: Scot’s team discovered that the total transaction time from their Enterprise Agent in India was over 26 seconds.

Digging deeper, they looked at the load times for each step in the transaction. Loading the sign-in page took 6.5s, authentication took 0.5 to 0.9s and the final step of loading the homepage took a whopping 17s. High load times for the homepage were clearly an area that needed to be investigated.

Figure 5
Figure 5: Breaking out the total transaction time by step revealed that much of the transaction time came from loading the homepage, which took 16.8 seconds.

Looking at the object-level waterfall, they discovered that a few objects on the homepage had very long wait times. In particular, Scot called out two objects that introduced a long 5.6 second delay.

Figure 6
Figure 6: Two objects on the homepage had very long wait times, introducing a long 5.6 second delay.

ThousandEyes also helped Scot’s team understand the impact of network latency on application response time. Looking at the Path Visualization from their four Enterprise Agents to the Microsoft data center in Dublin where SharePoint was hosted, Scot saw the importance of geographic distance: the India agent saw a latency of about 150ms, while the UK agent saw 22ms.

Figure 7
Figure 7: Latency also has an impact on application response time: the India agent saw a latency of about 150ms, while the UK agent saw 22ms.

Looking Forward

After Scot and his team used Transaction tests to successfully identify the issues with SharePoint, they worked with Microsoft to redesign the homepage and decrease the bloat. They are also exploring using ExpressRoute to improve network performance, as well as local data centers to host SharePoint in more geographic locations.

Scot then mentioned, “Since this was a success, we began leveraging ThousandEyes to monitor other cloud-based services and applications like JDA and Ariba. As we move more and more to the cloud, these tools have been immensely helpful.” The team has also increased their deployment from 4 to over 25 Enterprise Agents around the world. In addition, “to remove the need for user interaction and for ease of deployment, we’ve been using Intel NUCs — installing them and sending them out to the user to plug into the network to instantly get an agent deployed at a site.” NUCs can be quickly sent to a site or branch office having network issues, and require no technical expertise to plug in for monitoring.

As Scot summarized, “The ThousandEyes team was able to really drill in and explain a lot of the details that our team knows but doesn’t have the time to get into.” Equipped with the right tools, he and his team overcame the perpetual challenge of “showing to the business that it’s not the network, it’s your application. ThousandEyes has been invaluable for us to show the importance that it’s not always the network.”

Subscribe to the ThousandEyes Blog

Stay connected with blog updates and outage reports delivered while they're still fresh.

Upgrade your browser to view our website properly.

Please download the latest version of Chrome, Firefox or Microsoft Edge.

More detail