AppDynamics Integration Video Tutorial Transcript
My name is Konr Ness, solutions architect with AppDynamics, and in this demonstration, I'm going to show you the new integration between AppDynamics and ThousandEyes and how with this native integration, we have extended the end-to-end visibility that we provide for your distributed cloud applications.
Now, this is a dashboard for a business-critical, revenue-generating insurance application that handles policy applications and customer payments being submitted.
In this view, we have contexts in the payments being submitted, the performance that our end users are experiencing, but also the performance of the third-party APIs and critical service dependencies that our application relies on to capture those payments.
Now there's some concerning things that we're seeing here in this dashboard.
Yellow indicates that the amount of payments submitted has dropped below normal baselines.
And in fact, earlier today, the payments stopped altogether.
Correlated with our drop in revenue, we see an increase in response time for the “create charge” business transaction.
Let's take a look at how this app works and diagnose what's going on.
This is the flow map for our “create charge” business transaction.
It's the same one we saw in our dashboard that said we had a high response time and a drop in revenue.
This business transaction has multiple microservices that support it, like web tier, core services and payment services.
For this transaction to be successful, this backend microservice also relies on an external payments processing API represented by this gray cloud.
Now, sure enough, we can clearly see a sudden spike in response time.
This is how customers confirm payments for their service.
If customers aren't able to do this, that will cause a lot of unhappy customers and a spike in calls to our call center.
Drilling into one of these slow snapshots tells the story very clearly of what's going on.
This external web service call is taking longer than normal, and the payment service is erroring out with a remote call timeout, which is causing the upstream core services to fail entirely.
Now, clicking "Drill Down" brings us directly to the details of that failing HTTP request, including the call graph of the code that actually executed it.
Now this external web service call to our payments processor is timing out, and this is causing payments to not go through.
The most important thing to understand in a situation like this is isolating the problem domain.
Is this a problem with our app, a problem with the network, a problem with the public Internet or an issue with the external service provider, which in this case is our payment gateway?
Using this snapshot, we've already confirmed that the slowness is not coming from our app or our code.
It's something external of our app.
Now, typically at this point, the app team is absolved of the blame and the issue is punched over to the network team, and this back and forth results in wasteful finger-pointing and losing precious time, while end user experience is being impacted.
But what if we could find out exactly where the external environment is impacting connectivity to the payments gateway? Returning to our dashboard, we could actually answer those questions very quickly with the combined visibility of AppDynamics and ThousandEyes.
For those of you not familiar with ThousandEyes, they're an Internet and cloud intelligence platform that provides visibility into how the Internet and external environments impact application performance.
Now we have a ThousandEyes Enterprise Agent running in our application environment, and that allows us to extend the visibility outside of our app environment and understand how that external web service request traverses the public Internet.
In fact, this graph here visually correlates the backend response time from the perspective of our app with the network latency when those transactions leave our data center.
Now, sure enough, at the same time that our revenue dipped and payments processing started to fail, we saw an increase in network latency from ThousandEyes.
Now, this confirms that it is indeed the network and not slowness with that third-party service.
With this integration between AppDynamics and ThousandEyes, we help improve the handoff from AppOps to NetOps when troubleshooting by unifying visibility and providing a common operating language.
So we know there's an increase in network latency to our payment gateway, but what exactly is causing it?
Let's seamlessly navigate directly to the ThousandEyes test from this dashboard to continue diagnosing this problem.
In ThousandEyes, we have an HTTP server test making requests to the same URL that our backend payment service or our app depends on.
Now we're testing the payments API from five different vantage points, four of which are from points of presence across the globe, also known as Cloud Agents that are hosted by ThousandEyes.
And the fifth is our Enterprise Agent running in our application environment, so we can understand how transactions that leave our environment to external service endpoints traverse that public Internet.
Now, this network overview tab tells a very important story.
The increase in network latency that is occurring is actually being experienced from a few different locations, mostly in the US. However, it's impacting the app environment the most.
And this helps us constrain this problem from being an issue with our data center network or our upstream ISP to one with the Internet path of our payment provider.
Now, how does our connection to this payment provider traverse the Internet?
Well, let's look at path visualization to understand.
First, I'll change the time frame we're looking at to be before the problem started so we can understand what has changed.
Now, our app is running in a data center in the Pacific Northwest, and this path visualization shows how connections are traversing the public Internet. On the left is our Enterprise Agent.
On the right is the destination or the front door of the payment processor's APIs, and each of the lines and circles between them visualizes the path that traffic takes through the Internet to get to its destination.
Before the problems started happening, requests were traveling to the payment processor US location, hosted in Ashburn, or Amazon's US-East Region.
And other Cloud Agents, like in Japan and England, were actually routing to the payment processor’s Singapore location.
Now, this is a classic GeoDNS implementation.
Based on where we're physically located, DNS is returning a different IP address of the service to provide the lowest latency.
All good so far, but when the latency spikes up, let's see what happens.
Wait a minute.
Now it looks like our API requests from our app are being routed all the way to Singapore.
Now that conflicts with the purpose of GeoDNS.
Why would a service endpoint in Singapore serve requests coming from the US?
Clearly, something looks off here, but it explains the root cause of why we're suddenly seeing higher network latency and our application is timing out.
Now, this unique view in ThousandEyes contains all the information our application and network teams need to understand the problem.
Having this unified visibility can prevent the need for an all-hands-on-deck war room.
Instead, we create a ShareLink: an always-available, interactive view of this ThousandEyes data.
We send this ShareLink with all the contexts needed to describe the problem straight to the payments team who can resolve this issue with the payment vendor.
Now, let me show you how we set up that dashboard in AppDynamics with both application and network context in just one view.
This unique view is thanks to a new integration between AppDynamics and ThousandEyes that we are announcing today.
To put ThousandEyes metrics on this dashboard was as easy as adding your ThousandEyes credentials into AppDynamics.
And after that, when we select which data we want to show in the widget, not only do we have access to AppDynamics data, like applications and end user monitoring, databases or servers, but now we also have the ThousandEyes data source.
Select which metric you care about and which tests to pull from, and we immediately plot that data without having to set up any complex data integrations.
This makes it simple to extend the end-to-end visibility of your apps to add ThousandEyes network and Internet context to your applications.
Now, in summary, with AppDynamics and ThousandEyes, we provide complete visibility into your applications and networks with our correlated and real-time insights into both that allow you to take quick action to resolve incidents that affect customer experience.
Thanks for your time.