In this week’s episode, Archana and I are joined by a special guest, Christian Koch, who is the head of product, cloud and ecosystem at PacketFabric. Listen in as we cover the current state of global Internet health and dive into network outage numbers across ISPs, public cloud and collaboration platforms.
This week, we saw that the overall number of outages wasn’t particularly concerning, reflecting our “new normal,” but there were a few notable outages that had far-reaching impacts. In particular, an outage at TATA Communications and a fiber cut in CenturyLink’s Level 3 network had significant end-user impacts (which included a disruption to Merrill Lynch’s brokerage business), as did another outage that took down access to GitHub.
After going through these events, we spoke with Koch about a recent move by the FCC to limit China Telecom operations in the U.S., as well as a recently approved undersea cable between Los Angeles and Taiwan that’s shared between Google and other telecom providers. Finally, we looked at how Internet patterns are changing globally and the role of Internet Exchange Providers (IXPs).
Give this week’s episode a watch or a listen in the embeds provided, grab our slides on Slideshare, and as always, feel free to read along with the transcript below. We’re also available on iTunes (Apple podcast), Spotify, and Stitcher, so be sure to subscribe and leave us a review on your platform of choice. Finally, don’t forget to leave a comment here or on Twitter, tagging @ThousandEyes and using the hashtag #TheInternetReport.
Catch up on past episodes of The Internet Report here.
Show Links:
- April 21 GitHub outage
- Christian Koch’s Foundations Email
- April 20 TATA outage
- April 21 CenturyLink Fiber Cut
- April 23 GitHub outage
- The Internet Report- Episode 5: Week of April 20- April 26 on Slideshare
Follow Along with the Transcript
Angelique Medina:
Welcome to The Internet Report, where we recap all of the interesting things that happened in the previous week on the Internet. I'm Angelique Medina, and I'm joined by my co-host, Archana Kesavan.
Archana Kesavan:
Hello.
Angelique Medina:
This week, we're really excited to have Christian Koch on the show. He is the head of product, cloud, and ecosystem at PacketFabric. He also co-founded and is on the board of directors as president of NYNOG, which is New York Network Operators Group. It's a non-profit organization and one of the goals they have is to connect network operators and technology professionals. They have a lot of interesting events, and they recently held one of their events — I think it was about a week or a couple of weeks ago. We're really excited to have him on the show. What we're going to start with today is just a very brief recap of some of the major outages that occurred. Then we'll talk a little bit about some of the other things that we're seeing.
Angelique Medina:
There were a couple of really interesting outages that happened, one of them was a TATA outage and the other was a Level 3 or CenturyLink outage. As you can see, there was a slight increase from the previous week. A couple of weeks ago it went down to 177 and then it went up to 282, and then last week it was at 313. It was in that 300 range that we were seeing towards the end of March.
Angelique Medina:
Largely, the contributors to this were TATA and, again, Level 3 and CenturyLink who had some issues early in the week.
Angelique Medina:
Again, ISPs went back to that 250 number. But, overall, not really all that interesting.
Angelique Medina:
Cloud service providers are pretty low as usual and then the same with the collaboration app providers. They peaked, there was a huge spike, in late April, but since then it's come down, and it hasn't really gone up substantially since then.
Angelique Medina:
Overall, it looks like we're in this new normal. But there were some pretty significant effects from the CenturyLink outage. We had heard that from a number of folks. This was a fiber cut or multiple fiber cuts and that happened on Tuesday of last week. What was interesting about that was that we had ... I don't know if you have that link handy, Archana.
Archana Kesavan:
Hold on. Yeah. I'm just opening that up. There you go. Let me share my screen really quickly. All right. I think you guys are seeing this right now.
Angelique Medina:
Yeah. This was Level 3 where we saw the outage. This was just one example of something that lasted for a while that morning. It was something where we saw an impact not only on the west coast but also in the south as well. Reportedly, the fiber cut was in southern California. We had a number of folks who had mentioned that they were impacted as a result of this fiber cut.
Archana Kesavan:
Okay.
Angelique Medina:
That was interesting to see.
Archana Kesavan:
Yeah, we also learned in a couple of forums that there were probably a few different fiber cuts that were happening probably centralized around southern California but this was not necessarily a single one. There were multiple cuts that were prevailing around that were affecting the larger part of CenturyLink or Level 3 traffic.
Angelique Medina:
Right. But what's interesting is that even though the fiber cut was apparently in the southern California region, that can have pretty broad impact across, we saw, folks as far away as Atlanta and Raleigh, North Carolina where they had a problem. Definitely the cascading impact of something like this can be pretty broad.
Archana Kesavan:
Right. Also this particular outage lasted for quite a while. It started around 10:00 Eastern on Tuesday, but we were seeing effects of that go on through about 11:00, 11:10...
Angelique Medina:
Right.
Archana Kesavan:
It was a long-lasting outage compared to the others that we must have previously discussed.
Angelique Medina:
Right. Yeah, for sure. We saw that on Tuesday, but on Monday, so before that, there was a pretty big TATA outage that happened. That was one that impacted not just ... That was in the U.K., as well as, we saw some loss in Germany and France.
Archana Kesavan:
That was pretty broad, actually, Germany, France, London... Let me pull that up really quickly, as you're talking. Yeah.
Angelique Medina:
This is the TATA incident here. As you can see, there was a pretty significant number of interfaces that were impacted. We see some interfaces in London, France, and then Germany, so Frankfort.
Archana Kesavan:
The number there is really an indication of how much of that infrastructure within that particular provider was affected. Right?
Angelique Medina:
That's right. Yeah. I mean
Archana Kesavan:
One that's pretty big.
Angelique Medina:
More than 80 interfaces that were impacted.
Archana Kesavan:
Yes.
Angelique Medina:
That's a pretty big event. It didn't last very long, though, about 20 minutes, but it was pretty significant in terms of its scope.
Archana Kesavan:
Right. Right.
Angelique Medina:
And then ...
Archana Kesavan:
The fact that we heard about this, in the U.S. was limited because of the timing when it occurred. It happened around midday in the U.K., around 11:00 AM in the U.K.
Angelique Medina:
Right.
Archana Kesavan:
So the U.S. didn't hear much about it. Just looking at the blast radius of the outage from the amount of infrastructure that was affected, this looked like a pretty significant one.
Angelique Medina:
Right. And then GitHub had kind of an off week. We saw that on Tuesday... And was something that lasted like 90 minutes. So you'd go to the website and you're basically getting a server error. So it wasn't a network issue reaching their servers and they self-host and they also have some instances that are hosted in AWS. But either way, we were seeing that we were getting errors connecting to their site and it was pretty long-lasting, as I mentioned. And it wasn't network related, so we can look for example, the path to their PoPs and see that effectively there's no loss here. And then this happened again on the... Was it the Wednesday?
Archana Kesavan:
It was, yeah.
Angelique Medina:
So Thursday, and again it was an issue where their site was periodically, so it was like 15 minutes and then it would come back up and then you'd have an issue loading it. So not really clear what's going on there, but it wasn't network related. So they just seemed to have maybe just an off week for GitHub, but haven't heard anything since then so hopefully all things are clear.
Archana Kesavan:
And when we say network related, the path to GitHub's PoPs ...
Angelique Medina:
To their front door; to their web servers. Now, it's possible that there is a network issue connecting between their application tiers maybe or something on the backend. They're not front-ended by a CDN provider, which is interesting. So it wasn't an issue with any external provider. This was something again, regardless of where they were hosted, whether it was AWS or their own PoPs, they were all impacted the same way. So that was interesting. I think even our own users internally here were having problems that morning. I think that's actually how we became aware of it pretty early on.
Angelique Medina:
So those are the highlights of some of the major incidents that happened last week, but there's always other news that's just happening that maybe isn't outage related or performance-related that is something that maybe will have an impact at some point or maybe it's not something that an average consumer would notice, but it is noteworthy from a network operation standpoint. One of those, because we have Christian here and Christian has this great newsletter that he puts out every week where he compiles a lot of the interesting traffic as well as news that's happened in the previous week. And one of the things that you had in this week's newsletter was around this FCC recommendation for China Telecom to potentially cease operations within the United States. So maybe you can tell us a little bit about that.
Christian Koch:
Yeah, sure. So a couple of weeks ago the Department of Justice in the US recommended to the FCC that they revoke China Telecom's license to do business in the United States as it pertains to international traffic. And just this past week, the FCC took them up on the recommendation and has issued orders to a number of Chinese telecom companies and are ordering them to provide more data and proof that they are not controlled or influenced by the Chinese government.
Angelique Medina:
Yeah, that's really interesting. So we've seen, for example, and I think you had also mentioned earlier that China Mobile and is already kind of under that order, so to speak, so they don't operate within the United States. They had that same ruling by the FCC at some point. And then China Unicom is part of this new umbrella or they also had previously had this order?
Christian Koch:
China Unicom was mentioned in the recent order by the FCC, yes.
Angelique Medina:
Yeah. And what's interesting about this is that what we've seen in some previous BGP incidents, so there was a route leak last year with the Swiss hosting company where they leaked hundreds of thousands or tens of thousands of routes, and this happened over the course of a few days. And because China Telecom, because of this leak, they ended up in the path for traffic. And specifically, what we were seeing was for traffic, that stuff that was destined for Facebook, that it was getting blackholed. So it was effectively getting dropped at the edge of China Telecom's network, so they were filtering that traffic out and this was happening in Europe and we've seen it elsewhere.
Angelique Medina:
For example, with the Google hijacking by the Nigerian Telecom. It was accidental, but it was again an instance in which China Telecom ended up in the path for Google, that China Telecom basically did the same thing. They filtered out traffic that was destined to Google. So these policies that they have in terms of which traffic they will transit or accept versus maybe dropping is something that appears to be implemented fairly broadly across their network. So it's not just contained to their infrastructure in China, it's also implemented outside in Europe. We've seen this in the US as well and other countries in which they have network infrastructure. So that was interesting.
Archana Kesavan:
Well, what is interesting for me there, Angelique, from what you mentioned, and Christian, what you were saying in terms of these restrictions that are going to be in place or the FCC is taking the advice and looking into it is that even if these providers are restricted from doing business in the US, there can be incidents where traffic involuntarily can go through their networks, right?
Angelique Medina:
Right.
Archana Kesavan:
I mean, because the Internet is so interconnected and sometimes these hijacks or route leaks, intentional or non-intentional, I'm not talking about the malicious nature or not. Irrespective of that, these providers can come into play sometimes and that cannot be stopped because of these policies in place.
Angelique Medina:
Well, I mean, that's interesting because, I mean, I wonder though, when these incidents have occurred, why would they accept routes to something that... an announcement for something that they filter out. So they don't necessarily have to accept routes from their peers or announcements from their peers. And they had, in those particular incidents, they did. So some of it is self-policing and then also their peers have some impact on this as well.
Archana Kesavan:
Right, right.
Angelique Medina:
It's just responsibility. Yeah, yeah. But then also, some other interesting things that you had brought up in your newsletter in the last couple of weeks was this cable between the West Coast of the United States to.
Angelique Medina:
Cable between the West coast of the United States to Taiwan, where initially this was a cable, a Google cable, that was going to be to Hong Kong. And then it changed to Taiwan.
Christian Koch:
Yeah. Yeah, that's correct. So it's a cable that has a number of members of a consortium. It's led by a company called PLCM. And what happened was the FCC kind of blocked it inadvertently and was looking for more information from the parties on the cable due to some information that they found out about business ties and things like that. And what happened was Google applied for a stay order in order to open up a segment of that cable. So the full cable runs from Los Angeles in the United States to Hong Kong and there are multiple branching units, which pull off that cable into Taiwan and the Philippines. So essentially what Google has done was asked the FCC for permission to be able to activate and operate that segment of the cable from Los Angeles to Taiwan.
Angelique Medina:
Interesting. But this is something that's not just a Google cable. I mean, this is a shared cable with Google and I think even China Telecom has a stake in it as well.
Christian Koch:
Correct. Correct. And Facebook has a stake in it as well. And they are looking to complete a segment of the cable that goes to the Philippines. And eventually, when that segment or branching unit is complete, they will look for the same sort of stay order that Google has requested and was granted depending on how things turn out for the whole cable project as a whole.
Angelique Medina:
Interesting. And another thing that you also have been doing in addition to the newsletter is you've been compiling traffic statistics covering some period starting before, a lot of activity around COVID-19, and then through to even very recent weeks, maybe even to today. And so why don't you tell us a little bit about what you've seen kind of and maybe some of the more interesting things that you've noted just in looking at this, because this is global. You've compiled this across many, many different regions and cities and there's a lot of variation.
Christian Koch:
Yeah. Yeah. So it's interesting. I think the common theme is that yeah, traffic in 99% or 90% of areas of the world is increasing. Right? But if you look through the data, all traffic increases aren't created equal, which is what's interesting, right? So we came down to a couple of locations like Dusseldorf or Munich in Germany that obviously aren't seeing as large increases as Frankfurt, which is a global hub for interconnection. If we look at Palermo, Italy, which is a smaller interconnection hub where a number of subsea cables land, that traffic also isn't increasing as much.
Angelique Medina:
Yeah. So you mentioned Palermo, but then also I was looking at an exchange in Milan where we saw a pretty significant spike in traffic. So what do you think is accounting for the variation? Is it just that there's an increase in usage in that region versus maybe Palermo where that's more about, as you mentioned, kind of continental interconnection.
Christian Koch:
Yeah. So there's a number of things here, Angelique. When it comes down to it, there are a number of major hubs around the world where interconnection is very prevalent and there are a number of medium-sized and smaller hubs. But a few of the things that we can get out of this and that really play into how much traffic or how popular a hub is who's connected and peering on that Internet exchange, which is the viewer, the maturity of the market, the network's interconnect and network strategy. And the cost of interconnecting because there are some regions around the world where networks prefer to interconnect with other networks privately rather than over a public Internet exchange. So while you may see a large increase on one Internet exchange and not the other, that might just mean that most of the networks there are actually interconnecting privately.
Angelique Medina:
Interesting. And so would you say that that's, as you mentioned, kind of market maturity? So is it the more mature markets where there is private interconnection was more common or the other way around?
Christian Koch:
Yeah. So mostly in the more mature markets is where you're going to see private interconnection as the prevalent or predominantly popular method or preferred method of interconnection. It's not always the case though because there are mature markets like the UK and other places in London where private interconnection may not be the preferred method because Internet exchanges have been around for a long time in some of those markets and have a completely different model than those in other regions or the US.
Angelique Medina:
Very interesting. Yeah. And then you also had mentioned Verizon had put out some numbers around the composition and the increase in traffic that they were seeing.
Christian Koch:
Yeah. So that's interesting. While I've collected all of this data and all of these graphs on Internet exchange points and a few ISPs, I haven't really looked too much at the mobile networks or wireless networks. And I was reading through the Verizon earnings report this weekend and I noticed that they gave some statistics on how their network is performing and what the increases are that they're seeing from the baseline of before COVID-19. And they said that they're seeing almost 1000% increase in collaboration tool traffic, which is video conferencing like Zoom that we're using right now. And almost 200% or a little bit over 200% increase in gaming traffic on their network. And this is Verizon wireless, so this is not even including the broadband actually.
Angelique Medina:
Wow. Yeah, that is really interesting. Yeah. I had seen some numbers from Comcast as well and they were showing the increase in basically upstream versus downstream utilization. And of course downstream had grown quite a bit more and even beyond upstream, which of course points to video conferencing and things like gaming. But they also said that they were seeing kind of a plateauing of the traffic around the first week of April. So it wasn't going down, but it wasn't at that same kind of slope. It wasn't going up. So is that something that you've noticed as well, maybe more broadly or just in kind of your line of sight that it's kind of starting to hit a bit kind of its new normal, if you will?
Christian Koch:
Yeah. As I was looking through the graphs and updating some graphs and data that I collect over the weekend, I did notice that things seem to be leveling off, maybe adjusting to the new normal. Maybe people are tired of watching Netflix and are playing Scrabble now. Who knows, right? I mean, there's a number of things that I think we can take away from this, but we have seen it or I have seen it level off a little bit.
Archana Kesavan:
I was listening to, because it was the packet pushers episode that had BT and Netflix on there discussing how traffic trends have been. And one of the interesting tidbits that BT shared was that the peak, the time it hits peak... Well, let's rephrase this. So the peak's gone up like it's, it's there throughout the day, but when people fall off, the peak is like much earlier in the night than it used to be before. So there's sort of fatigue of watching TV or being on calls that hits. So when they used to see before COVID that the peak would go up to like 10:00 PM or 11:00 PM at night, that's dropping off much earlier and also it's starting much earlier in the morning. So people are like maybe sleeping early or stopping the streaming effects early and then they're starting their days early in the COVID era. I just thought it was an interesting trend there with respect to traffic levels.
Christian Koch:
It's very interesting.
Archana Kesavan:
Yeah. So Christian, I had a question with respect to the pairing that you were talking about the private and the public pairing is metro markets versus not so and also just because of the history in some markets. Is there any advantage to public versus private pairing that you can think of?
Christian Koch:
Yeah, absolutely. And the number one thing is actually control and you own a large amount of capacity. So if you were to take a fiber cable and data center and say I want to pair privately with maybe I want to appear privately with you. You've got a business and we have a lot of similarities and there's an advantage for us to do that. And we say Hey we can connect this piece of fiber between us and then we can actually allocate a 10-gigabit interface on our network router. And we've got 10 gigabits of bandwidth between us, of capacity between us. But if you go to connect to an internet exchange and you have that same 10-gigabit interface, now it's shared between everybody you want to pair with on there. So there is a big advantage of having that control and more capacity available to you.
Angelique Medina:
So does that come with the price then? I'm guessing yes.
Christian Koch:
It obviously depends on how you run your network and how large your network is and do you have economies of scale and get certain things cheaper than others do. But it does have advantages in the long run and cross-connect costs vary by data centers and regions. So it's all highly relative.
Angelique Medina:
Interesting.
Archana Kesavan:
Makes sense.
Angelique Medina:
Anything else from what you've observed the last couple of weeks that you think would be interesting to share?
Christian Koch:
Well at PacketFabric what we're seeing isn't, while I can't reveal any specific metrics, what I can say is that we are seeing an increase in demand. And the great thing about that is that our business can serve that demand faster than most people in the market because that is what we were built for, right? We built our network to look like the cloud and that is turning up and turning down bandwidth as you need it and being able to use what you pay for and pay for what use. So it's fairly interesting and our customers are in the driver's seat and we are seeing those increases. Maybe after things settle down a little bit we'll be able to share more data.
Angelique Medina:
Yeah, no, I think that would be very interesting to share more broadly with listeners. So I think that's probably all we have time for today if you want to take us out Archana.
Archana Kesavan:
Yeah so Christian, one of the things, and kind of just closing this out is we've had people ask us, customers, prospects, is there a forum? Is there a community where we can come together to understand what's happening? I know the foundation's newsletter is a great place to be involved and for anybody listening here, if you want to subscribe to it, go to foundations.email and you can subscribe to Christian's newsletter. Some of the data that we discussed here, kind of we read about it in his newsletter. Are there any other places, forums that you would recommend people to go to stay updated?
Christian Koch:
Yeah, of course depends on where you're located, but look for the network operator groups like NANOG, there's a lot of more local ones in the US and most countries have their own operator groups too. So if you're in Japan for some example, there's JANOG and if you're in Hong Kong there's HKNOG. So just look to what potential network operators, operator groups are in your community.
Archana Kesavan:
Got it. Cool. Thank you so much Christian for that. And if you guys as always are interested in following us, sign up for the Internet Report, Angelique if you want to just bring up to the last slide in there, you can email us at internetreport@thousandeyes.com if you want to get one of our cool t-shirts, and again, what we do at Thousand Eyes is take a deeper dive into outages that are happening. So if that's of interest to you, follow us on blog.thousandeyes.com, as well. All right with that, we'll close out today we'll see you guys next week.