Customers of O2 and Be have been having a difficult few days, with packet loss impacting on many services, and while web browsing can generally function with packet loss present, online gaming becomes almost impossible, protocols such as POP3 and SMTP are usually more tolerant to packet loss.
Customers are becoming increasingly frustrated, as it seems multiple fixes to the problem have been attempted, but as with a status update today, fixes are not always working.
The situation is also one that is not common to all destinations on the internet, or all users of the services. This has resulted in Be support staff having to get customers to collate ping and tracert results to gain an understanding of where the problem lies. Our own Broadband Quality Monitor provides some insight into the issue.
Click image for larger version
Click image for larger version
The graphs show the variability of the experience users will receive, with almost no packet loss (red area on graphs) for some, and significant levels for others. The comb tooth effect with regularly spaced spikes is thought to be an effect due to how the standard Be modems handle ICMP traffic. The fact that on the badly affected connection the packet loss was still present at off-peak times, and the connection did not appear to be busy with downloads/uploads suggests a network problem. The three BQM graphs on the right show what you should normally see on a connection which is not suffering packet loss.
A more traditional trace route shows the problem clearly too. With the * representing a dropped packet, the non-repsonse from the final server is most likely down to guardian.co.uk not responding to ICMP requests at all, this is an unfriendly but common practice.
1 <1 ms <1 ms <1 ms 192.168.1.254
2 * * * Request timed out.
3 8 ms 22 ms 8 ms 10.1.3.178
4 * * 12 ms Xe7-1-0-0-grtlontl3.red.telefonica-wholesale.net .7.16.84.in-addr.arpa [184.108.40.206]
5 * 72 ms * Xe2-0-0-0-grtmadno1.red.telefonica-wholesale.net [220.127.116.11]
6 41 ms 44 ms 41 ms Level3-1-0-0-grtmadno1.red.telefonica-wholesale.net [18.104.22.168]
7 * 42 ms * ae-0-11.bar1.Madrid2.Level3.net [22.214.171.124]
8 42 ms 50 ms 49 ms ae-5-5.ebr1.Paris1.Level3.net [126.96.36.199]
9 41 ms 42 ms * ae-45-45.ebr1.London1.Level3.net [188.8.131.52]
10 53 ms * * ae-58-113.csw1.London1.Level3.net [184.108.40.206]
11 54 ms 48 ms 110 ms ae-15-51.car5.London1.Level3.net [220.127.116.11]
12 209 ms * 50 ms GUARDIAN-UN.car5.London1.Level3.net [18.104.22.168]
13 * 54 ms * 22.214.171.124
14 59 ms 53 ms 76 ms 126.96.36.199
15 63 ms 53 ms * 188.8.131.52
16 * * * Request timed out.
17 * * * Request timed out.
A new network was announced in 2011, with it being built in 2012, alas it seems the new network cannot come quickly enough. Be has over the years since it started as one of the first ADSL2+ providers in the UK has built a loyal following, and while many accept the occasional evening of disruption, when problems go on for a period of time people may start voting with their wallet and move to other providers. The assumption at this time is that this is not just an issue with the volumes of traffic, but rather kit actually failing, since if it was traffic volumes then emergency measures such as traffic management could reduce packet loss to an acceptable level.
These problems also show the level of investment that is required to keep a network running smoothly 24/7, there is one large LLU provider that provides redundancy and capacity to the level that a major node can vanish, and the remaining network absorb the traffic without increasing contention significantly. This does not rule out congestion issues at a local exchange, but does help to avoid widespread issues that Be/O2 appear to have currently.
We will keep an eye on our Broadband Quality Monitoring system, as once a final fix is in place, the reduction in packet loss should be obvious.