On November 9, 1965, a relay tripped on a transmission line at the hydro-electric plant in Queenston Ontario, near Niagara Falls, setting in motion a 13-hour power blackout that disrupted electric service for more than 30,000,000 people in Ontario, Connecticut, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, and Vermont. Excess power from that first line migrated to other power lines, that quickly became overloaded, causing their own relays to trip. In a matter of just 10 minutes, the blackout propagated across more than 200,000 square kilometres.
Most telephones (land lines at the time) kept working thanks to emergency generators that were a standard part of central offices.
Following the incident, new monitoring equipment, procedures and systems were introduced, but that didn’t prevent another large scale power grid failure 38 years later, in August, 2003, knocking out service for 55,000,000 in Ontario and 8 US states.
The root cause of the 2003 blackout was determined to be a bug in the alarm system software at FirstEnergy in Akron, Ohio. As a result, operators were unaware of the need to redistribute the power away from overloaded transmission lines. As a result, what should have been at most a manageable local blackout led to another collapse of much of the Northeast electric grid.
Those stories may help put last Friday’s Rogers outage into perspective.
We need to be realistic about what happened, and how we can mitigate the consequences of similar network failures. Friday’s network event was unprecedented. In over 40 years of involvement in North American telecom networks, I cannot recall an outage as broad in scale (nation-wide) and scope (spanning mobile and fixed networks, voice and data).
Still, most Canadians did not lose their communications services.
Rogers does not operate a monopoly for any of its services. That should have been self-evident to the people who were tweeting about so-called “CRTC monopolies”. Canada didn’t experience a total communications blackout at any point while Rogers was restoring service.
Access to 9-1-1 services should have kept running. Wireless devices supported by Canadian telephone companies automatically scan for a different network to complete 9-1-1 calls, if the native network is not available. If that didn’t work, device suppliers and telecom carriers need to investigate why.
Some major customer network managers may need to re-examine their communications architectures to ensure sufficient carrier diversity. Some found poetic justice in the CRTC’s tweet that its phone lines were “affected by the Rogers network outage.”
Consumers may decide to re-evaluate the value proposition of bundling, perhaps choosing to pay a little more in order to separate their home connectivity from their mobile service provider.
But, let’s be clear about the overall state of Canada’s telecom competition policy.
If anything, last week’s network failure should serve to reaffirm Canada’s policy promoting facilities-based competition. Customers served by alternate facilities-based providers kept operating. A review of the world’s LTE deployments shows that there are 10 LTE networks operating in Canada compared to 9 in the US, 3 or 4 in most European countries (Russia has 9; Sweden has 6; Denmark has 5).
Wholesale-based service providers did not add any measure of network survivability. While one wholesale provider boasted that its services were 50% down, in reality, half of its customers were 100% out of service. The other half were running on a different facilities-based provider.
Canadians benefit from robust competition among facilities-based carriers, and policies that encourage investment in diverse infrastructures.
Some used the network outage to advocate for structural separation, or for a government-run telecom access network. I couldn’t imagine how anyone would think that the people responsible for Canada’s airport fiascos, or passport backlogs, development of government payroll systems, or delivery of clean drinking water, could be entrusted with our telecommunications infrastructure.
That fake baby boom is another important lesson to apply in the wake of last week’s network event. Much misinformation will continue to circulate while technology professionals determine the root causes of the network outage and develop processes to try to avoid similar events in the future.
In the fullness of time, we will understand what caused the networks to fail. Industry-wide, carriers will undertake measures to avoid similar events and mitigate the impacts when future outages inevitably occur.