This past week, an airline company with a reputation for punctuality, and with one of the best reliability records in the industry, was left dumbfounded as over 1,000 flights worldwide were grounded. No, we’re not talking about Southwest Airlines. This time, it’s Delta.

Like Southwest, the cause is IT-related: a power control module failed. Also like Southwest, backup systems were supposed to kick-in and, familiarly, they did not. This created a cascading effect as multiple systems failed in succession. Reservation systems, boarding pass creation, crew and gate assignments… they all went down.
Delta Airlines IT Failure
Delta’s CEO, Ed Bastian, was perhaps the most exasperated of all, explaining that “hundreds of millions of dollars” were spent on IT upgrades and backups “to prevent what happened yesterday from occurring.” Mr. Bastian lamented, “This is not who we are,” which may be true but does little to assuage the frustration felt by tens of thousands stranded passengers in the US, Japan, Italy, and UK.

In today’s world, airline flight is a highly choreographed affair: flights land in faraway destinations, flight crews disembark, rest, and are assigned to subsequent flights. The smooth transportation of international travelers depends on many moving parts. When that dance is interrupted, chaos ensues. One singular problem, like Delta’s failed power control module, creates a domino effect of mammoth proportions. At the heart of all this is not the logistics, however, but the passengers who are inconvenienced. Wide-scale anger, frustration, and helplessness are all emotions that have huge impacts on future travel decisions and the corporate bottom line.

What’s Your Excuse, Delta?

In the case of Southwest Airlines’ service stoppage, a router was blamed as the culprit. But, as our CEO Scott Restivo explained , it was about more than a mere piece of computer hardware. A look at the bigger picture reveals a company whose employee unions contend that stock performance takes precedence over IT infrastructure investment, to the detriment of both passengers and the company itself.

In the case of Delta, there seems to be a clear incongruence between spending “hundreds of millions of dollars” on IT infrastructure upgrades and a power control module failure that ground the company to a halt for multiple days. Obviously something is missing in this equation, but what is it?

Aviation analysts have been largely critical of the industry’s mega-mergers and overall ability to keep up with technology, stating that “the meltdown … raises questions about whether a recent wave of four U.S. airline mergers that created four large carriers controlling 85% of domestic capacity has built companies too large and too reliant on IT systems that date from the 1990s.”

Process Inefficiency: Different Sector, Same Story

At this stage we can only speculate as to why Delta’s systems failed so spectacularly. Whether it’s outdated equipment or improperly merged systems, ultimately the real problem goes much deeper: process inefficiency.

In our whitepaper, ITIL Lite: Service Management for SMBs, we discuss how a unstructured approach to IT causes process inefficiency. Some characteristics include focusing on reactive fire-fighting policies, informal processes, in-house development, and isolated silos of knowledge. Even though our whitepaper is directed at small-to mid-sized businesses, this approach to IT is also present in large corporations. Inefficient companies, whether large or small, can’t see the forest for the trees: they are wholly reactive (replace a part when it’s broken) instead of proactive (find issues before they become problems).

When it comes to colossal IT failures, it’s easy to be distracted by blithe headlines, such as faulty routers and worn-out power control modules. These are simple, digestible, and nicely wrapped for public consumption: “Oh, it was just a faulty router. Just replace it and we’re good-to-go.”

The real reasons why complex systems fail, however, can only be found by digging deeper: When was the part last serviced? Why wasn’t it replaced before it was broken? Which IT employees submitted service requests? And, to cut to the chase, what process was lacking? Why wasn’t a comprehensive process in-place that was capable of discovering potential problems?

Embracing the ITSM Methodology

Here at Crow Canyon we have designed our solutions with a strong focus on ITSM and ITIL. These are well-established best practice methodologies designed to foster collaboration, communication, analytics, and knowledge sharing. We embrace the basic premises of IT Service Management, which are process-centric, proactive, accountable, scalable, and customer oriented. They put the onus on IT Management to implement a process-based approach, and our products are designed to facilitate those processes with robust tracking, reporting, asset management, service request, and analytics functionality.

Any size organization can benefit with cost-savings, smooth operations, quick recover from failures, greater overall efficiency — and better-served employees and customers — by embracing an ITSM approach. Both Delta and Southwest, as well as their many passengers, are now well-aware of what happens when IT is given short-shrift or too many outdated systems conflict with each other. Every company would be smart to adopt the ITSM/ITIL approach with the right procedures and policies in place — and the software to implement it — before IT failures impact the bottom line.