What just happened to Southwest Airlines? A cautionary tale about underfunding key IT technology

Neil Raden Profile picture for user Neil Raden January 3, 2023
Summary:
By now, the world knows about Southwest's massive holiday logistical meltdown, which stranded millions of passengers. Southwest blamed the system-wide breakdown on winter weather, but that doesn't hold up. A deeper look at Southwest's predicament surfaces harsh lessons on the problem of technical debt.

man-in-box-fail

2022 was the year of glowing reports of organizations applying digital technology, especially cloud computing, AI, distributed data architecture and analytics, to drive "digital transformation."

This is a cautionary tale of a legendary company that depleted investment in a core IT technology, while draining cash reserves for stock buy-back to reward executives with seven-figure compensation.

As a result, it ruined the holiday season for perhaps a million stranded passengers and may have permanently damaged the company.

The near-collapse of Southwest Airlines service, when winter weather began disrupting air travel on Thursday, December 22, 2022, exposed troubling questions about the airline: obsolete technology, tone-deaf executives and a business model that couldn't scale.

According to industry tracker FlightAware, the extent of the debacle was reported in the New York Times article What Caused the Chaos at Southwest?:

Southwest Airlines canceled roughly 15,750 flights in the seven days from the 22nd to the 28th, at a rate of 17%, 30%, 35%, 33%, 66%, 72% and 62%, respectively. In comparison, Delta Air Lines, American Airlines and United Airlines each canceled fewer than 40 flights by the 28th. Delta had the fewest with only 15 cancellations. By the 28th, about 87% of all canceled flights in the US were from Southwest alone. More than 2,500 flights, or 62 percent of its planned flights on the 28th, had been canceled. Southwest said in a statement on Wednesday that it planned to fly one-third of its scheduled flights for the next several days as it tried to return to normal operations, meaning it would continue to cancel close to 2,500 flights a day. Some passengers, unable to rebook Southwest flights, rented cars or spent hundreds of dollars buying tickets on other airlines.

Industry experts expressed surprise that Southwest could fail so spectacularly. CBS News Travel Editor Peter Greenberg joined CBS 2 Streaming Anchor Brad Edwards to talk about the mass of Southwest Airlines flight cancellations. Greenberg said:

Of all the airlines that could melt down, I was really surprised it would be Southwest - because think about this. They only fly one type of equipment. They're known for getting their planes in and out in 20 to 25 minutes. They cross-train their staff - that means the guy who does your bag can also push the plane out. They do a great job of turning their planes around – except when what happens? Their communication system breaks down - and that's what happened here on a system-wide basis.

He shouldn't have been surprised. This wasn't the first time. Signs of impending problems, similar but less extreme, occurred in 2021. The problem was more than just the weather. The vast snowstorm exposed other vulnerabilities in Southwest's network.

Transportation Secretary Pete Buttigieg didn't mince words, placing the blame squarely on management:

While weather can disrupt flight schedules, the thousands of cancellations by Southwest in recent days have not been because of the weather. Other airlines that experienced weather related cancellations and delays due to the winter storm recovered relatively quickly, unlike Southwest. Yesterday, (12/28) Southwest canceled 59 percent of its flights, while other major airlines canceled 3 percent. As Southwest acknowledges, the cancellations and significant delays at least since December 24 are due to circumstances within the airline's control.

I recognize that Southwest's employees, from customer service agents to ground staff to flight crews, are working extremely hard, under trying circumstances, to help the airline return to normalcy. These frontline employees are not to blame for mistakes at the leadership level. Inadequate computer systems made it difficult to shift crews to where they were needed most. In addition, Southwest lacks agreements with other airlines and could not rebook passengers on competitors' flights, forcing many people to wait days until Southwest clears its backlog.

In another New York Times report on the situation, we learn that:

In contrast to Southwest's "point to point" model, most airlines use a "hub-and-spoke" system, in which planes typically return to a hub airport after flying out to other cities. Southwest's route model often lets passengers fly directly from smaller cities and regions without stopping at a central hub like Denver or New York. Point-to-point flights cut travel times by eliminating the intermediate stop — typically a big advantage for travelers who are not flying from major metro areas. Inadequate computer systems made it difficult to shift crews to where they were needed most. In addition, Southwest lacks agreements with other airlines and could not rebook passengers on competitors' flights, forcing many people to wait days until Southwest clears its backlog.

In weather emergencies, hub-and-spoke airlines can quickly relocate equipment and crews to hubs for affected routes, restarting schedules. For Southwest, that option isn't available, and disruptions in some routes provoke logistical problems with resources out of position to resume normal operations.

A confounding factor is that Southwest's operations give them significant advantages, as Peter Greenberg noted above. However, their point-to-point model is far more likely to collapse under severe disruption. There is a formula in route optimization, (n * ( n − 1)) /2), to calculate the number of routes where the variable of n stands for the number of point-to-point cities (destinations). If an airline offers flights to 100 destinations, 4,950 routes will be needed to cover all destinations in the network.

Southwest has roughly 6,000 pilots and approximately 10,000 cabin crews, composed of two pilots and three or four flight attendants per flight. Unless this personnel (equipment, ground crew and gate attendants)  are in scheduled places at the scheduled time, Southwest's tight operations can quickly fall apart because if any of those resources are misplaced or off-shift, FAA regulations restrict the plane from flying passengers.

Scheduling airlines is a complex system with constraints concerning union rules, federal regulations and airline policies when assigning crews and pilots to flights. However, Southwest's system couldn't track where its crew members and pilots were after so many flights were canceled. 

Rescheduling for Southwest involves complexity. For example, every flight must have a captain, preferably not two, which may result in other flights needing a captain. One cabin crew flight attendant must be designated as a Purser (supervisor). Though not strictly required, it is preferred to have at least one of the pilots familiar with the destination airport.

As CNN reported in Why Southwest Is Still Melting Down, Southwest Chief Operating Officer Andrew Watterson commented:

Our technology could not handle the process of matching up those crew members with the aircraft. Southwest ended up with planes that were ready to take off with available crew, but the company's scheduling software wasn't able to match them quickly and accurately.

My take

Management cut corners on modernizing Southwest's routing system, which was developed twenty years ago and patched repeatedly, even as the airline's operations expanded, adding complexity to the system. Twenty-year-old software surely was designed in the "managing from scarcity" mentality of computer resources. Their scheduling optimization software is so old that it is likely incapable of handling such an enormous disruption. Older patched software is subject to “technical debt,” a deficiency between the current requirements and. advanced technology and what it needed. Airlines, which were early adopters of automating optimization models, are particularly susceptible to technical debt, but Southwest’s current predicament is extreme.

In March, in its open letter to the company, the flight attendants union even placed updating the creaking scheduling technology above its demands for increased pay. No action was taken - and Southwest now finds it in a very difficult spot, with a wall of technical debt to somehow overcome.

Loading
A grey colored placeholder image