SpaceX’s Philosophy: Reliability Through Continual Upgrades

Remains of a Falcon 9 rocket fall to Earth.

By Douglas Messier
Managing Editor

To succeed in the launch business, you need to be very, very good and more than a little bit lucky. Eventually, there comes a day when you are neither.

That is what happened to SpaceX on June 28. A string of 18 successful Falcon 9 launches was snapped as the company’s latest rocket broke up in the clear blues skies over the Atlantic Ocean. A Dragon supply ship headed for the International Space Station was lost, SpaceX’s crowded manifest was thrown into confusion, and the company’s reputation for reliability was shattered.

It was quite a nasty little shock. But, in another sense, the timing was the only real surprise. One might have expected an accident to occur much earlier in Falcon 9’s history as SpaceX worked out bugs in the launch vehicle. But failures don’t follow any set schedule. They arrive when they arrive, often with little advanced warning.

Even with 19 flights, Falcon 9 has not flown enough times for anyone to gain a real understanding of how the launch vehicle will perform over the long run. You need scores of flights to get a really good handle on reliability. This process is even more difficult in the case of SpaceX, which doesn’t operate like most launch providers.

Hardware as Software

The launch industry tends to be very conservative. A launch provider will build a rocket, test it, and make changes as necessary based on those results. The company matures the design, and then puts it an assembly line staffed with workers who are skilled at doing the exact same things over and over again, day in and day out, for years on end.

Changes are made very carefully and only after thorough testing. Experience has shown that while upgrades can improve a rocket’s performance, they also can cause problems. Given the high cost of launches, there are not a lot of opportunities to fully test out upgrades.

By contrast, SpaceX has treated Falcon 9 as something akin to software — a system designed to be regularly upgraded as engineers learn from flight experience. The original version of Falcon 9 flew five times before it was retired for the Falcon 9 v.1.1, which included higher performance engines, longer fuel tanks, landing legs for first stage recovery, and host of other significant upgrades.

Falcon 9 lifts off from Vandenberg Air Force Base. (Credit: SpaceX)
Falcon 9 lifts off from Vandenberg Air Force Base. (Credit: SpaceX)

SpaceX boasted that the Falcon 9 v.1.1 was virtually a new launch vehicle. And, to some degree, it was. The larger rocket could launch communications and military satellites that need to go higher than low Earth orbit (LEO). The earlier version of the rocket performed all its missions in LEO.

Falcon 9 v.1.1 flew successfully 13 times before failing on its 14th flight, giving it a reliability of 92.85 percent. If you include the five launches of the retired version of the rocket, the reliability increases to 94.74 percent. But again, even 19 flights is not a very large number.

SpaceX even made changes to the launch vehicle during Falcon 9 v.1.1’s brief flight history. Last year, for example, the company decided to bring the production of helium bottles used to pressurize the liquid oxygen (LOX) tanks in house. Previously, the bottles had been supplied by an outside contractor.

It’s not clear why the change was made. Perhaps there were problems with the supplier’s bottles or prices, or SpaceX simply thought it could do a better job at a lower cost. However, it fits a pattern. The company likes to build as much as its rockets in house as possible. SpaceX also is known in the industry to use supplier relationships as a way of identifying and hiring away a company’s best personnel.

Whatever the reason for the change, SpaceX ended up experiencing helium leaks that caused launch delays in 2014 [Orbcomm’s Elusive Falcon 9 Launch Date TBD] and 2015 [SpaceX Puts Off Next Falcon 9 Launch]. The helium problems were one of the reasons SpaceX conducted only six launches in 2014, far short of the 12 the company had hoped to accomplish.

It must be emphasized that the root cause of the “overpressure event” in the upper stage LOX tank that resulted in the Falcon 9’s destruction is yet to be determined. The accident could well, in fact, have no connection to the helium system used to pressurize the tank. It could be unrelated to the decision to bring production in house. The point here is that changes to a launch vehicle can cause unexpected problems.

SpaceX CEO Elon Musk has said the cause of the accident appears to be complex; engineers have spent a lot of time trying to understand exactly what went wrong.

“Obviously, this is a huge blow to SpaceX, and we take these missions incredibly seriously,” Musk said on Tuesday. “Everyone that can engage in the investigation at SpaceX is very, very focused on that. In this case, the data does seem to be quite difficult to interpret. Whatever happened is clearly not a simple, straightforward thing, so we want to spend as much time as possible just reviewing the data.”

Musk said via Twitter that the company will have preliminary results of its investigation by the end of this week. SpaceX will brief the Federal Aviation Administration and key customers before posting the conclusions on its website, he said.

Another Falcon 9 Upgrade

As the investigation continues, SpaceX engineers are working on yet another upgrade to the launch vehicle. The Falcon 9 v.1.2, which had been set to debut late this year, will feature super-chilled propellant and a 10 percent increase in the volume of the second-stage tank. The improvements are designed to increase thrust by 15 percent and help offset the performance hit the rocket took when landing legs and other systems were added to allow for the recovery of the first stage.

Even more changes are likely once SpaceX succeeds in recovering first stage boosters for reuse. Engineers will examine every inch of the booster looking for wear and tear and for any changes that can be made to improve reliability. New versions of the Falcon 9 will undoubtedly emerge.

How one feels about the constant upgrading of the Falcon 9 depends upon where one sits. SpaceX has found plenty of commercial customers willing to take risks on its ever evolving rocket despite its scant launch history.  SpaceX’s low prices are a big factor.

NASA was not that upset by the loss of the Dragon capsule. The space agency entered into commercial cargo agreements with both SpaceX and Orbital ATK fully expecting to lose some supply ships along the way. The payloads the agency places on these vehicles are largely low risk, nothing that can’t be replaced.

Both companies have lost supply ships over the past eight months, with Orbital ATK losing a Cygnus freighter last October when its Antares rocket blew up. The multiple failures — and the loss of a Russian Progress freighter in April — have strained ISS supply lines, but not to the breaking point.

There’s a major difference between the SpaceX and Orbital accidents. Antares is not a major factor in the international launch market. NASA is the rocket’s only user, having booked nine Cygnus supply flights to the space station. Orbital ATK has announced no other launch contracts.

SpaceX, on the other hand, has roughly 50 launches on its manifest. The company is in the critical path for a number of major players: communications satellite fleet operators whose schedules and revenue models have been thrown into uncertainty; NASA, which in addition to cargo flights is expecting the company to launch astronauts to the International Space Station within two years; and the U.S. Air Force, which is looking to bring down its high launch costs by awarding contracts to SpaceX.

The Stakes Get Higher

Lt Gen Ellen Pawlikowski, Space and Missile Systems Center commander, signed agreements with Space-X CEO Elon Musk, Jun 7, 2013 at the Space-X facility in Hawthorne, Calif. (Credit: USAF/Joe Juarez)
Lt Gen Ellen Pawlikowski, Space and Missile Systems Center commander, signed agreements with Space-X CEO Elon Musk, Jun 7, 2013 at the Space-X facility in Hawthorne, Calif. (Credit: USAF/Joe Juarez)

The Falcon 9 accident came at a time when the stakes for SpaceX launches had been raised significantly. NASA had just certified SpaceX to launch payloads more crucial than ISS cargo. The company will be launching the space agency’s Jason-3 remote sensing satellite, a mission that had been scheduled for August prior to the accident.

Far more significant is the recently completed U.S. Air Force certification of Falcon 9, which will allow SpaceX to compete with United Launch Alliance (ULA) for defense launch contracts. SpaceX had pushed the Air Force to complete the certification process faster. The company unsuccessfully sued the service in an attempt to void a large launch contract given to ULA even before certification was completed.

Musk’s argument was that SpaceX was able to deliver launches that are just as reliable and much less expensive than ULA. The accident doesn’t void the certification, but it certainly raises questions about the reliability claim.

Atlas V liftoff (Credit: ULA)
Atlas V liftoff (Credit: ULA)

Gen. William Shelton, who was commander of the U.S. Air Force Space Command until his retirement last August, pointed out ULA’s excellent record in a recent op-ed piece in The Wall Street Journal.

Current U.S. space policy is implemented by buying both the Atlas V and Delta IV rockets from the United Launch Alliance, a joint venture of Lockheed Martin and Boeing. Both rockets have a 100% success record—83 launches without failure.

ULA critics will quibble with that statement; like the Falcon 9, both the Atlas V and Delta IV have experienced anomalies during flights. But, the company has not suffered the type of catastrophic failure that SpaceX experienced last month.

Shelton also pointed out that there’s already a problem with the certification the Air Force just awarded for the Falcon 9:

SpaceX is the first company to complete the certification process for its Falcon 9 Version 1.1 rocket—the one that failed on Sunday. But the company is also developing a “Full Thrust” Falcon 9—capable of carrying all but the heaviest satellites—and that is the rocket it intends to use to bid on national-security contracts.

The Falcon 9 Full Thrust version hasn’t gone through certification, indeed it has never been launched. Nevertheless, SpaceX lobbyists last year convinced key congressional leaders that their rocket is ready to launch national-security missions.

In other words, the Air Force will be launching on yet another version of the Falcon 9 with an even shorter launch history than the one that just failed. That can be handled with some additional certification work. However, it’s an unnerving prospect for an organization whose primary focus is on mission assurance, not cost.

The Air Force does not like taking a lot of risks with its launches. And with good reason. The satellites it launches are crucial to national security, and many of them are very costly.  That makes any launch accidents doubly expensive.  If the Air Force saves money on a cheaper launch vehicle but it ends up losing a very expensive satellite (or two), exactly what has it gained?

The service has gone through periods during which launch vehicles failed on a regular basis. It decided to revamp its processes to emphasize reliability. The Air Force was closely involved with Boeing and Lockheed Martin when the companies developed the Atlas V and Delta IV boosters.  Those efforts haven’t come cheap, but they have paid off.

It helped that the technology used in the boosters had long histories. The Centaur upper stage is an evolved version of the one that first flew in 1963. Russia’s RD-180 engine, which powers the first stage of the Atlas V, is extremely reliable and can trace its roots back to the 1980’s. Falcon 9 just doesn’t have the same legacy yet.

Despite their reliability, both the Atlas V and Delta IV will be phased out. One reason is competition from SpaceX. Both of ULA’s rockets are too expensive to compete on the commercial market; they are almost totally reliant on military and NASA payloads, for which they now have tough competition. The other reason is the decaying relationship between the United States and Russia, which has made continued use of the RD-180 engine unsustainable.

Instead of developing a new engine for the Atlas V, ULA has elected to develop a brand new launch vehicle called Vulcan. The new rocket won’t be ready for flight until 2019, and it won’t be certified to carry defense payloads until several years after its inaugural launch. This will leave SpaceX with the very type of monopoly the company has criticized ULA for having on military launches.

Lives at Risk

Dragon Version 2. (Credit: SpaceX)
Dragon Version 2. (Credit: SpaceX)

Meanwhile, the stakes are about to be raised even higher for SpaceX on the civilian side. The company is in the final stretch of NASA’s commercial crew program, under which it is set to fly astronauts to the International Space Station within the next two years. This is a much more costly and risky endeavor than cargo; the consequences of failure are much higher.

The employees at SpaceX are acutely aware of this reality. As they watched Falcon 9 break up, they were undoubtedly thinking: this time it was only cargo, but what if this happens when there are astronauts aboard? The accident was a very sober reminder of how much more will soon be at stake.

On the plus side, it’s good to have these failures now before crewed flights begin. The company can learn from them and fix what went wrong. On the other hand, there’s not a whole lot of time remaining with a 2017 deadline for crewed flights looming. And what if the rocket has other problems that haven’t surfaced yet?

There was talk after the Falcon 9 accident about whether the Dragon cargo ship could have been saved if it had the abort system planned for human-rated Dragon V2. The general consensus was that the capsule could have been rocketed away and parachuted to safety. However, the discussion misses a key point.

An abort system is like an inflatable aircraft slide: an essential safety feature that you never, ever want to actually use. It’s an option of last resort when all else has failed and there’s no other way to save the lives of the crew.

The ultimate goal is to build a rocket that is so reliable that you never have to use the escape system. SpaceX believes the way to accomplish that goal is through constant innovation, not by the traditional method of flying the same design over and over again.

 The Human Cost

Marlin 1D engines undergoing checks. (Credit: SpaceX)
Marlin 1D engines undergoing checks. (Credit: SpaceX)

There’s another area that SpaceX needs to address as the stakes involving its launches become higher: the use of its workforce.

Musk has adopted a Silicon Valley approach to working hours: hire young employees, contractors and interns and work them to the bone. Sixty to 80 hour weeks are the norm. “The best thing about working at SpaceX is the flexibility,” one intern joked. “You can work whatever 80 hours a week you want.”

There are some real benefit to the SpaceX model:  the esprit de corps it produces, the invaluable experience of working on real space hardware, the prestige that comes with working for a world famous boss, and the sense of mission embodied in Musk’s goal of colonizing Mars and becoming a multi-planet species. Throw in stock options for when the company goes public, an awesome array of food in the cafeteria, and various other perks, and it’s easy to see why a lot of people are eager to work there.

The negatives of this approach are a workforce prone to exhaustion, burnout, high turnover and mistakes. And that should raise some rather serious questions. If you’re sitting in Dragon spacecraft awaiting for tons of propellant to explode underneath you, how comfortable are you going to be knowing the rocket, the capsule, and its escape system were built by people who have been working insane hours for months or even years? Not very.

A related issue is who is on the assembly line building these vehicles. SpaceX tends to attract a lot of very driven, Type A personalities with sharp elbows. Those are not necessarily the folks who are best on an assembly line, where the ability to perform the same tasks over and over again with great precision is an extremely valuable skill.

The solutions to these issues are relatively straightforward. One is to cut back on hours worked and to hire more employees to pick up the slack. The other is an evolution in the workforce with a greater focus on production and (eventually) the refurbishment of recovered first-stage boosters.

Cutting back on hours and hiring more employees could end up raising costs a lot. SpaceX’s launch prices are already the lowest in the industry. They are so low that the company’s competitors can’t figure out how SpaceX is making any money. The Chinese can’t figure it out. Nobody at Orbital ATK can either. Either final costs are much higher when you add in payload processing services and other factors, or SpaceX’s profit margins are razor thin.

It’s possible that reusing the Falcon 9 first stage is more than just a technological breakthrough, but that it’s essential to the company’s long-term viability. Getting 10 flights of a stage — even at a reduced launch price — would bring in more money than a single launch, and it would be the most efficient use of the output of the workforce.

In the wake of the accident, SpaceX is now even further behind on a manifest that has consistently slipped to the right for years now. The loss of the Falcon 9 has scrambled an already tight schedule. The first demonstration of the Falcon Heavy is likely running about three years behind schedule. NASA wants crew flights in two years. Comsat operators want their satellites launched.

Meanwhile, Musk will want to fix whatever went wrong with the Falcon 9 and resume flights as soon as possible. That means employees are going to have to work harder to recover from the failure and to get caught up on the schedule even as the stakes for what they’re doing get higher.

The Falcon 9 failure was a nasty little wake up call. How SpaceX recovers from it, and what it does moving forward, has consequences not only for SpaceX but the entire American space industry. The global space industry, in fact. It will be interesting to see how things play out.