NASA: Falcon 9 Failure in 2015 Caused by “Design Error”

Dragon capsule separated from Falcon 9 launch vehicle.

by Douglas Messier
Managing Editor

Nearly three years after a SpaceX Falcon 9 failed in flight sending a Dragon resupply ship to the bottom of the Atlantic, NASA has finally released a public summary of its own investigation into the accident. [Public Summary — PDF]

You might recall that SpaceX’s internal accident investigation blamed a defective strut assembly in the second stage liquid oxygen (LOX) tank. The strut, provided by an outside supplier, snapped under launch stresses, causing a helium bottle inside the tank to break free and destroy the LOX tank, the company said.

The NASA investigation found that is a credible scenario for the accident. However, the space agency blamed a “design error” by SpaceX. The table below shows a summary of the investigation’s technical findings.

The report also questioned the use of other materials in the booster, although they were not directed related to the launch failure.

Investigators also concluded that a new process implemented for the Falcon 9 Full Thrust variant led to “a substantial portion of the anomaly data being lost.”

Why is took NASA nearly three years to release the public summary is an interesting story. You might recall that I pestered NASA back in 2016 asking when the report would be released. In December of that year, I was told to check back in six months.

When I checked back in July 2017, NASA told me the agency had no obligation to release a report; therefore, it would not do so. I chronicled the sage in this story.

About a month later, I discovered a provision in the Senate appropriations bill requiring the FAA to release a report on the accident that would include the findings of NASA’s investingation.

“The report must consolidate all relevant investigations by, or at the request of, the Federal Government that were conducted, including those completed by NASA as part of the FAA report, and must also include a summary suitable for public disclosure,” according to a committee report that accompanies the spending bill.

I haven’t had time to fully digest this report, so I will read it tomorrow and write more if warranted.

  • Great job Doug, bringing truth to power.

  • ThomasLMatula

    Yes, great job!

    In essence all accidents are basically design error, because if something was designed differently it would not have failed, or the failure would not have created the chain of events that caused the accident, or any human “error” that was responsible for the accident would not have occurred. That is why building safe, robust, reliable systems for space or even a safer bike to take you to work, is a learning experience.

    All of NASA’s fatal and near fatal accidents (Apollo 1, Apollo 13, Challenger, Columbia) were basically the result of design errors.

    Hopefully the experience SpaceX gained with Falcon 1 and Falcon 9, along with a well designed test program will minimize the design errors in the BFR. The same with the New Shepard and New Glenn from Blue Origin.

  • Terry Stetler

    And moot now given CRS-7 was F9 v1 1, 2) v1.2 Block 5 is imenent, and 3) as of January F9’s approval for Category 2 NASA missions.

  • Andrew Tubbiolo

    Er, actually no. There is both human error and design error. Challenger’s SRB joint technology was a perfectly good solution provided the “O” rings were pliable. The unaccounted for accumulation of alumina slag even made up for joint flexure with stiff “O” rings in the Challenger flight, until it did not. It was in many ways a robust design, just not robust enough to operate with the “O” rings cold. The decision to fly with cold “O” rings was a human decision that was made against the known design limits. Look at TR-1 and 2, those are human factors. TR-1 humans need to pay attention … TR-2 is a human procedures issue. TR-3 is technical only in the nature of the packet system used, in that Space X chose a packet system that does not confirm receipt of data on the client side with a resulting loss of data. It’s the old UDP vs TCP jihad that happens on Earth based computers. However, I would argue this is not a technical problem, as SX could choose a packet system standard that closes the loop with a confirmation of receipt. Rather it was a human decision to go with the open loop approach.

  • GiantEnemyGPU

    That’s not correct. The usage of o-rings was incorrect in the first place!

    The original SRB joint flexed away from each other but the o-rings were able to jump its groove and seal the new gap before the seal was completely eroded. With colder o-rings, this saving throw was not possible. So the primary problem isn’t the temperature, it’s the design of the joint.

    After the Challenger disaster, the SRBs were redesigned such that under SRB pressurization, the joint was compressed and allowed the o-rings to work as designed.

  • Andrew Tubbiolo

    A valid retort, however I’d bet that if you looked at the stated specifications for the old SRB vs the new you’d see that the use of the “O” rings was correct per the old specifications. The new design covered shortcomings of the old approach. Pliable “O”rings and zinc chromate putty carried the day in the old design provided operating limitations were followed. It’s plain and simple, management under pressure from politicians and the media pushed for flight in conditions the technicians knew would be an issue. I don’t know offhand, perhaps you do, was anything like the post-Challenger design proposed for the initial design?

  • Michael Halpern

    I wonder if some of this is stretched or exaggerated for political purposes,

  • ThomasLMatula

    True, humans always error. However a good design engineer takes that into account and makes it hard for them to mess up.

  • ThomasLMatula

    Yes, there always were a problem.

    “For a number of engineers and managers at SRB manufacturer Morton
    Thiokol and within NASA, however, the cause of the disaster had been
    identified more than a year before Challenger’s maiden voyage: the
    primary and secondary O-rings meant to prevent a leakage of hot gases
    were incapable of properly sealing the gaps between the SRB joints in
    extremely cold weather. Already, catastrophe had been averted on one
    previous cold-weather launch in January 1985 and conditions in the hours
    leading up to 51L’s liftoff were colder still. Moreover, an application
    of zinc chromate putty, intended as a “thermal barrier” to keep the
    combustion gas path away from the two O-rings, had been shown as early
    as 1984 to be susceptible to the formation of “blow holes,” which compromised its effectiveness.”

  • Andrew Tubbiolo

    When you can find them …. 🙂

  • GiantEnemyGPU

    People at NASA knew something was wrong with the joint design before Challenger. Maybe not the big wigs, but definitely some engineers recognized the goof with o-rings.

    The original specifications allowed no erosion of the o-rings at all. After all, why would a sealed joint protected by putty erode at all?

    Feynman said this in his addendum:

    “The O-rings of the Solid Rocket Boosters were not
    designed to erode. Erosion was a clue that something was wrong.
    Erosion was not something from which safety can be inferred.”

  • Douglas Messier

    I don’t know. If I were the manufacturer Musk blamed for destroying his rocket and Dragon resupply mission to ISS, I think I’d be pretty happy right now.

    I suspect there’s a reason he never mentioned the supplier’s name. Would have been too easy for the company to fire back that it was being made a scapegoat for a series of bad decisions by Musk and his team.

    Remember that the SpaceX investigation was headed by a high-level company official with 11 employees and a single voting member from the FAA. The FAA official was the only one not to sign the final accident report.

  • windbourne

    In fact, the SRB engineers nixed flying the shuttle until it warmed up.
    That was PURE decision.

  • windbourne

    No, challenger and Columbia were operational decision errors.
    This was SX choosing to save a few bucks and using a lower grade strut.

    The strut itself did not hold what it should have, but, it is possible that had the had the more expensive aviation strut, that it would have held.

  • windbourne

    not just fire back, but sue.

  • duheagle

    Most of it seems to be old NASA hands taking an opportunity to flog all their “they don’t do things like we do” peeves in print.

  • Mr Snarky Answer

    I work in a business with suppliers and root cause analysis. We never divulge supplier name in public, whether it was their problem or ours. Our label is on the product so that’s where it starts and ends. We will tell customers that a failure happened of part x from supplier but never say the name. It can be figured out if it is a large enough issue hitting multiple downstream customers (from a supplier) but that in exception to the rule.

  • Nickolai

    Great digging, really admire how you kept at this.

  • Douglas Messier

    Thank you. It was really Congress that got this released. I didn’t ask anyone to do that. But, I have heard that some folks on the Hill had been following my coverage.

  • Andrew Tubbiolo

    …. And they are right to do so. They don’t do things like the old guard, and lose a flight for it. “A” flight, and maybe “A” booster and spacecraft under test. And the counter punch is the same, they don’t do things like the old guard and fly like mad men with a launch rate that exceeds the launch rate of nation states. The old guys get to grump, their points are valid, however it can be argued that it’s a price worth paying given the overall outcome. Just so long as they learn from past malfunction, Space X is largely doing fine.

  • Douglas Messier

    Then standard procedures in the industry worked to SpaceX’s advantage. No obligation to name supplier. Supplier isn’t going to pop up and call attention to itself. Have a high-level SpaceX employee run the internal investigation, stock the board with company employees and a lone FAA rep. Count on NASA not to release its report.

    This is a small industry. And people talk, especially after a few drinks. So, I’m wondering if word got around and the supplier found itself having to fight claims it was at fault alone.

    If you read the Ashlee Vance book, you find a story of Elon blamed a Falcon 1 failure on an employee not doing his job properly. There was actually a design flaw where they had chosen the wrong part (screws, I think) and they had rusted in the salt air in the months it took to get the rocket off the ground. The employee flew back to Hawthorne, had a furious argument with Musk, and left the company.

  • Mr Snarky Answer

    I read the book and $h1t happens. I’ve been in engineering RCA meetings about customer outage issues where blame is thrown around. I’ve been on both sides of the yelling myself. They did find casting flaws in the parts that pulled at low, 1/5 rated performance. Are you saying SpaceX fabricated that?

    At the end of the day it was SpaceX failure one way or another the underlying cause is irrelevant to the responsibility.

  • envy

    You can say the same for the Shuttle: Challenger and Columbia were ultimately due to NASA deciding to save a few bucks and use segmented SRBs and external foam insulation. It is possible (more like certain) that if they had the one-piece SRBs and internal insulation standard on other launch vehicles, that both would have been fine.

    CRS-7 was an infant mortality failure, which can and will be mitigated by flight tests, recovery, reuse, inspection, and iterative redesign to resolve identified issues. Columbia and Challenger failed do to problems that had been identified previously and were not appropriately fixed (even after Columbia, the foam shedding problems STILL weren’t fixed, and the next flight almost failed in the same way). The problem with Shuttle failures was both the systemic normalization of deviance due to lack of understanding of operational characteristics, AND a fundamentally flawed overall design – sidemount and SRBs were both poor choices, driven by cost.

  • envy

    It was a aluminum nut on a fuel line that corroded.

  • Mr Snarky Answer

    Salty air out on those islands..

  • redneck

    That’s the way I read it as well. It’s not a few bucks as someone upthread suggested. It’s a whole culture of not wasting years and Dirksens on the Good Oldspace Seal Of Approval on every nut and strut.

  • Christopher James Huff

    SpaceX found struts that would have failed even with the 4x factor specified in the report. Not including that margin (IIRC, a factor of 3 was used) may have been a design error, but the struts themselves were still faulty, failing down to 1/5th their rated load.

    I would not be particularly happy to find that specifying that my parts should be derated by a factor of 4 was insufficient.

  • Mr Snarky Answer

    “All of NASA’s fatal and near fatal accidents (Apollo 1, Apollo 13, Challenger, Columbia) were basically the result of design errors.”

    This is the point that most miss. Any number of design changes could have avoided the outcome, any number of testing protocols can accommodate design deficiencies. Sometimes you know the design needs improvement and use testing as a way to ensure the soft spots are covered. Sometimes a soft spot bites you in the a$$.

    The entire F9 program was based on taking the rule book and throwing it out the window, starting from first principles to see what is adequate to get the job done vs what is window dressing that just increases cost. The miss on the part of SpaceX is if you are going to go with commercially sourced parts you need to test and characterize the crap out of them. If they did that in the first place, there would be no CRS-7 failure and “design deficiency”

  • patb2009

    when the failure of one part can cost $500 Million, it’s a damned big concern.

  • Not Invented Here

    I think we can trust NASA LSP to be non-political in technical matters, they don’t have a dog in the fight (they’re not the ones building SLS).

    But it’s pretty clear the release of this summary is political, since someone in the congress ordered it, I bet you can hear about this in a hearing very soon.

  • redneck

    And when the failure to get things done costs Dirksens per year for decades with zero flights, it’s an even bigger damned concern. Of course, it’s so much easier to throw mud at people doing the work than to actually do any, and powerpoints don’t RUD.

  • Terry Stetler

    Wonder if that supplier was Kobe Steel, who’s been falsifying qualification data be for 10 years? To hey sold cc bad parts to be automakers, aerospace etc.

  • Terry Stetler

    Yup. The Alabama Mafia’s fingerprints are all over Congress getting involved in this.

  • Brainbit

    Are you saying FH is a fundamentally flawed overall design and was a poor choice driven by cost?

  • Lee

    When a manufacturer tells me I should plan on a factor of safety of 4 for their parts, when my calculations clearly show that (for example) a factor of 2 would be more than enough, that tells me one of two things:

    1) They don’t believe my numbers or
    2) They believe my numbers, but know that they can’t produce parts with the required specifications, so they say you need to design to higher factor of safety.

    Not sure which it was in this case, but it’s probably a moot point given the loads under which the part actually failed.

  • Jeff2Space

    Do note that the strength of the tank’s aluminum/lithium alloy increases as temperature drops to LH2/LOX temperatures. Putting the insulation on the inside would have meant the aluminum/lithium would have been at ambient temperatures on the pad and directly exposed to aerodynamic heating during launch. The increased temperatures means reduced strength of the aluminum/lithium material which would have required thicker tank walls, increasing the mass of the entire tank.

  • Douglas Messier

    Why not release it? Why did Congress have to request it in the first place?

    They released a public summary when Orbital lost Antares & Cygnus. Why should this one be withheld from the public? Why should SpaceX be spared from taking its lumps for a launch failure?

    There was a disagreement over cause. Why should the only narrative be a SpaceX investigation that had 11 employees and 1 FAA rep who didn’t sign the final report?

    When SpaceX supporters aren’t singing the praises of Musk and SpaceX, they’re going into a panic whenever there’s any criticism of them. The defenders act like snowflakes facing a hair dryer. Grow up.

  • Douglas Messier

    There was no problem releasing a summary of the Antares/Cygnus failure. Nobody complained about that.

  • publiusr

    Parts break. More concerning was the rumors of upper stage propellant lines icing up for Space X that ULA knocked out a long time ago. I have to give the steely eyed missile men that one.

  • envy

    Err, what? FH does not use sidemount or SRBs.

  • envy

    Primarily a concern on the intertank due to shockwave impingement from the adjacent SRBs. The intertank was insulated against aero heating even though it contained no cryogens.

    Actually, the acreage SOFI wasn’t even the shedding problem… it was the hand cut and placed foam. The bipod ramps, in Columbia’s case. I’m not sure there was a good fix except for moving completely to top-mount.

  • envy

    It should have been released while it was still relevant. Not when it was more convenient political ammunition to ignore FH and continue SLS. I find it hard to beleive that LSP took more than 2 years AFTER approving a LSP flight (Jason-3) to finalize this report.

    But in the end, I don’t think it matters much. SpaceX is focused on surpassing the reliability of all their competition. They could have the most consecutive launch successes by the end of next year.