Start. 37 seconds of flight ... broads! 10 years and 7 billion dollars spent on development. Four one-and-a-half satellites of the scientific program Cluster (the study of the interaction of solar radiation with the Earth’s magnetic field) and the Ariane 5 launch vehicle turned into a "candy" on June 4, 1996. And blamed on programmers.
The previous model, the Ariane 4 rocket, was successfully launched more than 100 times. Something went wrong? ')
To storm the heavens, you need to know Hell's language well.
Dossier
Ariane 5 (“Ariane-5”) is a European one-time launch vehicle, part of the Ariane family (the first launch took place in 1979). Used for launching medium or heavy spacecraft into Earth orbit, it can simultaneously launch two or three satellites and simultaneously up to eight microsatellites.
Project history Established in 1984-1995. European Space Agency (ESA; ESA), the main developer - National Center for Space Research of France (CNES). The program participants are 10 European countries, the cost of the project is 7 billion US dollars (46.2% - the contribution of France).
About a thousand industrial firms took part in the creation of the rocket. The main contractor is the European company Airbus Defense and Space (Airbus Difens and Space; division of Airbus Group, Airbus Group, Paris). The marketing of Ariane 5 in the space services market is handled by the French company Arianespace (“Arianspeace”; Evry), with which ESA signed a corresponding agreement on November 25, 1997.
Specifications Ariane 5 is a heavy-duty two-stage launch vehicle. Length - 52-53 m, maximum diameter - 5.4 m, starting weight - 775-780 tons (depending on configuration).
The first stage is equipped with a Vulcain 2 liquid-propellant rocket engine (Vulkan-2; Vulcain was used in the first three versions of the rocket), the second - HM7B (for the Ariane 5 ECA version) or Aestus (Aestus); for Ariane 5 ES). Vulcain 2 and HM7B work on a mixture of hydrogen and oxygen, produced by the French company Snecma (Snekma; part of the Safran group, Safran, Paris).
Aestus uses long-lasting fuel - nitrogen tetroxide and monomethylhydrazine. The engine was developed by the German company Daimler Chrysler Aerospace AG (DASA, DASA, Munich).
In addition, two solid-fuel boosters (manufactured by Europropulsion, Europropelzh; Suren, France; a joint venture of the Safran group and the Italian company Avio, Avio) are attached to the first stage, which provide more than 90% of the thrust at the start of the launch. In the Ariane 5 ES variant, the second stage may be absent when placing the payload on a low reference orbit.
The day after the disaster, the Director General of the European Space Agency (ESA) and the Chairman of the Board of the French National Center for Space Research (CNES) ordered the formation of an independent Commission to investigate the circumstances and causes of this emergency, which included well-known experts and scientists from all concerned. European countries.
On June 13, 1996, the Commission began its work, and already on July 19 its exhaustive report ( PDF ) was published, which immediately became available on the Web .
The commission had telemetry data, trajectory data, and also a recording of optical observations of the flight progress. The explosion occurred at an altitude of about 4 km, and the fragments were scattered over an area of ​​about 12 square meters. km in the savannah and swamps. The testimony of numerous specialists was heard and the production and operational documentation was studied.
Technical details of the accident
The position and orientation of the launch vehicle in space were measured by the Navigation System (Inertial Reference Systems - IRS), which includes an embedded computer that calculates angles and speeds based on information from the onboard Inertial Platform, equipped with laser gyroscopes and accelerometers. The data from the IRS were transmitted via a special bus to the On-Board Computer (On-Board Computer - OBC), which provided the information necessary for the flight program and directly - through hydraulic and servo drives - controlled solid-fuel boosters and a Vulkain cryogenic engine.
To ensure the reliability of the Flight Control System, equipment duplication was used. Therefore, two IRS systems (one active, the other its hot standby) with identical hardware and software functioned in parallel. As soon as the onboard computer OBC found that the “active” IRS had left the normal mode, it immediately switched to another. On-board computers, too, were two.
Significant phases of the development process
7 minutes before the scheduled start, a violation of the “visibility criterion” was recorded. Therefore, the start was postponed for an hour.
H0 = 9 hours 33 min. 59 sec. local time, the “launch window” was again “caught” and, finally, the launch itself was carried out, which occurred normally up to the moment H0 + 37 seconds.
In the subsequent seconds, a rocket sharply deviated from a given trajectory, which ended in an explosion.
At the moment H0 + 39 seconds, due to the high aerodynamic load due to exceeding the “angle of attack” of a critical value by 20 degrees, the launch accelerators of the rocket separated from its main stage, which served as the basis for the launch of the Rocket Auto-Blast System.
The change in the angle of attack was due to the abnormal rotation of the nozzles of solid-fuel accelerators; such a deviation of the accelerator nozzles from the correct orientation was caused at the time of H0 + 37 seconds by the command issued by the On-board Computer based on information from the active Navigation System (IRS 2).
Some of this information was in principle incorrect: what was interpreted as flight data was in fact the diagnostic information of the IRS 2 embedded computer.
The built-in computer IRS 2 transmitted incorrect data, because it diagnosed an abnormal situation by “catching” an exception thrown by one of the software modules.
At the same time, the On-board Computer could not switch to the backup system IRS 1, since it had already ceased to function during the previous cycle (occupied 72 milliseconds) - for the same reason as the IRS 2.
The exception “thrown out” by one of the IRS programs was the result of performing a signed data conversion from a 64-bit floating point format to a 16-bit integer, which resulted in “Operand Error”.
An error occurred in a software component designed exclusively for performing the “adjustment” of the Inertial Platform. Moreover, this software module produces significant results only up to the moment H0 + 7 seconds of the missile's separation from the launch pad. After the rocket took off, the functioning of this module could not have any effect on the flight.
The “adjustment function” really had to (in accordance with the requirements set for it) act for another 50 seconds after the initiation of the “flight mode” on the navigation system bus (time H0–3 seconds), which it did.
The “Operand Error” error occurred due to an unexpectedly large value of BH (Horizontal Bias) calculated by an internal function based on the value of the “horizontal speed” measured by the sensors on the Platform.
The value of BH served as an indicator of the accuracy of the Platform positioning. The BH value was much larger than expected because the flight path of the Ariane 5 at an early stage was significantly different from the flight path of the Ariane 4 (where this software module was used previously), which led to a significantly higher “horizontal speed”.
The final action, which had fatal consequences, was the termination of the processor. Accordingly, the entire Navigation System has ceased to function. It was technically impossible to resume her actions.
This chain of events was completely reproduced using computer simulations, which, together with materials from other studies and experiments, led to the conclusion that the causes and circumstances of the catastrophe were fully revealed.
Causes and origins of the accident
The initial requirement for the continuation of the adjustment operation after the launch of the rocket was laid more than 10 years before the fateful event, when even earlier models of the Ariane series were designed. With some unlikely developments, the takeoff could be canceled just a few seconds before the start, for example, in the interval H0-9 seconds, when the “flight mode” was launched on the IRS, and H0-5 seconds, when a command was issued to perform some operations with missile equipment.
In the event of an unexpected cancellation of take-off, it was necessary to quickly return to the countdown mode - without first repeating all installation operations, including bringing to the starting position of the Inertial Platform (an operation requiring 45 minutes - time for You can lose the "launch window").
It was justified that in the event of a launch event a period of 50 seconds after H0-9 would be sufficient for the ground equipment to regain full control of the Inertial Platform without losing information - during this time the Platform would stop the movement that had begun, and the corresponding program module information about its state will be recorded, which will help to quickly return it to its original position (this is the case when the rocket continues to be at the launch site). Once, in 1989, at launch number 33 of the Ariane 4 rocket, this feature was successfully used.
However, Ariane 5, unlike the previous model, had a fundamentally different discipline for performing pre-flight actions - so different that the work of the fatal program module after the launch time did not make sense at all. However, the module was reused without any modifications .
ADA language
The investigation revealed that there were as many as seven variables in this software module involved in type conversion operations. It turned out that the developers conducted an analysis of all operations that could potentially generate an exception for vulnerability.
It was their very conscious decision to add the proper protection to the four variables, and leave three - including BH - unprotected. The reason for this decision was the belief that for these three variables the occurrence of an overflow situation is impossible in principle .
This confidence was supported by calculations showing that the expected range of physical flight parameters, on the basis of which the values ​​of these variables are determined, is such that it cannot lead to an undesirable situation. And that was true - but for the trajectory calculated for the Ariane 4 model.
And the rocket of the new generation Ariane 5 was launched on a completely different trajectory, for which no evaluations were performed. Meanwhile, it (together with the high initial acceleration) was such that the “horizontal speed” exceeded the calculated (for Ariane 4) more than five times.
Protection for all seven (including BH) variables was not provided, because the maximum workload of 80% was declared for the IRS computer. Developers had to look for ways to reduce unnecessary computational costs and they weakened protection where a theoretically undesirable situation could not arise. When it arose, such an exception handling mechanism came into effect, which turned out to be completely inadequate.
This mechanism included the following three main actions.
Information about the occurrence of an emergency situation must be transmitted via the bus to the onboard computer OBC.
In parallel, she - along with the entire context - was recorded in the reprogrammable EEPROM memory (which, during the investigation, it was possible to restore and read its contents).
The IRS processor was supposed to crash.
The last action turned out to be fatal - it was he who happened in a situation that actually was normal (despite the software exception generated due to an unprotected overflow), which led to the catastrophe.
findings
The defect on Ariane 5 was not caused by a single cause. Throughout the development and testing processes, there were many stages in which a given defect could be identified.
The software module was reused in a new environment where operating conditions differed from the requirements of the software module. These requirements have not been revised.
The system has identified and recognized the error. Unfortunately, the specification of the error handling mechanism was inconsistent and caused final destruction.
An erroneous module has never been properly tested in a new environment — neither at the equipment level, nor at the system integration level. Therefore, the fallacy of development and implementation was not detected.
From the report of the commission:
The main task in the development of Ariane 5 is a bias towards reducing an accidental accident. The exception that occurred is not due to an accidental accident, but a construction error. An exception was found, but was processed incorrectly, because the view was taken that the program should be considered as correct until the opposite is shown. The Commission takes the opposite view that software should be considered erroneous until the use of currently recognized best practices demonstrates its correctness.
A happy ending
Despite the file, another 4 Cluster II satellites were built and launched into orbit on the Soyuz-U / Frigate rocket in 2000.
The launch accident attracted the attention of the public, politicians and heads of organizations to the high risks associated with the use of complex computing systems, which contributed to an increase in investment in research aimed at improving the reliability of systems with special security requirements . The subsequent automatic analysis of the Ariane code (written in Ada) was the first use of static analysis in a large project using the abstract interpretation technique.