The live feed buries the only useful information at the very bottom of the article:
> The plane manufacturer says it has found that intense radiation from the Sun could corrupt data crucial to flight controls.
> It’s thought most will be able to undergo a simple software update.
> The issue was discovered after a JetBlue aircraft en-route from Mexico to the United States in October experienced a ‘sudden drop in altitude’.
> The plane made an emergency landing, with reports at the time suggesting 15 to 20 people suffered minor injuries.
> It’s thought the incident was caused by intense solar radiation, which corrupted data in a computer used to help control the aircraft.
On the Qantas 72 flight (2008), the ATSB report showed the same power spike that upset the ADIRU also left tidy 1-word corruptions in the flight data recorder. Those aligned with the clock cycle, shared the same amplitude and were confined to single ARINC words. That is pretty much exactly the signature of a failing solid state relay or contactor on the shared avionics power bus (upstream of both FDR and fly by wire).
Radation-driven bit flips would be Poisson distributed in time and energy. So that is one way to find out
Do you think they're using the guise of "its solar radiation" as cover to do a software update to fix a more problematic "bug", and perhaps tangentially there are some changes in said-update to improve some error correcting type code (eg: related to detecting spurious bit flips).
Not in aviation.
Counterpoint: Boeing MCAS tho
Does the 737-Max not count as aviation anymore?
It does. It is but the Max issue was well different to this one.
No, that would be straight to jail.
Remind me who from Boeing went to jail?
Airbus is in Europe where the Rule of Law still exists
That’s what we naively thought here too.
Look at how US government treats financial behemoths which actively harm whole mankind vs how EU treats them. There is way more to this topic obviously (who wants to harm their local company), but generally US is pro-companies while Europe is pro-people.
Deutsche Bank and HSBC, two major European banks, have repeatedly admitted they have engaged in money laundering activities for Russia, drug cartels and terrorists and have consistently failed to meet their AML obligations. The US is the only entity that’s going after these banks for these issues winning significant judgments and even with that backdrop you don’t see any EU enforcement.
No, because aerospace is not garden-variety Silicon Valley webshittery.
There is a slightly different level of discipline and engineering ethics at play.
Yeah I don't buy it either.
If it was really 'solar radiation' there would be more small details.
Reading the Airbus press release, I wonder if this is what happened:
Solar radiation event led to alpha particle induced data corruption in a flight control computer memory (could be DRAM, SRAM, on-chip cache, registers...). These failures are supposed to be transient (reboot and all is well).
This is an anticipated failure mode. Only one (of three?) computers should be affected by such a failure and therefore the remaining two keep on running the plane.
But what happened is <something> went wrong with the failover/voting mechanism (as often happens with one-off seldom-executed failover code). The result was no flight control computer functionality until the entire system was rebooted. Hence the emergency landing.
The fix is to address that software error, with perhaps a secondary fix TBD to harden the hardware (add some shielding perhaps).
The fact that they talk about data corruption and not just a malfunction suggests alpha bit flip rather than latch-up.
Then send the whole statement through a French to English translator to make it a bit more confusing.
I would say its pretty detailed -an unknown interference caused a single crc protected 32 bit word to be corrupted simultaneously, by timestamp, in both the flight controller hardware and the black box data recorder.
My concern would be what error correction mechanism did or did not catch the corruption in memory and why did it not recover without critical impact to operations?
> corrupted simultaneously
This sounds like a software bug.
Something like - {copy a to b, checksum a--b}
Instead of - {copy a to t, checksum a--t, copy t to b, checksum a--b}
I bet the fix is along these lines, with the caveat of real time systems/etc.
My guess is they haven't managed to point to the single memory bit which was flipped to cause this result.
The software update is probably more along the lines of 'lets just introduce a watchdog task which resets the system if the output deviates too far from the input for too long'.
After reset, it went away. If it was this kind of hw issue, it should still be present.
Considering those units were designed back when they did not have EDAC mandated, I can believe it could have been a bit flip (along with some other stuff they will probably address to take into consideration this failure mode). Nowadays, most MCU's have ECC on them so the time of this excuse is mostly gone now. :)
> Nowadays, most MCU's have ECC on them so the time of this excuse is mostly gone now. :)
That's kind of a misleading statement. Assuming you mean on planes built nowadays, as we clearly see that nowadays planes still flying (6K of them at least) still have issues. We don't need hand wavy comments trying to make it sound like modern day aviation is no longer susceptible, especially when it's in a thread on an article showing how that's just not true
I think you and gp may be speaking about different stages. Gp seems to be saying that a plane being designed and specified today would use technologies hardened against this type of error.
That even though they’re in widespread operation today, the aircraft types in question were designed (and certified) many years ago, before ECC was the norm. My impression is that, once their type is certified, new airframes are built to pretty much exactly that specification even all these years later.
> I think you and gp may be speaking about different stages
Yes, that's my point. Just because new aircraft are designed with improved hardware does not automatically mean the issue is resolved industry wide. Existing equipment will still have issues. So the statement is misleading. Is the number of aircraft with ECC "most" of the equipment in the skies?
Ok, I can see how my statement can be confusing. I wanted to say that on newly built things this is mostly gone today, although I'm certain freakish accidents can happen. Yes, if your hardware does not have ECC[1] that is something that can happen. I was initially surprised because I did not expect them to not have error correction, but I guess it makes sense for systems designed a long time ago and still in use, so that was new info to me.
[1] Technically EDAC is the correct name of the whole sybsystem, and ECC is the name of the algorithm. But I've only heard it refered as ECC in my industry. I was even initially confused when I read EDAC, so TIL.
I'm very surprised that a plane doesn't have voltage, current and glitch monitoring on every power rail, logged to the data recorder.
You would pretty much be logging, every millisecond, the minimum, maximum and mean voltage for every 1ms period (and the same for current).
Then any failing solid state relay would be obvious in the collected data, far before you start to get word corruption!
What do you think it could be ?
"That is pretty much exactly the signature of a failing solid state relay or contactor on the shared avionics power bus (upstream of both FDR and fly by wire)."
Thanks. I didn’t realise what that meant in context.
The software update is actually a rollback, apparently.
https://www.pprune.org/rumours-news/669424-airbus-a320-recal...
"The ELAC software update (actually a rollback) is the fix for around 4,500 affected aircraft. A further 2,000 or so will require hardware mods."
I'd like to see a more technical article on this. Airbus has triple redundancy in the flight control computers.[1] And they're different CPUs - one AMD, one Intel, one Motorola, all doing the same job. If flight was disrupted, they should have had lots of alarms.
[1] https://www.researchgate.net/publication/26587285_Challenges...
i wonder how definitive that is and how well they were able to reproduce the issue under controlled conditions and how strong the evidence is that there was particularly strong solar radiation in play. it would probably be a good thing if they published technical details for investigations like this that impact public safety.
i believe it could be solar radiation, but i also believe that solar radiation could be a catch-all for unexplained phenomena.
Interesting how radiation issues could be solved in software.
To give you a bit of insight, around the same timeframe (late October/early November) I directly observed two high-accuracy RTK GPS receivers reporting high accuracy (2cm), full 3D DGPS lock with carrier phase, and positions wandering within about a 5m circle horizontally. The altitude was staying pretty consistent (within about 1m, which was outside of the reported accuracy but not bad) until there was a sudden 60m altitude shift. This was all while they were sitting static on the ground, verified both by the crew and the accelerometer, gyro, and RADAR data.
There wasn’t a software fix per se, but we were able to quickly add a check to verify that the Kalman Filter’s position variance estimate was on the same order of magnitude as the accuracy level that the receivers were reporting and put a big red warning up. This wasn’t a flight-critical system, but it is the first time we’d ever seen that behaviour from those receivers and we’ve used them for 5 years.
Not my area at all, but I'm extremely surprised that a fly-by-wire system would use GPS as an altitude reference. Is that really the case?
It’s a combined signal system, using pressure based sensors + gps.
And inertial guidance too?
I don’t know what airbus uses I only looked into the schematics of commercial avionics like Garmin. I doubted though IMU drift and calibration introduce more error than they can provide in useful signal, old school pressure sensors + gps adjusted manually or automatically for regional pressure settings (pilots get these numbers through radio when they enter a new pressure area) is accurate enough (~1m). I’ll let a real avionics engineer correct me here, I’d be curious if that signal is worth the hassle + I can imagine such tiny SMD sensors ARE the biggest victims of radiation hallucination.
i would expect a huge shift like that to violate the gaussian assumption of the kalman filter? (which i guess is what you're checking, sort of?). regardless i would expect the kalman filter to smooth the shift over some substantial time at least?
I think it more likely these receivers fell for a spoof GPS signal or some software bug internal to the receiver than a solar bitflip.
In our case I don’t believe it was solar bitflips but rather wildly changing ionospheric conditions. I was primarily pointing out how you can have a software-only fix for problems like this.
Without going too far into the weeds, the fact that the receivers in question were reporting high accuracy under uncertainty is definitely a software bug in the receivers from my perspective. There was a different receiver with a completely different chipset in it on-site too that was experiencing similar issues but was reporting low accuracy. Without going into too much detail, I’ve got pretty good reasons to believe it wasn’t spoofing.
Perhaps it's improving the checksum algorithm on network packets, or even ... adding one.
Makes you wonder, if/how _passengers_ are directly protected against the radiation
They're not. Excessive high altitude flight increases your chance of developing melanoma.
Ok, I'll take an aisle seat more often now instead of a window seat.
Unless you fly as often as pilots and other onboard staff, it's unlikely to be significant.
If flying were invented today, I bet it wouldn't be allowed due to the radiation. It's more than many medical procedures which guidelines say to only do when the medical benefits outweigh the radiation risk.
I suppose if flying were invented today the plane would have no windows and the pilots would use cameras.
Clearly the benefits of flying outweight the radiation risks, though.
Passengers flying now and then, it's not a big deal, but aircrews are at increased risk of cancer.
It comes down to voting algorithms and memory persistence. Sometimes there is a threshold before data are "voted out".
I don't work on the A320 but solar radiation is a well-known issue in avionics, generally speaking.
Edit: deleted some speculation
Finally turning on the ECC RAM option?
About one third require hardware mods.
Maybe there's a range that requires a change?
Now imagine, if it was over the air update, then maybe there would be no disruption?
Agreed, I expected additional shielding or something physical like that.
s/solved/mitigated/
Note that the software update (it actually looks like a roll-back to an older version?) will only fix 4,500 newer aircraft, another older 2,000 (not sure what these are, they can't be pre-NEO, the ratios seem wrong?) will also need a hardware fix.
I'm amazed airlines haven't put up press releases detailing what is happening with their fleets yet. It has been a few hours so presumably they know and in the US at least this is a crazy busy weekend for travel.
Because the ones that only required a software update on their fleet like bluejet have already done it. Like it's stated in the article.
Also: > The radiation corrupted data in the ELAC - a computer used to operate control surfaces on the wings and horizontal stabilizer.
It's unclear to me how a software update is supposed to help this component with radiation shielding
Redundancy.
Unless they had total component failure, its most likely localized and if you create redundancy like RAID - you may be able to counter whatever they are seeing as a failure mode. Or at least reduce the likelihood of impact on the flight giving them time to replace components on the ground
and
> But EasyJet says it has already completed the required software update and is planning on operating its flights as normal on Saturday