Menu Sign In Contact FAQ
Banner
Welcome to our forums

UK airspace closed?

dublinpilot wrote:

Having found an entry and exit point, with the latter being the duplicate and therefore
geographically incorrect, the software could not extract a valid UK portion of flight plan between
these two points.

This sounds suspiciously like a division by zero or fencepost error.

ESKC (Uppsala/Sundbro), Sweden

They would be fools to tell the truth

So many possible attacks. Even if it didn’t make them look bad, it would enable DOS attacks to be constructed.

Anybody can inject FPL messages into the AFTN and thus to IFPS and to national FPL processing systems. The whole edifice relies on security via obscurity. If it became open-sourced to any extent, it would be like open source software anywhere i.e. getting constantly hacked (in most cases). They need to keep the details secret.

Administrator
Shoreham EGKA, United Kingdom

Maybe something like this happened:

1. The system gets the 4444 FPL. This does not include e.g. all the points along an airway, but is of course still valid.
2. It converts it to ADEXP, thus inserting many more points outside the FIR.
3. It now analyzes the 4444 FPL. It finds the entry point and tries to find the exit point, which it does not. Now it tries to find a point in 4444 (after its last point in the FIR) that is outside the FIR, is present in ADEXP and is also the nearest. It fails the first time (why? ) and moves on to the next point in the 4444. It now finds a point that is outside the FIR, is also the nearest point to the 4444 point, but is the first dup.

There are many questions that cannot be answered because the algorithm’s details are not known.

LGMT (Mytilene, Lesvos, Greece), Greece

Peter wrote:

NERL_Major_Incident_Investigation_Preliminary_Report_pdf

A comment and analysis on this report by someone involved in software development. It includes some software related discussions.

ESKC (Uppsala/Sundbro), Sweden

AA, you beat me with that link. :-)
Yeah, classical swiss-cheese model of failure, this time in (aviation-related) SW environment.
One wonders how many of those are lurking elsewhere in the critical infrastructure (including the unforeseen feedback loops in the financial instruments happily employed everywhere).

Slovakia

What I find scary is not so much that there was a bug in the UK route extraction code (stuff happens) but that the reaction to the bug was to bring down the whole system.

My day job is writing network traffic management software. “Everyone knows” that you should use ASSERT statements to check for broken assumptions in the code. But that is a TERRIBLE idea in real time code. Much better to let one packet (or flight plan) be broken, than to bring down the whole system.

Obviously the system should have rejected this flight plan, and carried on operating.

It’s the same lack of forethought at a sort of meta-level that caused the Turkish DC10 crash at Paris.

LFMD, France

johnh wrote:

My day job is writing network traffic management software. “Everyone knows” that you should use ASSERT statements to check for broken assumptions in the code. But that is a TERRIBLE idea in real time code. Much better to let one packet (or flight plan) be broken, than to bring down the whole system.

Or you use a language designed for high-availability real time applications such as Erlang. (Used e.g. in Telecom systems and systems like the WhatsApp server.)

Erlang supports (indeed encourages) a program design style where you can let failing processes crash without bringing down the whole system.

ESKC (Uppsala/Sundbro), Sweden

Local copy

What strikes me is not the “computer science” commentary but the name Frequentis AG.

They wrote the EAD AIP and approach plate database system. Many years ago I was involved in a project which produced a PC executable which grabbed a copy of that whole database (about 20GB then) and made a local searchable copy. The value at the time to pilots was obvious but it was never launched because of likely legal problems

But it was easy to write. It was immediately found that the login was totally fake. There was a complicated sequence of cookies and whatnot, designed to p1ss off anyone trying to do it with a machine instead of a human, but once you got a URL to something inside, that would work for days or weeks. There was no security whatsoever. And there still isn’t to this day. The system just tries to frustrate URL passing by inserting a token into the URL and trashing that token all every week or two (which is why EAD links posted here are useless; you need to download the PDF and drop it into your post). Only a complete amateur would have written that sort of code. No idea what they were trying to achieve. Maybe maximise the billable value by satisfying some “security dickhead” at NATS, maybe trying to avoid bots downloading the State secrets in there (the hidden google recaptcha with a threshold of about 0.6 does that very well), maybe something else. Maybe make it harder for bots to discover some crappy code in there which might make their customer demand a refund (although I doubt NATS would spot a crap product even if it bit them in the bum; the organisation is so top-heavy that nobody has any degree of freedom). They had some other anti bot code so when downloading the plates we had to put in random-ish delays to make it look like somebody was keyboarding it. But why block bots? You fundamentally cannot. EuroGA gets probably 100k bot hits per day and still this is about 1% of the server bandwidth (the cost of hosting is mostly the GB disk space but NATS etc run their own servers anyway so all this is “free”). We block Russia and China to reduce the malicious stuff a bit. No point in implementing a login at all. Just let everybody go in and download stuff, and use firewall rules to block blatent abuse like downloading the LKPR AIP every 100 microseconds.

Financial software tends to be less critical in that you can pull the plug out, and the people I have known who work there are pretty smart. The sort of people doing ATC software are mostly on LinkedIn and telling everybody they can’t get a job anywhere else

The whole thing was written in Java by a load of obvious hackers.

“Everyone knows” that you should use ASSERT statements to check for broken assumptions in the code. But that is a TERRIBLE idea in real time code. Much better to let one packet (or flight plan) be broken, than to bring down the whole system.

I agree, except this is not really real time. There is no microsecond or even milisecond precision involved. They have enough time to write debugs to a logfile and indeed they seem to have been doing that, which is how it was eventually sussed.

Or you use a language designed for high-availability real time applications such as Erlang. (Used e.g. in Telecom systems and systems like the WhatsApp server.)

All that does is selects for clever programmers who can get their head around these weird languages. And they are scarce, so very expensive, and cannot be fired. Those people could write as reliable stuff in C, or (much more slowly) in assembler. Or Fortran I’ve seen so many of these claims. Used to know an old guy who kept saying how Forth is great for safety critical systems… Well, almost nobody could understand it, so those who could were super clever and produced good code

The key thing is that each niche in programming has its own employment ecosystem. In embedded / realtime you have mostly C. In ATC and some other areas you have Java. In telecoms you have XYZ… Each one says their language is super safe and super brilliant. But in reality any program longer than about 100 lines (variable, obviously) cannot be properly tested…

Software will always have bugs. The bug here was bringing down the system, instead of isolating that FP and sending it to some human to fix. According to another plausible commentary I read, NATS already have a system looking for FPs which they know will break their software, and this French one is just another one to add to that list. Then the issue can be dealt with separately. Or not, if it happens once every 5 years.

Administrator
Shoreham EGKA, United Kingdom

All that does is selects for clever programmers who can get their head around these weird languages.

Couldn’t agree more. All our code (the stuff that matters) is in C++. If you have the right mindset you can write extremely reliable code in C++, thank you very much. With the great advantage that you can fairly easily hire people to work on it. Write your code in Haskell or Erlang and… best of luck with that.

I agree, except this is not really real time. There is no microsecond or even milisecond precision involved.

Depends what you mean by real-time, I agree. What I really meant is that it has to keep on working, no matter what, because bad things will happen if it doesn’t. Imagine if the fly-by-wire avionics says “oh, this shouldn’t be a null pointer, I’d better write a log message and stop”. (Oh wait, it probably does).

LFMD, France

The main issues I have seen in server-level IT have been with logfiles overflowing the server space because the cron job intended to compress them and eventually delete them stopped working

Usually this happens a year after the person who wrote the code left the company…

Administrator
Shoreham EGKA, United Kingdom
Sign in to add your message

Back to Top