Anomalous track points on flights UK to southern Spain

If you look at trips such as:
uk.flightaware.com/live/flight/E … /EGKK/LXGB
You can see that the track jumps back to near Gatwick where the plane is actually over mountains in Spain.

If you look at the track points:
uk.flightaware.com/live/flight/E … B/tracklog
you can see that, for example, the first rogue point occurs at UTC 18:51:42 where the latitude jumps from 40.3 to 51.0 in ten seconds.
This is due to a switch of ADS-B receiver from LECU/MCV to EGLF/FAB.

Now, it is perfectly reasonable for the Spanish ADS-B receiver in Madrid (if that is what LECU/MCV is) to fail to receive signals fronm the plane as it passes over the mountains. What is not reasonable is for the ADS-B receiver in Farnborough, UK (if that is what EGLF/FAB is) to claim to be receiving signals from a plane that is actually 1270 kilometres away. Even worse, it should not be claiming that the plane’s position is near Gatwick when it’s actually in Spain.

This problem affects many flights across Spain, and not just those from Gatwick.

Why is this happening?

(My server is monitoring aircraft flights over an English national park. These rogue points manufacture multiple fictitious flights over the park. So it seems if the problem cannot be corrected at source, I need to code detection and elimination of these rogue points.)

There are two possible causes:

  • the Flightaware process that combines data from different datasources gets confused and manages to combine data from two different aircraft into a single flight.
  • data timing problems where the positions are combined out of order

The timing problems are in the process of being fixed; the first problem is also in the process of being fixed I think.

You absolutely want to do your own validity checking, though; transponder data is not particularly trustworthy (!)

Thanks. Looking at the track on Google Earth here:
brisk.org.uk/pics/EZY8905clip.JPG
All the rogue points are exactly on the track of the plane. It is just that their timing is completely wrong.
In the clip shown, the plane takes off from Gatwick and travels in a southwesterly direction. It’s 17th ADS-B report is timed at 17:19:47 UTC, and its 18th at 17:20:12. Exactly on track, in between these two points geographically, is the first rogue point, number 137, which is timed at 18:51:42.

I was thinking about this over lunch, with a glass of red wine, and thought that this would happen if the Farnborough ADS-B receiving station were to track the flight correctly, and report its positions correctly, but give them the wrong timestamp. If its time stamps were UTC plus 1 hour 30 minutes, and sent to FA as such, then FA would innocently insert them into the track when 18:51 UTC eventuates, by which time the plane is over Spain.

But, if that hypothesis is correct, then it would mean that ADS-B timestamps are created by ADS-B receivers. I had assumed that the ADS-B transmitter in the plane created the timestamp. It must have accurate UTC time from GPS satellites so it would be sensible for the plane to do it. So I’m thinking that the ADS-B receiver at Farnborough may be anomalous, having its system clock set wrongly, and applying its own timestamps rather than ones in the ADS-B message (if indeed there are timestamps in the transmitted message).

Chasing this up further, I looked at another anomalous flight:
uk.flightaware.com/live/flight/N … D/tracklog
In this case the rogue points come from timings where the plane was over the Bay of Biscay. As before, the rogue points are all exactly on the track of the plane as it left Gatwick, but have the wrong timings. I looked at three rogues:
21:22:22 UTC ADS-B from EGGW/LTN Luton airport, timing error +47 mins
21:29:38 UTC from EGLC/LCY London City Airport, timing error +62 mins
21:31:39 UTC from EGLC/LCY London City Airport, timing error +60 mins

So it cannot be just a rogue receiver at Farnborough, since these other two receivers give bad timings too. The Bay of Biscay is an hour ahead of BST so the LCY timing error of 60-62 minutes suggests possibly FA is confused about time zones when processing these points? The 47 minute error for Luton is more diificult to explain that way.

It’s not timezones, all times use seconds-since-the-epoch.

There is no timestamp carried in the ADS-B message itself.

Historically the timestamp from the receiver would be used, but as you say that causes problems with the system clock being set incorrectly (there were some guards against that, but it’s hard to distinguish a system clock that’s slightly wrong from slightly delayed data etc).

More recently the time of receipt on the FA servers is used; those clocks are NTP-synchronized so should stay correct (and are monitored etc).

However the FAB receiver actually appears to have a correctly set clock; the problem is, as far as I can tell, in large transmission delays: positions arrive on the server side well after they were actually captured/timestamped/forwarded by the receiver. There’s some things that can be done to fix that (look at the round-trip to the receiver and throw it out if it exceeds a threshold), which is next on my list.

Thanks obj. Gosh, that’s v. surprising in two respects. Firstly, the timing delays I’ve observed are in the range 47 to 92 minutes, and surely those are much too long to be caused by internet transit times, especially in a country like UK which has good internet infrastructure. Secondly, if FA is using time of arrival at FA’s servers to do the timestamp, that would explain something else that I’ve noticed: planes on FA’s live screens are about ten minutes behind those on flightradar24.

I’d have thought it would have been better to use a timestamp inserted by the ADS-B receiving station, but to check the difference between that and time of arrival at FA’s servers and if too much then you know there may be a clock problem at the receiver. (But I’m assuming here that the receivers forward to FA all the time rather than batching)

Right, so I’m going to start coding an anomaly-removing algorithm now.

I think it is a combination of CPU overload, unbounded buffers, and possibly limited upstream bandwidth / TCP going slow - if you’re trying to push more data upstream than can fit over a sustained period of time, you gradually drop further and further back - if there is no mechanism to bound this, you could be hours behind at busy times.

I’d have thought it would have been better to use a timestamp inserted by the ADS-B receiving station, but to check the difference between that and time or arrival at FA’s servers and if too much then you know there may be a clock problem at the receiver. (But I’m assuming here that the receivers forward to FA all the time rather than batching)

This is tricky to get right because you can’t reliably detect relatively small offsets. And yeah, there’s batching and delayed data to deal with. And requiring NTP-synchronized clocks on receivers is another can of worms.

Secondly, if FA is using time of arrival at FA’s servers to do the timestamp, that would explain something else that I’ve noticed: planes on FA’s live screens are about ten minutes behind those on flightradar24.

I don’t think that’s related at all; under normal cases positions are processed in under 30 seconds. Sometimes the downstream FA servers will get backlogged and it can take longer, but the timestamps still reflect the arrival time (before processing). Maybe things were just running slow when you looked.

I tracked this particular one down to slowness in the bit of the processing chain that stamps a receive time on each message. This is hopefully fixed now (by moving the timestamping process much earlier, right after we get the message). It will take a couple of days to filter through to everywhere but if you see similar problems for flights after, say, this Wednesday, let me know.

Many thanks. I’ll keep a look out.

Overnight I had another idea. When adding a new point to an aircraft’s track, you know the position of the aircraft and the ID of the ADS-B station that received the new point. Calculate the distance between the two and if it is more than some limit, say 400 miles, don’t add the point. I think this would completely eliminate the rogue points I’ve seen for planes flying south from UK over Spanish moyuntains or Bay of Biscay.

There are some checks like this already present, but it’s not trivial to do it across the whole track: the combined track is built from several independently-processed datasources that might provide positions out of time order and with differing processing / data release delays.

Obj, you asked me to let you know if I saw any more of these anomalies. Well here is a big one:
uk.flightaware.com/live/flight/M … G/tracklog
There are points with late timestamps on Aug 21st at UTC 15:24:02, 15:27:37, 15:28:57 etc. (But the delays are not the same.)

regards

Thanks. This has the same underlying cause (slowness in the first part of the processing chain) as the original problem. I’ll dig into why there was a regression there (the slowness should not be breaking anything in theory now, it should just mean that tracks take a while to update)

Thanks.

I don’t know if this is any help, but the Excel spreadsheet up here
http://www.brisk.org.uk/pics/MON978-20150821-trackcalc.xlsx
shows the ADS-B track points of the above anomalous flight, together with calculated great circle distances between adjacent points and hence calculated speed over ground. I’ve highlighted in yellow the track points that give rise to ridiculously high speeds compared to the reported ADS-B speed.

It seems like this might be the basis for discarding those anomalous points.

The error is just in the timing. This particular thing affects a group of receivers at a time, so an approach based on discarding outliers is not going to work (how do you know which is the right data?)

The rogue points don’t seem to be in blocks, on the cases I’m looking at. For example, the algorithm works perfectly to remove rogue points from this one:
http://www.brisk.org.uk/pics/EZY8995-20150821.jpg
(it removes all the yellow lines)

Because the algorithm works by looking at the points in timestamp order, and the early points are by definition not delayed, it is bound to start with valid data.

It would be interesting to see an example of a track where it doesn’t work. Perhaps I should look at tracks in the USA where things may be different?

The track points you see are after outlier detection etc. The raw data has many more points from many receivers, and when there is a timing issue we get a block of points that are all self-consistent so outlier detection doesn’t help.

Once the regression is fixed this should just go away.