Timeliness of Data

Is there a guide as to the amount of time between when flight related data is generated and when it is available via the API?

We do not intentionally introduce artificial delays into the data, but there is some unavoidable delay in receiving and processing the data and making it available. The amount of the delay varies depending on the data source, so it is difficult to characterize the delay in a single number. Regardless of the transmission delay, the timestamp recorded on each position or departure/arrival time is when the event was recorded.

For example, FAA ground RADAR data comes to us with at least a 5 minute delay. Data received over ACARS or satellite datacomm links can be several minutes delayed at minimum, but can be an hour or more delayed if the data was stored by the avionics for batch delivery at the aircraft’s next uplink. ADS-B position data generally has less than 15 seconds delay.

We are trying to write an application that identifies aircraft accurately.

The application can’t identify aircraft in realtime, both because the FAA delays the ASDI data, and because the data from other sources arrives with various other delays. But it is acceptable for the application to delay identifying aircraft for a reasonable period of time to ensure that most data has arrived (except for ACARS/satellite datacom, which we’ll probably just skip because the delay can be an hour or more).

Suppose it delays for 7 minutes, to make sure the FAA data has arrived.

Now the problem is that when we call SearchBirdseyeInFlight, it returns not the flights from 7 minutes ago, but the flights it thinks are in the air now. Which obviously can’t be correct, because the ASDI data hasn’t even arrived yet, for the current time.

We don’t qualify for realtime data.

At 600 mph, an aircraft goes 50 miles in 5 minutes, so by the time we get the ASDI data, it is so woefully imprecise that our application is worthless.

What we need is for SearchBirdseyeInFlight to support an asOfTime query parameter. Then our application would just wait 7 minutes, and call SearchBirdseyeInFlight?asOfTime=-7 (or -420 if it’s in terms of seconds instead of minutes).

When the call reaches flightaware’s servers, the server backend code would query for aircraft in the bounding area as it does today, but also within a bounding time.

To ensure that non-exactly-matching timestamps aren’t filtered out, the bounding time would need to be forgiving. For example, if it was +/- 5 seconds, then everything from -415 to -425 would be included. (Probably the allowed fuzziness should be another query parameter.)

It gets trickier if there are no data points in the bounding time, but there are data points outside the bounds. In that case, some linear interpolation could be performed by the server. Computers are good at that sort of thing, provided they have access to enough data, which is true for flightaware’s servers!

Surely we aren’t the first software company to have this requirement? How have other applications solved this problem?

Explaining this problem to other people, I’ve used this analogy, which may or may not make the problem clear(er):

Think of a security camera that tapes everything going on in a convenience store. It records everything with a timestamp. But suppose you can’t look at the tape until 5 minutes later. That’s fine. You still see what happened. You just can’t see it in realtime.

Now, suppose there are, say, 3 security cameras, and each of them has a skewed clock. One displays the tape 10 seconds after events happen, another displays 30 seconds after, another displays 5 minutes after.

Your task is to see what happened exactly five minutes ago. How would you do that?

With the tapes, it’s obvious. You start the 5-minute delayed tape, and wait 4:30, then start the 30-second-delayed tape, wait until 4:50, and start the 10-second-delayed tape. When the clock hits 5 minutes, all three tapes are time-aligned, and you get an accurate picture, from all three cameras, of what happened five minutes ago.

I need to do the same thing with flight data.

With the SearchBirdseyePositions function, you can request positions for a specific “fp” (faFlightID) that have a “clock” value that is greater than 10 minutes ago, for example. Each time you call that function, you will receive just positions within the last 10 minutes, some of which you might have already received, but your application can filter out the ones it already knows about. Just ensure that you call that function every 10 minutes to get continuous data.

Note that I intentionally picked a time interval that was at least 5 minutes, because that is how much the FAA delay is. If you wish to poll for positions more frequently, then that’s fine but you’d probably still want to request positions for more than the last 5 minutes to ensure you will be returned the FAA delayed positions when they do become available with the “backdated” timestamps.

Alternatively, you can just call SearchBirdseyePositions as before, and remember the “clock” of the last position you received for that flight. When you need to query for more positions for that flight, just request all positions with a clock value that is greater than the last clock that you received (or maybe even 5 minutes prior to that last clock if you want to ensure a little overlap).