Calculation of values "nearby sites"

foxhunter · November 18, 2019, 7:43am

Does somebody know for which time frame the values in the table “nearby sites” are calculated/counted?

these values:

EDIT: Found it. Seem to be the seven day median of the last days

SoNic67 · November 18, 2019, 10:12am

In my case there is no median. It just takes one of the previous days (usually 3rd or 4th one back) on “reported” and slaps it in there.

wiedehopf · November 18, 2019, 10:14am

That will probably be the median.
Median - Wikipedia

Note that it’s not an arithmetic average, that’s something different.

SoNic67 · November 18, 2019, 10:19am

There is no median, those are exact numbers.

wiedehopf · November 18, 2019, 10:24am

Yeah the median is an exact number chosen by a certain principle, which is explained in the article i linked.

Again, the median is not an average.

SoNic67 · November 18, 2019, 10:33am

A continuous probability distribution would make sense in this specific case, a discrete one it doesn’t.

But hey, if it’s easier on the server…

obj · November 18, 2019, 10:40am

This is correct.

Not when the underlying measurements are in terms of UTC days.

foxhunter · November 18, 2019, 10:48am

What you have shown in the screenshot is exactly the median of your last seven days

SoNic67 · November 18, 2019, 2:35pm

IMO the measurements are the numbers on vertical axis, those get “averaged”. The days are just sampling intervals, those don’t get “averaged”.

foxhunter · November 18, 2019, 3:03pm

day 1: 1000
day 2: 1500
day 3: 1200
day 4: 1700
day 5: 1300
day 6: 1100
day 7: 900

These numbers will be sorted by size and then the value which is at position 4 will be reported. That’s how it is explained in the article posted by wiedehopf.

In my example the median would be 1200 because there are three values lower (900/1000/1100) and three values higher (1300/1500/1700)

wiedehopf · November 18, 2019, 3:18pm

The median is as you illustrated chosen individually for positions and aircraft count.

Both positions and aircraft count, you have 7 values.
Sort those 7 values, choose the middle value.

There you have your two medians.

thespeedycab · November 18, 2019, 3:28pm

No.
Both the median and the average (better known as mean) are measures of location of data.
the median is the value such that half the values are smaller and half are larger. For FA this is calculated as the median over the last 7 days (as pointed out by @foxhunter). For an uneven number of values the median is the middle value exactly.

The mean or average is a different measure of location.

SoNic67 · November 18, 2019, 3:35pm

I understand that, how it is calculated.
What I say is that it’s irrelevant (or badly applied) here where the granularity of the input data is much higher than what is used for median in statistics. We are taking if thousands of flight per day, those are better represented by normal average. We have a data set with 1000-2000 range, that can be averaged and rounded up/down to integer number of airplanes.
Median is used only what something like that (average mean) makes no physical sense. Like for small sets of integer data, maybe under 10. Like throwing a dice.

caius · November 18, 2019, 3:56pm

That’s not what the median is for. The median is used when you don’t want the data to be skewed by outliers so much - for example, say someone turns off their receiver for a few days or their antenna falls over or something. Using the mean would result in those two unrepresentative days pulling the average down. Using the median is more representative of the normal situation.

foxhunter · November 18, 2019, 4:01pm

There is no “calculation” in the overview of “nearby sites”. It’s simply a value taken off from the list of the last seven days. The example on Wikipedia shows the lack of this median values. If you change the hightest from 40 to 400 it does not change the median value because it’s still the highest.

So worst case example:

1, 2, 3, 4, 5, 6, 1000
will still give the median of 4

However it can give you a trend because the next day the values for counting have been changed.

thespeedycab · November 18, 2019, 6:32pm

I can only speculate to what FA’s thinking was, but I can give you my opinion and considerations

What is a reasonable amount of data included in the statistic ? Here is it is a week. It could be a month or a year of some other amount. Given the currency of the data and its fluctuation, it seems quite reasonable to focus the number of data points to include to be limited. 1 week seems right, if, admittedly somewhat arbitrary (e.g. why not 2 weeks?).
With that amount of data, what is the proper way of summarizing it? There are only 7 data points included. Then we need a measure that is properly robust to small sample size AND will be interpretable. The mean lacks this robustness for small sample sizes (but might have worked well for a month of data, for example).
Another consideration is quality of data. While it is tempting to think of the counts of aircraft and positions are perfect, they are unlikely to be that (e.g. TIS-B overcounting, weather influence, etc.). Some protection against interpretation of bad data is helpful. The median will be more robust than the mean (this would likely be true even with a month worths of data).

Arguably, none of these data are really continuous, or fitting into some statistical distribution. Now, that is okay as we are not trying to perform inference on the distribution itself. In that regard, the mean has no real issues.

sonnendeck · August 2, 2023, 7:05pm

fun fact: if you are a new member and have reported an even number of days smaller than 7, it takes the average of the two closest median values

astrodeveloper · August 2, 2023, 8:54pm

Thinking as a systems and data administrator in another life, (now happily retired), Using a median is much simpler and better to represent how a site works. Little math, quick results. Much less thashing in the servers and consistent for all users. It just makes sense.

Topic		Replies	Views
Nearby site rankings ADS-B Flight Tracking	6	887	July 7, 2018
Can someone explain the Nearby Sites stats calculation ADS-B Flight Tracking	5	2632	March 30, 2018
Understanding the statistics data ADS-B Flight Tracking	7	948	December 5, 2018
Nearby Sites Table Query ADS-B Flight Tracking	3	495	April 14, 2022
Stats not working? ADS-B Flight Tracking	5	408	August 7, 2018

Calculation of values "nearby sites"

Related topics