FlightAware Discussions

MLAT: UDP packet loss

i’ve checked the piaware.log by chance and have identified that there are messages from time to time like this, with different percentage values:

> Aug 16 19:40:45 piaware piaware[643]: NOTICE from adept server: 34% of multilateration messages (UDP) are not reaching the server - check your network?
> Aug 16 20:43:09 piaware piaware[643]: NOTICE from adept server: 20% of multilateration messages (UDP) are not reaching the server - check your network?
> Aug 16 21:00:21 piaware piaware[643]: NOTICE from adept server: 29% of multilateration messages (UDP) are not reaching the server - check your network?
> Aug 16 21:01:15 piaware piaware[643]: NOTICE from adept server: 12% of multilateration messages (UDP) are not reaching the server - check your network?
> Aug 16 21:11:18 piaware piaware[643]: NOTICE from adept server: 33% of multilateration messages (UDP) are not reaching the server - check your network?

Is there anything i need to do? It looks like it starts somewhere this evening.
My network runs fine without issues. The status on the local webpage and the FA stats page does not show any problems

@obj
Because this comes up now and then and i also had this problem, a little inquiry:

How does your server handle out of order UDP packets?
Because i believe my problem was a tunnel that combined the bandwidth of a LTE mobile connection and a wired DSL connection.
Once i switched that off, no packet loss anymore.

Note that i didn’t have any packet loss when testing ping etc.

So my guess is, when the packet order is disrupted, you get the notice in regards to packet loss.
Is that a possibility or am i way off?

i’ve checked my log again which was created new on August 12th, so four days ago.

The four lines above are the only ones in the whole file, probably an issue with server reachability for a certain period of time?

I am running a dedicated VDSL connection and did not change anything today, i also left the Pi alone as i was doing other stuff.

checking the previous logfile and found only these entries with “NOTICE” in it:

Aug  5 14:50:54 piaware piaware[538]: NOTICE adept server is shutting down.  reason: reconnection request received
Aug  5 14:52:13 piaware piaware[538]: NOTICE adept server is shutting down.  reason: reconnection request received
Aug  5 15:01:24 piaware piaware[538]: NOTICE adept server is shutting down.  reason: reconnection request received

earlier than again the network stuff:

Aug 2 01:13:50 piaware piaware[538]: NOTICE from adept server: 12% of multilateration messages (UDP) are not reaching the server - check your network?
Aug 2 01:14:35 piaware piaware[538]: NOTICE from adept server: 60% of multilateration messages (UDP) are not reaching the server - check your network?

no further entries since install of the SD card (July 23rd)

Or a problem literally anywhere along the way.
This includes bandwidth problems.

reachability in different flavors…

Out-of-order delivery won’t affect it (and, so long as the data is still timely, the mlat server can actually still use out-of-order data). There’s actually not any ordering info in the packets.

The packet loss calculation is really simple, the client counts how many packets it sent and periodically reports this to the server, the server compares that to how many packets it received since the last report. It’s not going to be 100% correct because there might be a few packets around the time of the report that go into different time periods, but it’s pretty close.

so as a summary: Nothing to be concerned about as long as the number of these messages is low, correct?

Well, it’s just MLAT and it still works despite some packet loss.

Also unless your WiFi is unstable, there’s nothing you can do about it.

ok, thanks

I have my Raspberry connected via LAN cable, WiFi is disabled

That increases the mystery.

When i had the problem i tried using mtr --udp to find where the packet loss was occuring along the way to FA servers.
Was completely unsuccessful to find any hops with packet loss.

Checked the piaware.log this morning again, no new messages about packet loss.

Default mtr datagram size is pretty small, IIRC. mlat will be using ~1.4k datagrams.

Maybe consider reducing that to 1280 bytes.
(Or detecting which packet size arrives unfragmented)

When having the problem i also tried reducing the MTU on the Pi in the hope the packet loss would go away. If i remember correctly it didn’t have any effect.

You probably have already some time in the past seen this article, but it’s an interesting summary on that issue (mostly related to IPv6):
https://blog.cloudflare.com/increasing-ipv6-mtu/

I suppose i could reactivate the LTE tunnel and do some more testing :wink:
But i don’t get the impression it’s a big problem overall.
MLAT even works reasonably well with relatively high packet loss.

PMTUD is a fair amount of work to implement and frankly it’s not worth it in this case - 1400 is small enough for most connections and for the rest of them, let the IP stack do fragmentation. If fragmentation is broken… you should fix your setup.

Hmm. I tried reducing the MTR on my Raspbian to avoid any possible fragmentation issues down along the line.
(Raspbian should be fragmenting the packets before putting them on the line, right?)
Didn’t seem to have helped.

Oh well maybe that tunnel stack by Deutsche Telekom is just horribly broken and mangles UDP.
Wouldn’t surprise me.

Respecting the local MTR still seems like a good idea in case it’s smaller than 1400.
(and it should just be fragmented by the local IP stack and go down the line just fine anyway, but still less packets don’t hurt.)

What is the average packet loss across all MLAT participants?

Generally the kernel should be aware of the path MTU anyway, because it has an established TCP connection to the same host for ADS-B data transfer. It’ll fragment datagrams that exceed the path MTU without mlat-client needing to do anything special.

I guess mlat-client could set IP_PMTUDISC_WANT (this may be the system-wide default anyway) or IP_PMTUDISC_DO and inspect IP_MTU without a ton of work. I’m not sure if these options are easily visible from Python though. IP_MTUDISC_PROBE (i.e. a full RFC4821 implementation) is a lot of work requiring coordination with the server side, and is unlikely to happen given that the only real benefit it gives is when the path MTU isn’t otherwise discoverable - and we always have that TCP connection hanging around to help out with that.

(keep in mind that “respecting the local MTU” does not help here - it is the path MTU which matters, which can vary by destination)

Yes i’m aware.
But if there is a fragmentation issue down the line i should be able to circumvent it by reducing the local MTU below that threshold where the connection down the line “goes bad”.

Thanks for all the explanations :slight_smile:

Sure, and that should just work already with no changes - mlat-client will continue to send 1400-byte datagrams, and the local IP stack will fragment those down to 1280 or whatever you set. It’s not as efficient and you may lose some datagrams to reassembly failure, but it should work. The PMTUD changes are just about avoiding that extra fragmentation.

Yeah and even doing that didn’t help when i had the problem.

Maybe it was another problem altogether unrelated to MTU.