FlightAware Discussions

FYI New Pi Kernel causes dump1090-fa to run much harder

Same here. It’s very clear, when I updated my system (updated twice)
grafik

Looks like it is, indeed, a problem with zerocopy.

Running 5.4.51 on a Pi 4B, dump1090-fa 3.8.1 + librtlsdr c1faae295cb523d34ebfd1e007f244e8ee624ea3:

librtlsdr built with ENABLE_ZEROCOPY=ON (and zerocopy is actually used):

CPU load: 46.3%
  4031 ms for demodulation
  23800 ms for reading from USB
  8 ms for network input and background tasks

librtlsdr built with ENABLE_ZEROCOPY=OFF:

CPU load: 11.8%
  5490 ms for demodulation
  1614 ms for reading from USB
  11 ms for network input and background tasks

On the older kernels, librtlsdr would try to use zerocopy but would bail out and not actually use it because of a kernel bug.

I would speculate that when zerocopy is enabled and actually used, the write cache characteristics of the memory region the buffers lie in interacts badly with what dump1090 does with those buffers (it does the initial conversion work in-place, writing the modified data back to the same buffer, and this is where the extra CPU seems to be coming from)

1 Like

I thought default was OFF in librtlsdr (IE this would be the default for repo installs), was their workaround for the kernel issue. Is why I tried with zerocopy =ON with the latest kernel and was still seeing the increased use. I had it backwards?

CMAKE:

option(ENABLE_ZEROCOPY "Enable usbfs zero-copy support" OFF)
if (ENABLE_ZEROCOPY)
    message (STATUS "Building with usbfs zero-copy support enabled")
    add_definitions(-DENABLE_ZEROCOPY=1)
else (ENABLE_ZEROCOPY)
    message (STATUS "Building with usbfs zero-copy support disabled, use -DENABLE_ZEROCOPY=ON to enable")
endif (ENABLE_ZEROCOPY)

I prolly messed something up…

This is true for current git, but not for the older versions provided by Raspbian. The default changed after the “dies with SIGKILL” problem surfaced, and those changes haven’t made it into Debian stable so far AFAIK.

So for my own education… it basically boils down to a USB stack issue causing excessive buffer/“memcopy” interrupts? Not wanting to confuse the thread, just trying to validate. I would assume the SIGKILL problem was more or less a timeout, or a GCC compilation bug. There are so many memory access layers, I get lost.

There’s a bunch of stuff going on here.

The idea behind zerocopy is to allow userspace to directly map the actual memory buffer used for talking to the USB controller. So the USB controller fills that memory, and then userspace can directly use the data, without the kernel needing to copy the data around further.

There was a cache coherency or mapping problem of some type (I’m not familiar with the exact mechanics) that meant that in older kernels, the userspace zerocopy buffer didn’t actually point at the right place. So userspace would see garbage data, not the expected USB data. librtlsdr built with ENABLE_ZEROCOPY=ON tests for this and disables zerocopy if that kernel bug is present. This is what happens on the 4.x kernels on a Pi.

Separately (I’m not sure if it had the same cause or not) there was a kernel bug that meant on some architectures (maybe aarch64? I forget), trying to access a zerocopy buffer at all resulted in a SIGKILL. librtlsdr would trigger this while trying to test for the previous zerocopy bug. I think this was, again, a mapping problem where the mapping pointed to the wrong place, perhaps outside physical memory. The existence of this problem lead to librtlsdr changing the default to ENABLE_ZEROCOPY=OFF

Finally, with both kernel bugs fixed and ENABLE_ZEROCOPY=ON (this is the situation on Raspbian with a 5.x kernel), zerocopy is actually used. But it performs terribly; if I had to guess, given the previous cache coherency issues, this might be due to something like having the memory region underlying the zerocopy buffers set to uncacheable.

AFAIK userspace is not doing anything really wrong here, it’s just that the zerocopy performance is bad.
dump1090 can probably avoid the problem by ensuring that it only reads from the zerocopy buffer, and doesn’t try to do conversion work in-place there.

3 Likes

Now that the discussion is ongoing, are there plans from FA side trying to fix that? Or is it out of scope and more related to the operating system?

Rolling back to an old kernel should not be the final solution but only a workaround, or am i wrong?

Yes (but it’ll be at least a week before I can start on it)

1 Like

No worries, at least you keep it alive :slight_smile:
Thanks!

I won’t update my feeder Pis until this is resolved.

I was running the upgrade weeks ago where the problem was not known, but it was pending until yesterday where i needed to reboot because the device was not responding (60 days uptime) and then the trouble began by using the new kernel

I followed the guide to downgrade to the last 4.19 kernel, but this is then a mixed system with old kernel and new components. As long as it works, but i like to have this cleaned up

So I guess this is why my dump1090 process keeps crashing?

I also have the USB dongle and my dump1090 process is using about 25% - 30% of my Rasberry’s CPU.

Seems like a fix is a while away. Is that correct?

For me, dump1090 is not crashing. Only more CPU power is needed (Pi3 B+) with passive cooling, temperature around 50°C now (before about 48°C)

Nope. Different symptoms. Start by looking at syslog / journalctl.

After downgrading to the earlier kernel, the high CPU usage problem went away, but the system seemed unstable (kept losing WiFi and occasionally froze). I reinstalled the current version Pi OS Lite from the Raspberry Pi Imager (version 4.19.118) and then PiAware and the rest of the software.
It seems a lot more stable now (no noticeable freezes or loss of WiFi) for the past few days since the re-install.

I’m not sure the issue is a huge problem in the thick of things (thanks @obj for clarifying). The Pi doesn’t have an issue dealing with the extra work. No reason to panic IMHO.

I have no issues with UAT978(using rtl-sdr dongles) only, Mode S Beast only and airspy only RPIs.
I think it could be related to dump1090fa and the rtl-sdr dongles.

Lets keep things in perspective… My system is a Pi 4 with lots of memory. For me, the CPU increase is certainly acceptable for the few weeks necessary for the FlightAware people (@obj et al) to repair and deploy whatever is necessary. That means that I’ll wait for things to be fixed and keep my eyes open.

But I’m sure there are more than a few people out there that have Pi 3 or lower that may be impacted a lot more. Their CPU jump could cause other hardware flakiness in turn. It may be better for these people to downgrade the kernel but I’m not in a position to tell one way or the other.

I’m just glad that I picked up on this issue early on and decided to post it. I’m not patting my own back; just relieved that what I brought up is a real issue.

3 Likes

I am not saying it’s not acceptable. And as stated earlier you can return to a 4.xx kernel which does not produce this CPU load.

I am operating a 3B outdoor and currently i do not want to have that higher CPU load as it also increases the temperature.
But as long as i am able to use a working kernel, i am good until FA has it resolved.

Makes it more bearable, performance is still not great.
(simple memcpy before doing any work in the rtl-sdr callback)

image

Edit:

Note the above graphic might have other factors …
Removing zerocopy actually didn’t remove much more cycles than than the work-around.
But i was testing with readsb … maybe that just has some inefficiencies that dump1090-fa doesn’t have.