Hi everyone,

ever since I switched to Arch about two months ago, most applications segfault multiple times a day. There doesn’t seem to be any pattern for the crashes, sometimes it’s even happening while idling (e.g. reading a news article).

Things I’ve tried without any luck so far:

  • Running Firefox in safe-mode without any extensions
  • Switching from regular to LTS kernel
  • Disable Hardware Acceleration in Firefox
  • Change RAM speed and timings
  • Run Memtest successfully
  • Replace entire RAM with a new certified kit
  • Use only a single RAM slot
  • Apply Ryzen fixes (iommu=soft, limit c-states)
  • Use only a single CPU core (maxcpus=1)
  • Downgrade Nvidia driver to 535xx
  • Use Nouveau instead of the nvidia driver
  • Use Openbox instead of KDE
  • Disable zswap and THP

Here’s full journalctl from a day where both Spotify and Firefox crashed at the end, a few seconds after each other:

https://pastebin.com/BH0LMnD9

Some more info about my system:

  • Ryzen 5 3600X
  • MSI B450M PRO-VDH Max
  • 32GB RAM @ 3200MHz
  • Geforce RTX 2070 SUPER (using nvidia-dkms)
  • Plasma 5.27.10 on X11

I’m pretty sure that it’s not hardware related, because I’ve booted up a Debian 12 live image where everything ran for several hours without a crash. But it seems to be Arch related, as I also booted up a fresh EndeavourOS live image (so basically Arch), where applications also randomly segfaulted. Any idea why everything works fine on Debian but not on Arch? Debian uses the 6.1 kernel, which I already tried, so that’s not it.

Let me know if you need any more information that might help solve this issue. Thanks!

Edit [solved]: It looks like disabling PBO in the UEFI/BIOS did the trick. The strange thing is, after enabling it again, it’s still not crashing again. Someone suspected that the MoBo default/training settings were faulty, so I guess this was a very rare case here. That’s probably why it took so long to find a solution. Thanks everyone for helping me out!

  • Possibly linux
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    9 months ago

    Do you have a old computer to test the software? You could pull out the drive and put in a known good system to test. I also would use a live system for a while to see if it has problems.

    This sounds like a hardware problem. Maybe your ram controller on the CPU is faulty? Try reseating the CPU and check for bent pins.

    If its a software problem you could also go the nuclear option and start over. Its a pain but it might be worth it. I don’t know on Arch but with Debian you can reinstall all packages. Doing that should wipe out any corruption.

    • NoisyFlake@lemm.eeOP
      link
      fedilink
      arrow-up
      1
      ·
      9 months ago

      Starting over won’t probably fix anything, since even the EndeavourOS live image has the segfaults. Of course I could just start over on Debian, but I really like Arch and would only switch as a last resort.

      I have another computer where I can test it, yes. It’s probably enough to run EndeavourOS live for a while, but then again, I’m 99% sure that no crashes are going to happen, otherwise the EndeavourOS forums would be flooded with this issue.

      • Possibly linux
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        If your live system is crashing it is definitely a hardware problem. Can you post dmesg?

          • Possibly linux
            link
            fedilink
            English
            arrow-up
            1
            ·
            9 months ago

            There is one line that catches my attention.

            ccp 0000:2b:00.1: ccp: unable to access the device: you might be running a broken BIOS
            

            This theoretically shouldn’t causes crashes but from my research it looks like AMD CCP can cause system instability in some cases. I would update your bios to the latest release and if that doesn’t fix it you should try disabling AMD CCP in bios as I doubt you need it anyway.