I run an old desktop mainboard as my homelab server. It runs Ubuntu smoothly at loads between 0.2 and 3 (whatever unit that is).

Problem:
Occasionally, the CPU load skyrockets above 400 (yes really), making the machine totally unresponsive. The only solution is the reset button.

Solution:

  • I haven’t found what the cause might be, but I think that a reboot every few days would prevent it from ever happening. That could be done easily with a crontab line.
  • alternatively, I would like to have some dead-simple script running in the background that simply looks at the CPU load and executes a reboot when the load climbs over a given threshold.

–> How could such a cpu-load-triggered reboot be implemented?


edit: I asked ChatGPT to help me create a script that is started by crontab every X minutes. The script has a kill-threshold that does a kill-9 on the top process, and a higher reboot-threshold that … reboots the machine. before doing either, or none of these, it will write a log line. I hope this will keep my system running, and I will review the log file to see how it fares. Or, it might inexplicable break my system. Fun!

  • Possibly linux
    link
    fedilink
    English
    arrow-up
    12
    ·
    8 months ago

    Here’s a better suggestion. Why don’t you see if you can find out what’s causing the issue? It sounds a like a problem occurring in userspace. Try running htop

    • PlutoniumAcid@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 months ago

      You know you are right, and I’ve tried. I can manually monitor but it doesn’t happen just then. I don’t know yet what causes it, I can only assume it’s one of the Docker containers because the machine is doing nothing else.

      I am doing this to find out how often it happens, how quickly it happens, and what’s at the top when it happens.