I’ve been using a Proxmox home server for quite some time now without many problems.
Recently i got an AMD Navi 10 RX 5700 XT and tried to pass it through to a windows VM.
I mainly followed the official Proxmox guide but got it running by using some other tutorials too.
For now, it works once after i reboot the host. Then its no problem to start the VM, but after a restart the VM doesnt start no more, showing this error:
swtpm_setup: Not overwriting existing state file. kvm: ../hw/pci/pci.c:1637: pci_irq_handler: Assertion
0 <= irq_num && irq_num < PCI_NUM_PINS’ failed.
stopping swtpm instance (pid 98348) due to QEMU startup error
TASK ERROR: start failed: QEMU exited with code -1`
I tried fixing it using this but it didnt change much.
EDIT: link was not shown
Formatted with a code block so it’s more readable:
16:41:43 `Dec 19 16:40:45 pve pvedaemon[1590]: end task UPID:pve:00030675:000E7952:6581B96F:vncshell::root@pam: OK Dec 19 16:40:47 pve kernel: vfio-pci 0000:03:00.0: not ready 16383ms after bus reset; waiting Dec 19 16:41:03 pve pvedaemon[1590]: starting task UPID:pve:000308EE:000E85EB:6581B98F:qmstart:195:root@pam: Dec 19 16:41:03 pve pvedaemon[198894]: start VM 195: UPID:pve:000308EE:000E85EB:6581B98F:qmstart:195:root@pam: Dec 19 16:41:06 pve kernel: vfio-pci 0000:03:00.0: not ready 32767ms after bus reset; waiting Dec 19 16:41:40 pve kernel: vfio-pci 0000:03:00.0: not ready 65535ms after bus reset; giving up Dec 19 16:41:41 pve kernel: vfio-pci 0000:03:00.1: Unable to change power state from D0 to D3hot, device inaccessible Dec 19 16:41:41 pve kernel: vfio-pci 0000:03:00.0: Unable to change power state from D0 to D3hot, device inaccessible Dec 19 16:41:41 pve systemd[1]: 195.scope: Deactivated successfully. Dec 19 16:41:41 pve systemd[1]: 195.scope: Consumed 54min 2.778s CPU time. Dec 19 16:41:41 pve systemd[1]: Started 195.scope. Dec 19 16:41:41 pve kernel: tap195i0: entered promiscuous mode Dec 19 16:41:41 pve kernel: vmbr0: port 4(fwpr195p0) entered blocking state Dec 19 16:41:41 pve kernel: vmbr0: port 4(fwpr195p0) entered disabled state Dec 19 16:41:41 pve kernel: fwpr195p0: entered allmulticast mode Dec 19 16:41:41 pve kernel: fwpr195p0: entered promiscuous mode Dec 19 16:41:41 pve kernel: vmbr0: port 4(fwpr195p0) entered blocking state Dec 19 16:41:41 pve kernel: vmbr0: port 4(fwpr195p0) entered forwarding state Dec 19 16:41:41 pve kernel: fwbr195i0: port 1(fwln195i0) entered blocking state Dec 19 16:41:41 pve kernel: fwbr195i0: port 1(fwln195i0) entered disabled state Dec 19 16:41:41 pve kernel: fwln195i0: entered allmulticast mode Dec 19 16:41:41 pve kernel: fwln195i0: entered promiscuous mode Dec 19 16:41:41 pve kernel: fwbr195i0: port 1(fwln195i0) entered blocking state Dec 19 16:41:41 pve kernel: fwbr195i0: port 1(fwln195i0) entered forwarding state Dec 19 16:41:41 pve kernel: fwbr195i0: port 2(tap195i0) entered blocking state Dec 19 16:41:41 pve kernel: fwbr195i0: port 2(tap195i0) entered disabled state Dec 19 16:41:41 pve kernel: tap195i0: entered allmulticast mode Dec 19 16:41:41 pve kernel: fwbr195i0: port 2(tap195i0) entered blocking state Dec 19 16:41:41 pve kernel: fwbr195i0: port 2(tap195i0) entered forwarding state Dec 19 16:41:43 pve kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:41:43 pve kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:41:43 pve kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:41:43 pve kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:41:43 pve kernel: vfio-pci 0000:03:00.1: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:41:43 pve kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:41:43 pve kernel: vfio-pci 0000:03:00.1: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:41:44 pve kernel: pcieport 0000:02:00.0: broken device, retraining non-functional downstream link at 2.5GT/s Dec 19 16:41:44 pve pvedaemon[1592]: VM 195 qmp command failed - VM 195 not running Dec 19 16:41:45 pve kernel: pcieport 0000:02:00.0: retraining failed Dec 19 16:41:46 pve kernel: pcieport 0000:02:00.0: broken device, retraining non-functional downstream link at 2.5GT/s Dec 19 16:41:47 pve kernel: pcieport 0000:02:00.0: retraining failed Dec 19 16:41:47 pve kernel: vfio-pci 0000:03:00.0: not ready 1023ms after bus reset; waiting Dec 19 16:41:48 pve kernel: vfio-pci 0000:03:00.0: not ready 2047ms after bus reset; waiting Dec 19 16:41:50 pve kernel: vfio-pci 0000:03:00.0: not ready 4095ms after bus reset; waiting Dec 19 16:41:54 pve kernel: vfio-pci 0000:03:00.0: not ready 8191ms after bus reset; waiting Dec 19 16:42:03 pve kernel: vfio-pci 0000:03:00.0: not ready 16383ms after bus reset; waiting Dec 19 16:42:21 pve kernel: vfio-pci 0000:03:00.0: not ready 32767ms after bus reset; waiting Dec 19 16:42:56 pve kernel: vfio-pci 0000:03:00.0: not ready 65535ms after bus reset; giving up Dec 19 16:42:56 pve kernel: vfio-pci 0000:03:00.1: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:42:56 pve kernel: vfio-pci 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible Dec 19 16:42:56 pve kernel: fwbr195i0: port 2(tap195i0) entered disabled state Dec 19 16:42:56 pve kernel: tap195i0 (unregistering): left allmulticast mode Dec 19 16:42:56 pve kernel: fwbr195i0: port 2(tap195i0) entered disabled state Dec 19 16:42:56 pve pvedaemon[199553]: stopping swtpm instance (pid 199561) due to QEMU startup error Dec 19 16:42:56 pve pvedaemon[198894]: start failed: QEMU exited with code 1 Dec 19 16:42:56 pve pvedaemon[1590]: end task UPID:pve:000308EE:000E85EB:6581B98F:qmstart:195:root@pam: start failed: QEMU exit> Dec 19 16:42:56 pve systemd[1]: 195.scope: Deactivated successfully. Dec 19 16:42:56 pve systemd[1]: 195.scope: Consumed 1.736s CPU time.
It does seem a lot like the reset bug, but then you already tried that. :/ Kernel module aren’t as easy to install and if you’re missing the required flags it might just do nothing.
Should show the 6 flags =y
Or maybe some variation of manual reset…
https://forum.proxmox.com/threads/issues-with-intel-arc-a770m-gpu-passthrough-on-nuc12snki72-vfio-pci-not-ready-after-flr-or-bus-reset.130667/
Just fyi, the 6 y-flags were shown
It was inteded to be a code block, but that way it was just a bunch of text without newlines somehow