Re: [PVE-User] Dell R350, Proxmox VE 8.2.2, sas-megaraid error and system hang
Alwin Antreich
alwin at antreich.com
Sat Aug 10 09:22:11 CEST 2024
On August 9, 2024 1:30:22 PM GMT+02:00, Andrea Casati <casati at kona.it> wrote:
>Hello
>
>Dell R350 with PERC H755.
>Tried with kernel 6.8.4, 6.8.8 and 6.5.13.
>System hangs (need to phisically power off/on the machine) every day during compressed backup, and sometimes during normal usage of VM.
>
>Log with kernel 6.8.4:
>*Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: Adapter is OPERATIONAL for scsi:0
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: Snap dump wait time : 15
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: Reset successful for scsi0.
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: 3296 (774378251s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: 3300 (boot + 5s/0x0020/CRIT) - Controller encountered an error and was reset*
>
>Errors on console with kernel 6.5.13:
>*kvm_intel: kvm [2225]: vcpu0, guest rIP: 0xfffff80277d68f93 Unhandled WRMSR(0x1d9) = 0x1*
>*megaraid_sas 0000:01:00.0: FW in FAULT state Fault code:0x10000 subcode:0x0 func:megasas_wait_for_outstanding_fusion*
>
>
>IDRAC reports no errors - Dell support reports no problems.
>
>Have anyone seen something like this before?
I've seen similar issues with other controllers when a faulty disk was present.
And do you have the latest firmware on the controller?
Cheers,
Alwin
Hi Andrea,
More information about the pve-user
mailing list