[PVE-User] Debian 11 hard lock issues as VM

FingerlessGloves me at FingerlessGloves.me
Tue Jan 17 09:38:51 CET 2023


Hi Both,

This issue sounds very similar to my issue where if I have MariaDB 
installed and do a backup on Debian 11 with the guest agent installed, 
the VM has a chance of hanging when doing a backup. The only solution I 
have to get around the problem is to remove the guest agent.

I have an issue open for it here: 
https://gitlab.com/qemu-project/qemu/-/issues/881

I haven't tested it recently with the latest proxmox and debian updates. 
I may try it at some point but I don't really like my data corrupting 😂

---
FingerlessGloves

On 2023-01-17 08:22 AM, Eneko Lacunza wrote:
> Hi Bryan,
> 
> We started to upgrade our cluster from PVE 7.2 to 7.3 yesterday.
> 
> I have enabled the agent in our only VM with Debian 11 running on a 
> 7.3-4 node at the moment, and performed 5 full backups in a row, VM 
> continues working (no hang).
> 
> You haven't provided details about your setup:
> 
> - Server (especially CPU model). Debian could be suffering from weird 
> BIOS clock issues.
> 
> - Running kernel on PVE 7.3-4 . Kernel 5.15.x has been quite bad for 
> us, have you tried kernel 5.13 or 5.19?
> 
> Cheers
> 
> El 17/1/23 a las 6:06, Bryan Fields escribió:
>> I am running proxmox 7.3-4 with a now Debian 11 VM.
>> 
>> I have ZFS local storage in each server in the cluster.   Every 15 
>> minutes the VM is replicated to the other server(s).  Recently I've 
>> upgraded a server from Debian 9 to Debian 11 and it started locking 
>> up.  This didn't seem to have a certain amount of time that it took to 
>> lockup, or a certain number of replications.
>> 
>> Through some debugging I found this was the qemu-agent not unfreezing 
>> the OS after the replication.  This should happen in under 100 ms is 
>> my understanding and from what I could see, it worked fine on all my 
>> other VM's with Ubuntu or RHEL.
>> 
>> I compared the agent from the debian 11 server and the Ubuntu servers, 
>> and debian was 5.2.0 vs 6.2.0 on Ubuntu.  I compiled the agent from 
>> the 7.2.0 qemu sources (statically too if anyone wants a copy) and ran 
>> it from screen on a terminal on the Debian 11 VM. This still locked up 
>> hard after 2-4 hours.
>> 
>> Debian is using the stock kernel:
>>> Linux eyes 5.10.0-20-amd64 #1 SMP Debian 5.10.158-2 (2022-12-13) 
>>> x86_64 GNU/Linux
>> 
>> I read some things online and thought it might be related to VirtIO, 
>> and changed that to VirtIO single with no difference.
>> 
>> I've reverted back to the old kernel and am going to let this run.
>> 4.9.0-19-amd64 #1 SMP Debian 4.9.320-2 (2022-06-30) x86_64 GNU/Linux
>> 
>> Complicating this, the box is my observium install and I don't have 
>> another device watching it, so when it locks up, it takes my 
>> monitoring offline :-D
>> 
>> On the working Ubuntu boxes I'm running:
>>> 5.15.0-58-generic #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 x86_64 
>>> x86_64 x86_64 GNU/Linux
>> 
>> Below is the log where this locks up, and there's no more output after 
>> the last one (I have verbose enabled)
>> 
>>> 1673846104.535376: debug: received EOF
>>> 1673846104.635560: debug: received EOF
>>> 1673846104.735735: debug: received EOF
>>> 1673846104.835868: debug: received EOF
>>> 1673846104.936067: debug: read data, count: 104, data: 
>>> {"execute":"guest-sync-delimited","arguments":{"id":371290701}}
>>> {"arguments":{},"execute":"guest-ping"}
>>> 
>>> 1673846104.936136: debug: process_event: called
>>> 1673846104.936144: debug: processing command
>>> 1673846104.936216: debug: sending data, count: 23
>>> 1673846104.936257: debug: process_event: called
>>> 1673846104.936272: debug: processing command
>>> 1673846104.936350: debug: sending data, count: 15
>>> 1673846104.936833: debug: received EOF
>>> 1673846105.37003: debug: received EOF
>>> 1673846105.137190: debug: received EOF
>>> 1673846105.237344: debug: received EOF
>>> 1673846105.337525: debug: received EOF
>>> 1673846105.437693: debug: received EOF
>>> 1673846105.537907: debug: received EOF
>>> 1673846105.638096: debug: received EOF
>>> 1673846105.738307: debug: received EOF
>>> 1673846105.838495: debug: received EOF
>>> 1673846105.938652: debug: received EOF
>>> 1673846106.38813: debug: received EOF
>>> 1673846106.139011: debug: received EOF
>>> 1673846106.239210: debug: received EOF
>>> 1673846106.339403: debug: received EOF
>>> 1673846106.439583: debug: received EOF
>>> 1673846106.539782: debug: received EOF
>>> 1673846106.639990: debug: received EOF
>>> 1673846106.740190: debug: received EOF
>>> 1673846106.840388: debug: read data, count: 115, data: 
>>> {"arguments":{"id":371290702},"execute":"guest-sync-delimited"}
>>> {"execute":"guest-fsfreeze-freeze","arguments":{}}
>>> 
>>> 1673846106.840450: debug: process_event: called
>>> 1673846106.840465: debug: processing command
>>> 1673846106.840497: debug: sending data, count: 23
>>> 1673846106.840545: debug: process_event: called
>>> 1673846106.840563: debug: processing command
>>> 1673846106.841114: debug: disabling command: guest-get-time
>>> 1673846106.841131: debug: disabling command: guest-set-time
>>> 1673846106.841138: debug: disabling command: guest-shutdown
>>> 1673846106.841145: debug: disabling command: guest-file-open
>>> 1673846106.841151: debug: disabling command: guest-file-close
>>> 1673846106.841157: debug: disabling command: guest-file-read
>>> 1673846106.841164: debug: disabling command: guest-file-write
>>> 1673846106.841171: debug: disabling command: guest-file-seek
>>> 1673846106.841179: debug: disabling command: guest-file-flush
>>> 1673846106.841187: debug: disabling command: guest-fsfreeze-freeze
>>> 1673846106.841194: debug: disabling command: 
>>> guest-fsfreeze-freeze-list
>>> 1673846106.841202: debug: disabling command: guest-fstrim
>>> 1673846106.841209: debug: disabling command: guest-suspend-disk
>>> 1673846106.841217: debug: disabling command: guest-suspend-ram
>>> 1673846106.841225: debug: disabling command: guest-suspend-hybrid
>>> 1673846106.841232: debug: disabling command: 
>>> guest-network-get-interfaces
>>> 1673846106.841239: debug: disabling command: guest-get-vcpus
>>> 1673846106.841245: debug: disabling command: guest-set-vcpus
>>> 1673846106.841251: debug: disabling command: guest-get-disks
>>> 1673846106.841257: debug: disabling command: guest-get-fsinfo
>>> 1673846106.841265: debug: disabling command: guest-set-user-password
>>> 1673846106.841272: debug: disabling command: guest-get-memory-blocks
>>> 1673846106.841278: debug: disabling command: guest-set-memory-blocks
>>> 1673846106.841286: debug: disabling command: 
>>> guest-get-memory-block-info
>>> 1673846106.841294: debug: disabling command: guest-exec-status
>>> 1673846106.841303: debug: disabling command: guest-exec
>>> 1673846106.841311: debug: disabling command: guest-get-host-name
>>> 1673846106.841319: debug: disabling command: guest-get-users
>>> 1673846106.841326: debug: disabling command: guest-get-timezone
>>> 1673846106.841334: debug: disabling command: guest-get-osinfo
>>> 1673846106.841343: debug: disabling command: guest-get-devices
>>> 1673846106.841350: debug: disabling command: 
>>> guest-ssh-get-authorized-keys
>>> 1673846106.841356: debug: disabling command: 
>>> guest-ssh-add-authorized-keys
>>> 1673846106.841363: debug: disabling command: 
>>> guest-ssh-remove-authorized-keys
>>> 1673846106.841371: warning: disabling logging due to filesystem 
>>> freeze
>> 
>> 
>> Other than disabling the agent, is there any reason this is hapening?  
>> I can't think that Debian 11 is shipping with a broken kernel, but the 
>> 'qm guest cmd 152 fsfreeze-freeze' and 'qm guest cmd 152 
>> fsfreeze-thaw' works fine from the host. Could this be something with 
>> the VirtIO pipe/IPC?
>> 
>> Anyone else seeing this or have any ideas?
>> 
> 
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
> 
> Tel. +34 943 569 206 |https://www.binovo.es
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
> 
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/



More information about the pve-user mailing list