[PVE-User] Debian 11 hard lock issues as VM
FingerlessGloves
me at FingerlessGloves.me
Tue Jan 17 09:38:51 CET 2023
Hi Both,
This issue sounds very similar to my issue where if I have MariaDB
installed and do a backup on Debian 11 with the guest agent installed,
the VM has a chance of hanging when doing a backup. The only solution I
have to get around the problem is to remove the guest agent.
I have an issue open for it here:
https://gitlab.com/qemu-project/qemu/-/issues/881
I haven't tested it recently with the latest proxmox and debian updates.
I may try it at some point but I don't really like my data corrupting 😂
---
FingerlessGloves
On 2023-01-17 08:22 AM, Eneko Lacunza wrote:
> Hi Bryan,
>
> We started to upgrade our cluster from PVE 7.2 to 7.3 yesterday.
>
> I have enabled the agent in our only VM with Debian 11 running on a
> 7.3-4 node at the moment, and performed 5 full backups in a row, VM
> continues working (no hang).
>
> You haven't provided details about your setup:
>
> - Server (especially CPU model). Debian could be suffering from weird
> BIOS clock issues.
>
> - Running kernel on PVE 7.3-4 . Kernel 5.15.x has been quite bad for
> us, have you tried kernel 5.13 or 5.19?
>
> Cheers
>
> El 17/1/23 a las 6:06, Bryan Fields escribió:
>> I am running proxmox 7.3-4 with a now Debian 11 VM.
>>
>> I have ZFS local storage in each server in the cluster. Every 15
>> minutes the VM is replicated to the other server(s). Recently I've
>> upgraded a server from Debian 9 to Debian 11 and it started locking
>> up. This didn't seem to have a certain amount of time that it took to
>> lockup, or a certain number of replications.
>>
>> Through some debugging I found this was the qemu-agent not unfreezing
>> the OS after the replication. This should happen in under 100 ms is
>> my understanding and from what I could see, it worked fine on all my
>> other VM's with Ubuntu or RHEL.
>>
>> I compared the agent from the debian 11 server and the Ubuntu servers,
>> and debian was 5.2.0 vs 6.2.0 on Ubuntu. I compiled the agent from
>> the 7.2.0 qemu sources (statically too if anyone wants a copy) and ran
>> it from screen on a terminal on the Debian 11 VM. This still locked up
>> hard after 2-4 hours.
>>
>> Debian is using the stock kernel:
>>> Linux eyes 5.10.0-20-amd64 #1 SMP Debian 5.10.158-2 (2022-12-13)
>>> x86_64 GNU/Linux
>>
>> I read some things online and thought it might be related to VirtIO,
>> and changed that to VirtIO single with no difference.
>>
>> I've reverted back to the old kernel and am going to let this run.
>> 4.9.0-19-amd64 #1 SMP Debian 4.9.320-2 (2022-06-30) x86_64 GNU/Linux
>>
>> Complicating this, the box is my observium install and I don't have
>> another device watching it, so when it locks up, it takes my
>> monitoring offline :-D
>>
>> On the working Ubuntu boxes I'm running:
>>> 5.15.0-58-generic #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 x86_64
>>> x86_64 x86_64 GNU/Linux
>>
>> Below is the log where this locks up, and there's no more output after
>> the last one (I have verbose enabled)
>>
>>> 1673846104.535376: debug: received EOF
>>> 1673846104.635560: debug: received EOF
>>> 1673846104.735735: debug: received EOF
>>> 1673846104.835868: debug: received EOF
>>> 1673846104.936067: debug: read data, count: 104, data:
>>> {"execute":"guest-sync-delimited","arguments":{"id":371290701}}
>>> {"arguments":{},"execute":"guest-ping"}
>>>
>>> 1673846104.936136: debug: process_event: called
>>> 1673846104.936144: debug: processing command
>>> 1673846104.936216: debug: sending data, count: 23
>>> 1673846104.936257: debug: process_event: called
>>> 1673846104.936272: debug: processing command
>>> 1673846104.936350: debug: sending data, count: 15
>>> 1673846104.936833: debug: received EOF
>>> 1673846105.37003: debug: received EOF
>>> 1673846105.137190: debug: received EOF
>>> 1673846105.237344: debug: received EOF
>>> 1673846105.337525: debug: received EOF
>>> 1673846105.437693: debug: received EOF
>>> 1673846105.537907: debug: received EOF
>>> 1673846105.638096: debug: received EOF
>>> 1673846105.738307: debug: received EOF
>>> 1673846105.838495: debug: received EOF
>>> 1673846105.938652: debug: received EOF
>>> 1673846106.38813: debug: received EOF
>>> 1673846106.139011: debug: received EOF
>>> 1673846106.239210: debug: received EOF
>>> 1673846106.339403: debug: received EOF
>>> 1673846106.439583: debug: received EOF
>>> 1673846106.539782: debug: received EOF
>>> 1673846106.639990: debug: received EOF
>>> 1673846106.740190: debug: received EOF
>>> 1673846106.840388: debug: read data, count: 115, data:
>>> {"arguments":{"id":371290702},"execute":"guest-sync-delimited"}
>>> {"execute":"guest-fsfreeze-freeze","arguments":{}}
>>>
>>> 1673846106.840450: debug: process_event: called
>>> 1673846106.840465: debug: processing command
>>> 1673846106.840497: debug: sending data, count: 23
>>> 1673846106.840545: debug: process_event: called
>>> 1673846106.840563: debug: processing command
>>> 1673846106.841114: debug: disabling command: guest-get-time
>>> 1673846106.841131: debug: disabling command: guest-set-time
>>> 1673846106.841138: debug: disabling command: guest-shutdown
>>> 1673846106.841145: debug: disabling command: guest-file-open
>>> 1673846106.841151: debug: disabling command: guest-file-close
>>> 1673846106.841157: debug: disabling command: guest-file-read
>>> 1673846106.841164: debug: disabling command: guest-file-write
>>> 1673846106.841171: debug: disabling command: guest-file-seek
>>> 1673846106.841179: debug: disabling command: guest-file-flush
>>> 1673846106.841187: debug: disabling command: guest-fsfreeze-freeze
>>> 1673846106.841194: debug: disabling command:
>>> guest-fsfreeze-freeze-list
>>> 1673846106.841202: debug: disabling command: guest-fstrim
>>> 1673846106.841209: debug: disabling command: guest-suspend-disk
>>> 1673846106.841217: debug: disabling command: guest-suspend-ram
>>> 1673846106.841225: debug: disabling command: guest-suspend-hybrid
>>> 1673846106.841232: debug: disabling command:
>>> guest-network-get-interfaces
>>> 1673846106.841239: debug: disabling command: guest-get-vcpus
>>> 1673846106.841245: debug: disabling command: guest-set-vcpus
>>> 1673846106.841251: debug: disabling command: guest-get-disks
>>> 1673846106.841257: debug: disabling command: guest-get-fsinfo
>>> 1673846106.841265: debug: disabling command: guest-set-user-password
>>> 1673846106.841272: debug: disabling command: guest-get-memory-blocks
>>> 1673846106.841278: debug: disabling command: guest-set-memory-blocks
>>> 1673846106.841286: debug: disabling command:
>>> guest-get-memory-block-info
>>> 1673846106.841294: debug: disabling command: guest-exec-status
>>> 1673846106.841303: debug: disabling command: guest-exec
>>> 1673846106.841311: debug: disabling command: guest-get-host-name
>>> 1673846106.841319: debug: disabling command: guest-get-users
>>> 1673846106.841326: debug: disabling command: guest-get-timezone
>>> 1673846106.841334: debug: disabling command: guest-get-osinfo
>>> 1673846106.841343: debug: disabling command: guest-get-devices
>>> 1673846106.841350: debug: disabling command:
>>> guest-ssh-get-authorized-keys
>>> 1673846106.841356: debug: disabling command:
>>> guest-ssh-add-authorized-keys
>>> 1673846106.841363: debug: disabling command:
>>> guest-ssh-remove-authorized-keys
>>> 1673846106.841371: warning: disabling logging due to filesystem
>>> freeze
>>
>>
>> Other than disabling the agent, is there any reason this is hapening?
>> I can't think that Debian 11 is shipping with a broken kernel, but the
>> 'qm guest cmd 152 fsfreeze-freeze' and 'qm guest cmd 152
>> fsfreeze-thaw' works fine from the host. Could this be something with
>> the VirtIO pipe/IPC?
>>
>> Anyone else seeing this or have any ideas?
>>
>
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
>
> Tel. +34 943 569 206 |https://www.binovo.es
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
>
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/
More information about the pve-user
mailing list