[PVE-User] VMs hung after live migration - Intel CPU

Jan Vlach janus at volny.cz
Mon Nov 7 22:59:01 CET 2022


Hi,

For what’s it worth, live VM migration with Linux VMs with various debian versions work here just fine. I’m using virtio for networking and virtio scsi for disks. (The only version where I had problems was debian6 where the kernel does not support virtio scsi and megaraid sas 8708EM2 needs to be used. I get kernel panic in mpt_sas on thaw after migration.)

We're running 5.15.60-1-pve on three node cluster with AMD EPYC 7551P 32-Core Processor. These are supermicros with latest bios (latest microcode?) and BMC 

Storage is local ZFS pool, backed by SSDS in striped mirrors (4 devices on each node). Migration has dedicated 2x 10GigE LACP and dedicated VLAN on switch stack. 

I have more nodes with EPYC3/Milan on the way, so I’ll test those later as well.

What does your cluster look hardware-wise? What are the problems you experienced with VM migratio on 5.13->5.19? 

Thanks,
JV


> On 7. 11. 2022, at 14:40, Eneko Lacunza via pve-user <pve-user at lists.proxmox.com> wrote:
> 
> 
> From: Eneko Lacunza <elacunza at binovo.es>
> Subject: Re: [PVE-User] VMs hung after live migration - Intel CPU
> Date: 7 November 2022 14:40:07 CET
> To: Mark Schouten <mark at tuxis.nl>, Proxmox VE user list <pve-user at lists.proxmox.com>
> 
> 
> Hi,
> 
> Sadly I'm not sure what is best. For most of the clusters we admin, I have decided to stay in 5.13 (pinning that version with proxmox-boot-tool) because 5.19 seems will receive much more changes and it will be more unstable...
> 
> Cheers
> 
> El 7/11/22 a las 13:56, Mark Schouten escribió:
>> Hi,
>> 
>> 
>> Thanks. What would you suggest? Downgrading to 5.13 ?
>> 
>> -- 
>> Mark Schouten
>> CTO, Tuxis B.V. | https://www.tuxis.nl/
>> <mark at tuxis.nl> | +31 318 200208
>> 
>> 
>> *From: * Eneko Lacunza <elacunza at binovo.es>
>> *To: * Mark Schouten <mark at tuxis.nl>, Proxmox VE user list <pve-user at lists.proxmox.com>
>> *Sent: * 2022-11-07 9:23
>> *Subject: * Re: [PVE-User] VMs hung after live migration - Intel CPU
>> 
>>    Hi,
>> 
>>    5.15 has been a disaster for us, issues seem to have no end.
>>    Frankly, I don't understand how can it be the official supported
>>    kernel in PVE 7.2 right now.
>> 
>>    Our tests with 5.19 in a pair of nodes (in another cluster) seem
>>    good, but I don't think 5.13 -> 5.19 migration is working well
>>    either. Both kernels not being the "official" one, I'm unable to
>>    decide what to do with our clusters...
>> 
>>    This has been ongoing for some months... :-(
>> 
>>    I see 5.15.64 has been promoted to enterprise repo this weekend,
>>    no idea if any attempt to fix live migration issues is included...
>> 
>>    Thanks
>> 
>>    El 6/11/22 a las 9:04, Mark Schouten escribió:
>>>    Hi,
>>> 
>>>    I’ve seen the same behavior between two AMD cpu’s with the -60 kernel. One of the vm’s the ‘crashed’ even started working after migrating back again..
>>> 
>>>    I’m probably going to 5.19, I’ve heard other issues with 5.15 as well (CephFS client issues).
>>> 
>>>    Mark Schouten
>>> 
>>>>    Op 3 nov. 2022 om 17:55 heeft Eneko Lacunza via pve-user<pve-user at lists.proxmox.com>  <mailto:pve-user at lists.proxmox.com>  het volgende geschreven:
>>>> 
>>>>    
>>>> 
>>>>>    _______________________________________________
>>>>>    pve-user mailing list
>>>>>    pve-user at lists.proxmox.com  <mailto:pve-user at lists.proxmox.com>
>>>>>    https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user  <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user>
>> 
>>    Eneko Lacunza
>>    Zuzendari teknikoa | Director técnico
>>    Binovo IT Human Project
>> 
>>    Tel. +34 943 569 206 |https://www.binovo.es  <https://www.binovo.es>
>>    Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
>> 
>>    https://www.youtube.com/user/CANALBINOVO  <https://www.youtube.com/user/CANALBINOVO>
>>    https://www.linkedin.com/company/37269706/  <https://www.linkedin.com/company/37269706/>
>> 
> 
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
> 
> Tel. +34 943 569 206 |https://www.binovo.es
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
> 
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/
> 
> 
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user



More information about the pve-user mailing list