[PVE-User] VMs With Multiple Interfaces Rebooting

JR Richardson jmr.richardson at gmail.com
Fri Nov 22 07:16:53 CET 2024


Hey Folks,

Just wanted to share an experience I recently had, Cluster parameters:
7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage.
Server Specs:
CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets)
Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z)
Manager Version pve-manager/8.2.4/faa83925c9641325

Super stable environment for many years through software and hardware
upgrades, few issues to speak of, then without warning one of my
hypervisors in 3 node group crashed with a memory dimm error, cluster
HA took over and restarted the VMs on the other two nodes in the group
as expected. The problem quickly materialized as the VMs started
rebooting quickly, a lot of network issues and notice of migration
pending. I could not lockdown exactly what the root cause was. Notable
was these particular VMs all have multiple network interfaces. After
several hours of not being able to get the current VMs stable, I tried
spinning up new VMs on to no avail, reboots persisted on the new VMs.
This seemed to only affect the VMs that were on the hypervisor that
failed all other VMs across the cluster were fine.

I have not installed any third-party monitoring software, found a few
post in the forum about it, but was not my issue.

In an act of desperation, I performed a dist-upgrade and this solved
the issue straight away.
Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z)
Manager Version pve-manager/8.3.0/c1689ccb1065a83b

 Hope this was helpful and if there are any ideas on why this
happened, I welcome any responses.

 Thanks.

JR



More information about the pve-user mailing list