From anders.ostling at gmail.com  Tue Nov  5 14:26:15 2024
From: anders.ostling at gmail.com (=?utf-8?Q?Anders_=C3=96stling?=)
Date: Tue, 5 Nov 2024 14:26:15 +0100
Subject: [PVE-User] Bug reports
Message-ID: <700A4AF8-12A3-4B12-B802-1215E156CBFB@gmail.com>

Hi

I have just configured a PVE hos with AD authentication sync. I think that I found a bug. AD allows spaces in usernames (bad idea but still). My client uses such user names for some account.

starting sync for realm YYY-XXX.SE
value 'cad cam2 at XXX-YYY.SE' does not look like a valid user name
value 'cad cam3 at XXX-YYY.SE' does not look like a valid user name
value 'CAD CAM at XXX-YYY.SE' does not look like a valid user name
got data from server, updating users
syncing users (remove-vanished opts: none)
adding user 'Administrator at XXX-XXX.SE?

So this may or may not be a bug in the sync code. IDK

/Anders

From alwin at antreich.com  Tue Nov  5 20:04:42 2024
From: alwin at antreich.com (Alwin Antreich)
Date: Tue, 05 Nov 2024 20:04:42 +0100
Subject: [PVE-User] Bug reports
In-Reply-To: <700A4AF8-12A3-4B12-B802-1215E156CBFB@gmail.com>
References: <700A4AF8-12A3-4B12-B802-1215E156CBFB@gmail.com>
Message-ID: <D4B384A7-FACC-4D7B-B404-504E60196935@antreich.com>

On November 5, 2024 2:26:15 PM GMT+01:00, "Anders ?stling" <anders.ostling at gmail.com> wrote:
>Hi
>
>I have just configured a PVE hos with AD authentication sync. I think that I found a bug. AD allows spaces in usernames (bad idea but still). My client uses such user names for some account.
>
>starting sync for realm YYY-XXX.SE
>value 'cad cam2 at XXX-YYY.SE' does not look like a valid user name
>value 'cad cam3 at XXX-YYY.SE' does not look like a valid user name
>value 'CAD CAM at XXX-YYY.SE' does not look like a valid user name
>got data from server, updating users
>syncing users (remove-vanished opts: none)
>adding user 'Administrator at XXX-XXX.SE?
>
>So this may or may not be a bug in the sync code. IDK
>
>/Anders
>_______________________________________________
>pve-user mailing list
>pve-user at lists.proxmox.com
>https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Hi Anders,

please report bugs on <https://bugzilla.proxmox.com/> .

Cheers,
Alwin


From t.lamprecht at proxmox.com  Thu Nov 21 13:11:42 2024
From: t.lamprecht at proxmox.com (Thomas Lamprecht)
Date: Thu, 21 Nov 2024 13:11:42 +0100
Subject: [PVE-User] Proxmox VE 8.3 released!
Message-ID: <640074fa-faf4-410d-a58b-4af5f73e2746@proxmox.com>

Hi All!

We are excited to announce that our latest software version 8.3 for Proxmox
Virtual Environment is now available for download. This release is based on
Debian 12.8 "Bookworm" but uses a newer Linux kernel 6.8.12-4 and kernel 6.11
as opt-in, QEMU 9.0.2, LXC 6.0.0, and ZFS 2.2.6 (with compatibility patches
for Kernel 6.11).

Proxmox VE 8.3 comes full of new features and highlights

- Support for Ceph Reef and Ceph Squid
- Tighter integration of the SDN stack with the firewall
- New webhook notification target
- New view type "Tag View" for the resource tree
- New change detection modes for speeding up container backups to Proxmox
  Backup Server
- More streamlined guest import from files in OVF and OVA
- and much more

As always, we have included countless bugfixes and improvements on many
places; see the release notes for all details.

Release notes
https://pve.proxmox.com/wiki/Roadmap

Press release
https://www.proxmox.com/en/news/press-releases

Video tutorial
https://www.proxmox.com/en/training/video-tutorials/item/what-s-new-in-proxmox-ve-8-3

Download
https://www.proxmox.com/en/downloads
Alternate ISO download:
https://enterprise.proxmox.com/iso

Documentation
https://pve.proxmox.com/pve-docs

Community Forum
https://forum.proxmox.com

Bugtracker
https://bugzilla.proxmox.com

Source code
https://git.proxmox.com

There has been a lot of feedback from our community members and customers, and
many of you reported bugs, submitted patches and were involved in testing -
THANK YOU for your support!

With this release we want to pay tribute to a special member of the community
who unfortunately passed away too soon.
RIP tteck! tteck was a genuine community member and he helped a lot of users
with his Proxmox VE Helper-Scripts. He will be missed. We want to express
sincere condolences to his wife and family.

FAQ
Q: Can I upgrade latest Proxmox VE 7 to 8 with apt?
A: Yes, please follow the upgrade instructions on https://pve.proxmox.com/wiki/Upgrade_from_7_to_8

Q: Can I upgrade an 8.0 installation to the stable 8.3 via apt?
A: Yes, upgrading from is possible via apt and GUI.

Q: Can I install Proxmox VE 8.3 on top of Debian 12 "Bookworm"?
A: Yes, see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm

Q: Can I upgrade from with Ceph Reef to Ceph Squid?
A: Yes, see https://pve.proxmox.com/wiki/Ceph_Reef_to_Squid

Q: Can I upgrade my Proxmox VE 7.4 cluster with Ceph Pacific to Proxmox VE 8.3
   and to Ceph Reef?
A: This is a three-step process. First, you have to upgrade Ceph from Pacific
   to Quincy, and afterwards you can then upgrade Proxmox VE from 7.4 to 8.3.
   As soon as you run Proxmox VE 8.3, you can upgrade Ceph to Reef. There are
   a lot of improvements and changes, so please follow exactly the upgrade
   documentation:
   https://pve.proxmox.com/wiki/Ceph_Pacific_to_Quincy
   https://pve.proxmox.com/wiki/Upgrade_from_7_to_8
   https://pve.proxmox.com/wiki/Ceph_Quincy_to_Reef

Q: Where can I get more information about feature updates?
A: Check the https://pve.proxmox.com/wiki/Roadmap, https://forum.proxmox.com/,
   the https://lists.proxmox.com/, and/or subscribe to our
   https://www.proxmox.com/en/news.

-- --
Best Regards,

Thomas Lamprecht


From jmr.richardson at gmail.com  Fri Nov 22 07:16:53 2024
From: jmr.richardson at gmail.com (JR Richardson)
Date: Fri, 22 Nov 2024 00:16:53 -0600
Subject: [PVE-User] VMs With Multiple Interfaces Rebooting
Message-ID: <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>

Hey Folks,

Just wanted to share an experience I recently had, Cluster parameters:
7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage.
Server Specs:
CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets)
Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z)
Manager Version pve-manager/8.2.4/faa83925c9641325

Super stable environment for many years through software and hardware
upgrades, few issues to speak of, then without warning one of my
hypervisors in 3 node group crashed with a memory dimm error, cluster
HA took over and restarted the VMs on the other two nodes in the group
as expected. The problem quickly materialized as the VMs started
rebooting quickly, a lot of network issues and notice of migration
pending. I could not lockdown exactly what the root cause was. Notable
was these particular VMs all have multiple network interfaces. After
several hours of not being able to get the current VMs stable, I tried
spinning up new VMs on to no avail, reboots persisted on the new VMs.
This seemed to only affect the VMs that were on the hypervisor that
failed all other VMs across the cluster were fine.

I have not installed any third-party monitoring software, found a few
post in the forum about it, but was not my issue.

In an act of desperation, I performed a dist-upgrade and this solved
the issue straight away.
Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z)
Manager Version pve-manager/8.3.0/c1689ccb1065a83b

 Hope this was helpful and if there are any ideas on why this
happened, I welcome any responses.

 Thanks.

JR


From mark at tuxis.nl  Fri Nov 22 08:53:29 2024
From: mark at tuxis.nl (Mark Schouten)
Date: Fri, 22 Nov 2024 08:53:29 +0100
Subject: [PVE-User] VMs With Multiple Interfaces Rebooting
In-Reply-To: <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>
References: <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>
Message-ID: <EB2131A9-6A4D-447C-A61E-A6367B542A13@tuxis.nl>

Hi JR,

What do you mean by ?reboot?? Does the vm crash so that it is powered down from a HA point of view and started back up? Or does the VM OS nicely reboot?


Mark Schouten

> Op 22 nov 2024 om 07:18 heeft JR Richardson <jmr.richardson at gmail.com> het volgende geschreven:
> 
> ?Hey Folks,
> 
> Just wanted to share an experience I recently had, Cluster parameters:
> 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage.
> Server Specs:
> CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets)
> Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z)
> Manager Version pve-manager/8.2.4/faa83925c9641325
> 
> Super stable environment for many years through software and hardware
> upgrades, few issues to speak of, then without warning one of my
> hypervisors in 3 node group crashed with a memory dimm error, cluster
> HA took over and restarted the VMs on the other two nodes in the group
> as expected. The problem quickly materialized as the VMs started
> rebooting quickly, a lot of network issues and notice of migration
> pending. I could not lockdown exactly what the root cause was. Notable
> was these particular VMs all have multiple network interfaces. After
> several hours of not being able to get the current VMs stable, I tried
> spinning up new VMs on to no avail, reboots persisted on the new VMs.
> This seemed to only affect the VMs that were on the hypervisor that
> failed all other VMs across the cluster were fine.
> 
> I have not installed any third-party monitoring software, found a few
> post in the forum about it, but was not my issue.
> 
> In an act of desperation, I performed a dist-upgrade and this solved
> the issue straight away.
> Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z)
> Manager Version pve-manager/8.3.0/c1689ccb1065a83b
> 
> Hope this was helpful and if there are any ideas on why this
> happened, I welcome any responses.
> 
> Thanks.
> 
> JR
> 
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 


From mark at tuxis.nl  Fri Nov 22 08:53:29 2024
From: mark at tuxis.nl (Mark Schouten)
Date: Fri, 22 Nov 2024 08:53:29 +0100
Subject: [PVE-User] VMs With Multiple Interfaces Rebooting
In-Reply-To: <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>
References: <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>
Message-ID: <EB2131A9-6A4D-447C-A61E-A6367B542A13@tuxis.nl>

Hi JR,

What do you mean by ?reboot?? Does the vm crash so that it is powered down from a HA point of view and started back up? Or does the VM OS nicely reboot?


Mark Schouten

> Op 22 nov 2024 om 07:18 heeft JR Richardson <jmr.richardson at gmail.com> het volgende geschreven:
> 
> ?Hey Folks,
> 
> Just wanted to share an experience I recently had, Cluster parameters:
> 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage.
> Server Specs:
> CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets)
> Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z)
> Manager Version pve-manager/8.2.4/faa83925c9641325
> 
> Super stable environment for many years through software and hardware
> upgrades, few issues to speak of, then without warning one of my
> hypervisors in 3 node group crashed with a memory dimm error, cluster
> HA took over and restarted the VMs on the other two nodes in the group
> as expected. The problem quickly materialized as the VMs started
> rebooting quickly, a lot of network issues and notice of migration
> pending. I could not lockdown exactly what the root cause was. Notable
> was these particular VMs all have multiple network interfaces. After
> several hours of not being able to get the current VMs stable, I tried
> spinning up new VMs on to no avail, reboots persisted on the new VMs.
> This seemed to only affect the VMs that were on the hypervisor that
> failed all other VMs across the cluster were fine.
> 
> I have not installed any third-party monitoring software, found a few
> post in the forum about it, but was not my issue.
> 
> In an act of desperation, I performed a dist-upgrade and this solved
> the issue straight away.
> Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z)
> Manager Version pve-manager/8.3.0/c1689ccb1065a83b
> 
> Hope this was helpful and if there are any ideas on why this
> happened, I welcome any responses.
> 
> Thanks.
> 
> JR
> 
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 


From jmr.richardson at gmail.com  Fri Nov 22 17:59:03 2024
From: jmr.richardson at gmail.com (JR Richardson)
Date: Fri, 22 Nov 2024 10:59:03 -0600
Subject: [PVE-User] VMs With Multiple Interfaces Rebooting
Message-ID: <000e01db3cff$d6a20130$83e60390$@gmail.com>

Hi Mark,

Found this error during log review:
" vvepve13 pvestatd[1468]: VM 13113 qmp command failed - VM 13113 qmp
command 'query-proxmox-support' failed - unable to connect to VM 13113 qmp
socket - timeout after 51 retries"

HA was sending shutdown to the VM after not being able to verify VM was
running. I initially through this was networking related but as I
investigate further, this seems like a bug in 'qm', so strange, been running
on this version for months, doing migrations and spinning up new VMs without
any issues.
Thanks
JR


Hi JR,

What do you mean by ?reboot?? Does the vm crash so that it is powered down
from a HA point of view and started back up? Or does the VM OS nicely
reboot?


Mark Schouten

> Op 22 nov 2024 om 07:18 heeft JR Richardson <jmr.richardson at gmail.com> het
volgende geschreven:
> 
> ?Hey Folks,
> 
> Just wanted to share an experience I recently had, Cluster parameters:
> 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage.
> Server Specs:
> CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets) 
> Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z) Manager Version 
> pve-manager/8.2.4/faa83925c9641325
> 
> Super stable environment for many years through software and hardware 
> upgrades, few issues to speak of, then without warning one of my 
> hypervisors in 3 node group crashed with a memory dimm error, cluster 
> HA took over and restarted the VMs on the other two nodes in the group 
> as expected. The problem quickly materialized as the VMs started 
> rebooting quickly, a lot of network issues and notice of migration 
> pending. I could not lockdown exactly what the root cause was. Notable 
> was these particular VMs all have multiple network interfaces. After 
> several hours of not being able to get the current VMs stable, I tried 
> spinning up new VMs on to no avail, reboots persisted on the new VMs.
> This seemed to only affect the VMs that were on the hypervisor that 
> failed all other VMs across the cluster were fine.
> 
> I have not installed any third-party monitoring software, found a few 
> post in the forum about it, but was not my issue.
> 
> In an act of desperation, I performed a dist-upgrade and this solved 
> the issue straight away.
> Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) Manager Version 
> pve-manager/8.3.0/c1689ccb1065a83b
> 
> Hope this was helpful and if there are any ideas on why this happened, 
> I welcome any responses.
> 
> Thanks.
> 
> JR


From seiichirou.hiraoka at gmail.com  Sat Nov 23 03:31:28 2024
From: seiichirou.hiraoka at gmail.com (Seiichirou Hiraoka)
Date: Sat, 23 Nov 2024 11:31:28 +0900
Subject: [PVE-User] Proposal to establish a Japanese section on the official
 Proxmox forum.
In-Reply-To: <000e01db3cff$d6a20130$83e60390$@gmail.com>
References: <000e01db3cff$d6a20130$83e60390$@gmail.com>
Message-ID: <CACrcfuW-zhJkKL2zuJ=oqHcU-a_Z=hsGicMVEuwvxMjaHD9kRA@mail.gmail.com>

Dear members of the Proxmox community,

Thank you very much for your support.We have recently started using Proxmox
and are deeply impressed by its excellent functionality and
stability.However, in the process of gathering information, I realised that
the official Japanese documentation is not well prepared.For this reason,
we are making efforts to share information ourselves by translating
documents using DeepL and publishing them on GitHub.

A look at the official forums shows that in addition to English, a German
forum has been set up, indicating an active user community in
German-speaking countries.We believe that such multilingual support is very
beneficial for the expansion of the community and the exchange of
information between users.There are many examples of open source software
(OSS) forming communities in various languages and contributing to global
dissemination.

In Japan, the demand for Proxmox is growing and there are many
users.However, language barriers still limit access to information and
communication.We therefore propose that a Japanese section be set up in the
official forums.This would allow Japanese-speaking users to exchange
information and solve problems more smoothly, which we believe would result
in the further spread of Proxmox and the revitalisation of the community.

Of course, we understand that there may be cases where it is difficult to
immediately set up a forum due to the absence of Japanese-speaking staff.In
such cases, we are considering community-led Japanese-language forums, and
would be grateful for official support and advice.

We would be grateful if you could consider this proposal.Thank you very
much in advance.

- flathill

From alwin at antreich.com  Mon Nov 25 06:32:16 2024
From: alwin at antreich.com (Alwin Antreich)
Date: Mon, 25 Nov 2024 06:32:16 +0100
Subject: [PVE-User] VMs With Multiple Interfaces Rebooting
In-Reply-To: <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>
References: <CA+U74VPYtp8uS2sC515wMHc5qc6tfjzRnRtWbxMyVtRdNTD4SQ@mail.gmail.com>
Message-ID: <254CB7A1-E72D-442B-9956-721A4D66BEAE@antreich.com>

On November 22, 2024 7:16:53 AM GMT+01:00, JR Richardson <jmr.richardson at gmail.com> wrote:
>Hey Folks,
>
>Just wanted to share an experience I recently had, Cluster parameters:
>7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage.
>Server Specs:
>CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets)
>Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z)
>Manager Version pve-manager/8.2.4/faa83925c9641325
>
>Super stable environment for many years through software and hardware
>upgrades, few issues to speak of, then without warning one of my
>hypervisors in 3 node group crashed with a memory dimm error, cluster
>HA took over and restarted the VMs on the other two nodes in the group
>as expected. The problem quickly materialized as the VMs started
>rebooting quickly, a lot of network issues and notice of migration
>pending. I could not lockdown exactly what the root cause was. Notable
This sounds like it wanted to balance the load. Do you have CRS active and/or static load scheduling?

>was these particular VMs all have multiple network interfaces. After
>several hours of not being able to get the current VMs stable, I tried
>spinning up new VMs on to no avail, reboots persisted on the new VMs.
>This seemed to only affect the VMs that were on the hypervisor that
>failed all other VMs across the cluster were fine.
>
>I have not installed any third-party monitoring software, found a few
>post in the forum about it, but was not my issue.
>
>In an act of desperation, I performed a dist-upgrade and this solved
>the issue straight away.
>Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z)
>Manager Version pve-manager/8.3.0/c1689ccb1065a83b
The upgrade likely restarted the pve-ha-lrm service, which could break the migration cycle.

The systemd logs should give you a clue to what was happening, the ha stack logs the actions on the given node.

Cheers,
Alwin
Hi JR,


From jmr.richardson at gmail.com  Mon Nov 25 16:08:17 2024
From: jmr.richardson at gmail.com (JR Richardson)
Date: Mon, 25 Nov 2024 09:08:17 -0600
Subject: [PVE-User] VMs With Multiple Interfaces Rebooting
In-Reply-To: <mailman.5.1732532402.36715.pve-user@lists.proxmox.com>
References: <mailman.5.1732532402.36715.pve-user@lists.proxmox.com>
Message-ID: <CA+U74VNt=fNn2vmy3JuqNOFG8DbHWjf7HxTu3MsN7S62FFMwBw@mail.gmail.com>

> >Super stable environment for many years through software and hardware
> >upgrades, few issues to speak of, then without warning one of my
> >hypervisors in 3 node group crashed with a memory dimm error, cluster
> >HA took over and restarted the VMs on the other two nodes in the group
> >as expected. The problem quickly materialized as the VMs started
> >rebooting quickly, a lot of network issues and notice of migration
> >pending. I could not lockdown exactly what the root cause was. Notable
> This sounds like it wanted to balance the load. Do you have CRS active and/or static load scheduling?
CRS option is set to basic, not dynamic.

>
> >was these particular VMs all have multiple network interfaces. After
> >several hours of not being able to get the current VMs stable, I tried
> >spinning up new VMs on to no avail, reboots persisted on the new VMs.
> >This seemed to only affect the VMs that were on the hypervisor that
> >failed all other VMs across the cluster were fine.
> >
> >I have not installed any third-party monitoring software, found a few
> >post in the forum about it, but was not my issue.
> >
> >In an act of desperation, I performed a dist-upgrade and this solved
> >the issue straight away.
> >Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z)
> >Manager Version pve-manager/8.3.0/c1689ccb1065a83b
> The upgrade likely restarted the pve-ha-lrm service, which could break the migration cycle.
>
> The systemd logs should give you a clue to what was happening, the ha stack logs the actions on the given node.
I don't see anything particular in the lrm logs, just starting the VMs
over and over.

Here are relevant syslog entries from the end of one cycle reboot to
beginning startup.

2024-11-21T18:36:59.023578-06:00 vvepve13 qmeventd[3838]: Starting
cleanup for 13101
2024-11-21T18:36:59.105435-06:00 vvepve13 qmeventd[3838]: Finished
cleanup for 13101
2024-11-21T18:37:30.758618-06:00 vvepve13 pve-ha-lrm[1608]:
successfully acquired lock 'ha_agent_vvepve13_lock'
2024-11-21T18:37:30.758861-06:00 vvepve13 pve-ha-lrm[1608]: watchdog active
2024-11-21T18:37:30.758977-06:00 vvepve13 pve-ha-lrm[1608]: status
change wait_for_agent_lock => active
2024-11-21T18:37:30.789271-06:00 vvepve13 pve-ha-lrm[4337]: starting
service vm:13101
2024-11-21T18:37:30.808204-06:00 vvepve13 pve-ha-lrm[4338]: start VM
13101: UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam:
2024-11-21T18:37:30.808383-06:00 vvepve13 pve-ha-lrm[4337]: <root at pam>
starting task UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam:
2024-11-21T18:37:31.112154-06:00 vvepve13 systemd[1]: Started 13101.scope.
2024-11-21T18:37:32.802414-06:00 vvepve13 kernel: [  316.379944]
tap13101i0: entered promiscuous mode
2024-11-21T18:37:32.846352-06:00 vvepve13 kernel: [  316.423935]
vmbr0: port 10(tap13101i0) entered blocking state
2024-11-21T18:37:32.846372-06:00 vvepve13 kernel: [  316.423946]
vmbr0: port 10(tap13101i0) entered disabled state
2024-11-21T18:37:32.846375-06:00 vvepve13 kernel: [  316.423990]
tap13101i0: entered allmulticast mode
2024-11-21T18:37:32.847377-06:00 vvepve13 kernel: [  316.424825]
vmbr0: port 10(tap13101i0) entered blocking state
2024-11-21T18:37:32.847391-06:00 vvepve13 kernel: [  316.424832]
vmbr0: port 10(tap13101i0) entered forwarding state
2024-11-21T18:37:34.594397-06:00 vvepve13 kernel: [  318.172029]
tap13101i1: entered promiscuous mode
2024-11-21T18:37:34.640376-06:00 vvepve13 kernel: [  318.217302]
vmbr0: port 11(tap13101i1) entered blocking state
2024-11-21T18:37:34.640393-06:00 vvepve13 kernel: [  318.217310]
vmbr0: port 11(tap13101i1) entered disabled state
2024-11-21T18:37:34.640396-06:00 vvepve13 kernel: [  318.217341]
tap13101i1: entered allmulticast mode
2024-11-21T18:37:34.640398-06:00 vvepve13 kernel: [  318.218073]
vmbr0: port 11(tap13101i1) entered blocking state
2024-11-21T18:37:34.640400-06:00 vvepve13 kernel: [  318.218077]
vmbr0: port 11(tap13101i1) entered forwarding state
2024-11-21T18:37:35.819630-06:00 vvepve13 pve-ha-lrm[4337]: Task
'UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam:'
still active, waiting
2024-11-21T18:37:36.249349-06:00 vvepve13 kernel: [  319.827024]
tap13101i2: entered promiscuous mode
2024-11-21T18:37:36.291346-06:00 vvepve13 kernel: [  319.868406]
vmbr0: port 12(tap13101i2) entered blocking state
2024-11-21T18:37:36.291365-06:00 vvepve13 kernel: [  319.868417]
vmbr0: port 12(tap13101i2) entered disabled state
2024-11-21T18:37:36.291367-06:00 vvepve13 kernel: [  319.868443]
tap13101i2: entered allmulticast mode
2024-11-21T18:37:36.291368-06:00 vvepve13 kernel: [  319.869185]
vmbr0: port 12(tap13101i2) entered blocking state
2024-11-21T18:37:36.291369-06:00 vvepve13 kernel: [  319.869191]
vmbr0: port 12(tap13101i2) entered forwarding state
2024-11-21T18:37:37.997394-06:00 vvepve13 kernel: [  321.575034]
tap13101i3: entered promiscuous mode
2024-11-21T18:37:38.040384-06:00 vvepve13 kernel: [  321.617225]
vmbr0: port 13(tap13101i3) entered blocking state
2024-11-21T18:37:38.040396-06:00 vvepve13 kernel: [  321.617236]
vmbr0: port 13(tap13101i3) entered disabled state
2024-11-21T18:37:38.040400-06:00 vvepve13 kernel: [  321.617278]
tap13101i3: entered allmulticast mode
2024-11-21T18:37:38.040402-06:00 vvepve13 kernel: [  321.618070]
vmbr0: port 13(tap13101i3) entered blocking state
2024-11-21T18:37:38.040403-06:00 vvepve13 kernel: [  321.618077]
vmbr0: port 13(tap13101i3) entered forwarding state
2024-11-21T18:37:38.248094-06:00 vvepve13 pve-ha-lrm[4337]: <root at pam>
end task UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam:
OK
2024-11-21T18:37:38.254144-06:00 vvepve13 pve-ha-lrm[4337]: service
status vm:13101 started
2024-11-21T18:37:44.256824-06:00 vvepve13 QEMU[3794]: kvm:
../accel/kvm/kvm-all.c:1836: kvm_irqchip_commit_routes: Assertion `ret
== 0' failed.
2024-11-21T18:38:17.486394-06:00 vvepve13 kernel: [  361.063298]
vmbr0: port 10(tap13101i0) entered disabled state
2024-11-21T18:38:17.486423-06:00 vvepve13 kernel: [  361.064099]
tap13101i0 (unregistering): left allmulticast mode
2024-11-21T18:38:17.486426-06:00 vvepve13 kernel: [  361.064110]
vmbr0: port 10(tap13101i0) entered disabled state
2024-11-21T18:38:17.510386-06:00 vvepve13 kernel: [  361.087517]
vmbr0: port 11(tap13101i1) entered disabled state
2024-11-21T18:38:17.510400-06:00 vvepve13 kernel: [  361.087796]
tap13101i1 (unregistering): left allmulticast mode
2024-11-21T18:38:17.510403-06:00 vvepve13 kernel: [  361.087805]
vmbr0: port 11(tap13101i1) entered disabled state
2024-11-21T18:38:17.540386-06:00 vvepve13 kernel: [  361.117511]
vmbr0: port 12(tap13101i2) entered disabled state
2024-11-21T18:38:17.540402-06:00 vvepve13 kernel: [  361.117817]
tap13101i2 (unregistering): left allmulticast mode
2024-11-21T18:38:17.540404-06:00 vvepve13 kernel: [  361.117827]
vmbr0: port 12(tap13101i2) entered disabled state
2024-11-21T18:38:17.561380-06:00 vvepve13 kernel: [  361.138518]
vmbr0: port 13(tap13101i3) entered disabled state
2024-11-21T18:38:17.561394-06:00 vvepve13 kernel: [  361.138965]
tap13101i3 (unregistering): left allmulticast mode
2024-11-21T18:38:17.561399-06:00 vvepve13 kernel: [  361.138977]
vmbr0: port 13(tap13101i3) entered disabled state
2024-11-21T18:38:17.584412-06:00 vvepve13 systemd[1]: 13101.scope:
Deactivated successfully.
2024-11-21T18:38:17.584619-06:00 vvepve13 systemd[1]: 13101.scope:
Consumed 51.122s CPU time.
2024-11-21T18:38:18.522886-06:00 vvepve13 pvestatd[1476]: VM 13101 qmp
command failed - VM 13101 not running
2024-11-21T18:38:18.523725-06:00 vvepve13 pve-ha-lrm[4889]: <root at pam>
end task UPID:vvepve13:0000131A:00008A78:673FD272:qmstart:13104:root at pam:
OK
2024-11-21T18:38:18.945142-06:00 vvepve13 qmeventd[4990]: Starting
cleanup for 13101
2024-11-21T18:38:19.022405-06:00 vvepve13 qmeventd[4990]: Finished
cleanup for 13101


Thanks

JR


From alwin at antreich.com  Wed Nov 27 10:38:59 2024
From: alwin at antreich.com (Alwin Antreich)
Date: Wed, 27 Nov 2024 09:38:59 +0000
Subject: [PVE-User] VMs With Multiple Interfaces Rebooting
In-Reply-To: <CA+U74VNt=fNn2vmy3JuqNOFG8DbHWjf7HxTu3MsN7S62FFMwBw@mail.gmail.com>
References: <mailman.5.1732532402.36715.pve-user@lists.proxmox.com>
 <CA+U74VNt=fNn2vmy3JuqNOFG8DbHWjf7HxTu3MsN7S62FFMwBw@mail.gmail.com>
Message-ID: <64bc4bad4c8528beaf44558880c9723751431d16@antreich.com>

Hi JR,


November 25, 2024 at 4:08 PM, "JR Richardson" <jmr.richardson at gmail.com mailto:jmr.richardson at gmail.com?to=%22JR%20Richardson%22%20%3Cjmr.richardson%40gmail.com%3E > wrote:


> 
> > 
> > Super stable environment for many years through software and hardware
> > upgrades, few issues to speak of, then without warning one of my
> > hypervisors in 3 node group crashed with a memory dimm error, cluster
> > HA took over and restarted the VMs on the other two nodes in the group
> > as expected. The problem quickly materialized as the VMs started
> > rebooting quickly, a lot of network issues and notice of migration
> > pending. I could not lockdown exactly what the root cause was. Notable
> >  This sounds like it wanted to balance the load. Do you have CRS active and/or static load scheduling?
> > 
> CRS option is set to basic, not dynamic.
K, basic. And I meant is rebalance active. :)

> 
> 2024-11-21T18:37:38.248094-06:00 vvepve13 pve-ha-lrm[4337]: <root at pam>
> end task UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam:
> OK
> 2024-11-21T18:37:38.254144-06:00 vvepve13 pve-ha-lrm[4337]: service
> status vm:13101 started
> 2024-11-21T18:37:44.256824-06:00 vvepve13 QEMU[3794]: kvm:
> ../accel/kvm/kvm-all.c:1836: kvm_irqchip_commit_routes: Assertion `ret
> == 0' failed.
This doesn't look good. I'd assume that this is VM13101, which failed to start. And was consequently moved to the other remaining node (vice versa).

But this doesn't explain the WHY. You will need to look further into the logs to see what else transpired during this time.

Cheers,
Alwin