From anders.ostling at gmail.com Tue Nov 5 14:26:15 2024 From: anders.ostling at gmail.com (=?utf-8?Q?Anders_=C3=96stling?=) Date: Tue, 5 Nov 2024 14:26:15 +0100 Subject: [PVE-User] Bug reports Message-ID: <700A4AF8-12A3-4B12-B802-1215E156CBFB@gmail.com> Hi I have just configured a PVE hos with AD authentication sync. I think that I found a bug. AD allows spaces in usernames (bad idea but still). My client uses such user names for some account. starting sync for realm YYY-XXX.SE value 'cad cam2 at XXX-YYY.SE' does not look like a valid user name value 'cad cam3 at XXX-YYY.SE' does not look like a valid user name value 'CAD CAM at XXX-YYY.SE' does not look like a valid user name got data from server, updating users syncing users (remove-vanished opts: none) adding user 'Administrator at XXX-XXX.SE? So this may or may not be a bug in the sync code. IDK /Anders From alwin at antreich.com Tue Nov 5 20:04:42 2024 From: alwin at antreich.com (Alwin Antreich) Date: Tue, 05 Nov 2024 20:04:42 +0100 Subject: [PVE-User] Bug reports In-Reply-To: <700A4AF8-12A3-4B12-B802-1215E156CBFB@gmail.com> References: <700A4AF8-12A3-4B12-B802-1215E156CBFB@gmail.com> Message-ID: On November 5, 2024 2:26:15 PM GMT+01:00, "Anders ?stling" wrote: >Hi > >I have just configured a PVE hos with AD authentication sync. I think that I found a bug. AD allows spaces in usernames (bad idea but still). My client uses such user names for some account. > >starting sync for realm YYY-XXX.SE >value 'cad cam2 at XXX-YYY.SE' does not look like a valid user name >value 'cad cam3 at XXX-YYY.SE' does not look like a valid user name >value 'CAD CAM at XXX-YYY.SE' does not look like a valid user name >got data from server, updating users >syncing users (remove-vanished opts: none) >adding user 'Administrator at XXX-XXX.SE? > >So this may or may not be a bug in the sync code. IDK > >/Anders >_______________________________________________ >pve-user mailing list >pve-user at lists.proxmox.com >https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user Hi Anders, please report bugs on . Cheers, Alwin From t.lamprecht at proxmox.com Thu Nov 21 13:11:42 2024 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Thu, 21 Nov 2024 13:11:42 +0100 Subject: [PVE-User] Proxmox VE 8.3 released! Message-ID: <640074fa-faf4-410d-a58b-4af5f73e2746@proxmox.com> Hi All! We are excited to announce that our latest software version 8.3 for Proxmox Virtual Environment is now available for download. This release is based on Debian 12.8 "Bookworm" but uses a newer Linux kernel 6.8.12-4 and kernel 6.11 as opt-in, QEMU 9.0.2, LXC 6.0.0, and ZFS 2.2.6 (with compatibility patches for Kernel 6.11). Proxmox VE 8.3 comes full of new features and highlights - Support for Ceph Reef and Ceph Squid - Tighter integration of the SDN stack with the firewall - New webhook notification target - New view type "Tag View" for the resource tree - New change detection modes for speeding up container backups to Proxmox Backup Server - More streamlined guest import from files in OVF and OVA - and much more As always, we have included countless bugfixes and improvements on many places; see the release notes for all details. Release notes https://pve.proxmox.com/wiki/Roadmap Press release https://www.proxmox.com/en/news/press-releases Video tutorial https://www.proxmox.com/en/training/video-tutorials/item/what-s-new-in-proxmox-ve-8-3 Download https://www.proxmox.com/en/downloads Alternate ISO download: https://enterprise.proxmox.com/iso Documentation https://pve.proxmox.com/pve-docs Community Forum https://forum.proxmox.com Bugtracker https://bugzilla.proxmox.com Source code https://git.proxmox.com There has been a lot of feedback from our community members and customers, and many of you reported bugs, submitted patches and were involved in testing - THANK YOU for your support! With this release we want to pay tribute to a special member of the community who unfortunately passed away too soon. RIP tteck! tteck was a genuine community member and he helped a lot of users with his Proxmox VE Helper-Scripts. He will be missed. We want to express sincere condolences to his wife and family. FAQ Q: Can I upgrade latest Proxmox VE 7 to 8 with apt? A: Yes, please follow the upgrade instructions on https://pve.proxmox.com/wiki/Upgrade_from_7_to_8 Q: Can I upgrade an 8.0 installation to the stable 8.3 via apt? A: Yes, upgrading from is possible via apt and GUI. Q: Can I install Proxmox VE 8.3 on top of Debian 12 "Bookworm"? A: Yes, see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_12_Bookworm Q: Can I upgrade from with Ceph Reef to Ceph Squid? A: Yes, see https://pve.proxmox.com/wiki/Ceph_Reef_to_Squid Q: Can I upgrade my Proxmox VE 7.4 cluster with Ceph Pacific to Proxmox VE 8.3 and to Ceph Reef? A: This is a three-step process. First, you have to upgrade Ceph from Pacific to Quincy, and afterwards you can then upgrade Proxmox VE from 7.4 to 8.3. As soon as you run Proxmox VE 8.3, you can upgrade Ceph to Reef. There are a lot of improvements and changes, so please follow exactly the upgrade documentation: https://pve.proxmox.com/wiki/Ceph_Pacific_to_Quincy https://pve.proxmox.com/wiki/Upgrade_from_7_to_8 https://pve.proxmox.com/wiki/Ceph_Quincy_to_Reef Q: Where can I get more information about feature updates? A: Check the https://pve.proxmox.com/wiki/Roadmap, https://forum.proxmox.com/, the https://lists.proxmox.com/, and/or subscribe to our https://www.proxmox.com/en/news. -- -- Best Regards, Thomas Lamprecht From jmr.richardson at gmail.com Fri Nov 22 07:16:53 2024 From: jmr.richardson at gmail.com (JR Richardson) Date: Fri, 22 Nov 2024 00:16:53 -0600 Subject: [PVE-User] VMs With Multiple Interfaces Rebooting Message-ID: Hey Folks, Just wanted to share an experience I recently had, Cluster parameters: 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage. Server Specs: CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets) Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z) Manager Version pve-manager/8.2.4/faa83925c9641325 Super stable environment for many years through software and hardware upgrades, few issues to speak of, then without warning one of my hypervisors in 3 node group crashed with a memory dimm error, cluster HA took over and restarted the VMs on the other two nodes in the group as expected. The problem quickly materialized as the VMs started rebooting quickly, a lot of network issues and notice of migration pending. I could not lockdown exactly what the root cause was. Notable was these particular VMs all have multiple network interfaces. After several hours of not being able to get the current VMs stable, I tried spinning up new VMs on to no avail, reboots persisted on the new VMs. This seemed to only affect the VMs that were on the hypervisor that failed all other VMs across the cluster were fine. I have not installed any third-party monitoring software, found a few post in the forum about it, but was not my issue. In an act of desperation, I performed a dist-upgrade and this solved the issue straight away. Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) Manager Version pve-manager/8.3.0/c1689ccb1065a83b Hope this was helpful and if there are any ideas on why this happened, I welcome any responses. Thanks. JR From mark at tuxis.nl Fri Nov 22 08:53:29 2024 From: mark at tuxis.nl (Mark Schouten) Date: Fri, 22 Nov 2024 08:53:29 +0100 Subject: [PVE-User] VMs With Multiple Interfaces Rebooting In-Reply-To: References: Message-ID: Hi JR, What do you mean by ?reboot?? Does the vm crash so that it is powered down from a HA point of view and started back up? Or does the VM OS nicely reboot? Mark Schouten > Op 22 nov 2024 om 07:18 heeft JR Richardson het volgende geschreven: > > ?Hey Folks, > > Just wanted to share an experience I recently had, Cluster parameters: > 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage. > Server Specs: > CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets) > Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z) > Manager Version pve-manager/8.2.4/faa83925c9641325 > > Super stable environment for many years through software and hardware > upgrades, few issues to speak of, then without warning one of my > hypervisors in 3 node group crashed with a memory dimm error, cluster > HA took over and restarted the VMs on the other two nodes in the group > as expected. The problem quickly materialized as the VMs started > rebooting quickly, a lot of network issues and notice of migration > pending. I could not lockdown exactly what the root cause was. Notable > was these particular VMs all have multiple network interfaces. After > several hours of not being able to get the current VMs stable, I tried > spinning up new VMs on to no avail, reboots persisted on the new VMs. > This seemed to only affect the VMs that were on the hypervisor that > failed all other VMs across the cluster were fine. > > I have not installed any third-party monitoring software, found a few > post in the forum about it, but was not my issue. > > In an act of desperation, I performed a dist-upgrade and this solved > the issue straight away. > Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) > Manager Version pve-manager/8.3.0/c1689ccb1065a83b > > Hope this was helpful and if there are any ideas on why this > happened, I welcome any responses. > > Thanks. > > JR > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From mark at tuxis.nl Fri Nov 22 08:53:29 2024 From: mark at tuxis.nl (Mark Schouten) Date: Fri, 22 Nov 2024 08:53:29 +0100 Subject: [PVE-User] VMs With Multiple Interfaces Rebooting In-Reply-To: References: Message-ID: Hi JR, What do you mean by ?reboot?? Does the vm crash so that it is powered down from a HA point of view and started back up? Or does the VM OS nicely reboot? Mark Schouten > Op 22 nov 2024 om 07:18 heeft JR Richardson het volgende geschreven: > > ?Hey Folks, > > Just wanted to share an experience I recently had, Cluster parameters: > 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage. > Server Specs: > CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets) > Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z) > Manager Version pve-manager/8.2.4/faa83925c9641325 > > Super stable environment for many years through software and hardware > upgrades, few issues to speak of, then without warning one of my > hypervisors in 3 node group crashed with a memory dimm error, cluster > HA took over and restarted the VMs on the other two nodes in the group > as expected. The problem quickly materialized as the VMs started > rebooting quickly, a lot of network issues and notice of migration > pending. I could not lockdown exactly what the root cause was. Notable > was these particular VMs all have multiple network interfaces. After > several hours of not being able to get the current VMs stable, I tried > spinning up new VMs on to no avail, reboots persisted on the new VMs. > This seemed to only affect the VMs that were on the hypervisor that > failed all other VMs across the cluster were fine. > > I have not installed any third-party monitoring software, found a few > post in the forum about it, but was not my issue. > > In an act of desperation, I performed a dist-upgrade and this solved > the issue straight away. > Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) > Manager Version pve-manager/8.3.0/c1689ccb1065a83b > > Hope this was helpful and if there are any ideas on why this > happened, I welcome any responses. > > Thanks. > > JR > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From jmr.richardson at gmail.com Fri Nov 22 17:59:03 2024 From: jmr.richardson at gmail.com (JR Richardson) Date: Fri, 22 Nov 2024 10:59:03 -0600 Subject: [PVE-User] VMs With Multiple Interfaces Rebooting Message-ID: <000e01db3cff$d6a20130$83e60390$@gmail.com> Hi Mark, Found this error during log review: " vvepve13 pvestatd[1468]: VM 13113 qmp command failed - VM 13113 qmp command 'query-proxmox-support' failed - unable to connect to VM 13113 qmp socket - timeout after 51 retries" HA was sending shutdown to the VM after not being able to verify VM was running. I initially through this was networking related but as I investigate further, this seems like a bug in 'qm', so strange, been running on this version for months, doing migrations and spinning up new VMs without any issues. Thanks JR Hi JR, What do you mean by ?reboot?? Does the vm crash so that it is powered down from a HA point of view and started back up? Or does the VM OS nicely reboot? Mark Schouten > Op 22 nov 2024 om 07:18 heeft JR Richardson het volgende geschreven: > > ?Hey Folks, > > Just wanted to share an experience I recently had, Cluster parameters: > 7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage. > Server Specs: > CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets) > Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z) Manager Version > pve-manager/8.2.4/faa83925c9641325 > > Super stable environment for many years through software and hardware > upgrades, few issues to speak of, then without warning one of my > hypervisors in 3 node group crashed with a memory dimm error, cluster > HA took over and restarted the VMs on the other two nodes in the group > as expected. The problem quickly materialized as the VMs started > rebooting quickly, a lot of network issues and notice of migration > pending. I could not lockdown exactly what the root cause was. Notable > was these particular VMs all have multiple network interfaces. After > several hours of not being able to get the current VMs stable, I tried > spinning up new VMs on to no avail, reboots persisted on the new VMs. > This seemed to only affect the VMs that were on the hypervisor that > failed all other VMs across the cluster were fine. > > I have not installed any third-party monitoring software, found a few > post in the forum about it, but was not my issue. > > In an act of desperation, I performed a dist-upgrade and this solved > the issue straight away. > Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) Manager Version > pve-manager/8.3.0/c1689ccb1065a83b > > Hope this was helpful and if there are any ideas on why this happened, > I welcome any responses. > > Thanks. > > JR From seiichirou.hiraoka at gmail.com Sat Nov 23 03:31:28 2024 From: seiichirou.hiraoka at gmail.com (Seiichirou Hiraoka) Date: Sat, 23 Nov 2024 11:31:28 +0900 Subject: [PVE-User] Proposal to establish a Japanese section on the official Proxmox forum. In-Reply-To: <000e01db3cff$d6a20130$83e60390$@gmail.com> References: <000e01db3cff$d6a20130$83e60390$@gmail.com> Message-ID: Dear members of the Proxmox community, Thank you very much for your support.We have recently started using Proxmox and are deeply impressed by its excellent functionality and stability.However, in the process of gathering information, I realised that the official Japanese documentation is not well prepared.For this reason, we are making efforts to share information ourselves by translating documents using DeepL and publishing them on GitHub. A look at the official forums shows that in addition to English, a German forum has been set up, indicating an active user community in German-speaking countries.We believe that such multilingual support is very beneficial for the expansion of the community and the exchange of information between users.There are many examples of open source software (OSS) forming communities in various languages and contributing to global dissemination. In Japan, the demand for Proxmox is growing and there are many users.However, language barriers still limit access to information and communication.We therefore propose that a Japanese section be set up in the official forums.This would allow Japanese-speaking users to exchange information and solve problems more smoothly, which we believe would result in the further spread of Proxmox and the revitalisation of the community. Of course, we understand that there may be cases where it is difficult to immediately set up a forum due to the absence of Japanese-speaking staff.In such cases, we are considering community-led Japanese-language forums, and would be grateful for official support and advice. We would be grateful if you could consider this proposal.Thank you very much in advance. - flathill From alwin at antreich.com Mon Nov 25 06:32:16 2024 From: alwin at antreich.com (Alwin Antreich) Date: Mon, 25 Nov 2024 06:32:16 +0100 Subject: [PVE-User] VMs With Multiple Interfaces Rebooting In-Reply-To: References: Message-ID: <254CB7A1-E72D-442B-9956-721A4D66BEAE@antreich.com> On November 22, 2024 7:16:53 AM GMT+01:00, JR Richardson wrote: >Hey Folks, > >Just wanted to share an experience I recently had, Cluster parameters: >7 nodes, 2 HA Groups (3 nodes and 4 nodes), shared storage. >Server Specs: >CPU(s) 40 x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (2 Sockets) >Kernel Version Linux 6.8.12-1-pve (2024-08-05T16:17Z) >Manager Version pve-manager/8.2.4/faa83925c9641325 > >Super stable environment for many years through software and hardware >upgrades, few issues to speak of, then without warning one of my >hypervisors in 3 node group crashed with a memory dimm error, cluster >HA took over and restarted the VMs on the other two nodes in the group >as expected. The problem quickly materialized as the VMs started >rebooting quickly, a lot of network issues and notice of migration >pending. I could not lockdown exactly what the root cause was. Notable This sounds like it wanted to balance the load. Do you have CRS active and/or static load scheduling? >was these particular VMs all have multiple network interfaces. After >several hours of not being able to get the current VMs stable, I tried >spinning up new VMs on to no avail, reboots persisted on the new VMs. >This seemed to only affect the VMs that were on the hypervisor that >failed all other VMs across the cluster were fine. > >I have not installed any third-party monitoring software, found a few >post in the forum about it, but was not my issue. > >In an act of desperation, I performed a dist-upgrade and this solved >the issue straight away. >Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) >Manager Version pve-manager/8.3.0/c1689ccb1065a83b The upgrade likely restarted the pve-ha-lrm service, which could break the migration cycle. The systemd logs should give you a clue to what was happening, the ha stack logs the actions on the given node. Cheers, Alwin Hi JR, From jmr.richardson at gmail.com Mon Nov 25 16:08:17 2024 From: jmr.richardson at gmail.com (JR Richardson) Date: Mon, 25 Nov 2024 09:08:17 -0600 Subject: [PVE-User] VMs With Multiple Interfaces Rebooting In-Reply-To: References: Message-ID: > >Super stable environment for many years through software and hardware > >upgrades, few issues to speak of, then without warning one of my > >hypervisors in 3 node group crashed with a memory dimm error, cluster > >HA took over and restarted the VMs on the other two nodes in the group > >as expected. The problem quickly materialized as the VMs started > >rebooting quickly, a lot of network issues and notice of migration > >pending. I could not lockdown exactly what the root cause was. Notable > This sounds like it wanted to balance the load. Do you have CRS active and/or static load scheduling? CRS option is set to basic, not dynamic. > > >was these particular VMs all have multiple network interfaces. After > >several hours of not being able to get the current VMs stable, I tried > >spinning up new VMs on to no avail, reboots persisted on the new VMs. > >This seemed to only affect the VMs that were on the hypervisor that > >failed all other VMs across the cluster were fine. > > > >I have not installed any third-party monitoring software, found a few > >post in the forum about it, but was not my issue. > > > >In an act of desperation, I performed a dist-upgrade and this solved > >the issue straight away. > >Kernel Version Linux 6.8.12-4-pve (2024-11-06T15:04Z) > >Manager Version pve-manager/8.3.0/c1689ccb1065a83b > The upgrade likely restarted the pve-ha-lrm service, which could break the migration cycle. > > The systemd logs should give you a clue to what was happening, the ha stack logs the actions on the given node. I don't see anything particular in the lrm logs, just starting the VMs over and over. Here are relevant syslog entries from the end of one cycle reboot to beginning startup. 2024-11-21T18:36:59.023578-06:00 vvepve13 qmeventd[3838]: Starting cleanup for 13101 2024-11-21T18:36:59.105435-06:00 vvepve13 qmeventd[3838]: Finished cleanup for 13101 2024-11-21T18:37:30.758618-06:00 vvepve13 pve-ha-lrm[1608]: successfully acquired lock 'ha_agent_vvepve13_lock' 2024-11-21T18:37:30.758861-06:00 vvepve13 pve-ha-lrm[1608]: watchdog active 2024-11-21T18:37:30.758977-06:00 vvepve13 pve-ha-lrm[1608]: status change wait_for_agent_lock => active 2024-11-21T18:37:30.789271-06:00 vvepve13 pve-ha-lrm[4337]: starting service vm:13101 2024-11-21T18:37:30.808204-06:00 vvepve13 pve-ha-lrm[4338]: start VM 13101: UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam: 2024-11-21T18:37:30.808383-06:00 vvepve13 pve-ha-lrm[4337]: starting task UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam: 2024-11-21T18:37:31.112154-06:00 vvepve13 systemd[1]: Started 13101.scope. 2024-11-21T18:37:32.802414-06:00 vvepve13 kernel: [ 316.379944] tap13101i0: entered promiscuous mode 2024-11-21T18:37:32.846352-06:00 vvepve13 kernel: [ 316.423935] vmbr0: port 10(tap13101i0) entered blocking state 2024-11-21T18:37:32.846372-06:00 vvepve13 kernel: [ 316.423946] vmbr0: port 10(tap13101i0) entered disabled state 2024-11-21T18:37:32.846375-06:00 vvepve13 kernel: [ 316.423990] tap13101i0: entered allmulticast mode 2024-11-21T18:37:32.847377-06:00 vvepve13 kernel: [ 316.424825] vmbr0: port 10(tap13101i0) entered blocking state 2024-11-21T18:37:32.847391-06:00 vvepve13 kernel: [ 316.424832] vmbr0: port 10(tap13101i0) entered forwarding state 2024-11-21T18:37:34.594397-06:00 vvepve13 kernel: [ 318.172029] tap13101i1: entered promiscuous mode 2024-11-21T18:37:34.640376-06:00 vvepve13 kernel: [ 318.217302] vmbr0: port 11(tap13101i1) entered blocking state 2024-11-21T18:37:34.640393-06:00 vvepve13 kernel: [ 318.217310] vmbr0: port 11(tap13101i1) entered disabled state 2024-11-21T18:37:34.640396-06:00 vvepve13 kernel: [ 318.217341] tap13101i1: entered allmulticast mode 2024-11-21T18:37:34.640398-06:00 vvepve13 kernel: [ 318.218073] vmbr0: port 11(tap13101i1) entered blocking state 2024-11-21T18:37:34.640400-06:00 vvepve13 kernel: [ 318.218077] vmbr0: port 11(tap13101i1) entered forwarding state 2024-11-21T18:37:35.819630-06:00 vvepve13 pve-ha-lrm[4337]: Task 'UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam:' still active, waiting 2024-11-21T18:37:36.249349-06:00 vvepve13 kernel: [ 319.827024] tap13101i2: entered promiscuous mode 2024-11-21T18:37:36.291346-06:00 vvepve13 kernel: [ 319.868406] vmbr0: port 12(tap13101i2) entered blocking state 2024-11-21T18:37:36.291365-06:00 vvepve13 kernel: [ 319.868417] vmbr0: port 12(tap13101i2) entered disabled state 2024-11-21T18:37:36.291367-06:00 vvepve13 kernel: [ 319.868443] tap13101i2: entered allmulticast mode 2024-11-21T18:37:36.291368-06:00 vvepve13 kernel: [ 319.869185] vmbr0: port 12(tap13101i2) entered blocking state 2024-11-21T18:37:36.291369-06:00 vvepve13 kernel: [ 319.869191] vmbr0: port 12(tap13101i2) entered forwarding state 2024-11-21T18:37:37.997394-06:00 vvepve13 kernel: [ 321.575034] tap13101i3: entered promiscuous mode 2024-11-21T18:37:38.040384-06:00 vvepve13 kernel: [ 321.617225] vmbr0: port 13(tap13101i3) entered blocking state 2024-11-21T18:37:38.040396-06:00 vvepve13 kernel: [ 321.617236] vmbr0: port 13(tap13101i3) entered disabled state 2024-11-21T18:37:38.040400-06:00 vvepve13 kernel: [ 321.617278] tap13101i3: entered allmulticast mode 2024-11-21T18:37:38.040402-06:00 vvepve13 kernel: [ 321.618070] vmbr0: port 13(tap13101i3) entered blocking state 2024-11-21T18:37:38.040403-06:00 vvepve13 kernel: [ 321.618077] vmbr0: port 13(tap13101i3) entered forwarding state 2024-11-21T18:37:38.248094-06:00 vvepve13 pve-ha-lrm[4337]: end task UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam: OK 2024-11-21T18:37:38.254144-06:00 vvepve13 pve-ha-lrm[4337]: service status vm:13101 started 2024-11-21T18:37:44.256824-06:00 vvepve13 QEMU[3794]: kvm: ../accel/kvm/kvm-all.c:1836: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. 2024-11-21T18:38:17.486394-06:00 vvepve13 kernel: [ 361.063298] vmbr0: port 10(tap13101i0) entered disabled state 2024-11-21T18:38:17.486423-06:00 vvepve13 kernel: [ 361.064099] tap13101i0 (unregistering): left allmulticast mode 2024-11-21T18:38:17.486426-06:00 vvepve13 kernel: [ 361.064110] vmbr0: port 10(tap13101i0) entered disabled state 2024-11-21T18:38:17.510386-06:00 vvepve13 kernel: [ 361.087517] vmbr0: port 11(tap13101i1) entered disabled state 2024-11-21T18:38:17.510400-06:00 vvepve13 kernel: [ 361.087796] tap13101i1 (unregistering): left allmulticast mode 2024-11-21T18:38:17.510403-06:00 vvepve13 kernel: [ 361.087805] vmbr0: port 11(tap13101i1) entered disabled state 2024-11-21T18:38:17.540386-06:00 vvepve13 kernel: [ 361.117511] vmbr0: port 12(tap13101i2) entered disabled state 2024-11-21T18:38:17.540402-06:00 vvepve13 kernel: [ 361.117817] tap13101i2 (unregistering): left allmulticast mode 2024-11-21T18:38:17.540404-06:00 vvepve13 kernel: [ 361.117827] vmbr0: port 12(tap13101i2) entered disabled state 2024-11-21T18:38:17.561380-06:00 vvepve13 kernel: [ 361.138518] vmbr0: port 13(tap13101i3) entered disabled state 2024-11-21T18:38:17.561394-06:00 vvepve13 kernel: [ 361.138965] tap13101i3 (unregistering): left allmulticast mode 2024-11-21T18:38:17.561399-06:00 vvepve13 kernel: [ 361.138977] vmbr0: port 13(tap13101i3) entered disabled state 2024-11-21T18:38:17.584412-06:00 vvepve13 systemd[1]: 13101.scope: Deactivated successfully. 2024-11-21T18:38:17.584619-06:00 vvepve13 systemd[1]: 13101.scope: Consumed 51.122s CPU time. 2024-11-21T18:38:18.522886-06:00 vvepve13 pvestatd[1476]: VM 13101 qmp command failed - VM 13101 not running 2024-11-21T18:38:18.523725-06:00 vvepve13 pve-ha-lrm[4889]: end task UPID:vvepve13:0000131A:00008A78:673FD272:qmstart:13104:root at pam: OK 2024-11-21T18:38:18.945142-06:00 vvepve13 qmeventd[4990]: Starting cleanup for 13101 2024-11-21T18:38:19.022405-06:00 vvepve13 qmeventd[4990]: Finished cleanup for 13101 Thanks JR From alwin at antreich.com Wed Nov 27 10:38:59 2024 From: alwin at antreich.com (Alwin Antreich) Date: Wed, 27 Nov 2024 09:38:59 +0000 Subject: [PVE-User] VMs With Multiple Interfaces Rebooting In-Reply-To: References: Message-ID: <64bc4bad4c8528beaf44558880c9723751431d16@antreich.com> Hi JR, November 25, 2024 at 4:08 PM, "JR Richardson" wrote: > > > > > Super stable environment for many years through software and hardware > > upgrades, few issues to speak of, then without warning one of my > > hypervisors in 3 node group crashed with a memory dimm error, cluster > > HA took over and restarted the VMs on the other two nodes in the group > > as expected. The problem quickly materialized as the VMs started > > rebooting quickly, a lot of network issues and notice of migration > > pending. I could not lockdown exactly what the root cause was. Notable > > This sounds like it wanted to balance the load. Do you have CRS active and/or static load scheduling? > > > CRS option is set to basic, not dynamic. K, basic. And I meant is rebalance active. :) > > 2024-11-21T18:37:38.248094-06:00 vvepve13 pve-ha-lrm[4337]: > end task UPID:vvepve13:000010F2:00007AEA:673FD24A:qmstart:13101:root at pam: > OK > 2024-11-21T18:37:38.254144-06:00 vvepve13 pve-ha-lrm[4337]: service > status vm:13101 started > 2024-11-21T18:37:44.256824-06:00 vvepve13 QEMU[3794]: kvm: > ../accel/kvm/kvm-all.c:1836: kvm_irqchip_commit_routes: Assertion `ret > == 0' failed. This doesn't look good. I'd assume that this is VM13101, which failed to start. And was consequently moved to the other remaining node (vice versa). But this doesn't explain the WHY. You will need to look further into the logs to see what else transpired during this time. Cheers, Alwin