From f.thommen at dkfz-heidelberg.de Mon Jan 4 12:44:57 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Mon, 4 Jan 2021 12:44:57 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum Message-ID: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> Dear all, one of our three PVE hypervisors in the cluster crashed (it was fenced successfully) and rebooted automatically. I took the chance to do a complete dist-upgrade and rebooted again. The PVE Ceph dashboard now reports, that * the monitor on the host is down (out of quorum), and * "A newer version was installed but old version still running, please restart" The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 is installed. The hypervisor has been rebooted twice since the upgrade, so it should be basically impossible that the old version is still running. `systemctl restart ceph.target` and restarting the monitor through the PVE Ceph UI didn't help. The hypervisor is running PVE 6.3-3 (the other two are running 6.3-2 with monitor 14.2.15) What to do in this situation? I am happy with either UI or commandline instructions, but I have no Ceph experience besides setting up it up following the PVE instructions. Any help or hint is appreciated. Cheers, Frank From f.thommen at dkfz-heidelberg.de Tue Jan 5 20:01:31 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Tue, 5 Jan 2021 20:01:31 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> Message-ID: <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> On 04.01.21 12:44, Frank Thommen wrote: > > Dear all, > > one of our three PVE hypervisors in the cluster crashed (it was fenced > successfully) and rebooted automatically.? I took the chance to do a > complete dist-upgrade and rebooted again. > > The PVE Ceph dashboard now reports, that > > ? * the monitor on the host is down (out of quorum), and > ? * "A newer version was installed but old version still running, > please restart" > > The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 is > installed. The hypervisor has been rebooted twice since the upgrade, so > it should be basically impossible that the old version is still running. > > `systemctl restart ceph.target` and restarting the monitor through the > PVE Ceph UI didn't help. The hypervisor is running PVE 6.3-3 (the other > two are running 6.3-2 with monitor 14.2.15) > > What to do in this situation? > > I am happy with either UI or commandline instructions, but I have no > Ceph experience besides setting up it up following the PVE instructions. > > Any help or hint is appreciated. > Cheers, Frank In an attempt to fix the issue I destroyed the monitor through the UI and recreated it. Unfortunately it can still not be started. A popup tells me that the monitor has been started, but the overview still shows "stopped" and there is no version number any more. Then I stopped and started Ceph on the node (`pveceph stop; pveceph start`) which resulted in a degraded cluster (1 host down, 7 of 21 OSDs down). OSDs cannot be started through the UI either. I feel extremely uncomfortable with this situation and would appreciate any hint as to how I should proceed with the problem. Cheers, Frank From f.thommen at dkfz-heidelberg.de Tue Jan 5 20:08:09 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Tue, 5 Jan 2021 20:08:09 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> Message-ID: On 05.01.21 20:01, Frank Thommen wrote: > > On 04.01.21 12:44, Frank Thommen wrote: >> >> Dear all, >> >> one of our three PVE hypervisors in the cluster crashed (it was fenced >> successfully) and rebooted automatically.? I took the chance to do a >> complete dist-upgrade and rebooted again. >> >> The PVE Ceph dashboard now reports, that >> >> ?? * the monitor on the host is down (out of quorum), and >> ?? * "A newer version was installed but old version still running, >> please restart" >> >> The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 is >> installed. The hypervisor has been rebooted twice since the upgrade, >> so it should be basically impossible that the old version is still >> running. >> >> `systemctl restart ceph.target` and restarting the monitor through the >> PVE Ceph UI didn't help. The hypervisor is running PVE 6.3-3 (the >> other two are running 6.3-2 with monitor 14.2.15) >> >> What to do in this situation? >> >> I am happy with either UI or commandline instructions, but I have no >> Ceph experience besides setting up it up following the PVE instructions. >> >> Any help or hint is appreciated. >> Cheers, Frank > > In an attempt to fix the issue I destroyed the monitor through the UI > and recreated it.? Unfortunately it can still not be started.? A popup > tells me that the monitor has been started, but the overview still shows > "stopped" and there is no version number any more. > > Then I stopped and started Ceph on the node (`pveceph stop; pveceph > start`) which resulted in a degraded cluster (1 host down, 7 of 21 OSDs > down). OSDs cannot be started through the UI either. > > I feel extremely uncomfortable with this situation and would appreciate > any hint as to how I should proceed with the problem. > > Cheers, Frank OSDs and MDSs just took a bit to start, so from this side it looks ok now. But the monitor still refuses to start. Frank From uwe.sauter.de at gmail.com Tue Jan 5 20:10:19 2021 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Tue, 5 Jan 2021 20:10:19 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> Message-ID: Hi Frank, did you look into the log of MON and OSD? Can you provide the list of installed packages of the affected host and the rest of the cluster? Is the output of "ceph status" the same for all hosts? Regards, Uwe Am 05.01.21 um 20:01 schrieb Frank Thommen: > > On 04.01.21 12:44, Frank Thommen wrote: >> >> Dear all, >> >> one of our three PVE hypervisors in the cluster crashed (it was fenced successfully) and rebooted automatically.? I >> took the chance to do a complete dist-upgrade and rebooted again. >> >> The PVE Ceph dashboard now reports, that >> >> ?? * the monitor on the host is down (out of quorum), and >> ?? * "A newer version was installed but old version still running, please restart" >> >> The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 is installed. The hypervisor has been rebooted twice >> since the upgrade, so it should be basically impossible that the old version is still running. >> >> `systemctl restart ceph.target` and restarting the monitor through the PVE Ceph UI didn't help. The hypervisor is >> running PVE 6.3-3 (the other two are running 6.3-2 with monitor 14.2.15) >> >> What to do in this situation? >> >> I am happy with either UI or commandline instructions, but I have no Ceph experience besides setting up it up >> following the PVE instructions. >> >> Any help or hint is appreciated. >> Cheers, Frank > > In an attempt to fix the issue I destroyed the monitor through the UI and recreated it.? Unfortunately it can still not > be started.? A popup tells me that the monitor has been started, but the overview still shows "stopped" and there is no > version number any more. > > Then I stopped and started Ceph on the node (`pveceph stop; pveceph start`) which resulted in a degraded cluster (1 host > down, 7 of 21 OSDs down). OSDs cannot be started through the UI either. > > I feel extremely uncomfortable with this situation and would appreciate any hint as to how I should proceed with the > problem. > > Cheers, Frank > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From f.thommen at dkfz-heidelberg.de Tue Jan 5 20:24:56 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Tue, 5 Jan 2021 20:24:56 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> Message-ID: <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> Hi Uwe, > did you look into the log of MON and OSD? I can't see any specific MON and OSD logs. However the log available in the UI (Ceph -> Log) has lots of messages regarding scrubbing but no messages regarding issues with starting the monitor > Can you provide the list of > installed packages of the affected host and the rest of the cluster? let me compile the lists and post them somewhere. They are quite long. > > Is the output of "ceph status" the same for all hosts? yes Frank > > > Regards, > > ????Uwe > > Am 05.01.21 um 20:01 schrieb Frank Thommen: >> >> On 04.01.21 12:44, Frank Thommen wrote: >>> >>> Dear all, >>> >>> one of our three PVE hypervisors in the cluster crashed (it was >>> fenced successfully) and rebooted automatically.? I took the chance >>> to do a complete dist-upgrade and rebooted again. >>> >>> The PVE Ceph dashboard now reports, that >>> >>> ?? * the monitor on the host is down (out of quorum), and >>> ?? * "A newer version was installed but old version still running, >>> please restart" >>> >>> The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 is >>> installed. The hypervisor has been rebooted twice since the upgrade, >>> so it should be basically impossible that the old version is still >>> running. >>> >>> `systemctl restart ceph.target` and restarting the monitor through >>> the PVE Ceph UI didn't help. The hypervisor is running PVE 6.3-3 (the >>> other two are running 6.3-2 with monitor 14.2.15) >>> >>> What to do in this situation? >>> >>> I am happy with either UI or commandline instructions, but I have no >>> Ceph experience besides setting up it up following the PVE instructions. >>> >>> Any help or hint is appreciated. >>> Cheers, Frank >> >> In an attempt to fix the issue I destroyed the monitor through the UI >> and recreated it.? Unfortunately it can still not be started.? A popup >> tells me that the monitor has been started, but the overview still >> shows "stopped" and there is no version number any more. >> >> Then I stopped and started Ceph on the node (`pveceph stop; pveceph >> start`) which resulted in a degraded cluster (1 host down, 7 of 21 >> OSDs down). OSDs cannot be started through the UI either. >> >> I feel extremely uncomfortable with this situation and would >> appreciate any hint as to how I should proceed with the problem. >> >> Cheers, Frank >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From uwe.sauter.de at gmail.com Tue Jan 5 20:29:59 2021 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Tue, 5 Jan 2021 20:29:59 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> Message-ID: Frank, Am 05.01.21 um 20:24 schrieb Frank Thommen: > Hi Uwe, > >> did you look into the log of MON and OSD? > > I can't see any specific MON and OSD logs. However the log available in the UI (Ceph -> Log) has lots of messages > regarding scrubbing but no messages regarding issues with starting the monitor > On each host the logs should be in /var/log/ceph. These should be rotated (see /etc/logrotate.d/ceph-common for details). Regards, Uwe > >> Can you provide the list of installed packages of the affected host and the rest of the cluster? > > let me compile the lists and post them somewhere.? They are quite long. > >> >> Is the output of "ceph status" the same for all hosts? > > yes > > Frank > >> >> >> Regards, >> >> ?????Uwe >> >> Am 05.01.21 um 20:01 schrieb Frank Thommen: >>> >>> On 04.01.21 12:44, Frank Thommen wrote: >>>> >>>> Dear all, >>>> >>>> one of our three PVE hypervisors in the cluster crashed (it was fenced successfully) and rebooted automatically.? I >>>> took the chance to do a complete dist-upgrade and rebooted again. >>>> >>>> The PVE Ceph dashboard now reports, that >>>> >>>> ?? * the monitor on the host is down (out of quorum), and >>>> ?? * "A newer version was installed but old version still running, please restart" >>>> >>>> The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 is installed. The hypervisor has been rebooted >>>> twice since the upgrade, so it should be basically impossible that the old version is still running. >>>> >>>> `systemctl restart ceph.target` and restarting the monitor through the PVE Ceph UI didn't help. The hypervisor is >>>> running PVE 6.3-3 (the other two are running 6.3-2 with monitor 14.2.15) >>>> >>>> What to do in this situation? >>>> >>>> I am happy with either UI or commandline instructions, but I have no Ceph experience besides setting up it up >>>> following the PVE instructions. >>>> >>>> Any help or hint is appreciated. >>>> Cheers, Frank >>> >>> In an attempt to fix the issue I destroyed the monitor through the UI and recreated it.? Unfortunately it can still >>> not be started.? A popup tells me that the monitor has been started, but the overview still shows "stopped" and there >>> is no version number any more. >>> >>> Then I stopped and started Ceph on the node (`pveceph stop; pveceph start`) which resulted in a degraded cluster (1 >>> host down, 7 of 21 OSDs down). OSDs cannot be started through the UI either. >>> >>> I feel extremely uncomfortable with this situation and would appreciate any hint as to how I should proceed with the >>> problem. >>> >>> Cheers, Frank >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at lists.proxmox.com >>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From f.thommen at dkfz-heidelberg.de Tue Jan 5 20:35:29 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Tue, 5 Jan 2021 20:35:29 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> Message-ID: <58eb5cb7-3423-b65b-6c82-dad47aa1d72e@dkfz-heidelberg.de> On 05.01.21 20:24, Frank Thommen wrote: > Hi Uwe, > >> did you look into the log of MON and OSD? > > I can't see any specific MON and OSD logs. However the log available in > the UI (Ceph -> Log) has lots of messages regarding scrubbing but no > messages regarding issues with starting the monitor > > >> Can you provide the list of installed packages of the affected host >> and the rest of the cluster? > > let me compile the lists and post them somewhere.? They are quite long. dpkg -l: * https://pastebin.com/HacFNTDf * https://pastebin.com/qmya0Y2Y * https://pastebin.com/5CmudA6L The second one is from the host where the monitor refuses to start Frank From f.thommen at dkfz-heidelberg.de Tue Jan 5 20:44:36 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Tue, 5 Jan 2021 20:44:36 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> Message-ID: On 05.01.21 20:29, Uwe Sauter wrote: > Frank, > > Am 05.01.21 um 20:24 schrieb Frank Thommen: >> Hi Uwe, >> >>> did you look into the log of MON and OSD? >> >> I can't see any specific MON and OSD logs. However the log available >> in the UI (Ceph -> Log) has lots of messages regarding scrubbing but >> no messages regarding issues with starting the monitor >> > > On each host the logs should be in /var/log/ceph. These should be > rotated (see /etc/logrotate.d/ceph-common for details). ok. I see lots of ----------------------- 2021-01-05 20:38:05.900 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:07.208 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:08.688 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:08.744 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:09.092 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:12.268 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:12.468 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:12.964 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:15.752 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:17.440 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:19.388 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:19.468 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:22.712 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id 2021-01-05 20:38:22.828 7f979e753700 1 mon.odcf-pve02 at -1(probing) e4 handle_auth_request failed to assign global_id ----------------------- in the mon log on the problematic host. When (unsuccessfully) starting the monitor through the UI, the following entries appear in ceph.audit.log: ----------------------- 2021-01-05 20:40:07.635369 mon.odcf-pve03 (mon.1) 288082 : audit [DBG] from='client.? 192.168.255.2:0/2418486168' entity='client.admin' cmd=[{"format":"json","prefix":"mgr metadata"}]: dispatch 2021-01-05 20:40:07.636592 mon.odcf-pve03 (mon.1) 288083 : audit [DBG] from='client.? 192.168.255.2:0/2418486168' entity='client.admin' cmd=[{"format":"json","prefix":"mgr dump"}]: dispatch 2021-01-05 20:40:08.296793 mon.odcf-pve03 (mon.1) 288084 : audit [DBG] from='client.? 192.168.255.2:0/778781756' entity='client.admin' cmd=[{"format":"json","prefix":"mon metadata"}]: dispatch 2021-01-05 20:40:08.297767 mon.odcf-pve03 (mon.1) 288085 : audit [DBG] from='client.? 192.168.255.2:0/778781756' entity='client.admin' cmd=[{"prefix":"quorum_status","format":"json"}]: dispatch 2021-01-05 20:40:08.436982 mon.odcf-pve01 (mon.0) 389632 : audit [DBG] from='client.? 192.168.255.2:0/784579843' entity='client.admin' cmd=[{"format":"json","prefix":"df"}]: dispatch ----------------------- 192.168.255.2 is the IP number of the problematic host in the Ceph mesh network. odcf-pve01 and odcf-pve03 are the "good" nodes. However I am not sure, what kind of information I should look for in the logs Frank > > Regards, > > ????Uwe > > > >> >>> Can you provide the list of installed packages of the affected host >>> and the rest of the cluster? >> >> let me compile the lists and post them somewhere.? They are quite long. >> >>> >>> Is the output of "ceph status" the same for all hosts? >> >> yes >> >> Frank >> >>> >>> >>> Regards, >>> >>> ?????Uwe >>> >>> Am 05.01.21 um 20:01 schrieb Frank Thommen: >>>> >>>> On 04.01.21 12:44, Frank Thommen wrote: >>>>> >>>>> Dear all, >>>>> >>>>> one of our three PVE hypervisors in the cluster crashed (it was >>>>> fenced successfully) and rebooted automatically.? I took the chance >>>>> to do a complete dist-upgrade and rebooted again. >>>>> >>>>> The PVE Ceph dashboard now reports, that >>>>> >>>>> ?? * the monitor on the host is down (out of quorum), and >>>>> ?? * "A newer version was installed but old version still running, >>>>> please restart" >>>>> >>>>> The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 >>>>> is installed. The hypervisor has been rebooted twice since the >>>>> upgrade, so it should be basically impossible that the old version >>>>> is still running. >>>>> >>>>> `systemctl restart ceph.target` and restarting the monitor through >>>>> the PVE Ceph UI didn't help. The hypervisor is running PVE 6.3-3 >>>>> (the other two are running 6.3-2 with monitor 14.2.15) >>>>> >>>>> What to do in this situation? >>>>> >>>>> I am happy with either UI or commandline instructions, but I have >>>>> no Ceph experience besides setting up it up following the PVE >>>>> instructions. >>>>> >>>>> Any help or hint is appreciated. >>>>> Cheers, Frank >>>> >>>> In an attempt to fix the issue I destroyed the monitor through the >>>> UI and recreated it.? Unfortunately it can still not be started.? A >>>> popup tells me that the monitor has been started, but the overview >>>> still shows "stopped" and there is no version number any more. >>>> >>>> Then I stopped and started Ceph on the node (`pveceph stop; pveceph >>>> start`) which resulted in a degraded cluster (1 host down, 7 of 21 >>>> OSDs down). OSDs cannot be started through the UI either. >>>> >>>> I feel extremely uncomfortable with this situation and would >>>> appreciate any hint as to how I should proceed with the problem. >>>> >>>> Cheers, Frank >>>> >>>> _______________________________________________ >>>> pve-user mailing list >>>> pve-user at lists.proxmox.com >>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at lists.proxmox.com >>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From f.thommen at dkfz-heidelberg.de Tue Jan 5 21:17:11 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Tue, 5 Jan 2021 21:17:11 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> Message-ID: <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> On 05.01.21 21:02, Uwe Sauter wrote: > There's a paragraph about probing mons on > > https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ I will check that (tomorrow :-) > Can you connect to ports TCP/3300 and TCP/6789 from the other two hosts? > You can use telnet to check this. (On host1 or host3 run "telnet host > port". To quit telnet, press ctrl+], then type "quit" + enter.) yes, all working ok > Is the clock synchronized on the affected host? yes > From the package lists I assume that host1 has a GUI installed while > host3 additionally acts as MariaDB server? I always keep the packages > list in sync on the clusters I run? no MariaDB server anywhere and all three have the PVE webUI. I wouldn't know of any other GUI. Additional packages might come a prerequisite for individually installed admin tools (e.g. wireshark). We install them ad-hoc and don't usually keep them on all hosts. > It also lookls like hosts 1 and 3 are on different kernels (though that > shouldn't be an issue here?). 1 and 3 should be the same while 2 has been updated more recently Frank > > > Am 05.01.21 um 20:44 schrieb Frank Thommen: >> >> >> On 05.01.21 20:29, Uwe Sauter wrote: >>> Frank, >>> >>> Am 05.01.21 um 20:24 schrieb Frank Thommen: >>>> Hi Uwe, >>>> >>>>> did you look into the log of MON and OSD? >>>> >>>> I can't see any specific MON and OSD logs. However the log available >>>> in the UI (Ceph -> Log) has lots of messages regarding scrubbing but >>>> no messages regarding issues with starting the monitor >>>> >>> >>> On each host the logs should be in /var/log/ceph. These should be >>> rotated (see /etc/logrotate.d/ceph-common for details). >> >> ok.? I see lots of >> >> ----------------------- >> 2021-01-05 20:38:05.900 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:07.208 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:08.688 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:08.744 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:09.092 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:12.268 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:12.468 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:12.964 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:15.752 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:17.440 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:19.388 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:19.468 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:22.712 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> 2021-01-05 20:38:22.828 7f979e753700? 1 mon.odcf-pve02 at -1(probing) e4 >> handle_auth_request failed to assign global_id >> ----------------------- >> >> in the mon log on the problematic host. >> >> When (unsuccessfully) starting the monitor through the UI, the >> following entries appear in ceph.audit.log: >> >> ----------------------- >> 2021-01-05 20:40:07.635369 mon.odcf-pve03 (mon.1) 288082 : audit [DBG] >> from='client.? 192.168.255.2:0/2418486168' entity='client.admin' >> cmd=[{"format":"json","prefix":"mgr metadata"}]: dispatch >> 2021-01-05 20:40:07.636592 mon.odcf-pve03 (mon.1) 288083 : audit [DBG] >> from='client.? 192.168.255.2:0/2418486168' entity='client.admin' >> cmd=[{"format":"json","prefix":"mgr dump"}]: dispatch >> 2021-01-05 20:40:08.296793 mon.odcf-pve03 (mon.1) 288084 : audit [DBG] >> from='client.? 192.168.255.2:0/778781756' entity='client.admin' >> cmd=[{"format":"json","prefix":"mon metadata"}]: dispatch >> 2021-01-05 20:40:08.297767 mon.odcf-pve03 (mon.1) 288085 : audit [DBG] >> from='client.? 192.168.255.2:0/778781756' entity='client.admin' >> cmd=[{"prefix":"quorum_status","format":"json"}]: dispatch >> 2021-01-05 20:40:08.436982 mon.odcf-pve01 (mon.0) 389632 : audit [DBG] >> from='client.? 192.168.255.2:0/784579843' entity='client.admin' >> cmd=[{"format":"json","prefix":"df"}]: dispatch >> ----------------------- >> >> 192.168.255.2 is the IP number of the problematic host in the Ceph >> mesh network. odcf-pve01 and odcf-pve03 are the "good" nodes. >> >> However I am not sure, what kind of information I should look for in >> the logs >> >> Frank >> >>> >>> Regards, >>> >>> ?????Uwe >>> >>> >>> >>>> >>>>> Can you provide the list of installed packages of the affected host >>>>> and the rest of the cluster? >>>> >>>> let me compile the lists and post them somewhere.? They are quite long. >>>> >>>>> >>>>> Is the output of "ceph status" the same for all hosts? >>>> >>>> yes >>>> >>>> Frank >>>> >>>>> >>>>> >>>>> Regards, >>>>> >>>>> ?????Uwe >>>>> >>>>> Am 05.01.21 um 20:01 schrieb Frank Thommen: >>>>>> >>>>>> On 04.01.21 12:44, Frank Thommen wrote: >>>>>>> >>>>>>> Dear all, >>>>>>> >>>>>>> one of our three PVE hypervisors in the cluster crashed (it was >>>>>>> fenced successfully) and rebooted automatically. I took the >>>>>>> chance to do a complete dist-upgrade and rebooted again. >>>>>>> >>>>>>> The PVE Ceph dashboard now reports, that >>>>>>> >>>>>>> ?? * the monitor on the host is down (out of quorum), and >>>>>>> ?? * "A newer version was installed but old version still >>>>>>> running, please restart" >>>>>>> >>>>>>> The Ceph UI reports monitor version 14.2.11 while in fact 14.2.16 >>>>>>> is installed. The hypervisor has been rebooted twice since the >>>>>>> upgrade, so it should be basically impossible that the old >>>>>>> version is still running. >>>>>>> >>>>>>> `systemctl restart ceph.target` and restarting the monitor >>>>>>> through the PVE Ceph UI didn't help. The hypervisor is running >>>>>>> PVE 6.3-3 (the other two are running 6.3-2 with monitor 14.2.15) >>>>>>> >>>>>>> What to do in this situation? >>>>>>> >>>>>>> I am happy with either UI or commandline instructions, but I have >>>>>>> no Ceph experience besides setting up it up following the PVE >>>>>>> instructions. >>>>>>> >>>>>>> Any help or hint is appreciated. >>>>>>> Cheers, Frank >>>>>> >>>>>> In an attempt to fix the issue I destroyed the monitor through the >>>>>> UI and recreated it.? Unfortunately it can still not be started. >>>>>> A popup tells me that the monitor has been started, but the >>>>>> overview still shows "stopped" and there is no version number any >>>>>> more. >>>>>> >>>>>> Then I stopped and started Ceph on the node (`pveceph stop; >>>>>> pveceph start`) which resulted in a degraded cluster (1 host down, >>>>>> 7 of 21 OSDs down). OSDs cannot be started through the UI either. >>>>>> >>>>>> I feel extremely uncomfortable with this situation and would >>>>>> appreciate any hint as to how I should proceed with the problem. >>>>>> >>>>>> Cheers, Frank >>>>>> >>>>>> _______________________________________________ >>>>>> pve-user mailing list >>>>>> pve-user at lists.proxmox.com >>>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>>>> >>>>> _______________________________________________ >>>>> pve-user mailing list >>>>> pve-user at lists.proxmox.com >>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>>> >>>> _______________________________________________ >>>> pve-user mailing list >>>> pve-user at lists.proxmox.com >>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at lists.proxmox.com >>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From f.thommen at dkfz-heidelberg.de Tue Jan 5 21:18:12 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Tue, 5 Jan 2021 21:18:12 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <058f3eca-2e6f-eead-365a-4d451fa160d3@gmail.com> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <058f3eca-2e6f-eead-365a-4d451fa160d3@gmail.com> Message-ID: <4b7d2cd0-a524-9293-b1b8-5f2f0d324168@dkfz-heidelberg.de> On 05.01.21 21:05, Uwe Sauter wrote:> Also, is there still disk space available? It seems that the monitor > refuses to start if it can't write to the log files. There are tons of free disk space on all partitions and filesystems :-) Frank From jmr.richardson at gmail.com Thu Jan 7 21:34:23 2021 From: jmr.richardson at gmail.com (JR Richardson) Date: Thu, 7 Jan 2021 14:34:23 -0600 Subject: [PVE-User] Cluster Shared Storage Mass Move Disk's Message-ID: Hi All, I'm running Cluster 6.2 using several shared NFS storage nodes, working great. My question is I have one storage node that I want to de-commision so I need to move the virtual machine disks (qcow2) to other storage nodes. Is there a mechanism/script to do this with more than one disk at a time? I have 55 disks and would like to move all in bulk, on-line without having to shutdown the VMs. Thanks. JR -- JR Richardson Engineering for the Masses Chasing the Azeotrope From f.thommen at dkfz-heidelberg.de Fri Jan 8 11:36:15 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Fri, 8 Jan 2021 11:36:15 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> Message-ID: <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> On 05.01.21 21:17, Frank Thommen wrote: > On 05.01.21 21:02, Uwe Sauter wrote: >> There's a paragraph about probing mons on >> >> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >> > > I will check that (tomorrow :-) using the monitor's admin socket on either of the three nodes I can query the monitors of 01 and 03 (the good ones) but not of 02 (the problematic one): root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status Error ENOENT: problem getting command descriptions from mon.odcf-pve02 root at odcf-pve01:~# The monitor daemon is running on all three and the ports are open. Any other ideas? Cheers, Frank From uwe.sauter.de at gmail.com Fri Jan 8 11:45:35 2021 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Fri, 8 Jan 2021 11:45:35 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> Message-ID: Am 08.01.21 um 11:36 schrieb Frank Thommen: > > On 05.01.21 21:17, Frank Thommen wrote: >> On 05.01.21 21:02, Uwe Sauter wrote: >>> There's a paragraph about probing mons on >>> >>> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >> >> I will check that (tomorrow :-) > > > using the monitor's admin socket on either of the three nodes I can query the monitors of 01 and 03 (the good ones) but > not of 02 (the problematic one): > > root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status > Error ENOENT: problem getting command descriptions from mon.odcf-pve02 > root at odcf-pve01:~# > > The monitor daemon is running on all three and the ports are open. > > Any other ideas? You could check the permissions on the socket: ss -xln | grep ceph-mon SOCK=$(ss -xln | awk '/ceph-mon/ {print $5}') ls -la ${SOCK} On my host, this shows srwxr-xr-x 1 ceph ceph 0 Dec 20 23:47 /var/run/ceph/ceph-mon.px-alpha-cluster.asok > > Cheers, Frank > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.thommen at dkfz-heidelberg.de Fri Jan 8 12:05:18 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Fri, 8 Jan 2021 12:05:18 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> Message-ID: On 08.01.21 11:45, Uwe Sauter wrote: > > > Am 08.01.21 um 11:36 schrieb Frank Thommen: >> >> On 05.01.21 21:17, Frank Thommen wrote: >>> On 05.01.21 21:02, Uwe Sauter wrote: >>>> There's a paragraph about probing mons on >>>> >>>> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >>>> >>> >>> I will check that (tomorrow :-) >> >> >> using the monitor's admin socket on either of the three nodes I can >> query the monitors of 01 and 03 (the good ones) but not of 02 (the >> problematic one): >> >> root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status >> Error ENOENT: problem getting command descriptions from mon.odcf-pve02 >> root at odcf-pve01:~# >> >> The monitor daemon is running on all three and the ports are open. >> >> Any other ideas? > > You could check the permissions on the socket: > > ss -xln | grep ceph-mon > SOCK=$(ss -xln | awk '/ceph-mon/ {print $5}') > ls -la ${SOCK} > > On my host, this shows > > srwxr-xr-x 1 ceph ceph 0 Dec 20 23:47 > /var/run/ceph/ceph-mon.px-alpha-cluster.asok same here From aderumier at odiso.com Fri Jan 8 12:09:36 2021 From: aderumier at odiso.com (aderumier at odiso.com) Date: Fri, 08 Jan 2021 12:09:36 +0100 Subject: [PVE-User] Cluster Shared Storage Mass Move Disk's In-Reply-To: References: Message-ID: <69e466f6ee4ed86caa964ea132a1852da80c1a6d.camel@odiso.com> Hi, you can use the move disk feature in the vm gui. can be done command line with "qm move_disk ...." if you want to script it > Le jeudi 07 janvier 2021 ? 14:34 -0600, JR Richardson a ?crit?: > Hi All, > > I'm running Cluster 6.2 using several shared NFS storage nodes, > working great. My question is I have one storage node that I want to > de-commision so I need to move the virtual machine disks (qcow2) to > other storage nodes. Is there a mechanism/script to do this with more > than one disk at a time? I have 55 disks and would like to move all > in > bulk, on-line without having to shutdown the VMs. > > Thanks. > > JR From rumors at web.de Fri Jan 8 12:27:42 2021 From: rumors at web.de (Peter Simon) Date: Fri, 8 Jan 2021 12:27:42 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> Message-ID: <1073f776-3910-dbde-8304-e86b5f6ed4fb@web.de> Hi Frank, your /etc/ceph/ceph.conf is the same on all hosts ? is there mon host = ip1, ip2, ip3 and seperate sections with [mon.x] host = hostname mon addr = ip:6789 Cheers Peter Am 08.01.21 um 12:05 schrieb Frank Thommen: > > > On 08.01.21 11:45, Uwe Sauter wrote: >> >> >> Am 08.01.21 um 11:36 schrieb Frank Thommen: >>> >>> On 05.01.21 21:17, Frank Thommen wrote: >>>> On 05.01.21 21:02, Uwe Sauter wrote: >>>>> There's a paragraph about probing mons on >>>>> >>>>> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >>>>> >>>> >>>> I will check that (tomorrow :-) >>> >>> >>> using the monitor's admin socket on either of the three nodes I can >>> query the monitors of 01 and 03 (the good ones) but not of 02 (the >>> problematic one): >>> >>> root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status >>> Error ENOENT: problem getting command descriptions from mon.odcf-pve02 >>> root at odcf-pve01:~# >>> >>> The monitor daemon is running on all three and the ports are open. >>> >>> Any other ideas? >> >> You could check the permissions on the socket: >> >> ss -xln | grep ceph-mon >> SOCK=$(ss -xln | awk '/ceph-mon/ {print $5}') >> ls -la ${SOCK} >> >> On my host, this shows >> >> srwxr-xr-x 1 ceph ceph 0 Dec 20 23:47 >> /var/run/ceph/ceph-mon.px-alpha-cluster.asok > > same here > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.thommen at dkfz-heidelberg.de Fri Jan 8 12:44:58 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Fri, 8 Jan 2021 12:44:58 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <1073f776-3910-dbde-8304-e86b5f6ed4fb@web.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> <1073f776-3910-dbde-8304-e86b5f6ed4fb@web.de> Message-ID: <8599b7ad-0ea0-b836-4492-d80b9d43cfcb@dkfz-heidelberg.de> yes /etc/ceph/ceph.conf is identical on all three hosts and there is a mon_host line with the correct IPs. Interestingly there is a special section for odcf-pve02: ----------- [mon.odcf-pve02] public_addr = 192.168.255.2 ----------- This is the same IP as in the mon_host line. However there is no equivalent section for the other two nodes. Frank On 08.01.21 12:27, Peter Simon wrote: > Hi Frank, > > your /etc/ceph/ceph.conf is the same on all hosts ? > > is there mon host = ip1, ip2, ip3 > > and seperate sections with [mon.x] > host = hostname > mon addr = ip:6789 > > Cheers > Peter > > Am 08.01.21 um 12:05 schrieb Frank Thommen: >> >> >> On 08.01.21 11:45, Uwe Sauter wrote: >>> >>> >>> Am 08.01.21 um 11:36 schrieb Frank Thommen: >>>> >>>> On 05.01.21 21:17, Frank Thommen wrote: >>>>> On 05.01.21 21:02, Uwe Sauter wrote: >>>>>> There's a paragraph about probing mons on >>>>>> >>>>>> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >>>>>> >>>>>> >>>>> >>>>> I will check that (tomorrow :-) >>>> >>>> >>>> using the monitor's admin socket on either of the three nodes I can >>>> query the monitors of 01 and 03 (the good ones) but not of 02 (the >>>> problematic one): >>>> >>>> root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status >>>> Error ENOENT: problem getting command descriptions from mon.odcf-pve02 >>>> root at odcf-pve01:~# >>>> >>>> The monitor daemon is running on all three and the ports are open. >>>> >>>> Any other ideas? >>> >>> You could check the permissions on the socket: >>> >>> ss -xln | grep ceph-mon >>> SOCK=$(ss -xln | awk '/ceph-mon/ {print $5}') >>> ls -la ${SOCK} >>> >>> On my host, this shows >>> >>> srwxr-xr-x 1 ceph ceph 0 Dec 20 23:47 >>> /var/run/ceph/ceph-mon.px-alpha-cluster.asok >> >> same here >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From rumors at web.de Fri Jan 8 12:57:13 2021 From: rumors at web.de (Peter Simon) Date: Fri, 8 Jan 2021 12:57:13 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <8599b7ad-0ea0-b836-4492-d80b9d43cfcb@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> <1073f776-3910-dbde-8304-e86b5f6ed4fb@web.de> <8599b7ad-0ea0-b836-4492-d80b9d43cfcb@dkfz-heidelberg.de> Message-ID: <489a78d2-bdc8-883c-a4c8-83cd470df0d7@web.de> Hi, please try : [mon.odcf-pve0X] ?host = hostname ?mon addr = 192.168.255.x:6789 seperate entry for each VG Peter Am 08.01.21 um 12:44 schrieb Frank Thommen: > yes /etc/ceph/ceph.conf is identical on all three hosts and there is a > mon_host line with the correct IPs.? Interestingly there is a special > section for odcf-pve02: > > ----------- > [mon.odcf-pve02] > ???? public_addr = 192.168.255.2 > ----------- > > This is the same IP as in the mon_host line.? However there is no > equivalent section for the other two nodes. > > Frank > > > On 08.01.21 12:27, Peter Simon wrote: >> Hi Frank, >> >> your /etc/ceph/ceph.conf is the same on all hosts ? >> >> is there mon host = ip1, ip2, ip3 >> >> and seperate sections with [mon.x] >> host = hostname >> mon addr = ip:6789 >> >> Cheers >> Peter >> >> Am 08.01.21 um 12:05 schrieb Frank Thommen: >>> >>> >>> On 08.01.21 11:45, Uwe Sauter wrote: >>>> >>>> >>>> Am 08.01.21 um 11:36 schrieb Frank Thommen: >>>>> >>>>> On 05.01.21 21:17, Frank Thommen wrote: >>>>>> On 05.01.21 21:02, Uwe Sauter wrote: >>>>>>> There's a paragraph about probing mons on >>>>>>> >>>>>>> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >>>>>>> >>>>>>> >>>>>> >>>>>> I will check that (tomorrow :-) >>>>> >>>>> >>>>> using the monitor's admin socket on either of the three nodes I can >>>>> query the monitors of 01 and 03 (the good ones) but not of 02 (the >>>>> problematic one): >>>>> >>>>> root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status >>>>> Error ENOENT: problem getting command descriptions from >>>>> mon.odcf-pve02 >>>>> root at odcf-pve01:~# >>>>> >>>>> The monitor daemon is running on all three and the ports are open. >>>>> >>>>> Any other ideas? >>>> >>>> You could check the permissions on the socket: >>>> >>>> ss -xln | grep ceph-mon >>>> SOCK=$(ss -xln | awk '/ceph-mon/ {print $5}') >>>> ls -la ${SOCK} >>>> >>>> On my host, this shows >>>> >>>> srwxr-xr-x 1 ceph ceph 0 Dec 20 23:47 >>>> /var/run/ceph/ceph-mon.px-alpha-cluster.asok >>> >>> same here >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at lists.proxmox.com >>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From f.thommen at dkfz-heidelberg.de Fri Jan 8 13:01:38 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Fri, 8 Jan 2021 13:01:38 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: <8599b7ad-0ea0-b836-4492-d80b9d43cfcb@dkfz-heidelberg.de> References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> <1073f776-3910-dbde-8304-e86b5f6ed4fb@web.de> <8599b7ad-0ea0-b836-4492-d80b9d43cfcb@dkfz-heidelberg.de> Message-ID: Could this entry be the result of the fencing which happened when the host initially crashed? I assumed, that it would automatically be unfenced when it comes up again. I never run some manual "unfencing" (I wouldn't know how). Frank On 08.01.21 12:44, Frank Thommen wrote: > yes /etc/ceph/ceph.conf is identical on all three hosts and there is a > mon_host line with the correct IPs.? Interestingly there is a special > section for odcf-pve02: > > ----------- > [mon.odcf-pve02] > ???? public_addr = 192.168.255.2 > ----------- > > This is the same IP as in the mon_host line.? However there is no > equivalent section for the other two nodes. > > Frank > > > On 08.01.21 12:27, Peter Simon wrote: >> Hi Frank, >> >> your /etc/ceph/ceph.conf is the same on all hosts ? >> >> is there mon host = ip1, ip2, ip3 >> >> and seperate sections with [mon.x] >> host = hostname >> mon addr = ip:6789 >> >> Cheers >> Peter >> >> Am 08.01.21 um 12:05 schrieb Frank Thommen: >>> >>> >>> On 08.01.21 11:45, Uwe Sauter wrote: >>>> >>>> >>>> Am 08.01.21 um 11:36 schrieb Frank Thommen: >>>>> >>>>> On 05.01.21 21:17, Frank Thommen wrote: >>>>>> On 05.01.21 21:02, Uwe Sauter wrote: >>>>>>> There's a paragraph about probing mons on >>>>>>> >>>>>>> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >>>>>>> >>>>>>> >>>>>> >>>>>> I will check that (tomorrow :-) >>>>> >>>>> >>>>> using the monitor's admin socket on either of the three nodes I can >>>>> query the monitors of 01 and 03 (the good ones) but not of 02 (the >>>>> problematic one): >>>>> >>>>> root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status >>>>> Error ENOENT: problem getting command descriptions from mon.odcf-pve02 >>>>> root at odcf-pve01:~# >>>>> >>>>> The monitor daemon is running on all three and the ports are open. >>>>> >>>>> Any other ideas? >>>> >>>> You could check the permissions on the socket: >>>> >>>> ss -xln | grep ceph-mon >>>> SOCK=$(ss -xln | awk '/ceph-mon/ {print $5}') >>>> ls -la ${SOCK} >>>> >>>> On my host, this shows >>>> >>>> srwxr-xr-x 1 ceph ceph 0 Dec 20 23:47 >>>> /var/run/ceph/ceph-mon.px-alpha-cluster.asok >>> >>> same here >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at lists.proxmox.com >>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From martin.konold at konsec.com Sat Jan 9 22:11:16 2021 From: martin.konold at konsec.com (Konold, Martin) Date: Sat, 09 Jan 2021 22:11:16 +0100 Subject: [PVE-User] Single BPS for multiple PVE lead to namespace conflict Message-ID: <51f40ab8b46dac203ce907c46318645f@konsec.com> Hi there, I am pretty new to Proxmox and deeply impressed by the quality of many aspects of its design and implementation. In my testing I observed that in case I have multiple PVE Clusters and perform backups to a single datastore on a single PBS I experience a lack of namespaces. Why a single datastore on the PBS for multiple PVE Clusters? For reasons of efficiency and avoidance of fragmentation I would like to use a single RAIDZ2 as a target. The problem now arises that both PVE clusters see the same "vm/100/{dateTime}". What about prefixing the backups with the Cluster-Name e.g. "pve1/vm/100{dateTime}"? Is there something I overlooked sofar? Regards ppa. Martin Konold -- Martin Konold - Prokurist, CTO KONSEC GmbH -? make things real Amtsgericht Stuttgart, HRB 23690 Gesch?ftsf?hrer: Andreas Mack Im K?ller 3, 70794 Filderstadt, Germany From jan+pve at brand-web.net Sat Jan 9 23:36:25 2021 From: jan+pve at brand-web.net (Jan Brand) Date: Sat, 9 Jan 2021 23:36:25 +0100 Subject: [PVE-User] Single BPS for multiple PVE lead to namespace conflict In-Reply-To: <51f40ab8b46dac203ce907c46318645f@konsec.com> References: <51f40ab8b46dac203ce907c46318645f@konsec.com> Message-ID: <7602fc38-076f-575d-f8e6-35a391add4a5@brand-web.net> Hi Martin, you can create multiple datastores on one zpool, just create an additional zfs dataset and configure the second pbs datastore on it. This way all data is stored on the same RAIDZ array, if this is your intention. If you want use the deduplication feature of pbs across backups of multiple clusters, I have to disappoint you. Afaik the deduplication is per datastore. Another "solution" would be to manually avoid overlapping VMIDs, but this would be error-prone and means a lot of work in an existing environment. I would create one backup datastore per cluster and call it a day. Best regards, Jan Am 09.01.2021 um 22:11 schrieb Konold, Martin: > > Hi there, > > I am pretty new to Proxmox and deeply impressed by the quality of many > aspects of its design and implementation. > > In my testing I observed that in case I have multiple PVE Clusters and > perform backups to a single datastore on a single PBS I experience a > lack of namespaces. > > Why a single datastore on the PBS for multiple PVE Clusters? > > For reasons of efficiency and avoidance of fragmentation I would like > to use a single RAIDZ2 as a target. The problem now arises that both > PVE clusters see the same "vm/100/{dateTime}". > > What about prefixing the backups with the Cluster-Name e.g. > "pve1/vm/100{dateTime}"? > > Is there something I overlooked sofar? > > Regards > ppa. Martin Konold > > -- > Martin Konold - Prokurist, CTO > KONSEC GmbH -? make things real > Amtsgericht Stuttgart, HRB 23690 > Gesch?ftsf?hrer: Andreas Mack > Im K?ller 3, 70794 Filderstadt, Germany > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From getche_f at yahoo.com Sun Jan 10 21:24:36 2021 From: getche_f at yahoo.com (Getachew Mekonnen) Date: Sun, 10 Jan 2021 20:24:36 +0000 (UTC) Subject: Proxmox References: <866269398.677434.1610310276937.ref@mail.yahoo.com> Message-ID: <866269398.677434.1610310276937@mail.yahoo.com> Dear Sir/Madam, This is Getachew from Ethiopia. I'm an IT Expert at public organization and very keen to use Proxmox for my environment. As a beginner, I tried to understand the basic concepts from the official guide and also watched videos.? But, I have some questions which I'm not clear with. My server is Dell Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB.? 1. My first question is which RAID level (hd) I use?2. How do I use the hard disks, this is to mean, For example I dedicate the one 300 GB hdd for Proxmox, the other for one VM, the other for another VM, or shall I use another arrangement? Generally, It would be nice if I get a quick start guide for my scenario (In my scenario, I want to create one Windows File Server, One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) I greatly appreciate your time and consideration. Best regards, Getachew Mekonnen Senior IT Expert, PBS Program IT Support ICT Directorate, Bureau of Finance and Economic Development (BoFED) Hawassa, Ethiopia From aderumier at odiso.com Mon Jan 11 08:10:43 2021 From: aderumier at odiso.com (aderumier at odiso.com) Date: Mon, 11 Jan 2021 08:10:43 +0100 Subject: [PVE-User] Proxmox In-Reply-To: References: <866269398.677434.1610310276937.ref@mail.yahoo.com> Message-ID: <00fa1af3338bda8a77303ddd36f853843fb5150c.camel@odiso.com> Hi, if you don't need to use zfs from proxmox, you can create a big hardware raid with your 6 disks (raid10, raid5, raid6,...), as you want. Then install proxmox, with with a small xfs/ext4 partition (maybe 30g for example). Then, with the remaining space, proxmox will be able to create vms with lvm. (you don't need to dedicated physical disk for proxmox, and other physical disk for vms) Alternativily, if you don't do hardware raid, you can use zfs directly from proxmox to create a big zfs raid pool with your 6disks. (then proxmox will create zfs subvolumes for vms) in both cases, create a big raid with your 6disks. Regards, Alexandre Le dimanche 10 janvier 2021 ? 20:24 +0000, Getachew Mekonnen via pve- user a ?crit?: > Dear Sir/Madam, > This is Getachew from Ethiopia. I'm an IT Expert at public > organization and very keen to use Proxmox for my environment. As a > beginner, I tried to understand the basic concepts from the official > guide and also watched videos.? > But, I have some questions which I'm not clear with. My server is > Dell Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB.? > 1. My first question is which RAID level (hd) I use?2. How do I use > the hard disks, this is to mean, For example I dedicate the one 300 > GB hdd for Proxmox, the other for one VM, the other for another VM, > or shall I use another arrangement? > Generally, It would be nice if I get a quick start guide for my > scenario (In my scenario, I want to create one Windows File Server, > One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) > I greatly appreciate your time and consideration. > Best regards, > > Getachew Mekonnen > > Senior IT Expert, PBS Program IT Support > > ICT Directorate, Bureau of Finance and Economic Development (BoFED) > > Hawassa, Ethiopia From aderumier at odiso.com Mon Jan 11 08:20:40 2021 From: aderumier at odiso.com (aderumier at odiso.com) Date: Mon, 11 Jan 2021 08:20:40 +0100 Subject: [PVE-User] Single BPS for multiple PVE lead to namespace conflict In-Reply-To: <51f40ab8b46dac203ce907c46318645f@konsec.com> References: <51f40ab8b46dac203ce907c46318645f@konsec.com> Message-ID: <927b14bed1ae378ee740fe7b4eb75cbecaa80811.camel@odiso.com> Hi, I think it's on the roadmap. (could be great to have some king on namespace too on storages, to be able to share them across multiple cluster) Le samedi 09 janvier 2021 ? 22:11 +0100, Konold, Martin a ?crit?: > > Hi there, > > I am pretty new to Proxmox and deeply impressed by the quality of > many > aspects of its design and implementation. > > In my testing I observed that in case I have multiple PVE Clusters > and > perform backups to a single datastore on a single PBS I experience a > lack of namespaces. > > Why a single datastore on the PBS for multiple PVE Clusters? > > For reasons of efficiency and avoidance of fragmentation I would like > to > use a single RAIDZ2 as a target. The problem now arises that both PVE > clusters see the same "vm/100/{dateTime}". > > What about prefixing the backups with the Cluster-Name e.g. > "pve1/vm/100{dateTime}"? > > Is there something I overlooked sofar? > > Regards > ppa. Martin Konold > > -- > Martin Konold - Prokurist, CTO > KONSEC GmbH -? make things real > Amtsgericht Stuttgart, HRB 23690 > Gesch?ftsf?hrer: Andreas Mack > Im K?ller 3, 70794 Filderstadt, Germany > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From d.jaeger at proxmox.com Mon Jan 11 08:36:48 2021 From: d.jaeger at proxmox.com (Dominic =?iso-8859-1?Q?J=E4ger?=) Date: Mon, 11 Jan 2021 08:36:48 +0100 Subject: [PVE-User] Single BPS for multiple PVE lead to namespace conflict In-Reply-To: <927b14bed1ae378ee740fe7b4eb75cbecaa80811.camel@odiso.com> References: <51f40ab8b46dac203ce907c46318645f@konsec.com> <927b14bed1ae378ee740fe7b4eb75cbecaa80811.camel@odiso.com> Message-ID: <20210111073648.GA29825@mala> On Mon, Jan 11, 2021 at 08:20:40AM +0100, aderumier at odiso.com wrote: > I think it's on the roadmap. You are right! > Backup to one (physical) datastore from multiple Proxmox VE clusters, avoiding backup naming conflicts https://pbs.proxmox.com/wiki/index.php/Roadmap From nada at verdnatura.es Mon Jan 11 09:36:00 2021 From: nada at verdnatura.es (nada) Date: Mon, 11 Jan 2021 09:36:00 +0100 Subject: [PVE-User] Proxmox In-Reply-To: <866269398.677434.1610310276937@mail.yahoo.com> References: <866269398.677434.1610310276937.ref@mail.yahoo.com> <866269398.677434.1610310276937@mail.yahoo.com> Message-ID: <29b39848fc2942cf2613a24459414cdf@verdnatura.es> good day in Ethiopia ;-) in case you have good RAID controller i will do the following 1. boot to RAID controller manager and configure 3 hdd = RAID5 2 hdd = RAID1 1 hdd GlobalHotSpare 2. boot to ISO proxmox and install ZFS over that raid5 (use whole space) 3. create filesystem 'backup' over that raid1 (mirror) e.g ext4 in LVM2 4. add filesystem 'backup' to proxmox storage 5. nagios and mailserver may be created in CTs (containers) hope you will have 2nd server and be able to create proxmox cluster in near future. After that you will see great advantage of rapid ZFS good luck Nada On 2021-01-10 21:24, Getachew Mekonnen wrote: > Dear Sir/Madam, > This is Getachew from Ethiopia. I'm an IT Expert at public > organization and very keen to use Proxmox for my environment. As a > beginner, I tried to understand the basic concepts from the official > guide and also watched videos.? > But, I have some questions which I'm not clear with. My server is Dell > Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB.? > 1. My first question is which RAID level (hd) I use?2. How do I use > the hard disks, this is to mean, For example I dedicate the one 300 GB > hdd for Proxmox, the other for one VM, the other for another VM, or > shall I use another arrangement? > Generally, It would be nice if I get a quick start guide for my > scenario (In my scenario, I want to create one Windows File Server, > One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) > I greatly appreciate your time and consideration. > Best regards, > > Getachew Mekonnen > > Senior IT Expert, PBS Program IT Support > > ICT Directorate, Bureau of Finance and Economic Development (BoFED) > > Hawassa, Ethiopia From os-li-ml at mailbox.org Mon Jan 11 10:37:35 2021 From: os-li-ml at mailbox.org (Markus Dellermann) Date: Mon, 11 Jan 2021 10:37:35 +0100 Subject: [PVE-User] Proxmox In-Reply-To: <29b39848fc2942cf2613a24459414cdf@verdnatura.es> References: <866269398.677434.1610310276937.ref@mail.yahoo.com> <866269398.677434.1610310276937@mail.yahoo.com> <29b39848fc2942cf2613a24459414cdf@verdnatura.es> Message-ID: <13202740.RZW9cDOChW@think.das-dellermanns.de> Hi, Am Montag, 11. Januar 2021, 09:36:00 CET schrieb nada: > good day in Ethiopia ;-) > in case you have good RAID controller > i will do the following > > 1. boot to RAID controller manager and configure > 3 hdd = RAID5 > 2 hdd = RAID1 > 1 hdd GlobalHotSpare > 2. boot to ISO proxmox and install > ZFS over that raid5 (use whole space) Please don t do that, its not recommended! Alexandre had already mad another suggestion.. Hardware-Raid or zfs > 3. create filesystem 'backup' over that raid1 (mirror) > e.g ext4 in LVM2 > 4. add filesystem 'backup' to proxmox storage > 5. nagios and mailserver may be created in CTs (containers) > > hope you will have 2nd server and be able to create proxmox cluster > in near future. After that you will see great advantage of rapid ZFS > good luck > Nada Regards, Markus > > On 2021-01-10 21:24, Getachew Mekonnen wrote: > > Dear Sir/Madam, > > This is Getachew from Ethiopia. I'm an IT Expert at public > > organization and very keen to use Proxmox for my environment. As a > > beginner, I tried to understand the basic concepts from the official > > guide and also watched videos. > > But, I have some questions which I'm not clear with. My server is Dell > > Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB. > > 1. My first question is which RAID level (hd) I use?2. How do I use > > the hard disks, this is to mean, For example I dedicate the one 300 GB > > hdd for Proxmox, the other for one VM, the other for another VM, or > > shall I use another arrangement? > > Generally, It would be nice if I get a quick start guide for my > > scenario (In my scenario, I want to create one Windows File Server, > > One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) > > I greatly appreciate your time and consideration. > > Best regards, > > > > Getachew Mekonnen > > > > Senior IT Expert, PBS Program IT Support > > > > ICT Directorate, Bureau of Finance and Economic Development (BoFED) > > > > Hawassa, Ethiopia > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From gilberto.nunes32 at gmail.com Mon Jan 11 11:41:40 2021 From: gilberto.nunes32 at gmail.com (Gilberto Ferreira) Date: Mon, 11 Jan 2021 07:41:40 -0300 Subject: [PVE-User] Proxmox In-Reply-To: References: <866269398.677434.1610310276937.ref@mail.yahoo.com> <866269398.677434.1610310276937@mail.yahoo.com> <29b39848fc2942cf2613a24459414cdf@verdnatura.es> Message-ID: Hi > 1. boot to RAID controller manager and configure > 3 hdd = RAID5 > 2 hdd = RAID1 > 1 hdd GlobalHotSpare > 2. boot to ISO proxmox and install > ZFS over that raid5 (use whole space) Mix RAID and ZFS is not a good idea, ZFS needs to deal directly with disk. My advice is to install Proxmox in a RAID-1 and then use the other disks with RAID-5 as a VM storage (LVM-Thin for instance, or even XFS). In this way, the VMS will be safe in the 2d arrangement. Best regards --- Gilberto Nunes Ferreira (47) 99676-7530 - Whatsapp / Telegram Em seg., 11 de jan. de 2021 ?s 06:37, Markus Dellermann via pve-user escreveu: > > > > > ---------- Forwarded message ---------- > From: Markus Dellermann > To: pve-user at lists.proxmox.com > Cc: > Bcc: > Date: Mon, 11 Jan 2021 10:37:35 +0100 > Subject: Re: [PVE-User] Proxmox > Hi, > Am Montag, 11. Januar 2021, 09:36:00 CET schrieb nada: > > good day in Ethiopia ;-) > > in case you have good RAID controller > > i will do the following > > > > 1. boot to RAID controller manager and configure > > 3 hdd = RAID5 > > 2 hdd = RAID1 > > 1 hdd GlobalHotSpare > > 2. boot to ISO proxmox and install > > ZFS over that raid5 (use whole space) > > Please don t do that, its not recommended! > Alexandre had already mad another suggestion.. > Hardware-Raid or zfs > > > 3. create filesystem 'backup' over that raid1 (mirror) > > e.g ext4 in LVM2 > > 4. add filesystem 'backup' to proxmox storage > > 5. nagios and mailserver may be created in CTs (containers) > > > > hope you will have 2nd server and be able to create proxmox cluster > > in near future. After that you will see great advantage of rapid ZFS > > good luck > > Nada > Regards, > Markus > > > > > On 2021-01-10 21:24, Getachew Mekonnen wrote: > > > Dear Sir/Madam, > > > This is Getachew from Ethiopia. I'm an IT Expert at public > > > organization and very keen to use Proxmox for my environment. As a > > > beginner, I tried to understand the basic concepts from the official > > > guide and also watched videos. > > > But, I have some questions which I'm not clear with. My server is Dell > > > Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB. > > > 1. My first question is which RAID level (hd) I use?2. How do I use > > > the hard disks, this is to mean, For example I dedicate the one 300 GB > > > hdd for Proxmox, the other for one VM, the other for another VM, or > > > shall I use another arrangement? > > > Generally, It would be nice if I get a quick start guide for my > > > scenario (In my scenario, I want to create one Windows File Server, > > > One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) > > > I greatly appreciate your time and consideration. > > > Best regards, > > > > > > Getachew Mekonnen > > > > > > Senior IT Expert, PBS Program IT Support > > > > > > ICT Directorate, Bureau of Finance and Economic Development (BoFED) > > > > > > Hawassa, Ethiopia > > > > _______________________________________________ > > pve-user mailing list > > pve-user at lists.proxmox.com > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > > > > > > ---------- Forwarded message ---------- > From: Markus Dellermann via pve-user > To: pve-user at lists.proxmox.com > Cc: Markus Dellermann > Bcc: > Date: Mon, 11 Jan 2021 10:37:35 +0100 > Subject: Re: [PVE-User] Proxmox > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From getche_f at yahoo.com Mon Jan 11 12:00:02 2021 From: getche_f at yahoo.com (Getachew Mekonnen) Date: Mon, 11 Jan 2021 11:00:02 +0000 (UTC) Subject: Fw: [PVE-User] Proxmox In-Reply-To: References: <866269398.677434.1610310276937.ref@mail.yahoo.com> <866269398.677434.1610310276937@mail.yahoo.com> <29b39848fc2942cf2613a24459414cdf@verdnatura.es> Message-ID: <1055578638.843330.1610362802168@mail.yahoo.com> Dear! I greatly appreciate your genuine and timely answers for my questions. Now I got additional 2 hard disks; As I mentioned all are 300GB SAS hard drives. According to your recommendation, some 3 or 4 RAID 5, 2 RAID 1 (For Proxmox), 1 for Hot spare, What is "backup" file system, I'm not clear with that. And the other issue I can not get another server right now for? Cluster. Is? Cluster creation a must activity now? Thank you in advance. Best regards, Getachew Mekonnen Senior IT Expert, PBS Program IT Support ICT Directorate, Bureau of Finance and Economic Development (BoFED) Hawassa, Ethiopia ----- Forwarded message ----- From: "Markus Dellermann via pve-user" To: "pve-user at lists.proxmox.com" Cc: "Markus Dellermann" Sent: Mon, 11 Jan 2021 at 12:37 pm Subject: Re: [PVE-User] Proxmox Hi, Am Montag, 11. Januar 2021, 09:36:00 CET schrieb nada: > good day in Ethiopia ;-) > in case you have good RAID controller > i will do the following > > 1. boot to RAID controller manager and configure >? ? 3 hdd = RAID5 >? ? 2 hdd = RAID1 >? ? 1 hdd GlobalHotSpare > 2. boot to ISO proxmox and install >? ? ZFS over that raid5 (use whole space) Please don t do that, its not recommended! Alexandre had already mad another suggestion.. Hardware-Raid or zfs > 3. create filesystem 'backup' over that raid1 (mirror) >? ? e.g ext4 in LVM2 > 4. add filesystem 'backup' to proxmox storage > 5. nagios and mailserver may be created in CTs (containers) > > hope you will have 2nd server and be able to create proxmox cluster > in near future. After that you will see great advantage of rapid ZFS > good luck > Nada Regards, Markus > > On 2021-01-10 21:24, Getachew Mekonnen wrote: > > Dear Sir/Madam, > > This is Getachew from Ethiopia. I'm an IT Expert at public > > organization and very keen to use Proxmox for my environment. As a > > beginner, I tried to understand the basic concepts from the official > > guide and also watched videos. > > But, I have some questions which I'm not clear with. My server is Dell > > Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB. > > 1. My first question is which RAID level (hd) I use?2. How do I use > > the hard disks, this is to mean, For example I dedicate the one 300 GB > > hdd for Proxmox, the other for one VM, the other for another VM, or > > shall I use another arrangement? > > Generally, It would be nice if I get a quick start guide for my > > scenario (In my scenario, I want to create one Windows File Server, > > One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) > > I greatly appreciate your time and consideration. > > Best regards, > > > > Getachew Mekonnen > > > > Senior IT Expert, PBS Program IT Support > > > > ICT Directorate, Bureau of Finance and Economic Development (BoFED) > > > > Hawassa, Ethiopia > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From gilberto.nunes32 at gmail.com Mon Jan 11 12:09:15 2021 From: gilberto.nunes32 at gmail.com (Gilberto Ferreira) Date: Mon, 11 Jan 2021 08:09:15 -0300 Subject: [PVE-User] Fw: Proxmox In-Reply-To: References: <866269398.677434.1610310276937.ref@mail.yahoo.com> <866269398.677434.1610310276937@mail.yahoo.com> <29b39848fc2942cf2613a24459414cdf@verdnatura.es> Message-ID: Hi again... I think it is better if you started with the Docs... Proxmox is a very well documented project. Start here: Backup and restore: https://pve.proxmox.com/pve-docs/chapter-vzdump.html (and check this too: https://pbs.proxmox.com/wiki/index.php/Main_Page) Cluster: https://pve.proxmox.com/pve-docs/chapter-pvecm.html High Availability: https://pve.proxmox.com/pve-docs/chapter-ha-manager.html --- Gilberto Nunes Ferreira (47) 99676-7530 - Whatsapp / Telegram Em seg., 11 de jan. de 2021 ?s 08:00, Getachew Mekonnen via pve-user escreveu: > > > > > ---------- Forwarded message ---------- > From: Getachew Mekonnen > To: Proxmox VE User List > Cc: > Bcc: > Date: Mon, 11 Jan 2021 11:00:02 +0000 (UTC) > Subject: Fw: [PVE-User] Proxmox > Dear! > I greatly appreciate your genuine and timely answers for my questions. > Now I got additional 2 hard disks; As I mentioned all are 300GB SAS hard drives. > According to your recommendation, some 3 or 4 RAID 5, 2 RAID 1 (For Proxmox), 1 for Hot spare, > What is "backup" file system, I'm not clear with that. And the other issue I can not get another server right now for Cluster. Is Cluster creation a must activity now? > Thank you in advance. > Best regards, > > Getachew Mekonnen > > Senior IT Expert, PBS Program IT Support > > ICT Directorate, Bureau of Finance and Economic Development (BoFED) > > Hawassa, Ethiopia > > ----- Forwarded message ----- From: "Markus Dellermann via pve-user" To: "pve-user at lists.proxmox.com" Cc: "Markus Dellermann" Sent: Mon, 11 Jan 2021 at 12:37 pm Subject: Re: [PVE-User] Proxmox Hi, > Am Montag, 11. Januar 2021, 09:36:00 CET schrieb nada: > > good day in Ethiopia ;-) > > in case you have good RAID controller > > i will do the following > > > > 1. boot to RAID controller manager and configure > > 3 hdd = RAID5 > > 2 hdd = RAID1 > > 1 hdd GlobalHotSpare > > 2. boot to ISO proxmox and install > > ZFS over that raid5 (use whole space) > > Please don t do that, its not recommended! > Alexandre had already mad another suggestion.. > Hardware-Raid or zfs > > > 3. create filesystem 'backup' over that raid1 (mirror) > > e.g ext4 in LVM2 > > 4. add filesystem 'backup' to proxmox storage > > 5. nagios and mailserver may be created in CTs (containers) > > > > hope you will have 2nd server and be able to create proxmox cluster > > in near future. After that you will see great advantage of rapid ZFS > > good luck > > Nada > Regards, > Markus > > > > > On 2021-01-10 21:24, Getachew Mekonnen wrote: > > > Dear Sir/Madam, > > > This is Getachew from Ethiopia. I'm an IT Expert at public > > > organization and very keen to use Proxmox for my environment. As a > > > beginner, I tried to understand the basic concepts from the official > > > guide and also watched videos. > > > But, I have some questions which I'm not clear with. My server is Dell > > > Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB. > > > 1. My first question is which RAID level (hd) I use?2. How do I use > > > the hard disks, this is to mean, For example I dedicate the one 300 GB > > > hdd for Proxmox, the other for one VM, the other for another VM, or > > > shall I use another arrangement? > > > Generally, It would be nice if I get a quick start guide for my > > > scenario (In my scenario, I want to create one Windows File Server, > > > One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) > > > I greatly appreciate your time and consideration. > > > Best regards, > > > > > > Getachew Mekonnen > > > > > > Senior IT Expert, PBS Program IT Support > > > > > > ICT Directorate, Bureau of Finance and Economic Development (BoFED) > > > > > > Hawassa, Ethiopia > > > > _______________________________________________ > > pve-user mailing list > > pve-user at lists.proxmox.com > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > > > > > > > > > ---------- Forwarded message ---------- > From: Getachew Mekonnen via pve-user > To: Proxmox VE User List > Cc: Getachew Mekonnen > Bcc: > Date: Mon, 11 Jan 2021 11:00:02 +0000 (UTC) > Subject: [PVE-User] Fw: Proxmox > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From smr at kmi.com Mon Jan 11 13:40:34 2021 From: smr at kmi.com (Stefan M. Radman) Date: Mon, 11 Jan 2021 12:40:34 +0000 Subject: [PVE-User] Proxmox In-Reply-To: References: <866269398.677434.1610310276937.ref@mail.yahoo.com> <866269398.677434.1610310276937@mail.yahoo.com> <29b39848fc2942cf2613a24459414cdf@verdnatura.es> Message-ID: <1967D049-9368-4E21-B8EB-ADA42796F10A@kmi.com> If your Dell R730 has a PERC hardware RAID controller, I'd recommend to configure all your 6 x 300GB SAS HDDs into a single RAID6. This gives you the most capacity (1.2TB), availability (2 out of 6 may die) and flexibility (space allocation done within PVE). The PVE installer creates logical volumes root, data and swap from your virtual disk and you can control their size during installation. https://pve.proxmox.com/wiki/Installation Look for "Advanced LVM Configuration Options". Without a hardware RAID controller you should be looking into ZFS as suggested by Markus. With regards to "backup" filesystem I guess that Markus referred to storage for Proxmox backups. I'd rather keep that on an external NAS instead of an internal disk. Cluster creation is not a must. You can run multiple virtual machines on a single server. Stefan On Jan 11, 2021, at 12:00, Getachew Mekonnen via pve-user > wrote: From: Getachew Mekonnen > Subject: Fw: [PVE-User] Proxmox Date: January 11, 2021 at 12:00:02 GMT+1 To: Proxmox VE User List > Reply-To: Getachew Mekonnen > Dear! I greatly appreciate your genuine and timely answers for my questions. Now I got additional 2 hard disks; As I mentioned all are 300GB SAS hard drives. According to your recommendation, some 3 or 4 RAID 5, 2 RAID 1 (For Proxmox), 1 for Hot spare, What is "backup" file system, I'm not clear with that. And the other issue I can not get another server right now for Cluster. Is Cluster creation a must activity now? Thank you in advance. Best regards, Getachew Mekonnen Senior IT Expert, PBS Program IT Support ICT Directorate, Bureau of Finance and Economic Development (BoFED) Hawassa, Ethiopia ----- Forwarded message ----- From: "Markus Dellermann via pve-user" > To: "pve-user at lists.proxmox.com" > Cc: "Markus Dellermann" > Sent: Mon, 11 Jan 2021 at 12:37 pm Subject: Re: [PVE-User] Proxmox Hi, Am Montag, 11. Januar 2021, 09:36:00 CET schrieb nada: good day in Ethiopia ;-) in case you have good RAID controller i will do the following 1. boot to RAID controller manager and configure 3 hdd = RAID5 2 hdd = RAID1 1 hdd GlobalHotSpare 2. boot to ISO proxmox and install ZFS over that raid5 (use whole space) Please don t do that, its not recommended! Alexandre had already mad another suggestion.. Hardware-Raid or zfs 3. create filesystem 'backup' over that raid1 (mirror) e.g ext4 in LVM2 4. add filesystem 'backup' to proxmox storage 5. nagios and mailserver may be created in CTs (containers) hope you will have 2nd server and be able to create proxmox cluster in near future. After that you will see great advantage of rapid ZFS good luck Nada Regards, Markus On 2021-01-10 21:24, Getachew Mekonnen wrote: Dear Sir/Madam, This is Getachew from Ethiopia. I'm an IT Expert at public organization and very keen to use Proxmox for my environment. As a beginner, I tried to understand the basic concepts from the official guide and also watched videos. But, I have some questions which I'm not clear with. My server is Dell Poweredge r730 with 56 GB RAM; and 6 SAS hdds each with 300 GB. 1. My first question is which RAID level (hd) I use?2. How do I use the hard disks, this is to mean, For example I dedicate the one 300 GB hdd for Proxmox, the other for one VM, the other for another VM, or shall I use another arrangement? Generally, It would be nice if I get a quick start guide for my scenario (In my scenario, I want to create one Windows File Server, One Ubuntu Mail Server (Zimbra), and one Ubuntu Nagios) I greatly appreciate your time and consideration. Best regards, Getachew Mekonnen Senior IT Expert, PBS Program IT Support ICT Directorate, Bureau of Finance and Economic Development (BoFED) Hawassa, Ethiopia _______________________________________________ pve-user mailing list pve-user at lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at lists.proxmox.com https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.proxmox.com%2Fcgi-bin%2Fmailman%2Flistinfo%2Fpve-user&data=04%7C01%7Csmr%40kmi.com%7Cee36ee877da64a2c0bfe08d8b6201aab%7Cc2283768b8d34e008f3d85b1b4f03b33%7C0%7C0%7C637459596324521657%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=yqnKZguw4WVMMru%2Bu4zU1fmBZZ%2F66NvqBm1mep30Kks%3D&reserved=0 CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession. From gilberto.nunes32 at gmail.com Wed Jan 13 16:46:57 2021 From: gilberto.nunes32 at gmail.com (Gilberto Ferreira) Date: Wed, 13 Jan 2021 12:46:57 -0300 Subject: [PVE-User] Add vm to backup job through CLI. Message-ID: Hi there! I came across a customer request, to make some script in order to automatizate the job to add a certain vm to the backup vzdump task. So after some study about pvesh, I had made this little shell script, which I wish could come in handy. I put it in github: https://github.com/gilbertoferreira/addvmtobackup Improvements would be welcome! Thanks --- Gilberto Nunes Ferreira (47) 99676-7530 - Whatsapp / Telegram From alain.pean at c2n.upsaclay.fr Mon Jan 11 10:00:16 2021 From: alain.pean at c2n.upsaclay.fr (=?UTF-8?Q?Alain_P=c3=a9an?=) Date: Mon, 11 Jan 2021 10:00:16 +0100 Subject: [PVE-User] Proxmox In-Reply-To: <29b39848fc2942cf2613a24459414cdf@verdnatura.es> References: <866269398.677434.1610310276937.ref@mail.yahoo.com> <866269398.677434.1610310276937@mail.yahoo.com> <29b39848fc2942cf2613a24459414cdf@verdnatura.es> Message-ID: <5ca4a21f-980f-6e5b-1625-76586537dc0e@c2n.upsaclay.fr> Hi all, Le 11/01/2021 ? 09:36, nada a ?crit?: > > 1. boot to RAID controller manager and configure > ?? 3 hdd = RAID5 > ?? 2 hdd = RAID1 > ?? 1 hdd GlobalHotSpare > 2. boot to ISO proxmox and install > ?? ZFS over that raid5 (use whole space) > 3. create filesystem 'backup' over that raid1 (mirror) > ?? e.g ext4 in LVM2 > 4. add filesystem 'backup' to proxmox storage > 5. nagios and mailserver may be created in CTs (containers) Personnaly, I would do a Raid 10 with all the disk (3*2, ~900 Go of storage). Better performance and security, and install Proxmox on the whole volume. I would not backup on the same server. Note that it is not very secure to use only one server for all, and for example for mail server, which is quite critical. It would be better to have a real cluster with at least three nodes... Power Edge R730 is relativey old, is it still under maintenance from Dell ? My two cents... Alain -- Administrateur Syst?me/R?seau C2N Centre de Nanosciences et Nanotechnologies (UMR 9001) Boulevard Thomas Gobert (ex Avenue de La Vauve), 91120 Palaiseau Tel : 01-70-27-06-88 Bureau A255 From mariusz at modulesgarden.com Thu Jan 14 15:25:13 2021 From: mariusz at modulesgarden.com (Mariusz Miodowski) Date: Thu, 14 Jan 2021 15:25:13 +0100 Subject: Cannot update backup job via API - PVE 6.3.3 Message-ID: Hello All, We have noticed problem with latest PVE, when trying to update backup job via API. It looks that problem is caused by maxfiles parameter. If we omit it, everything works fine. This problem is not only related to our environment,? we have already got reports from our clients that they also have the same problem. *Request*: PUT https://10.10.11.48:8006/api2/json/cluster/backup/3cb1bee67a58e68ea97db1292a23a22a9ea8d529:1 Array ( ??? [vmid] => 8001 ??? [starttime] => 00:10 ??? [maxfiles] => 10 ??? [storage] => local ??? [remove] => 1 ??? [dow] => tue,wed,sat ??? [mode] => snapshot ??? [compress] => zstd ) *Response*: HTTP 500 HTTP/1.1 500 error during cfs-locked 'file-vzdump_cron' operation: value without key, but schema does not define a default key Cache-Control: max-age=0 Connection: close Date: Thu, 14 Jan 2021 14:12:37 GMT Pragma: no-cache Server: pve-api-daemon/3.0 Content-Length: 13 Content-Type: application/json;charset=UTF-8 Expires: Thu, 14 Jan 2021 14:12:37 GMT {"data":null} -- Regards Mariusz Miodowski ModulesGarden Development Team Manager https://www.modulesgarden.com From sam at lorch.net Fri Jan 15 08:59:09 2021 From: sam at lorch.net (Samuel Lorch) Date: Fri, 15 Jan 2021 08:59:09 +0100 Subject: [PVE-User] Cluster Status Missing From External Metric Server Metrics Message-ID: <9deb517da529b746fed36bd69d7895c6@lorch.net> Dear all, recently i have deployed allot of small Proxmox VE Clusters which now have a need for Monitoring. I have used Influx and Grafana to monitor Containers and VM's in the past and was very surprised to see that the External Metric Server doesn't supply any Metrics about the Cluster status (eg if quorate, number of votes, ...) even though this data is available in the api (pvesh get /cluster/ha/status/current or pvesh get /cluster/ha/status/manager_status). After investigating the source a bit with my limited knowledge of perl i think that all needed librarys already exists and that one could with very low effort add this data to the External Metric Servers Metrics. Is there a reason why these Metrics aren't exposed here or has this just been forgotten about? Is this something that the Proxmox team could implement? This would make the External Metric Server a good & simple option for Monitoring Clusters as nothing has to be installed on the Hosts themselves and its now Configurable via the GUI. Thanks in advance Samuel From f.gruenbichler at proxmox.com Fri Jan 15 10:12:34 2021 From: f.gruenbichler at proxmox.com (Fabian =?iso-8859-1?q?Gr=FCnbichler?=) Date: Fri, 15 Jan 2021 10:12:34 +0100 Subject: [PVE-User] Cluster Status Missing From External Metric Server Metrics In-Reply-To: <9deb517da529b746fed36bd69d7895c6@lorch.net> References: <9deb517da529b746fed36bd69d7895c6@lorch.net> Message-ID: <1610701788.cjsty5xk9w.astroid@nora.none> On January 15, 2021 8:59 am, Samuel Lorch wrote: > Dear all, > > recently i have deployed allot of small Proxmox VE Clusters which now > have a need for Monitoring. I have used Influx and Grafana to monitor > Containers and VM's in the past and was very surprised to see that the > External Metric Server doesn't supply any Metrics about the Cluster > status (eg if quorate, number of votes, ...) even though this data is > available in the api (pvesh get /cluster/ha/status/current or pvesh get > /cluster/ha/status/manager_status). After investigating the source a bit > with my limited knowledge of perl i think that all needed librarys > already exists and that one could with very low effort add this data to > the External Metric Servers Metrics. > > Is there a reason why these Metrics aren't exposed here or has this just > been forgotten about? likely the latter > > Is this something that the Proxmox team could implement? yes. might be worth opening an enhancement request over at https://bugzilla.proxmox.com not that for clusters, corosync also provides quite some metrics via corosync-cmapctl -m stats that might be interesting to pick up as well.. > > This would make the External Metric Server a good & simple option for > Monitoring Clusters as nothing has to be installed on the Hosts > themselves and its now Configurable via the GUI. > > Thanks in advance > Samuel > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > From f.ebner at proxmox.com Fri Jan 15 13:10:21 2021 From: f.ebner at proxmox.com (Fabian Ebner) Date: Fri, 15 Jan 2021 13:10:21 +0100 Subject: [PVE-User] Cannot update backup job via API - PVE 6.3.3 In-Reply-To: References: Message-ID: <7adad43c-3f8a-62d5-9b84-fae6ea1d57b3@proxmox.com> Hi, thanks for the report! I can reproduce this here and will work out a patch. On 14.01.21 15:25, Mariusz Miodowski via pve-user wrote: > > Hello All, > > We have noticed problem with latest PVE, when trying to update backup job via API. > > It looks that problem is caused by maxfiles parameter. If we omit it, everything works fine. > > This problem is not only related to our environment, we have already got reports from our clients that they also have the same problem. > > > > *Request*: > PUT https://10.10.11.48:8006/api2/json/cluster/backup/3cb1bee67a58e68ea97db1292a23a22a9ea8d529:1 > Array > ( > [vmid] => 8001 > [starttime] => 00:10 > [maxfiles] => 10 > [storage] => local > [remove] => 1 > [dow] => tue,wed,sat > [mode] => snapshot > [compress] => zstd > ) > > > *Response*: > HTTP 500 HTTP/1.1 500 error during cfs-locked 'file-vzdump_cron' operation: value without key, but schema does not define a default key > Cache-Control: max-age=0 > Connection: close > Date: Thu, 14 Jan 2021 14:12:37 GMT > Pragma: no-cache > Server: pve-api-daemon/3.0 > Content-Length: 13 > Content-Type: application/json;charset=UTF-8 > Expires: Thu, 14 Jan 2021 14:12:37 GMT > > {"data":null} > > > -- > Regards > Mariusz Miodowski > ModulesGarden Development Team Manager > https://www.modulesgarden.com > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From jmr.richardson at gmail.com Fri Jan 15 17:56:34 2021 From: jmr.richardson at gmail.com (JR Richardson) Date: Fri, 15 Jan 2021 10:56:34 -0600 Subject: [PVE-User] Unused Disk Remove vs Detach Message-ID: Hi All, I'm running PVE 6.2.11, I have VMs with Disks on NFS shared storage. I moved disks to other storage nodes but did not choose to delete the old disk, so the old disk is still assigned to the VM in hardware as 'Unused Disk 0'. When I select the disk, the 'Detach' button changes to 'Remove'. I remember in older versions of PVE, you could just 'Detach' disks instead of removing them. Is this due to the old storage node is still on-line? If I shutdown the old storage node, would I get the option to Detach Unused Disks? Can I get around this via command line utility to detach the old disk? Thanks. JR -- JR Richardson Engineering for the Masses Chasing the Azeotrope From chris.hofstaedtler at deduktiva.com Fri Jan 15 18:46:18 2021 From: chris.hofstaedtler at deduktiva.com (Chris Hofstaedtler | Deduktiva) Date: Fri, 15 Jan 2021 18:46:18 +0100 Subject: [PVE-User] Unused Disk Remove vs Detach In-Reply-To: References: Message-ID: <20210115174618.j6ej5ear4tj7opnp@zeha.at> * JR Richardson [210115 17:57]: > I'm running PVE 6.2.11, I have VMs with Disks on NFS shared storage. I > moved disks to other storage nodes but did not choose to delete the > old disk, so the old disk is still assigned to the VM in hardware as > 'Unused Disk 0'. When I select the disk, the 'Detach' button changes > to 'Remove'. I remember in older versions of PVE, you could just > 'Detach' disks instead of removing them. Is this due to the old > storage node is still on-line? If I shutdown the old storage node, > would I get the option to Detach Unused Disks? I can only speculate, but essentially, completely detached disks are invisible to PVE. I believe, on subsequent actions - like disk moves, VM migrations, ... - the disk "numbers" can be reused. If there's such an old disk around, the operation might either fail or the disk might be silently overwritten. >From this PoV, I would strongly recommend against fully detaching disks from VMs. > Can I get around this via command line utility to detach the old disk? You can edit the VM definition file and remove the unusedN: line. Best, Chris -- Chris Hofstaedtler / Deduktiva GmbH (FN 418592 b, HG Wien) www.deduktiva.com / +43 1 353 1707 From f.thommen at dkfz-heidelberg.de Sat Jan 16 13:26:17 2021 From: f.thommen at dkfz-heidelberg.de (Frank Thommen) Date: Sat, 16 Jan 2021 13:26:17 +0100 Subject: [PVE-User] After update Ceph monitor shows wrong version in UI and is down and out of quorum In-Reply-To: References: <21dec802-c6e8-d395-1444-7b30df5620cd@dkfz-heidelberg.de> <255b8af8-8834-0f24-d9a6-819f2d2cf8c8@dkfz-heidelberg.de> <9811d98a-ebf2-8590-ddd0-3b707ede4a4e@dkfz-heidelberg.de> <89a1ad57-6f99-d422-08df-d110f10aa3b9@dkfz-heidelberg.de> <4bdfeb73-582e-2c25-e300-166283e40dc5@dkfz-heidelberg.de> <1073f776-3910-dbde-8304-e86b5f6ed4fb@web.de> <8599b7ad-0ea0-b836-4492-d80b9d43cfcb@dkfz-heidelberg.de> Message-ID: <8d04daad-107b-4497-dccf-84b95d933a39@dkfz-heidelberg.de> Just to close this thread on the maillist: I finally made this a support request @proxmox and we are still working on it. It's not an easy case to solve :-) Frank On 08.01.21 13:01, Frank Thommen wrote: > Could this entry be the result of the fencing which happened when the > host initially crashed?? I assumed, that it would automatically be > unfenced when it comes up again.? I never run some manual "unfencing" (I > wouldn't know how). > > Frank > > > > On 08.01.21 12:44, Frank Thommen wrote: >> yes /etc/ceph/ceph.conf is identical on all three hosts and there is a >> mon_host line with the correct IPs.? Interestingly there is a special >> section for odcf-pve02: >> >> ----------- >> [mon.odcf-pve02] >> ????? public_addr = 192.168.255.2 >> ----------- >> >> This is the same IP as in the mon_host line.? However there is no >> equivalent section for the other two nodes. >> >> Frank >> >> >> On 08.01.21 12:27, Peter Simon wrote: >>> Hi Frank, >>> >>> your /etc/ceph/ceph.conf is the same on all hosts ? >>> >>> is there mon host = ip1, ip2, ip3 >>> >>> and seperate sections with [mon.x] >>> host = hostname >>> mon addr = ip:6789 >>> >>> Cheers >>> Peter >>> >>> Am 08.01.21 um 12:05 schrieb Frank Thommen: >>>> >>>> >>>> On 08.01.21 11:45, Uwe Sauter wrote: >>>>> >>>>> >>>>> Am 08.01.21 um 11:36 schrieb Frank Thommen: >>>>>> >>>>>> On 05.01.21 21:17, Frank Thommen wrote: >>>>>>> On 05.01.21 21:02, Uwe Sauter wrote: >>>>>>>> There's a paragraph about probing mons on >>>>>>>> >>>>>>>> https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> I will check that (tomorrow :-) >>>>>> >>>>>> >>>>>> using the monitor's admin socket on either of the three nodes I can >>>>>> query the monitors of 01 and 03 (the good ones) but not of 02 (the >>>>>> problematic one): >>>>>> >>>>>> root at odcf-pve01:~# ceph tell mon.odcf-pve02 mon_status >>>>>> Error ENOENT: problem getting command descriptions from >>>>>> mon.odcf-pve02 >>>>>> root at odcf-pve01:~# >>>>>> >>>>>> The monitor daemon is running on all three and the ports are open. >>>>>> >>>>>> Any other ideas? >>>>> >>>>> You could check the permissions on the socket: >>>>> >>>>> ss -xln | grep ceph-mon >>>>> SOCK=$(ss -xln | awk '/ceph-mon/ {print $5}') >>>>> ls -la ${SOCK} >>>>> >>>>> On my host, this shows >>>>> >>>>> srwxr-xr-x 1 ceph ceph 0 Dec 20 23:47 >>>>> /var/run/ceph/ceph-mon.px-alpha-cluster.asok >>>> >>>> same here >>>> >>>> _______________________________________________ >>>> pve-user mailing list >>>> pve-user at lists.proxmox.com >>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>>> >>> >>> _______________________________________________ >>> pve-user mailing list >>> pve-user at lists.proxmox.com >>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >>> >> >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user From krienke at uni-koblenz.de Mon Jan 18 17:43:52 2021 From: krienke at uni-koblenz.de (Rainer Krienke) Date: Mon, 18 Jan 2021 17:43:52 +0100 Subject: [PVE-User] pve & ceph nautilus: Since today I get an error when trying to remove a VM-template Message-ID: <7fb2c2c2-70a5-6226-617c-4b28aabc2525@uni-koblenz.de> Hello, since today I have a strange Problem with my pve installation. The storage backend used is a "external" ceph nautilus cluster. All VMs are created from templates as full clones. It seems whenever I delete a VM-template I get an error. Removing VMs (not templates) is OK. Removing a template results in this kind of error messages since today: Removing all snapshots: 0% complete...failed. Could not remove disk 'ceph-pxa:base-174-disk-0', check manually: error during cfs-locked 'storage-ceph-pxa' operation: rbd snap purge 'base-174-disk-0' error: Removing all snapshots: 0% complete...failed. Removing all snapshots: 0% complete...failed. error during cfs-locked 'storage-ceph-pxa' operation: rbd snap purge 'base-174-disk-0' error: Removing all snapshots: 0% complete...failed. TASK OK Until today this worked always fine. R Inside of pve the template is deleted, however on ceph the template file is still there in form of a rbd with a protected snapshot (which seems to be sane to me): ceph$ rbd snap ls pxa-rbd/base-174-disk-0 SNAPID NAME SIZE PROTECTED TIMESTAMP 2992 __base__ 32 GiB yes Mon Jan 18 11:59:49 2021 I unprotected the snapshot, removed it and then removed the base-rbd which worked just fine. So on the ceph side everything looks good, but on the pve side I get the strange error. This started today after I started to create a pve-snapshot of a VM and the "include memory" checkbox was on, allthough I didn't want RAM to be included. So I stopped the snapshot via web-gui. I got an error saying: Qemu Guest Agent is not running - VM 185 qmp command 'guest-ping' failed - got timeout snapshot create failed: starting cleanup TASK ERROR: VM 185 qmp command 'savevm-start' failed - VM snapshot already started Perhaps this has left a lock somewhere, but perhaps it has nothing to do with the problem itself. Does anyone have an idea what might be wrong or how to find out more? Thanks Rainer Here some more infos: root at pxsrv1: dpkg -l|grep pve ii pve-cluster 6.2-1 amd64 "pmxcfs" distributed cluster filesystem for Proxmox Virtual Environment. ii pve-container 3.2-2 all Proxmox VE Container management tool ii pve-docs 6.2-6 all Proxmox VE Documentation ii pve-edk2-firmware 2.20200531-1 all edk2 based firmware modules for virtual machines ii pve-firewall 4.1-3 amd64 Proxmox VE Firewall ii pve-firmware 3.1-3 all Binary firmware code for the pve-kernel ii pve-ha-manager 3.1-1 amd64 Proxmox VE HA Manager ii pve-i18n 2.2-1 all Internationalization support for Proxmox VE ii pve-kernel-5.3 6.1-6 all Latest Proxmox VE Kernel Image ii pve-kernel-5.3.10-1-pve 5.3.10-1 amd64 The Proxmox PVE Kernel Image rc pve-kernel-5.3.18-1-pve 5.3.18-1 amd64 The Proxmox PVE Kernel Image ii pve-kernel-5.3.18-3-pve 5.3.18-3 amd64 The Proxmox PVE Kernel Image ii pve-kernel-5.4 6.2-7 all Latest Proxmox VE Kernel Image rc pve-kernel-5.4.34-1-pve 5.4.34-2 amd64 The Proxmox PVE Kernel Image ii pve-kernel-5.4.44-2-pve 5.4.44-2 amd64 The Proxmox PVE Kernel Image ii pve-kernel-5.4.65-1-pve 5.4.65-1 amd64 The Proxmox PVE Kernel Image ii pve-kernel-helper 6.2-7 all Function for various kernel maintenance tasks. ii pve-lxc-syscalld 0.9.1-1 amd64 PVE LXC syscall daemon ii pve-manager 6.2-12 amd64 Proxmox Virtual Environment Management Tools ii pve-qemu-kvm 5.1.0-3 amd64 Full virtualization on x86 hardware ii pve-xtermjs 4.7.0-2 amd64 Binaries built from the Rust termproxy crate ii smartmontools 7.1-pve2 amd64 control and monitor storage systems using S.M.A.R.T. root at pxsrv1:/etc/pve# pvecm status Cluster information ------------------- Name: pxa Config Version: 5 Transport: knet Secure auth: on Quorum information ------------------ Date: Mon Jan 18 14:02:08 2021 Quorum provider: corosync_votequorum Nodes: 5 Node ID: 0x00000001 Ring ID: 1.327 Quorate: Yes Votequorum information ---------------------- Expected votes: 5 Highest expected: 5 Total votes: 5 Quorum: 3 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 a.b.c.d (local) 0x00000002 1 a.b.c.d 0x00000003 1 a.b.c.d 0x00000004 1 a.b.c.d 0x00000005 1 a.b.c.d root at pxa1:/etc/pve# root at pxa1:/etc/pve# pvecm status Cluster information ------------------- Name: pxa Config Version: 5 Transport: knet Secure auth: on Quorum information ------------------ Date: Mon Jan 18 14:06:08 2021 Quorum provider: corosync_votequorum Nodes: 5 Node ID: 0x00000001 Ring ID: 1.327 Quorate: Yes Votequorum information ---------------------- Expected votes: 5 Highest expected: 5 Total votes: 5 Quorum: 3 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 0x00000001 1 a.b.c.d 0x00000002 1 a.b.c.d 0x00000003 1 a.b.c.d 0x00000004 1 a.b.c.d 0x00000005 1 a.b.c.d -- Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1 56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312 PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287 1001312 From linuxmail at 4lin.net Tue Jan 19 17:47:57 2021 From: linuxmail at 4lin.net (Denny Fuchs) Date: Tue, 19 Jan 2021 17:47:57 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <5b545e139718261531562fec0a49eede@verdnatura.es> <56518B49-6608-4161-8424-5088F92853BE@kmi.com> <4C4C3D36-0A76-43CB-B802-1FFCFEECE548@kmi.com> <202541d69d0b3a71b6ee87d103c7a84c@verdnatura.es> <337680261.683808.1580751316031.JavaMail.zimbra@ifsc.edu.br> Message-ID: hi, I want to say also: I have this same issue on one from 7 nodes. All nodes where (more or less) identical (Puppet), but one node has also the problem, that pvscan / prefers the direct device (/dev/sdd, over multipath (/dev/mpathd). PVE is the latest 6.x I try with rd.lvm.conf=0 and looking, if it helps. cu denny -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 203 bytes Desc: OpenPGP digital signature URL: From smr at kmi.com Tue Jan 19 19:39:30 2021 From: smr at kmi.com (Stefan M. Radman) Date: Tue, 19 Jan 2021 18:39:30 +0000 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <5b545e139718261531562fec0a49eede@verdnatura.es> <56518B49-6608-4161-8424-5088F92853BE@kmi.com> <4C4C3D36-0A76-43CB-B802-1FFCFEECE548@kmi.com> <202541d69d0b3a71b6ee87d103c7a84c@verdnatura.es> <337680261.683808.1580751316031.JavaMail.zimbra@ifsc.edu.br> Message-ID: Hi Denny In the meantime I'm running 6.3-1 with the latest updates and my workaround delaying the start of the lvm2-pvscan at .service still works (see below). Regards Stefan root at pve:~# systemctl edit lvm2-pvscan at .service root at pve:~# cat /etc/systemd/system/lvm2-pvscan at .service.d/override.conf # Ensure that multipath discovery finishes before LVM pvscan [Unit] After=multipathd.service [Service] ExecStartPre=/bin/sleep 10 root at pve:~# On Jan 19, 2021, at 17:47, Denny Fuchs > wrote: hi, I want to say also: I have this same issue on one from 7 nodes. All nodes where (more or less) identical (Puppet), but one node has also the problem, that pvscan / prefers the direct device (/dev/sdd, over multipath (/dev/mpathd). PVE is the latest 6.x I try with rd.lvm.conf=0 and looking, if it helps. cu denny _______________________________________________ pve-user mailing list pve-user at lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession. From smr at kmi.com Tue Jan 19 19:39:30 2021 From: smr at kmi.com (Stefan M. Radman) Date: Tue, 19 Jan 2021 18:39:30 +0000 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <5b545e139718261531562fec0a49eede@verdnatura.es> <56518B49-6608-4161-8424-5088F92853BE@kmi.com> <4C4C3D36-0A76-43CB-B802-1FFCFEECE548@kmi.com> <202541d69d0b3a71b6ee87d103c7a84c@verdnatura.es> <337680261.683808.1580751316031.JavaMail.zimbra@ifsc.edu.br> Message-ID: Hi Denny In the meantime I'm running 6.3-1 with the latest updates and my workaround delaying the start of the lvm2-pvscan at .service still works (see below). Regards Stefan root at pve:~# systemctl edit lvm2-pvscan at .service root at pve:~# cat /etc/systemd/system/lvm2-pvscan at .service.d/override.conf # Ensure that multipath discovery finishes before LVM pvscan [Unit] After=multipathd.service [Service] ExecStartPre=/bin/sleep 10 root at pve:~# On Jan 19, 2021, at 17:47, Denny Fuchs > wrote: hi, I want to say also: I have this same issue on one from 7 nodes. All nodes where (more or less) identical (Puppet), but one node has also the problem, that pvscan / prefers the direct device (/dev/sdd, over multipath (/dev/mpathd). PVE is the latest 6.x I try with rd.lvm.conf=0 and looking, if it helps. cu denny _______________________________________________ pve-user mailing list pve-user at lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession. From linuxmail at 4lin.net Tue Jan 19 20:06:58 2021 From: linuxmail at 4lin.net (Denny Fuchs) Date: Tue, 19 Jan 2021 20:06:58 +0100 Subject: [PVE-User] LVM autoactivation failed with multipath over iSCSI In-Reply-To: References: <413ca3933a616bd30233b8fefb305154@verdnatura.es> <5b545e139718261531562fec0a49eede@verdnatura.es> <56518B49-6608-4161-8424-5088F92853BE@kmi.com> <4C4C3D36-0A76-43CB-B802-1FFCFEECE548@kmi.com> <202541d69d0b3a71b6ee87d103c7a84c@verdnatura.es> <337680261.683808.1580751316031.JavaMail.zimbra@ifsc.edu.br> Message-ID: <29ee0b91-cce0-3789-e0d5-d64e30855347@4lin.net> hi, Am 19.01.21 um 19:39 schrieb Stefan M. Radman via pve-user: > In the meantime I'm running 6.3-1 with the latest updates and my workaround delaying the start of the lvm2-pvscan at .service still works (see below). yepp, that works. Thanks for that hint ! cu denny -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 203 bytes Desc: OpenPGP digital signature URL: From krienke at uni-koblenz.de Thu Jan 21 08:38:08 2021 From: krienke at uni-koblenz.de (Rainer Krienke) Date: Thu, 21 Jan 2021 08:38:08 +0100 Subject: [PVE-User] Error deleting VM-templates Message-ID: <74b67460-f17a-940f-5210-8a7c482e40f8@uni-koblenz.de> Hello, I have a 5 host PVE cluster version 6.3-3/eee5f901 (running kernel: 5.4.78-2-pve) community edition. The storage backend is an external ceph-nautilus 14.2.15 cluster. Everything works fine exept for this: whenever I try to remove a VM *template* (without any children) I get and error no matter if I do it via GUI (More->Remove -menu) or qm: root at pxsrv:~# qm destroy 174 # id 174 is a vm template Removing all snapshots: 0% complete...failed. Could not remove disk 'ceph-pxa:base-174-disk-0', check manually: error during cfs-locked 'storage-ceph-pxa' operation: rbd snap purge 'base-174-disk-0' error: Removing all snapshots: 0% complete...failed. Afterwards in pve the template has vanished but on ceph the "base-174-disk-0" rbd with the protected "__base__" snapshot is still there. I can however easily and successfully remove these remains manually even from the pve host where I ran qm destroy using exactly the same cephx-client as configured in pve simply by running rbd commands: rbd -m -n client.rz --conf /etc/pve/priv/ceph/ceph-pxa.conf --keyring /etc/pve/priv/ceph/ceph-pxa.keyring --auth_supported cephx snap unprotect pxa-rbd/base-174-disk-0 at __base__ rbd -m -n client.rz --conf /etc/pve/priv/ceph/ceph-pxa.conf --keyring /etc/pve/priv/ceph/ceph-pxa.keyring --auth_supported cephx snap purge pxa-rbd/base-174-disk-0 rbd -m -n client.rz --conf /etc/pve/priv/ceph/ceph-pxa.conf --keyring /etc/pve/priv/ceph/ceph-pxa.keyring --auth_supported cephx rm pxa-rbd/base-183-disk-0 This works without any problem. So the question is if I can do it manually on a pve host why does proxmox throw an error doing the very same on the same host? For me this looks like a pve bug. I no idea how to find out more. Anyone else with some fresh ideas? Thanks Rainer -- Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1 56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312 PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287 1001312 From pfrank at gmx.de Thu Jan 21 14:56:47 2021 From: pfrank at gmx.de (Petric Frank) Date: Thu, 21 Jan 2021 14:56:47 +0100 Subject: [PVE-User] Backup notes on NFS share Message-ID: <1714294.Zkmt1EvEu4@main> Hello, i do backups to a mounted NFS share. But when i try to apply a note to the backup i get the following error: --------------------- cut ----------------------- volume notes are not supported for PVE::Storage::NFSPlugin at /usr/share/perl5/PVE/ Storage/Plugin.pm line 838. (500) --------------------- cut ----------------------- Is that intentional or do i miss something. Running current Proxmox GUI 6.3-3 kind regards Petric From humbertos at ifsc.edu.br Thu Jan 21 16:34:59 2021 From: humbertos at ifsc.edu.br (Humberto Jose de Sousa) Date: Thu, 21 Jan 2021 12:34:59 -0300 Subject: Cloud-init don't manage resolv.conf on Debian 10 Message-ID: Hello folks. I'm trying to use cloud-init in a VM with a fresh install of Debian 10. All works, except resolv.conf. Anyone could give a tip? I'm on the latest proxmox no-subscription. I configured the cloud-init through a web interface. In the VM I installed the cloud-init package. I also tried to install the resolvconf package, but it did not work. -- *Humberto Jos? de Sousa* Analista de TI - CTIC (48) 3381-2821 *Instituto Federal de Santa Catarina - C?mpus S?o Jos?* R. Jos? Lino Kretzer, 608, Praia Comprida, S?o Jos? / SC - CEP: 88103-310 https://www.ifsc.edu.br/web/campus-sao-jose From aderumier at odiso.com Thu Jan 21 17:50:38 2021 From: aderumier at odiso.com (aderumier at odiso.com) Date: Thu, 21 Jan 2021 17:50:38 +0100 Subject: [PVE-User] Cloud-init don't manage resolv.conf on Debian 10 In-Reply-To: References: Message-ID: Le jeudi 21 janvier 2021 ? 12:34 -0300, Humberto Jose de Sousa via pve- user a ?crit?: > Hello folks. > > I'm trying to use cloud-init in a VM with a fresh install of Debian > 10. All > works, except resolv.conf. > > Anyone could give a tip? > > I'm on the latest proxmox no-subscription. I configured the cloud- > init > through a web interface. > In the VM I installed the cloud-init package. I also tried to install > the > resolvconf package, but it did not work. Hi, cloud-init write dns config in /etc/network/interfaces.d/cloudinit.cfg. this is working for ubuntu, because /etc/resolv.conf is not a real file, but a symlink manage by resolvconf package (or systemd-resolvd I'm not sure). you can try to install "resolvconf" package on debian, it should works From aderumier at odiso.com Thu Jan 21 17:50:38 2021 From: aderumier at odiso.com (aderumier at odiso.com) Date: Thu, 21 Jan 2021 17:50:38 +0100 Subject: [PVE-User] Cloud-init don't manage resolv.conf on Debian 10 In-Reply-To: References: Message-ID: Le jeudi 21 janvier 2021 ? 12:34 -0300, Humberto Jose de Sousa via pve- user a ?crit?: > Hello folks. > > I'm trying to use cloud-init in a VM with a fresh install of Debian > 10. All > works, except resolv.conf. > > Anyone could give a tip? > > I'm on the latest proxmox no-subscription. I configured the cloud- > init > through a web interface. > In the VM I installed the cloud-init package. I also tried to install > the > resolvconf package, but it did not work. Hi, cloud-init write dns config in /etc/network/interfaces.d/cloudinit.cfg. this is working for ubuntu, because /etc/resolv.conf is not a real file, but a symlink manage by resolvconf package (or systemd-resolvd I'm not sure). you can try to install "resolvconf" package on debian, it should works From humbertos at ifsc.edu.br Thu Jan 21 19:25:03 2021 From: humbertos at ifsc.edu.br (Humberto Jose de Sousa) Date: Thu, 21 Jan 2021 15:25:03 -0300 Subject: [PVE-User] Cloud-init don't manage resolv.conf on Debian 10 In-Reply-To: References: Message-ID: I verified the file content that you pointed: *cat /etc/network/interfaces.d/50-cloud-init * *auto loiface lo inet loopback dns-nameservers x.x.x.x x.x.x.x dns-search xx.xxxx.edu.br auto eth0iface eth0 inet static address x.x.x.x/xx gateway x.x.x.x* I think that problem is because the dns parameters are written in the loopback interface. This happen to you? The resolvconf package didn't work. Em qui., 21 de jan. de 2021 ?s 13:50, escreveu: > Le jeudi 21 janvier 2021 ? 12:34 -0300, Humberto Jose de Sousa via > pve-user a ?crit : > > Hello folks. > > I'm trying to use cloud-init in a VM with a fresh install of Debian 10. All > works, except resolv.conf. > > Anyone could give a tip? > > I'm on the latest proxmox no-subscription. I configured the cloud-init > through a web interface. > In the VM I installed the cloud-init package. I also tried to install the > resolvconf package, but it did not work. > > > Hi, > > cloud-init write dns config in /etc/network/interfaces.d/cloudinit.cfg. > > this is working for ubuntu, because /etc/resolv.conf is not a real file, > but a symlink manage by resolvconf package (or systemd-resolvd I'm not > sure). > > > you can try to install "resolvconf" package on debian, it should works > From humbertos at ifsc.edu.br Thu Jan 21 19:25:03 2021 From: humbertos at ifsc.edu.br (Humberto Jose de Sousa) Date: Thu, 21 Jan 2021 15:25:03 -0300 Subject: [PVE-User] Cloud-init don't manage resolv.conf on Debian 10 In-Reply-To: References: Message-ID: I verified the file content that you pointed: *cat /etc/network/interfaces.d/50-cloud-init * *auto loiface lo inet loopback dns-nameservers x.x.x.x x.x.x.x dns-search xx.xxxx.edu.br auto eth0iface eth0 inet static address x.x.x.x/xx gateway x.x.x.x* I think that problem is because the dns parameters are written in the loopback interface. This happen to you? The resolvconf package didn't work. Em qui., 21 de jan. de 2021 ?s 13:50, escreveu: > Le jeudi 21 janvier 2021 ? 12:34 -0300, Humberto Jose de Sousa via > pve-user a ?crit : > > Hello folks. > > I'm trying to use cloud-init in a VM with a fresh install of Debian 10. All > works, except resolv.conf. > > Anyone could give a tip? > > I'm on the latest proxmox no-subscription. I configured the cloud-init > through a web interface. > In the VM I installed the cloud-init package. I also tried to install the > resolvconf package, but it did not work. > > > Hi, > > cloud-init write dns config in /etc/network/interfaces.d/cloudinit.cfg. > > this is working for ubuntu, because /etc/resolv.conf is not a real file, > but a symlink manage by resolvconf package (or systemd-resolvd I'm not > sure). > > > you can try to install "resolvconf" package on debian, it should works > From aderumier at odiso.com Fri Jan 22 10:04:07 2021 From: aderumier at odiso.com (aderumier at odiso.com) Date: Fri, 22 Jan 2021 10:04:07 +0100 Subject: [PVE-User] Cloud-init don't manage resolv.conf on Debian 10 In-Reply-To: References: Message-ID: <666ba844620b56bd12745f6ad9cfce503b44b7a8.camel@odiso.com> it's works for me with resolvconf ?and config in lo interface. #apt install resolvconf (verify that /etc/resolv.conf is now a symlink) #echo "" > /etc/resolv.conf reboot after reboot /etc/resolv.conf values are same than /etc/nework/interfaces.d/50-cloud-init Le jeudi 21 janvier 2021 ? 19:41 -0300, Humberto Jose de Sousa a ?crit?: > I verified the file content that you pointed: > > cat /etc/network/interfaces.d/50-cloud-init > auto lo > iface lo inet loopback > ? ? dns-nameservers x.x.x.x x.x.x.x > ? ? dns-search xx.xxxx.edu.br > > auto eth0 > iface eth0 inet static > ? ? address x.x.x.x/xx > ? ? gateway x.x.x.x > > I think that problem is because the dns parameters are written in the > loopback interface. This happen to you? > > The resolvconf package didn't work. > > Em qui., 21 de jan. de 2021 ?s 15:25, Humberto Jose de Sousa > escreveu: > > > > I verified the file content that you pointed: > > > > cat /etc/network/interfaces.d/50-cloud-init > > auto lo > > iface lo inet loopback > > ? ? dns-nameservers x.x.x.x x.x.x.x > > ? ? dns-search xx.xxxx.edu.br > > > > auto eth0 > > iface eth0 inet static > > ? ? address x.x.x.x/xx > > ? ? gateway x.x.x.x > > > > I think that problem is because the dns parameters are written in > the loopback interface. This happen to you? > > > > The resolvconf package didn't work. > > > > Em qui., 21 de jan. de 2021 ?s 13:50, > escreveu: > >> > >> Le jeudi 21 janvier 2021 ? 12:34 -0300, Humberto Jose de Sousa via > pve-user a ?crit : > >> > >> Hello folks. > >> > >> I'm trying to use cloud-init in a VM with a fresh install of > Debian 10. All > >> works, except resolv.conf. > >> > >> Anyone could give a tip? > >> > >> I'm on the latest proxmox no-subscription. I configured the cloud- > init > >> through a web interface. > >> In the VM I installed the cloud-init package. I also tried to > install the > >> resolvconf package, but it did not work. > >> > >> > >> Hi, > >> > >> cloud-init write dns config in > /etc/network/interfaces.d/cloudinit.cfg. > >> > >> this is working for ubuntu, because /etc/resolv.conf is not a real > file, > >> but a symlink manage by resolvconf package (or systemd-resolvd I'm > not sure). > >> > >> > >> you can try to install "resolvconf" package on debian, it should > works > > > > -- > > Humberto Jos? de Sousa > > Analista de TI - CTIC > > (48) 3381-2821 > > Instituto Federal de Santa Catarina - C?mpus S?o Jos? > > R. Jos? Lino Kretzer, 608, Praia Comprida, S?o Jos? / SC - CEP: > 88103-310 > > https://www.ifsc.edu.br/web/campus-sao-jose From elacunza at binovo.es Fri Jan 22 12:25:08 2021 From: elacunza at binovo.es (Eneko Lacunza) Date: Fri, 22 Jan 2021 12:25:08 +0100 Subject: SPICE issue Message-ID: <75695212-0855-55f5-93f6-3cd146cd7386@binovo.es> Hi all, We have a recently installed 3-node cluster. We're having issues connecting to VM consoles with SPICE; it works when we connect to a VM running on the same node as the one we connect for WUI; but not for VMs running in the other 2 nodes. If we connect to WUI of another node, then we can SPICE to VMs local to that node, but not the others. So SPICE config on the VMs is correct. We detected this issue on PVE 6.3-2 and have updated to PVE 6.3-3 but the issue continues. I can't paste a pveversion -v (weird remote access). Any ideas? Thanks a lot -- Eneko Lacunza Zuzendari teknikoa | Director t?cnico Binovo IT Human Project Tel. +34 943 569 206 | https://www.binovo.es Astigarragako Bidea, 2 - 2? izda. Oficina 10-11, 20180 Oiartzun https://www.youtube.com/user/CANALBINOVO/ https://www.linkedin.com/company/37269706/ From humbertos at ifsc.edu.br Fri Jan 22 15:27:03 2021 From: humbertos at ifsc.edu.br (Humberto Jose de Sousa) Date: Fri, 22 Jan 2021 11:27:03 -0300 Subject: [PVE-User] Cloud-init don't manage resolv.conf on Debian 10 In-Reply-To: <666ba844620b56bd12745f6ad9cfce503b44b7a8.camel@odiso.com> References: <666ba844620b56bd12745f6ad9cfce503b44b7a8.camel@odiso.com> Message-ID: I found the problem. When I commented the loopback interface in the interfaces file, the service resolvconf got the right configuration. Thanks a lot. Em sex., 22 de jan. de 2021 ?s 06:04, escreveu: > it's works for me with resolvconf and config in lo interface. > > #apt install resolvconf > (verify that /etc/resolv.conf is now a symlink) > > #echo "" > /etc/resolv.conf > > reboot > > after reboot /etc/resolv.conf values are same than > /etc/nework/interfaces.d/50-cloud-init > > Le jeudi 21 janvier 2021 ? 19:41 -0300, Humberto Jose de Sousa a ?crit : > > I verified the file content that you pointed: > > cat /etc/network/interfaces.d/50-cloud-init > auto lo > iface lo inet loopback > dns-nameservers x.x.x.x x.x.x.x > dns-search xx.xxxx.edu.br > > auto eth0 > iface eth0 inet static > address x.x.x.x/xx > gateway x.x.x.x > > I think that problem is because the dns parameters are written in the > loopback interface. This happen to you? > > The resolvconf package didn't work. > > Em qui., 21 de jan. de 2021 ?s 15:25, Humberto Jose de Sousa < > humbertos at ifsc.edu.br> escreveu: > > > > I verified the file content that you pointed: > > > > cat /etc/network/interfaces.d/50-cloud-init > > auto lo > > iface lo inet loopback > > dns-nameservers x.x.x.x x.x.x.x > > dns-search xx.xxxx.edu.br > > > > auto eth0 > > iface eth0 inet static > > address x.x.x.x/xx > > gateway x.x.x.x > > > > I think that problem is because the dns parameters are written in the > loopback interface. This happen to you? > > > > The resolvconf package didn't work. > > > > Em qui., 21 de jan. de 2021 ?s 13:50, escreveu: > >> > >> Le jeudi 21 janvier 2021 ? 12:34 -0300, Humberto Jose de Sousa via > pve-user a ?crit : > >> > >> Hello folks. > >> > >> I'm trying to use cloud-init in a VM with a fresh install of Debian 10. > All > >> works, except resolv.conf. > >> > >> Anyone could give a tip? > >> > >> I'm on the latest proxmox no-subscription. I configured the cloud-init > >> through a web interface. > >> In the VM I installed the cloud-init package. I also tried to install > the > >> resolvconf package, but it did not work. > >> > >> > >> Hi, > >> > >> cloud-init write dns config in /etc/network/interfaces.d/cloudinit.cfg. > >> > >> this is working for ubuntu, because /etc/resolv.conf is not a real file, > >> but a symlink manage by resolvconf package (or systemd-resolvd I'm not > sure). > >> > >> > >> you can try to install "resolvconf" package on debian, it should works > > From jmr.richardson at gmail.com Sat Jan 23 17:07:26 2021 From: jmr.richardson at gmail.com (JR Richardson) Date: Sat, 23 Jan 2021 10:07:26 -0600 Subject: [PVE-User] Unused Disk Remove vs Detach SOLVED Message-ID: <000201d6f1a1$d80b0d80$88212880$@gmail.com> * JR Richardson [210115 17:57]: > I'm running PVE 6.2.11, I have VMs with Disks on NFS shared storage. I > moved disks to other storage nodes but did not choose to delete the > old disk, so the old disk is still assigned to the VM in hardware as > 'Unused Disk 0'. When I select the disk, the 'Detach' button changes > to 'Remove'. I remember in older versions of PVE, you could just > 'Detach' disks instead of removing them. Is this due to the old > storage node is still on-line? If I shutdown the old storage node, > would I get the option to Detach Unused Disks? I can only speculate, but essentially, completely detached disks are invisible to PVE. I believe, on subsequent actions - like disk moves, VM migrations, ... - the disk "numbers" can be reused. If there's such an old disk around, the operation might either fail or the disk might be silently overwritten. >From this PoV, I would strongly recommend against fully detaching disks from VMs. > Can I get around this via command line utility to detach the old disk? You can edit the VM definition file and remove the unusedN: line. Best, Chris Hi Chris, I removed the unused disk from the VM.conf file and that immediately removed it from the GUI. I shut down the storage node and kept all my disks intact without errors. Thanks. JR JR Richardson Engineering for the Masses Chasing the Azeotrope JRx DistillCo 1'st Place Brisket 1'st Place Chili From elacunza at binovo.es Tue Jan 26 12:56:39 2021 From: elacunza at binovo.es (Eneko Lacunza) Date: Tue, 26 Jan 2021 12:56:39 +0100 Subject: SPICE issue In-Reply-To: <75695212-0855-55f5-93f6-3cd146cd7386@binovo.es> References: <75695212-0855-55f5-93f6-3cd146cd7386@binovo.es> Message-ID: <9c1e6b3d-c927-fedd-6626-46c1a588ed25@binovo.es> Hi all, Issue was that spiceproxy was unable to resolve other proxmox node's IP. Updating /etc/hosts with the corresponding entries solved the problem. Thanks El 22/1/21 a las 12:25, Eneko Lacunza escribi?: > We have a recently installed 3-node cluster. We're having issues > connecting to VM consoles with SPICE; it works when we connect to a VM > running on the same node as the one we connect for WUI; but not for > VMs running in the other 2 nodes. > > If we connect to WUI of another node, then we can SPICE to VMs local > to that node, but not the others. So SPICE config on the VMs is correct. > > We detected this issue on PVE 6.3-2 and have updated to PVE 6.3-3 but > the issue continues. I can't paste a pveversion -v (weird remote access). > > Any ideas? > > Thanks a lot > -- Eneko Lacunza Zuzendari teknikoa | Director t?cnico Binovo IT Human Project Tel. +34 943 569 206 | https://www.binovo.es Astigarragako Bidea, 2 - 2? izda. Oficina 10-11, 20180 Oiartzun https://www.youtube.com/user/CANALBINOVO/ https://www.linkedin.com/company/37269706/ From pfrank at gmx.de Wed Jan 27 15:29:41 2021 From: pfrank at gmx.de (Petric Frank) Date: Wed, 27 Jan 2021 15:29:41 +0100 Subject: [PVE-User] LXC: Directory mount not writeable Message-ID: <5486020.peFUeoqG7q@main> Hello, created a directory mount using pct set 300 -mp0 ,mp=,ro=0 But in the lxc vm this directory is not writeable. A test with touch /z fails. Omitting the "ro"-option does not change this. If i change in 300.conf the line from mp0: ,mp=,ro=0 to mp0: ,mp=,rw=1 the touch command works inside the container. But then the GUI shows an error: *mp0*: invalid format - format error mp0.rw: property is not defined in schema and the schema does not allow additional properties Is this (mountpoint not being writeable) a bug ? If yes should i write a bug report ? Updated the proxmox server today from the no-subscription repository. The web GUI is at version 6.3-3. regards Petric From leesteken at protonmail.ch Wed Jan 27 15:47:02 2021 From: leesteken at protonmail.ch (Arjen) Date: Wed, 27 Jan 2021 14:47:02 +0000 Subject: [PVE-User] LXC: Directory mount not writeable In-Reply-To: <5486020.peFUeoqG7q@main> References: <5486020.peFUeoqG7q@main> Message-ID: <-BwUTHT-bzkW_PFM7mZQzI-vLXeP0I22k3h4oG_l7ZdLkqNTbc-D-EJczW4NhNprO2CptyXIQpDrCRdkCblW9Mb8AxZ8yXOAltsaJTP5qNM=@protonmail.ch> > Hello, > > created a directory mount using > pct set 300 -mp0 ,mp=,ro=0 Is this an unprivileged container? > But in the lxc vm this directory is not writeable. A test with > touch /z > > fails. Omitting the "ro"-option does not change this. This sounds like a user-permission-issue and not a read-only/not-writable problem. Unless the directory on the host is mounted (from another device) as read-only. > If i change in 300.conf the line from > mp0: ,mp=,ro=0 > > to > mp0: ,mp=,rw=1 > > the touch command works inside the container. > > But then the GUI shows an error: > mp0: invalid format - format error mp0.rw: property is not defined in schema and the > schema does not allow additional properties rw does not exists. ro=0 is correct. > Is this (mountpoint not being writeable) a bug ? > > If yes should i write a bug report ? No, this is probably a user/owner permission issue. Please check that the user that you use has permissions to write in the mounted directory. Please note that the user-ID of a unprivileged container is offset by 100000. Maybe you can show the user-ID, and the owner-ID and group-ID of the directory, and the rwx-permissions of the directory (all from both inside the container and from the Proxmox host). > Updated the proxmox server today from the no-subscription repository. The web GUI is > at version 6.3-3. This looks to be up to date. > regards > Petric > kind regards From pfrank at gmx.de Wed Jan 27 17:36:27 2021 From: pfrank at gmx.de (Petric Frank) Date: Wed, 27 Jan 2021 17:36:27 +0100 Subject: [PVE-User] LXC: Directory mount not writeable In-Reply-To: References: <5486020.peFUeoqG7q@main> Message-ID: <2934333.NQBDW4tYnK@main> Hello, Am Mittwoch, 27. Januar 2021, 15:47:02 CET schrieb Arjen via pve-user: > > Hello, > > > > created a directory mount using > > pct set 300 -mp0 ,mp=,ro=0 > > Is this an unprivileged container? yes > > Is this (mountpoint not being writeable) a bug ? > > > > If yes should i write a bug report ? > > No, this is probably a user/owner permission issue. Please check that the > user that you use has permissions to write in the mounted directory. Please > note that the user-ID of a unprivileged container is offset by 100000. > Maybe you can show the user-ID, and the owner-ID and group-ID of the > directory, and the rwx-permissions of the directory (all from both inside > the container and from the Proxmox host). This was the hint i needed. After changing the uid:gid on the host directory to 100000 everything works well. Thank you very much for your hint. > kind regards Kind regards and keep well Petric Frank From lindsay.mathieson at gmail.com Thu Jan 28 07:17:03 2021 From: lindsay.mathieson at gmail.com (Lindsay Mathieson) Date: Thu, 28 Jan 2021 16:17:03 +1000 Subject: [PVE-User] Mirrored ZFS Boot drive? Message-ID: When I setup my latest server, I used ZFS for the boot SSD. I later added a 2nd SSD mirror, but of course that only mirrors the rpool, not the boot partions (2 of them?), so not as useful as I thought, as if the first SSD fails, the server won't boot :( If I installed proxmox using mirrored SSD's from the start, would that setup a "dual" boot option on both SSD's or would one still be the primary and only boot device? -- Lindsay From leesteken at protonmail.ch Thu Jan 28 07:34:07 2021 From: leesteken at protonmail.ch (Arjen) Date: Thu, 28 Jan 2021 06:34:07 +0000 Subject: [PVE-User] Mirrored ZFS Boot drive? In-Reply-To: References: Message-ID: ??????? Original Message ??????? On Thursday, January 28, 2021 7:17 AM, Lindsay Mathieson wrote: > When I setup my latest server, I used ZFS for the boot SSD. > > I later added a 2nd SSD mirror, but of course that only mirrors the > rpool, not the boot partions (2 of them?), so not as useful as I > thought, as if the first SSD fails, the server won't boot :( > > If I installed proxmox using mirrored SSD's from the start, would that > setup a "dual" boot option on both SSD's or would one still be the > primary and only boot device? The Proxmox installer makes a boot partition per drive and nowadays keeps them all in sync with pve-efiboot-tool (but it does not check for bitrot like ZFS). Please have a look at the Host Bootloader section of the manual: https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot kind regards From lindsay.mathieson at gmail.com Thu Jan 28 08:19:30 2021 From: lindsay.mathieson at gmail.com (Lindsay Mathieson) Date: Thu, 28 Jan 2021 17:19:30 +1000 Subject: [PVE-User] Mirrored ZFS Boot drive? In-Reply-To: References: Message-ID: On 28/01/2021 4:34 pm, Arjen via pve-user wrote: > The Proxmox installer makes a boot partition per drive and nowadays keeps them all in sync with pve-efiboot-tool (but it does not check for bitrot like ZFS). Please have a look at the Host Bootloader section of the manual:https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot Thanks! very useful. Time to see if I can update the boot or nuke the server in the process :) -- Lindsay From f.ebner at proxmox.com Fri Jan 29 10:07:37 2021 From: f.ebner at proxmox.com (Fabian Ebner) Date: Fri, 29 Jan 2021 10:07:37 +0100 Subject: [PVE-User] Cannot update backup job via API - PVE 6.3.3 In-Reply-To: <7adad43c-3f8a-62d5-9b84-fae6ea1d57b3@proxmox.com> References: <7adad43c-3f8a-62d5-9b84-fae6ea1d57b3@proxmox.com> Message-ID: <198f74ad-1840-e895-6c11-3c6271fac643@proxmox.com> Hi, a fix for this was applied in git[0] and will become available with the next version of pve-manager (but it's not packaged yet). [0]: https://git.proxmox.com/?p=pve-manager.git;a=commit;h=992ff8857d41182edb3e80d005309bccafd95f68 Am 15.01.21 um 13:10 schrieb Fabian Ebner: > Hi, > thanks for the report! I can reproduce this here and will work out a patch. > > On 14.01.21 15:25, Mariusz Miodowski via pve-user wrote: >> >> Hello All, >> >> We have noticed problem with latest PVE, when trying to update backup >> job via API. >> >> It looks that problem is caused by maxfiles parameter. If we omit it, >> everything works fine. >> >> This problem is not only related to our environment,? we have already >> got reports from our clients that they also have the same problem. >> >> >> >> *Request*: >> PUT >> https://10.10.11.48:8006/api2/json/cluster/backup/3cb1bee67a58e68ea97db1292a23a22a9ea8d529:1 >> >> Array >> ( >> ??? [vmid] => 8001 >> ??? [starttime] => 00:10 >> ??? [maxfiles] => 10 >> ??? [storage] => local >> ??? [remove] => 1 >> ??? [dow] => tue,wed,sat >> ??? [mode] => snapshot >> ??? [compress] => zstd >> ) >> >> >> *Response*: >> HTTP 500 HTTP/1.1 500 error during cfs-locked 'file-vzdump_cron' >> operation: value without key, but schema does not define a default key >> Cache-Control: max-age=0 >> Connection: close >> Date: Thu, 14 Jan 2021 14:12:37 GMT >> Pragma: no-cache >> Server: pve-api-daemon/3.0 >> Content-Length: 13 >> Content-Type: application/json;charset=UTF-8 >> Expires: Thu, 14 Jan 2021 14:12:37 GMT >> >> {"data":null} >> >> >> -- >> Regards >> Mariusz Miodowski >> ModulesGarden Development Team Manager >> https://www.modulesgarden.com >> _______________________________________________ >> pve-user mailing list >> pve-user at lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > > _______________________________________________ > pve-user mailing list > pve-user at lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > From pfrank at gmx.de Fri Jan 29 13:36:15 2021 From: pfrank at gmx.de (Petric Frank) Date: Fri, 29 Jan 2021 13:36:15 +0100 Subject: [PVE-User] LXC mount point not in backup Message-ID: <3515667.kIdP2CnQM6@main> Hello, in my config i have a mountp point defined as: mp0: /storage/nextcloud-gz,mp=/nextcloud,ro=0,backup=1 Despite the option "backup=1" it is not included in an backup. The backup screen log: --------------------------- cut -------------------------- INFO: starting new backup job: vzdump 200 --compress gzip --node proxmox-gz --remove 0 --mode stop --storage zfs-raid INFO: filesystem type on dumpdir is 'zfs' -using /var/tmp/vzdumptmp31209_200 for temporary files INFO: Starting Backup of VM 200 (lxc) INFO: Backup started at 2021-01-29 13:23:05 INFO: status = stopped INFO: backup mode: stop INFO: ionice priority: 7 INFO: CT Name: nextcloud INFO: including mount point rootfs ('/') in backup INFO: excluding bind mount point mp0 ('/nextcloud') from backup (not a volume) INFO: creating vzdump archive '/storage/vz/dump/vzdump- lxc-200-2021_01_29-13_23_05.tar.gz' INFO: Total bytes written: 3123005440 (3.0GiB, 21MiB/s) INFO: archive file size: 1.39GB INFO: Finished Backup of VM 200 (00:02:27) INFO: Backup finished at 2021-01-29 13:25:32 INFO: Backup job finished successfully TASK OK --------------------------- cut -------------------------- As you can see the line below states that as "not a volume": INFO: excluding bind mount point mp0 ('/nextcloud') from backup (not a volume) Is there a way to include the mount point in a backup ? Additional question: What is the behaviour of PBS in this case ? kind regards Petric From f.gruenbichler at proxmox.com Fri Jan 29 15:59:50 2021 From: f.gruenbichler at proxmox.com (=?UTF-8?Q?Fabian_Gr=C3=BCnbichler?=) Date: Fri, 29 Jan 2021 15:59:50 +0100 (CET) Subject: [PVE-User] LXC mount point not in backup In-Reply-To: <3515667.kIdP2CnQM6@main> References: <3515667.kIdP2CnQM6@main> Message-ID: <1182254484.1418.1611932390376@webmail.proxmox.com> > Petric Frank hat am 29.01.2021 13:36 geschrieben: > > > Hello, > > in my config i have a mountp point defined as: > > mp0: /storage/nextcloud-gz,mp=/nextcloud,ro=0,backup=1 > > Despite the option "backup=1" it is not included in an backup. The backup screen log: > --------------------------- cut -------------------------- > INFO: starting new backup job: vzdump 200 --compress gzip --node proxmox-gz --remove > 0 --mode stop --storage zfs-raid > INFO: filesystem type on dumpdir is 'zfs' -using /var/tmp/vzdumptmp31209_200 for > temporary files > INFO: Starting Backup of VM 200 (lxc) > INFO: Backup started at 2021-01-29 13:23:05 > INFO: status = stopped > INFO: backup mode: stop > INFO: ionice priority: 7 > INFO: CT Name: nextcloud > INFO: including mount point rootfs ('/') in backup > INFO: excluding bind mount point mp0 ('/nextcloud') from backup (not a volume) > INFO: creating vzdump archive '/storage/vz/dump/vzdump- > lxc-200-2021_01_29-13_23_05.tar.gz' > INFO: Total bytes written: 3123005440 (3.0GiB, 21MiB/s) > INFO: archive file size: 1.39GB > INFO: Finished Backup of VM 200 (00:02:27) > INFO: Backup finished at 2021-01-29 13:25:32 > INFO: Backup job finished successfully > TASK OK > --------------------------- cut -------------------------- > > As you can see the line below states that as "not a volume": > INFO: excluding bind mount point mp0 ('/nextcloud') from backup (not a volume) > > Is there a way to include the mount point in a backup ? no. the assumption here is that you are using a bind mount to pass in something that is not managed by PVE (e.g., a dir used in more than one container, some network share mounted on the host, ..) and thus also not included in backups. > Additional question: What is the behaviour of PBS in this case ? if you of mean of vzdump when backing up to PBS - identical. of course if you run proxmox-backup-client from within the container you can choose yourself which mountpoints are included or not.. From lindsay.mathieson at gmail.com Sun Jan 31 01:52:29 2021 From: lindsay.mathieson at gmail.com (Lindsay Mathieson) Date: Sun, 31 Jan 2021 10:52:29 +1000 Subject: [PVE-User] Mirrored ZFS Boot drive? In-Reply-To: References: Message-ID: <38189d4c-306e-93e2-8746-dc569e697fd0@gmail.com> On 28/01/2021 4:34 pm, Arjen via pve-user wrote: > The Proxmox installer makes a boot partition per drive and nowadays keeps them all in sync with pve-efiboot-tool (but it does not check for bitrot like ZFS). Please have a look at the Host Bootloader section of the manual:https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot I tested this out, adding a 2nd ssd,? server is using efi/systemdboot, worked a treat. I even zapped and removed the first SSD, it still booted fine. Thanks! -- Lindsay