From aderumier at odiso.com Sat Jul 1 09:27:20 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Sat, 1 Jul 2017 09:27:20 +0200 (CEST) Subject: [PVE-User] [pve-devel] Proxmox VE 5.0 beta2 released! In-Reply-To: <20170630174825.GC12327@kalry> References: <20170616134344.GC1933@kalry> <947136085.14300301.1497623213661.JavaMail.zimbra@oxygem.tv> <20170618073625.GD1933@kalry> <031dcafe-385f-2829-6fea-b862027e7713@proxmox.com> <649A5203-FBBE-4CD4-8369-AC4D46C8EA45@elchaka.de> <20170630174825.GC12327@kalry> Message-ID: <1225336742.14799284.1498894040871.JavaMail.zimbra@oxygem.tv> I'll not be include in proxmox 5.0. (proxmox devs are working on other things) I'll try to push it for proxmox 5.1. ----- Mail original ----- De: lemonnierk at ulrar.net ?: "proxmoxve" Envoy?: Vendredi 30 Juin 2017 19:48:25 Objet: Re: [PVE-User] [pve-devel] Proxmox VE 5.0 beta2 released! On Fri, Jun 30, 2017 at 07:33:55PM +0200, Mehmet wrote: > I guess "when it is done" :) > > Do you miss something or just to know when it will become done? Just know when it's done. As you can imagine, we've all been waiting for cloud-init for ever. If it's done in two weeks we'll wait, but if it's done in two months we'll have to install a new cluster on pve 4 with the old not so great deployment system, hence the question. _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From gilberto.nunes32 at gmail.com Sat Jul 1 15:00:49 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Sat, 1 Jul 2017 10:00:49 -0300 Subject: [PVE-User] [pve-devel] Proxmox VE 5.0 beta2 released! In-Reply-To: <649A5203-FBBE-4CD4-8369-AC4D46C8EA45@elchaka.de> References: <160f961a-f0b8-7a60-2f8f-8a9c37bc2418@proxmox.com> <420650404.12924802.1495548218032.JavaMail.zimbra@oxygem.tv> <735614091.12925121.1495548899554.JavaMail.zimbra@oxygem.tv> <2021261790.14180495.1497339810341.JavaMail.zimbra@oxygem.tv> <20170616134344.GC1933@kalry> <947136085.14300301.1497623213661.JavaMail.zimbra@oxygem.tv> <20170618073625.GD1933@kalry> <031dcafe-385f-2829-6fea-b862027e7713@proxmox.com> <649A5203-FBBE-4CD4-8369-AC4D46C8EA45@elchaka.de> Message-ID: I heard that soon Debian "Stretch" comes out, so PVE too... Perhaps you right... I miss something... Obrigado Cordialmente Gilberto Ferreira Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server | Zimbra Mail Server (47) 3025-5907 (47) 99676-7530 Skype: gilberto.nunes36 konnectati.com.br https://www.youtube.com/watch?v=dsiTPeNWcSE 2017-06-30 14:33 GMT-03:00 Mehmet : > I guess "when it is done" :) > > Do you miss something or just to know when it will become done? > > Am 21. Juni 2017 21:33:07 MESZ schrieb Gilberto Nunes < > gilberto.nunes32 at gmail.com>: > >And when PVE 5 comes out?? > > > >2017-06-19 3:29 GMT-03:00 Emmanuel Kasper : > > > >> > In the meantime, I assume I can install proxmox beta on debian 9 > >stable ? > >> > To start playing with the API, make sure our applications will > >still work > >> > fine, maybe start adding some Cloud-Init support in there :) > >> > >> Yes, you can install the 5.0 Beta on Debian 9.0, see > >> https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch > >> > >> _______________________________________________ > >> pve-user mailing list > >> pve-user at pve.proxmox.com > >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > >> > > > > > > > >-- > > product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb> > >Gilberto Ferreira > >about.me/gilbertof > > product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb> > >_______________________________________________ > >pve-user mailing list > >pve-user at pve.proxmox.com > >https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From devin at pabstatencio.com Mon Jul 3 04:49:18 2017 From: devin at pabstatencio.com (Devin Acosta) Date: Sun, 2 Jul 2017 19:49:18 -0700 Subject: [PVE-User] PVE 5. / Corosync / Fencing Message-ID: *I am using the latest Proxmox Beta in a 2-node cluster right now (just doing some testing). From what I read it appears that I need to setup fencing, and possibly watchdog in order to finish making the cluster fully HA? I want the ability that if a host dies or loses access to the storage network that it will restart the container/VM on another host in the cluster. From what I read it appears the cluster.conf is gone now, and replaced by corosync. I don?t find much information in regards to corosync and configuring it for fencing for HP servers.* *Any assistance would greatly be appreciated. I have 3 HP DL360g8 servers all sharing NFS storage.* *Thanks much.* *Devin* From e.kasper at proxmox.com Mon Jul 3 09:03:33 2017 From: e.kasper at proxmox.com (Emmanuel Kasper) Date: Mon, 3 Jul 2017 09:03:33 +0200 Subject: [PVE-User] PVE 5. / Corosync / Fencing In-Reply-To: References: Message-ID: Hi Devin On 07/03/2017 04:49 AM, Devin Acosta wrote: > *I am using the latest Proxmox Beta in a 2-node cluster right now (just > doing some testing). From what I read it appears that I need to setup > fencing, and possibly watchdog in order to finish making the cluster fully > HA? I want the ability that if a host dies or loses access to the storage > network that it will restart the container/VM on another host in the > cluster. this is exactly what the HA stack provides do you have two or three nodes ? reliable HA requires three nodes. out of the box PVE4 and PVE5 works with a software defined watchdog which will self fence the node if cluster communication is lost see https://pve.proxmox.com/pve-docs/chapter-ha-manager.html#ha_manager_fencing for the fine docs if the VM/CT is defined as a HA resource, it will be automically restarted on another node when fencing takes place From bleek at cross-solution.de Tue Jul 4 08:57:24 2017 From: bleek at cross-solution.de (Carsten Bleek) Date: Tue, 4 Jul 2017 08:57:24 +0200 Subject: [PVE-User] Backup WIN7 VM on Proxmox 2.2, restore on Proxmox 4.4 results in "Boot failed: not a bootable disk In-Reply-To: <5a0f41ef-e439-3efd-0122-a2520d33ae0e@binovo.es> References:

<5a0f41ef-e439-3efd-0122-a2520d33ae0e@binovo.es> Message-ID: Hi I guess, it was a windows issue. In the end I could solve the problem by using an AcronisBootableMedia.iso. I backup/restore the Disc and windows is booting again. Regards, Carsten Am 30.06.2017 um 09:46 schrieb Eneko Lacunza: > Seems strange. Have you tried to boot to a windows 7 installation ISO > and inspect the virtual disk from within? Maybe try also "repair boot"? > > El 30/06/17 a las 09:39, Carsten Bleek escribi?: >> Hi Eneko, >> >> VM starts, but windows does not boot. >> >> There is something stange with the raw image. On Proxmox 2.2 Windows >> is booting. But I can't get any partition informations >> >> Using parted: >> >> parted /var/lib/vz/images/303/vm-303-disk-1.raw >> GNU Parted 2.3 >> Using /var/lib/vz/images/303/vm-303-disk-1.raw >> Welcome to GNU Parted! Type 'help' to view a list of commands. >> (parted) p >> Error: /var/lib/vz/images/303/vm-303-disk-1.raw: unrecognised disk label >> >> Using: fdisk >> >> root at cross-prox:~# fdisk /var/lib/vz/images/303/vm-303-disk-1.raw >> Device contains neither a valid DOS partition table, nor Sun, SGI or >> OSF disklabel >> Building a new DOS disklabel with disk identifier 0xe6aadbb7. >> Changes will remain in memory only, until you decide to write them. >> After that, of course, the previous content won't be recoverable. >> >> Warning: invalid flag 0x0000 of partition table 4 will be corrected >> by w(rite) >> You must set cylinders. >> You can do this from the extra functions menu. >> >> WARNING: DOS-compatible mode is deprecated. It's strongly recommended to >> switch off the mode (command 'c') and change display units to >> sectors (command 'u'). >> >> Command (m for help): p >> >> Disk /var/lib/vz/images/303/vm-303-disk-1.raw: 0 MB, 0 bytes >> 255 heads, 63 sectors/track, 0 cylinders >> Units = cylinders of 16065 * 512 = 8225280 bytes >> Sector size (logical/physical): 512 bytes / 512 bytes >> I/O size (minimum/optimal): 512 bytes / 512 bytes >> Disk identifier: 0xe6aadbb7 >> >> Device Boot Start End >> Blocks Id System >> >> Command (m for help): q >> >> Regards, >> >> Carsten >> >> Am 30.06.2017 um 09:30 schrieb Eneko Lacunza: >>> Hi Carsten, >>> >>> VM doesn't start, or Windows doesn't boot? >>> >>> If VM doesn't start please send error log. >>> >>> El 30/06/17 a las 09:04, Carsten Bleek escribi?: >>>> Hi, >>>> >>>> I'm trying to move a KVM Win7 VM from an 2.2 to a 4.4 Proxmox using >>>> ZFS. I've done it using the normal backup/restore feature of the >>>> Proxmox GUI. In addition, I've tried to bypass Backup/Restore as >>>> descibed in >>>> https://pve.proxmox.com/wiki/Upgrade_from_3.x_to_4.0#Bypassing_Backup_and_Restore >>>> (same result). >>>> >>>> Regards, >>>> >>>> Carsten >>>> >>>> >>>> >>>> On 4.4 KVM runs as (VM is not not booting): >>>> >>>> /usr/bin/kvm -id 303 -chardev >>>> socket,id=qmp,path=/var/run/qemu-server/303.qmp,server,nowait -mon >>>> chardev=qmp,mode=control -pidfile /var/run/qemu-server/303.pid >>>> -daemonize -name Win7 -smp 4,sockets=2,cores=2,maxcpus=4 >>>> -nodefaults -boot >>>> menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg >>>> -vga std -vnc unix:/var/run/qemu-server/303.vnc,x509,password >>>> -no-hpet -cpu >>>> kvm64,+lahf_lm,+sep,+kvm_pv_unhalt,+kvm_pv_eoi,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_reset,hv_vpindex,hv_runtime,hv_relaxed,enforce >>>> -m 4096 -k de -device >>>> pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device >>>> pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device >>>> piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device >>>> usb-tablet,id=tablet,bus=uhci.0,port=1 -device >>>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi >>>> initiator-name=iqn.1993-08.org.debian:01:7a05c3c8a25 -drive >>>> file=/var/lib/vz/images/303/vm-303-disk-1.raw,if=none,id=drive-ide0,cache=writethrough,format=raw,aio=threads,detect-zeroes=on >>>> -device >>>> ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=100 >>>> -drive if=none,id=drive-ide2,media=cdrom,aio=threads -device >>>> ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 >>>> -netdev >>>> type=tap,id=net0,ifname=tap303i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown >>>> -device >>>> e1000,mac=08:00:27:8C:BA:31,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 >>>> -rtc driftfix=slew,base=localtime -global >>>> kvm-pit.lost_tick_policy=discard >>>> >>>> On 2.2 KVM runs as and the VM ist booting. >>>> >>>> usr/bin/kvm -id 303 -chardev >>>> socket,id=monitor,path=/var/run/qemu-server/303.mon,server,nowait >>>> -mon chardev=monitor,mode=readline -vnc >>>> unix:/var/run/qemu-server/303.vnc,x509,password -pidfile >>>> /var/run/qemu-server/303.pid -daemonize -usbdevice tablet -name >>>> Win7 -smp sockets=2,cores=2 -nodefaults -boot menu=on -vga std >>>> -localtime -rtc-td-hack -no-kvm-pit-reinjection -no-hpet -k de >>>> -drive >>>> file=/dev/cdrom2,if=none,id=drive-ide2,media=cdrom,aio=native >>>> -device >>>> ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200 >>>> -drive >>>> file=/var/lib/vz/images/303/vm-303-disk-1.raw,if=none,id=drive-ide0,cache=directsync,aio=native >>>> -device >>>> ide-hd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=100 -m >>>> 4096 -netdev >>>> type=tap,id=net0,ifname=tap303i0,script=/var/lib/qemu-server/pve-bridge >>>> -device >>>> e1000,mac=08:00:27:8C:BA:31,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 >>>> -cpuunits 1000 >>>> >>> >> >> > -- Cross Solution | Carsten Bleek Diemelstra?e 2-4 | Tel: 069-71910361 60486 Frankfurt am Main | Fax: 069-71910369 http://cross-solution.de | bleek at cross-solution.de From martin at proxmox.com Tue Jul 4 17:25:10 2017 From: martin at proxmox.com (Martin Maurer) Date: Tue, 4 Jul 2017 17:25:10 +0200 Subject: [PVE-User] Proxmox VE 5.0 released! Message-ID: Hi all, We are very happy to announce the final release of our Proxmox VE 5.0 - based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. New Proxmox VE Storage Replication Replicas provide asynchronous data replication between two or multiple nodes in a cluster, thus minimizing data loss in case of failure. For all organizations using local storage the Proxmox replication feature is a great option to increase data redundancy for high I/Os avoiding the need of complex shared or distributed storage configurations. => https://pve.proxmox.com/wiki/Storage_Replication With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for distributed storage. Packaging is now done by the Proxmox team. The Ceph Luminous is not yet production ready but already available for testing. If you use Ceph, follow the recommendations below. We also have a simplified procedure for disk import from different hypervisors. You can now easily import disks from VMware, Hyper-V, or other hypervisors via a new command line tool called ?qm importdisk?. Other new features are the live migration with local storage via QEMU, added USB und Host PCI address visibility in the GUI, bulk actions and filtering options in the GUI and an optimized NoVNC console. And as always we have included countless bugfixes and improvements on a lot of places. Video Watch our short introduction video - What's new in Proxmox VE 5.0? https://www.proxmox.com/en/training/video-tutorials Release notes https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 Download https://www.proxmox.com/en/downloads Alternate ISO download: http://download.proxmox.com/iso/ Source Code https://git.proxmox.com Bugtracker https://bugzilla.proxmox.com FAQ Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via apt? A: Yes, upgrading from beta to stable can be done via apt. https://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_Virtual_Environment_5.x_to_latest_5.0 Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? A: Yes, see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 Q: I am running Ceph Server on V4.4 in a production setup - should I upgrade now? A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest Ceph Luminous release (release candidate status). Therefore not yet recommended for production. But you should start testing in a testlab environment, here is one important wiki article - https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous Many thank you's to our active community for all feedback, testing, bug reporting and patch submissions! -- Best Regards, Martin Maurer Proxmox VE project leader From gilberto.nunes32 at gmail.com Tue Jul 4 19:05:50 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Tue, 4 Jul 2017 14:05:50 -0300 Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References: Message-ID: Hi That's Nice ! I don't see any reference about Cloud Init... Is it in this release?? Obrigado Cordialmente Gilberto Ferreira Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server | Zimbra Mail Server (47) 3025-5907 (47) 99676-7530 Skype: gilberto.nunes36 konnectati.com.br https://www.youtube.com/watch?v=dsiTPeNWcSE 2017-07-04 12:25 GMT-03:00 Martin Maurer : > Hi all, > > We are very happy to announce the final release of our Proxmox VE 5.0 - > based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. > > New Proxmox VE Storage Replication > > Replicas provide asynchronous data replication between two or multiple > nodes in a cluster, thus minimizing data loss in case of failure. For all > organizations using local storage the Proxmox replication feature is a > great option to increase data redundancy for high I/Os avoiding the need of > complex shared or distributed storage configurations. > => https://pve.proxmox.com/wiki/Storage_Replication > > With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for distributed > storage. Packaging is now done by the Proxmox team. The Ceph Luminous is > not yet production ready but already available for testing. If you use > Ceph, follow the recommendations below. > > We also have a simplified procedure for disk import from different > hypervisors. You can now easily import disks from VMware, Hyper-V, or other > hypervisors via a new command line tool called ?qm importdisk?. > > Other new features are the live migration with local storage via QEMU, > added USB und Host PCI address visibility in the GUI, bulk actions and > filtering options in the GUI and an optimized NoVNC console. > > And as always we have included countless bugfixes and improvements on a > lot of places. > > Video > Watch our short introduction video - What's new in Proxmox VE 5.0? > https://www.proxmox.com/en/training/video-tutorials > > Release notes > https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 > > Download > https://www.proxmox.com/en/downloads > Alternate ISO download: > http://download.proxmox.com/iso/ > > Source Code > https://git.proxmox.com > > Bugtracker > https://bugzilla.proxmox.com > > FAQ > Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via apt? > A: Yes, upgrading from beta to stable can be done via apt. > https://pve.proxmox.com/wiki/Downloads#Update_a_running_Prox > mox_Virtual_Environment_5.x_to_latest_5.0 > > Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? > A: Yes, see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_St > retch > > Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? > A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 > > Q: I am running Ceph Server on V4.4 in a production setup - should I > upgrade now? > A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest Ceph > Luminous release (release candidate status). Therefore not yet recommended > for production. But you should start testing in a testlab environment, here > is one important wiki article - https://pve.proxmox.com/wiki/C > eph_Jewel_to_Luminous > > Many thank you's to our active community for all feedback, testing, bug > reporting and patch submissions! > > -- > Best Regards, > > Martin Maurer > Proxmox VE project leader > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From dietmar at proxmox.com Tue Jul 4 19:13:48 2017 From: dietmar at proxmox.com (Dietmar Maurer) Date: Tue, 4 Jul 2017 19:13:48 +0200 (CEST) Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References:

Message-ID: <770257297.58.1499188429122@webmail.proxmox.com> > I don't see any reference about Cloud Init... Is it in this release?? No, it is not. From gilberto.nunes32 at gmail.com Tue Jul 4 19:17:47 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Tue, 4 Jul 2017 14:17:47 -0300 Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: <770257297.58.1499188429122@webmail.proxmox.com> References:

<770257297.58.1499188429122@webmail.proxmox.com> Message-ID: Did you have any clue when??? Obrigado Cordialmente Gilberto Ferreira Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server | Zimbra Mail Server (47) 3025-5907 (47) 99676-7530 Skype: gilberto.nunes36 konnectati.com.br https://www.youtube.com/watch?v=dsiTPeNWcSE 2017-07-04 14:13 GMT-03:00 Dietmar Maurer : > > I don't see any reference about Cloud Init... Is it in this release?? > > No, it is not. > > From dietmar at proxmox.com Tue Jul 4 21:29:01 2017 From: dietmar at proxmox.com (Dietmar Maurer) Date: Tue, 4 Jul 2017 21:29:01 +0200 (CEST) Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References:

<770257297.58.1499188429122@webmail.proxmox.com> Message-ID: <606722669.60.1499196541865@webmail.proxmox.com> > Did you have any clue when??? All information is available on the developer list (pve-devel). From moh at multihouse.dk Tue Jul 4 22:02:20 2017 From: moh at multihouse.dk (Martin Overgaard Hansen) Date: Tue, 4 Jul 2017 20:02:20 +0000 Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References: Message-ID: Den 4. jul. 2017 kl. 17.25 skrev Martin Maurer >: Hi all, We are very happy to announce the final release of our Proxmox VE 5.0 - based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. New Proxmox VE Storage Replication Replicas provide asynchronous data replication between two or multiple nodes in a cluster, thus minimizing data loss in case of failure. For all organizations using local storage the Proxmox replication feature is a great option to increase data redundancy for high I/Os avoiding the need of complex shared or distributed storage configurations. => https://pve.proxmox.com/wiki/Storage_Replication With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for distributed storage. Packaging is now done by the Proxmox team. The Ceph Luminous is not yet production ready but already available for testing. If you use Ceph, follow the recommendations below. We also have a simplified procedure for disk import from different hypervisors. You can now easily import disks from VMware, Hyper-V, or other hypervisors via a new command line tool called ?qm importdisk?. Other new features are the live migration with local storage via QEMU, added USB und Host PCI address visibility in the GUI, bulk actions and filtering options in the GUI and an optimized NoVNC console. And as always we have included countless bugfixes and improvements on a lot of places. Video Watch our short introduction video - What's new in Proxmox VE 5.0? https://www.proxmox.com/en/training/video-tutorials Release notes https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 Download https://www.proxmox.com/en/downloads Alternate ISO download: http://download.proxmox.com/iso/ Source Code https://git.proxmox.com Bugtracker https://bugzilla.proxmox.com FAQ Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via apt? A: Yes, upgrading from beta to stable can be done via apt. https://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_Virtual_Environment_5.x_to_latest_5.0 Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? A: Yes, see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 Q: I am running Ceph Server on V4.4 in a production setup - should I upgrade now? A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest Ceph Luminous release (release candidate status). Therefore not yet recommended for production. But you should start testing in a testlab environment, here is one important wiki article - https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous Many thank you's to our active community for all feedback, testing, bug reporting and patch submissions! -- Best Regards, Martin Maurer Proxmox VE project leader _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user Sounds great, really looking forward to the new ceph release becoming stable. What's the general opinion regarding incremental backups with a dirty map like feature? http://wiki.qemu.org/Features/IncrementalBackup I see that as a very important and missing feature. Best Regards, Martin Overgaard Hansen MultiHouse IT Partner A/S From aderumier at odiso.com Tue Jul 4 22:46:12 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Tue, 4 Jul 2017 22:46:12 +0200 (CEST) Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References: Message-ID: <1417019251.14902641.1499201172556.JavaMail.zimbra@oxygem.tv> Congrats to proxmox team ! BTW, small typo on https://www.proxmox.com/en/training/video-tutorials "New open-source storage replikation stack" /replikation/replication ----- Mail original ----- De: "Martin Maurer" ?: "pve-devel" , "proxmoxve" Envoy?: Mardi 4 Juillet 2017 17:25:10 Objet: [PVE-User] Proxmox VE 5.0 released! Hi all, We are very happy to announce the final release of our Proxmox VE 5.0 - based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. New Proxmox VE Storage Replication Replicas provide asynchronous data replication between two or multiple nodes in a cluster, thus minimizing data loss in case of failure. For all organizations using local storage the Proxmox replication feature is a great option to increase data redundancy for high I/Os avoiding the need of complex shared or distributed storage configurations. => https://pve.proxmox.com/wiki/Storage_Replication With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for distributed storage. Packaging is now done by the Proxmox team. The Ceph Luminous is not yet production ready but already available for testing. If you use Ceph, follow the recommendations below. We also have a simplified procedure for disk import from different hypervisors. You can now easily import disks from VMware, Hyper-V, or other hypervisors via a new command line tool called ?qm importdisk?. Other new features are the live migration with local storage via QEMU, added USB und Host PCI address visibility in the GUI, bulk actions and filtering options in the GUI and an optimized NoVNC console. And as always we have included countless bugfixes and improvements on a lot of places. Video Watch our short introduction video - What's new in Proxmox VE 5.0? https://www.proxmox.com/en/training/video-tutorials Release notes https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 Download https://www.proxmox.com/en/downloads Alternate ISO download: http://download.proxmox.com/iso/ Source Code https://git.proxmox.com Bugtracker https://bugzilla.proxmox.com FAQ Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via apt? A: Yes, upgrading from beta to stable can be done via apt. https://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_Virtual_Environment_5.x_to_latest_5.0 Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? A: Yes, see https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 Q: I am running Ceph Server on V4.4 in a production setup - should I upgrade now? A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest Ceph Luminous release (release candidate status). Therefore not yet recommended for production. But you should start testing in a testlab environment, here is one important wiki article - https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous Many thank you's to our active community for all feedback, testing, bug reporting and patch submissions! -- Best Regards, Martin Maurer Proxmox VE project leader _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From elacunza at binovo.es Wed Jul 5 10:20:54 2017 From: elacunza at binovo.es (Eneko Lacunza) Date: Wed, 5 Jul 2017 10:20:54 +0200 Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References: Message-ID: Thanks a lot for your excelent work! El 04/07/17 a las 17:25, Martin Maurer escribi?: > Hi all, > > We are very happy to announce the final release of our Proxmox VE 5.0 > - based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. > > New Proxmox VE Storage Replication > > Replicas provide asynchronous data replication between two or multiple > nodes in a cluster, thus minimizing data loss in case of failure. For > all organizations using local storage the Proxmox replication feature > is a great option to increase data redundancy for high I/Os avoiding > the need of complex shared or distributed storage configurations. > => https://pve.proxmox.com/wiki/Storage_Replication > > With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for > distributed storage. Packaging is now done by the Proxmox team. The > Ceph Luminous is not yet production ready but already available for > testing. If you use Ceph, follow the recommendations below. > > We also have a simplified procedure for disk import from different > hypervisors. You can now easily import disks from VMware, Hyper-V, or > other hypervisors via a new command line tool called ?qm importdisk?. > > Other new features are the live migration with local storage via QEMU, > added USB und Host PCI address visibility in the GUI, bulk actions and > filtering options in the GUI and an optimized NoVNC console. > > And as always we have included countless bugfixes and improvements on > a lot of places. > > Video > Watch our short introduction video - What's new in Proxmox VE 5.0? > https://www.proxmox.com/en/training/video-tutorials > > Release notes > https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 > > Download > https://www.proxmox.com/en/downloads > Alternate ISO download: > http://download.proxmox.com/iso/ > > Source Code > https://git.proxmox.com > > Bugtracker > https://bugzilla.proxmox.com > > FAQ > Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via > apt? > A: Yes, upgrading from beta to stable can be done via apt. > https://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_Virtual_Environment_5.x_to_latest_5.0 > > Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? > A: Yes, see > https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch > > Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? > A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 > > Q: I am running Ceph Server on V4.4 in a production setup - should I > upgrade now? > A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest > Ceph Luminous release (release candidate status). Therefore not yet > recommended for production. But you should start testing in a testlab > environment, here is one important wiki article - > https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous > > Many thank you's to our active community for all feedback, testing, > bug reporting and patch submissions! > -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943493611 943324914 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) www.binovo.es From moh at multihouse.dk Wed Jul 5 15:54:59 2017 From: moh at multihouse.dk (Martin Overgaard Hansen) Date: Wed, 5 Jul 2017 13:54:59 +0000 Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References: Message-ID: Sounds great, really looking forward to the new ceph release becoming stable. What's the general opinion regarding incremental backups with a dirty map like feature? http://wiki.qemu.org/Features/IncrementalBackup I see that as a very important and missing feature. Best Regards, Martin Overgaard Hansen MultiHouse IT Partner A/S > -----Original Message----- > From: pve-user [mailto:pve-user-bounces at pve.proxmox.com] On Behalf Of > Martin Maurer > Sent: Tuesday, July 4, 2017 5:25 PM > To: pve-devel at pve.proxmox.com; PVE User List user at pve.proxmox.com> > Subject: [PVE-User] Proxmox VE 5.0 released! > > Hi all, > > We are very happy to announce the final release of our Proxmox VE 5.0 - > based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. > > New Proxmox VE Storage Replication > > Replicas provide asynchronous data replication between two or multiple > nodes in a cluster, thus minimizing data loss in case of failure. For all > organizations using local storage the Proxmox replication feature is a great > option to increase data redundancy for high I/Os avoiding the need of > complex shared or distributed storage configurations. > => https://pve.proxmox.com/wiki/Storage_Replication > > With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for > distributed storage. Packaging is now done by the Proxmox team. The Ceph > Luminous is not yet production ready but already available for testing. > If you use Ceph, follow the recommendations below. > > We also have a simplified procedure for disk import from different > hypervisors. You can now easily import disks from VMware, Hyper-V, or > other hypervisors via a new command line tool called ?qm importdisk?. > > Other new features are the live migration with local storage via QEMU, > added USB und Host PCI address visibility in the GUI, bulk actions and filtering > options in the GUI and an optimized NoVNC console. > > And as always we have included countless bugfixes and improvements on a > lot of places. > > Video > Watch our short introduction video - What's new in Proxmox VE 5.0? > https://www.proxmox.com/en/training/video-tutorials > > Release notes > https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 > > Download > https://www.proxmox.com/en/downloads > Alternate ISO download: > http://download.proxmox.com/iso/ > > Source Code > https://git.proxmox.com > > Bugtracker > https://bugzilla.proxmox.com > > FAQ > Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via apt? > A: Yes, upgrading from beta to stable can be done via apt. > https://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_V > irtual_Environment_5.x_to_latest_5.0 > > Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? > A: Yes, see > https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch > > Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? > A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 > > Q: I am running Ceph Server on V4.4 in a production setup - should I upgrade > now? > A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest Ceph > Luminous release (release candidate status). Therefore not yet > recommended for production. But you should start testing in a testlab > environment, here is one important wiki article - > https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous > > Many thank you's to our active community for all feedback, testing, bug > reporting and patch submissions! > > -- > Best Regards, > > Martin Maurer > Proxmox VE project leader > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From aderumier at odiso.com Wed Jul 5 19:38:13 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Wed, 5 Jul 2017 19:38:13 +0200 (CEST) Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References: Message-ID: <1038450618.14942055.1499276293482.JavaMail.zimbra@oxygem.tv> >>What's the general opinion regarding incremental backups with a dirty map like feature? http://wiki.qemu.org/Features/IncrementalBackup >> >>I see that as a very important and missing feature. It need to be implement in proxmox vma format and backup code. (as proxmox don't use qemu drive_backup qmp) Note that currently, dirty map can't be saved on disk (on vm shutdown for exemple), so if vm is shutdown, you need to do a full backup again. ----- Mail original ----- De: "Martin Overgaard Hansen" ?: "proxmoxve" Envoy?: Mercredi 5 Juillet 2017 15:54:59 Objet: Re: [PVE-User] Proxmox VE 5.0 released! Sounds great, really looking forward to the new ceph release becoming stable. What's the general opinion regarding incremental backups with a dirty map like feature? http://wiki.qemu.org/Features/IncrementalBackup I see that as a very important and missing feature. Best Regards, Martin Overgaard Hansen MultiHouse IT Partner A/S > -----Original Message----- > From: pve-user [mailto:pve-user-bounces at pve.proxmox.com] On Behalf Of > Martin Maurer > Sent: Tuesday, July 4, 2017 5:25 PM > To: pve-devel at pve.proxmox.com; PVE User List user at pve.proxmox.com> > Subject: [PVE-User] Proxmox VE 5.0 released! > > Hi all, > > We are very happy to announce the final release of our Proxmox VE 5.0 - > based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. > > New Proxmox VE Storage Replication > > Replicas provide asynchronous data replication between two or multiple > nodes in a cluster, thus minimizing data loss in case of failure. For all > organizations using local storage the Proxmox replication feature is a great > option to increase data redundancy for high I/Os avoiding the need of > complex shared or distributed storage configurations. > => https://pve.proxmox.com/wiki/Storage_Replication > > With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for > distributed storage. Packaging is now done by the Proxmox team. The Ceph > Luminous is not yet production ready but already available for testing. > If you use Ceph, follow the recommendations below. > > We also have a simplified procedure for disk import from different > hypervisors. You can now easily import disks from VMware, Hyper-V, or > other hypervisors via a new command line tool called ?qm importdisk?. > > Other new features are the live migration with local storage via QEMU, > added USB und Host PCI address visibility in the GUI, bulk actions and filtering > options in the GUI and an optimized NoVNC console. > > And as always we have included countless bugfixes and improvements on a > lot of places. > > Video > Watch our short introduction video - What's new in Proxmox VE 5.0? > https://www.proxmox.com/en/training/video-tutorials > > Release notes > https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 > > Download > https://www.proxmox.com/en/downloads > Alternate ISO download: > http://download.proxmox.com/iso/ > > Source Code > https://git.proxmox.com > > Bugtracker > https://bugzilla.proxmox.com > > FAQ > Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via apt? > A: Yes, upgrading from beta to stable can be done via apt. > https://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_V > irtual_Environment_5.x_to_latest_5.0 > > Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? > A: Yes, see > https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch > > Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? > A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 > > Q: I am running Ceph Server on V4.4 in a production setup - should I upgrade > now? > A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest Ceph > Luminous release (release candidate status). Therefore not yet > recommended for production. But you should start testing in a testlab > environment, here is one important wiki article - > https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous > > Many thank you's to our active community for all feedback, testing, bug > reporting and patch submissions! > > -- > Best Regards, > > Martin Maurer > Proxmox VE project leader > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From moh at multihouse.dk Wed Jul 5 21:07:33 2017 From: moh at multihouse.dk (Martin Overgaard Hansen) Date: Wed, 5 Jul 2017 19:07:33 +0000 Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: <1038450618.14942055.1499276293482.JavaMail.zimbra@oxygem.tv> References: <1038450618.14942055.1499276293482.JavaMail.zimbra@oxygem.tv> Message-ID: <4B11358D-5E3B-451A-A5F8-16C5111F658F@multihouse.dk> On 5 Jul 2017, at 19.38, Alexandre DERUMIER > wrote: Note that currently, dirty map can't be saved on disk (on vm shutdown for exemple), so if vm is shutdown, you need to do a full backup again. That should change in the near future, looking forward to a similar feature in proxmox. Thanks for the response. Best Regards, Martin Overgaard Hansen MultiHouse IT Partner A/S From dimitris.beletsiotis at gmail.com Wed Jul 5 23:04:37 2017 From: dimitris.beletsiotis at gmail.com (Dimitris Beletsiotis) Date: Thu, 6 Jul 2017 00:04:37 +0300 Subject: [PVE-User] Proxmox VE 5.0 released! In-Reply-To: References: Message-ID: <930a6076-613d-0d04-71fb-a09c88ae56d5@gmail.com> Just watched the video, great work! Regards, Dimitris Beletsiotis On 04-Jul-17 18:25, Martin Maurer wrote: > Hi all, > > We are very happy to announce the final release of our Proxmox VE 5.0 > - based on the great Debian 9 codename "Stretch" and a Linux Kernel 4.10. > > New Proxmox VE Storage Replication > > Replicas provide asynchronous data replication between two or multiple > nodes in a cluster, thus minimizing data loss in case of failure. For > all organizations using local storage the Proxmox replication feature > is a great option to increase data redundancy for high I/Os avoiding > the need of complex shared or distributed storage configurations. > => https://pve.proxmox.com/wiki/Storage_Replication > > With Proxmox VE 5.0 Ceph RBD becomes the de-facto standard for > distributed storage. Packaging is now done by the Proxmox team. The > Ceph Luminous is not yet production ready but already available for > testing. If you use Ceph, follow the recommendations below. > > We also have a simplified procedure for disk import from different > hypervisors. You can now easily import disks from VMware, Hyper-V, or > other hypervisors via a new command line tool called ?qm importdisk?. > > Other new features are the live migration with local storage via QEMU, > added USB und Host PCI address visibility in the GUI, bulk actions and > filtering options in the GUI and an optimized NoVNC console. > > And as always we have included countless bugfixes and improvements on > a lot of places. > > Video > Watch our short introduction video - What's new in Proxmox VE 5.0? > https://www.proxmox.com/en/training/video-tutorials > > Release notes > https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_5.0 > > Download > https://www.proxmox.com/en/downloads > Alternate ISO download: > http://download.proxmox.com/iso/ > > Source Code > https://git.proxmox.com > > Bugtracker > https://bugzilla.proxmox.com > > FAQ > Q: Can I upgrade a 5.x beta installation to the stable 5.0 release via > apt? > A: Yes, upgrading from beta to stable can be done via apt. > https://pve.proxmox.com/wiki/Downloads#Update_a_running_Proxmox_Virtual_Environment_5.x_to_latest_5.0 > > Q: Can I install Proxmox VE 5.0 on top of Debian Stretch? > A: Yes, see > https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch > > Q: Can I upgrade Proxmox VE 4.4 to 5.0 with apt dist-upgrade? > A: Yes, see https://pve.proxmox.com/wiki/Upgrade_from_4.x_to_5.0 > > Q: I am running Ceph Server on V4.4 in a production setup - should I > upgrade now? > A: Not yet. Ceph packages in Proxmox VE 5.0 are based on the latest > Ceph Luminous release (release candidate status). Therefore not yet > recommended for production. But you should start testing in a testlab > environment, here is one important wiki article - > https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous > > Many thank you's to our active community for all feedback, testing, > bug reporting and patch submissions! > From mark at tuxis.nl Thu Jul 6 11:25:09 2017 From: mark at tuxis.nl (Mark Schouten) Date: Thu, 06 Jul 2017 11:25:09 +0200 Subject: [PVE-User] Conntrack on FORWARD-chain Message-ID: <1802398.K1LpG5bMBr@tuxis> Hi, We have a cluster with the firewall enabled on cluster- and host-level, not on VM-level. One of the VM's is a firewall which routes traffic for the other VM's. We ran into issues because the Proxmox firewall is looking at the FORWARD-chain, and dropping ctstate INVALID. That is causing issues, because it feels the routed traffic has state invalid. Everything starts working as soon as I do a `iptables -D PVEFW-FORWARD 1`. Am I misinterpreting stuff, doing something wrong, or is this something else? Thanks, -- Kerio Operator in de Cloud? https://www.kerioindecloud.nl/ Mark Schouten | Tuxis Internet Engineering KvK: 61527076 | http://www.tuxis.nl/ T: 0318 200208 | info at tuxis.nl From uwe.sauter.de at gmail.com Thu Jul 6 11:32:50 2017 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Thu, 6 Jul 2017 11:32:50 +0200 Subject: [PVE-User] Automatic migration before reboot / shutdown? Migration to host in same group? Message-ID: Hi all, 1) I was wondering how a PVE (4.4) cluster will behave when one of the nodes is restarted / shutdown either via WebGUI or via commandline. Will hosted, HA-managed VMs be migrated to other hosts before shutting down or will they be stopped (and restared on another host once HA recognizes them as gone)? 2) Currently I run a cluster of four nodes that share the same 2U chassis: +-----+-----+ | A | B | +-----+-----+ | C | D | +-----+-----+ (Please don't comment on whether this setup is ideal ? I'm aware of the risks a single chassis brings?) I created several HA groups: - left contains A & C - right contains B & D - upper contains A & B - lower contains C & D - all contains all nodes and configured VMs to run inside one of the groups. For updates I usually follow the following steps: - migrate VMs from node via "bulk migrate" feature, selecting one of the other nodes - when no more VMs run, do a "apt-get dist-upgrade" and reboot - repeat till all nodes are up-to-date One issue I ran into with this procedure is that sometimes while a VM is still migrated to another host, already migrated VMs are migrated back onto the current node because the target that was selected for "bulk migrate" was not inside the same group as the current host. Practical example: - VM 101 is configured to run on the left side of the cluster - VM 102 is configured to run on the lower level of the cluster - node C shall be updated - I select "bulk migrate" to node D - VM 101 is migrated to D - VM 102 is migrated to D, but takes some time (a lot of RAM) - HA recognizes that VM 101 is not running in the correct group and schedules a migration back to node C - migration of VM 102 finishes and migration of VM 101 back to node C immediatelly starts - once migration of VM 101 has finished I manually need to initate another migration (and after that need to be faster then HA to do a reboot) Would it be possible to implement another "bulk action" that will evacuate a host in a way that for every VM, the appropriate target node is selected, depending on HA group configuration? This might also temporarily disable that node in HA management for e.g. 10min or until next reboot so that maintenance work can be done? What do you think of that idea? Regards, Uwe From t.lamprecht at proxmox.com Thu Jul 6 14:15:00 2017 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Thu, 6 Jul 2017 14:15:00 +0200 Subject: [PVE-User] Automatic migration before reboot / shutdown? Migration to host in same group? In-Reply-To: References: Message-ID: <324b7a97-b637-b251-fe7b-2606df1db243@proxmox.com> Hi, On 07/06/2017 11:32 AM, Uwe Sauter wrote: > Hi all, > > 1) I was wondering how a PVE (4.4) cluster will behave when one of the nodes is restarted / shutdown either via WebGUI or via > commandline. Will hosted, HA-managed VMs be migrated to other hosts before shutting down or will they be stopped (and restared on > another host once HA recognizes them as gone)? First: on any graceful shutdown, which triggers stopping the pve-ha-lrm service, all HA managed services will be queued to stop (graceful shutdown with timeout). This is done to ensure consistency. If a HA service gets then recovered to another node, or "waits" until the current node comes up again depends if you triggered a shutdown or a reboot. On a shutdown the service will be recovered after the node is seen as "dead" (~2 minutes) but on a reboot we mark the service as freezed, so the ha stack does not touches it. The idea here is that if a user reboots the node without migrating away a service he expects that the node comes up again fast and starts the service on its own again. Now, we know that this may not always be ideal, especially on really big machines with hundreds of gigabyte of RAM and a slow as hell firmware, where a boot may need > 10 minutes. An idea is to allow the configuration of the behavior and add two additional behaviors, i.e. migrate away and relocate away. > 2) Currently I run a cluster of four nodes that share the same 2U chassis: > > +-----+-----+ > | A | B | > +-----+-----+ > | C | D | > +-----+-----+ > > (Please don't comment on whether this setup is ideal ? I'm aware of the risks a single chassis brings?) As long as nodes share continents your never save anyway :-P > I created several HA groups: > > - left contains A & C > - right contains B & D > - upper contains A & B > - lower contains C & D > - all contains all nodes > > and configured VMs to run inside one of the groups. > > For updates I usually follow the following steps: > - migrate VMs from node via "bulk migrate" feature, selecting one of the other nodes > - when no more VMs run, do a "apt-get dist-upgrade" and reboot > - repeat till all nodes are up-to-date > > One issue I ran into with this procedure is that sometimes while a VM is still migrated to another host, already migrated VMs are > migrated back onto the current node because the target that was selected for "bulk migrate" was not inside the same group as the > current host. This is expected, you told the ha-manager that a service should or can not run there, thus it tried to bring it in an "OK" state again. > Practical example: > - VM 101 is configured to run on the left side of the cluster > - VM 102 is configured to run on the lower level of the cluster > - node C shall be updated > - I select "bulk migrate" to node D > - VM 101 is migrated to D > - VM 102 is migrated to D, but takes some time (a lot of RAM) > - HA recognizes that VM 101 is not running in the correct group and schedules a migration back to node C > - migration of VM 102 finishes and migration of VM 101 back to node C immediatelly starts > - once migration of VM 101 has finished I manually need to initate another migration (and after that need to be faster then HA to > do a reboot) > > > Would it be possible to implement another "bulk action" that will evacuate a host in a way that for every VM, the appropriate > target node is selected, depending on HA group configuration? This might also temporarily disable that node in HA management for > e.g. 10min or until next reboot so that maintenance work can be done? > What do you think of that idea? > Quasi, a maintenance mode? I'm not opposed to it, but if such a thing would be done it would be only a light wrapper around already existing functionality. Can I ask if whats the reason for your group setup? I assume that all VMs may run on all nodes, but you want to "pin" some VMs to specific nodes for load reasons? If this is the case I'd suggest changing the group configuration. I.e. each node gets a group, A, B, C and D. Each group has the respective node with priority 2 and all others with priority 1. When doing an system upgrade on node A you would edit group A and set node A's priority to 0, now all should migrate away from this node, trying to balance the service count over all nodes. You do not need to trigger a bulk action, at least for the HA managed VMs. After all migrated execute the upgrade and reboot. Then reconfigure the Group A that node A has again the highest priority, i.e. 2, and the respective services migrate back to it again. This should be quite fast to do after the initial setup, you just need to open the group configuration dialog and lower/higher the priority of one node. You could also use a simmilar procedure on your current group configuration. The main thing what changes is that you need to edit two groups to make a node free. The advantage of mine method would be that the services get distributed on all other nodes not just moved to a single one. If anything is unclear or cannot apply to your situation, feel free to ask. cheers, Thomas PS: if not already read, please see also: From uwe.sauter.de at gmail.com Thu Jul 6 15:14:37 2017 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Thu, 6 Jul 2017 15:14:37 +0200 Subject: [PVE-User] Automatic migration before reboot / shutdown? Migration to host in same group? In-Reply-To: <324b7a97-b637-b251-fe7b-2606df1db243@proxmox.com> References: <324b7a97-b637-b251-fe7b-2606df1db243@proxmox.com> Message-ID: <9fa3634d-a702-4f4d-5597-2d1d9706c3bb@gmail.com> Hi Thomas, thank you for your insight. >> 1) I was wondering how a PVE (4.4) cluster will behave when one of the nodes is restarted / shutdown either via WebGUI or via >> commandline. Will hosted, HA-managed VMs be migrated to other hosts before shutting down or will they be stopped (and restared on >> another host once HA recognizes them as gone)? > > First: on any graceful shutdown, which triggers stopping the pve-ha-lrm service, > all HA managed services will be queued to stop (graceful shutdown with timeout). > This is done to ensure consistency. > > If a HA service gets then recovered to another node, or "waits" until the current > node comes up again depends if you triggered a shutdown or a reboot. > On a shutdown the service will be recovered after the node is seen as "dead" (~2 minutes) > but on a reboot we mark the service as freezed, so the ha stack does not touches it. > The idea here is that if a user reboots the node without migrating away a service he expects > that the node comes up again fast and starts the service on its own again. > Now, we know that this may not always be ideal, especially on really big machines > with hundreds of gigabyte of RAM and a slow as hell firmware, where a boot may need > 10 minutes. Understood. This is also kind of what I expected. What is still unclear to me is what you consider a "graceful" shutdown? Every action that stops pve-ha-lrm? > An idea is to allow the configuration of the behavior and add two additional behaviors, > i.e. migrate away and relocate away. What's the difference between migration and relocation? Temporary vs. permanent? >> 2) Currently I run a cluster of four nodes that share the same 2U chassis: >> >> +-----+-----+ >> | A | B | >> +-----+-----+ >> | C | D | >> +-----+-----+ >> >> (Please don't comment on whether this setup is ideal ? I'm aware of the risks a single chassis brings?) > As long as nodes share continents your never save anyway :-P True, but impossible to implement for approx. 99.999999% of all PVE users. And latencies will be a nightmare then, esp. with Ceph :D >> I created several HA groups: >> >> - left contains A & C >> - right contains B & D >> - upper contains A & B >> - lower contains C & D >> - all contains all nodes >> >> and configured VMs to run inside one of the groups. >> >> For updates I usually follow the following steps: >> - migrate VMs from node via "bulk migrate" feature, selecting one of the other nodes >> - when no more VMs run, do a "apt-get dist-upgrade" and reboot >> - repeat till all nodes are up-to-date >> >> One issue I ran into with this procedure is that sometimes while a VM is still migrated to another host, already migrated VMs are >> migrated back onto the current node because the target that was selected for "bulk migrate" was not inside the same group as the >> current host. > This is expected, you told the ha-manager that a service should or can not run there, > thus it tried to bring it in an "OK" state again. Yes, I was aware of the reasons why the VM was moved back, though it would make more sense to move it to another node in the same (allowed) group for the maintenance case I'm describing here. >> Practical example: >> - VM 101 is configured to run on the left side of the cluster >> - VM 102 is configured to run on the lower level of the cluster >> - node C shall be updated >> - I select "bulk migrate" to node D >> - VM 101 is migrated to D >> - VM 102 is migrated to D, but takes some time (a lot of RAM) >> - HA recognizes that VM 101 is not running in the correct group and schedules a migration back to node C >> - migration of VM 102 finishes and migration of VM 101 back to node C immediatelly starts >> - once migration of VM 101 has finished I manually need to initate another migration (and after that need to be faster then HA to >> do a reboot) >> >> >> Would it be possible to implement another "bulk action" that will evacuate a host in a way that for every VM, the appropriate >> target node is selected, depending on HA group configuration? This might also temporarily disable that node in HA management for >> e.g. 10min or until next reboot so that maintenance work can be done? >> What do you think of that idea? >> > > Quasi, a maintenance mode? I'm not opposed to it, but if such a thing would be done > it would be only a light wrapper around already existing functionality. Absolutely. Just another action that would evacuate the current host as optimal as possible. All VMs that are constrained to a specific node group should be migrated within that group, all other VMs should be migrated to any node available (possible doing some load balancing inside the cluster). > Can I ask if whats the reason for your group setup? > I assume that all VMs may run on all nodes, but you want to "pin" some VMs to specific nodes for load reasons? We started to build a cluster out of just one chassis with four nodes. In the next few weeks I will add additional nodes that possibly be located in another building. Those nodes will be grouped similarily and there will be additional groups that include subsets of nodes from each building. The reason behind my group setup is that I have two projects which have several services that are running on two VMs each (for redundency and load balancing, e.g. LDAP). A configuration where one LDAP is running "left" and the other is running "right" eliminates the risk that both VMs run on the same node (and have a disruption of service if that particular node fails). So for the first project I distribute all important VMs between "left" and right" and the other project's important VMs are distrbuted between "upper" and "lower". This ensures that for both projects, important services are not interrupted if *one* node fails. All less-important VMs are allowed to run on all nodes. If there are valid concerns against this reasoning, I'm open to suggestions for improvement. > If this is the case I'd suggest changing the group configuration. > I.e. each node gets a group, A, B, C and D. Each group has the respective node with priority 2 and all others with priority 1. > When doing an system upgrade on node A you would edit group A and set node A's priority to 0, > now all should migrate away from this node, trying to balance the service count over all nodes. > You do not need to trigger a bulk action, at least for the HA managed VMs. > > After all migrated execute the upgrade and reboot. > Then reconfigure the Group A that node A has again the highest priority, > i.e. 2, and the respective services migrate back to it again. > > This should be quite fast to do after the initial setup, you just need to open the group configuration > dialog and lower/higher the priority of one node. > > You could also use a simmilar procedure on your current group configuration. > The main thing what changes is that you need to edit two groups to make a node free. > The advantage of mine method would be that the services get distributed on all other nodes not just moved to a single one. Interesting idea. Didn't have a look at priorities yet. Request for improvement: In "datacenter -> HA -> groups" show the configured priority, e.g. in a format "nodename(priority)[,nodename(priority)]" Regards, Uwe > If anything is unclear or cannot apply to your situation, feel free to ask. > > cheers, > Thomas > > PS: if not already read, please see also: > > > From t.lamprecht at proxmox.com Thu Jul 6 16:03:03 2017 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Thu, 6 Jul 2017 16:03:03 +0200 Subject: [PVE-User] Automatic migration before reboot / shutdown? Migration to host in same group? In-Reply-To: <9fa3634d-a702-4f4d-5597-2d1d9706c3bb@gmail.com> References: <324b7a97-b637-b251-fe7b-2606df1db243@proxmox.com> <9fa3634d-a702-4f4d-5597-2d1d9706c3bb@gmail.com> Message-ID: <7ec69140-85d9-bd63-b6c5-d77da9e46b71@proxmox.com> Hi, On 07/06/2017 03:14 PM, Uwe Sauter wrote: > Hi Thomas, > > thank you for your insight. > > >>> 1) I was wondering how a PVE (4.4) cluster will behave when one of the nodes is restarted / shutdown either via WebGUI or via >>> commandline. Will hosted, HA-managed VMs be migrated to other hosts before shutting down or will they be stopped (and restared on >>> another host once HA recognizes them as gone)? >> First: on any graceful shutdown, which triggers stopping the pve-ha-lrm service, >> all HA managed services will be queued to stop (graceful shutdown with timeout). >> This is done to ensure consistency. >> >> If a HA service gets then recovered to another node, or "waits" until the current >> node comes up again depends if you triggered a shutdown or a reboot. >> On a shutdown the service will be recovered after the node is seen as "dead" (~2 minutes) >> but on a reboot we mark the service as freezed, so the ha stack does not touches it. >> The idea here is that if a user reboots the node without migrating away a service he expects >> that the node comes up again fast and starts the service on its own again. >> Now, we know that this may not always be ideal, especially on really big machines >> with hundreds of gigabyte of RAM and a slow as hell firmware, where a boot may need > 10 minutes. > Understood. This is also kind of what I expected. > > What is still unclear to me is what you consider a "graceful" shutdown? Every action that stops pve-ha-lrm? No, not every action which stops the pve-ha-lrm. If it gets a stop request by anyone we check if a shutdown or reboot is in progress, if so we know that we have to stop/shutdown the services. If no shutdown or reboot is in progress we just freeze the services and to not touch them, this is done as the only case where this happens is the one where an user manually triggers an stop via: # systemctl stop pve-ha-lrm or # systemctl restart pve-ha-lrm in both cases stopping running services is probably unwanted, we expect that the user knows why he does this. One reason could be to shutdown the LRM watchdog connection as quorum loss is expected in the next minutes. >> An idea is to allow the configuration of the behavior and add two additional behaviors, >> i.e. migrate away and relocate away. > What's the difference between migration and relocation? Temporary vs. permanent? Migration does an online migration if possible (=on VMs) and the services is already running. Relocation *always* stops the service if it runs and only then migrates it. If it then gets started on the other side again depends on the request state. The latter one may be useful on really big VMs where short down time can be accepted and online migration would need far to long or cause congestion on the network. >>> 2) Currently I run a cluster of four nodes that share the same 2U chassis: >>> >>> +-----+-----+ >>> | A | B | >>> +-----+-----+ >>> | C | D | >>> +-----+-----+ >>> >>> (Please don't comment on whether this setup is ideal ? I'm aware of the risks a single chassis brings?) >> As long as nodes share continents your never save anyway :-P > True, but impossible to implement for approx. 99.999999% of all PVE users. And latencies will be a nightmare then, esp. with Ceph :D Haha, yeah, would be quite a nightmare, if you haven't your own sea cable connection :D >>> I created several HA groups: >>> >>> - left contains A & C >>> - right contains B & D >>> - upper contains A & B >>> - lower contains C & D >>> - all contains all nodes >>> >>> and configured VMs to run inside one of the groups. >>> >>> For updates I usually follow the following steps: >>> - migrate VMs from node via "bulk migrate" feature, selecting one of the other nodes >>> - when no more VMs run, do a "apt-get dist-upgrade" and reboot >>> - repeat till all nodes are up-to-date >>> >>> One issue I ran into with this procedure is that sometimes while a VM is still migrated to another host, already migrated VMs are >>> migrated back onto the current node because the target that was selected for "bulk migrate" was not inside the same group as the >>> current host. >> This is expected, you told the ha-manager that a service should or can not run there, >> thus it tried to bring it in an "OK" state again. > Yes, I was aware of the reasons why the VM was moved back, though it would make more sense to move it to another node in the same > (allowed) group for the maintenance case I'm describing here. > >>> Practical example: >>> - VM 101 is configured to run on the left side of the cluster >>> - VM 102 is configured to run on the lower level of the cluster >>> - node C shall be updated >>> - I select "bulk migrate" to node D >>> - VM 101 is migrated to D >>> - VM 102 is migrated to D, but takes some time (a lot of RAM) >>> - HA recognizes that VM 101 is not running in the correct group and schedules a migration back to node C >>> - migration of VM 102 finishes and migration of VM 101 back to node C immediatelly starts >>> - once migration of VM 101 has finished I manually need to initate another migration (and after that need to be faster then HA to >>> do a reboot) >>> >>> >>> Would it be possible to implement another "bulk action" that will evacuate a host in a way that for every VM, the appropriate >>> target node is selected, depending on HA group configuration? This might also temporarily disable that node in HA management for >>> e.g. 10min or until next reboot so that maintenance work can be done? >>> What do you think of that idea? >>> >> Quasi, a maintenance mode? I'm not opposed to it, but if such a thing would be done >> it would be only a light wrapper around already existing functionality. > Absolutely. Just another action that would evacuate the current host as optimal as possible. All VMs that are constrained to a > specific node group should be migrated within that group, all other VMs should be migrated to any node available (possible doing > some load balancing inside the cluster). I'll look again in this, if I get an idea how to incorporate this without breaking edge cases I can give it a shot, no promise yet, though, sorry :) >> Can I ask if whats the reason for your group setup? >> I assume that all VMs may run on all nodes, but you want to "pin" some VMs to specific nodes for load reasons? > We started to build a cluster out of just one chassis with four nodes. In the next few weeks I will add additional nodes that > possibly be located in another building. Those nodes will be grouped similarily and there will be additional groups that include > subsets of nodes from each building. > > The reason behind my group setup is that I have two projects which have several services that are running on two VMs each (for > redundency and load balancing, e.g. LDAP). A configuration where one LDAP is running "left" and the other is running "right" > eliminates the risk that both VMs run on the same node (and have a disruption of service if that particular node fails). > So for the first project I distribute all important VMs between "left" and right" and the other project's important VMs are > distrbuted between "upper" and "lower". This ensures that for both projects, important services are not interrupted if *one* node > fails. > All less-important VMs are allowed to run on all nodes. > > If there are valid concerns against this reasoning, I'm open to suggestions for improvement. Sounds OK, I have to think about it if I can propose a better fitting solution regarding our HA stack. An idea was to add simple dependencies, i.e. this group/service should not run on the same node as the other group/services. Not sure if this is quite specialism or more people would profit from it... >> If this is the case I'd suggest changing the group configuration. >> I.e. each node gets a group, A, B, C and D. Each group has the respective node with priority 2 and all others with priority 1. >> When doing an system upgrade on node A you would edit group A and set node A's priority to 0, >> now all should migrate away from this node, trying to balance the service count over all nodes. >> You do not need to trigger a bulk action, at least for the HA managed VMs. >> >> After all migrated execute the upgrade and reboot. >> Then reconfigure the Group A that node A has again the highest priority, >> i.e. 2, and the respective services migrate back to it again. >> >> This should be quite fast to do after the initial setup, you just need to open the group configuration >> dialog and lower/higher the priority of one node. >> >> You could also use a simmilar procedure on your current group configuration. >> The main thing what changes is that you need to edit two groups to make a node free. >> The advantage of mine method would be that the services get distributed on all other nodes not just moved to a single one. > Interesting idea. Didn't have a look at priorities yet. > > Request for improvement: In "datacenter -> HA -> groups" show the configured priority, e.g. in a format > "nodename(priority)[,nodename(priority)]" Hmm, this should already be the case, except if the default priority is set. I added this when I reworked the HA group editor sometimes in 4.3. cheers, Thomas From uwe.sauter.de at gmail.com Thu Jul 6 16:14:28 2017 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Thu, 6 Jul 2017 16:14:28 +0200 Subject: [PVE-User] Automatic migration before reboot / shutdown? Migration to host in same group? In-Reply-To: <7ec69140-85d9-bd63-b6c5-d77da9e46b71@proxmox.com> References: <324b7a97-b637-b251-fe7b-2606df1db243@proxmox.com> <9fa3634d-a702-4f4d-5597-2d1d9706c3bb@gmail.com> <7ec69140-85d9-bd63-b6c5-d77da9e46b71@proxmox.com> Message-ID: <0b090eaf-fb32-2828-4c0b-880b1bd65d51@gmail.com> Thomas, >>> An idea is to allow the configuration of the behavior and add two additional behaviors, >>> i.e. migrate away and relocate away. >> What's the difference between migration and relocation? Temporary vs. permanent? > > Migration does an online migration if possible (=on VMs) and the services is already running. > Relocation *always* stops the service if it runs and only then migrates it. > If it then gets started on the other side again depends on the request state. > > The latter one may be useful on really big VMs where short down time can be accepted > and online migration would need far to long or cause congestion on the network. Ah, ok. So relocation is migration without "online" checked (UI) or "qm migrate" without "--online"? >>>> (Please don't comment on whether this setup is ideal ? I'm aware of the risks a single chassis brings?) >>> As long as nodes share continents your never save anyway :-P >> True, but impossible to implement for approx. 99.999999% of all PVE users. And latencies will be a nightmare then, esp. with >> Ceph :D > > Haha, yeah, would be quite a nightmare, if you haven't your own sea cable connection :D Even then the latency adds up by approx. 1 millisecond per 100km? >>> Quasi, a maintenance mode? I'm not opposed to it, but if such a thing would be done >>> it would be only a light wrapper around already existing functionality. >> Absolutely. Just another action that would evacuate the current host as optimal as possible. All VMs that are constrained to a >> specific node group should be migrated within that group, all other VMs should be migrated to any node available (possible doing >> some load balancing inside the cluster). > > I'll look again in this, if I get an idea how to incorporate this without breaking edge cases I can give it a shot, > no promise yet, though, sorry :) >> If there are valid concerns against this reasoning, I'm open to suggestions for improvement. > > Sounds OK, I have to think about it if I can propose a better fitting solution regarding our HA stack. > An idea was to add simple dependencies, i.e. this group/service should > not run on the same node as the other group/services. Not sure if this is quite specialism or more people would profit from it... I'd like to hear you suggestions if you find the time? >> Interesting idea. Didn't have a look at priorities yet. >> >> Request for improvement: In "datacenter -> HA -> groups" show the configured priority, e.g. in a format >> "nodename(priority)[,nodename(priority)]" > > Hmm, this should already be the case, except if the default priority is set. > I added this when I reworked the HA group editor sometimes in 4.3. Well, I didn't play with priorities yet so it might be that it just didn't show up in my case. Thank you, Uwe From tonci at suma-informatika.hr Fri Jul 7 07:35:42 2017 From: tonci at suma-informatika.hr (=?UTF-8?B?VG9uxI1pIFN0aXBpxI1ldmnEhw==?=) Date: Fri, 7 Jul 2017 07:35:42 +0200 Subject: [PVE-User] prox storage replication <> iscsi multipath problem Message-ID: <71c6acd8-8250-0b37-20c3-37f177ab4e8f@suma-informatika.hr> Hi to all, I'm testing pvesr and it works correct so far and is big step ahead regarding migration/replication w/o shared storage. Actually that is something I was really waiting for , because it is easier to find neighborhood with two (only) server than two servers with shared storage. This is the way that one host can really back the other one up (since sync frequency is fine-tunable) But I do have problem with some kind of collisions. My test lab has 3 hosts and one freenas shared storage. The connection in between is iscsi-target-multipath , so each node (incl freenas as shared storage -> lvm) has 3 nics . In order to test and play with pvesr I created zfspool on each host using local hard drive (1 drive -> one zfs volume ... no redundancy etc) and storage replication was working fine . But after a while iscsi-multipath connections are still on but my shared lvm iscsi freenas storage disappears . The only way I was able to got it back was deleting all pvesr jobs and destroying zfs pools on each node. I repeated this scenario more time times but the result was the same I'm aware that this scenario ( shared storage and storage replication ) is not kind a usual (but it should be possible) but I'm still wondering why this pvesr killed my freenas-iscsi target ? Thank you in advance Best regards Tonci -- / / /srda?an pozdrav / best regards/ Ton?i Stipi?evi?, dipl. ing. Elektr. /direktor / manager/** ** d.o.o. ltd. *podr?ka / upravljanje **IT*/ sustavima za male i srednje tvrtke/ /Small & Medium Business /*IT*//*support / management* Badali?eva 27 / 10000 Zagreb / Hrvatska ? Croatia url: www.suma-informatika.hr mob: +385 91 1234003 fax: +385 1 5560007 From w.link at proxmox.com Fri Jul 7 07:57:05 2017 From: w.link at proxmox.com (Wolfgang Link) Date: Fri, 7 Jul 2017 07:57:05 +0200 Subject: [PVE-User] prox storage replication <> iscsi multipath problem In-Reply-To: <71c6acd8-8250-0b37-20c3-37f177ab4e8f@suma-informatika.hr> References: <71c6acd8-8250-0b37-20c3-37f177ab4e8f@suma-informatika.hr> Message-ID: <26281f11-c87b-1f17-0204-3eff12354810@proxmox.com> Hi Tonci, I guess it is the network traffic. You should limited the replica speed or use a separate Network. On 07/07/2017 07:35 AM, Ton?i Stipi?evi? wrote: > Hi to all, > > I'm testing pvesr and it works correct so far and is big step ahead > regarding migration/replication w/o shared storage. Actually that is > something I was really waiting for , because it is easier to find > neighborhood with two (only) server than two servers with shared > storage. This is the way that one host can really back the other one up > (since sync frequency is fine-tunable) > > But I do have problem with some kind of collisions. My test lab has > 3 hosts and one freenas shared storage. The connection in between is > iscsi-target-multipath , so each node (incl freenas as shared storage -> > lvm) has 3 nics . In order to test and play with pvesr I created zfspool > on each host using local hard drive (1 drive -> one zfs volume ... no > redundancy etc) and storage replication was working fine . But after a > while iscsi-multipath connections are still on but my shared lvm iscsi > freenas storage disappears . The only way I was able to got it back was > deleting all pvesr jobs and destroying zfs pools on each node. > > I repeated this scenario more time times but the result was the same > > > I'm aware that this scenario ( shared storage and storage replication ) > is not kind a usual (but it should be possible) but I'm still wondering > why this pvesr killed my freenas-iscsi target ? > > > Thank you in advance > > Best regards > Tonci From aderumier at odiso.com Fri Jul 7 10:46:33 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Fri, 7 Jul 2017 10:46:33 +0200 (CEST) Subject: [PVE-User] corosync unicast : does somebody use it in production with 10-16 nodes ? Message-ID: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> Hi, I'm looking to remove multicast from my network (Don't have too much time to explain, but we have multicast storm problem,because of igmp snooping bug) Does somebody running it with "big" clusters ? (10-16 nodes) I'm currently testing it with 9 nodes (1200vm+containers), I'm seeing around 3mbit/s of traffic on each node, and I don't have any cluster break for now. (Switch have recents asics with around 0,015ms latency). Any return of experience is welcome :) Thanks ! Alexandre From aderumier at odiso.com Fri Jul 7 11:51:42 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Fri, 7 Jul 2017 11:51:42 +0200 (CEST) Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> Message-ID: <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> note that I'm just seeing , time to time (around once by hour), pvedaemon: ipcc_send_rec failed: Transport endpoint is not connected But I don't have any corosync error / retransmit. ----- Mail original ----- De: "aderumier" ?: "pve-devel" , "proxmoxve" Envoy?: Vendredi 7 Juillet 2017 10:46:33 Objet: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? Hi, I'm looking to remove multicast from my network (Don't have too much time to explain, but we have multicast storm problem,because of igmp snooping bug) Does somebody running it with "big" clusters ? (10-16 nodes) I'm currently testing it with 9 nodes (1200vm+containers), I'm seeing around 3mbit/s of traffic on each node, and I don't have any cluster break for now. (Switch have recents asics with around 0,015ms latency). Any return of experience is welcome :) Thanks ! Alexandre _______________________________________________ pve-devel mailing list pve-devel at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel From tonci at suma-informatika.hr Fri Jul 7 14:35:47 2017 From: tonci at suma-informatika.hr (=?UTF-8?B?VG9uxI1pIFN0aXBpxI1ldmnEhw==?=) Date: Fri, 7 Jul 2017 14:35:47 +0200 Subject: [PVE-User] prox storage replication <> iscsi multipath problem In-Reply-To: References: Message-ID: Hi Wolfgang Thank you for your response . Everything is VLAN-separated ... all three multipath links have its own subnets and the link between zfs local storages uses its own VLAN-separated link (actually vmbr1 -> intranet link ) Any ideas ? :) -- / / /srda?an pozdrav / best regards/ Ton?i Stipi?evi?, dipl. ing. Elektr. /direktor / manager/** ** d.o.o. ltd. *podr?ka / upravljanje **IT*/ sustavima za male i srednje tvrtke/ /Small & Medium Business /*IT*//*support / management* Badali?eva 27 / 10000 Zagreb / Hrvatska ? Croatia url: www.suma-informatika.hr mob: +385 91 1234003 fax: +385 1 5560007 On 07/07/2017 12:00, pve-user-request at pve.proxmox.com wrote: > Send pve-user mailing list submissions to > pve-user at pve.proxmox.com > > To subscribe or unsubscribe via the World Wide Web, visit > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > or, via email, send a message with subject or body 'help' to > pve-user-request at pve.proxmox.com > > You can reach the person managing the list at > pve-user-owner at pve.proxmox.com > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of pve-user digest..." > > > Today's Topics: > > 1. prox storage replication <> iscsi multipath problem > (Ton?i Stipi?evi?) > 2. Re: prox storage replication <> iscsi multipath problem > (Wolfgang Link) > 3. corosync unicast : does somebody use it in production with > 10-16 nodes ? (Alexandre DERUMIER) > 4. Re: [pve-devel] corosync unicast : does somebody use it in > production with 10-16 nodes ? (Alexandre DERUMIER) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 7 Jul 2017 07:35:42 +0200 > From: Ton?i Stipi?evi? > To: pve-user at pve.proxmox.com > Subject: [PVE-User] prox storage replication <> iscsi multipath > problem > Message-ID: <71c6acd8-8250-0b37-20c3-37f177ab4e8f at suma-informatika.hr> > Content-Type: text/plain; charset=utf-8; format=flowed > > Hi to all, > > I'm testing pvesr and it works correct so far and is big step ahead > regarding migration/replication w/o shared storage. Actually that is > something I was really waiting for , because it is easier to find > neighborhood with two (only) server than two servers with shared > storage. This is the way that one host can really back the other one up > (since sync frequency is fine-tunable) > > But I do have problem with some kind of collisions. My test lab has > 3 hosts and one freenas shared storage. The connection in between is > iscsi-target-multipath , so each node (incl freenas as shared storage -> > lvm) has 3 nics . In order to test and play with pvesr I created zfspool > on each host using local hard drive (1 drive -> one zfs volume ... no > redundancy etc) and storage replication was working fine . But after a > while iscsi-multipath connections are still on but my shared lvm iscsi > freenas storage disappears . The only way I was able to got it back was > deleting all pvesr jobs and destroying zfs pools on each node. > > I repeated this scenario more time times but the result was the same > > > I'm aware that this scenario ( shared storage and storage replication ) > is not kind a usual (but it should be possible) but I'm still wondering > why this pvesr killed my freenas-iscsi target ? > > > Thank you in advance > > Best regards > Tonci From dietmar at proxmox.com Fri Jul 7 17:17:18 2017 From: dietmar at proxmox.com (Dietmar Maurer) Date: Fri, 7 Jul 2017 17:17:18 +0200 (CEST) Subject: [PVE-User] prox storage replication <> iscsi multipath problem In-Reply-To: References:

Message-ID: <331998136.49.1499440638536@webmail.proxmox.com> > Everything is VLAN-separated ... all three multipath links have its own > subnets and the link between zfs local storages uses its own > VLAN-separated link (actually vmbr1 -> intranet link ) Usually VLAN separation does not help to prevent network overload. Or do you have some special switches which can guarantee minimum transfer rates? Besides, I cannot see why replication (ssh/zfs) can disturb an iscsi connection. What error do you get exactly on the iscsi connection? From nick-liste at posteo.eu Fri Jul 7 17:38:24 2017 From: nick-liste at posteo.eu (Nicola Ferrari (#554252)) Date: Fri, 7 Jul 2017 17:38:24 +0200 Subject: [PVE-User] qm migrate: strange output In-Reply-To: <5239afd5-f45e-9621-ec06-8b1978d842a3@gmail.com> References: <5239afd5-f45e-9621-ec06-8b1978d842a3@gmail.com> Message-ID: On 20/06/2017 18:19, Uwe Sauter wrote: > > Can someone explain under which circumstances this output is displayed instead of just the short message that migration > was started? I can't answer actually, but as far as I can remember, I've always seen the "long" version of the output, also in the qemu monitor in the web interface.. I've never seen a "qm migrate" one line output in my config.. That sounds quite strange :) Could you please post your pveversion -v output? I may compare it with mine.. Bye! Nick -- +---------------------+ | Linux User #554252 | +---------------------+ From murphy.lawson at outlook.com Fri Jul 7 17:46:48 2017 From: murphy.lawson at outlook.com (Murphy Lawson) Date: Fri, 7 Jul 2017 15:46:48 +0000 Subject: [PVE-User] smartd - Bad IEC (SMART) mode page Message-ID: Hi Everyone, A colleague has recently deployed 3 servers all running PVE (Virtual Environment 4.4-1) and it has been reported that they are all reporting SMART errors but only for one drive in an LSI/Avago RAID array. When running a single scan with 'smartd -q onecheck' the following error is returned: Device: /dev/bus/0 [megaraid_disk_05], [SEAGATE ST2000NM0045 N002], lu id: 0x4f6a1e4f8c6b, S/N: AB123412341234, 2.00 TB Device: /dev/bus/0 [megaraid_disk_05], Bad IEC (SMART) mode page, err=-5, skip device Unable to register SCSI device /dev/bus/0 [megaraid_disk_05] at line 21 of file /etc/smartd.conf In this instance the server had been up for just under an hour: 15:07:00 up 54 min, 1 user, load average: 0.09, 0.06, 0.01 When the error is seen, it is no longer possible to control the SMART status on the reported drive and no other details are available (e.g. temperature): # smartctl -s off /dev/bus/0 -d megaraid,5 smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF ENABLE/DISABLE COMMANDS SECTION === unable to fetch IEC (SMART) mode page [Input/output error] Onlythe above error when queried is consistent, the error logged when smartd is running as a daemon varies on each server: smartd[1649]: Device: /dev/bus/0 [megaraid_disk_05], failed to read Temperature smartd[2050]: Device: /dev/bus/0 [megaraid_disk_04], Read SMART Self-Test Log Failed smartd[1422]: Device: /dev/bus/0 [megaraid_disk_04], failed to read SMART values The error only seems to occur after the server has been running for a few while and the only way I found to clear it is to reboot or power down the servers. The drives are using the 'megaraid_sas' kernel module which I've tried to reload but as the drives are active, this is not possible. Further hardware details below: scsi host0: Avago SAS based MegaRAID driver 81:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02) Kernel version is: Linux SERVERNAME 4.4.35-1-pve #1 SMP Fri Dec 9 11:09:55 CET 2016 x86_64 GNU/Linux I've checked out the smartmontools forum and there doesn't appear to have been an recent report of this issue. A few other similar reports from Redhat but this was before these controllers were supported by the kernel. Has anyone else come across this issue? I've changed the RAID controller to not power down any spare/unused drives in case this is what's occuring but in case this doesn't resolve it, any other advice would be appreciated. Thanks in advance Murphy From uwe.sauter.de at gmail.com Fri Jul 7 18:00:52 2017 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Fri, 7 Jul 2017 18:00:52 +0200 Subject: [PVE-User] qm migrate: strange output In-Reply-To: References: <5239afd5-f45e-9621-ec06-8b1978d842a3@gmail.com> Message-ID: Here you go (but beware that there have been updates since I posted): # pveversion -v proxmox-ve: 4.4-92 (running kernel: 4.4.67-1-pve) pve-manager: 4.4-15 (running version: 4.4-15/7599e35a) pve-kernel-4.4.35-1-pve: 4.4.35-77 pve-kernel-4.4.59-1-pve: 4.4.59-87 pve-kernel-4.4.67-1-pve: 4.4.67-92 pve-kernel-4.4.62-1-pve: 4.4.62-88 lvm2: 2.02.116-pve3 corosync-pve: 2.4.2-2~pve4+1 libqb0: 1.0.1-1 pve-cluster: 4.0-52 qemu-server: 4.0-110 pve-firmware: 1.1-11 libpve-common-perl: 4.0-95 libpve-access-control: 4.0-23 libpve-storage-perl: 4.0-76 pve-libspice-server1: 0.12.8-2 vncterm: 1.3-2 pve-docs: 4.4-4 pve-qemu-kvm: 2.7.1-4 pve-container: 1.0-100 pve-firewall: 2.0-33 pve-ha-manager: 1.0-41 ksm-control-daemon: 1.2-1 glusterfs-client: 3.5.2-2+deb8u3 lxc-pve: 2.0.7-4 lxcfs: 2.0.6-pve1 criu: 1.6.0-1 novnc-pve: 0.5-9 smartmontools: 6.5+svn4324-1~pve80 zfsutils: 0.6.5.9-pve15~bpo80 ceph: 10.2.7-1~bpo80+1 Am 07.07.2017 um 17:38 schrieb Nicola Ferrari (#554252): > On 20/06/2017 18:19, Uwe Sauter wrote: >> >> Can someone explain under which circumstances this output is displayed instead of just the short message that migration >> was started? > > I can't answer actually, but as far as I can remember, I've always seen > the "long" version of the output, also in the qemu monitor in the web > interface.. I've never seen a "qm migrate" one line output in my > config.. That sounds quite strange :) > > Could you please post your pveversion -v output? I may compare it with > mine.. > > Bye! > Nick > From tonci at suma-informatika.hr Sat Jul 8 00:49:17 2017 From: tonci at suma-informatika.hr (=?UTF-8?B?VG9uxI1pIFN0aXBpxI1ldmnEhw==?=) Date: Sat, 8 Jul 2017 00:49:17 +0200 Subject: [PVE-User] prox storage replication <> iscsi multipath problem In-Reply-To: <331998136.49.1499440638536@webmail.proxmox.com> References:

<331998136.49.1499440638536@webmail.proxmox.com> Message-ID: <06c35e44-9e35-8346-f02a-253f884b313f@suma-informatika.hr> no sorry my mistake , zfs interconnection goes through other switch , no VLAns, all hosts (vmbr's) just plugged into the hpe switch (8port) iscsi multipath goes through 3 separate nics and through 3 separate vlans on hp procurve 1720 and when there is no zfs pools I get fantastic results with multipath (alua) : root at pvesuma01:~# multipath -ll FNAS04 (36589cfc0000004081c5d751435a19ea7) dm-3 FreeNAS,iSCSI Disk size=3.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw `-+- policy='round-robin 0' prio=50 status=active |- 8:0:0:0 sdd 8:48 active ready running |- 7:0:0:0 sde 8:64 active ready running `- 9:0:0:0 sdf 8:80 active ready running root at pvesuma01:~# iscsiadm --mode session tcp: [1] 10.1.10.4:3260,1 iqn.2005-fn4.org.freenas.ctl:target1 (non-flash) tcp: [2] 10.3.10.4:3260,3 iqn.2005-fn4.org.freenas.ctl:target1 (non-flash) tcp: [3] 10.2.10.4:3260,2 iqn.2005-fn4.org.freenas.ctl:target1 (non-flash) When I restore 3 VMs from all 3 hosts in the same time , from the Freenas shared nfs storage onto this iscsi multipath volume this multipath link gets saturated up to 80% and all 3 links are equally occupied (800Mbs each) . So 200Euros PC (1 quadports nic , 5 sata drives on the jbod controller - because of zfs) is receiving average data stream of 2,4Gbs etc ... So this switch would not be bottle-neck Now I have additional and more accurate feedback : 1. No VM on iscsi target is running so there is no switch load at all , zpools are created and iscsilvm target is still online and visible 2. I clone one VM to the zpool and everything is still ok root at pvesuma01:~# multipath -ll FNAS04 (36589cfc0000004081c5d751435a19ea7) dm-3 FreeNAS,iSCSI Disk size=3.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw `-+- policy='round-robin 0' prio=50 status=active |- 7:0:0:0 sdd 8:48 active ready running |- 8:0:0:0 sde 8:64 active ready running `- 9:0:0:0 sdf 8:80 active ready running root at pvesuma01:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk ??sda1 8:1 0 1.8T 0 part sdb 8:16 1 465.8G 0 disk ??sdb1 8:17 1 1007K 0 part ??sdb2 8:18 1 127M 0 part ??sdb3 8:19 1 465.7G 0 part ??pve-swap 253:0 0 15G 0 lvm [SWAP] ??pve-root 253:1 0 96G 0 lvm / ??pve-data 253:2 0 338.7G 0 lvm /var/lib/vz sdc 8:32 1 465.8G 0 disk ??sdc1 8:33 1 465.8G 0 part ??sdc9 8:41 1 8M 0 part sdd 8:48 0 3T 0 disk ??FNAS04 253:3 0 3T 0 mpath ??vg4-vm--9999--disk--1 253:4 0 1G 0 lvm ??vg4-vm--9998--disk--1 253:5 0 10G 0 lvm ??vg4-vm--9997--disk--1 253:6 0 32G 0 lvm ??vg4-vm--8002--disk--1 253:7 0 32G 0 lvm ??vg4-vm--9995--disk--1 253:8 0 10G 0 lvm ??vg4-vm--8001--disk--1 253:9 0 32G 0 lvm ??vg4-vm--9994--disk--1 253:10 0 15G 0 lvm ??vg4-vm--9993--disk--1 253:11 0 32G 0 lvm ??vg4-vm--9996--disk--1 253:12 0 25G 0 lvm ??vg4-vm--9996--disk--2 253:13 0 32G 0 lvm ??vg4-vm--9996--disk--3 253:14 0 5G 0 lvm ??vg4-vm--9991--disk--1 253:15 0 32G 0 lvm ??vg4-vm--9990--disk--1 253:16 0 32G 0 lvm ??vg4-vm--9989--disk--1 253:17 0 32G 0 lvm ??vg4-vm--9988--disk--1 253:18 0 64G 0 lvm ??vg4-vm--9987--disk--1 253:19 0 42G 0 lvm ??vg4-vm--9990--disk--2 253:20 0 32G 0 lvm ??vg4-vm--6001--disk--1 253:21 0 15G 0 lvm sde 8:64 0 3T 0 disk ??FNAS04 253:3 0 3T 0 mpath ??vg4-vm--9999--disk--1 253:4 0 1G 0 lvm ??vg4-vm--9998--disk--1 253:5 0 10G 0 lvm ??vg4-vm--9997--disk--1 253:6 0 32G 0 lvm ??vg4-vm--8002--disk--1 253:7 0 32G 0 lvm ??vg4-vm--9995--disk--1 253:8 0 10G 0 lvm ??vg4-vm--8001--disk--1 253:9 0 32G 0 lvm ??vg4-vm--9994--disk--1 253:10 0 15G 0 lvm ??vg4-vm--9993--disk--1 253:11 0 32G 0 lvm ??vg4-vm--9996--disk--1 253:12 0 25G 0 lvm ??vg4-vm--9996--disk--2 253:13 0 32G 0 lvm ??vg4-vm--9996--disk--3 253:14 0 5G 0 lvm ??vg4-vm--9991--disk--1 253:15 0 32G 0 lvm ??vg4-vm--9990--disk--1 253:16 0 32G 0 lvm ??vg4-vm--9989--disk--1 253:17 0 32G 0 lvm ??vg4-vm--9988--disk--1 253:18 0 64G 0 lvm ??vg4-vm--9987--disk--1 253:19 0 42G 0 lvm ??vg4-vm--9990--disk--2 253:20 0 32G 0 lvm ??vg4-vm--6001--disk--1 253:21 0 15G 0 lvm sdf 8:80 0 3T 0 disk ??FNAS04 253:3 0 3T 0 mpath ??vg4-vm--9999--disk--1 253:4 0 1G 0 lvm ??vg4-vm--9998--disk--1 253:5 0 10G 0 lvm ??vg4-vm--9997--disk--1 253:6 0 32G 0 lvm ??vg4-vm--8002--disk--1 253:7 0 32G 0 lvm ??vg4-vm--9995--disk--1 253:8 0 10G 0 lvm ??vg4-vm--8001--disk--1 253:9 0 32G 0 lvm ??vg4-vm--9994--disk--1 253:10 0 15G 0 lvm ??vg4-vm--9993--disk--1 253:11 0 32G 0 lvm ??vg4-vm--9996--disk--1 253:12 0 25G 0 lvm ??vg4-vm--9996--disk--2 253:13 0 32G 0 lvm ??vg4-vm--9996--disk--3 253:14 0 5G 0 lvm ??vg4-vm--9991--disk--1 253:15 0 32G 0 lvm ??vg4-vm--9990--disk--1 253:16 0 32G 0 lvm ??vg4-vm--9989--disk--1 253:17 0 32G 0 lvm ??vg4-vm--9988--disk--1 253:18 0 64G 0 lvm ??vg4-vm--9987--disk--1 253:19 0 42G 0 lvm ??vg4-vm--9990--disk--2 253:20 0 32G 0 lvm ??vg4-vm--6001--disk--1 253:21 0 15G 0 lvm zd0 230:0 0 15G 0 disk 3. after reboot iscsi-target multipath disappears root at pvesuma01:~# multipath -ll root at pvesuma01:~# iscsiadm --mode session tcp: [1] 10.1.10.4:3260,1 iqn.2005-fn4.org.freenas.ctl:target1 (non-flash) tcp: [2] 10.3.10.4:3260,3 iqn.2005-fn4.org.freenas.ctl:target1 (non-flash) tcp: [3] 10.2.10.4:3260,2 iqn.2005-fn4.org.freenas.ctl:target1 (non-flash) root at pvesuma01:~# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk ??sda1 8:1 0 1.8T 0 part sdb 8:16 1 465.8G 0 disk ??sdb1 8:17 1 1007K 0 part ??sdb2 8:18 1 127M 0 part ??sdb3 8:19 1 465.7G 0 part ??pve-swap 253:0 0 15G 0 lvm [SWAP] ??pve-root 253:1 0 96G 0 lvm / ??pve-data 253:2 0 338.7G 0 lvm /var/lib/vz sdc 8:32 1 465.8G 0 disk ??sdc1 8:33 1 465.8G 0 part ??sdc9 8:41 1 8M 0 part sdd 8:48 0 3T 0 disk sde 8:64 0 3T 0 disk sdf 8:80 0 3T 0 disk zd0 230:0 0 15G 0 disk ??zd0p1 230:1 0 13G 0 part ??zd0p2 230:2 0 1K 0 part ??zd0p5 230:5 0 2G 0 part -- / / /srda?an pozdrav / best regards/ Ton?i Stipi?evi?, dipl. ing. Elektr. /direktor / manager/** ** d.o.o. ltd. *podr?ka / upravljanje **IT*/ sustavima za male i srednje tvrtke/ /Small & Medium Business /*IT*//*support / management* Badali?eva 27 / 10000 Zagreb / Hrvatska ? Croatia url: www.suma-informatika.hr mob: +385 91 1234003 fax: +385 1 5560007 On 07/07/2017 17:17, Dietmar Maurer wrote: >> Everything is VLAN-separated ... all three multipath links have its own >> subnets and the link between zfs local storages uses its own >> VLAN-separated link (actually vmbr1 -> intranet link ) > Usually VLAN separation does not help to prevent network overload. Or do you > have some special switches which can guarantee minimum transfer rates? > > Besides, I cannot see why replication (ssh/zfs) can disturb an iscsi connection. > > What error do you get exactly on the iscsi connection? > From tonci at suma-informatika.hr Sat Jul 8 08:22:59 2017 From: tonci at suma-informatika.hr (=?UTF-8?B?VG9uxI1pIFN0aXBpxI1ldmnEhw==?=) Date: Sat, 8 Jul 2017 08:22:59 +0200 Subject: [PVE-User] prox storage replication <> iscsi multipath problem In-Reply-To: <331998136.49.1499440638536@webmail.proxmox.com> References:

<331998136.49.1499440638536@webmail.proxmox.com> Message-ID: <1d1b5374-d973-f316-0ecb-dfc82c9f1343@suma-informatika.hr> Hello Dietmar , It seems that iscsi layer works but lvm does not. Volume group "vg4" not found TASK ERROR: can't activate LV '/dev/vg4/vm-6001-disk-1': Cannot process volume group vg4 As soon as I remove VM that has image in zpool and reboot hostst , everything regarding iscsi-target-multipath works like before w/o any data loss or something like that Thank you in adavnce BR Tonci -- / / /srda?an pozdrav / best regards/ Ton?i Stipi?evi?, dipl. ing. Elektr. /direktor / manager/** ** d.o.o. ltd. *podr?ka / upravljanje **IT*/ sustavima za male i srednje tvrtke/ /Small & Medium Business /*IT*//*support / management* Badali?eva 27 / 10000 Zagreb / Hrvatska ? Croatia url: www.suma-informatika.hr mob: +385 91 1234003 fax: +385 1 5560007 On 07/07/2017 17:17, Dietmar Maurer wrote: >> Everything is VLAN-separated ... all three multipath links have its own >> subnets and the link between zfs local storages uses its own >> VLAN-separated link (actually vmbr1 -> intranet link ) > Usually VLAN separation does not help to prevent network overload. Or do you > have some special switches which can guarantee minimum transfer rates? > > Besides, I cannot see why replication (ssh/zfs) can disturb an iscsi connection. > > What error do you get exactly on the iscsi connection? > From devin at pabstatencio.com Sun Jul 9 04:49:30 2017 From: devin at pabstatencio.com (Devin Acosta) Date: Sat, 8 Jul 2017 19:49:30 -0700 Subject: [PVE-User] PVE5/LXC CPU Shows Crazy Percentage? Message-ID: I am running the latest release of Proxmox and I am using LXC containers. I notice that it appears that when processes run that "top" can provide crazy numbers and almost not realistic. I installed Nessus and it was just downloading plugins, not even doing a scan, and it says 2400% CPU load, when the box is only 2.5x. Is this a common issue with LXC not reporting correct percentages? top - 22:46:58 up 0 min, 1 user, load average: 2.52, 1.78, 0.85 Tasks: 22 total, 1 running, 21 sleeping, 0 stopped, 0 zombie %Cpu(s): 60.6 us, 3.4 sy, 0.0 ni, 36.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 4194304 total, 787784 free, 310964 used, 3095556 buff/cache KiB Swap: 4194304 total, 4194304 free, 0 used. 787784 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 528 root 20 0 503520 144832 7240 S 2400 3.5 10:41.94 nessusd 1 root 20 0 41012 4828 3720 S 0.0 0.1 0:00.16 systemd 37 root 20 0 36828 6512 6220 S 0.0 0.2 0:00.04 systemd-journal 56 root 20 0 82568 6128 5268 S 0.0 0.1 0:00.00 sshd 57 root 20 0 221892 4580 3760 S 0.0 0.1 0:00.00 rsyslogd 58 root 20 0 26540 2864 2588 S 0.0 0.1 0:00.00 systemd-logind 59 root 20 0 221848 7336 5488 S 0.0 0.2 0:00.05 httpd 61 dbus 20 0 26584 2816 2488 S 0.0 0.1 0:00.03 dbus-daemon 67 root 20 0 71344 1828 1144 S 0.0 0.0 0:00.00 saslauthd 71 root 20 0 6464 1760 1640 S 0.0 0.0 0:00.00 agetty 72 root 20 0 6464 1640 1524 S 0.0 0.0 0:00.00 agetty 73 root 20 0 22764 2720 2108 S 0.0 0.1 0:00.00 crond 74 root 20 0 6464 1564 1436 S 0.0 0.0 0:00.00 agetty 76 root 20 0 71344 684 0 S 0.0 0.0 0:00.00 saslauthd 77 root 20 0 27188 1572 1340 S 0.0 0.0 0:00.00 xinetd 91 root 20 0 90292 3452 1716 S 0.0 0.1 0:00.01 sendmail 95 apache 20 0 221848 4028 2168 S 0.0 0.1 0:00.00 httpd 114 smmsp 20 0 85736 3428 1904 S 0.0 0.1 0:00.00 sendmail 295 root 20 0 136876 7800 6540 S 0.0 0.2 0:00.07 sshd 297 root 20 0 11784 2988 2592 S 0.0 0.1 0:00.03 bash 356 root 20 0 6484 1444 1336 S 0.0 0.0 0:00.00 nessus-service 570 root 20 0 51888 3728 3128 R 0.0 0.1 0:00.00 top -- Devin Acosta Red Hat Certified Architect, LinuxStack From dorsyka at yahoo.com Sun Jul 9 12:04:44 2017 From: dorsyka at yahoo.com (dORSY) Date: Sun, 9 Jul 2017 10:04:44 +0000 (UTC) Subject: [PVE-User] proxmox 5 - replication fails References: <2108859951.1396331.1499594684882.ref@mail.yahoo.com> Message-ID: <2108859951.1396331.1499594684882@mail.yahoo.com> Hi, I upgraded my 2-node setup to proxmox 5 and stretch. I am using ZFS local storages and unicast corosync (OVH). Did set up storage migration from both sides to the other for some VMs and a container. It makes transfering VMs really fast, however VMs need to be shut down as online migration doesn't work anymore with replication. I can live with that for the time being, but would be a great improvement. Problem is,? some zfs migration attempts failed. And stays failed. 2017-07-09 11:03:00 103-0: start replication job 2017-07-09 11:03:00 103-0: guest => VM 103, running => 2979 2017-07-09 11:03:00 103-0: volumes => local-zfs:vm-103-disk-1 2017-07-09 11:03:00 103-0: delete stale replication snapshot '__replicate_103-0_1499590800__' on local-zfs:vm-103-disk-1 2017-07-09 11:03:05 103-0: (remote_prepare_local_job) delete stale replication snapshot '__replicate_103-0_1499590800__' on local-zfs:vm-103-disk-1 2017-07-09 11:03:09 103-0: create snapshot '__replicate_103-0_1499590980__' on local-zfs:vm-103-disk-1 2017-07-09 11:03:09 103-0: full sync 'local-zfs:vm-103-disk-1' (__replicate_103-0_1499590980__) 2017-07-09 11:03:10 103-0: delete previous replication snapshot '__replicate_103-0_1499590980__' on local-zfs:vm-103-disk-1 2017-07-09 11:03:10 103-0: end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-103-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_103-0_1499590980__ | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=xxhostnamexx' root at ip.of.other.node -- pvesm import local-zfs:vm-103-disk-1 zfs - -with-snapshots 1' failed: exit code 255 Could not run the command by hand as it deletes those snapshots right away. Digging into pvesr logs in journal, I found that migration works as expected for a time: Jul 09 10:15:02 NameOfNode1 pvesr[642]: send from @__replicate_103-0_1499587200__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499588100__ estimated size is 60.5M Jul 09 10:15:02 NameOfNode1 pvesr[642]: total estimated size is 60.5M Jul 09 10:15:02 NameOfNode1 pvesr[642]: TIME????????SENT?? SNAPSHOT Jul 09 10:15:02 NameOfNode1 pvesr[642]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499587200__????????name????????rpool/data/vm-103-disk-1 at __replicate_103-0_1499587200__????????- Jul 09 10:15:03 NameOfNode1 pvesr[642]: 10:15:03?? 5.47M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499588100__ Jul 09 10:15:04 NameOfNode1 pvesr[642]: 10:15:04?? 10.4M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499588100__ Jul 09 10:15:05 NameOfNode1 pvesr[642]: 10:15:05?? 40.4M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499588100__ Jul 09 10:15:12 NameOfNode1 pvesr[642]: send from @__replicate_109-0_1499587210__ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ estimated size is 172M Jul 09 10:15:12 NameOfNode1 pvesr[642]: total estimated size is 172M Jul 09 10:15:12 NameOfNode1 pvesr[642]: TIME????????SENT?? SNAPSHOT Jul 09 10:15:12 NameOfNode1 pvesr[642]: rpool/data/vm-109-disk-1 at __replicate_109-0_1499587210__????????name????????rpool/data/vm-109-disk-1 at __replicate_109-0_1499587210__????????- Jul 09 10:15:13 NameOfNode1 pvesr[642]: 10:15:13?? 15.6M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:14 NameOfNode1 pvesr[642]: 10:15:14?? 41.0M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:15 NameOfNode1 pvesr[642]: 10:15:15?? 71.2M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:16 NameOfNode1 pvesr[642]: 10:15:16?? 88.8M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:17 NameOfNode1 pvesr[642]: 10:15:17????114M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:18 NameOfNode1 pvesr[642]: 10:15:18????121M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:19 NameOfNode1 pvesr[642]: 10:15:19????149M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:20 NameOfNode1 pvesr[642]: 10:15:20????160M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:15:21 NameOfNode1 pvesr[642]: 10:15:21????172M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__ Jul 09 10:30:05 NameOfNode1 pvesr[3893]: send from @__replicate_103-0_1499588100__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__ estimated size is 8.30M Jul 09 10:30:05 NameOfNode1 pvesr[3893]: total estimated size is 8.30M Jul 09 10:30:05 NameOfNode1 pvesr[3893]: TIME????????SENT?? SNAPSHOT Jul 09 10:30:05 NameOfNode1 pvesr[3893]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499588100__????????name????????rpool/data/vm-103-disk-1 at __replicate_103-0_1499588100__????????- Jul 09 10:30:06 NameOfNode1 pvesr[3893]: 10:30:06?? 2.11M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__ Jul 09 10:30:07 NameOfNode1 pvesr[3893]: 10:30:07?? 2.11M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__ Jul 09 10:30:08 NameOfNode1 pvesr[3893]: 10:30:08?? 2.11M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__ Jul 09 10:30:09 NameOfNode1 pvesr[3893]: 10:30:09?? 2.11M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__ Jul 09 10:30:10 NameOfNode1 pvesr[3893]: 10:30:10?? 3.10M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__ Jul 09 10:30:23 NameOfNode1 pvesr[3893]: send from @__replicate_109-0_1499588111__ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499589022__ estimated size is 22.5M Jul 09 10:30:23 NameOfNode1 pvesr[3893]: total estimated size is 22.5M Jul 09 10:30:23 NameOfNode1 pvesr[3893]: TIME????????SENT?? SNAPSHOT Jul 09 10:30:24 NameOfNode1 pvesr[3893]: rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__????????name????????rpool/data/vm-109-disk-1 at __replicate_109-0_1499588111__????????- Jul 09 10:30:24 NameOfNode1 pvesr[3893]: 10:30:24?? 2.10M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499589022__ Jul 09 10:30:25 NameOfNode1 pvesr[3893]: 10:30:25?? 2.10M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499589022__ Jul 09 10:30:26 NameOfNode1 pvesr[3893]: 10:30:26?? 4.55M?? rpool/data/vm-109-disk-1 at __replicate_109-0_1499589022__ But then it somehow wants to send "@" and note that the estimated size even exceeds the size of data on the VM's disk, close to the volume sizes. This is happening both for VM 103 and 109 (sometimes it is just one node) ZFS list of those volumes: rpool/data/vm-103-disk-1??????14.5G??2.29T??14.5G??- rpool/data/vm-109-disk-1??????21.4G??2.29T??21.4G??- rpool/data/vm-103-disk-1??????volsize??????20G??????local rpool/data/vm-109-disk-1??????volsize??????30G??????local Jul 09 10:36:11 NameOfNode1 pvesr[5123]: send from @ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499589360__ estimated size is 28.4G Jul 09 10:36:11 NameOfNode1 pvesr[5123]: total estimated size is 28.4G Jul 09 10:36:11 NameOfNode1 pvesr[5123]: TIME????????SENT?? SNAPSHOT Jul 09 10:36:12 NameOfNode1 pvesr[5123]: rpool/data/vm-109-disk-1????????name????????rpool/data/vm-109-disk-1????????- Jul 09 10:36:12 NameOfNode1 pvesr[5123]: volume 'rpool/data/vm-109-disk-1' already exists Jul 09 10:36:12 NameOfNode1 pvesr[5123]: warning: cannot send 'rpool/data/vm-109-disk-1 at __replicate_109-0_1499589360__': Broken pipe Jul 09 10:36:12 NameOfNode1 pvesr[5123]: cannot send 'rpool/data/vm-109-disk-1': I/O error Jul 09 10:36:12 NameOfNode1 pvesr[5123]: command 'zfs send -Rpv -- rpool/data/vm-109-disk-1 at __replicate_109-0_1499589360__' failed: exit code 1 Jul 09 10:36:12 NameOfNode1 pvesr[5123]: send/receive failed, cleaning up snapshot(s).. Jul 09 10:45:17 NameOfNode1 pvesr[6669]: send from @__replicate_103-0_1499589000__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499589900__ estimated size is 9.12M Jul 09 10:45:17 NameOfNode1 pvesr[6669]: total estimated size is 9.12M Jul 09 10:45:17 NameOfNode1 pvesr[6669]: TIME????????SENT?? SNAPSHOT Jul 09 10:45:17 NameOfNode1 pvesr[6669]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__????????name????????rpool/data/vm-103-disk-1 at __replicate_103-0_1499589000__????????- Jul 09 10:45:18 NameOfNode1 pvesr[6669]: 10:45:18?? 7.24M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499589900__ Jul 09 10:46:02 NameOfNode1 pvesr[7055]: send from @ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499589960__ estimated size is 28.4G Jul 09 10:46:02 NameOfNode1 pvesr[7055]: total estimated size is 28.4G Jul 09 10:46:02 NameOfNode1 pvesr[7055]: TIME????????SENT?? SNAPSHOT Jul 09 10:46:02 NameOfNode1 pvesr[7055]: rpool/data/vm-109-disk-1????????name????????rpool/data/vm-109-disk-1????????- Jul 09 10:46:02 NameOfNode1 pvesr[7055]: volume 'rpool/data/vm-109-disk-1' already exists Jul 09 10:46:02 NameOfNode1 pvesr[7055]: warning: cannot send 'rpool/data/vm-109-disk-1 at __replicate_109-0_1499589960__': Broken pipe Jul 09 10:46:02 NameOfNode1 pvesr[7055]: cannot send 'rpool/data/vm-109-disk-1': I/O error Jul 09 10:46:02 NameOfNode1 pvesr[7055]: command 'zfs send -Rpv -- rpool/data/vm-109-disk-1 at __replicate_109-0_1499589960__' failed: exit code 1 Jul 09 10:46:02 NameOfNode1 pvesr[7055]: send/receive failed, cleaning up snapshot(s).. Jul 09 11:00:08 NameOfNode1 pvesr[9425]: send from @__replicate_103-0_1499589900__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499590800__ estimated size is 7.24M Jul 09 11:00:08 NameOfNode1 pvesr[9425]: total estimated size is 7.24M Jul 09 11:00:08 NameOfNode1 pvesr[9425]: TIME????????SENT?? SNAPSHOT Jul 09 11:00:08 NameOfNode1 pvesr[9425]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499589900__????????name????????rpool/data/vm-103-disk-1 at __replicate_103-0_1499589900__????????- Jul 09 11:00:09 NameOfNode1 pvesr[9425]: 11:00:09?? 2.11M?? rpool/data/vm-103-disk-1 at __replicate_103-0_1499590800__ Jul 09 11:03:09 NameOfNode1 pvesr[10151]: send from @ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499590980__ estimated size is 19.5G Jul 09 11:03:09 NameOfNode1 pvesr[10151]: total estimated size is 19.5G Jul 09 11:03:09 NameOfNode1 pvesr[10151]: TIME????????SENT?? SNAPSHOT Jul 09 11:03:10 NameOfNode1 pvesr[10151]: rpool/data/vm-103-disk-1????????name????????rpool/data/vm-103-disk-1????????- Jul 09 11:03:10 NameOfNode1 pvesr[10151]: volume 'rpool/data/vm-103-disk-1' already exists Jul 09 11:03:10 NameOfNode1 pvesr[10151]: warning: cannot send 'rpool/data/vm-103-disk-1 at __replicate_103-0_1499590980__': Broken pipe Jul 09 11:03:10 NameOfNode1 pvesr[10151]: cannot send 'rpool/data/vm-103-disk-1': I/O error Jul 09 11:03:10 NameOfNode1 pvesr[10151]: command 'zfs send -Rpv -- rpool/data/vm-103-disk-1 at __replicate_103-0_1499590980__' failed: exit code 1 Jul 09 11:03:10 NameOfNode1 pvesr[10151]: send/receive failed, cleaning up snapshot(s).. Jul 09 11:04:02 NameOfNode1 pvesr[10347]: send from @ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499591040__ estimated size is 28.4G Jul 09 11:04:02 NameOfNode1 pvesr[10347]: total estimated size is 28.4G Jul 09 11:04:02 NameOfNode1 pvesr[10347]: TIME????????SENT?? SNAPSHOT Jul 09 11:04:02 NameOfNode1 pvesr[10347]: rpool/data/vm-109-disk-1????????name????????rpool/data/vm-109-disk-1????????- Jul 09 11:04:02 NameOfNode1 pvesr[10347]: volume 'rpool/data/vm-109-disk-1' already exists Jul 09 11:04:02 NameOfNode1 pvesr[10347]: warning: cannot send 'rpool/data/vm-109-disk-1 at __replicate_109-0_1499591040__': Broken pipe Jul 09 11:04:02 NameOfNode1 pvesr[10347]: cannot send 'rpool/data/vm-109-disk-1': I/O error Jul 09 11:04:02 NameOfNode1 pvesr[10347]: command 'zfs send -Rpv -- rpool/data/vm-109-disk-1 at __replicate_109-0_1499591040__' failed: exit code 1 Jul 09 11:04:02 NameOfNode1 pvesr[10347]: send/receive failed, cleaning up snapshot(s).. Jul 09 11:13:03 NameOfNode1 pvesr[11914]: send from @ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499591580__ estimated size is 19.5G Jul 09 11:13:03 NameOfNode1 pvesr[11914]: total estimated size is 19.5G Jul 09 11:13:03 NameOfNode1 pvesr[11914]: TIME????????SENT?? SNAPSHOT Jul 09 11:13:03 NameOfNode1 pvesr[11914]: rpool/data/vm-103-disk-1????????name????????rpool/data/vm-103-disk-1????????- Jul 09 11:13:03 NameOfNode1 pvesr[11914]: volume 'rpool/data/vm-103-disk-1' already exists Jul 09 11:13:03 NameOfNode1 pvesr[11914]: warning: cannot send 'rpool/data/vm-103-disk-1 at __replicate_103-0_1499591580__': Broken pipe Jul 09 11:13:03 NameOfNode1 pvesr[11914]: cannot send 'rpool/data/vm-103-disk-1': I/O error Jul 09 11:13:03 NameOfNode1 pvesr[11914]: command 'zfs send -Rpv -- rpool/data/vm-103-disk-1 at __replicate_103-0_1499591580__' failed: exit code 1 Jul 09 11:13:03 NameOfNode1 pvesr[11914]: send/receive failed, cleaning up snapshot(s).. After that it really wants to transfer whole disks again and again but it fails to do that.There is no way to recover from this, just by destroying the whole ZFS volume on the target side and restransfer it. Please help me get around this problem or to find out what goes wrong there. regards,dorsy From elacunza at binovo.es Mon Jul 10 10:15:57 2017 From: elacunza at binovo.es (Eneko Lacunza) Date: Mon, 10 Jul 2017 10:15:57 +0200 Subject: [PVE-User] Proxmox 4.4 boot issues on Dell T130 Message-ID: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> Hi all, I'm having problems to boot to a Proxmox 4.4 USB created as usual with proxmox-ve_4.4-eb2d6f1e-2.iso When trying to boot with BIOS mode, grub rescue console appears. When trying to boot with UEFI mode, I choose the efi boot file from USB and , grub console appears (but no menu). I tried 2 different USB drives, and two USB ports (front and back). I tried with proxmox-ve_4.1-2f9650d4-21.iso, and this worked without issues with BIOS boot (then had to upgrade to latest 4.4) What I'm doing wrong? :-) Thanks Eneko -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943493611 943324914 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) www.binovo.es From dorsyka at yahoo.com Mon Jul 10 11:14:43 2017 From: dorsyka at yahoo.com (dorsy) Date: Mon, 10 Jul 2017 11:14:43 +0200 Subject: [PVE-User] proxmox 5 - replication fails In-Reply-To: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> References: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> Message-ID: Same error again, nothing strange until 3:20 AM: Jul 10 03:00:02 xxnodenamexx pvesr[32023]: send from @__replicate_103-0_1499647500__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ estimated size is 7.27M Jul 10 03:00:02 xxnodenamexx pvesr[32023]: total estimated size is 7.27M Jul 10 03:00:02 xxnodenamexx pvesr[32023]: TIME SENT SNAPSHOT Jul 10 03:00:02 xxnodenamexx pvesr[32023]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499647500__ name rpool/data/vm-103-disk-1 at __replicate_103-0_1499647500__ - Jul 10 03:00:03 xxnodenamexx pvesr[32023]: 03:00:03 3.10M rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ Jul 10 03:00:06 xxnodenamexx pvesr[32023]: send from @__replicate_109-0_1499647505__ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ estimated size is 24.0M Jul 10 03:00:06 xxnodenamexx pvesr[32023]: total estimated size is 24.0M Jul 10 03:00:06 xxnodenamexx pvesr[32023]: TIME SENT SNAPSHOT Jul 10 03:00:06 xxnodenamexx pvesr[32023]: rpool/data/vm-109-disk-1 at __replicate_109-0_1499647505__ name rpool/data/vm-109-disk-1 at __replicate_109-0_1499647505__ - Jul 10 03:00:07 xxnodenamexx pvesr[32023]: 03:00:07 17.9M rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ Jul 10 03:15:02 xxnodenamexx pvesr[2801]: send from @__replicate_103-0_1499648400__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ estimated size is 63.9M Jul 10 03:15:02 xxnodenamexx pvesr[2801]: total estimated size is 63.9M Jul 10 03:15:02 xxnodenamexx pvesr[2801]: TIME SENT SNAPSHOT Jul 10 03:15:03 xxnodenamexx pvesr[2801]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ name rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ - Jul 10 03:15:03 xxnodenamexx pvesr[2801]: 03:15:03 6.00M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:04 xxnodenamexx pvesr[2801]: 03:15:04 13.5M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:05 xxnodenamexx pvesr[2801]: 03:15:05 13.9M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:06 xxnodenamexx pvesr[2801]: 03:15:06 17.2M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:07 xxnodenamexx pvesr[2801]: 03:15:07 49.7M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:24 xxnodenamexx pvesr[2801]: send from @__replicate_109-0_1499648405__ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ estimated size is 169M Jul 10 03:15:24 xxnodenamexx pvesr[2801]: total estimated size is 169M Jul 10 03:15:24 xxnodenamexx pvesr[2801]: rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ name rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ - Jul 10 03:15:24 xxnodenamexx pvesr[2801]: TIME SENT SNAPSHOT Jul 10 03:15:25 xxnodenamexx pvesr[2801]: 03:15:25 18.6M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:26 xxnodenamexx pvesr[2801]: 03:15:26 49.3M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:27 xxnodenamexx pvesr[2801]: 03:15:27 69.9M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:28 xxnodenamexx pvesr[2801]: 03:15:28 74.0M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:29 xxnodenamexx pvesr[2801]: 03:15:29 80.3M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:30 xxnodenamexx pvesr[2801]: 03:15:30 85.5M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:31 xxnodenamexx pvesr[2801]: 03:15:31 92.9M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:32 xxnodenamexx pvesr[2801]: 03:15:32 101M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:33 xxnodenamexx pvesr[2801]: 03:15:33 102M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:34 xxnodenamexx pvesr[2801]: 03:15:34 126M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:35 xxnodenamexx pvesr[2801]: 03:15:35 141M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:36 xxnodenamexx pvesr[2801]: 03:15:36 141M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:37 xxnodenamexx pvesr[2801]: 03:15:37 142M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:38 xxnodenamexx pvesr[2801]: 03:15:38 142M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:39 xxnodenamexx pvesr[2801]: 03:15:39 151M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:40 xxnodenamexx pvesr[2801]: 03:15:40 167M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:41 xxnodenamexx pvesr[2801]: 03:15:41 169M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ And then it wants to send from "@" isntead of "@__replicate_103-0_[time]__ Also strange that the schedule is */15, but it is starting at 3:20 after 3:15 Jul 10 03:20:05 xxnodenamexx pvesr[4021]: send from @ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499649600__ estimated size is 19.5G Jul 10 03:20:05 xxnodenamexx pvesr[4021]: total estimated size is 19.5G Jul 10 03:20:05 xxnodenamexx pvesr[4021]: TIME SENT SNAPSHOT Jul 10 03:20:05 xxnodenamexx pvesr[4021]: rpool/data/vm-103-disk-1 name rpool/data/vm-103-disk-1 - Jul 10 03:20:05 xxnodenamexx pvesr[4021]: volume 'rpool/data/vm-103-disk-1' already exists Jul 10 03:20:05 xxnodenamexx pvesr[4021]: warning: cannot send 'rpool/data/vm-103-disk-1 at __replicate_103-0_1499649600__': Broken pipe Jul 10 03:20:06 xxnodenamexx pvesr[4021]: cannot send 'rpool/data/vm-103-disk-1': I/O error Jul 10 03:20:06 xxnodenamexx pvesr[4021]: command 'zfs send -Rpv -- rpool/data/vm-103-disk-1 at __replicate_103-0_1499649600__' failed: exit code 1 Jul 10 03:20:06 xxnodenamexx pvesr[4021]: send/receive failed, cleaning up snapshot(s).. Hope it helps. From dorsyka at yahoo.com Mon Jul 10 11:15:57 2017 From: dorsyka at yahoo.com (dorsy) Date: Mon, 10 Jul 2017 11:15:57 +0200 Subject: [PVE-User] proxmox 5 - replication fails In-Reply-To: References: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> Message-ID: <6278d743-c2d9-8845-d1b6-84a91db6b8f2@yahoo.com> Sorry not meant to reply here. From dorsyka at yahoo.com Mon Jul 10 13:39:37 2017 From: dorsyka at yahoo.com (dorsy) Date: Mon, 10 Jul 2017 13:39:37 +0200 Subject: [PVE-User] proxmox 5 - replication fails In-Reply-To: <2108859951.1396331.1499594684882@mail.yahoo.com> References: <2108859951.1396331.1499594684882@mail.yahoo.com> Message-ID: <08f7dc5f-1d81-7f04-dcf6-f05a08cb1d51@yahoo.com> Same error again, nothing strange until 3:20 AM: Jul 10 03:00:02 xxnodenamexx pvesr[32023]: send from @__replicate_103-0_1499647500__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ estimated size is 7.27M Jul 10 03:00:02 xxnodenamexx pvesr[32023]: total estimated size is 7.27M Jul 10 03:00:02 xxnodenamexx pvesr[32023]: TIME SENT SNAPSHOT Jul 10 03:00:02 xxnodenamexx pvesr[32023]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499647500__ name rpool/data/vm-103-disk-1 at __replicate_103-0_1499647500__ - Jul 10 03:00:03 xxnodenamexx pvesr[32023]: 03:00:03 3.10M rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ Jul 10 03:00:06 xxnodenamexx pvesr[32023]: send from @__replicate_109-0_1499647505__ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ estimated size is 24.0M Jul 10 03:00:06 xxnodenamexx pvesr[32023]: total estimated size is 24.0M Jul 10 03:00:06 xxnodenamexx pvesr[32023]: TIME SENT SNAPSHOT Jul 10 03:00:06 xxnodenamexx pvesr[32023]: rpool/data/vm-109-disk-1 at __replicate_109-0_1499647505__ name rpool/data/vm-109-disk-1 at __replicate_109-0_1499647505__ - Jul 10 03:00:07 xxnodenamexx pvesr[32023]: 03:00:07 17.9M rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ Jul 10 03:15:02 xxnodenamexx pvesr[2801]: send from @__replicate_103-0_1499648400__ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ estimated size is 63.9M Jul 10 03:15:02 xxnodenamexx pvesr[2801]: total estimated size is 63.9M Jul 10 03:15:02 xxnodenamexx pvesr[2801]: TIME SENT SNAPSHOT Jul 10 03:15:03 xxnodenamexx pvesr[2801]: rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ name rpool/data/vm-103-disk-1 at __replicate_103-0_1499648400__ - Jul 10 03:15:03 xxnodenamexx pvesr[2801]: 03:15:03 6.00M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:04 xxnodenamexx pvesr[2801]: 03:15:04 13.5M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:05 xxnodenamexx pvesr[2801]: 03:15:05 13.9M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:06 xxnodenamexx pvesr[2801]: 03:15:06 17.2M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:07 xxnodenamexx pvesr[2801]: 03:15:07 49.7M rpool/data/vm-103-disk-1 at __replicate_103-0_1499649300__ Jul 10 03:15:24 xxnodenamexx pvesr[2801]: send from @__replicate_109-0_1499648405__ to rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ estimated size is 169M Jul 10 03:15:24 xxnodenamexx pvesr[2801]: total estimated size is 169M Jul 10 03:15:24 xxnodenamexx pvesr[2801]: rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ name rpool/data/vm-109-disk-1 at __replicate_109-0_1499648405__ - Jul 10 03:15:24 xxnodenamexx pvesr[2801]: TIME SENT SNAPSHOT Jul 10 03:15:25 xxnodenamexx pvesr[2801]: 03:15:25 18.6M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:26 xxnodenamexx pvesr[2801]: 03:15:26 49.3M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:27 xxnodenamexx pvesr[2801]: 03:15:27 69.9M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:28 xxnodenamexx pvesr[2801]: 03:15:28 74.0M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:29 xxnodenamexx pvesr[2801]: 03:15:29 80.3M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:30 xxnodenamexx pvesr[2801]: 03:15:30 85.5M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:31 xxnodenamexx pvesr[2801]: 03:15:31 92.9M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:32 xxnodenamexx pvesr[2801]: 03:15:32 101M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:33 xxnodenamexx pvesr[2801]: 03:15:33 102M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:34 xxnodenamexx pvesr[2801]: 03:15:34 126M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:35 xxnodenamexx pvesr[2801]: 03:15:35 141M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:36 xxnodenamexx pvesr[2801]: 03:15:36 141M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:37 xxnodenamexx pvesr[2801]: 03:15:37 142M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:38 xxnodenamexx pvesr[2801]: 03:15:38 142M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:39 xxnodenamexx pvesr[2801]: 03:15:39 151M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:40 xxnodenamexx pvesr[2801]: 03:15:40 167M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ Jul 10 03:15:41 xxnodenamexx pvesr[2801]: 03:15:41 169M rpool/data/vm-109-disk-1 at __replicate_109-0_1499649323__ And then it wants to send from "@" isntead of "@__replicate_103-0_[time]__ Also strange that the schedule is */15, but it is starting at 3:20 after 3:15 Jul 10 03:20:05 xxnodenamexx pvesr[4021]: send from @ to rpool/data/vm-103-disk-1 at __replicate_103-0_1499649600__ estimated size is 19.5G Jul 10 03:20:05 xxnodenamexx pvesr[4021]: total estimated size is 19.5G Jul 10 03:20:05 xxnodenamexx pvesr[4021]: TIME SENT SNAPSHOT Jul 10 03:20:05 xxnodenamexx pvesr[4021]: rpool/data/vm-103-disk-1 name rpool/data/vm-103-disk-1 - Jul 10 03:20:05 xxnodenamexx pvesr[4021]: volume 'rpool/data/vm-103-disk-1' already exists Jul 10 03:20:05 xxnodenamexx pvesr[4021]: warning: cannot send 'rpool/data/vm-103-disk-1 at __replicate_103-0_1499649600__': Broken pipe Jul 10 03:20:06 xxnodenamexx pvesr[4021]: cannot send 'rpool/data/vm-103-disk-1': I/O error Jul 10 03:20:06 xxnodenamexx pvesr[4021]: command 'zfs send -Rpv -- rpool/data/vm-103-disk-1 at __replicate_103-0_1499649600__' failed: exit code 1 Jul 10 03:20:06 xxnodenamexx pvesr[4021]: send/receive failed, cleaning up snapshot(s).. Hope it helps. From uwe.sauter.de at gmail.com Mon Jul 10 16:25:35 2017 From: uwe.sauter.de at gmail.com (Uwe Sauter) Date: Mon, 10 Jul 2017 16:25:35 +0200 Subject: [PVE-User] qm migrate: strange output In-Reply-To: References: <5239afd5-f45e-9621-ec06-8b1978d842a3@gmail.com> Message-ID: <7cf6bc3b-0358-611a-8a47-f4a81d8aec77@gmail.com> Ah, thanks. (Sorry for the late reply, Gmail put you answer into the spam folder.) Am 20.06.2017 um 19:01 schrieb Michael Rasmussen: > The former is for HA vm's the latter for non HA vm's > > On June 20, 2017 6:19:36 PM GMT+02:00, Uwe Sauter @gmail.com> wrote: > > Hi all, > > usually when I update my PVE cluster I do it in a rolling fashion: > 1) empty one node from running VMs > 2) update & reboot that node > 3) go to next node > 4) migrate all running VMs to already updated node > 5) go to 2 until no more nodes need update > > For step 1 (or 4) I usually do: > > # qm list > VMID NAME STATUS MEM(MB) BOOTDISK(GB) PID > 106 test1 running 2048 32.00 4993 > 112 test2 running 1024 16.00 5218 > > # for i in 106 112; do qm migrate $i px-bravo-cluster --online; done > > Usually I get multiple lines like: > > Executing HA migrate for VM 106 to node px-bravo-cluster > Executing HA migrate for VM 112 to node px-bravo-cluster > > But once in a while (in the last few days more often) I get: > > Jun 20 18:15:42 starting migration of VM 106 to node 'px-bravo-cluster' (169.254.42.49 ) > Jun 20 18:15:42 copying disk images > Jun 20 18:15:42 starting VM 106 on remote node 'px-bravo-cluster' > Jun 20 18:15:46 start remote tunnel > Jun 20 18:15:46 starting online/live migration on unix:/run/qemu-server/106.migrate > Jun 20 18:15:46 migrate_set_speed: 8589934592 > Jun 20 18:15:46 migrate_set_downtime: 0.1 > Jun 20 18:15:46 set migration_caps > Jun 20 18:15:46 set cachesize: 214748364 > Jun 20 18:15:46 start migrate command to unix:/run/qemu-server/106.migrate > Jun 20 18:15:48 migration status: active (transferred 632787397, remaining 463048704), total 2156732416) > Jun 20 18:15:48 migration xbzrle cachesize: 134217728 transferred 0 pages 0 cachemiss 0 overflow 0 > Jun 20 18:15:50 migration speed: 512.00 MB/s - downtime 81 ms > Jun 20 18:15:50 migration status: completed > Jun 20 18:15:54 migration finished successfully (duration 00:00:13) > Jun 20 18:15:55 starting migration of VM 112 to node 'px-bravo-cluster' (169.254.42.49 ) > Jun 20 18:15:55 copying disk images > Jun 20 18:15:55 starting VM 112 on remote node 'px-bravo-cluster' > Jun 20 18:15:58 start remote tunnel > Jun 20 18:15:58 starting online/live migration on unix:/run/qemu-server/112.migrate > Jun 20 18:15:58 migrate_set_speed: 8589934592 > Jun 20 18:15:58 migrate_set_downtime: 0.1 > Jun 20 18:15:58 set migration_caps > Jun 20 18:15:58 set cachesize: 107374182 > Jun 20 18:15:58 start migrate command to unix:/run/qemu-server/112.migrate > Jun 20 18:16:00 migration status: active (transferred 876920642, remaining 143405056), total 1082990592) > Jun 20 18:16:00 migration xbzrle cachesize: 67108864 transferred 0 pages 0 cachemiss 0 overflow 0 > Jun 20 18:16:02 migration speed: 256.00 MB/s - downtime 66 ms > Jun 20 18:16:02 migration status: completed > Jun 20 18:16:06 migration finished successfully (duration 00:00:12) > > > Can someone explain under which circumstances this output is displayed instead of just the short message that migration > was started? > > > Regards, > > Uwe > ---------------------------------------------------------------------------------------------------------------------------------- > > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity. > ---- This mail was virus scanned and spam checked before delivery. This mail is also DKIM signed. See header dkim-signature. From wolfgang.riegler at gmx.de Mon Jul 10 18:52:01 2017 From: wolfgang.riegler at gmx.de (Wolfgang Riegler) Date: Mon, 10 Jul 2017 18:52:01 +0200 Subject: [PVE-User] Proxmox 5.0 storage replication failure Message-ID: <3934799.vJvyeggbSL@wolfgang> Hi, the new storage replication feature is really great! But there is an issue: Unfortunately the replication breaks completely if somebody do a rollback to an older snapshot than the last sync of a container and destroys that snapshot before the next sync. kind regards Wolfgang From dietmar at proxmox.com Mon Jul 10 20:25:41 2017 From: dietmar at proxmox.com (Dietmar Maurer) Date: Mon, 10 Jul 2017 20:25:41 +0200 (CEST) Subject: [PVE-User] Proxmox 5.0 storage replication failure In-Reply-To: <3934799.vJvyeggbSL@wolfgang> References: <3934799.vJvyeggbSL@wolfgang> Message-ID: <1034256638.12.1499711142257@webmail.proxmox.com> > the new storage replication feature is really great! But there is an issue: > Unfortunately the replication breaks completely if somebody do a rollback to > an older snapshot than the last sync of a container and destroys that snapshot > before the next sync. AFAIK it simply syncs from rollbacked snapshot instead. Please can you post the replication log with the error? From nick-liste at posteo.eu Tue Jul 11 08:56:42 2017 From: nick-liste at posteo.eu (Nicola Ferrari (#554252)) Date: Tue, 11 Jul 2017 08:56:42 +0200 Subject: [PVE-User] Proxmox 4.4 boot issues on Dell T130 In-Reply-To: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> References: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> Message-ID: Il 10/07/2017 10:15, Eneko Lacunza ha scritto: > > What I'm doing wrong? :-) Hi Eneko! Did you check the MD5 hash for the 4.4 image? Which method did you use to prepare the usb? I usually use dd if=pve.iso of=myusb bs=1M without problems.. HTH, Nick -- +---------------------+ | Linux User #554252 | +---------------------+ From elacunza at binovo.es Tue Jul 11 15:18:22 2017 From: elacunza at binovo.es (Eneko Lacunza) Date: Tue, 11 Jul 2017 15:18:22 +0200 Subject: [PVE-User] Proxmox 4.4 boot issues on Dell T130 In-Reply-To: References: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> Message-ID: Hi Nicola, El 11/07/17 a las 08:56, Nicola Ferrari (#554252) escribi?: > Il 10/07/2017 10:15, Eneko Lacunza ha scritto: >> What I'm doing wrong? :-) > Did you check the MD5 hash for the 4.4 image? Argg, this was it! I had previously used the ISO (I didn't download it for this server), so I didn't check it... checked now and yes, it was corrupt. I changed this laptop's SSD because sometimes it gave access errors... seems one of them corrupted the iso file. Thanks for pointing this obvious problem ;) > Which method did you use to prepare the usb? > > I usually use > dd if=pve.iso of=myusb bs=1M > > without problems.. Yes, this is what I do too. Cheers Eneko -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943493611 943324914 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) www.binovo.es From robson.branco at gmail.com Tue Jul 11 18:54:55 2017 From: robson.branco at gmail.com (Robson Branco) Date: Tue, 11 Jul 2017 13:54:55 -0300 Subject: [PVE-User] CEPH: HEALTH_WARN 1 mons down Message-ID: Greetings, We had a problem on the hd of one of our nodes, and we had to reinstall the server. Everything returned to normal operation, but we have a problem with the CEPH monitor that was out of the quorum. # pveversion pve-manager/4.4-1/eb2d6f1e (running kernel: 4.4.35-1-pve) # ceph health detail 2017-07-11 13:49:17.115883 7f256c273700 0 -- :/3226337302 >> 10.10.10.12:6789/0 pipe(0x7f256805a550 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f256805e840).fault HEALTH_WARN 1 mons down, quorum 0,1 0,1 mon.2 (rank 2) addr 10.10.10.12:6789/0 is down (out of quorum) # ceph mon remove mon.2 mon mon.2 does not exist or has already been removed # pveceph createmon monitor address '10.10.10.12:6789' already in use by 'mon.2' # pveceph status { "quorum" : [ 0, 1 ], "osdmap" : { "osdmap" : { "num_osds" : 12, "num_up_osds" : 12, "full" : false, "epoch" : 7409, "num_in_osds" : 12, "num_remapped_pgs" : 0, "nearfull" : false } }, "pgmap" : { "write_bytes_sec" : 2684, "version" : 19003699, "data_bytes" : 4676668142879, "bytes_used" : 9346006102016, "pgs_by_state" : [ { "state_name" : "active+clean", "count" : 511 }, { "state_name" : "active+clean+scrubbing+deep", "count" : 1 } ], "num_pgs" : 512, "read_bytes_sec" : 3020, "op_per_sec" : 1, "bytes_avail" : 14647047196672, "bytes_total" : 23993053298688 }, "monmap" : { "fsid" : "f5d2413a-0c0c-4bfa-b709-74bc07749789", "mons" : [ { "rank" : 0, "addr" : "10.10.10.10:6789/0", "name" : "0" }, { "name" : "1", "rank" : 1, "addr" : "10.10.10.11:6789/0" }, { "addr" : "10.10.10.12:6789/0", "rank" : 2, "name" : "2" } ], "modified" : "2016-10-06 12:48:35.417421", "created" : "2016-10-06 12:48:15.988857", "epoch" : 3 }, "quorum_names" : [ "0", "1" ], "mdsmap" : { "epoch" : 1, "up" : 0, "in" : 0, "by_rank" : [], "max" : 0 }, "election_epoch" : 510, "fsid" : "f5d2413a-0c0c-4bfa-b709-74bc07749789", "health" : { "overall_status" : "HEALTH_WARN", "detail" : [], "summary" : [ { "severity" : "HEALTH_WARN", "summary" : "1 mons down, quorum 0,1 0,1" } ], "timechecks" : { "round" : 120, "round_status" : "finished", "epoch" : 510, "mons" : [ { "skew" : 0, "name" : "0", "latency" : 0, "health" : "HEALTH_OK" }, { "health" : "HEALTH_OK", "name" : "1", "latency" : 0.000667, "skew" : 0.000802 } ] }, "health" : { "health_services" : [ { "mons" : [ { "kb_avail" : 9161916, "kb_total" : 28510348, "name" : "0", "kb_used" : 17877152, "store_stats" : { "bytes_sst" : 0, "bytes_misc" : 28985450, "last_updated" : "0.000000", "bytes_total" : 31036097, "bytes_log" : 2050647 }, "avail_percent" : 32, "last_updated" : "2017-07-11 13:50:12.768057", "health" : "HEALTH_OK" }, { "name" : "1", "kb_total" : 28510348, "kb_avail" : 18154988, "avail_percent" : 63, "last_updated" : "2017-07-11 13:50:39.318205", "store_stats" : { "bytes_misc" : 28857931, "bytes_sst" : 0, "last_updated" : "0.000000", "bytes_total" : 29497513, "bytes_log" : 639582 }, "kb_used" : 8884080, "health" : "HEALTH_OK" } ] } ] } } } Cordialmente, *Robson Rosa Branco <>><* [C]: +55 21 99525-6856 [E]: robson.branco at gmail.com [S]: b_r_a_n_c_o [L]: http://www.linkedin.com/in/robsonbranco [T]: https://twitter.com/robsonrbranco [+]: https://google.com/+RobsonBrancoRosa From aderumier at odiso.com Wed Jul 12 09:14:16 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Wed, 12 Jul 2017 09:14:16 +0200 (CEST) Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> Message-ID: <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> Hi, just for the record, I have migrate all my clusters with unicast, also big clusters with 16-20 nodes, and It's working fine. "pvedaemon: ipcc_send_rec failed: Transport endpoint is not connected " seem to be gone. don't see any error on the cluster. traffic is around 3-4mbit/s on each node. ----- Mail original ----- De: "aderumier" ?: "pve-devel" Cc: "proxmoxve" Envoy?: Vendredi 7 Juillet 2017 11:51:42 Objet: Re: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? note that I'm just seeing , time to time (around once by hour), pvedaemon: ipcc_send_rec failed: Transport endpoint is not connected But I don't have any corosync error / retransmit. ----- Mail original ----- De: "aderumier" ?: "pve-devel" , "proxmoxve" Envoy?: Vendredi 7 Juillet 2017 10:46:33 Objet: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? Hi, I'm looking to remove multicast from my network (Don't have too much time to explain, but we have multicast storm problem,because of igmp snooping bug) Does somebody running it with "big" clusters ? (10-16 nodes) I'm currently testing it with 9 nodes (1200vm+containers), I'm seeing around 3mbit/s of traffic on each node, and I don't have any cluster break for now. (Switch have recents asics with around 0,015ms latency). Any return of experience is welcome :) Thanks ! Alexandre _______________________________________________ pve-devel mailing list pve-devel at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel From dimitris.beletsiotis at gmail.com Wed Jul 12 09:33:38 2017 From: dimitris.beletsiotis at gmail.com (Dimitris Beletsiotis) Date: Wed, 12 Jul 2017 07:33:38 +0000 Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> Message-ID: Hi Alexander, Nice, thanks for sharing, it will definitely help in some cases. Regards, Dimitris Beletsiotis On Wed, Jul 12, 2017, 10:15 Alexandre DERUMIER wrote: > Hi, > > just for the record, > > I have migrate all my clusters with unicast, also big clusters with 16-20 > nodes, and It's working fine. > > > "pvedaemon: ipcc_send_rec failed: Transport endpoint is not connected " > seem to be gone. > > don't see any error on the cluster. > > traffic is around 3-4mbit/s on each node. > > > > ----- Mail original ----- > De: "aderumier" > ?: "pve-devel" > Cc: "proxmoxve" > Envoy?: Vendredi 7 Juillet 2017 11:51:42 > Objet: Re: [pve-devel] corosync unicast : does somebody use it in > production with 10-16 nodes ? > > note that I'm just seeing , time to time (around once by hour), > > pvedaemon: ipcc_send_rec failed: Transport endpoint is not connected > > But I don't have any corosync error / retransmit. > > > ----- Mail original ----- > De: "aderumier" > ?: "pve-devel" , "proxmoxve" < > pve-user at pve.proxmox.com> > Envoy?: Vendredi 7 Juillet 2017 10:46:33 > Objet: [pve-devel] corosync unicast : does somebody use it in production > with 10-16 nodes ? > > Hi, > > I'm looking to remove multicast from my network (Don't have too much time > to explain, but we have multicast storm problem,because of igmp snooping > bug) > > Does somebody running it with "big" clusters ? (10-16 nodes) > > > I'm currently testing it with 9 nodes (1200vm+containers), I'm seeing > around 3mbit/s of traffic on each node, > > and I don't have any cluster break for now. (Switch have recents asics > with around 0,015ms latency). > > > Any return of experience is welcome :) > > Thanks ! > > Alexandre > _______________________________________________ > pve-devel mailing list > pve-devel at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > _______________________________________________ > pve-devel mailing list > pve-devel at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From dietmar at proxmox.com Wed Jul 12 12:03:31 2017 From: dietmar at proxmox.com (Dietmar Maurer) Date: Wed, 12 Jul 2017 12:03:31 +0200 (CEST) Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> Message-ID: <1770977615.77.1499853811553@webmail.proxmox.com> > just for the record, > > I have migrate all my clusters with unicast, also big clusters with 16-20 > nodes, and It's working fine. > > > "pvedaemon: ipcc_send_rec failed: Transport endpoint is not connected " seem > to be gone. > > don't see any error on the cluster. > > traffic is around 3-4mbit/s on each node. how many running VMs/Containers? From dorsyka at yahoo.com Wed Jul 12 13:08:31 2017 From: dorsyka at yahoo.com (dorsy) Date: Wed, 12 Jul 2017 13:08:31 +0200 Subject: [PVE-User] proxmox 5 - replication fails In-Reply-To: References: <49cb2934-2b98-5e83-b957-72f9809c20fc@binovo.es> Message-ID: <5bbefaab-d022-69dc-d7b4-84189570ec3a@yahoo.com> Another strange thing is that the failing jobs are running out of schedule. Jul 12 12:00:03 ns pvesr[12049]: total estimated size is 119M Jul 12 12:00:03 ns pvesr[12049]: TIME SENT SNAPSHOT Jul 12 12:00:04 ns pvesr[12049]: 12:00:04 9.18M rpool/data/vm-100-disk-1 at __replicate_100-0_1499853601__ ... Jul 12 12:00:14 ns pvesr[12049]: 12:00:14 113M rpool/data/vm-100-disk-1 at __replicate_100-0_1499853601__ No errors in logs. Finishes just seemingly fine. Schedule is */2:00. However after approx. 5 minutes it wants to send the whole disk: Jul 12 12:06:03 ns pvesr[13511]: send from @ to rpool/data/vm-100-disk-1 at __replicate_100-0_1499853960__ estimated size is 8.64G Is there any way to get more debug info from pvesr or the scheduling of the replication? From d.csapak at proxmox.com Wed Jul 12 13:39:55 2017 From: d.csapak at proxmox.com (Dominik Csapak) Date: Wed, 12 Jul 2017 13:39:55 +0200 Subject: [PVE-User] proxmox 5 - replication fails In-Reply-To: <08f7dc5f-1d81-7f04-dcf6-f05a08cb1d51@yahoo.com> References: <2108859951.1396331.1499594684882@mail.yahoo.com> <08f7dc5f-1d81-7f04-dcf6-f05a08cb1d51@yahoo.com> Message-ID: <21c643b9-6fd8-2b41-54ef-3020ad57d677@proxmox.com> hi, i reply here, to avoid confusion in the other thread can you post the content of the two files: /etc/pve/replication.cfg /var/lib/pve-manager/pve-replication-state.json (of the source node) ? From dorsyka at yahoo.com Wed Jul 12 13:50:44 2017 From: dorsyka at yahoo.com (dorsy) Date: Wed, 12 Jul 2017 13:50:44 +0200 Subject: [PVE-User] proxmox 5 - replication fails In-Reply-To: <21c643b9-6fd8-2b41-54ef-3020ad57d677@proxmox.com> References: <2108859951.1396331.1499594684882@mail.yahoo.com> <08f7dc5f-1d81-7f04-dcf6-f05a08cb1d51@yahoo.com> <21c643b9-6fd8-2b41-54ef-3020ad57d677@proxmox.com> Message-ID: <18cc2741-ee2e-171a-54ac-754ce34ae3e2@yahoo.com> # cat /etc/pve/replication.cfg local: 105-0 target ns302695 rate 10 schedule */2:00 local: 103-0 target ns3511723 rate 11 schedule */20 local: 109-0 target ns3511723 rate 10 local: 102-0 target ns302695 rate 10 schedule 22:30 local: 107-0 target ns302695 rate 10 local: 100-0 target ns302695 rate 10 schedule */2:00 cat /var/lib/pve-manager/pve-replication-state.json {"103":{"local/ns3511723":{"storeid_list":["local-zfs"],"fail_count":0,"last_try":1499859600,"last_sync":1499859600,"last_iteration":1499859600,"last_node":"ns302695","duration":4.482678}},"109":{"local/ns3511723":{"fail_count":0,"storeid_list":["local-zfs"],"last_sync":1499859000,"last_try":1499859000,"last_iteration":1499859000,"last_node":"ns302695","duration":7.828846}}} On the failed node (at the moment, I had failures from both sides): # cat /var/lib/pve-manager/pve-replication-state.json {"105":{"local/ns302695":{"last_iteration":1499853601,"fail_count":0,"duration":32.107092,"last_node":"ns3511723","storeid_list":["local-zfs"],"last_try":1499853633,"last_sync":1499853633}},"102":{"local/ns302695":{"last_try":1499805001,"last_sync":1499805001,"last_node":"ns3511723","duration":126.81862,"storeid_list":["local-zfs"],"last_iteration":1499805001,"fail_count":0}},"107":{"local/ns302695":{"fail_count":0,"last_iteration":1499859000,"duration":3.511844,"last_node":"ns3511723","storeid_list":["local-zfs"],"last_try":1499859000,"last_sync":1499859000}},"100":{"local/ns302695":{"error":"command 'set -o pipefail && pvesm export local-zfs:vm-100-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_100-0_1499858220__ | /usr/bin/cstream -t 10000000 | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=ns302695' root at IP.OF.TAR.GET -- pvesm import local-zfs:vm-100-disk-1 zfs - -with-snapshots 1' failed: exit code 255","fail_count":5,"last_iteration":1499858220,"duration":2.493542,"last_node":"ns3511723","storeid_list":["local-zfs"],"last_try":1499858220,"last_sync":1499846406}}} But I knew all these from the API :) pve:/> get nodes/ns3511723/replication/100-0/log 200 OK [ { "n" : 1, "t" : "2017-07-12 13:17:00 100-0: start replication job" }, { "n" : 2, "t" : "2017-07-12 13:17:00 100-0: guest => VM 100, running => 12279" }, { "n" : 3, "t" : "2017-07-12 13:17:00 100-0: volumes => local-zfs:vm-100-disk-1" }, { "n" : 4, "t" : "2017-07-12 13:17:01 100-0: create snapshot '__replicate_100-0_1499858220__' on local-zfs:vm-100-disk-1" }, { "n" : 5, "t" : "2017-07-12 13:17:01 100-0: full sync 'local-zfs:vm-100-disk-1' (__replicate_100-0_1499858220__)" }, { "n" : 6, "t" : "2017-07-12 13:17:03 100-0: delete previous replication snapshot '__replicate_100-0_1499858220__' on local-zfs:vm-100-disk-1" }, { "n" : 7, "t" : "2017-07-12 13:17:03 100-0: end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-100-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_100-0_1499858220__ | /usr/bin/cstream -t 10000000 | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=ns302695' root at IP.OF.TAR.GET -- pvesm import local-zfs:vm-100-disk-1 zfs - -with-snapshots 1' failed: exit code 255" } ] pve:/> get nodes/ns3511723/replication/100-0/status 200 OK { "duration" : 2.493542, "error" : "command 'set -o pipefail && pvesm export local-zfs:vm-100-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_100-0_1499858220__ | /usr/bin/cstream -t 10000000 | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=ns302695' root at IP.OF.TAR.GET -- pvesm import local-zfs:vm-100-disk-1 zfs - -with-snapshots 1' failed: exit code 255", "fail_count" : 5, "guest" : "100", "id" : "100-0", "jobnum" : "0", "last_sync" : 1499846406, "last_try" : 1499858220, "next_sync" : 1499860020, "rate" : 10, "schedule" : "*/2:00", "target" : "ns302695", "type" : "local", "vmtype" : "qemu" } Also, I have set a throttle of 10MB/s for the replication jobs, which is just a portion of the available bandwidth between the nodes, it should not be an issue. On 2017-07-12 13:39, Dominik Csapak wrote: > hi, > > i reply here, to avoid confusion in the other thread > > can you post the content of the two files: > > /etc/pve/replication.cfg > /var/lib/pve-manager/pve-replication-state.json (of the source node) > > ? > > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From aderumier at odiso.com Wed Jul 12 14:46:18 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Wed, 12 Jul 2017 14:46:18 +0200 (CEST) Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <1770977615.77.1499853811553@webmail.proxmox.com> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> <1770977615.77.1499853811553@webmail.proxmox.com> Message-ID: <2897780.181311.1499863578595.JavaMail.zimbra@oxygem.tv> >>how many running VMs/Containers? on 20 cluster nodes, around 1000 vm on a 10 cluster nodes, 800vm + 800ct on a 9 cluster nodes, 400vm ----- Mail original ----- De: "dietmar" ?: "aderumier" , "pve-devel" Cc: "proxmoxve" Envoy?: Mercredi 12 Juillet 2017 12:03:31 Objet: Re: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? > just for the record, > > I have migrate all my clusters with unicast, also big clusters with 16-20 > nodes, and It's working fine. > > > "pvedaemon: ipcc_send_rec failed: Transport endpoint is not connected " seem > to be gone. > > don't see any error on the cluster. > > traffic is around 3-4mbit/s on each node. how many running VMs/Containers? From dietmar at proxmox.com Wed Jul 12 17:42:18 2017 From: dietmar at proxmox.com (Dietmar Maurer) Date: Wed, 12 Jul 2017 17:42:18 +0200 (CEST) Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <2897780.181311.1499863578595.JavaMail.zimbra@oxygem.tv> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> <1770977615.77.1499853811553@webmail.proxmox.com> <2897780.181311.1499863578595.JavaMail.zimbra@oxygem.tv> Message-ID: <173720687.115.1499874138734@webmail.proxmox.com> > >>how many running VMs/Containers? > > on 20 cluster nodes, around 1000 vm > > on a 10 cluster nodes, 800vm + 800ct > > on a 9 cluster nodes, 400vm Interesting. So far I did not know anybody using that with more than 6 nodes ... From aderumier at odiso.com Wed Jul 12 18:05:06 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Wed, 12 Jul 2017 18:05:06 +0200 (CEST) Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <173720687.115.1499874138734@webmail.proxmox.com> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> <1770977615.77.1499853811553@webmail.proxmox.com> <2897780.181311.1499863578595.JavaMail.zimbra@oxygem.tv> <173720687.115.1499874138734@webmail.proxmox.com> Message-ID: <73091534.190421.1499875506007.JavaMail.zimbra@oxygem.tv> forgot to said, it's with proxmox 4, corosync 2.4.2-2~pve4+1 cpu are CPU E5-2687W v3 @ 3.10GHz ----- Mail original ----- De: "dietmar" ?: "aderumier" Cc: "proxmoxve" , "pve-devel" Envoy?: Mercredi 12 Juillet 2017 17:42:18 Objet: Re: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? > >>how many running VMs/Containers? > > on 20 cluster nodes, around 1000 vm > > on a 10 cluster nodes, 800vm + 800ct > > on a 9 cluster nodes, 400vm Interesting. So far I did not know anybody using that with more than 6 nodes ... From t.lamprecht at proxmox.com Thu Jul 13 07:56:49 2017 From: t.lamprecht at proxmox.com (Thomas Lamprecht) Date: Thu, 13 Jul 2017 07:56:49 +0200 Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <73091534.190421.1499875506007.JavaMail.zimbra@oxygem.tv> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> <1770977615.77.1499853811553@webmail.proxmox.com> <2897780.181311.1499863578595.JavaMail.zimbra@oxygem.tv> <173720687.115.1499874138734@webmail.proxmox.com> <73091534.190421.1499875506007.JavaMail.zimbra@oxygem.tv> Message-ID: <50b45aea-f537-6f3e-dde4-5416179baf9d@proxmox.com> On 07/12/2017 06:05 PM, Alexandre DERUMIER wrote: > forgot to said, it's with proxmox 4, corosync 2.4.2-2~pve4+1 > > cpu are CPU E5-2687W v3 @ 3.10GHz And what switches do you employ there? You said something about modern low latency ASIC switches. But yes, quite interesting that this can be done. Maybe we should adapt pvecm so that unicast mode is a bit easier to setup. > > ----- Mail original ----- > De: "dietmar" > ?: "aderumier" > Cc: "proxmoxve" , "pve-devel" > Envoy?: Mercredi 12 Juillet 2017 17:42:18 > Objet: Re: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? > >>>> how many running VMs/Containers? >> on 20 cluster nodes, around 1000 vm >> >> on a 10 cluster nodes, 800vm + 800ct >> >> on a 9 cluster nodes, 400vm > Interesting. So far I did not know anybody using that with more than 6 nodes ... > > _______________________________________________ > pve-devel mailing list > pve-devel at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel From aderumier at odiso.com Thu Jul 13 15:42:20 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Thu, 13 Jul 2017 15:42:20 +0200 (CEST) Subject: [PVE-User] [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? In-Reply-To: <50b45aea-f537-6f3e-dde4-5416179baf9d@proxmox.com> References: <332178591.16090.1499417193573.JavaMail.zimbra@oxygem.tv> <513766960.21884.1499421102872.JavaMail.zimbra@oxygem.tv> <1863695419.164479.1499843656435.JavaMail.zimbra@oxygem.tv> <1770977615.77.1499853811553@webmail.proxmox.com> <2897780.181311.1499863578595.JavaMail.zimbra@oxygem.tv> <173720687.115.1499874138734@webmail.proxmox.com> <73091534.190421.1499875506007.JavaMail.zimbra@oxygem.tv> <50b45aea-f537-6f3e-dde4-5416179baf9d@proxmox.com> Message-ID: <1079188659.222502.1499953340776.JavaMail.zimbra@oxygem.tv> >>And what switches do you employ there? You said something about modern >>low latency ASIC switches. new mellanox switches, with spectrum asic http://www.mellanox.com/page/products_dyn?product_family=251&mtag=sn2000 ----- Mail original ----- De: "Thomas Lamprecht" ?: "pve-devel" , "aderumier" , "dietmar" Cc: "proxmoxve" Envoy?: Jeudi 13 Juillet 2017 07:56:49 Objet: Re: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? On 07/12/2017 06:05 PM, Alexandre DERUMIER wrote: > forgot to said, it's with proxmox 4, corosync 2.4.2-2~pve4+1 > > cpu are CPU E5-2687W v3 @ 3.10GHz And what switches do you employ there? You said something about modern low latency ASIC switches. But yes, quite interesting that this can be done. Maybe we should adapt pvecm so that unicast mode is a bit easier to setup. > > ----- Mail original ----- > De: "dietmar" > ?: "aderumier" > Cc: "proxmoxve" , "pve-devel" > Envoy?: Mercredi 12 Juillet 2017 17:42:18 > Objet: Re: [pve-devel] corosync unicast : does somebody use it in production with 10-16 nodes ? > >>>> how many running VMs/Containers? >> on 20 cluster nodes, around 1000 vm >> >> on a 10 cluster nodes, 800vm + 800ct >> >> on a 9 cluster nodes, 400vm > Interesting. So far I did not know anybody using that with more than 6 nodes ... > > _______________________________________________ > pve-devel mailing list > pve-devel at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel From infolist at schwarz-fr.net Sat Jul 15 16:02:04 2017 From: infolist at schwarz-fr.net (Phil Schwarz) Date: Sat, 15 Jul 2017 16:02:04 +0200 Subject: [PVE-User] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous Message-ID: Hi, short version : I broke my cluster ! Long version , with context: With a 4 nodes Proxmox Cluster The nodes are all Pproxmox 5.05+Ceph luminous with filestore -3 mon+OSD -1 LXC+OSD Was working fine Added a fifth node (proxmox+ceph) today a broke everything.. Though every node can ping each other, the web GUI is full of red crossed nodes. No LXC is seen though there up and alive. However, every other proxmox is manageable through the web GUI.... In logs, i've tons of same message on 2 over 3 mons : " failed to decode message of type 80 v6: buffer::malformed_input: void pg_history_t::decode(ceph::buffer::list::iterator&) unknown encoding version > 7" Thanks for your answers. Best regards While investigating, i wondered about my config : Question relative to /etc/hosts file : Should i use private_replication_LAN Ip or public ones ? From lists at merit.unu.edu Sun Jul 16 14:28:18 2017 From: lists at merit.unu.edu (mj) Date: Sun, 16 Jul 2017 14:28:18 +0200 Subject: [PVE-User] btrfs in a guest Message-ID: <87415083-f73d-0259-8ec9-6820b61513ab@merit.unu.edu> Hi, Just a quick question: I am building a new samba 4.6 fileserver on proxmox with ceph storage backend. I am tempted to try btrfs as a filesystem for the samba shares, because of the 'previous versions' functionality. (btrfs + snapshots) However... googling about btrfs/ceph etc, I'm not sure if it's a wise decision, and it's difficult to find info on it, as mostly the results are about using btrfs as storage for ceph. So, is it a good idea to use btrfs in a guest, or better stick with (for example) xfs? Have a nice sunday everybody! MJ From gilberto.nunes32 at gmail.com Sun Jul 16 23:50:03 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Sun, 16 Jul 2017 18:50:03 -0300 Subject: [PVE-User] LXC Livre Migration Message-ID: pct migrate 100 pve01 -online 2017-07-16 18:48:36 ERROR: migration aborted (duration 00:00:00): lxc live migration is currently not implemented migration aborted pveversion pve-manager/5.0-23/af4267bf (running kernel: 4.10.15-1-pve) Gilberto Ferreira From lindsay.mathieson at gmail.com Mon Jul 17 01:04:17 2017 From: lindsay.mathieson at gmail.com (Lindsay Mathieson) Date: Mon, 17 Jul 2017 09:04:17 +1000 Subject: [PVE-User] btrfs in a guest In-Reply-To: <87415083-f73d-0259-8ec9-6820b61513ab@merit.unu.edu> References: <87415083-f73d-0259-8ec9-6820b61513ab@merit.unu.edu> Message-ID: On 16/07/2017 10:28 PM, mj wrote: > Just a quick question: I am building a new samba 4.6 fileserver on > proxmox with ceph storage backend. I am tempted to try btrfs as a > filesystem for the samba shares, because of the 'previous versions' > functionality. (btrfs + snapshots) The Samba server is a Qemu VM? The backing filesystem (Ceph) should be irrelevant to whatever filesystem you use in the VM. Personally I'd go with zfs over btrf. -- Lindsay Mathieson From lists at merit.unu.edu Mon Jul 17 13:20:34 2017 From: lists at merit.unu.edu (lists) Date: Mon, 17 Jul 2017 13:20:34 +0200 Subject: [PVE-User] btrfs in a guest In-Reply-To: References: <87415083-f73d-0259-8ec9-6820b61513ab@merit.unu.edu> Message-ID: Hi Lindsay, Thanks for your reply. On 17-7-2017 1:04, Lindsay Mathieson wrote: > The Samba server is a Qemu VM? yes. > The backing filesystem (Ceph) should be irrelevant to whatever > filesystem you use in the VM. Yes, I realise that. I know it's possible, and btrfs and xfs also seem to perform (after some brief testing) similarly. But there is a lot of discussion about "CoW penalty". And that's why I'm asking. For what it's worth: Our ceph has xfs OSDs. So, should I worry about this CoW penalty or not really? > Personally I'd go with zfs over btrf. Interesting. I see that also with zfs, you can expose previous versions via samba. You prefer zfs, because..? (The "more mature" argument, or other reasons as well..? perhaps specific to running on Qemu VM on ceph storage?) MJ From lindsay.mathieson at gmail.com Mon Jul 17 13:56:32 2017 From: lindsay.mathieson at gmail.com (Lindsay Mathieson) Date: Mon, 17 Jul 2017 21:56:32 +1000 Subject: [PVE-User] btrfs in a guest In-Reply-To: References: <87415083-f73d-0259-8ec9-6820b61513ab@merit.unu.edu>

Message-ID: On 17/07/2017 9:20 PM, lists wrote: > So, should I worry about this CoW penalty or not really? Ah, I see what you mean. I wouldn't have thought so, but for the definitive answer probably best asked on the Ceph list. > >> Personally I'd go with zfs over btrf. > Interesting. I see that also with zfs, you can expose previous > versions via samba. > > You prefer zfs, because..? (The "more mature" argument, or other > reasons as well..? perhaps specific to running on Qemu VM on ceph storag More mature basically, and I don't have a lot of faith in btrfs stability. Last I checked RAID 5/6 was still experimental, for good reasons. Still issues with fragmentation. -- Lindsay Mathieson From yannis.milios at gmail.com Mon Jul 17 13:57:55 2017 From: yannis.milios at gmail.com (Yannis Milios) Date: Mon, 17 Jul 2017 12:57:55 +0100 Subject: [PVE-User] btrfs in a guest In-Reply-To: References: <87415083-f73d-0259-8ec9-6820b61513ab@merit.unu.edu>

Message-ID: > > >> Personally I'd go with zfs over btrf. > >> Interesting. I see that also with zfs, you can expose previous versions via samba. >> You prefer zfs, because..? (The "more mature" argument, or other reasons as well..? perhaps specific to running on Qemu VM on ceph >> storage?) I would go for ZFS for that scenario but definitely I wouldn't try to use it on a VM. I would prefer a physical server running a linux distro and ZoL for ZFS or maybe FreeNAS + SAMBA to expose the shares on clients. You could also use a second server as a ZFS sync target for failover purposes.. Yannis On Mon, Jul 17, 2017 at 12:20 PM, lists wrote: > Hi Lindsay, > > Thanks for your reply. > > On 17-7-2017 1:04, Lindsay Mathieson wrote: > >> The Samba server is a Qemu VM? >> > yes. > > The backing filesystem (Ceph) should be irrelevant to whatever filesystem >> you use in the VM. >> > Yes, I realise that. I know it's possible, and btrfs and xfs also seem to > perform (after some brief testing) similarly. But there is a lot of > discussion about "CoW penalty". > > And that's why I'm asking. > > For what it's worth: Our ceph has xfs OSDs. > > So, should I worry about this CoW penalty or not really? > > Personally I'd go with zfs over btrf. >> > Interesting. I see that also with zfs, you can expose previous versions > via samba. > > You prefer zfs, because..? (The "more mature" argument, or other reasons > as well..? perhaps specific to running on Qemu VM on ceph storage?) > > MJ > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From wolfgang.riegler at gmx.de Mon Jul 17 16:27:22 2017 From: wolfgang.riegler at gmx.de (Wolfgang Riegler) Date: Mon, 17 Jul 2017 16:27:22 +0200 Subject: [PVE-User] Proxmox 5.0 storage replication failure In-Reply-To: <1034256638.12.1499711142257@webmail.proxmox.com> References: <3934799.vJvyeggbSL@wolfgang> <1034256638.12.1499711142257@webmail.proxmox.com> Message-ID: <1500309724.KvqRkjb0X2@wolfgang> Sorry for long delay ... here is the log: 2017-07-17 16:26:00 100-0: start replication job 2017-07-17 16:26:00 100-0: guest => CT 100, running => 0 2017-07-17 16:26:01 100-0: volumes => local-zfs:subvol-100-disk-1 2017-07-17 16:26:01 100-0: create snapshot '__replicate_100-0_1500301560__' on local-zfs:subvol-100-disk-1 2017-07-17 16:26:01 100-0: full sync 'local-zfs:subvol-100-disk-1' (__replicate_100-0_1500301560__) 2017-07-17 16:26:02 100-0: delete previous replication snapshot '__replicate_100-0_1500301560__' on local-zfs:subvol-100-disk-1 2017-07-17 16:26:02 100-0: end replication job with error: command 'set -o pipefail && pvesm export local-zfs:subvol-100-disk-1 zfs - -with-snapshots 1 -snapshot __replicate_100-0_1500301560__ | /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=vbox-proxmox1' root at 192.168.43.220 -- pvesm import local-zfs:subvol-100-disk-1 zfs - -with-snapshots 1' failed: exit code 255 kind regards Wolfgang Am Montag, 10. Juli 2017, 20:25:41 CEST schrieb Dietmar Maurer: > > the new storage replication feature is really great! But there is an issue: > > Unfortunately the replication breaks completely if somebody do a rollback to > > an older snapshot than the last sync of a container and destroys that snapshot > > before the next sync. > > AFAIK it simply syncs from rollbacked snapshot instead. Please can you post the > replication log with the error? > > From m at plus-plus.su Wed Jul 19 01:10:39 2017 From: m at plus-plus.su (Mikhail) Date: Wed, 19 Jul 2017 02:10:39 +0300 Subject: [PVE-User] ZFS over iSCSI in Proxmox 5.0 issues Message-ID: <6de25416-8a88-ee39-d51c-7b573affb9fb@plus-plus.su> Hello, I'm trying to setup Proxmox 5.0 node as shared storage node (basically I have a 4x10TB disks inside that server, Proxmox installed on ZFS RAID 10 filesystem). I'm willing to use this node as a host for KVM vms, and also as a storage source for my other nodes. The issue is that I cannot seem to setup this node as a ZFS-over-iSCSI source for the other nodes. It looks like Debian Stretch (base system for Proxmox 5.0) has dropped support for "iscsitarget" package that was available in Debian Jessie - iscsitarget package provides tools for IET (ietadm, ietd). So the problem comes when I'm trying to setup new ZFS-over-iSCSI storage from the Datacenter GUI: there I have to choose iSCSI Provider module - Comstar, istgt, IET. I cannot choose Comstar because it is purely for Solaris type of OS. I cannot choose IET because my Proxmox 5.0 host has no IET (iscsitarget) package available. And as a last resort, I have tried "istgt" as a provider (before that, I installed "istgt" package inside my Proxmox 5.0 storage node). Before doing this, I followed Proxmox wiki page and set up ssh keys for authorization on Proxmox 5.0 storage server. This is all working good. However, if I choose istgt as a provider, I get the following error whenever I try to create/run new vm that's storage source is set to ZFS-over-iSCSI volume: TASK ERROR: create failed - No configuration found. Install istgt on 192.168.88.2 at /usr/share/perl5/PVE/Storage/LunCmd/Istgt.pm line 99. istgt is actually installed on 192.168.88.2. The question is how to use ZFS-over-iSCSI in this scenario? Thanks! From m at plus-plus.su Wed Jul 19 01:34:39 2017 From: m at plus-plus.su (Mikhail) Date: Wed, 19 Jul 2017 02:34:39 +0300 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS Message-ID: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> Hello, Having another issue here that I can't figure out. I have a 2-node cluster here, running Proxmox 4.4. This cluster is using shared storage that connects to a Linux storage server using iSCSI via direct (NIC-to-NIC) 10-Gbit LAN (storage server has 2x10-Gbit ports network cards; Proxmox nodes also have 10Gbit LAN NICs). My storage model for the VMs is LVM-over-iSCSI (storage.cfg from one of the nodes): iscsi: Storage portal 192.168.4.1 target iqn.2016-03.eu.company:storage.all.luns content none lvm: WWW-hosts vgname vg-www-hosts base Storage:0.0.1.scsi-1494554000000000060fd8f3f2072a735bf2387ea56fe72a0 content images shared 1 nfs: vmnfs export /mnt/vmnfs path /mnt/pve/vmnfs server 192.168.4.1 content images maxfiles 1 options vers=4 Problem is that LVM access to the virtual machine's LVs is terribly slow, while NFS transfers are fast enough (storage server has 4x4TB SAS drives in RAID10 configured with MDADM). Here's 2 simple test I performed to get the transfer speeds: 1) For the LVM-over-iSCSI transfers: (on the proxmox node) # dd if=/dev/vg-www-hosts/vm-82105-disk-1 of=vm-82105-disk-1 bs=1M count=3000 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 70.0969 s, 44.9 MB/s # 2) For the NFS transfer from the same storage server (NFS is configured as above): root at pm1:~# dd if=/mnt/pve/ISOimages/template/iso/myOS.iso of=myOS.iso bs=1M count=3000 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 1.17121 s, 2.7 GB/s root at pm1:~# So I cannot figure out why LVM-over-iSCSI is so slow. I'm using iscsitarget package on Debian Jessie on the storage server as a iSCSI target. Could this be due to iscsitarget package? As I wrote earlier, iscsitarget appears to be abandoned in Debian Stretch, which leads me to question if this is the right iSCSI target software I'm using here.. Thanks! From w.link at proxmox.com Wed Jul 19 07:17:42 2017 From: w.link at proxmox.com (Wolfgang Link) Date: Wed, 19 Jul 2017 07:17:42 +0200 Subject: [PVE-User] ZFS over iSCSI in Proxmox 5.0 issues In-Reply-To: <6de25416-8a88-ee39-d51c-7b573affb9fb@plus-plus.su> References: <6de25416-8a88-ee39-d51c-7b573affb9fb@plus-plus.su> Message-ID: <7ec0a483-1999-59ba-7c08-7b2d4e4e913e@proxmox.com> Hi, Proxmox VE is not a storage box, so we do not provide this kind of setups. ZFS over iSCSI is used if you have a external storage box like FreeNas. Debian Stretch use lio as iscsi target what should also work with IET. On 07/19/2017 01:10 AM, Mikhail wrote: > Hello, > > I'm trying to setup Proxmox 5.0 node as shared storage node (basically I > have a 4x10TB disks inside that server, Proxmox installed on ZFS RAID 10 > filesystem). I'm willing to use this node as a host for KVM vms, and > also as a storage source for my other nodes. > > The issue is that I cannot seem to setup this node as a ZFS-over-iSCSI > source for the other nodes. It looks like Debian Stretch (base system > for Proxmox 5.0) has dropped support for "iscsitarget" package that was > available in Debian Jessie - iscsitarget package provides tools for IET > (ietadm, ietd). So the problem comes when I'm trying to setup new > ZFS-over-iSCSI storage from the Datacenter GUI: there I have to choose > iSCSI Provider module - Comstar, istgt, IET. > > I cannot choose Comstar because it is purely for Solaris type of OS. > I cannot choose IET because my Proxmox 5.0 host has no IET (iscsitarget) > package available. > And as a last resort, I have tried "istgt" as a provider (before that, I > installed "istgt" package inside my Proxmox 5.0 storage node). > Before doing this, I followed Proxmox wiki page and set up ssh keys for > authorization on Proxmox 5.0 storage server. This is all working good. > However, if I choose istgt as a provider, I get the following error > whenever I try to create/run new vm that's storage source is set to > ZFS-over-iSCSI volume: > > TASK ERROR: create failed - No configuration found. Install istgt on > 192.168.88.2 at /usr/share/perl5/PVE/Storage/LunCmd/Istgt.pm line 99. > > istgt is actually installed on 192.168.88.2. > > The question is how to use ZFS-over-iSCSI in this scenario? > > Thanks! > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From dietmar at proxmox.com Wed Jul 19 08:41:43 2017 From: dietmar at proxmox.com (Dietmar Maurer) Date: Wed, 19 Jul 2017 08:41:43 +0200 (CEST) Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> Message-ID: <524889534.2.1500446503537@webmail.proxmox.com> > So I cannot figure out why LVM-over-iSCSI is so slow. I guess your benchmark is simply wrong. You are testing the local cache, because you do not sync the data back to the storage. From elacunza at binovo.es Wed Jul 19 09:03:10 2017 From: elacunza at binovo.es (Eneko Lacunza) Date: Wed, 19 Jul 2017 09:03:10 +0200 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <524889534.2.1500446503537@webmail.proxmox.com> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> Message-ID: El 19/07/17 a las 08:41, Dietmar Maurer escribi?: >> So I cannot figure out why LVM-over-iSCSI is so slow. > I guess your benchmark is simply wrong. You are testing the > local cache, because you do not sync the data back to the storage. Really, 2.7GB/s for 4x4TB disks in RAID10 seems totally unreasonable (I guess they're not SSD drives...) I think that in the best conditions that could give about 200-250MB/s max, totally sequential writes, etc. Don't know why iSCSI is so slow, have you checked CPU usage on both sides? Anyhow your test copy is too small, use a file that at least is double the available RAM on storage server, or otherwise force sync. Cheers Eneko -- Zuzendari Teknikoa / Director T?cnico Binovo IT Human Project, S.L. Telf. 943493611 943324914 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) www.binovo.es From m at plus-plus.su Wed Jul 19 09:58:46 2017 From: m at plus-plus.su (Mikhail) Date: Wed, 19 Jul 2017 10:58:46 +0300 Subject: [PVE-User] ZFS over iSCSI in Proxmox 5.0 issues In-Reply-To: <7ec0a483-1999-59ba-7c08-7b2d4e4e913e@proxmox.com> References: <6de25416-8a88-ee39-d51c-7b573affb9fb@plus-plus.su> <7ec0a483-1999-59ba-7c08-7b2d4e4e913e@proxmox.com> Message-ID: <251349ab-5214-8be7-3626-a8848523e5de@plus-plus.su> On 07/19/2017 08:17 AM, Wolfgang Link wrote: > Hi, > > Proxmox VE is not a storage box, so we do not provide this kind of setups. > > ZFS over iSCSI is used if you have a external storage box like FreeNas. > > Debian Stretch use lio as iscsi target what should also work with IET. Hello, Thanks for your response. I realize that Proxmox should not act as a both storage/vm host - I'm doing this more for testing purposes. Should I install some additional package on the Proxmox 5.0 host that will act as a ZFS-iSCSI target/portal? I'm asking because it does not seem to work as is, I'm getting the following when trying to run/migrate VM on created storage: TASK ERROR: clone failed: No configuration found. Install iet on 192.168.88.2 at /usr/share/perl5/PVE/Storage/LunCmd/Iet.pm line 113. Thanks, Mikhail. From m at plus-plus.su Wed Jul 19 11:32:05 2017 From: m at plus-plus.su (Mikhail) Date: Wed, 19 Jul 2017 12:32:05 +0300 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> Message-ID: <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> Hello, Thanks for your responses. The issue appears to be somewhere beyond iSCSI. I just tried to do some "dd" tests locally on the storage server and I'm getting very low write speeds: root at storage:/root# dd if=/dev/vg0/isoimages of=isoimages.vg0 62914560+0 records in 62914560+0 records out 32212254720 bytes (32 GB) copied, 945.573 s, 34.1 MB/s root at storage:/root# (/dev/vg0/isoimages is local LV to the storage server) So will have to find out what's the problem or bottleneck somewhere else. Load average on storage server is 4.0-5.0 for the following CPU according to lscpu output: Model name: Intel(R) Xeon(R) CPU E3-1230 v5 @ 3.40GHz Stepping: 3 CPU MHz: 800.000 CPU max MHz: 3401.0000 CPU min MHz: 800.0000 Could this be due to low (800.000 MHz) CPU frequency? Thanks! On 07/19/2017 10:03 AM, Eneko Lacunza wrote: > El 19/07/17 a las 08:41, Dietmar Maurer escribi?: >>> So I cannot figure out why LVM-over-iSCSI is so slow. >> I guess your benchmark is simply wrong. You are testing the >> local cache, because you do not sync the data back to the storage. > Really, 2.7GB/s for 4x4TB disks in RAID10 seems totally unreasonable (I > guess they're not SSD drives...) > > I think that in the best conditions that could give about 200-250MB/s > max, totally sequential writes, etc. > > Don't know why iSCSI is so slow, have you checked CPU usage on both sides? > > Anyhow your test copy is too small, use a file that at least is double > the available RAM on storage server, or otherwise force sync. > > Cheers > Eneko > From e.kasper at proxmox.com Wed Jul 19 11:52:36 2017 From: e.kasper at proxmox.com (Emmanuel Kasper) Date: Wed, 19 Jul 2017 11:52:36 +0200 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> Message-ID: <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> On 07/19/2017 11:32 AM, Mikhail wrote: > Hello, > > Thanks for your responses. > The issue appears to be somewhere beyond iSCSI. > I just tried to do some "dd" tests locally on the storage server and I'm > getting very low write speeds: do not use dd to benchmark storages, use fio with a command line like fio --size=9G --bs=64k --rw=write --direct=1 --runtime=60 --name=64kwrite --group_reporting | grep bw inside your mount point or use the --filename option to point to a block device from this you will get reliable sequential write info From m at plus-plus.su Wed Jul 19 12:01:45 2017 From: m at plus-plus.su (Mikhail) Date: Wed, 19 Jul 2017 13:01:45 +0300 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> Message-ID: On 07/19/2017 12:52 PM, Emmanuel Kasper wrote: > do not use dd to benchmark storages, use fio > > with a command line like > > fio --size=9G --bs=64k --rw=write --direct=1 --runtime=60 > --name=64kwrite --group_reporting | grep bw > > inside your mount point > > or use the --filename option to point to a block device > > from this you will get reliable sequential write info Emmanuel, thanks for the hint! Just tried benchmarking with fio using your command line. Results below, looks very slow (avg=24888.52 KB/s): # fio --size=9G --bs=64k --rw=write --direct=1 --runtime=60 --name=64kwrite --group_reporting 64kwrite: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync, iodepth=1 fio-2.1.11 Starting 1 process 64kwrite: Laying out IO file(s) (1 file(s) / 9216MB) Jobs: 1 (f=1): [W(1)] [15.4% done] [0KB/1022KB/0KB /s] [0/15/0 iops] [eta 05m:34s] 64kwrite: (groupid=0, jobs=1): err= 0: pid=7841: Wed Jul 19 12:57:15 2017 write: io=1422.6MB, bw=24231KB/s, iops=378, runt= 60117msec clat (usec): min=87, max=293416, avg=2637.70, stdev=14667.15 lat (usec): min=87, max=293418, avg=2639.85, stdev=14667.17 clat percentiles (usec): | 1.00th=[ 87], 5.00th=[ 88], 10.00th=[ 88], 20.00th=[ 89], | 30.00th=[ 101], 40.00th=[ 135], 50.00th=[ 195], 60.00th=[ 235], | 70.00th=[ 334], 80.00th=[ 414], 90.00th=[ 700], 95.00th=[ 8384], | 99.00th=[81408], 99.50th=[117248], 99.90th=[193536], 99.95th=[211968], | 99.99th=[250880] bw (KB /s): min= 555, max=172928, per=100.00%, avg=24888.52, stdev=34949.10 lat (usec) : 100=29.27%, 250=32.97%, 500=25.85%, 750=2.35%, 1000=1.41% lat (msec) : 2=0.49%, 4=0.37%, 10=3.04%, 20=1.57%, 50=1.22% lat (msec) : 100=0.78%, 250=0.67%, 500=0.01% cpu : usr=0.18%, sys=1.34%, ctx=26211, majf=0, minf=8 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=22761/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=1422.6MB, aggrb=24231KB/s, minb=24231KB/s, maxb=24231KB/s, mint=60117msec, maxt=60117msec Disk stats (read/write): dm-7: ios=0/22961, merge=0/0, ticks=0/77576, in_queue=77692, util=98.84%, aggrios=2437/28407, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00% md0: ios=2437/28407, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=1035/14632, aggrmerge=53/259, aggrticks=4785/68958, aggrin_queue=73796, aggrutil=67.74% sda: ios=1782/14834, merge=50/265, ticks=8488/77372, in_queue=85876, util=67.74% sdb: ios=1153/14837, merge=50/264, ticks=4460/71308, in_queue=75792, util=63.19% sdc: ios=737/14428, merge=57/254, ticks=3924/65828, in_queue=69896, util=56.76% sdd: ios=471/14431, merge=55/255, ticks=2268/61324, in_queue=63620, util=54.84% # I have also changed CPU freq. to max 3.40GHz, but looks like this was not an issue. Mikhail. From yannis.milios at gmail.com Wed Jul 19 13:43:40 2017 From: yannis.milios at gmail.com (Yannis Milios) Date: Wed, 19 Jul 2017 12:43:40 +0100 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> Message-ID: >> (storage server has 4x4TB SAS >> drives in RAID10 configured with MDADM) Have you checked if these drives are properly aligned, sometimes that can cause low r/w performance. Is there any particular reason you use mdadm instead of h/w raid controller? Yannis From m at plus-plus.su Wed Jul 19 14:10:22 2017 From: m at plus-plus.su (Mikhail) Date: Wed, 19 Jul 2017 15:10:22 +0300 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> Message-ID: <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> On 07/19/2017 02:43 PM, Yannis Milios wrote: > Have you checked if these drives are properly aligned, sometimes that can > cause low r/w performance. > Is there any particular reason you use mdadm instead of h/w raid controller? Hello Yannis There's no h/w raid controller because first we wanted to adopt ZFS on that storage server. I wanted to use OmniOS as a base OS, but by the time (about ~15 months ago) OmniOS did not support Intel's X550 10GiGE (no drivers in kernel) NICs we have inside that server, so had to fall back to Linux. As you know ZFS feels better when it has direct access to the drives, without h/w raid level. The MDADM RAID10 array was created without specifying any special alignment options. What's the best way to check if the drives are aligned in a proper way on existent arrwy? Here's what I can see now: 1) fdisk output for one of disks in array: # fdisk -l /dev/sda Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 482FED1A-9CD0-4AEF-ACFC-D981C9916FE2 Device Start End Sectors Size Type /dev/sda1 2048 1953791 1951744 953M Linux filesystem /dev/sda2 1953792 7814035455 7812081664 3.7T Linux RAID 2) MDADM array details: # mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Fri Mar 18 18:27:06 2016 Raid Level : raid10 Array Size : 7811819520 (7449.93 GiB 7999.30 GB) Used Dev Size : 3905909760 (3724.97 GiB 3999.65 GB) Raid Devices : 4 Total Devices : 4 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Wed Jul 19 14:58:57 2017 State : active, checking Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : near=2 Chunk Size : 512K Check Status : 43% complete Name : storage:0 (local to host storage) UUID : 7346ef36:0a6b33f6:37eb29cd:58d04b7c Events : 1010431 Number Major Minor RaidDevice State 0 8 2 0 active sync set-A /dev/sda2 1 8 18 1 active sync set-B /dev/sdb2 2 8 34 2 active sync set-A /dev/sdc2 3 8 50 3 active sync set-B /dev/sdd2 3) and LVM information for the PV that resides on md0 arrway: # pvdisplay --- Physical volume --- PV Name /dev/md0 VG Name vg0 PV Size 7.28 TiB / not usable 2.00 MiB Allocatable yes PE Size 4.00 MiB Total PE 1907182 Free PE 182430 Allocated PE 1724752 PV UUID CefFFF-Q6yz-eX2p-Ziev-jdFW-3G6h-vHaesD The mdadm array is running check right now, but the speed is limited to it's defaults: # cat /proc/sys/dev/raid/speed_limit_max 200000 # cat /proc/sys/dev/raid/speed_limit_min 1000 # cat /proc/mdstat Personalities : [raid10] md0 : active raid10 sda2[0] sdd2[3] sdc2[2] sdb2[1] 7811819520 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU] [========>............] check = 43.6% (3412377088/7811819520) finish=17599.2min speed=4165K/sec bitmap: 16/59 pages [64KB], 65536KB chunk unused devices: Thanks for your help! From jf+proxmox at codingfield.com Wed Jul 19 16:52:39 2017 From: jf+proxmox at codingfield.com (=?UTF-8?Q?Jean-Fran=C3=A7ois_Milants?=) Date: Wed, 19 Jul 2017 16:52:39 +0200 Subject: [PVE-User] Proxmox 5 : IPv6 for containers and VM Message-ID: <3db4053e5cc211ee6cbd6f3784eecfd1@codingfield.com> Hello, I'm discovering the joys of virtualization, containers,... with Proxmox 5, and I have an issue with IPv6. I've already posted this question on the forum, but I didn't get any answer. I hope it's fine to re-post it here... So! I'm using Proxmox 5. My provider provides me with 1IPv4 and a /64 IPv6 subnet. I've already managed to create a NAT network to share the IPv4 with containers and VM. But I don't know how to assignIPv6 addresses to these containers and VM. As I have a whole /64 subnet, I guess I should be able to assign one or more addresses to the guest VM. Here are the information from my provider: IP : 2a03:4000:1c:199::/64 Gateway : fe80::1 And here is my current configuration (/etc/network/interfaces) ---- auto lo iface lo inet loopback iface lo inet6 loopback iface ens3 inet manual auto vmbr0 iface vmbr0 inet static address ***.***.***.*** netmask 255.255.252.0 gateway ***.***.***.*** bridge_ports ens3 bridge_stp off bridge_fd 0 iface vmbr0 inet6 static address 2a03:4000:1c:199::2 netmask 64 gateway fe80::1 auto vmbr1 iface vmbr1 inet static address 10.129.0.1 netmask 255.255.0.0 bridge_ports none bridge_stp off bridge_fd 0 post-up echo 1 > /proc/sys/net/ipv4/ip_forward post-up iptables -t nat -A POSTROUTING -s '10.129.0.0/16' -o vmbr0 -j MASQUERADE post-down iptables -t nat -D POSTROUTING -s '10.129.0.0/16' -o vmbr0 -j MASQUERADE ---- With this configuration, the host has IPv6 access. Could you explain me how to configure the CT/VM to assign them IPv6 addresses? Thanks ! From mir at miras.org Wed Jul 19 16:15:08 2017 From: mir at miras.org (Michael Rasmussen) Date: Wed, 19 Jul 2017 16:15:08 +0200 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> Message-ID: <20170719161508.5fab0715@sleipner.datanom.net> On Wed, 19 Jul 2017 15:10:22 +0300 Mikhail wrote: > Here's what I can see now: > > 1) fdisk output for one of disks in array: > # fdisk -l /dev/sda > > Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors > Units: sectors of 1 * 512 = 512 bytes > Sector size (logical/physical): 512 bytes / 4096 bytes > I/O size (minimum/optimal): 4096 bytes / 4096 bytes > Disklabel type: gpt > Disk identifier: 482FED1A-9CD0-4AEF-ACFC-D981C9916FE2 > > Device Start End Sectors Size Type > /dev/sda1 2048 1953791 1951744 953M Linux filesystem > /dev/sda2 1953792 7814035455 7812081664 3.7T Linux RAID > > 2) MDADM array details: > > # mdadm --detail /dev/md0 > /dev/md0: > Version : 1.2 > Creation Time : Fri Mar 18 18:27:06 2016 > Raid Level : raid10 > Array Size : 7811819520 (7449.93 GiB 7999.30 GB) > Used Dev Size : 3905909760 (3724.97 GiB 3999.65 GB) > Raid Devices : 4 > Total Devices : 4 > Persistence : Superblock is persistent > > Intent Bitmap : Internal > > Update Time : Wed Jul 19 14:58:57 2017 > State : active, checking > Active Devices : 4 > Working Devices : 4 > Failed Devices : 0 > Spare Devices : 0 > > Layout : near=2 > Chunk Size : 512K > Try do read here: http://dennisfleurbaaij.blogspot.dk/2013/01/setting-up-linux-mdadm-raid-array-with.html -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: He hated being thought of as one of those people that wore stupid ornamental armour. It was gilt by association. -- Terry Pratchett, "Night Watch" -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From m at plus-plus.su Wed Jul 19 17:30:01 2017 From: m at plus-plus.su (Mikhail) Date: Wed, 19 Jul 2017 18:30:01 +0300 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <20170719161508.5fab0715@sleipner.datanom.net> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> <20170719161508.5fab0715@sleipner.datanom.net> Message-ID: <25d4a2e3-1d4a-c00f-070e-d9dfb01a1593@plus-plus.su> On 07/19/2017 05:15 PM, Michael Rasmussen wrote: > Try do read here: > http://dennisfleurbaaij.blogspot.dk/2013/01/setting-up-linux-mdadm-raid-array-with.html Hello, Thanks, I also checked that post earlier today. Basically, it looks like MDADM array, and LVM on top (and possibly FS inside the VMs) need to be created with manual calculations for alignment and these calculations need to be specified on the command line at the time of creation. It is pity to find out this now, when server is in active use - many manuals mention that MDADM, LVM, etc are smart enough these days to make these calculations automatically at the time of creation, but this does not appear to be true and that's where problems come from later on. I guess my only way to fix this is to migrate everything off that server and reinstall it from scratch, throwing away things like MDADM and LVM this time and replacing them with ZFS for storage purposes. Thanks all. From mir at miras.org Wed Jul 19 17:38:31 2017 From: mir at miras.org (Michael Rasmussen) Date: Wed, 19 Jul 2017 17:38:31 +0200 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <25d4a2e3-1d4a-c00f-070e-d9dfb01a1593@plus-plus.su> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> <20170719161508.5fab0715@sleipner.datanom.net> <25d4a2e3-1d4a-c00f-070e-d9dfb01a1593@plus-plus.su> Message-ID: <20170719173831.388ecaf4@sleipner.datanom.net> On Wed, 19 Jul 2017 18:30:01 +0300 Mikhail wrote: > > Basically, it looks like MDADM array, and LVM on top (and possibly FS > inside the VMs) need to be created with manual calculations for > alignment and these calculations need to be specified on the command > line at the time of creation. It is pity to find out this now, when > server is in active use - many manuals mention that MDADM, LVM, etc are > smart enough these days to make these calculations automatically at the > time of creation, but this does not appear to be true and that's where > problems come from later on. > Your problem is that your disks is native 4k which advertises 512b as well. This means lvm and mdadm got confused ;-) > I guess my only way to fix this is to migrate everything off that server > and reinstall it from scratch, throwing away things like MDADM and LVM > this time and replacing them with ZFS for storage purposes. > You mentioned before that you hoped to use Omnios. Latest stable now supports your nics. -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: I'm telling you that the kernel is stable not because it's a kernel, but because I refuse to listen to arguments like this. -- Linus Torvalds -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gilberto.nunes32 at gmail.com Wed Jul 19 21:59:15 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Wed, 19 Jul 2017 16:59:15 -0300 Subject: [PVE-User] Short question... Message-ID: Hi... One doubt: is it SCSI better than virtio?? Thanks From m at plus-plus.su Wed Jul 19 23:07:44 2017 From: m at plus-plus.su (Mikhail) Date: Thu, 20 Jul 2017 00:07:44 +0300 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <20170719173831.388ecaf4@sleipner.datanom.net> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> <20170719161508.5fab0715@sleipner.datanom.net> <25d4a2e3-1d4a-c00f-070e-d9dfb01a1593@plus-plus.su> <20170719173831.388ecaf4@sleipner.datanom.net> Message-ID: <7438f91a-35a4-106e-2f17-ed6e366ac587@plus-plus.su> On 07/19/2017 06:38 PM, Michael Rasmussen wrote: >> I guess my only way to fix this is to migrate everything off that server >> and reinstall it from scratch, throwing away things like MDADM and LVM >> this time and replacing them with ZFS for storage purposes. >> > You mentioned before that you hoped to use Omnios. Latest stable > now supports your nics. Yes, that was more than a year ago when I deployed this storage server - you have good memory, Michael! =) The time to give OmniOS a try has come, I also noticed that X550 NICs are supported by OmniOS since late autumn 2016. Luckily Proxmox supports online storage migration (Move disk) without bringing vms down - this simplifies storage migration a lot in live environment! cheers, Mikhail. From mir at miras.org Wed Jul 19 23:33:38 2017 From: mir at miras.org (Michael Rasmussen) Date: Wed, 19 Jul 2017 23:33:38 +0200 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <7438f91a-35a4-106e-2f17-ed6e366ac587@plus-plus.su> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> <20170719161508.5fab0715@sleipner.datanom.net> <25d4a2e3-1d4a-c00f-070e-d9dfb01a1593@plus-plus.su> <20170719173831.388ecaf4@sleipner.datanom.net> <7438f91a-35a4-106e-2f17-ed6e366ac587@plus-plus.su> Message-ID: <20170719233338.3e558bc5@sleipner.datanom.net> On Thu, 20 Jul 2017 00:07:44 +0300 Mikhail wrote: > > The time to give OmniOS a try has come, I also noticed that X550 NICs > are supported by OmniOS since late autumn 2016. Luckily Proxmox supports > online storage migration (Move disk) without bringing vms down - this > simplifies storage migration a lot in live environment! > Remember to get it here: http://www.omniosce.org/ -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: Don't stop at one bug. - The Elements of Programming Style (Kernighan & Plaugher) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From m at plus-plus.su Wed Jul 19 23:49:29 2017 From: m at plus-plus.su (Mikhail) Date: Thu, 20 Jul 2017 00:49:29 +0300 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: <20170719233338.3e558bc5@sleipner.datanom.net> References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> <20170719161508.5fab0715@sleipner.datanom.net> <25d4a2e3-1d4a-c00f-070e-d9dfb01a1593@plus-plus.su> <20170719173831.388ecaf4@sleipner.datanom.net> <7438f91a-35a4-106e-2f17-ed6e366ac587@plus-plus.su> <20170719233338.3e558bc5@sleipner.datanom.net> Message-ID: On 07/20/2017 12:33 AM, Michael Rasmussen wrote: >> The time to give OmniOS a try has come, I also noticed that X550 NICs >> are supported by OmniOS since late autumn 2016. Luckily Proxmox supports >> online storage migration (Move disk) without bringing vms down - this >> simplifies storage migration a lot in live environment! >> > Remember to get it here: http://www.omniosce.org/ I heard about original OmniOS, by OmniTI, is being discountinued and that OmniOSCE takes care of it. But I could not find installation ISO images of the OmniOSCE. Which procedure to follow to get OmniOSCE installed? I guess it is to get latest available OmniOS installation ISO from https://omnios.omniti.com/wiki.php/Installation and then follow procedure described on the http://www.omniosce.org/ page to convert it into OmniOSCE? thanks, Mikhail. From mir at miras.org Thu Jul 20 00:15:20 2017 From: mir at miras.org (Michael Rasmussen) Date: Thu, 20 Jul 2017 00:15:20 +0200 Subject: [PVE-User] Shared storage on NAS speed - LVM(over iSCSI) vs NFS In-Reply-To: References: <82d2dd2c-7576-df4f-0391-5a3c3f52f2c6@plus-plus.su> <524889534.2.1500446503537@webmail.proxmox.com> <5b16989d-a23f-d910-23a4-16d0d4b53479@plus-plus.su> <52648401-9646-e3c1-d39f-7d3d9f26c12e@proxmox.com> <57d8803f-1fc9-449e-1630-a768ca05d262@plus-plus.su> <20170719161508.5fab0715@sleipner.datanom.net> <25d4a2e3-1d4a-c00f-070e-d9dfb01a1593@plus-plus.su> <20170719173831.388ecaf4@sleipner.datanom.net> <7438f91a-35a4-106e-2f17-ed6e366ac587@plus-plus.su> <20170719233338.3e558bc5@sleipner.datanom.net> Message-ID: <20170720001520.0c7f0be6@sleipner.datanom.net> On Thu, 20 Jul 2017 00:49:29 +0300 Mikhail wrote: > > I heard about original OmniOS, by OmniTI, is being discountinued and > that OmniOSCE takes care of it. > But I could not find installation ISO images of the OmniOSCE. Which > procedure to follow to get OmniOSCE installed? I guess it is to get > latest available OmniOS installation ISO from > https://omnios.omniti.com/wiki.php/Installation and then follow > procedure described on the http://www.omniosce.org/ page to convert it > into OmniOSCE? > This is correct. Pay attention to upgrade to latest kernel release r151022i if you have HBA's based on LSI SAS >= 2300 (using mr_sas driver) -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: "I've finally learned what `upward compatible' means. It means we get to keep all our old mistakes." -- Dennie van Tassel -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From e.kasper at proxmox.com Thu Jul 20 10:06:30 2017 From: e.kasper at proxmox.com (Emmanuel Kasper) Date: Thu, 20 Jul 2017 10:06:30 +0200 Subject: [PVE-User] Short question... In-Reply-To: References: Message-ID: <98653dcd-7d4c-daa0-7775-672f4905d5a5@proxmox.com> On 07/19/2017 09:59 PM, Gilberto Nunes wrote: > Hi... > > One doubt: is it SCSI better than virtio?? > > Thanks Read The Fine Manual :) https://pve.proxmox.com/pve-docs/chapter-qm.html#qm_hard_disk From mityapetuhov at gmail.com Thu Jul 20 10:32:15 2017 From: mityapetuhov at gmail.com (Dmitry Petuhov) Date: Thu, 20 Jul 2017 11:32:15 +0300 Subject: [PVE-User] Short question... In-Reply-To: References: Message-ID: 19.07.2017 22:59, Gilberto Nunes wrote: > One doubt: is it SCSI better than virtio?? I think you meant virtio-scsi vs virtio-blk. Other virtual SCSI controllers are usually worse. Depends on what do you want from it. Virtio-scsi is the only interface that supports trim/discard, so it's best choice for thin-provisioned storages. Also, it can pass through raw SCSI device to guest (in PVE it does [did?] so with libiscsi-backed storages), so minimizing virtualisation overhead. But it uses single IO thread per controller, so if you want max performance with multiple virtual disks, you must select virtio-scsi-single From gilberto.nunes32 at gmail.com Thu Jul 20 12:32:33 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Thu, 20 Jul 2017 07:32:33 -0300 Subject: [PVE-User] Short question... In-Reply-To: References:

Message-ID: Thanks Dmitry That the answer I would like to hear, because I always knew that Virtio Bus is high perfomance. Now, after read the Manual, I realize that virtio are deprecated... But after your explanation, I will use only SCSI Thanks a lot 2017-07-20 5:32 GMT-03:00 Dmitry Petuhov : > 19.07.2017 22:59, Gilberto Nunes wrote: > >> One doubt: is it SCSI better than virtio?? >> > I think you meant virtio-scsi vs virtio-blk. Other virtual SCSI > controllers are usually worse. > Depends on what do you want from it. > > Virtio-scsi is the only interface that supports trim/discard, so it's best > choice for thin-provisioned storages. Also, it can pass through raw SCSI > device to guest (in PVE it does [did?] so with libiscsi-backed storages), > so minimizing virtualisation overhead. But it uses single IO thread per > controller, so if you want max performance with multiple virtual disks, you > must select virtio-scsi-single > > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From h.hampel at rac.de Thu Jul 20 16:58:18 2017 From: h.hampel at rac.de (Holger Hampel | RA Consulting) Date: Thu, 20 Jul 2017 14:58:18 +0000 Subject: [PVE-User] Firewall Zones Message-ID: Hello, in the documentation there are two zones mentioned: host and VM. I matched the host zone to firewall configuration in datacenter and host. But enabling the firewall in datacenter locked out all VMs (guests firewall default is disabled). So what is the right way starting firewalling on a system in use when I need the firewall only for a few guests (KVM)? Proxmox: 4.4 Regards Holger Hampel From gaio at sv.lnf.it Thu Jul 20 17:56:50 2017 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Thu, 20 Jul 2017 17:56:50 +0200 Subject: [PVE-User] Containers, stretch and php... Message-ID: <20170720155650.GD4065@sv.lnf.it> (PVE 4.4, upgraded to latest patches) I've build up a LXC container based on debian 9 (stretch), but after installing PHP i've started to have in logs in the container: Jul 20 16:09:14 vglpi systemd[1]: phpsessionclean.service: Failed to reset devices.list: Operation not permitted Jul 20 16:09:14 vglpi systemd[6345]: phpsessionclean.service: Failed at step NETWORK spawning /usr/lib/php/sessionclean: Permission denied Jul 20 16:09:14 vglpi systemd[1]: phpsessionclean.service: Main process exited, code=exited, status=225/NETWORK Jul 20 16:09:14 vglpi systemd[1]: Failed to start Clean php session files. Jul 20 16:09:14 vglpi systemd[1]: phpsessionclean.service: Unit entered failed state. Jul 20 16:09:14 vglpi systemd[1]: phpsessionclean.service: Failed with result 'exit-code'. Jul 20 16:39:14 vglpi systemd[1]: phpsessionclean.service: Failed to reset devices.list: Operation not permitted Jul 20 16:39:14 vglpi systemd[6364]: phpsessionclean.service: Failed at step NETWORK spawning /usr/lib/php/sessionclean: Permission denied Jul 20 16:39:14 vglpi systemd[1]: phpsessionclean.service: Main process exited, code=exited, status=225/NETWORK Jul 20 16:39:14 vglpi systemd[1]: Failed to start Clean php session files. Jul 20 16:39:14 vglpi systemd[1]: phpsessionclean.service: Unit entered failed state. Jul 20 16:39:14 vglpi systemd[1]: phpsessionclean.service: Failed with result 'exit-code'. and on the host: Jul 20 16:09:14 tessier kernel: [22451057.039944] audit: type=1400 audit(1500559754.627:239): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=10038 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Jul 20 16:09:14 tessier kernel: [22451057.039949] audit: type=1400 audit(1500559754.627:240): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=10038 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Jul 20 16:09:14 tessier kernel: [22451057.039953] audit: type=1400 audit(1500559754.627:241): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=10038 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Jul 20 16:09:14 tessier kernel: [22451057.039956] audit: type=1400 audit(1500559754.627:242): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=10038 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Jul 20 16:39:14 tessier kernel: [22452857.015429] audit: type=1400 audit(1500561554.627:243): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=12677 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Jul 20 16:39:14 tessier kernel: [22452857.015434] audit: type=1400 audit(1500561554.627:244): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=12677 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Jul 20 16:39:14 tessier kernel: [22452857.015438] audit: type=1400 audit(1500561554.627:245): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=12677 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Jul 20 16:39:14 tessier kernel: [22452857.015441] audit: type=1400 audit(1500561554.627:246): apparmor="DENIED" operation="file_lock" profile="lxc-container-default-cgns" pid=12677 comm="(ionclean)" family="unix" sock_type="dgram" protocol=0 addr=none Why? Thanks. -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From aderumier at odiso.com Fri Jul 21 10:58:52 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Fri, 21 Jul 2017 10:58:52 +0200 (CEST) Subject: [PVE-User] qemu-img convert async Message-ID: <986137330.110285.1500627532526.JavaMail.zimbra@oxygem.tv> hi, I'm seeing that qemu 2.9 have new flags to make qemu-img convert async http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500add3db0b0d "This patches introduces 2 new cmdline parameters. The -m parameter to specify the number of coroutines running in parallel (defaults to 8). And the -W parameter to allow qemu-img to write to the target out of order rather than sequential. This improves performance as the writes do not have to wait for each other to complete." Does somebody already have tested it ? (-W flag) From gaio at sv.lnf.it Fri Jul 21 12:13:33 2017 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Fri, 21 Jul 2017 12:13:33 +0200 Subject: [PVE-User] Containers, stretch and php... In-Reply-To: <20170720155650.GD4065@sv.lnf.it> References: <20170720155650.GD4065@sv.lnf.it> Message-ID: <20170721101333.GI4039@sv.lnf.it> > Why? My first stretch bug. ;-) https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=869182 -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From gaio at sv.lnf.it Fri Jul 21 15:10:30 2017 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Fri, 21 Jul 2017 15:10:30 +0200 Subject: [PVE-User] Creating a template from an existing container... Message-ID: <20170721131030.GO4039@sv.lnf.it> I've googled a bit finding old result (relative to old container system, not LXC). Having a just setup container, there's some quick way to 'template-ize' it? Thanks. -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From infolist at schwarz-fr.net Fri Jul 21 23:22:12 2017 From: infolist at schwarz-fr.net (Phil Schwarz) Date: Fri, 21 Jul 2017 23:22:12 +0200 Subject: [PVE-User] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous In-Reply-To: References: Message-ID: <1226aba0-3672-eb58-ed75-20613089193e@schwarz-fr.net> Hi, after some investigations (and getting the cluster back) , seems that i've got an issue with pveceph creating OSD: ceph-disk zap /dev/sdc pveceph createosd /dev/sdc -bluestore 0 -fstype xfs Unknown option: bluestore According to the doc (1) , it should be OK. pveceph createosd /dev/sdc -fstype xfs --> Should be using Filestore, that's on purpose. create OSD on /dev/sdc (xfs) Caution: invalid backup GPT header, but valid main header; regenerating backup header from main header. **************************************************************************** Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk verification and recovery are STRONGLY recommended. **************************************************************************** GPT data structures destroyed! You may now partition the disk using fdisk or other utilities. Creating new GPT entries. The operation has completed successfully. Setting name! partNum is 0 REALLY setting name! The operation has completed successfully. Setting name! partNum is 1 REALLY setting name! The operation has completed successfully. The operation has completed successfully. meta-data=/dev/sdc1 isize=2048 agcount=4, agsize=6400 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0 data = bsize=4096 blocks=25600, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=864, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 The operation has completed successfully. But OSD never gets into GUI, neither in crushmap : ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 7.47186 root default -2 3.17259 host H1 1 1.81360 osd.1 up 1.00000 1.00000 3 1.35899 osd.3 up 1.00000 1.00000 -3 0.67699 host H2 0 0.67699 osd.0 up 1.00000 1.00000 -4 1.80869 host H3 2 0.44969 osd.2 up 1.00000 1.00000 4 1.35899 osd.4 up 1.00000 1.00000 -5 1.81360 host H4 5 1.81360 osd.5 up 1.00000 1.00000 6 0 osd.6 down 0 1.00000 The osd.6 shloud appear under H5. Thanks Best regards (1) : https://pve.proxmox.com/pve-docs/pveceph.1.html Le 15/07/2017 ? 16:02, Phil Schwarz a ?crit : > Hi, > > short version : > I broke my cluster ! > > Long version , with context: > With a 4 nodes Proxmox Cluster > The nodes are all Pproxmox 5.05+Ceph luminous with filestore > -3 mon+OSD > -1 LXC+OSD > > Was working fine > Added a fifth node (proxmox+ceph) today a broke everything.. > > Though every node can ping each other, the web GUI is full of red > crossed nodes. No LXC is seen though there up and alive. > However, every other proxmox is manageable through the web GUI.... > > In logs, i've tons of same message on 2 over 3 mons : > > " failed to decode message of type 80 v6: buffer::malformed_input: void > pg_history_t::decode(ceph::buffer::list::iterator&) unknown encoding > version > 7" > > Thanks for your answers. > Best regards > > While investigating, i wondered about my config : > Question relative to /etc/hosts file : > Should i use private_replication_LAN Ip or public ones ? > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From alarig at grifon.fr Mon Jul 24 00:52:31 2017 From: alarig at grifon.fr (Alarig Le Lay) Date: Mon, 24 Jul 2017 00:52:31 +0200 Subject: [PVE-User] Matching VM-id and routed IP Message-ID: <20170723225231.jmpxqwcyf36uwqth@mew.swordarmor.fr> Hi, I?m wondering if there is a way to do some matching between the id of a VM and it?s routed IP. The plan is to announce the IPv4?s /32 and IPv6?s /48 of each VM running on an hypervisor to edge routers with iBGP or OSPF. Of course, a cluster will be set up, so ID will be unique in the cluster?s scope. My basic idea is to have something like 89.234.186.18/32 dev tap101i0 2a00:5884:8218::/48 dev tap101i0 on the host where VM 101 is running, and doing 'redistribute kernel' with quagga or bird. If this feature is not included in promox, is it possible to use the pmxcfs to achieve this (by example by writing the routed IP in /etc/pve/nodes/${node}/priv/101-routes.conf) and having some hooks at the startup/shutdown that will read that file? Thanks, -- alarig -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From leithner at itronic.at Mon Jul 24 12:11:46 2017 From: leithner at itronic.at (Harald Leithner) Date: Mon, 24 Jul 2017 12:11:46 +0200 Subject: [PVE-User] ZFS Checksum Error in VM but not on Host Message-ID: <676f48c1-12d8-ee6f-fd05-b2c61e8d1925@itronic.at> Hi, I'm not sure if this is Proxmox/Qemu related but I try it here. We have a VM on a ZFS Pool with Proxmox kernel for ZFS, so the result is RAIDZ1 (2 Disks) -> qemu -> ZFS (1 Disk) We got 2 Mails from ZFS inside the VM: --- ZFS has detected a checksum error: eid: 37 class: checksum host: backup time: 2017-07-21 15:07:59+0200 vtype: disk vpath: /dev/sdc1 vguid: 0x003AC1491C2AC7D2 cksum: 1 read: 0 write: 0 pool: backup --- ZFS has detected a data error: eid: 36 class: data host: backup time: 2017-07-21 15:07:59+0200 pool: backup --- The status of the zfs pool inside the VM: --- zpool status -v pool: backup state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://zfsonlinux.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 0h42m with 0 errors on Sun Jul 9 01:06:26 2017 config: NAME STATE READ WRITE CKSUM backup ONLINE 0 0 13 sdc ONLINE 0 0 26 errors: Permanent errors have been detected in the following files: /battlefield/backup/kunde/server/data/d7/d79c0feb29ef024ce0164253ee08e6daa986bd1d599f4640167de2c3d7828524 --- But on the host there is no error: zpool status -v pool: slow state: ONLINE scan: scrub in progress since Fri Jul 21 15:45:13 2017 54.0G scanned out of 486G at 60.8M/s, 2h1m to go 0 repaired, 11.11% done config: NAME STATE READ WRITE CKSUM slow ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 sda2 ONLINE 0 0 0 sdb2 ONLINE 0 0 0 errors: No known data errors (Scrub is also finished with no errors) --- HOST: pveversion --verbose proxmox-ve: 5.0-16 (running kernel: 4.10.15-1-pve) pve-manager: 5.0-23 (running version: 5.0-23/af4267bf) pve-kernel-4.10.15-1-pve: 4.10.15-15 pve-kernel-4.4.35-1-pve: 4.4.35-77 pve-kernel-4.10.8-1-pve: 4.10.8-7 pve-kernel-4.4.59-1-pve: 4.4.59-87 pve-kernel-4.10.11-1-pve: 4.10.11-9 pve-kernel-4.10.17-1-pve: 4.10.17-16 libpve-http-server-perl: 2.0-5 lvm2: 2.02.168-pve2 corosync: 2.4.2-pve3 libqb0: 1.0.1-1 pve-cluster: 5.0-12 qemu-server: 5.0-14 pve-firmware: 2.0-2 libpve-common-perl: 5.0-16 libpve-guest-common-perl: 2.0-11 libpve-access-control: 5.0-5 libpve-storage-perl: 5.0-12 pve-libspice-server1: 0.12.8-3 vncterm: 1.5-2 pve-docs: 5.0-9 pve-qemu-kvm: 2.9.0-2 pve-container: 2.0-14 pve-firewall: 3.0-2 pve-ha-manager: 2.0-2 ksm-control-daemon: 1.2-2 glusterfs-client: 3.8.8-1 lxc-pve: 2.0.8-3 lxcfs: 2.0.7-pve2 criu: 2.11.1-1~bpo90 novnc-pve: 0.6-4 smartmontools: 6.5+svn4324-1 zfsutils-linux: 0.6.5.9-pve16~bpo90 --- VM: Linux backup 4.10.15-1-pve #1 SMP PVE 4.10.15-12 (Mon, 12 Jun 2017 11:18:07 +0200) x86_64 GNU/Linux zfsutils-linux: 0.6.5.9-pve16~bpo90 Some hints would be very appreciated! bye Harald -- Harald Leithner ITronic Wiedner Hauptstra?e 120/5.1, 1050 Wien, Austria Tel: +43-1-545 0 604 Mobil: +43-699-123 78 4 78 Mail: leithner at itronic.at | itronic.at From yannis.milios at gmail.com Mon Jul 24 13:32:00 2017 From: yannis.milios at gmail.com (Yannis Milios) Date: Mon, 24 Jul 2017 12:32:00 +0100 Subject: [PVE-User] ZFS Checksum Error in VM but not on Host In-Reply-To: <676f48c1-12d8-ee6f-fd05-b2c61e8d1925@itronic.at> References: <676f48c1-12d8-ee6f-fd05-b2c61e8d1925@itronic.at> Message-ID: Hello, >> RAIDZ1 (2 Disks) -> qemu -> ZFS (1 Disk) Is there any particular reason of having this kind of setup? I mean in general using ZFS inside a VM is not recommended. >> NAME STATE READ WRITE CKSUM >> backup ONLINE 0 0 * 13* >> sdc ONLINE 0 0 *26* >> errors: Permanent errors have been detected in the following files: >> */battlefield/backup/kunde/serv*er/data/d7/d79c0feb29ef024ce01 64253ee08e6daa986bd1d599f4640167de2c3d7828524 Apparently you had checksum errors which lead to corruption of that file. Since this pool is not redundant, you will have to delete the file, restore it from a backup and then scrub the volume. Yannis On Mon, Jul 24, 2017 at 11:11 AM, Harald Leithner wrote: > Hi, > > I'm not sure if this is Proxmox/Qemu related but I try it here. > > We have a VM on a ZFS Pool with Proxmox kernel for ZFS, so the result is > > RAIDZ1 (2 Disks) -> qemu -> ZFS (1 Disk) > > We got 2 Mails from ZFS inside the VM: > > --- > > ZFS has detected a checksum error: > > eid: 37 > class: checksum > host: backup > time: 2017-07-21 15:07:59+0200 > vtype: disk > vpath: /dev/sdc1 > vguid: 0x003AC1491C2AC7D2 > cksum: 1 > read: 0 > write: 0 > pool: backup > > --- > > ZFS has detected a data error: > > eid: 36 > class: data > host: backup > time: 2017-07-21 15:07:59+0200 > pool: backup > > --- > > The status of the zfs pool inside the VM: > > --- > > zpool status -v > pool: backup > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://zfsonlinux.org/msg/ZFS-8000-8A > scan: scrub repaired 0 in 0h42m with 0 errors on Sun Jul 9 01:06:26 2017 > config: > > NAME STATE READ WRITE CKSUM > backup ONLINE 0 0 13 > sdc ONLINE 0 0 26 > > errors: Permanent errors have been detected in the following files: > > > /battlefield/backup/kunde/server/data/d7/d79c0feb29ef024ce01 > 64253ee08e6daa986bd1d599f4640167de2c3d7828524 > > --- > > But on the host there is no error: > > zpool status -v > > pool: slow > state: ONLINE > scan: scrub in progress since Fri Jul 21 15:45:13 2017 > 54.0G scanned out of 486G at 60.8M/s, 2h1m to go > 0 repaired, 11.11% done > config: > > NAME STATE READ WRITE CKSUM > slow ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > sda2 ONLINE 0 0 0 > sdb2 ONLINE 0 0 0 > > errors: No known data errors > > > (Scrub is also finished with no errors) > > --- > > HOST: > > pveversion --verbose > proxmox-ve: 5.0-16 (running kernel: 4.10.15-1-pve) > pve-manager: 5.0-23 (running version: 5.0-23/af4267bf) > pve-kernel-4.10.15-1-pve: 4.10.15-15 > pve-kernel-4.4.35-1-pve: 4.4.35-77 > pve-kernel-4.10.8-1-pve: 4.10.8-7 > pve-kernel-4.4.59-1-pve: 4.4.59-87 > pve-kernel-4.10.11-1-pve: 4.10.11-9 > pve-kernel-4.10.17-1-pve: 4.10.17-16 > libpve-http-server-perl: 2.0-5 > lvm2: 2.02.168-pve2 > corosync: 2.4.2-pve3 > libqb0: 1.0.1-1 > pve-cluster: 5.0-12 > qemu-server: 5.0-14 > pve-firmware: 2.0-2 > libpve-common-perl: 5.0-16 > libpve-guest-common-perl: 2.0-11 > libpve-access-control: 5.0-5 > libpve-storage-perl: 5.0-12 > pve-libspice-server1: 0.12.8-3 > vncterm: 1.5-2 > pve-docs: 5.0-9 > pve-qemu-kvm: 2.9.0-2 > pve-container: 2.0-14 > pve-firewall: 3.0-2 > pve-ha-manager: 2.0-2 > ksm-control-daemon: 1.2-2 > glusterfs-client: 3.8.8-1 > lxc-pve: 2.0.8-3 > lxcfs: 2.0.7-pve2 > criu: 2.11.1-1~bpo90 > novnc-pve: 0.6-4 > smartmontools: 6.5+svn4324-1 > zfsutils-linux: 0.6.5.9-pve16~bpo90 > > --- > > VM: > > Linux backup 4.10.15-1-pve #1 SMP PVE 4.10.15-12 (Mon, 12 Jun 2017 > 11:18:07 +0200) x86_64 GNU/Linux > > zfsutils-linux: 0.6.5.9-pve16~bpo90 > > > Some hints would be very appreciated! > > bye > Harald > > > -- > Harald Leithner > > ITronic > Wiedner Hauptstra?e 120/5.1, 1050 Wien, Austria > Tel: +43-1-545 0 604 > Mobil: +43-699-123 78 4 78 > Mail: leithner at itronic.at | itronic.at > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From gaio at sv.lnf.it Mon Jul 24 16:24:00 2017 From: gaio at sv.lnf.it (Marco Gaiarin) Date: Mon, 24 Jul 2017 16:24:00 +0200 Subject: [PVE-User] Containers, stretch and php... In-Reply-To: <20170720155650.GD4065@sv.lnf.it> References: <20170720155650.GD4065@sv.lnf.it> Message-ID: <20170724142400.GA14797@sv.lnf.it> > I've build up a LXC container based on debian 9 (stretch), but after > installing PHP i've started to have in logs in the container: I've upgraded to stretch 9.1 (the container) and upgraded pve-container to 1.0-101, but nothing changed. FYI. -- dott. Marco Gaiarin GNUPG Key ID: 240A3D66 Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/ Polo FVG - Via della Bont?, 7 - 33078 - San Vito al Tagliamento (PN) marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797 Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA! http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000 (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA) From leithner at itronic.at Mon Jul 24 17:21:04 2017 From: leithner at itronic.at (Harald Leithner) Date: Mon, 24 Jul 2017 17:21:04 +0200 Subject: [PVE-User] ZFS Checksum Error in VM but not on Host In-Reply-To: References: <676f48c1-12d8-ee6f-fd05-b2c61e8d1925@itronic.at> Message-ID: <3702bfd2-627b-eb6c-d16b-871fca637441@itronic.at> Hi, Am 24.07.2017 um 13:32 schrieb Yannis Milios: > Hello, > >>> RAIDZ1 (2 Disks) -> qemu -> ZFS (1 Disk) > > > Is there any particular reason of having this kind of setup? I mean in > general using ZFS inside a VM is not recommended. 2 reason for this, first having checksums^^, second snapshots. And I prefer ZFS over any other filesystem. Whats the reason why ZFS is not good in a VM? > > > >> NAME STATE READ WRITE CKSUM > >> backup ONLINE 0 0 * 13* > >> sdc ONLINE 0 0 *26* > >>> errors: Permanent errors have been detected in the following files: > > >>> */battlefield/backup/kunde/serv*er/data/d7/d79c0feb29ef024ce01 > 64253ee08e6daa986bd1d599f4640167de2c3d7828524 > > Apparently you had checksum errors which lead to corruption of that file. > Since this pool is not redundant, you will have to delete the file, restore > it from a backup and then scrub the volume. I understand the error and the solution, but not really why it happen. In the meantime I got an answer from Wolfgang Link who thinks it could be a bit flip in memory... Do you have any other Filesystem that support checksumming, thats maybe better for this job? > > Yannis Harald > > On Mon, Jul 24, 2017 at 11:11 AM, Harald Leithner > wrote: > >> Hi, >> >> I'm not sure if this is Proxmox/Qemu related but I try it here. >> >> We have a VM on a ZFS Pool with Proxmox kernel for ZFS, so the result is >> >> RAIDZ1 (2 Disks) -> qemu -> ZFS (1 Disk) >> >> We got 2 Mails from ZFS inside the VM: >> >> --- >> >> ZFS has detected a checksum error: >> >> eid: 37 >> class: checksum >> host: backup >> time: 2017-07-21 15:07:59+0200 >> vtype: disk >> vpath: /dev/sdc1 >> vguid: 0x003AC1491C2AC7D2 >> cksum: 1 >> read: 0 >> write: 0 >> pool: backup >> >> --- >> >> ZFS has detected a data error: >> >> eid: 36 >> class: data >> host: backup >> time: 2017-07-21 15:07:59+0200 >> pool: backup >> >> --- >> >> The status of the zfs pool inside the VM: >> >> --- >> >> zpool status -v >> pool: backup >> state: ONLINE >> status: One or more devices has experienced an error resulting in data >> corruption. Applications may be affected. >> action: Restore the file in question if possible. Otherwise restore the >> entire pool from backup. >> see: http://zfsonlinux.org/msg/ZFS-8000-8A >> scan: scrub repaired 0 in 0h42m with 0 errors on Sun Jul 9 01:06:26 2017 >> config: >> >> NAME STATE READ WRITE CKSUM >> backup ONLINE 0 0 13 >> sdc ONLINE 0 0 26 >> >> errors: Permanent errors have been detected in the following files: >> >> >> /battlefield/backup/kunde/server/data/d7/d79c0feb29ef024ce01 >> 64253ee08e6daa986bd1d599f4640167de2c3d7828524 >> >> --- >> >> But on the host there is no error: >> >> zpool status -v >> >> pool: slow >> state: ONLINE >> scan: scrub in progress since Fri Jul 21 15:45:13 2017 >> 54.0G scanned out of 486G at 60.8M/s, 2h1m to go >> 0 repaired, 11.11% done >> config: >> >> NAME STATE READ WRITE CKSUM >> slow ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> sda2 ONLINE 0 0 0 >> sdb2 ONLINE 0 0 0 >> >> errors: No known data errors >> >> >> (Scrub is also finished with no errors) >> >> --- >> >> HOST: >> >> pveversion --verbose >> proxmox-ve: 5.0-16 (running kernel: 4.10.15-1-pve) >> pve-manager: 5.0-23 (running version: 5.0-23/af4267bf) >> pve-kernel-4.10.15-1-pve: 4.10.15-15 >> pve-kernel-4.4.35-1-pve: 4.4.35-77 >> pve-kernel-4.10.8-1-pve: 4.10.8-7 >> pve-kernel-4.4.59-1-pve: 4.4.59-87 >> pve-kernel-4.10.11-1-pve: 4.10.11-9 >> pve-kernel-4.10.17-1-pve: 4.10.17-16 >> libpve-http-server-perl: 2.0-5 >> lvm2: 2.02.168-pve2 >> corosync: 2.4.2-pve3 >> libqb0: 1.0.1-1 >> pve-cluster: 5.0-12 >> qemu-server: 5.0-14 >> pve-firmware: 2.0-2 >> libpve-common-perl: 5.0-16 >> libpve-guest-common-perl: 2.0-11 >> libpve-access-control: 5.0-5 >> libpve-storage-perl: 5.0-12 >> pve-libspice-server1: 0.12.8-3 >> vncterm: 1.5-2 >> pve-docs: 5.0-9 >> pve-qemu-kvm: 2.9.0-2 >> pve-container: 2.0-14 >> pve-firewall: 3.0-2 >> pve-ha-manager: 2.0-2 >> ksm-control-daemon: 1.2-2 >> glusterfs-client: 3.8.8-1 >> lxc-pve: 2.0.8-3 >> lxcfs: 2.0.7-pve2 >> criu: 2.11.1-1~bpo90 >> novnc-pve: 0.6-4 >> smartmontools: 6.5+svn4324-1 >> zfsutils-linux: 0.6.5.9-pve16~bpo90 >> >> --- >> >> VM: >> >> Linux backup 4.10.15-1-pve #1 SMP PVE 4.10.15-12 (Mon, 12 Jun 2017 >> 11:18:07 +0200) x86_64 GNU/Linux >> >> zfsutils-linux: 0.6.5.9-pve16~bpo90 >> >> >> Some hints would be very appreciated! >> >> bye >> Harald >> >> >> -- >> Harald Leithner >> >> ITronic >> Wiedner Hauptstra?e 120/5.1, 1050 Wien, Austria >> Tel: +43-1-545 0 604 >> Mobil: +43-699-123 78 4 78 >> Mail: leithner at itronic.at | itronic.at >> _______________________________________________ >> pve-user mailing list >> pve-user at pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > -- Harald Leithner ITronic Wiedner Hauptstra?e 120/5.1, 1050 Wien, Austria Tel: +43-1-545 0 604 Mobil: +43-699-123 78 4 78 Mail: leithner at itronic.at | itronic.at From yannis.milios at gmail.com Mon Jul 24 18:14:20 2017 From: yannis.milios at gmail.com (Yannis Milios) Date: Mon, 24 Jul 2017 17:14:20 +0100 Subject: [PVE-User] ZFS Checksum Error in VM but not on Host In-Reply-To: <3702bfd2-627b-eb6c-d16b-871fca637441@itronic.at> References: <676f48c1-12d8-ee6f-fd05-b2c61e8d1925@itronic.at> <3702bfd2-627b-eb6c-d16b-871fca637441@itronic.at> Message-ID: > > >> 2 reason for this, first having checksums^^, second snapshots. > And I prefer ZFS over any other filesystem. > > Whats the reason why ZFS is not good in a VM? IMHO that's a waste of system resources. Since your VM disk already lies on a ZFS filesystem, where it can leverage all features you said (checksums, snapshots etc), what's the point of having ZFS inside VM at the same time? ZFS is not just another f/s, it consumes a lot of resources, particularly RAM. Of course I don't say it's not doable, I would use it in VM just for testing stuff... > > I understand the error and the solution, but not really why it happen. In > the meantime I got an answer from Wolfgang Link who thinks it could be a > bit flip in memory... > > Usually checksum errors are caused by RAM issues (other factors could be damaged SATA cables and more). Are you using ECC RAM on the server? I would suggest you to post this question on ZoL mailing list, there you can get much better feedback about pros and cons. Yannis From abreuer1521 at gmail.com Tue Jul 25 04:36:04 2017 From: abreuer1521 at gmail.com (Eric Abreu) Date: Mon, 24 Jul 2017 20:36:04 -0600 Subject: [PVE-User] Increase memory QXL graphic adapter In-Reply-To: References:

Message-ID: Hi. I wonder if there are ways of improving graphic performance of a kvm VM. Is there a way of increasing the memory of the virtual graphic adapter (QXL) from the terminal or tweaking the VM in a way that a user could stream YouTube videos without interruption. Thank you in advance. From aderumier at odiso.com Tue Jul 25 09:21:01 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Tue, 25 Jul 2017 09:21:01 +0200 (CEST) Subject: [PVE-User] Increase memory QXL graphic adapter In-Reply-To: References:

Message-ID: <992749884.192492.1500967261653.JavaMail.zimbra@oxygem.tv> ovirt have a good draft for auto tune value, I wonder if we could we use this for proxmox ? http://www.ovirt.org/documentation/draft/video-ram/ default value are ram='65536', vram='65536', vgamem='16384', 'heads=1'. also, seem that a new vram64 value is available https://www.redhat.com/archives/libvir-list/2016-February/msg01082.html I think you can do tests with args: -global qxl-vga.ram_size=134217728 -global qxl-vga.vram_size=134217728 -global qxl-vga.vgamem_mb=32 .... ----- Mail original ----- De: "Eric Abreu" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 04:36:04 Objet: [PVE-User] Increase memory QXL graphic adapter Hi. I wonder if there are ways of improving graphic performance of a kvm VM. Is there a way of increasing the memory of the virtual graphic adapter (QXL) from the terminal or tweaking the VM in a way that a user could stream YouTube videos without interruption. Thank you in advance. _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From aderumier at odiso.com Tue Jul 25 09:23:14 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Tue, 25 Jul 2017 09:23:14 +0200 (CEST) Subject: [PVE-User] Increase memory QXL graphic adapter In-Reply-To: <992749884.192492.1500967261653.JavaMail.zimbra@oxygem.tv> References:

<992749884.192492.1500967261653.JavaMail.zimbra@oxygem.tv> Message-ID: <737035153.194483.1500967394763.JavaMail.zimbra@oxygem.tv> another interesting article in deutsh http://linux-blog.anracom.com/2017/07/06/kvmqemu-mit-qxl-hohe-aufloesungen-und-virtuelle-monitore-im-gastsystem-definieren-und-nutzen-i/ ----- Mail original ----- De: "aderumier" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 09:21:01 Objet: Re: [PVE-User] Increase memory QXL graphic adapter ovirt have a good draft for auto tune value, I wonder if we could we use this for proxmox ? http://www.ovirt.org/documentation/draft/video-ram/ default value are ram='65536', vram='65536', vgamem='16384', 'heads=1'. also, seem that a new vram64 value is available https://www.redhat.com/archives/libvir-list/2016-February/msg01082.html I think you can do tests with args: -global qxl-vga.ram_size=134217728 -global qxl-vga.vram_size=134217728 -global qxl-vga.vgamem_mb=32 .... ----- Mail original ----- De: "Eric Abreu" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 04:36:04 Objet: [PVE-User] Increase memory QXL graphic adapter Hi. I wonder if there are ways of improving graphic performance of a kvm VM. Is there a way of increasing the memory of the virtual graphic adapter (QXL) from the terminal or tweaking the VM in a way that a user could stream YouTube videos without interruption. Thank you in advance. _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From leithner at itronic.at Tue Jul 25 09:49:22 2017 From: leithner at itronic.at (Harald Leithner) Date: Tue, 25 Jul 2017 09:49:22 +0200 Subject: [PVE-User] ZFS Checksum Error in VM but not on Host In-Reply-To: References: <676f48c1-12d8-ee6f-fd05-b2c61e8d1925@itronic.at> <3702bfd2-627b-eb6c-d16b-871fca637441@itronic.at> Message-ID: <540938e9-731e-e978-caba-852558b6a890@itronic.at> Am 24.07.2017 um 18:14 schrieb Yannis Milios: >> >> >>> 2 reason for this, first having checksums^^, second snapshots. >> And I prefer ZFS over any other filesystem. >> >> > Whats the reason why ZFS is not good in a VM? > > > IMHO that's a waste of system resources. Since your VM disk already lies on > a ZFS filesystem, where it can leverage all features you said (checksums, > snapshots etc), what's the point of having ZFS inside VM at the same time? > ZFS is not just another f/s, it consumes a lot of resources, particularly > RAM. Of course I don't say it's not doable, I would use it in VM just for > testing stuff... As you can see, the host zfs doesn't have the checksum problem, so not really a vaste on resources. > >> > > >> I understand the error and the solution, but not really why it happen. In >> the meantime I got an answer from Wolfgang Link who thinks it could be a >> bit flip in memory... >> >> > Usually checksum errors are caused by RAM issues (other factors could be > damaged SATA cables and more). Are you using ECC RAM on the server? I would > suggest you to post this question on ZoL mailing list, there you can get > much better feedback about pros and cons. Server has no ECC (AMD Ryzen) with a Mainboard that doesn only support ECC in non-ECC mode (a bit useless). But thx for you help. > > Yannis > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > -- Harald Leithner ITronic Wiedner Hauptstra?e 120/5.1, 1050 Wien, Austria Tel: +43-1-545 0 604 Mobil: +43-699-123 78 4 78 Mail: leithner at itronic.at | itronic.at From aderumier at odiso.com Tue Jul 25 10:06:26 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Tue, 25 Jul 2017 10:06:26 +0200 (CEST) Subject: [PVE-User] Increase memory QXL graphic adapter In-Reply-To: <737035153.194483.1500967394763.JavaMail.zimbra@oxygem.tv> References:

<992749884.192492.1500967261653.JavaMail.zimbra@oxygem.tv> <737035153.194483.1500967394763.JavaMail.zimbra@oxygem.tv> Message-ID: <1669312993.195774.1500969986096.JavaMail.zimbra@oxygem.tv> also discussion about youtube performance https://www.spinics.net/lists/spice-devel/msg27403.html "Since you are on el7 system you can test our nightly builds: https://copr.fedorainfracloud.org/coprs/g/spice/nightly/ which provides ability to switch the video encoder in spicy (package spice-gtk-tools) under Options menu. Your vm needs to have the video streaming enabled (set to 'filter' or 'all'). (virsh edit VM ; and add to graphics node) Also check if the image compression is turned on (ideally set to glz) " this can be tuned in -spice command line. (this need change in proxmox QemuServer.pm) "-spice port=5903,tls-port=5904,addr=127.0.0.1,\ x509-dir=/etc/pki/libvirt-spice,\ image-compression=auto_glz,jpeg-wan-compression=auto,\ zlib-glz-wan-compression=auto,\ playback-compression=on,streaming-video=all" not sure about new spicy video encoder. ----- Mail original ----- De: "aderumier" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 09:23:14 Objet: Re: [PVE-User] Increase memory QXL graphic adapter another interesting article in deutsh http://linux-blog.anracom.com/2017/07/06/kvmqemu-mit-qxl-hohe-aufloesungen-und-virtuelle-monitore-im-gastsystem-definieren-und-nutzen-i/ ----- Mail original ----- De: "aderumier" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 09:21:01 Objet: Re: [PVE-User] Increase memory QXL graphic adapter ovirt have a good draft for auto tune value, I wonder if we could we use this for proxmox ? http://www.ovirt.org/documentation/draft/video-ram/ default value are ram='65536', vram='65536', vgamem='16384', 'heads=1'. also, seem that a new vram64 value is available https://www.redhat.com/archives/libvir-list/2016-February/msg01082.html I think you can do tests with args: -global qxl-vga.ram_size=134217728 -global qxl-vga.vram_size=134217728 -global qxl-vga.vgamem_mb=32 .... ----- Mail original ----- De: "Eric Abreu" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 04:36:04 Objet: [PVE-User] Increase memory QXL graphic adapter Hi. I wonder if there are ways of improving graphic performance of a kvm VM. Is there a way of increasing the memory of the virtual graphic adapter (QXL) from the terminal or tweaking the VM in a way that a user could stream YouTube videos without interruption. Thank you in advance. _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From daniel at linux-nerd.de Tue Jul 25 10:53:50 2017 From: daniel at linux-nerd.de (Daniel) Date: Tue, 25 Jul 2017 08:53:50 +0000 Subject: [PVE-User] Installation PRoblems with Proxmox 5 Message-ID: Hi there, i got 4 new Servers which i wanted to install Proxmox 5. I Downloaded the ISO and try the install via CD. After the Installtion I see the following error (attached as picture) Any Idea what I can do? -- Gr?sse Daniel From daniel at linux-nerd.de Tue Jul 25 11:05:45 2017 From: daniel at linux-nerd.de (Daniel) Date: Tue, 25 Jul 2017 09:05:45 +0000 Subject: [PVE-User] Installation PRoblems with Proxmox 5 Message-ID: <88DF50C1-2399-4C35-9817-74E0CA601247@linux-nerd.de> Seems Picture was removed: I got this error: Command ?chroot /target dpkg ?force-confold ?configure ?a failed With exit code 1 at /usr/bun/proxmoxinstall line 385 -- Gr?sse Daniel Am 25.07.17, 10:53 schrieb "pve-user im Auftrag von Daniel" : Hi there, i got 4 new Servers which i wanted to install Proxmox 5. I Downloaded the ISO and try the install via CD. After the Installtion I see the following error (attached as picture) Any Idea what I can do? -- Gr?sse Daniel _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From aderumier at odiso.com Tue Jul 25 11:12:08 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Tue, 25 Jul 2017 11:12:08 +0200 (CEST) Subject: [PVE-User] Increase memory QXL graphic adapter In-Reply-To: <1669312993.195774.1500969986096.JavaMail.zimbra@oxygem.tv> References:

<992749884.192492.1500967261653.JavaMail.zimbra@oxygem.tv> <737035153.194483.1500967394763.JavaMail.zimbra@oxygem.tv> <1669312993.195774.1500969986096.JavaMail.zimbra@oxygem.tv> <87445321.199163.1500973928767.JavaMail.zimbra@oxygem.tv> Message-ID: Thanks a lot Alexandre, I will study all the posible solution and then I will test it. Best Regards Eric El jul 25, 2017 3:12 AM, "Alexandre DERUMIER" escribi?: seem that streaming-video=off by default, maybe can you test with editing /usr/share/perl5/PVE/QemuServer.pm push @$devices, '-spice', "tls-port=${spice_port},addr=$ localhost,tls-ciphers=HIGH,seamless-migration=on"; and add push @$devices, '-spice', "tls-port=${spice_port},addr=$ localhost,tls-ciphers=HIGH,seamless-migration=on,streaming-video=filter"; then restart systemctl restart pvedaemon and start your vm. (test with "filter" and "all" value to compare) ----- Mail original ----- De: "aderumier" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 10:06:26 Objet: Re: [PVE-User] Increase memory QXL graphic adapter also discussion about youtube performance https://www.spinics.net/lists/spice-devel/msg27403.html "Since you are on el7 system you can test our nightly builds: https://copr.fedorainfracloud.org/coprs/g/spice/nightly/ which provides ability to switch the video encoder in spicy (package spice-gtk-tools) under Options menu. Your vm needs to have the video streaming enabled (set to 'filter' or 'all'). (virsh edit VM ; and add to graphics node) Also check if the image compression is turned on (ideally set to glz) " this can be tuned in -spice command line. (this need change in proxmox QemuServer.pm) "-spice port=5903,tls-port=5904,addr=127.0.0.1,\ x509-dir=/etc/pki/libvirt-spice,\ image-compression=auto_glz,jpeg-wan-compression=auto,\ zlib-glz-wan-compression=auto,\ playback-compression=on,streaming-video=all" not sure about new spicy video encoder. ----- Mail original ----- De: "aderumier" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 09:23:14 Objet: Re: [PVE-User] Increase memory QXL graphic adapter another interesting article in deutsh http://linux-blog.anracom.com/2017/07/06/kvmqemu-mit-qxl- hohe-aufloesungen-und-virtuelle-monitore-im-gastsystem-definieren-und- nutzen-i/ ----- Mail original ----- De: "aderumier" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 09:21:01 Objet: Re: [PVE-User] Increase memory QXL graphic adapter ovirt have a good draft for auto tune value, I wonder if we could we use this for proxmox ? http://www.ovirt.org/documentation/draft/video-ram/ default value are ram='65536', vram='65536', vgamem='16384', 'heads=1'. also, seem that a new vram64 value is available https://www.redhat.com/archives/libvir-list/2016-February/msg01082.html I think you can do tests with args: -global qxl-vga.ram_size=134217728 -global qxl-vga.vram_size=134217728 -global qxl-vga.vgamem_mb=32 .... ----- Mail original ----- De: "Eric Abreu" ?: "proxmoxve" Envoy?: Mardi 25 Juillet 2017 04:36:04 Objet: [PVE-User] Increase memory QXL graphic adapter Hi. I wonder if there are ways of improving graphic performance of a kvm VM. Is there a way of increasing the memory of the virtual graphic adapter (QXL) from the terminal or tweaking the VM in a way that a user could stream YouTube videos without interruption. Thank you in advance. _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From marcelo_reimer at wycliffe.net Wed Jul 26 10:50:23 2017 From: marcelo_reimer at wycliffe.net (Marcelo Reimer) Date: Wed, 26 Jul 2017 10:50:23 +0200 Subject: [PVE-User] request iso for proxmox 3.2 Message-ID: <4c05e9af21d8c51b91ba405cbd58bed4@mail.gmail.com> Hi Does anyone have the iso for ve 3.2? I have a few legacy systems that need to be worked on and the iso files are missing. The main reason I need 3.2 is because 3.4 has a bug in x that won?t let me install it, I will upgrade to 3.4 when I get there. Marcelo Reimer IT Support Services Wycliffe Global Alliance Europe Area Siegenweg 30 57299 Burbach Tel: 02736 298 302 ext 125 Mob: 0157 5791 6974 marcelo_reimer at wycliffe.net www.wycliffe.net From vy.nt at vinahost.vn Wed Jul 26 10:52:46 2017 From: vy.nt at vinahost.vn (=?UTF-8?B?Tmd1eeG7hW4gVOG6pW4gVuG7uQ==?=) Date: Wed, 26 Jul 2017 15:52:46 +0700 Subject: [PVE-User] request iso for proxmox 3.2 In-Reply-To: <4c05e9af21d8c51b91ba405cbd58bed4@mail.gmail.com> References: <4c05e9af21d8c51b91ba405cbd58bed4@mail.gmail.com> Message-ID: Hello, Check the link https://archive.org/download/Proxmox-ve_Released_Iso Regards, [image: C?ng ty TNHH VinaHost] *Nguyen Tan Vy * Mobile: 0909619508 Skype: tonyha1090 *C?ng ty TNHH VinaHost* Collocation I Server Dedicated I SSD VPSI Domain I Web ? Email Hosting I Email Server Website: https://vinahost.vn Hotline: 19006046 ext 4 Tr? s? ch?nh: 351/31 N? Trang Long, P.13, B?nh Th?nh, TP. H? Ch? Minh V?n ph?ng ??i di?n: 154 Nguy?n Xi?n, Qu?n Thanh Xu?n, TP. H? N?i [image: Facebook] [image: Youtube] On Wed, Jul 26, 2017 at 3:50 PM, Marcelo Reimer wrote: > Hi > > > > Does anyone have the iso for ve 3.2? > > I have a few legacy systems that need to be worked on and the iso files are > missing. The main reason I need 3.2 is because 3.4 has a bug in x that > won?t let me install it, I will upgrade to 3.4 when I get there. > > > > Marcelo Reimer > > > > IT Support Services > > Wycliffe Global Alliance Europe Area > > Siegenweg 30 > > 57299 Burbach > > Tel: 02736 298 302 ext 125 > > Mob: 0157 5791 6974 > > marcelo_reimer at wycliffe.net > > www.wycliffe.net > _______________________________________________ > pve-user mailing list > pve-user at pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > From gilberto.nunes32 at gmail.com Wed Jul 26 23:26:45 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Wed, 26 Jul 2017 18:26:45 -0300 Subject: [PVE-User] Issues with CPU and Memory HotPlug Message-ID: Hi list I am using PVE 5.0, and try to hotplug Memory and CPU. I set the file /lib/udev/rules.d/80-hotplug-cpu-mem.rules, with SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online" I have install a VM with Ubuntu 17.04, with ZFS as storage, with 4 GB of memory. When I try decrease the memory, I receive a kernelops: kernel BUG at /build/linux-Fsa3B1/linux-4.10.0/mm/memory_hotplug.c:2172 And in the web interface I get: Parameter verification failed. (400) memory: hotplug problem - 400 parameter verification failed. dimm5: error unplug memory This is the conf of the vm: agent: 1 bootdisk: scsi0 cores: 2 cpu: Conroe hotplug: disk,network,usb,memory,cpu ide2: STG-01:iso/mini-ubuntu-64.iso,media=cdrom memory: 2048 name: VM-Ubuntu net0: virtio=FE:26:B8:AC:47:0B,bridge=vmbr0 numa: 1 ostype: l26 protection: 1 scsi0: STG-ZFS-01:vm-101-disk-1,size=32G scsihw: virtio-scsi-pci smbios1: uuid=5794fcd3-c227-46fe-bea9-84d41f75a0d7 sockets: 2 usb0: spice usb1: spice usb2: spice vga: qxl What is wrong??? Thanks From gilberto.nunes32 at gmail.com Thu Jul 27 00:15:59 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Wed, 26 Jul 2017 19:15:59 -0300 Subject: [PVE-User] Issues with CPU and Memory HotPlug In-Reply-To: References: Message-ID: UPDATE I thing I figured out that I can only increase and not decrease the resource in VM KVM, right??? 2017-07-26 18:26 GMT-03:00 Gilberto Nunes : > Hi list > > I am using PVE 5.0, and try to hotplug Memory and CPU. > I set the file > /lib/udev/rules.d/80-hotplug-cpu-mem.rules, with > > SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" > SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online" > > I have install a VM with Ubuntu 17.04, with ZFS as storage, with 4 GB of memory. > > When I try decrease the memory, I receive a kernelops: > > kernel BUG at /build/linux-Fsa3B1/linux-4.10.0/mm/memory_hotplug.c:2172 > > And in the web interface I get: > > Parameter verification failed. (400) > > memory: hotplug problem - 400 parameter verification failed. dimm5: error unplug memory > > This is the conf of the vm: > > agent: 1 > bootdisk: scsi0 > cores: 2 > cpu: Conroe > hotplug: disk,network,usb,memory,cpu > ide2: STG-01:iso/mini-ubuntu-64.iso,media=cdrom > memory: 2048 > name: VM-Ubuntu > net0: virtio=FE:26:B8:AC:47:0B,bridge=vmbr0 > numa: 1 > ostype: l26 > protection: 1 > scsi0: STG-ZFS-01:vm-101-disk-1,size=32G > scsihw: virtio-scsi-pci > smbios1: uuid=5794fcd3-c227-46fe-bea9-84d41f75a0d7 > sockets: 2 > usb0: spice > usb1: spice > usb2: spice > vga: qxl > > > What is wrong??? > > Thanks > From aderumier at odiso.com Thu Jul 27 12:21:05 2017 From: aderumier at odiso.com (Alexandre DERUMIER) Date: Thu, 27 Jul 2017 12:21:05 +0200 (CEST) Subject: [PVE-User] Issues with CPU and Memory HotPlug In-Reply-To: References: Message-ID: <812922601.367777.1501150865813.JavaMail.zimbra@oxygem.tv> cpu hotplug/unplug works fine memory hotplug works fine memory unplug is mostly broken in linux and not implemented in windows. see notes here : https://pve.proxmox.com/wiki/Hotplug_(qemu_disk,nic,cpu,memory) Alexandre Derumier Ing?nieur syst?me et stockage Manager Infrastructure Fixe : +33 3 59 82 20 10 125 Avenue de la r?publique 59110 La Madeleine [ https://twitter.com/OdisoHosting ] [ https://twitter.com/mindbaz ] [ https://www.linkedin.com/company/odiso ] [ https://www.viadeo.com/fr/company/odiso ] [ https://www.facebook.com/monsiteestlent ] [ https://www.monsiteestlent.com/ | MonSiteEstLent.com ] - Blog d?di? ? la webperformance et la gestion de pics de trafic De: "Gilberto Nunes" ?: "proxmoxve" Envoy?: Mercredi 26 Juillet 2017 23:26:45 Objet: [PVE-User] Issues with CPU and Memory HotPlug Hi list I am using PVE 5.0, and try to hotplug Memory and CPU. I set the file /lib/udev/rules.d/80-hotplug-cpu-mem.rules, with SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online" I have install a VM with Ubuntu 17.04, with ZFS as storage, with 4 GB of memory. When I try decrease the memory, I receive a kernelops: kernel BUG at /build/linux-Fsa3B1/linux-4.10.0/mm/memory_hotplug.c:2172 And in the web interface I get: Parameter verification failed. (400) memory: hotplug problem - 400 parameter verification failed. dimm5: error unplug memory This is the conf of the vm: agent: 1 bootdisk: scsi0 cores: 2 cpu: Conroe hotplug: disk,network,usb,memory,cpu ide2: STG-01:iso/mini-ubuntu-64.iso,media=cdrom memory: 2048 name: VM-Ubuntu net0: virtio=FE:26:B8:AC:47:0B,bridge=vmbr0 numa: 1 ostype: l26 protection: 1 scsi0: STG-ZFS-01:vm-101-disk-1,size=32G scsihw: virtio-scsi-pci smbios1: uuid=5794fcd3-c227-46fe-bea9-84d41f75a0d7 sockets: 2 usb0: spice usb1: spice usb2: spice vga: qxl What is wrong??? Thanks _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From gilberto.nunes32 at gmail.com Thu Jul 27 13:14:41 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Thu, 27 Jul 2017 08:14:41 -0300 Subject: [PVE-User] Issues with CPU and Memory HotPlug In-Reply-To: <812922601.367777.1501150865813.JavaMail.zimbra@oxygem.tv> References: <812922601.367777.1501150865813.JavaMail.zimbra@oxygem.tv> Message-ID: Hum.... I got it... I'm just confuse about the terms... Thanks Em 27 de jul de 2017 07:21, "Alexandre DERUMIER" escreveu: cpu hotplug/unplug works fine memory hotplug works fine memory unplug is mostly broken in linux and not implemented in windows. see notes here : https://pve.proxmox.com/wiki/Hotplug_(qemu_disk,nic,cpu,memory) Alexandre Derumier Ing?nieur syst?me et stockage Manager Infrastructure Fixe : +33 3 59 82 20 10 125 Avenue de la r?publique 59110 La Madeleine [ https://twitter.com/OdisoHosting ] [ https://twitter.com/mindbaz ] [ https://www.linkedin.com/company/odiso ] [ https://www.viadeo.com/fr/ company/odiso ] [ https://www.facebook.com/monsiteestlent ] [ https://www.monsiteestlent.com/ | MonSiteEstLent.com ] - Blog d?di? ? la webperformance et la gestion de pics de trafic De: "Gilberto Nunes" ?: "proxmoxve" Envoy?: Mercredi 26 Juillet 2017 23:26:45 Objet: [PVE-User] Issues with CPU and Memory HotPlug Hi list I am using PVE 5.0, and try to hotplug Memory and CPU. I set the file /lib/udev/rules.d/80-hotplug-cpu-mem.rules, with SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online" I have install a VM with Ubuntu 17.04, with ZFS as storage, with 4 GB of memory. When I try decrease the memory, I receive a kernelops: kernel BUG at /build/linux-Fsa3B1/linux-4.10.0/mm/memory_hotplug.c:2172 And in the web interface I get: Parameter verification failed. (400) memory: hotplug problem - 400 parameter verification failed. dimm5: error unplug memory This is the conf of the vm: agent: 1 bootdisk: scsi0 cores: 2 cpu: Conroe hotplug: disk,network,usb,memory,cpu ide2: STG-01:iso/mini-ubuntu-64.iso,media=cdrom memory: 2048 name: VM-Ubuntu net0: virtio=FE:26:B8:AC:47:0B,bridge=vmbr0 numa: 1 ostype: l26 protection: 1 scsi0: STG-ZFS-01:vm-101-disk-1,size=32G scsihw: virtio-scsi-pci smbios1: uuid=5794fcd3-c227-46fe-bea9-84d41f75a0d7 sockets: 2 usb0: spice usb1: spice usb2: spice vga: qxl What is wrong??? Thanks _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list pve-user at pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user From gilberto.nunes32 at gmail.com Mon Jul 31 23:26:34 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Mon, 31 Jul 2017 18:26:34 -0300 Subject: [PVE-User] Backup just stuck Message-ID: Hello list I have here a fresh installation of PVE 5. I have deploy a Windows 10 64 bits and try to make a backup! But, oddly, the backup process stuck in: INFO: starting new backup job: vzdump 102 --node PROXMOX01 --compress lzo --storage local --remove 0 --mode snapshot INFO: Starting Backup of VM 102 (qemu) INFO: status = running INFO: update VM 102: -lock backup INFO: VM Name: Win10-01 INFO: include disk 'scsi0' 'ZFS-LOCAL:vm-102-disk-2' 53G INFO: backup mode: snapshot INFO: ionice priority: 7 INFO: creating archive '/var/lib/vz/dump/vzdump-qemu-102-2017_07_31-18_17_07.vma.lzo' If I try to make a backup of KVM with Linux, it's gone! But with Windows 10 I get the above behavior! Somebody has the same effect?? What can I do to solve it? Thanks From gilberto.nunes32 at gmail.com Mon Jul 31 23:36:59 2017 From: gilberto.nunes32 at gmail.com (Gilberto Nunes) Date: Mon, 31 Jul 2017 18:36:59 -0300 Subject: [PVE-User] Backup just stuck In-Reply-To: References: Message-ID: Well I just remove the qemu guest agent and everything is ok! Perhaps some bug in Qemu Guest Agent or something??? Obrigado Cordialmente Gilberto Ferreira Consultor TI Linux | IaaS Proxmox, CloudStack, KVM | Zentyal Server | Zimbra Mail Server (47) 3025-5907 (47) 99676-7530 Skype: gilberto.nunes36 konnectati.com.br https://www.youtube.com/watch?v=dsiTPeNWcSE 2017-07-31 18:26 GMT-03:00 Gilberto Nunes : > Hello list > > > I have here a fresh installation of PVE 5. > > I have deploy a Windows 10 64 bits and try to make a backup! > > But, oddly, the backup process stuck in: > > > INFO: starting new backup job: vzdump 102 --node PROXMOX01 --compress lzo > --storage local --remove 0 --mode snapshot > INFO: Starting Backup of VM 102 (qemu) > INFO: status = running > INFO: update VM 102: -lock backup > INFO: VM Name: Win10-01 > INFO: include disk 'scsi0' 'ZFS-LOCAL:vm-102-disk-2' 53G > INFO: backup mode: snapshot > INFO: ionice priority: 7 > INFO: creating archive '/var/lib/vz/dump/vzdump-qemu- > 102-2017_07_31-18_17_07.vma.lzo' > > > If I try to make a backup of KVM with Linux, it's gone! But with Windows > 10 I get the above behavior! > > > Somebody has the same effect?? What can I do to solve it? > > > Thanks > > > > From lindsay.mathieson at gmail.com Mon Jul 31 23:59:57 2017 From: lindsay.mathieson at gmail.com (Lindsay Mathieson) Date: Tue, 1 Aug 2017 07:59:57 +1000 Subject: [PVE-User] Backup just stuck In-Reply-To: References:

Message-ID: <3d3d5167-f7ac-db5e-c4d7-5b5156987667@gmail.com> On 1/08/2017 7:36 AM, Gilberto Nunes wrote: > Well > I just remove the qemu guest agent and everything is ok! > Perhaps some bug in Qemu Guest Agent or something??? Quite possibly, I've been having heaps of problems with it. the vss_freeze/thaw command often hangs with a 1 hour timeout. -- Lindsay Mathieson