[PVE-User] BIG cluster questions

Thu Jul 1 08:45:38 CEST 2021

Hi Laurent,

Sure, subscription is coming, but it's going through another channel ;)

El 25/6/21 a las 19:33, Laurent Dumont escribió:
> This is anecdotal but I have never seen one cluster that big. You 
> might want to inquire about professional support which would give you 
> a better perspective for that kind of scale.
>
> On Thu, Jun 24, 2021 at 10:30 AM Eneko Lacunza via pve-user 
> <pve-user at lists.proxmox.com <mailto:pve-user at lists.proxmox.com>> wrote:
>
>
>
>
>     ---------- Forwarded message ----------
>     From: Eneko Lacunza <elacunza at binovo.es <mailto:elacunza at binovo.es>>
>     To: "pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>"
>     <pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>>
>     Cc:
>     Bcc:
>     Date: Thu, 24 Jun 2021 16:30:31 +0200
>     Subject: BIG cluster questions
>     Hi all,
>
>     We're currently helping a customer to configure a virtualization
>     cluster
>     with 88 servers for VDI.
>
>     Right know we're testing the feasibility of building just one Proxmox
>     cluster of 88 nodes. A 4-node cluster has been configured too for
>     comparing both (same server and networking/racks).
>
>     Nodes have 2 NICs 2x25Gbps each. Currently there are two LACP bonds
>     configured (one for each NIC); one for storage (NFS v4.2) and the
>     other
>     for the rest (VMs, cluster).
>
>     Cluster has two rings, one on each bond.
>
>     - With clusters at rest (no significant number of VMs running), we
>     see
>     quite a different corosync/knet latency average on our 88 node
>     cluster
>     (~300-400) and our 4-node cluster (<100).
>
>
>     For 88-node cluster:
>
>     - Creating some VMs (let's say 16), one each 30s, works well.
>     - Destroying some VMs (let's say 16), one each 30s, outputs error
>     messages (storage cfs lock related) and fails removing some of the
>     VMs.
>
>     - Rebooting 32 nodes, one each 30 seconds (boot for a node is about
>     120s) so that no quorum is lost, creates a cluster traffic
>     "flood". Some
>     of the rebooted nodes don't rejoin the cluster, and WUI shows all
>     nodes
>     in cluster quorum with a grey ?, instead of green OK. In this
>     situation
>     corosying latency in some nodes can skyrocket to 10s or 100s times
>     the
>     values before the reboots. Access to pmxcfs is very slow and we have
>     been able to fix the issue only rebooting all nodes.
>
>     - We have tried changing the transport of knet in a ring from UDP to
>     SCTP as reported here:
>     https://forum.proxmox.com/threads/proxmox-6-2-corosync-3-rare-and-spontaneous-disruptive-udp-5405-storm-flood.75871/page-2
>     <https://forum.proxmox.com/threads/proxmox-6-2-corosync-3-rare-and-spontaneous-disruptive-udp-5405-storm-flood.75871/page-2>
>     that gives better latencies for corosync, but the reboot issue
>     continues.
>
>     We don't know whether both issues are related or not.
>
>     Could LACP bonds be the issue?
>     https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_network_configuration
>     <https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_network_configuration>
>     "
>     If your switch support the LACP (IEEE 802.3ad) protocol then we
>     recommend using the corresponding bonding mode (802.3ad).
>     Otherwise you
>     should generally use the active-backup mode.
>     If you intend to run your cluster network on the bonding interfaces,
>     then you have to use active-passive mode on the bonding interfaces,
>     other modes are unsupported.
>     "
>     As per second line, we understand that running cluster networking
>     over a
>     LACP bond is not supported (just to confirm our interpretation)?
>     We're
>     in the process of reconfiguring nodes/switches to test without a
>     bond,
>     to see if that gives us a stable cluster (will report on this). Do
>     you
>     think this could be the issue?
>
>
>     Now for more general questions; do you think a 88-node Proxmox VE
>     cluster is feasible?
>
>     Those 88 nodes will host about 14.000 VMs. Will HA manager be able to
>     manage them, or are they too many? (HA for those VMs doesn't seem
>     to be
>     a requirement right know).
>
>
>     Thanks a lot
>     Eneko
>
>
>           EnekoLacunza
>
>     CTO | Zuzendari teknikoa
>
>     Binovo IT Human Project
>
>             943 569 206 <tel:943 569 206>
>
>     elacunza at binovo.es <mailto:elacunza at binovo.es>
>     <mailto:elacunza at binovo.es <mailto:elacunza at binovo.es>>
>
>     binovo.es <http://binovo.es> <//binovo.es <http://binovo.es>>
>
>             Astigarragako Bidea, 2 - 2 izda. Oficina 10-11, 20180 Oiartzun
>
>
>     youtube <https://www.youtube.com/user/CANALBINOVO/
>     <https://www.youtube.com/user/CANALBINOVO/>>
>             linkedin <https://www.linkedin.com/company/37269706/
>     <https://www.linkedin.com/company/37269706/>>
>
>
>
>
>     ---------- Forwarded message ----------
>     From: Eneko Lacunza via pve-user <pve-user at lists.proxmox.com
>     <mailto:pve-user at lists.proxmox.com>>
>     To: "pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>"
>     <pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>>
>     Cc: Eneko Lacunza <elacunza at binovo.es <mailto:elacunza at binovo.es>>
>     Bcc:
>     Date: Thu, 24 Jun 2021 16:30:31 +0200
>     Subject: [PVE-User] BIG cluster questions
>     _______________________________________________
>     pve-user mailing list
>     pve-user at lists.proxmox.com <mailto:pve-user at lists.proxmox.com>
>     https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>     <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user>
>

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/