[pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ?

Thomas Lamprecht t.lamprecht at proxmox.com
Wed Sep 21 09:40:01 CEST 2016


On 09/21/2016 08:50 AM, Alexandre DERUMIER wrote:
>>> Forgot to mention that consul supports multiple clusters and/or multi
>>> center clusters out of the box.
> yes, I read the doc yesterday. seem very interesting.
>
> The most work could be to replace pmxcs by consul kv store. I have seen some consul fuse fs implementation,
> but it don't have all pmxcs features (like symlinks for example).
>
> Zookeeper seem to be lower level.
>
> reading sheedog plugin:(1500loc)
>
> https://github.com/sheepdog/sheepdog/blob/8772904509ce6b10c5edca4f497022686aecc18f/sheep/cluster/zookeeper.c
> vs
> https://github.com/sheepdog/sheepdog/blob/8772904509ce6b10c5edca4f497022686aecc18f/sheep/cluster/corosync.c
Discussion and evaluating options is good but throwing instantly all away,
and switching to another - not necessarily better - cluster stack is
maybe a bit overreacted. :) I also think that our current cluster stack,
with corosync + pve-cluser (pmxcfs) is quite stable and a lot of things
depend on it.

Also corosync is very well tested software and works really good, at least
with small to mid size clusters (< 60 nodes - which I find is quite an
achievement for a cluster!). You have also to consider
that quite some overhead, and thus node limitation, may come from the
database used by pmxcfs, the transaction needs to be synced with disk to
make everything reliable and while this is quite optimized it makes things
slower (placing the DB on really fast storage could help here).

I, personally, would prefer to keep corosync and introduce a protocol which
allows connecting multiple clusters (easier said, but still less change and
work then adapting to another cluster stack, which is most surely not
better, or has other drawbacks.)

Also taking a look at the corosync satellite approach sounds interesting.

Connecting multiple clusters is also another approach then a small cluster
with a lot of satellite nodes per cluster node, I see the former better as
its more decentralized and seems to fit netter in our current design. :)

>
> Note that for scaling, zookeeper,consul,... have some kind of master nodes for the quorum, and client nodes. (same than corosync satelitte).
> I don't think it's technically possible to scale with full mesh masters nodes with lot of nodes.

No, with full mesh you wont really overcome the limits and problems corosync
has here, corosync utilizes the possibilities quite well with multicast 
here.

@Alexandre, you say that with 16 nodes the cluster is quite at is maximum,
can I get some more infos from you as I currently do not have the 
hardware to
test this :)

Do you use IGMP snooping/queriers?
On which network communicates corosync, on an independent? And how fast 
is it?
Redundant rings also?


>
> ----- Mail original -----
> De: "datanom.net" <mir at datanom.net>
> À: "pve-devel" <pve-devel at pve.proxmox.com>
> Envoyé: Mercredi 21 Septembre 2016 07:49:06
> Objet: Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ?
>
> On Wed, 21 Sep 2016 01:45:18 +0200
> Michael Rasmussen <mir at datanom.net> wrote:
>
>> https://github.com/hashicorp/consul
>>
> Forgot to mention that consul supports multiple clusters and/or multi
> center clusters out of the box.
>





More information about the pve-devel mailing list