[pve-devel] has somebody already tested corosync3 alpha et new knet transport ?

Alexandre DERUMIER aderumier at odiso.com
Wed Jun 27 08:35:53 CEST 2018

>>FYI, knet is a abstraction layer, it still uses udp (aka multicast) 

Are you sure ? 


Currently supported
○ UDP (unicast)
○ SCTP (connection-oriented)
○ Loopback (for localhost only … obviously)
○ No multicast, but could be added if really wanted
○ No broadcast
■ We are no longer that insane

so udp unicast and sctp.



Other options in the interface section do just what you might expect.
mcastport: <n> 
  tells knet to use that port number <n> for communication,. The default remains the old one of 5405
  +linknumber, but you can override it per link here. Even though knet doesn't do actual multicasting
  the name remains for old time's sake

If it's really working without multicast, with lower latencies, that's a big improvement :)

>>Are there links to the presentation, could be interesting :) 
I'll try to get it. But it's was more about casual consistency vs paxos.
The guy is only begining to implement his container orchestrator (in rust )

----- Mail original -----
De: "Thomas Lamprecht" <t.lamprecht at proxmox.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>, "aderumier" <aderumier at odiso.com>
Envoyé: Mercredi 27 Juin 2018 08:02:30
Objet: Re: [pve-devel] has somebody already tested corosync3 alpha et new knet transport ?


On 6/26/18 10:54 PM, Alexandre DERUMIER wrote: 
> I have found this presentation about coming corosync3 (seem to be alpha recently) 
> http://build.clusterlabs.org/corosync/presentations/2017-Kronosnet-The-new-face-of-corosync-communications.pdf 
> with the new kronosnet (knet) transport. 

Yes, tracking it somewhat since over half a year, looks really 
good on paper but didn't not have yet time to do much testing - 
as it'd be PVE 6.X timeframe anyway. 

> Latencies results are really impressive and no more multicast ! (users will be happy ;) 

FYI, knet is a abstraction layer, it still uses udp (aka multicast) 
As you do not get to handle a lot of links with a lot of nodes without 
multicast - i.e., multicast is a very good thing, even if some hosting 
environments and switch default settings are against it :) 
It can also uses SCTP as transport method, which is a layer 4 protocol, 
on the same level as UDP or TCP - i.e., it's not encapsulated in those. 

> and a lot of others improvments (dynamic mtu, ifdown/iup without breaking cluster, and seem to be compatible with corosync2 (with udp, udpu transports) 
> I'm still looking to make bigger proxmox clusters in the future :) 

Yes, looks definitly nice and it's on our radar, I'll try to build 
a corosync 3 package if got a bit time to spare. 

> BTW, I was at a kubernetes/container conference at Paris today, 
> and a talk of a guy was about trying to create in own orchestrator instead kubernetes (because of problem with etcd, network lag brigging down k8s master,...), 
> talking about clusters, paxos, strong consistency. 
> He's looking to use a causal consistency model instead strong consistency, I never heard about this, 
> but this seem really great to be able to manage bigger cluster, and also geo clusters. 

You can do more in parallel with it. In strong consistency models all 
events (for our case, write/read operations) are ensured to be ordered. 
If node A sees write OP-A happen before write OP-B then this principle 
guarantee that all other nodes see OP-B after OP-A. 
Casual consistency does this too, but only if OP-A and OP-B are related, 
i.e., they affect each other (like a write to the same file would). 

Are there links to the presentation, could be interesting :) 
Seems they use a a protocol named "cure" for the update replication: 

> He have given a link to an opensource key value store using causal consistency, called "antidote" 
> https://syncfree.github.io/antidote/ 
> Maybe for the future (proxmox 10 ;), it could be great to have this kind of model. 
> (I'm not enough expert to say if it could work, and If it could be possible to reimplement pmxcfs with this kind of protocol, and manage others things like pve-crm/lrm) 

Hmm, an academic erlang project with a bit short whitepaper, 
I'm a bit wary on such projects - but sounds definitively interesting. 

More information about the pve-devel mailing list