[pve-devel] [PATCH cluster v2 0/8] initial API adaption to corosync 3/kronosnet

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Jun 14 19:04:51 CEST 2019


On 6/14/19 5:50 PM, Thomas Lamprecht wrote:
> On 6/14/19 3:03 PM, Fabian Grünbichler wrote:
>> Jun 14 14:23:50 clustertest71 systemd[1]: Starting Corosync Cluster Engine...
>> Jun 14 14:23:50 clustertest71 corosync[2160]:   [MAIN  ] Corosync Cluster Engine 3.0.1-dirty starting up
>> Jun 14 14:23:50 clustertest71 corosync[2160]:   [MAIN  ] Corosync built-in features: dbus monitoring watchdog systemd xmlconf snmp pie relro bindnow
>> Jun 14 14:23:50 clustertest71 corosync[2160]:   [MAIN  ] parse error in config: Not all nodes have the same number of links
>> Jun 14 14:23:50 clustertest71 corosync[2160]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1386.
>> Jun 14 14:23:50 clustertest71 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
>> Jun 14 14:23:50 clustertest71 systemd[1]: corosync.service: Failed with result 'exit-code'.
>> Jun 14 14:23:50 clustertest71 systemd[1]: Failed to start Corosync Cluster Engine.
>>
>>[snip]

>> doing the same with link0 and link1 instead of link0 and link5 works.
>> subsequently changing corosync.conf to have link0 and linkX with X != 1
>> also works, although the reload complains with the same error message
>> (cmap and corosync-cfgtool show the updated status just fine).
>> restarting corosync fails, again with the status shown above.
>>
>> haven't checked yet whether that is an issue on our side or corosync,
>> but probably worth an investigation ;)
> 
> this is a "bug" of corosync..
> 
> the following check in exec/totemconfig fails:
> 
> for (i=0; i<num_configured; i++) {
> 	if (totem_config->interfaces[i].member_count != members) err...
> }
> 
> here, num_configured is the correct number of configured interfaces
> (2), the struct entry member_count is 1 (one node, which seems OK here
> too) but members is 0...
> 
> members is set a bit above with:
> members = totem_config->interfaces[0].member_count;
> 
> 
> but totem_config->interfaces gets dynamically allocated with:
> totem_config->interfaces = malloc (sizeof (struct totem_interface) * INTERFACE_MAX);
> 
> So it's not the configured interfaces (0 being the lowest one
> configured, 1 the next, ...) but the _actual_ links from 0 to
> INTERFACE_MAX - 1 (== 7)
> 
> So here it _always_ gets the membercount from link0, if that is
> non-existent in the config then it's the default 0...
> 
> So either, link0 isn't as optional as you meant/wished or they have
> at least one, and probably a few more, bugs where they falsely assume
> that interfaces[0] is the first configured not link0...
> 
> 

see: https://github.com/corosync/corosync/pull/484





More information about the pve-devel mailing list