[pve-devel] [PATCH cluster] fix #3596: handle delnode of offline node

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Nov 12 13:14:17 CET 2021


On 12.11.21 12:50, Fabian Ebner wrote:
> Am 12.11.21 um 09:45 schrieb Fabian Grünbichler:
>> the recommended way is to first shutdown, then delnode, and never let it
>> come back online, in which case corosync-cfgtool won't be able to kill
>> the removed (offline) node.
>>
>> also, the order was wrong - if we first update corosync.conf to remove
>> the node entry from the nodelist, corosync doesn't know about the nodeid
>> anymore, so killing will fail even if the node is still online.
>>
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
>> ---
>>   data/PVE/API2/ClusterConfig.pm | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/data/PVE/API2/ClusterConfig.pm b/data/PVE/API2/ClusterConfig.pm
>> index 8f4a5bb..5a6a1ac 100644
>> --- a/data/PVE/API2/ClusterConfig.pm
>> +++ b/data/PVE/API2/ClusterConfig.pm
>> @@ -485,9 +485,13 @@ __PACKAGE__->register_method ({
>>             delete $nodelist->{$node};
>>   -        PVE::Corosync::update_nodelist($conf, $nodelist);
>> +        # allowed to fail when node is already shut down!
>> +        eval {
>> +        PVE::Tools::run_command(['corosync-cfgtool','-k', $nodeid])
>> +            if defined($nodeid);
>> +        };
>>   
> 
> But what if it fails for a different reason than 'CS_ERR_NOT_EXIST'? Shouldn't we match the error?

at least that examples is like ENOENT on unlink, an OK error (user could
have -k'illed it before that).





More information about the pve-devel mailing list