[pve-devel] [PATCH v2 cluster 3/5] Add verification and fallback to cluster join/addnode
Thomas Lamprecht
t.lamprecht at proxmox.com
Tue Dec 17 07:09:56 CET 2019
On 12/4/19 3:04 PM, Stefan Reiter wrote:
> Verify that the config of the new node is valid and compatible with the
> cluster (i.e. that the links for the new node match the currently
> configured nodes).
>
> Additionally, fallback is provided via a new parameter to addnode,
> 'new_node_ip'. Previously, fallback was handled on the joining node, by
> setting it's local IP as 'link0', however, a cluster with only one link,
> but numbered 1-7 is still valid, and a fallback is possible, but the old
> code would now fail.
>
> Instead, pass the locally resolved IP via a seperate parameter
> (resolving the IP on the cluster side is impractical, as IP resolution
> could fail or provide a wrong IP for Corosync).
>
> For compatibility reasons, allow fallback to occur via the old link0
> method as well, but mark with FIXME for future removal.
>
> Fallback fails in case the cluster has more than one link, in this case
> only the user can know which NIC/IP corresponds to which cluster link.
>
> Signed-off-by: Stefan Reiter <s.reiter at proxmox.com>
> ---
> data/PVE/API2/ClusterConfig.pm | 54 +++++++++++++++++++++++++++++++++-
> data/PVE/CLI/pvecm.pm | 6 ++++
> data/PVE/Cluster/Setup.pm | 6 ++++
> 3 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/data/PVE/API2/ClusterConfig.pm b/data/PVE/API2/ClusterConfig.pm
> index 2c5bb03..10dcd6d 100644
> --- a/data/PVE/API2/ClusterConfig.pm
> +++ b/data/PVE/API2/ClusterConfig.pm
> @@ -207,6 +207,12 @@ __PACKAGE__->register_method ({
> description => "Do not throw error if node already exists.",
> optional => 1,
> },
> + new_node_ip => {
> + type => 'string',
> + description => "IP Address of node to add. Used as fallback if no links are given.",
> + format => 'ip',
> + optional => 1,
> + },
> }),
> },
> returns => {
> @@ -268,6 +274,50 @@ __PACKAGE__->register_method ({
> $check_duplicate_addr->($link);
> }
>
> + # Make sure that the newly added node has links compatible with the
> + # rest of the cluster. To start, extract all links that currently
> + # exist. Check any node, all have the same links here (because of
> + # verify_conf above).
> + my $node_options = (values %$nodelist)[0];
> + my $cluster_links = {};
> + foreach my $opt (keys %$node_options) {
> + next if $opt !~ $PVE::Corosync::link_addr_re;
> + $cluster_links->{$2} = $node_options->{$opt};
> + }
> +
> + if (%$links) {
> + # verify specified links exist and none are missing
> + for my $linknum (0..$PVE::Corosync::max_link_iter) {
> + my $have_cluster = defined($cluster_links->{$linknum});
> + my $have_new = defined($links->{$linknum});
$have_cluster_link
$have_new_link (or _node_ ?)
else it's slightly confusing, i.e., as you'd check for if there's a cluster
> +
> + die "corosync: link $linknum exists in cluster config but wasn't specified for new node\n"
> + if $have_cluster && !$have_new;
> + die "corosync: link $linknum specified for new node doesn't exist in cluster config\n"
> + if !$have_cluster && $have_new;
> + }
> + } else {
> + # FIXME: Probably a good idea to remove this once we're sure
> + # noone would join a node old enough to need this to a recent
> + # cluster
# FIXME: remove in 8.0 or 7.0 if we need to switch clustering tech yet again ;)
> + $param->{new_node_ip} = $cluster_links->{0}->{address}
> + if !$param->{new_node_ip} && $cluster_links->{0}
> + && scalar(%$cluster_links) == 1;
<insert post-if for multiple line spanning statement nitpick-stanza here>
> +
> + # when called without any link parameters, fall back to
> + # new_node_ip, if all existing nodes only have a single link too
> + die "no links and no fallback ip (new_node_ip/link0) given, cannot join cluster\n"
> + if !$param->{new_node_ip};
but this and the other branch of the if above does not covers the:
* single link passed (e.g., the 0 one)
* that one isn't in the config
* but there's just a single link defined in the config (on another ID)
with fitting network
-> magically switch the passed ID to the config ID
> +
> + if (%$cluster_links == 1) {
please use scalar() to be more explicit, or even scalar(keys %$cluster_links) ?
> + my $linknum = (keys %$cluster_links)[0];
> + $links->{$linknum} = { address => $param->{new_node_ip} };
> + } else {
> + die "no links given but multiple links found on other nodes,"
> + . " fallback only supported on single-link clusters\n";
> + }
> + }
> +
> if (defined(my $res = $nodelist->{$name})) {
> $param->{nodeid} = $res->{nodeid} if !$param->{nodeid};
> $param->{votes} = $res->{quorum_votes} if !defined($param->{votes});
> @@ -494,7 +544,9 @@ __PACKAGE__->register_method ({
> path => 'join',
> method => 'POST',
> protected => 1,
> - description => "Joins this node into an existing cluster.",
> + description => "Joins this node into an existing cluster. If no links are"
> + . " given, default to IP resolved by node's hostname on single"
> + . " link (fallback fails for clusters with multiple links).",
> parameters => {
> additionalProperties => 0,
> properties => PVE::Corosync::add_corosync_link_properties({
> diff --git a/data/PVE/CLI/pvecm.pm b/data/PVE/CLI/pvecm.pm
> index df0b1b9..364d9b7 100755
> --- a/data/PVE/CLI/pvecm.pm
> +++ b/data/PVE/CLI/pvecm.pm
> @@ -401,6 +401,12 @@ __PACKAGE__->register_method ({
> $links->{$link}, get_standard_option('corosync-link'));
> }
>
> + # this will be used as fallback if no links are specified
> + push @$cmd, '--new_node_ip', $local_ip_address;
> +
you can only add that after 5/5 + guarding, else this breaks joining to a slightly
outdated node, which I want really to avoid.. If it'd be a user select-able switch
which triggers this, OK, but plainly adding a parameter isn't.
For the ssh way you could always fallback to just link0 if you implement the logic
I already suggested.
> + print "No local links given, will attempt fallback to $local_ip_address\n"
> + if !%$links;
"No cluster network links passed explicitly, fallback to local node IP '$local_ip_address'\n"
> +
> if (system (@$cmd) != 0) {
> my $cmdtxt = join (' ', @$cmd);
> die "unable to add node: command failed ($cmdtxt)\n";
> diff --git a/data/PVE/Cluster/Setup.pm b/data/PVE/Cluster/Setup.pm
> index 71a637f..8aa7f4b 100644
> --- a/data/PVE/Cluster/Setup.pm
> +++ b/data/PVE/Cluster/Setup.pm
> @@ -686,6 +686,12 @@ sub join {
> $args->{"link$link"} = PVE::Corosync::print_corosync_link($links->{$link});
> }
>
> + # this will be used as fallback if no links are specified
> + $args->{new_node_ip} = $local_ip_address;
same as above, needs to be guarded somehow.
> +
> + print "No local links given, will attempt fallback to $local_ip_address\n"
> + if !%$links;
same as above for the error message
> +
> print "Request addition of this node\n";
> my $res = eval { $conn->post("/cluster/config/nodes/$nodename", $args); };
> if (my $err = $@) {
>
More information about the pve-devel
mailing list