[PVE-User] pve-firewall, clustering and HA gone bad
mark at tuxis.nl
Thu Jun 13 13:30:15 CEST 2019
On Thu, Jun 13, 2019 at 12:34:28PM +0200, Thomas Lamprecht wrote:
> Do your ringX_addr in corosync.conf use the hostnames or the resolved
> addresses? As with nodes added on newer PVE (at least 5.1, IIRC) we try
> to resolve the nodename and use the resolved address to exactly avoid
> such issues. If it don't uses that I recommend changing that instead
> of the all nodes in al /etc/hosts approach.
It has the hostnames. It's a cluster upgraded from 4.2 up to current.
> > It seems that pve-firewall tries to detect localnet, but failed to do so
> > correct. localnet should be 192.168.1.0/24, but instead it detected the
> > IPv6 addresses. Which isn't entirely incorrect, but IPv6 is not used for
> > clustering, so I should open IPv4 in the firewall not IPv6. So it seems
> > like nameresolving is used to define localnat, and not what corosync is
> > actually using.
> From a quick look at the code: That seems true and is definitively the
> wrong behavior :/
Ok, I'll file a bug for that.
> > 2: ha-manager should not be able to start the VM's when they are running
> > elsewhere
> This can only happen if fencing fails, and that fencing works is always
> a base assumption we must take (as else no HA is possible at all).
> So it would be interesting why fencing did not worked here (see below
> for the reason I could not determine that yet as I did not have your logs
> at hand)
We must indeed make assumptions. Are there ways we can assume better? :)
> The list trims attachments, could you please send them directly to my
> address? I'd really like to see those.
Attached again, so you should receive it now.
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | info at tuxis.nl
More information about the pve-user