[PVE-User] Quorum Activity blocked
Piviul
piviul at riminilug.it
Mon Nov 7 08:44:03 CET 2022
Good morning sirs, in a 3 nodes proxmox 6.4 all the 3 nodes seems to
works, all vm guest continue to works but If I try to start a vm guest
the starting fails with the message: "cluster not ready - no quorum?
(500)". This is the cluster manager status:
# pvecm status
Cluster information
-------------------
Name: CSA-cluster1
Config Version: 3
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Mon Nov 7 08:37:20 2022
Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2.91e
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000002 1 192.168.255.2 (local)
These are the first logs in syslog showing that some problem occurs:
Nov 4 23:38:01 pve02 systemd[1]: Started Proxmox VE replication runner.
Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] link: host: 3 link: 0
is down
Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] host: host: 3 (passive)
best link: 0 (pri: 1)
Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] host: host: 3 has no
active links
Nov 4 23:38:28 pve02 corosync[1703]: [TOTEM ] Token has not been
received in 2737 ms
Nov 4 23:38:30 pve02 corosync[1703]: [KNET ] rx: host: 3 link: 0 is up
Nov 4 23:38:30 pve02 corosync[1703]: [KNET ] host: host: 3 (passive)
best link: 0 (pri: 1)
Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Sync members[2]: 1 2
Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Sync left[1]: 3
Nov 4 23:38:32 pve02 corosync[1703]: [TOTEM ] A new membership
(1.873) was formed. Members left: 3
Nov 4 23:38:32 pve02 corosync[1703]: [TOTEM ] Failed to receive the
leave message. failed: 3
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: members: 1/1626, 2/1578
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: starting data
syncronisation
Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: members: 1/1626, 2/1578
Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: starting data
syncronisation
Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Members[2]: 1 2
Nov 4 23:38:32 pve02 corosync[1703]: [MAIN ] Completed service
synchronization, ready to provide service.
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: received sync request
(epoch 1/1626/00000009)
Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: received sync
request (epoch 1/1626/00000009)
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: received all states
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: leader is 1/1626
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: synced members:
1/1626, 2/1578
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: all data is up to date
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: dfsm_deliver_queue:
queue length 2
Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: received all states
Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: all data is up to date
Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: dfsm_deliver_queue:
queue length 46
Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] link: host: 3 link: 0
is down
Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] host: host: 3 (passive)
best link: 0 (pri: 1)
Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] host: host: 3 has no
active links
Nov 4 23:38:41 pve02 corosync[1703]: [KNET ] link: host: 1 link: 0
is down
Nov 4 23:38:41 pve02 corosync[1703]: [KNET ] host: host: 1 (passive)
best link: 0 (pri: 1)
Nov 4 23:38:41 pve02 corosync[1703]: [KNET ] host: host: 1 has no
active links
Nov 4 23:38:42 pve02 corosync[1703]: [TOTEM ] Token has not been
received in 2737 ms
Nov 4 23:38:43 pve02 corosync[1703]: [TOTEM ] A processor failed,
forming new configuration: token timed out (3650ms), waiting 4380ms for
consensus.
Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Sync members[1]: 2
Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Sync left[1]: 1
Nov 4 23:38:48 pve02 corosync[1703]: [TOTEM ] A new membership
(2.877) was formed. Members left: 1
Nov 4 23:38:48 pve02 corosync[1703]: [TOTEM ] Failed to receive the
leave message. failed: 1
Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] notice: members: 2/1578
Nov 4 23:38:48 pve02 pmxcfs[1578]: [status] notice: members: 2/1578
Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] This node is within the
non-primary component and will NOT provide any services.
Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Members[1]: 2
Nov 4 23:38:48 pve02 corosync[1703]: [MAIN ] Completed service
synchronization, ready to provide service.
Nov 4 23:38:48 pve02 pmxcfs[1578]: [status] notice: node lost quorum
Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] crit: received write while
not quorate - trigger resync
Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] crit: leaving CPG group
Nov 4 23:38:48 pve02 pve-ha-lrm[1943]: unable to write lrm status file
- unable to open file '/etc/pve/nodes/pve02/lrm_status.tmp.1943' -
Permission denied
Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] notice: start cluster connection
Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] crit: cpg_join failed: 14
Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] crit: can't initialize service
Nov 4 23:38:55 pve02 pmxcfs[1578]: [dcdb] notice: members: 2/1578
Nov 4 23:38:55 pve02 pmxcfs[1578]: [dcdb] notice: all data is up to date
Nov 4 23:39:00 pve02 systemd[1]: Starting Proxmox VE replication runner...
Nov 4 23:39:01 pve02 pvesr[2146320]: trying to acquire cfs lock
'file-replication_cfg' ...
[...]
What's happened to my cluster? Someone has some suggestions to
troubleshoot the problem?
Piviul
More information about the pve-user
mailing list