[PVE-User] TASK ERROR: cluster not ready - no quorum?

Shain Miley smiley at npr.org
Mon Mar 9 17:33:59 CET 2015


I am looking into the possibility that there is a multicast issue here 
as I am unable to ping any of the multicast ip address on any of the nodes.

I have reached out to cisco support for some additional help.

I will let you know what I find out.

Thanks again,

Shain


On 3/9/15 11:54 AM, Eneko Lacunza wrote:
> It seems yesterday something happened at 20:40:53:
>
> Mar 08 20:40:53 corosync [TOTEM ] FAILED TO RECEIVE
> Mar 08 20:41:05 corosync [CLM   ] CLM CONFIGURATION CHANGE
> Mar 08 20:41:05 corosync [CLM   ] New Configuration:
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.48)
> Mar 08 20:41:05 corosync [CLM   ] Members Left:
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.16)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.33)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.49)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.50)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.69)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.75)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.77)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.87)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.141)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.142)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.161)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.163)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.165)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.215)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.216)
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.219)
> Mar 08 20:41:05 corosync [CLM   ] Members Joined:
> Mar 08 20:41:05 corosync [QUORUM] Members[16]: 1 2 4 5 6 7 8 10 11 12 
> 13 14 15 16 17 19
> Mar 08 20:41:05 corosync [QUORUM] Members[15]: 1 2 4 5 6 7 8 11 12 13 
> 14 15 16 17 19
> Mar 08 20:41:05 corosync [QUORUM] Members[14]: 1 2 4 5 6 7 8 11 12 14 
> 15 16 17 19
> Mar 08 20:41:05 corosync [QUORUM] Members[13]: 1 2 4 5 6 7 8 11 12 15 
> 16 17 19
> Mar 08 20:41:05 corosync [QUORUM] Members[12]: 1 2 4 5 6 7 8 11 12 15 
> 17 19
> Mar 08 20:41:05 corosync [QUORUM] Members[11]: 1 2 4 5 6 7 8 11 12 15 17
> Mar 08 20:41:05 corosync [QUORUM] Members[10]: 1 2 4 5 6 7 8 11 12 17
> Mar 08 20:41:05 corosync [CMAN  ] quorum lost, blocking activity
> Mar 08 20:41:05 corosync [QUORUM] This node is within the non-primary 
> component and will NOT provide any services.
> Mar 08 20:41:05 corosync [QUORUM] Members[9]: 1 2 5 6 7 8 11 12 17
> Mar 08 20:41:05 corosync [QUORUM] Members[8]: 1 2 5 6 7 11 12 17
> Mar 08 20:41:05 corosync [QUORUM] Members[7]: 1 2 5 6 7 12 17
> Mar 08 20:41:05 corosync [QUORUM] Members[6]: 1 2 6 7 12 17
> Mar 08 20:41:05 corosync [QUORUM] Members[5]: 1 2 7 12 17
> Mar 08 20:41:05 corosync [QUORUM] Members[4]: 1 2 12 17
> Mar 08 20:41:05 corosync [QUORUM] Members[3]: 1 12 17
> Mar 08 20:41:05 corosync [QUORUM] Members[2]: 1 12
> Mar 08 20:41:05 corosync [QUORUM] Members[1]: 12
> Mar 08 20:41:05 corosync [CLM   ] CLM CONFIGURATION CHANGE
> Mar 08 20:41:05 corosync [CLM   ] New Configuration:
> Mar 08 20:41:05 corosync [CLM   ]     r(0) ip(172.31.2.48)
> Mar 08 20:41:05 corosync [CLM   ] Members Left:
> Mar 08 20:41:05 corosync [CLM   ] Members Joined:
> Mar 08 20:41:05 corosync [TOTEM ] A processor joined or left the 
> membership and a new membership was formed.
> Mar 08 20:41:05 corosync [CPG   ] chosen downlist: sender r(0) 
> ip(172.31.2.48) ; members(old:17 left:16)
> Mar 08 20:41:05 corosync [MAIN  ] Completed service synchronization, 
> ready to provide service
>
> Is the "pvecm nodes" similar in all nodes?
>
> I don't have experience troubleshooting corosync but it seems you have 
> to re-estrablish the corosync cluster and quorum.
>
> Check "corosync-quorumtool -l -i" . Also check cman_tool command for 
> diagnosing the cluster.
>
> Is corosync service loaded and running? Does restarting it change 
> something (service cman restart) ?
>
>
>
> On 09/03/15 16:13, Shain Miley wrote:
>> Oddly enough...there is nothing in the latest corosync 
>> logfile...however the one from last night (when we started seeing the 
>> problem) has a lot of info in it.
>>
>> Here is the link to entire file:
>>
>> http://717b5bb5f6a032ce28eb-fa7f03050c118691fd4b41bf00a93863.r71.cf1.rackcdn.com/corosync.log.1
>>
>> Thanks again for your help so far.
>>
>> Shain
>>
>> On 3/9/15 10:53 AM, Eneko Lacunza wrote:
>>> What about /var/log/cluster/corosync.log ?
>>>
>>> On 09/03/15 15:34, Shain Miley wrote:
>>>> Yes,
>>>>
>>>> All the nodes are pingable and resolvable via their hostname.
>>>>
>>>> Here is the ouput of 'pvecm nodes'
>>>>
>>>>
>>>> root at proxmox13:~# pvecm nodes
>>>> Node  Sts   Inc   Joined               Name
>>>>    1   X    964                        proxmox22
>>>>    2   X    964                        proxmox23
>>>>    3   X    756                        proxmox24
>>>>    4   X    808                        proxmox18
>>>>    5   X    964                        proxmox19
>>>>    6   X    964                        proxmox20
>>>>    7   X    964                        proxmox21
>>>>    8   X    964                        proxmox1
>>>>    9   X      0                        proxmox2
>>>>   10   X    756                        proxmox3
>>>>   11   X    964                        proxmox4
>>>>   12   M    696   2014-10-20 01:10:09  proxmox13
>>>>   13   X    904                        proxmox14
>>>>   14   X    848                        proxmox15
>>>>   15   X    856                        proxmox16
>>>>   16   X    836                        proxmox17
>>>>   17   X    964                        proxmox25
>>>>   18   X    960                        proxmox26
>>>>   19   X    868                        proxmox28
>>>>
>>>> Thanks,
>>>>
>>>> Shain
>>>>
>>>> On 3/9/15 10:23 AM, Eneko Lacunza wrote:
>>>>> pvecm nodes
>>>>
>>>>
>>>> -- 
>>>> _NPR | Shain Miley| Manager of Systems and Infrastructure, Digital 
>>>> Media | smiley at npr.org | p: 202-513-3649
>>>
>>>
>>> -- 
>>> Zuzendari Teknikoa / Director Técnico
>>> Binovo IT Human Project, S.L.
>>> Telf. 943575997
>>>        943493611
>>> Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
>>> www.binovo.es
>>
>>
>> -- 
>> _NPR | Shain Miley| Manager of Systems and Infrastructure, Digital 
>> Media | smiley at npr.org | p: 202-513-3649
>
>
> -- 
> Zuzendari Teknikoa / Director Técnico
> Binovo IT Human Project, S.L.
> Telf. 943575997
>        943493611
> Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
> www.binovo.es


-- 
_NPR | Shain Miley| Manager of Systems and Infrastructure, Digital Media 
| smiley at npr.org | p: 202-513-3649
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pve-user/attachments/20150309/7722dece/attachment.htm>


More information about the pve-user mailing list