[PVE-User] Cluster issue!

Wed Jan 31 11:35:23 CET 2018

>>the bindnetaddr is not the ip of the master, but is used to determine in
which network corosync sends/receives (so it should not really matter if it
is
>> yyy.120 or yyy.0 as long as those are in the same network)
Yes! I know that!
My question is: Why do I need change it manualy??
I expected that pvecm do it automaticaly...
I tried it several times... I have destroy custer, I did a fresh install of
proxmox. I have created a new cluster, and everytime that I opened the
/etc/pve/corosync.conf, the binnetdaddr was set to the IP of NIC installed
in first pve cluster node;
Just after I change to the network IP, instead host IP, so cluster worked
properly.
I do not understand why I need change it manualy!

---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36


2018-01-31 5:52 GMT-02:00 Dominik Csapak <d.csapak at proxmox.com>:

> On 01/30/2018 08:09 PM, Gilberto Nunes wrote:
>
>> Hi there
>>
>> After I change the corosync.conf, cluster is function again:
>>
>> Here's the original corosync.conf, just after I create the cluster:
>>
>> logging {
>>    debug: off
>>    to_syslog: yes
>> }
>>
>> nodelist {
>>    node {
>>      name: pve01
>>      nodeid: 1
>>      quorum_votes: 1
>>      ring0_addr: 10.10.10.210
>>    }
>>    node {
>>      name: pve02
>>      nodeid: 2
>>      quorum_votes: 1
>>      ring0_addr: 10.10.10.220
>>    }
>> }
>>
>> quorum {
>>    provider: corosync_votequorum
>> }
>>
>> totem {
>>    cluster_name: HOMECLUSTER
>>    config_version: 2
>>    interface {
>>      bindnetaddr: 10.10.10.120    -----------> this is the IP of "master"
>> node....
>>
>
> the bindnetaddr is not the ip of the master, but is used to determine in
> which network corosync sends/receives (so it should not really matter if it
> is yyy.120 or yyy.0 as long as those are in the same network)
>
>
>      ringnumber: 0
>>
>>    }
>>    ip_version: ipv4
>>    secauth: on
>>    version: 2
>> }
>> }
>>
>>
>> And this is the "now working" version:
>>
>> logging {
>>    debug: off
>>    to_syslog: yes
>> }
>>
>> nodelist {
>>    node {
>>      name: pve01
>>      nodeid: 1
>>      quorum_votes: 1
>>      ring0_addr: 10.10.10.210
>>    }
>>    node {
>>      name: pve02
>>      nodeid: 2
>>      quorum_votes: 1
>>      ring0_addr: 10.10.10.220
>>    }
>> }
>>
>> quorum {
>>    provider: corosync_votequorum
>> }
>>
>> totem {
>>    cluster_name: HOMECLUSTER
>>    config_version: 2
>>    interface {
>>      bindnetaddr: 10.10.10.0
>>      ringnumber: 0
>>      mcastport: 5405
>>    }
>>    transport: udpu
>>
>
> i guess this is the thing which made it work, namely i guess that
> multicast does not properly work in your network
>
>    ip_version: ipv4
>>    secauth: on
>>    version: 2
>> }
>> logging {
>>          fileline: off
>>          to_logfile: yes
>>          to_syslog: yes
>>          debug: off
>>          logfile: /var/log/cluster/corosync.log
>>          debug: off
>>          timestamp: on
>>          logger_subsys {
>>                  subsys: AMF
>>                  debug: off
>>          }
>> }
>>
>>
>> After reboot, everything is running smootlhy
>>
>> ---
>> Gilberto Nunes Ferreira
>>
>> (47) 3025-5907
>> (47) 99676-7530 - Whatsapp / Telegram
>>
>> Skype: gilberto.nunes36
>>
>>
>>
>>
>> 2018-01-30 15:39 GMT-02:00 Gilberto Nunes <gilberto.nunes32 at gmail.com>:
>>
>> Hi
>>>
>>> I have a fresh instalation of Proxmox 5.1.
>>> In the /etc/hosts I have:
>>>
>>> 127.0.0.1 localhost.localdomain localhost
>>> 10.10.10.210 pve01.domain.com pve01 pvelocalhost
>>> 10.10.10.220 pve02.domain.com pve02
>>>
>>> in both sides, pve01 and pve02
>>>
>>> I form the cluster with the command pvecm create HOMECLUSTER
>>> I ssh to pve02 and do pvecm add pve01.
>>> The cluster are formed as expected, but after 2 minutes, I get this error
>>> in /var/log/syslog:
>>>
>>> Jan 30 15:23:04 pve01 corosync[1482]: error   [TOTEM ] FAILED TO RECEIVE
>>> Jan 30 15:23:04 pve01 corosync[1482]:  [TOTEM ] FAILED TO RECEIVE
>>> Jan 30 15:23:05 pve01 corosync[1482]: notice  [TOTEM ] A new membership (
>>> 10.10.10.210:12) was formed. Members left: 2
>>> Jan 30 15:23:05 pve01 corosync[1482]: notice  [TOTEM ] Failed to receive
>>> the leave message. failed: 2
>>> Jan 30 15:23:05 pve01 corosync[1482]:  [TOTEM ] A new membership (
>>> 10.10.10.210:12) was formed. Members left: 2
>>> Jan 30 15:23:05 pve01 corosync[1482]:  [TOTEM ] Failed to receive the
>>> leave message. failed: 2
>>>
>>> So, I stop the cluster ( systemctl stop pve-cluster;systemctl stop
>>> corosync) and start pmxcfs -l (localy).
>>> I saw that in /etc/pve/corosync.conf file, the statement line:
>>>
>>>      bindnetaddr: 10.10.10.210
>>>
>>> So after I change this line to this:
>>>
>>>      bindnetaddr: 10.10.10.0
>>>
>>> and restart both nodes, the cluster back to normality.
>>>
>>> This second line wouldn't add but pvecm script?
>>> Why I need to change it to the network address by myself and not pvecm
>>> script do this automaticaly??
>>>
>>> I cannot understand!
>>>
>>> Any advice?
>>>
>>> Thanks a lot.
>>>
>>>
>>>
>>>
>>>
>>> ---
>>> Gilberto Nunes Ferreira
>>>
>>> (47) 3025-5907
>>> (47) 99676-7530 - Whatsapp / Telegram
>>>
>>> Skype: gilberto.nunes36
>>>
>>>
>>>
>>>
>>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
>>
>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>