[PVE-User] Whole cluster brokes

Daniel daniel at linux-nerd.de
Wed Mar 8 13:12:59 CET 2017


Hi,

i was able to resolve this by my self. After i restarted the network Interface (bonding) it was working again.
So maybe the problem was the Bonding on that case.


-- 
Grüsse
 
Daniel

Am 08.03.17, 12:51 schrieb "pve-user im Auftrag von Daniel" <pve-user-bounces at pve.proxmox.com im Auftrag von daniel at linux-nerd.de>:

    Hi,
    
    there are absolutly no network changes at all.
    
    I got some strange errors:
    
    omping: Can't get addr info for omping: Name or service not known
    
    On Host01 it ts working with 
    omping -c 10 -i 1 -q 10.0.2.110 10.0.2.111
    
    On host host2 I did with error:
    
    omping -c 10 -i 1 -q 10.0.2.111 10.0.2.110
    
    And then I got the error omping: Can't get addr info for omping: Name or service not known
    
    I cant absolutely understand what is happening here.
    All Servers has the same Network config.
    
    -- 
    Grüsse
     
    Daniel
    
    Am 08.03.17, 12:39 schrieb "pve-user im Auftrag von Thomas Lamprecht" <pve-user-bounces at pve.proxmox.com im Auftrag von t.lamprecht at proxmox.com>:
    
        Hi,
        On 03/08/2017 11:38 AM, Daniel wrote:
        > Hi,
        >
        > when i try the command with 2 NODES i got the follwing Error.
        > So it seems realy to be a multicast problem.
        >
        > root at host01:~# omping -c 10 -i 1 -q 10.0.2.110 10.0.2.111
        > 10.0.2.111 : waiting for response msg
        > 10.0.2.111 : waiting for response msg
        
        Command is ok like this, thje one from your other mail is not.
        But you have to start it on both the node with IP 10.0.2.110 *and* the 
        one with 10.0.2.111 to make it work.
        
        >
        > I cant restart pve-cluster – I get errors. Corosync was not restarted yet – And yes – actually I don’t have HA configured yet.
        > Is there any special command to restart Corosync?
        
        systemctl restart corosync
        
        >
        > Should this help when I try to do on one node?
        >
        > echo 1 > /sys/devices/virtual/net/vmbr0/bridge/multicast_querier
        >   
        
        Yes, you can try that.
        
        > I am not sure what how long the cluster was working after 13 was shutdown.
        >
        Changes on the switch/network?
        
        > ok it seems that Multicast is not working anymore. But how can this happen? It was working before without any trouble.
        
        As said, or changes in the network or that the other node really acted as a
        multicast querier.
        
        But omping looks like no multicast is working at all, with a missing 
        querier you
        would get problems after about 5 minutes but before that it should work.
        
        
        
        
        _______________________________________________
        pve-user mailing list
        pve-user at pve.proxmox.com
        http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
        
    
    _______________________________________________
    pve-user mailing list
    pve-user at pve.proxmox.com
    http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
    



More information about the pve-user mailing list