[PVE-User] Whole cluster brokes

Daniel daniel at linux-nerd.de
Fri Mar 10 15:53:48 CET 2017


HI there,

i got the same error today again after adding vlans on the switch and removing them again:

Mar 10 15:51:59 host01 systemd[1]: Stopping The Proxmox VE cluster filesystem...
Mar 10 15:52:00 host01 corosync[14350]:  [TOTEM ] A new membership (10.0.2.110:127136) was formed. Members
Mar 10 15:52:00 host01 corosync[14350]:  [QUORUM] Members[12]: 1 2 3 4 5 6 7 8 9 10 11 12
Mar 10 15:52:00 host01 corosync[14350]:  [MAIN  ] Completed service synchronization, ready to provide service.
Mar 10 15:52:01 host01 corosync[14350]:  [TOTEM ] A new membership (10.0.2.110:127140) was formed. Members
Mar 10 15:52:01 host01 corosync[14350]:  [QUORUM] Members[12]: 1 2 3 4 5 6 7 8 9 10 11 12
Mar 10 15:52:01 host01 corosync[14350]:  [MAIN  ] Completed service synchronization, ready to provide service.
Mar 10 15:52:04 host01 corosync[14350]:  [TOTEM ] A new membership (10.0.2.110:127144) was formed. Members
Mar 10 15:52:04 host01 corosync[14350]:  [QUORUM] Members[12]: 1 2 3 4 5 6 7 8 9 10 11 12
Mar 10 15:52:04 host01 corosync[14350]:  [MAIN  ] Completed service synchronization, ready to provide service.
Mar 10 15:52:10 host01 systemd[1]: pve-cluster.service stop-sigterm timed out. Killing.
Mar 10 15:52:10 host01 cron[2006]: (*system*vzdump) CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
Mar 10 15:52:10 host01 pve-ha-lrm[2128]: unable to write lrm status file - unable to open file '/etc/pve/nodes/host01/lrm_status.tmp.2128' - Transport endpoint is not connected
Mar 10 15:52:10 host01 systemd[1]: pve-cluster.service: main process exited, code=killed, status=9/KILL
Mar 10 15:52:10 host01 systemd[1]: Unit pve-cluster.service entered failed state.
Mar 10 15:52:10 host01 systemd[1]: Starting The Proxmox VE cluster filesystem...
Mar 10 15:52:10 host01 pmxcfs[23185]: [status] notice: update cluster info (cluster name  fcse, version = 13)
Mar 10 15:52:10 host01 pmxcfs[23185]: [status] notice: node has quorum
Mar 10 15:52:10 host01 pmxcfs[23185]: [dcdb] notice: members: 1/23185, 3/1990, 4/1910, 5/22930, 6/1893, 7/2035, 8/1927, 9/1887, 10/1989, 11/1509, 12/2135
Mar 10 15:52:10 host01 pmxcfs[23185]: [dcdb] notice: starting data syncronisation
Mar 10 15:52:10 host01 pmxcfs[23185]: [dcdb] notice: received sync request (epoch 1/23185/00000001)
Mar 10 15:52:10 host01 pmxcfs[23185]: [status] notice: members: 1/23185, 3/1990, 4/1910, 5/22930, 6/1893, 7/2035, 8/1927, 9/1887, 10/1989, 11/1509, 12/2135
Mar 10 15:52:10 host01 pmxcfs[23185]: [status] notice: starting data syncronisation
Mar 10 15:52:10 host01 pmxcfs[23185]: [status] notice: received sync request (epoch 1/23185/00000001)
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Transport endpoint is not connected
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: status update time (35.230 seconds)
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:10 host01 pvestatd[31372]: ipcc_send_rec failed: Connection refused
Mar 10 15:52:13 host01 corosync[14350]:  [TOTEM ] A new membership (10.0.2.110:127148) was formed. Members
Mar 10 15:52:13 host01 corosync[14350]:  [QUORUM] Members[12]: 1 2 3 4 5 6 7 8 9 10 11 12
Mar 10 15:52:13 host01 corosync[14350]:  [MAIN  ] Completed service synchronization, ready to provide service.
Mar 10 15:52:15 host01 pve-ha-lrm[2128]: loop take too long (45 seconds)
Mar 10 15:52:15 host01 pve-ha-crm[2115]: ipcc_send_rec failed: Transport endpoint is not connected
Mar 10 15:52:15 host01 pve-ha-lrm[2128]: ipcc_send_rec failed: Transport endpoint is not connected
Mar 10 15:52:22 host01 corosync[14350]:  [TOTEM ] A new membership (10.0.2.110:127152) was formed. Members
Mar 10 15:52:22 host01 corosync[14350]:  [QUORUM] Members[12]: 1 2 3 4 5 6 7 8 9 10 11 12
Mar 10 15:52:22 host01 corosync[14350]:  [MAIN  ] Completed service synchronization, ready to provide service.
Mar 10 15:52:25 host01 corosync[14350]:  [TOTEM ] A new membership (10.0.2.110:127156) was formed. Members
Mar 10 15:52:25 host01 corosync[14350]:  [QUORUM] Members[12]: 1 2 3 4 5 6 7 8 9 10 11 12
Mar 10 15:52:25 host01 corosync[14350]:  [MAIN  ] Completed service synchronization, ready to provide service.
Mar 10 15:52:27 host01 corosync[14350]:  [TOTEM ] A new membership (10.0.2.110:127160) was formed. Members
Mar 10 15:52:27 host01 corosync[14350]:  [QUORUM] Members[12]: 1 2 3 4 5 6 7 8 9 10 11 12
Mar 10 15:52:27 host01 corosync[14350]:  [MAIN  ] Completed service synchronization, ready to provide service.




-- 
GrĂ¼sse
 
Daniel

Am 08.03.17, 13:16 schrieb "pve-user im Auftrag von Thomas Lamprecht" <pve-user-bounces at pve.proxmox.com im Auftrag von t.lamprecht at proxmox.com>:

    Hi,
    
    On 03/08/2017 01:12 PM, Daniel wrote:
    > Hi,
    >
    > i was able to resolve this by my self. After i restarted the network Interface (bonding) it was working again.
    > So maybe the problem was the Bonding on that case.
    >
    >
    
    Ok, glad to hear!
    
    cheers,
    Thomas
    
    _______________________________________________
    pve-user mailing list
    pve-user at pve.proxmox.com
    http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
    



More information about the pve-user mailing list