[PVE-User] Web GUI: connection reset by peer (596)

Uwe Sauter uwe.sauter.de at gmail.com
Sat Feb 25 09:23:34 CET 2017


I'm sorry I forgot to mention that I already switched to "transport: udpu".

I tested multicast before creating the cluster. While the first test (omping -c 10000 -i 0.001 -F -q px-a px-b px-c
px-d) showed no packet loss the second one that is mentioned at [1] (omping -c 600 -i 1 -q px-a px-b px-c px-d) showed
70% loss for multicast:

root at px-b # omping -c 600 -i 1 -q px-a px-b px-c px-d
[…]
px-a :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.077/0.250/0.443/0.065
px-a : multicast, xmt/rcv/%loss = 600/182/69%, min/avg/max/std-dev = 0.157/0.280/0.432/0.062
px-c :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.084/0.236/0.391/0.062
px-c : multicast, xmt/rcv/%loss = 600/182/69%, min/avg/max/std-dev = 0.153/0.265/0.407/0.057
px-d :   unicast, xmt/rcv/%loss = 600/600/0%, min/avg/max/std-dev = 0.080/0.243/0.400/0.066
px-d : multicast, xmt/rcv/%loss = 600/180/70%, min/avg/max/std-dev = 0.134/0.265/0.401/0.060

As I have no control of the switch in use I decided to go with UDPU as we don't plan to grow the cluster to more than
~15 nodes.

This is my corosync.conf (I'm using 169.254.42.0/24 for cluster internal communication):

###############
logging {
  debug: off
  logfile: /var/log/corosync/corosync.log
  timestamp: on
  to_logfile: yes
  to_syslog: yes
}

nodelist {
  node {
    name: px-c
    nodeid: 3
    quorum_votes: 1
    ring0_addr: px-c
  }

  node {
    name: px-d
    nodeid: 4
    quorum_votes: 1
    ring0_addr: px-d
  }

  node {
    name: px-a
    nodeid: 1
    quorum_votes: 1
    ring0_addr: px-a
  }

  node {
    name: px-b
    nodeid: 2
    quorum_votes: 1
    ring0_addr: px-b
  }
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: px-infra
  config_version: 5
  ip_version: ipv4
  secauth: on
  transport: udpu
  version: 2
  interface {
    bindnetaddr: 169.254.42.48
    ringnumber: 0
  }
}
###############





[1] https://pve.proxmox.com/wiki/Multicast_notes#Diagnosis_from_first_principles


Am 25.02.2017 um 06:54 schrieb Yannis Milios:
> In my opinion this is related to difficulties in cluster communication.Have a look these notes:
> 
> https://pve.proxmox.com/wiki/Multicast_notes
> 
> 
> 
> On Fri, 24 Feb 2017 at 22:45, Uwe Sauter <uwe.sauter.de at gmail.com <mailto:uwe.sauter.de at gmail.com>> wrote:
> 
>     Hi,
> 
>     no I didn't think about that.
> 
>     I now tried and restarted pveproxy afterwards but to no avail.
> 
>     Can you explain why you thought that this might help?
> 
> 
>     Regards,
> 
>             Uwe
> 
> 
>     Am 24.02.2017 um 21:28 schrieb Gilberto Nunes:
>     > Hi
>     >
>     > Did you try to execute:
>     >
>     > pvecm updatecerts
>     >
>     > in every nodes???
>     >
>     > 2017-02-24 15:04 GMT-03:00 Uwe Sauter <uwe.sauter.de at gmail.com <mailto:uwe.sauter.de at gmail.com>
>     <mailto:uwe.sauter.de at gmail.com <mailto:uwe.sauter.de at gmail.com>>>:
>     >
>     >     Hi,
>     >
>     >     I have a GUI problem with a four node cluster that I installed recently. I was able
>     >     to follow this up to ext-all.js but I'm no web developer so this is where I got stuck.
>     >
>     >     Background:
>     >     * four node cluster
>     >     * each node has two interfaces in use
>     >     ** eth0 is  1Gb used for management and some VM traffic
>     >     ** eth2 is 10Gb used for cluster synchronization, Ceph and more VM traffic
>     >     * host names are resolved via /etc/hosts
>     >     * let's call the nodes px-a, px-b, px-c, px-d
>     >     * Proxmox version 4.4-12/e71b7a74
>     >
>     >
>     >     Problem:
>     >     When I access the cluster via the web GUI on px-a I can view all info regarding px-a
>     >     without any problems. If I try to view infos regarding the other nodes I almost every
>     >     time I get "connection reset by peer (596)".
>     >     If I access the cluster GUI on px-b I can view this node's info but not the info of the
>     >     other nodes.
>     >
>     >     I started to migrate VMs to the cluster today. Before that, when the cluster had no
>     >     VMs running, the access between nodes worked without problem.
>     >
>     >
>     >     Debugging:
>     >     I was able to trace this using Chrome's developer tools up to the point where
>     >     some method inside ext-all.js fails with said "connection reset by peer".
>     >
>     >     Detail using pretty formatted version of ext-all.js:
>     >
>     >     Object (?) Ext.cmd.derive("Ext.data.request.Ajax", Ext.data.request.Base begins at line 11370
>     >
>     >     Method "start" begins at line 11394
>     >
>     >     Error occurs at line 11409 "h.send(e);"
>     >
>     >
>     >     I don't know what causes h.send(e) to fail. Any suggestions what could cause this or how to
>     >     debug further is appreciated.
>     >
>     >     Regards,
>     >
>     >             Uwe
>     >     _______________________________________________
>     >     pve-user mailing list
>     >     pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com> <mailto:pve-user at pve.proxmox.com
>     <mailto:pve-user at pve.proxmox.com>>
>     >     http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>     <http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user>
>     >
>     >
>     >
>     >
>     > --
>     >
>     > Gilberto Ferreira
>     > +55 (47) 99676-7530
>     > Skype: gilberto.nunes36
>     >
>     _______________________________________________
>     pve-user mailing list
>     pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
>     http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 
> -- 
> Sent from Gmail Mobile



More information about the pve-user mailing list