[PVE-User] troubles creating a cluster
Gilberto Nunes
gilberto.nunes32 at gmail.com
Tue Oct 30 17:36:31 CET 2018
Consider reinstall proxmox
---
Gilberto Nunes Ferreira
(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram
Skype: gilberto.nunes36
Em ter, 30 de out de 2018 às 13:28, Adam Weremczuk <adamw at matrixscience.com>
escreveu:
> It doesn't appear to be related to /etc/hosts.
> I've reverted them to defaults on all systems, commented out IPv6
> sections and restarted all nodes.
> The problem on node1 (lion) persists:
>
> systemctl status pve-cluster.service
> ● pve-cluster.service - The Proxmox VE cluster filesystem
> Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled;
> vendor preset: enabled)
> Active: active (running) since Tue 2018-10-30 16:18:10 GMT; 3min 7s ago
> Process: 1864 ExecStartPost=/usr/bin/pvecm updatecerts --silent
> (code=exited, status=0/SUCCESS)
> Process: 1819 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
> Main PID: 1853 (pmxcfs)
> Tasks: 6 (limit: 4915)
> Memory: 46.4M
> CPU: 699ms
> CGroup: /system.slice/pve-cluster.service
> └─1853 /usr/bin/pmxcfs
>
> Oct 30 16:18:08 lion pmxcfs[1853]: [dcdb] crit: can't initialize service
> Oct 30 16:18:08 lion pmxcfs[1853]: [status] crit: cpg_initialize failed: 2
> Oct 30 16:18:08 lion pmxcfs[1853]: [status] crit: can't initialize service
> Oct 30 16:18:10 lion systemd[1]: Started The Proxmox VE cluster filesystem.
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: update cluster info
> (cluster name MS-HA-Cluster, version = 1)
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: node has quorum
> Oct 30 16:18:14 lion pmxcfs[1853]: [dcdb] notice: members: 1/1853
> Oct 30 16:18:14 lion pmxcfs[1853]: [dcdb] notice: all data is up to date
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: members: 1/1853
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: all data is up to date
>
>
> On 30/10/18 15:06, Adam Weremczuk wrote:
> > I have modified /etc/hosts on all nodes indeed.
> > That's because DNS will be served from one of containers on the cluster.
> > I don't want for cluster nodes to rely on DNS when communicating with
> > each other.
> > Maybe I'm trying to duplicate what Proxmox already does under the hood?
> >
> > Anyway my hosts files look like below:
> >
> > node1
> > 192.168.8.101 node1.example.com node1 pvelocalhost
> > 192.168.8.102 node2.example.com node2
> > 192.168.8.103 node3.example.com node3
> >
> > node2
> > 192.168.8.101 node1.example.com node1
> > 192.168.8.102 node2.example.com node2 pvelocalhost
> > 192.168.8.103 node3.example.com node3
> >
> > node3
> > 192.168.8.101 node1.example.com node1
> > 192.168.8.102 node2.example.com node2
> > 192.168.8.103 node3.example.com node3 pvelocalhost
> >
> > + IPv6 section (identical on all) which I should probably comment out:
> >
> > ::1 ip6-localhost ip6-loopback
> > fe00::0 ip6-localnet
> > ff00::0 ip6-mcastprefix
> > ff02::1 ip6-allnodes
> > ff02::2 ip6-allrouters
> > ff02::3 ip6-allhosts
> >
> >
> > On 30/10/18 14:54, Gilberto Nunes wrote:
> >> HOw about /etc/hosts file?
> >> Remember that Proxmox need to know about his IP and hostname
> >> correctly, in order to start CRM accordingly
> >> ---
> >> Gilberto Nunes Ferreira
> >>
> >> (47) 3025-5907
> >> (47) 99676-7530 - Whatsapp / Telegram
> >>
> >> Skype: gilberto.nunes36
> >>
> >>
> >>
> >>
> >>
> >> Em ter, 30 de out de 2018 às 11:47, Adam Weremczuk
> >> <adamw at matrixscience.com <mailto:adamw at matrixscience.com>> escreveu:
> >>
> >> Yes, I have 3 nodes (2 x Lenovo servers + a VM) all on the same
> >> LAN with
> >> static IPv4 addresses.
> >> They can happily ping each other and Proxmox web GUI looks ok on
> >> all 3.
> >> No IPv6 in use.
> >>
> >> "Systemctl status pve-cluster.service" looks clean on the other
> >> nodes
> >> but on this troublesome one returns:
> >>
> >> Active: active (running)
> >> (...)
> >> Oct 30 14:17:10 lion pmxcfs[18003]: [dcdb] crit: can't initialize
> >> service
> >> Oct 30 14:17:10 lion pmxcfs[18003]: [status] crit: cpg_initialize
> >> failed: 2
> >> Oct 30 14:17:10 lion pmxcfs[18003]: [status] crit: can't
> >> initialize service
> >>
> >>
> >> On 30/10/18 14:38, Gilberto Nunes wrote:
> >> > Hi
> >> >
> >> > It's seems to be a problem with the network connection between
> >> the servers.
> >> > They can ping each others?
> >> > Is this a separated network, isolated from you LAN Network?
> >> >
> >> > ---
> >> > Gilberto Nunes Ferreira
> >> >
> >> > (47) 3025-5907
> >> > (47) 99676-7530 - Whatsapp / Telegram
> >> >
> >> > Skype: gilberto.nunes36
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > Em ter, 30 de out de 2018 às 11:36, Adam Weremczuk
> >> <adamw at matrixscience.com <mailto:adamw at matrixscience.com>>
> >> > escreveu:
> >> >
> >> >> Hi all,
> >> >>
> >> >> My errors:
> >> >>
> >> >> Connection error 500: RPCEnvironment init request failed:
> >> Unable to load
> >> >> access control list: Connection refused
> >> >>
> >> >> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[1] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[2] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[3] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[1] failed:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[2] failed:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[3] failed:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:06 lion pvesr[17960]: Unable to load access
> >> control list:
> >> >> Connection refused
> >> >> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Main process
> >> exited,
> >> >> code=exited, status=111/n/a
> >> >> Oct 30 14:17:06 lion systemd[1]: Failed to start Proxmox VE
> >> replication
> >> >> runner.
> >> >> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Unit entered
> >> failed state.
> >> >> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Failed with
> >> result
> >> >> 'exit-code'.
> >> >> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:07 lion ntpd[1700]: Soliciting pool server
> >> 2001:4860:4806:8::
> >> >> Oct 30 14:17:07 lion pve-ha-lrm[1980]: updating service status
> >> from
> >> >> manager failed: Connection refused
> >> >> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[1] failed:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[2] failed:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[3] failed:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[4] failed:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:08 lion pvestatd[1879]: status update error:
> >> Connection
> >> >> refused
> >> >> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >> >> Connection refused
> >> >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: State
> >> >> 'stop-sigterm' timed out. Killing.
> >> >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Killing
> >> process
> >> >> 1813 (pmxcfs) with signal SIGKILL.
> >> >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Main
> >> process
> >> >> exited, code=killed, status=9/KILL
> >> >> Oct 30 14:17:10 lion systemd[1]: Stopped The Proxmox VE cluster
> >> filesystem.
> >> >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Unit
> >> entered
> >> >> failed state.
> >> >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Failed
> >> with result
> >> >> 'timeout'.
> >> >>
> >> >> System info:
> >> >>
> >> >> pveversion -v
> >> >> proxmox-ve: 5.2-2 (running kernel: 4.15.17-1-pve)
> >> >> pve-manager: 5.2-10 (running version: 5.2-10/6f892b40)
> >> >> pve-kernel-4.15: 5.2-1
> >> >> pve-kernel-4.15.17-1-pve: 4.15.17-9
> >> >> corosync: 2.4.2-pve5
> >> >> criu: 2.11.1-1~bpo90
> >> >> glusterfs-client: 3.8.8-1
> >> >> ksm-control-daemon: 1.2-2
> >> >> libjs-extjs: 6.0.1-2
> >> >> libpve-access-control: 5.0-8
> >> >> libpve-apiclient-perl: 2.0-5
> >> >> libpve-common-perl: 5.0-40
> >> >> libpve-guest-common-perl: 2.0-18
> >> >> libpve-http-server-perl: 2.0-11
> >> >> libpve-storage-perl: 5.0-23
> >> >> libqb0: 1.0.1-1
> >> >> lvm2: 2.02.168-pve6
> >> >> lxc-pve: 3.0.2+pve1-3
> >> >> lxcfs: 3.0.2-2
> >> >> novnc-pve: 1.0.0-2
> >> >> proxmox-widget-toolkit: 1.0-20
> >> >> pve-cluster: 5.0-30
> >> >> pve-container: 2.0-23
> >> >> pve-docs: 5.2-8
> >> >> pve-firewall: 3.0-14
> >> >> pve-firmware: 2.0-5
> >> >> pve-ha-manager: 2.0-5
> >> >> pve-i18n: 1.0-6
> >> >> pve-libspice-server1: 0.12.8-3
> >> >> pve-qemu-kvm: 2.11.1-5
> >> >> pve-xtermjs: 1.0-5
> >> >> qemu-server: 5.0-38
> >> >> smartmontools: 6.5+svn4324-1
> >> >> spiceterm: 3.0-5
> >> >> vncterm: 1.5-3
> >> >> zfsutils-linux: 0.7.11-pve1~bpo1
> >> >>
> >> >> Any idea what's wrong with my (fresh and default) installation?
> >> >>
> >> >> Thanks,
> >> >> Adam
> >> >>
> >> >> _______________________________________________
> >> >> pve-user mailing list
> >> >> pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
> >> >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >> >>
> >> > _______________________________________________
> >> > pve-user mailing list
> >> > pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
> >> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>
> >
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
More information about the pve-user
mailing list