[PVE-User] troubles creating a cluster

Gilberto Nunes gilberto.nunes32 at gmail.com
Tue Oct 30 17:36:31 CET 2018


Consider reinstall proxmox
---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36





Em ter, 30 de out de 2018 às 13:28, Adam Weremczuk <adamw at matrixscience.com>
escreveu:

> It doesn't appear to be related to /etc/hosts.
> I've reverted them to defaults on all systems, commented out IPv6
> sections and restarted all nodes.
> The problem on node1 (lion) persists:
>
> systemctl status pve-cluster.service
> ● pve-cluster.service - The Proxmox VE cluster filesystem
>     Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled;
> vendor preset: enabled)
>     Active: active (running) since Tue 2018-10-30 16:18:10 GMT; 3min 7s ago
>    Process: 1864 ExecStartPost=/usr/bin/pvecm updatecerts --silent
> (code=exited, status=0/SUCCESS)
>    Process: 1819 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS)
>   Main PID: 1853 (pmxcfs)
>      Tasks: 6 (limit: 4915)
>     Memory: 46.4M
>        CPU: 699ms
>     CGroup: /system.slice/pve-cluster.service
>             └─1853 /usr/bin/pmxcfs
>
> Oct 30 16:18:08 lion pmxcfs[1853]: [dcdb] crit: can't initialize service
> Oct 30 16:18:08 lion pmxcfs[1853]: [status] crit: cpg_initialize failed: 2
> Oct 30 16:18:08 lion pmxcfs[1853]: [status] crit: can't initialize service
> Oct 30 16:18:10 lion systemd[1]: Started The Proxmox VE cluster filesystem.
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: update cluster info
> (cluster name  MS-HA-Cluster, version = 1)
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: node has quorum
> Oct 30 16:18:14 lion pmxcfs[1853]: [dcdb] notice: members: 1/1853
> Oct 30 16:18:14 lion pmxcfs[1853]: [dcdb] notice: all data is up to date
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: members: 1/1853
> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: all data is up to date
>
>
> On 30/10/18 15:06, Adam Weremczuk wrote:
> > I have modified /etc/hosts on all nodes indeed.
> > That's because DNS will be served from one of containers on the cluster.
> > I don't want for cluster nodes to rely on DNS when communicating with
> > each other.
> > Maybe I'm trying to duplicate what Proxmox already does under the hood?
> >
> > Anyway my hosts files look like below:
> >
> > node1
> > 192.168.8.101 node1.example.com node1 pvelocalhost
> > 192.168.8.102 node2.example.com node2
> > 192.168.8.103 node3.example.com node3
> >
> > node2
> > 192.168.8.101 node1.example.com node1
> > 192.168.8.102 node2.example.com node2 pvelocalhost
> > 192.168.8.103 node3.example.com node3
> >
> > node3
> > 192.168.8.101 node1.example.com node1
> > 192.168.8.102 node2.example.com node2
> > 192.168.8.103 node3.example.com node3 pvelocalhost
> >
> > + IPv6 section (identical on all) which I should probably comment out:
> >
> > ::1     ip6-localhost ip6-loopback
> > fe00::0 ip6-localnet
> > ff00::0 ip6-mcastprefix
> > ff02::1 ip6-allnodes
> > ff02::2 ip6-allrouters
> > ff02::3 ip6-allhosts
> >
> >
> > On 30/10/18 14:54, Gilberto Nunes wrote:
> >> HOw about /etc/hosts file?
> >> Remember that Proxmox need to know about his IP and hostname
> >> correctly, in order to start CRM accordingly
> >> ---
> >> Gilberto Nunes Ferreira
> >>
> >> (47) 3025-5907
> >> (47) 99676-7530 - Whatsapp / Telegram
> >>
> >> Skype: gilberto.nunes36
> >>
> >>
> >>
> >>
> >>
> >> Em ter, 30 de out de 2018 às 11:47, Adam Weremczuk
> >> <adamw at matrixscience.com <mailto:adamw at matrixscience.com>> escreveu:
> >>
> >>     Yes, I have 3 nodes (2 x Lenovo servers + a VM) all on the same
> >>     LAN with
> >>     static IPv4 addresses.
> >>     They can happily ping each other and Proxmox web GUI looks ok on
> >>     all 3.
> >>     No IPv6 in use.
> >>
> >>     "Systemctl status pve-cluster.service" looks clean on the other
> >> nodes
> >>     but on this troublesome one returns:
> >>
> >>     Active: active (running)
> >>     (...)
> >>     Oct 30 14:17:10 lion pmxcfs[18003]: [dcdb] crit: can't initialize
> >>     service
> >>     Oct 30 14:17:10 lion pmxcfs[18003]: [status] crit: cpg_initialize
> >>     failed: 2
> >>     Oct 30 14:17:10 lion pmxcfs[18003]: [status] crit: can't
> >>     initialize service
> >>
> >>
> >>     On 30/10/18 14:38, Gilberto Nunes wrote:
> >>     > Hi
> >>     >
> >>     > It's seems to be a problem with the network connection between
> >>     the servers.
> >>     > They can ping each others?
> >>     > Is this a separated network, isolated from you LAN Network?
> >>     >
> >>     > ---
> >>     > Gilberto Nunes Ferreira
> >>     >
> >>     > (47) 3025-5907
> >>     > (47) 99676-7530 - Whatsapp / Telegram
> >>     >
> >>     > Skype: gilberto.nunes36
> >>     >
> >>     >
> >>     >
> >>     >
> >>     >
> >>     > Em ter, 30 de out de 2018 às 11:36, Adam Weremczuk
> >>     <adamw at matrixscience.com <mailto:adamw at matrixscience.com>>
> >>     > escreveu:
> >>     >
> >>     >> Hi all,
> >>     >>
> >>     >> My errors:
> >>     >>
> >>     >> Connection error 500: RPCEnvironment init request failed:
> >>     Unable to load
> >>     >> access control list: Connection refused
> >>     >>
> >>     >> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[1] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[2] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[3] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[1] failed:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[2] failed:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[3] failed:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:06 lion pvesr[17960]: Unable to load access
> >>     control list:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Main process
> >>     exited,
> >>     >> code=exited, status=111/n/a
> >>     >> Oct 30 14:17:06 lion systemd[1]: Failed to start Proxmox VE
> >>     replication
> >>     >> runner.
> >>     >> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Unit entered
> >>     failed state.
> >>     >> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Failed with
> >> result
> >>     >> 'exit-code'.
> >>     >> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:07 lion ntpd[1700]: Soliciting pool server
> >>     2001:4860:4806:8::
> >>     >> Oct 30 14:17:07 lion pve-ha-lrm[1980]: updating service status
> >> from
> >>     >> manager failed: Connection refused
> >>     >> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[1] failed:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[2] failed:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[3] failed:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[4] failed:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:08 lion pvestatd[1879]: status update error:
> >>     Connection
> >>     >> refused
> >>     >> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[1] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[2] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[3] failed:
> >>     >> Connection refused
> >>     >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: State
> >>     >> 'stop-sigterm' timed out. Killing.
> >>     >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Killing
> >>     process
> >>     >> 1813 (pmxcfs) with signal SIGKILL.
> >>     >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Main
> >> process
> >>     >> exited, code=killed, status=9/KILL
> >>     >> Oct 30 14:17:10 lion systemd[1]: Stopped The Proxmox VE cluster
> >>     filesystem.
> >>     >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Unit
> >> entered
> >>     >> failed state.
> >>     >> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Failed
> >>     with result
> >>     >> 'timeout'.
> >>     >>
> >>     >> System info:
> >>     >>
> >>     >> pveversion -v
> >>     >> proxmox-ve: 5.2-2 (running kernel: 4.15.17-1-pve)
> >>     >> pve-manager: 5.2-10 (running version: 5.2-10/6f892b40)
> >>     >> pve-kernel-4.15: 5.2-1
> >>     >> pve-kernel-4.15.17-1-pve: 4.15.17-9
> >>     >> corosync: 2.4.2-pve5
> >>     >> criu: 2.11.1-1~bpo90
> >>     >> glusterfs-client: 3.8.8-1
> >>     >> ksm-control-daemon: 1.2-2
> >>     >> libjs-extjs: 6.0.1-2
> >>     >> libpve-access-control: 5.0-8
> >>     >> libpve-apiclient-perl: 2.0-5
> >>     >> libpve-common-perl: 5.0-40
> >>     >> libpve-guest-common-perl: 2.0-18
> >>     >> libpve-http-server-perl: 2.0-11
> >>     >> libpve-storage-perl: 5.0-23
> >>     >> libqb0: 1.0.1-1
> >>     >> lvm2: 2.02.168-pve6
> >>     >> lxc-pve: 3.0.2+pve1-3
> >>     >> lxcfs: 3.0.2-2
> >>     >> novnc-pve: 1.0.0-2
> >>     >> proxmox-widget-toolkit: 1.0-20
> >>     >> pve-cluster: 5.0-30
> >>     >> pve-container: 2.0-23
> >>     >> pve-docs: 5.2-8
> >>     >> pve-firewall: 3.0-14
> >>     >> pve-firmware: 2.0-5
> >>     >> pve-ha-manager: 2.0-5
> >>     >> pve-i18n: 1.0-6
> >>     >> pve-libspice-server1: 0.12.8-3
> >>     >> pve-qemu-kvm: 2.11.1-5
> >>     >> pve-xtermjs: 1.0-5
> >>     >> qemu-server: 5.0-38
> >>     >> smartmontools: 6.5+svn4324-1
> >>     >> spiceterm: 3.0-5
> >>     >> vncterm: 1.5-3
> >>     >> zfsutils-linux: 0.7.11-pve1~bpo1
> >>     >>
> >>     >> Any idea what's wrong with my (fresh and default) installation?
> >>     >>
> >>     >> Thanks,
> >>     >> Adam
> >>     >>
> >>     >> _______________________________________________
> >>     >> pve-user mailing list
> >>     >> pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
> >>     >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>     >>
> >>     > _______________________________________________
> >>     > pve-user mailing list
> >>     > pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>
> >>     > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>
> >
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>



More information about the pve-user mailing list