[PVE-User] Cluster problem...
Gilberto Nunes
gilberto.nunes32 at gmail.com
Fri Jul 11 19:43:37 CEST 2014
Hi...
That's another problem running around Cluster Administration on Proxmox...
I set two VirtualBox VM's running latest PVE version...
My laptop, an Intel Core i5 running Ubuntu, act as a Storage with TGT
Target...
I am able to create the cluster and define the quorum disk...
However, when I reboot both nodes, I get this error:
Starting qdiskd [ FAILED ]...
No local IP Address has been set...
I think something with DLM lock or something similar issue...
But, if I go to CLI and check:
pve01:~# /etc/init.d/cman status
qdiskd is stopped
pve01:~# /etc/init.d/rgmanager status
rgmanager is stopped
On both nodes, cman and rgmanager are dead!
So, if yype the sequence command bellow:
/etc/init.d/cman start
/etc/init.d/rgmanager start
On both nodes, the cluster go on-line....
I had experience with this issue in physical machines two...
First, I tought that could be a problem with VirtualBox VM's but it is
not...
So, as a workaround, I put this command in rc.local:
/etc/init.d/cman stop
/etc/init.d/rgmanager stop
/etc/init.d/cman start
/etc/init.d/rgmanager start
in order to bring cluster on-line...
Here's the cluster.conf:
<?xml version="1.0"?>
<cluster config_version="35" name="CLUSTER">
<cman expected_votes="3" keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<quorumd allow_kill="0" interval="3" label="quorum" tko="10" votes="1">
<heuristic interval="3" program="ping 192.168.1.100 -c1 -w1" score="1"
tko="4"/>
<heuristic interval="3" program="ip addr | grep eth0 | grep -q UP"
score="2" tko="3"/>
</quorumd>
<totem token="54000"/>
<clusternodes>
<clusternode name="pve01" nodeid="1" votes="1">
</clusternode>
<clusternode name="pve02" nodeid="2" votes="1">
</clusternode>
</clusternodes>
<rm>
<failoverdomains>
<failoverdomain name="serverfailover" ordered="1" restricted="0">
<failoverdomainnode name="pve01" priority="1"/>
<failoverdomainnode name="pve02" priority="2"/>
</failoverdomain>
</failoverdomains>
<pvevm autostart="1" vmid="100"/>
</rm>
</cluster>
And /etc/default/redhat-cluster-pve has the content:
FENCE_JOIN="yes"
After running this:
/etc/init.d/cman stop
/etc/init.d/cman start
/etc/init.d/rgmanager stop
/etc/init.d/rgmanager start
/etc/init.d/pve-cluster stop
/etc/init.d/pve-cluster start
/etc/init.d/pveproxy stop
/etc/init.d/pveproxy start
My cluster get on-line, but the more weird issue is here:
clustat
Cluster Status for CLUSTER @ Thu Jul 10 11:43:30 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
pve01 1
Online, Local, rgmanager
pve02 2
Online
/dev/block/8:33 0
Online, Quorum Disk
Service Name Owner
(Last) State
------- ---- -----
------ -----
pvevm:100 pve01
starting
I remove such VM, 100... It doesn't exist anymore.... But stiil there,
according to clustat!!!
Seconds after, I run clustat again and got this message:
pve01:~# clustat
Cluster Status for CLUSTER @ Thu Jul 10 11:43:46 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
pve01 1
Online, Local, rgmanager
pve02 2
Online
/dev/block/8:33 0
Online, Quorum Disk
Service Name Owner
(Last) State
------- ---- -----
------ -----
pvevm:100 (none)
recoverable
And finally:
clustat
Cluster Status for CLUSTER @ Thu Jul 10 11:44:07 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
pve01 1
Online, Local, rgmanager
pve02 2
Online, rgmanager
/dev/block/8:33 0
Online, Quorum Disk
Service Name Owner
(Last) State
------- ---- -----
------ -----
pvevm:100 (pve01)
failed
But, again, there is no VM...
Is there something I do wrong?
--
Gilberto Ferreira
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pve-user/attachments/20140711/e3a388de/attachment.htm>
More information about the pve-user
mailing list