[PVE-User] Proxmox CEPH 6 servers failures!

Fri Oct 5 17:48:24 CEST 2018

I have 6 monitors.
What if I reduce it to 5? Or 4? Would help??
---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36

Em sex, 5 de out de 2018 às 11:46, Marcus Haarmann <
marcus.haarmann at midoco.de> escreveu:

> This is corosync you are talking about. Also there, a quorum is needed to
> work properly.
> It needs to be configured in the same way as ceph.
> You will always need a majority (e.g 4 out of 6, 3 out of 6 wont do).
>
> You main problem can be that you might lose one location and the part
> which has the majority of servers
> is down.
> In my opinion, in your situation a 7th server would get you to 7 active
> servers, 4 needed,
> so 3 can be offline (remember to check your crush map so you will have a
> working ceph cluster
> on the remaining servers).
> Depending on which side is getting offline, only one side will be able to
> operate without the other,
> but the other side won't.
>
> Marcus Haarmann
>
>
> Von: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> An: "pve-user" <pve-user at pve.proxmox.com>
> Gesendet: Freitag, 5. Oktober 2018 15:08:24
> Betreff: Re: [PVE-User] Proxmox CEPH 6 servers failures!
>
> Ok! Now I get it!
> pvecm show me
> pve-ceph01:/etc/pve# pvecm status
> Quorum information
> ------------------
> Date: Fri Oct 5 10:04:57 2018
> Quorum provider: corosync_votequorum
> Nodes: 6
> Node ID: 0x00000001
> Ring ID: 1/32764
> Quorate: Yes
>
> Votequorum information
> ----------------------
> Expected votes: 6
> Highest expected: 6
> Total votes: 6
> Quorum: 4
> Flags: Quorate
>
> Membership information
> ----------------------
> Nodeid Votes Name
> 0x00000001 1 10.10.10.100 (local)
> 0x00000002 1 10.10.10.110
> 0x00000003 1 10.10.10.120
> 0x00000004 1 10.10.10.130
> 0x00000005 1 10.10.10.140
> 0x00000006 1 10.10.10.150
>
> *Quorum: 4*
> So I need 4 server online, at least!
> Now when I loose 3 of 6, I remain, of course, just with 3 and not with 4,
> which is required...
> I will request new server to make quorum. Thanks for clarify this
> situation!
> ---
> Gilberto Nunes Ferreira
>
> (47) 3025-5907
> (47) 99676-7530 - Whatsapp / Telegram
>
> Skype: gilberto.nunes36
>
>
>
>
>
> Em sex, 5 de out de 2018 às 09:53, Gilberto Nunes <
> gilberto.nunes32 at gmail.com> escreveu:
>
> > Folks...
> >
> > I CEPH servers are in the same network: 10.10.10.0/24...
> > There is a optic channel between the builds: buildA and buildB, just to
> > identified!
> > When I create the cluster in first time, 3 servers going down in buildB,
> > and the remain ceph servers continued to worked properly...
> > I do not understand why now this cant happens anymore!
> > Sorry if I sound like a newbie! I still learn about it!
> > ---
> > Gilberto Nunes Ferreira
> >
> > (47) 3025-5907
> > (47) 99676-7530 - Whatsapp / Telegram
> >
> > Skype: gilberto.nunes36
> >
> >
> >
> >
> >
> > Em sex, 5 de out de 2018 às 09:44, Marcus Haarmann <
> > marcus.haarmann at midoco.de> escreveu:
> >
> >> Gilberto,
> >>
> >> the underlying problem is a ceph problem and not related to VMs or
> >> Proxmox.
> >> The ceph system requires a mayority of monitor nodes to be active.
> >> Your setup seems to have 3 mon nodes, which results in a loss of quorum
> >> when two of these servers are gone.
> >> Check "ceph -s" on each side if you see any reaction of ceph.
> >> If not, probably not enough mons are present.
> >>
> >> Also, when one side is down you should see a non-presence of some OSD
> >> instances.
> >> In this case, ceph might be up but your VMs which are spread over the
> OSD
> >> disks,
> >> might block because of the non-accessibility of the primary storage.
> >> The distribution of data over the OSD instances is steered by the crush
> >> map.
> >> You should make sure to have enough copies configured and the crush map
> >> set up in a way
> >> that on each side of your cluster is minimum one copy.
> >> In case the crush map is mis-configured, all copies of your data may be
> >> on the wrong side,
> >> esulting in proxmox not being able to access the VM data.
> >>
> >> Marcus Haarmann
> >>
> >>
> >> Von: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> >> An: "pve-user" <pve-user at pve.proxmox.com>
> >> Gesendet: Freitag, 5. Oktober 2018 14:31:20
> >> Betreff: Re: [PVE-User] Proxmox CEPH 6 servers failures!
> >>
> >> Nice.. Perhaps if I create a VM in Proxmox01 and Proxmox02, and join
> this
> >> VM into Cluster Ceph, can I solve to quorum problem?
> >> ---
> >> Gilberto Nunes Ferreira
> >>
> >> (47) 3025-5907
> >> (47) 99676-7530 - Whatsapp / Telegram
> >>
> >> Skype: gilberto.nunes36
> >>
> >>
> >>
> >>
> >>
> >> Em sex, 5 de out de 2018 às 09:23, dorsy <dorsyka at yahoo.com> escreveu:
> >>
> >> > Your question has already been answered. You need majority to have
> >> quorum.
> >> >
> >> > On 2018. 10. 05. 14:10, Gilberto Nunes wrote:
> >> > > Hi
> >> > > Perhaps this can help:
> >> > >
> >> > > https://imageshack.com/a/img921/6208/X7ha8R.png
> >> > >
> >> > > I was thing about it, and perhaps if I deploy a VM in both side,
> with
> >> > > Proxmox and add this VM to the CEPH cluster, maybe this can help!
> >> > >
> >> > > thanks
> >> > > ---
> >> > > Gilberto Nunes Ferreira
> >> > >
> >> > > (47) 3025-5907
> >> > > (47) 99676-7530 - Whatsapp / Telegram
> >> > >
> >> > > Skype: gilberto.nunes36
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > Em sex, 5 de out de 2018 às 03:55, Alexandre DERUMIER <
> >> > aderumier at odiso.com>
> >> > > escreveu:
> >> > >
> >> > >> Hi,
> >> > >>
> >> > >> Can you resend your schema, because it's impossible to read.
> >> > >>
> >> > >>
> >> > >> but you need to have to quorum on monitor to have the cluster
> >> working.
> >> > >>
> >> > >>
> >> > >> ----- Mail original -----
> >> > >> De: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> >> > >> À: "proxmoxve" <pve-user at pve.proxmox.com>
> >> > >> Envoyé: Jeudi 4 Octobre 2018 22:05:16
> >> > >> Objet: [PVE-User] Proxmox CEPH 6 servers failures!
> >> > >>
> >> > >> Hi there
> >> > >>
> >> > >> I have something like this:
> >> > >>
> >> > >> CEPH01 ----|
> >> > >> |----- CEPH04
> >> > >> |
> >> > >> |
> >> > >> CEPH02
> >> ----|-----------------------------------------------------|----
> >> > >> CEPH05
> >> > >> | Optic Fiber
> >> > >> |
> >> > >> CEPH03 ----|
> >> > >> |--- CEPH06
> >> > >>
> >> > >> Sometime, when Optic Fiber not work, and just CEPH01, CEPH02 and
> >> CEPH03
> >> > >> remains, the entire cluster fail!
> >> > >> I find out the cause!
> >> > >>
> >> > >> ceph.conf
> >> > >>
> >> > >> [global] auth client required = cephx auth cluster required =
> cephx
> >> auth
> >> > >> service required = cephx cluster network = 10.10.10.0/24 fsid =
> >> > >> e67534b4-0a66-48db-ad6f-aa0868e962d8 keyring =
> >> > >> /etc/pve/priv/$cluster.$name.keyring mon allow pool delete = true
> >> osd
> >> > >> journal size = 5120 osd pool default min size = 2 osd pool default
> >> size
> >> > =
> >> > >> 3
> >> > >> public network = 10.10.10.0/24 [osd] keyring =
> >> > >> /var/lib/ceph/osd/ceph-$id/keyring [mon.pve-ceph01] host =
> >> pve-ceph01
> >> > mon
> >> > >> addr = 10.10.10.100:6789 mon osd allow primary affinity = true
> >> > >> [mon.pve-ceph02] host = pve-ceph02 mon addr = 10.10.10.110:6789
> mon
> >> osd
> >> > >> allow primary affinity = true [mon.pve-ceph03] host = pve-ceph03
> mon
> >> > addr
> >> > >> =
> >> > >> 10.10.10.120:6789 mon osd allow primary affinity = true
> >> > [mon.pve-ceph04]
> >> > >> host = pve-ceph04 mon addr = 10.10.10.130:6789 mon osd allow
> >> primary
> >> > >> affinity = true [mon.pve-ceph05] host = pve-ceph05 mon addr =
> >> > >> 10.10.10.140:6789 mon osd allow primary affinity = true
> >> > [mon.pve-ceph06]
> >> > >> host = pve-ceph06 mon addr = 10.10.10.150:6789 mon osd allow
> >> primary
> >> > >> affinity = true
> >> > >>
> >> > >> Any help will be welcome!
> >> > >>
> >> > >> ---
> >> > >> Gilberto Nunes Ferreira
> >> > >>
> >> > >> (47) 3025-5907
> >> > >> (47) 99676-7530 - Whatsapp / Telegram
> >> > >>
> >> > >> Skype: gilberto.nunes36
> >> > >> _______________________________________________
> >> > >> pve-user mailing list
> >> > >> pve-user at pve.proxmox.com
> >> > >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >> > >>
> >> > >> _______________________________________________
> >> > >> pve-user mailing list
> >> > >> pve-user at pve.proxmox.com
> >> > >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >> > >>
> >> > > _______________________________________________
> >> > > pve-user mailing list
> >> > > pve-user at pve.proxmox.com
> >> > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >> > _______________________________________________
> >> > pve-user mailing list
> >> > pve-user at pve.proxmox.com
> >> > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >> >
> >> _______________________________________________
> >> pve-user mailing list
> >> pve-user at pve.proxmox.com
> >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >> _______________________________________________
> >> pve-user mailing list
> >> pve-user at pve.proxmox.com
> >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> >>
> >
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>