[PVE-User] Proxmox CEPH 6 servers failures!
Gilberto Nunes
gilberto.nunes32 at gmail.com
Fri Oct 5 18:54:01 CEST 2018
>> Have you read any documentation ?
>> At all? Even just a quick-start guide? If so, did you retain any of
it? (Odd numbers, quorum, etc)
Yes, I did some research and read the docs... In fact, I just missing the
Odd numbers!
Sorry if I sent a lot of mail to the list about my inquires...
Thanks
---
Gilberto Nunes Ferreira
(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram
Skype: gilberto.nunes36
Em sex, 5 de out de 2018 às 13:35, Woods, Ken A (DNR) <ken.woods at alaska.gov>
escreveu:
> Gilberto,
>
> I have a questions, which I think many of us have, given your recent and
> not-so-recent history. Please don’t take them as insults, they’re not
> intended as such. I’m just trying to figure out how to best help you solve
> the problems you keep having.
>
> Have you read any documentation ?
> At all? Even just a quick-start guide? If so, did you retain any of it?
> (Odd numbers, quorum, etc)
>
> Or—-do you fire off an email to the list without first trying to find the
> solution yourself?
>
> Additionally, how many times does it take for you to receive the same
> answer before you believe it?
> Have you considered buying a full service maintenance subscription?
>
> Thanks, I’m pretty sure if we can figure out how you think about these
> issues, we can better help you. .......Because at this point, I’m ready to
> start telling you to STFU&RTFM.
>
> Compassionately,
>
> Ken
>
>
>
> > On Oct 5, 2018, at 07:49, Gilberto Nunes <gilberto.nunes32 at gmail.com>
> wrote:
> >
> > I have 6 monitors.
> > What if I reduce it to 5? Or 4? Would help??
> > ---
> > Gilberto Nunes Ferreira
> >
> > (47) 3025-5907
> > (47) 99676-7530 - Whatsapp / Telegram
> >
> > Skype: gilberto.nunes36
> >
> >
> >
> >
> >
> > Em sex, 5 de out de 2018 às 11:46, Marcus Haarmann <
> > marcus.haarmann at midoco.de> escreveu:
> >
> >> This is corosync you are talking about. Also there, a quorum is needed
> to
> >> work properly.
> >> It needs to be configured in the same way as ceph.
> >> You will always need a majority (e.g 4 out of 6, 3 out of 6 wont do).
> >>
> >> You main problem can be that you might lose one location and the part
> >> which has the majority of servers
> >> is down.
> >> In my opinion, in your situation a 7th server would get you to 7 active
> >> servers, 4 needed,
> >> so 3 can be offline (remember to check your crush map so you will have a
> >> working ceph cluster
> >> on the remaining servers).
> >> Depending on which side is getting offline, only one side will be able
> to
> >> operate without the other,
> >> but the other side won't.
> >>
> >> Marcus Haarmann
> >>
> >>
> >> Von: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> >> An: "pve-user" <pve-user at pve.proxmox.com>
> >> Gesendet: Freitag, 5. Oktober 2018 15:08:24
> >> Betreff: Re: [PVE-User] Proxmox CEPH 6 servers failures!
> >>
> >> Ok! Now I get it!
> >> pvecm show me
> >> pve-ceph01:/etc/pve# pvecm status
> >> Quorum information
> >> ------------------
> >> Date: Fri Oct 5 10:04:57 2018
> >> Quorum provider: corosync_votequorum
> >> Nodes: 6
> >> Node ID: 0x00000001
> >> Ring ID: 1/32764
> >> Quorate: Yes
> >>
> >> Votequorum information
> >> ----------------------
> >> Expected votes: 6
> >> Highest expected: 6
> >> Total votes: 6
> >> Quorum: 4
> >> Flags: Quorate
> >>
> >> Membership information
> >> ----------------------
> >> Nodeid Votes Name
> >> 0x00000001 1 10.10.10.100 (local)
> >> 0x00000002 1 10.10.10.110
> >> 0x00000003 1 10.10.10.120
> >> 0x00000004 1 10.10.10.130
> >> 0x00000005 1 10.10.10.140
> >> 0x00000006 1 10.10.10.150
> >>
> >> *Quorum: 4*
> >> So I need 4 server online, at least!
> >> Now when I loose 3 of 6, I remain, of course, just with 3 and not with
> 4,
> >> which is required...
> >> I will request new server to make quorum. Thanks for clarify this
> >> situation!
> >> ---
> >> Gilberto Nunes Ferreira
> >>
> >> (47) 3025-5907
> >> (47) 99676-7530 - Whatsapp / Telegram
> >>
> >> Skype: gilberto.nunes36
> >>
> >>
> >>
> >>
> >>
> >> Em sex, 5 de out de 2018 às 09:53, Gilberto Nunes <
> >> gilberto.nunes32 at gmail.com> escreveu:
> >>
> >>> Folks...
> >>>
> >>> I CEPH servers are in the same network: 10.10.10.0/24...
> >>> There is a optic channel between the builds: buildA and buildB, just to
> >>> identified!
> >>> When I create the cluster in first time, 3 servers going down in
> buildB,
> >>> and the remain ceph servers continued to worked properly...
> >>> I do not understand why now this cant happens anymore!
> >>> Sorry if I sound like a newbie! I still learn about it!
> >>> ---
> >>> Gilberto Nunes Ferreira
> >>>
> >>> (47) 3025-5907
> >>> (47) 99676-7530 - Whatsapp / Telegram
> >>>
> >>> Skype: gilberto.nunes36
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Em sex, 5 de out de 2018 às 09:44, Marcus Haarmann <
> >>> marcus.haarmann at midoco.de> escreveu:
> >>>
> >>>> Gilberto,
> >>>>
> >>>> the underlying problem is a ceph problem and not related to VMs or
> >>>> Proxmox.
> >>>> The ceph system requires a mayority of monitor nodes to be active.
> >>>> Your setup seems to have 3 mon nodes, which results in a loss of
> quorum
> >>>> when two of these servers are gone.
> >>>> Check "ceph -s" on each side if you see any reaction of ceph.
> >>>> If not, probably not enough mons are present.
> >>>>
> >>>> Also, when one side is down you should see a non-presence of some OSD
> >>>> instances.
> >>>> In this case, ceph might be up but your VMs which are spread over the
> >> OSD
> >>>> disks,
> >>>> might block because of the non-accessibility of the primary storage.
> >>>> The distribution of data over the OSD instances is steered by the
> crush
> >>>> map.
> >>>> You should make sure to have enough copies configured and the crush
> map
> >>>> set up in a way
> >>>> that on each side of your cluster is minimum one copy.
> >>>> In case the crush map is mis-configured, all copies of your data may
> be
> >>>> on the wrong side,
> >>>> esulting in proxmox not being able to access the VM data.
> >>>>
> >>>> Marcus Haarmann
> >>>>
> >>>>
> >>>> Von: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> >>>> An: "pve-user" <pve-user at pve.proxmox.com>
> >>>> Gesendet: Freitag, 5. Oktober 2018 14:31:20
> >>>> Betreff: Re: [PVE-User] Proxmox CEPH 6 servers failures!
> >>>>
> >>>> Nice.. Perhaps if I create a VM in Proxmox01 and Proxmox02, and join
> >> this
> >>>> VM into Cluster Ceph, can I solve to quorum problem?
> >>>> ---
> >>>> Gilberto Nunes Ferreira
> >>>>
> >>>> (47) 3025-5907
> >>>> (47) 99676-7530 - Whatsapp / Telegram
> >>>>
> >>>> Skype: gilberto.nunes36
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Em sex, 5 de out de 2018 às 09:23, dorsy <dorsyka at yahoo.com>
> escreveu:
> >>>>>
> >>>>> Your question has already been answered. You need majority to have
> >>>> quorum.
> >>>>>
> >>>>>> On 2018. 10. 05. 14:10, Gilberto Nunes wrote:
> >>>>>> Hi
> >>>>>> Perhaps this can help:
> >>>>>>
> >>>>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__imageshack.com_a_img921_6208_X7ha8R.png&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=ol07vaB33zwEaLWY7eR90cAScnrpD7QJI5G1zpMMlKI&e=
> >>>>>>
> >>>>>> I was thing about it, and perhaps if I deploy a VM in both side,
> >> with
> >>>>>> Proxmox and add this VM to the CEPH cluster, maybe this can help!
> >>>>>>
> >>>>>> thanks
> >>>>>> ---
> >>>>>> Gilberto Nunes Ferreira
> >>>>>>
> >>>>>> (47) 3025-5907
> >>>>>> (47) 99676-7530 - Whatsapp / Telegram
> >>>>>>
> >>>>>> Skype: gilberto.nunes36
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Em sex, 5 de out de 2018 às 03:55, Alexandre DERUMIER <
> >>>>> aderumier at odiso.com>
> >>>>>> escreveu:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> Can you resend your schema, because it's impossible to read.
> >>>>>>>
> >>>>>>>
> >>>>>>> but you need to have to quorum on monitor to have the cluster
> >>>> working.
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Mail original -----
> >>>>>>> De: "Gilberto Nunes" <gilberto.nunes32 at gmail.com>
> >>>>>>> À: "proxmoxve" <pve-user at pve.proxmox.com>
> >>>>>>> Envoyé: Jeudi 4 Octobre 2018 22:05:16
> >>>>>>> Objet: [PVE-User] Proxmox CEPH 6 servers failures!
> >>>>>>>
> >>>>>>> Hi there
> >>>>>>>
> >>>>>>> I have something like this:
> >>>>>>>
> >>>>>>> CEPH01 ----|
> >>>>>>> |----- CEPH04
> >>>>>>> |
> >>>>>>> |
> >>>>>>> CEPH02
> >>>> ----|-----------------------------------------------------|----
> >>>>>>> CEPH05
> >>>>>>> | Optic Fiber
> >>>>>>> |
> >>>>>>> CEPH03 ----|
> >>>>>>> |--- CEPH06
> >>>>>>>
> >>>>>>> Sometime, when Optic Fiber not work, and just CEPH01, CEPH02 and
> >>>> CEPH03
> >>>>>>> remains, the entire cluster fail!
> >>>>>>> I find out the cause!
> >>>>>>>
> >>>>>>> ceph.conf
> >>>>>>>
> >>>>>>> [global] auth client required = cephx auth cluster required =
> >> cephx
> >>>> auth
> >>>>>>> service required = cephx cluster network = 10.10.10.0/24 fsid =
> >>>>>>> e67534b4-0a66-48db-ad6f-aa0868e962d8 keyring =
> >>>>>>> /etc/pve/priv/$cluster.$name.keyring mon allow pool delete = true
> >>>> osd
> >>>>>>> journal size = 5120 osd pool default min size = 2 osd pool default
> >>>> size
> >>>>> =
> >>>>>>> 3
> >>>>>>> public network = 10.10.10.0/24 [osd] keyring =
> >>>>>>> /var/lib/ceph/osd/ceph-$id/keyring [mon.pve-ceph01] host =
> >>>> pve-ceph01
> >>>>> mon
> >>>>>>> addr = 10.10.10.100:6789 mon osd allow primary affinity = true
> >>>>>>> [mon.pve-ceph02] host = pve-ceph02 mon addr = 10.10.10.110:6789
> >> mon
> >>>> osd
> >>>>>>> allow primary affinity = true [mon.pve-ceph03] host = pve-ceph03
> >> mon
> >>>>> addr
> >>>>>>> =
> >>>>>>> 10.10.10.120:6789 mon osd allow primary affinity = true
> >>>>> [mon.pve-ceph04]
> >>>>>>> host = pve-ceph04 mon addr = 10.10.10.130:6789 mon osd allow
> >>>> primary
> >>>>>>> affinity = true [mon.pve-ceph05] host = pve-ceph05 mon addr =
> >>>>>>> 10.10.10.140:6789 mon osd allow primary affinity = true
> >>>>> [mon.pve-ceph06]
> >>>>>>> host = pve-ceph06 mon addr = 10.10.10.150:6789 mon osd allow
> >>>> primary
> >>>>>>> affinity = true
> >>>>>>>
> >>>>>>> Any help will be welcome!
> >>>>>>>
> >>>>>>> ---
> >>>>>>> Gilberto Nunes Ferreira
> >>>>>>>
> >>>>>>> (47) 3025-5907
> >>>>>>> (47) 99676-7530 - Whatsapp / Telegram
> >>>>>>>
> >>>>>>> Skype: gilberto.nunes36
> >>>>>>> _______________________________________________
> >>>>>>> pve-user mailing list
> >>>>>>> pve-user at pve.proxmox.com
> >>>>>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> pve-user mailing list
> >>>>>>> pve-user at pve.proxmox.com
> >>>>>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> >>>>>> _______________________________________________
> >>>>>> pve-user mailing list
> >>>>>> pve-user at pve.proxmox.com
> >>>>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> >>>>> _______________________________________________
> >>>>> pve-user mailing list
> >>>>> pve-user at pve.proxmox.com
> >>>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> >>>> _______________________________________________
> >>>> pve-user mailing list
> >>>> pve-user at pve.proxmox.com
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> >>>> _______________________________________________
> >>>> pve-user mailing list
> >>>> pve-user at pve.proxmox.com
> >>>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> >> _______________________________________________
> >> pve-user mailing list
> >> pve-user at pve.proxmox.com
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> >> _______________________________________________
> >> pve-user mailing list
> >> pve-user at pve.proxmox.com
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> > _______________________________________________
> > pve-user mailing list
> > pve-user at pve.proxmox.com
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=MgD89RsU1x3jskwZGQbL6-1NxgHQ1p8eVOUTn80Qrs0&s=JjGOAMHuh_uB4EgSPjevuD3d-A4OKgqg6WKszbSuZyg&e=
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
More information about the pve-user
mailing list