<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Hi,<br>
<br>
El 17/03/16 a las 10:51, Jean-Laurent Ivars escribió:<br>
</div>
<blockquote
cite="mid:F9931FE8-CFD2-4C2D-9FC9-36BC21DDB9A4@ipgenius.fr"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
El 16/03/16 a las 20:39, Jean-Laurent Ivars escribió:<br class="">
<div class="">
<div>
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<blockquote
cite="mid:3BAE8E69-5D5F-4A99-9C74-C89E0E171EA4@ipgenius.fr"
type="cite" class="">
<div class="">I have a 2 host cluster setup with ZFS
and replicated on each other with pvesync script
among other things and my VMs are running on these
hosts for now but I am impatient to be able to
migrate on my new infrastructure. I decided to
change my infrastructure because I really would like
to take advantage of CEPH for replication, expanding
abilities, live migration and even maybe high
availability setup.
<div class=""><br class="">
</div>
<div class="">After having read a lot of
documentations/books/forums, I decided to go with
CEPH storage which seem to be the way to go for
me.</div>
<div class=""><br class="">
</div>
<div class="">My servers are hosted by OVH and from
what I read, and with the budget I have, the best
options with CEPH storage in mind seemed to be the
following servers : <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="https://www.ovh.com/fr/serveurs_dedies/details-servers.xml?range=HOST&id=2016-HOST-32H">https://www.ovh.com/fr/serveurs_dedies/details-servers.xml?range=HOST&id=2016-HOST-32H</a> </div>
<div class="">With the following storage options :
No HW Raid, 2X300Go SSD and 2X2To HDD</div>
</div>
</blockquote>
About the SSD, what exact brand/model are they? I can't
find this info on OVH web.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>The models are INTEL SSDSC2BB30, you can find information
here : <a moz-do-not-send="true"
href="https://www.ovh.com/fr/serveurs_dedies/avantages-disques-ssd.xml"
class="">https://www.ovh.com/fr/serveurs_dedies/avantages-disques-ssd.xml</a></div>
<div>They are datacenter SSD and they have the Power Loss
Imminent protection.</div>
</div>
</div>
</blockquote>
Ok, they should perform well for Ceph, I have one of those in a
setup. You should monitor their wear-out though, as they are rated
only for 0.3 drive writes per day.<br class="">
<blockquote
cite="mid:F9931FE8-CFD2-4C2D-9FC9-36BC21DDB9A4@ipgenius.fr"
type="cite">
<div class="">
<div>
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class=""> <br
class="">
<blockquote
cite="mid:3BAE8E69-5D5F-4A99-9C74-C89E0E171EA4@ipgenius.fr"
type="cite" class="">
<div class=""><br class="">
</div>
<div class="">I know that it would be better to give
CEPH the whole disks but I have to put my system
somewhere… I was thinking that even if it’s not the
best (i can’t afford more), these settings would
work… So I have tried to give CEPH the OSDs with my
SSD journal partition with the appropriate command
but it didn’t seem to work and I assume it's because
CEPH don’t want partitions but entire hard drive…</div>
<div class=""><br class="">
</div>
<div class="">
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);"><span
class="" style="color: rgb(195, 55, 32);">root</span><span
class="" style="color: rgb(175, 173, 36);">@</span><span
class="" style="color: rgb(52, 187, 199);">pvegra1 </span><span
class="" style="color: rgb(175, 173, 36);">~ </span><span
class="" style="color: rgb(213, 59, 211);"># </span>pveceph
createosd /dev/sdc -journal_dev /dev/sda4</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">create
OSD on /dev/sdc (xfs)</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">using
device '/dev/sda4' for journal</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">Creating
new GPT entries.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">GPT
data structures destroyed! You may now partition
the disk using fdisk or</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">other
utilities.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">Creating
new GPT entries.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">The
operation has completed successfully.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">WARNING:ceph-disk:OSD
will not be hot-swappable if journal is not the
same device as the osd data</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">WARNING:ceph-disk:Journal
/dev/sda4 was not prepared with ceph-disk.
Symlinking directly.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">Setting
name!</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">partNum
is 0</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">REALLY
setting name!</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">The
operation has completed successfully.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">meta-data=/dev/sdc1
isize=2048 agcount=4,
agsize=122094597 blks</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">
= sectsz=512 attr=2,
projid32bit=1</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">
= crc=0 finobt=0</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">data
= bsize=4096
blocks=488378385, imaxpct=5</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">
= sunit=0 swidth=0
blks</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">naming
=version 2 bsize=4096 ascii-ci=0
ftype=0</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">log
=internal log bsize=4096
blocks=238466, version=2</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">
= sectsz=512 sunit=0
blks, lazy-count=1</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">realtime
=none extsz=4096 blocks=0,
rtextents=0</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">Warning:
The kernel is still using the old partition table.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">The new
table will be used at the next reboot.</div>
<div class="" style="margin: 0px; line-height:
normal; font-family: 'Andale Mono'; color: rgb(41,
249, 20); background-color: rgb(0, 0, 0);">The
operation has completed successfully.</div>
</div>
<div class=""><br class="">
</div>
<div class="">I saw the following threads : </div>
<div class=""><a moz-do-not-send="true"
href="https://forum.proxmox.com/threads/ceph-server-feedback.17909/"
class="">https://forum.proxmox.com/threads/ceph-server-feedback.17909/</a> </div>
<div class=""><a moz-do-not-send="true"
href="https://forum.proxmox.com/threads/ceph-server-why-block-devices-and-not-partitions.17863/"
class="">https://forum.proxmox.com/threads/ceph-server-why-block-devices-and-not-partitions.17863/</a><br
class="">
<div class=""><br class="webkit-block-placeholder">
</div>
<div class="">But this kind of setting seem to
suffer performance issue and It’s not officially
supported and I am not feeling very well with that
because at the moment, I only took community
subscription from Proxmox but I want to be able to
move on a different plan to get support from them
if I need it and if I go this way, I’m afraid they
will say me it’s a non supported configuration.</div>
<div class=""><br class="">
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>So you aren’t « shocked » I want to use partitions
instead of whole drives in my configuration ?</div>
</div>
</div>
</blockquote>
OSD Journals are always a partition. :-) That is what Proxmox does
from GUI; creates a new partition for the journal in the
journal-dirve; if you don't choose a journal drive, then it creates
2 partitions on the OSD disk, one for journal and the other for
data.<br>
<blockquote
cite="mid:F9931FE8-CFD2-4C2D-9FC9-36BC21DDB9A4@ipgenius.fr"
type="cite">
<div class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<blockquote
cite="mid:3BAE8E69-5D5F-4A99-9C74-C89E0E171EA4@ipgenius.fr"
type="cite" class="">
<div class="">
<div class=""> </div>
<div class="">OVH can provide USB keys so I could
install the system on it and get my whole disks
for CEPH, but I think it is not supported too.
Moreover, I fear for performances and stability in
the time with this solution.</div>
<div class=""><br class="">
</div>
<div class="">Maybe I could use one SSD for the
system and journal partitions (but again it’s a
mix not really supported) and the other SSD
dedicated to CEPH… but with this solution I loose
my system RAID protection… and a lot of SSD
space...</div>
<div class=""><br class="">
</div>
<div class="">I’m a little bit confused about the
best partitioning scheme and how to manage to
obtain a stable, supported, which the less space
lost and performant configuration.</div>
<div class=""><br class="">
</div>
<div class="">Should I continue with my partitioning
scheme even if it’s not the best supported, it
seem the most appropriate in my case or do I need
to completing rethink my install ?</div>
<div class=""><br class="">
</div>
<div class="">Please can someone give me advice, I’m
all yours :)</div>
<div class="">Thanks a lot for anyone taking the
time to read this mail and giving me good advices.</div>
</div>
</blockquote>
I suggest you only mirror swap and root partitions. Then
use one SSD for earch OSD's journal.<br class="">
<br class="">
So to fix your problems, please try the following:<br
class="">
- Remove all OSDs from Proxmox GUI (or CLI)<br class="">
- Remove journal partitions<br class="">
- Remove journal partition mirrors<br class="">
- Now we have 2 partitions on each SSD (swap and root),
mirrored.<br class="">
- Create OSDs from Proxmox GUI, use a different SSD disk
for journal of each OSD. If you can't do this, SSD
drives don't have GPT partition.<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>Tank you very much for you suggestion, I am going to
follow you advices, only changing one thing, as a french ml
user told me, swap is not a really good idea, my system
won’t really need it and if it does it would not be good for
overall performances : it can cause intensive IO access so I
should not add this in my setup witch is soliciting the SSD
enough...</div>
</div>
</div>
</blockquote>
I have seen problems with too much swap, but 1-2 GB shouldn't be a
problem. In fact new Proxmox ISOs will limit swap to a max size of 8
o 4 GB (I don't recall right now).<br>
<blockquote
cite="mid:F9931FE8-CFD2-4C2D-9FC9-36BC21DDB9A4@ipgenius.fr"
type="cite">
<div class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<blockquote
cite="mid:3BAE8E69-5D5F-4A99-9C74-C89E0E171EA4@ipgenius.fr"
type="cite" class="">
<div class="">
<div class=""><br class="webkit-block-placeholder">
</div>
<div class="">P.S. If someone from the official
proxmox support team sees this message can you
tell me If I buy a subscription with ticket if I
can be assisted on this kind of question ? And if
I buy a subscription, I will ask help to configure
CEPH for the best too, SSD pool, normal speed
pool, how to set redundancy, how to make
snapshots, how to make backups and so on and so
on… is it the kind of things you can help me with
?</div>
</div>
</blockquote>
You need to first buy a subscription.<br class="">
<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>I already have a community subscription but what I was
really asking is IF i buy a higher one is this the kind of
question the support can give me answers.</div>
</div>
</div>
</blockquote>
Maybe better write directly to Dietmar o Martin to ask about this :)
<blockquote
cite="mid:F9931FE8-CFD2-4C2D-9FC9-36BC21DDB9A4@ipgenius.fr"
type="cite">
<div class="">
<div>
<div><br class="">
</div>
Thank you again for taking the time to answer me :)<br
class="">
</div>
</div>
</blockquote>
You're welcome!<br>
<br>
Cheers<br>
Eneko<br>
<pre class="moz-signature" cols="72">--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
<a class="moz-txt-link-abbreviated" href="http://www.binovo.es">www.binovo.es</a></pre>
</body>
</html>