[PVE-User] Ceph Journal Performance

Eneko Lacunza elacunza at binovo.es
Mon Nov 3 09:10:41 CET 2014

Hi Lindsay,

On 02/11/14 08:59, Lindsay Mathieson wrote:
> One OSD per node.
>> You're breaking CEPH's philosophy. It's
>> designed to be used with at least tens of OSDs. You can use any old/cheap
>> drives. Just more-better.
> Yah, bit of a learning curve here, having to adjust my preconceptions and
> expectations.
> If I increased the number of OSD's to 4 per node would that improve the write
> performance?
> - Could I ensue there was a copy of all data on each node?
> - Would it be usable with one node down? (redundancy)
If you configure Ceph to work with a minimum size of 1 yes. Ceph knows 
on what node is each OSD so is does take that into account to balance 

4 drives per server will be better, but using SSD for journals will help 
you a lot, could even give you better performance than 4 osds per 
server. He had for some months a 2-osd setup with journal on intel ssd 
320's and about 20 VMs working quite good. (didn't test performance)

Take into account that usually you won't see sequential IO, but almost 
all will be random, due to IO from different VMs mixing in.
>> For 2 drives maybe better use DRBD.
> Yes, I'm wondering if I trying to force square pegs into round wholes. I like
> ceph, especially its flexibility and potential for growth, but I'm wondering
> if its overkill and a mismatch for my requirements.
> There were several things I was wanting out of  this exercise:
> - Shared Storage. Migration  is essential.
> - Redundancy. I need this to operate with one node down.
> - Better performance. Our NAS is only just adequate.
> - Quick recovery. We have unreliable power and several times this year, even
> with a big UPS the nodes and nas have been forced to shutdown. The NAS takes a
> lot longer to come up than proxmox, meaning none of the virtual servers
> autostart.
> GlusterFS has tested out well so far and is easier to setup than DRBD, but I'd
> really like to give ceph a go as it seems the future for proxmox.
I have no experience with GlusterFS. I suggest you skip DRBD except if 
you're sure you won't need more disks; configuring Proxmox/DRBD to shut 
down and start up OK automatically is a bit of a pain; you'lll need to 
create some init scripts with dependencies, etc. (It is cheaper to buy 
SSDs/more OSDs) Then test, test, test :) I have been there last week :)
>>> Would it be ok to use spare space on the proxmox boot/swap/system SSD? or
>>> bad  idea?
>> If your system SSD is fast enough, then why not? PVE itself not uses much
>> bandwidth on it.
> Do the OSD journals all use the same raw device partition, or would I have to
> create a partition on the SSD for each OSD?
Proxmox/ceph will create a separate partition (5GB default) for each 
OSD's journal. Check your SSD's write IOPS too.


Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943575997
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)

More information about the pve-user mailing list