[PVE-User] Ceph Journal Performance
Lindsay Mathieson
lindsay.mathieson at gmail.com
Wed Nov 5 01:52:15 CET 2014
On 3 November 2014 18:10, Eneko Lacunza <elacunza at binovo.es> wrote:
> Hi Lindsay,
Thanks for the informative reply Eneko, most helpful.
> 4 drives per server will be better, but using SSD for journals will help you
> a lot, could even give you better performance than 4 osds per server. He had
> for some months a 2-osd setup with journal on intel ssd 320's and about 20
> VMs working quite good. (didn't test performance)
I finally got round to testing ceph with ssd journal. Took me a bit as
I had to use a gparted boot iso to repartition the os ssd to free up
space, as ceph doesn't seem to like lvs partitions for journals.
I had to create the osd from the command line (pvecep hcreateosd) as
the webui didn't list my ssd partitions.
It did make a huge difference, raw vm IO increased from 3MB/s to 40.
Multiple VM's were much more responsive, quite usable.
Overall, I seemed to get similar i/o to what I was getting with
gluster, when I implemented a SSD cache for it (EXT4 with SSD
Journal). However ceph seemed to cope better with high loads, with one
of my stress tests - starting 7 vm's simultaneously, gluster seemed to
fail, with some of the VM's reporting I/O errors and crashing.
Whereas with ceph, they were very slow :) but all started normally.
Good enough results, that I think I will get a dedicated journal SSD
and add a couple of extra disks, though I have to work on our network
link. Its 2 bonded 1GB ports, but its maxing out at 90M/s, should do
better. Probably because I'm only using balance-rr, we have a Managed
switch with LACP, but I have to move it :) Need to replug everything
...
> Take into account that usually you won't see sequential IO, but almost all
> will be random, due to IO from different VMs mixing in.
Thats definitely where the SSD has helped. VM's are much more responsive now.
>>
>>
>>> For 2 drives maybe better use DRBD.
Yah, looked at that - not flexible enough as we would want to expand
and way to fiddly to setup.
>
> Proxmox/ceph will create a separate partition (5GB default) for each OSD's
> journal. Check your SSD's write IOPS too.
Can journal size be too large? if I gave 20GB+ to a journal for 3TB
drives would it be used or is that just a waste?
thanks,
--
Lindsay
More information about the pve-user
mailing list