[PVE-User] Best configuration among these : Episode 2

Eneko Lacunza elacunza at binovo.es
Tue Feb 24 14:28:01 CET 2015


On 24/02/15 13:53, Philippe Schwarz wrote:
>> I have 3 small proxmox-ceph clusters and neither of them surpasses
>> 250Mbps on peak use (and it's for backups), and there are some
>> DB-heavy processes running every two hours. Normally with so small
>> number of OSD you'll be limited by the IOPS of magnetic hard
>> disks.
> OK, so a single 10Gbe is far from nowadays limit for us. Good for the
> future.
> Gonna give a try to the poor-man failover solution.
I think this is a good idea. I monitor servers with munin for more 
detailed data, you can then check if you're running out out network 
bandwidth and upgrade to 10 gbit if needed. What you can do is to 
separate the "public ceph" (VMs->ceph) network from the "private ceph" 
(osd->osd copies, backfilling) network, using a different 1gbit port for 
>> Currently Ceph (firefly) has some performance bottlenecks with SSD
>> drives and can't use all their performance, so I don't think going
>> PCIe SSD for journals will help you, unless you use the same disk
>> for more OSD journals. I'd choose option #2, you could even put
>> proxmox on the same SSD as the journals. You can use the excess of
>> money for more OSD disks, generally people use 1 SSD per 3-4 OSDs.
>> You can also put the journals of 2 OSD in one SSD and the other 2
>> journals and proxmox on the other.
> Interesting Setup. It doubles the IOPS on the journal but divides by
> two the MTBF, because the failure of the journal is the failure of the
> server. Incidentally, it divides by two the number of writes on the
> SSD and doubles the average lifetime of the SSD. Interesting !
Also you can softraid the two ssd's for the system partition, so that 
you only lose 2 OSDs if a SSD fails. Softraid is not officially 
supported in Proxmox but it works very well. Nevertheless Intel SSDs 
have a very good track of quality, so I don't know if this is worth it.
>> Generally speaking, more OSDs = better performance. Take into
>> account that Ceph is optimized for multiple user/VMs access to
>> storage, not for 1 user/VM fast access. You can also add more
>> OSDs/SSDs to existing servers.
> Yes, the focus was put on this point in the tests i have read. It's
> important not to test from a single client.
> Ceph's purpose is to share the load onto many clients.
> Testing (and designing it) with a single one is non-sense.
> The more OSDs (the disks), the better.
> Those cheap 7200rpm 1TB SATA are good enough for this purpose i think,
> isnt'it ?
> The most difficult part will be to get caddies..
Yes, ebay is your friend. :-)
> Thanks for all your answers, you're a brilliant Ceph evangelist ;-)
> You probably made a new fan !
I hope your new cluster works for your needs! Tell us when it's built!

Also, please take note that as it will be a small ceph cluster, you must 
tune a bit the default parameters for backfilling:

	 osd max backfills = 1
	 osd recovery max active = 1


Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943575997
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)

More information about the pve-user mailing list