[PVE-User] Best configuration among these : Episode 2

Eneko Lacunza elacunza at binovo.es
Tue Feb 24 11:21:00 CET 2015


On 24/02/15 10:39, Philippe Schwarz wrote:
> So, i modified my setup and have a few questions.
> No more SAN (no more SPOF), but an increase in hardware specs of the 3
> remaining servers.
> About storage network :
> I can't afford loosing the cluster because of the network failing.
> I'm on the way to use Netgear XS708E, a 10Gbe switch. Should be
> doubled to be redundant. But, i should therefore double the 10Gbe NIC
> too. Too expensive solution!
> So i planned to setup an active/passive bond :
> 1X10gbe (active) + 1X1Gbe(passive), the 10gbe plugged on the XS708E,
> the 1Gbe plugged on cheap 1gbe switch and both switchs connected with
> a single (or dual, LACP) link.
> The other (because of Intel Dual port 10Gb) NIC 10 Gbe will be
> connected to the LAN using the same principle (active10Gbe+passive
> 1gbe bond).
> Is there an issue with that ?
I never did something like this myself. I think you should be OK with 
1gbit network for the number of OSDs you're listing. The only exception 
could be the 5xSSD setup, but you'll need lot's of CPU power to make use 
of all IOPS power in that setup.

I have 3 small proxmox-ceph clusters and neither of them surpasses 
250Mbps on peak use (and it's for backups), and there are some DB-heavy 
processes running every two hours. Normally with so small number of OSD 
you'll be limited by the IOPS of magnetic hard disks.
> About Ceph journal :
> Because of the smart use of fast device for the journal and
> slow/cheap/large device for the datas, i wonder which solution should
> be the best:
> 1. 1SSD-200G(ProxMox)+1SSDPCIE-400G:IntelP3700:(journal)+4SATA 1TB=2200€
> 2. 1SSD-200G(ProxMox)+1SSD-200G(journal)+4SATA 1TB= 900€
> 3. 1SSD-200G(ProxMox)+5SSD 1TB, no journal= 2600€
> Not the same price, but not the same perfs either...
> Non PCIe SSD would be either Intel S3700 200GB or Samsung 850 Pro 1TB
> Any clue ?
Samsung 840 Pro is total shit for ceph, so I won't event test the 850. 
The ceph community consensus seems to be Intel S3700 200GB so I won't 
look elsewhere.

Currently Ceph (firefly) has some performance bottlenecks with SSD 
drives and can't use all their performance, so I don't think going PCIe 
SSD for journals will help you, unless you use the same disk for more 
OSD journals. I'd choose option #2, you could even put proxmox on the 
same SSD as the journals. You can use the excess of money for more OSD 
disks, generally people use 1 SSD per 3-4 OSDs. You can also put the 
journals of 2 OSD in one SSD and the other 2 journals and proxmox on the 

> About raid controller :
> Raid is mandatory on this controller, but Jbod (or single disk array)
> mode will be used.
> Dell H730 is sold with 1GB Non-volatile Cache and 2GB NVcache.
> Is the difference of 300€ (the price of a good SSD) worth it ???
I wouldn't invest in doubling the NV cache. Better put more OSDs and SSDs :)

> Other  hardware considerations :
>   My proxmox cluster will be made of 1 samba + 1WS(Trend) + 1WS(WSUS) +
> 1WS(autocad licenses) + 1 squid + 1 LTSP + many other little other
> servers (apt-proxy, xibo,...)
> So, except for the Squid+Squidguard, nothing really CPU/RAM/IOPS hungry..
> Before Ceph (previous idea was using a ZFS/FreeBSD SAN) i planned to
> use 64 GB of RAM and Dual 2630 CPU.
> Should i go up to 96 GB and Dual 2650 CPU ?? (not sure i can afford both)
Calculate 1 GB for each OSD, 1GB for ceph monitors, then some for 
proxmox and cache, maybe 4GB or so. If the remaining RAM is enough for 
your VMs to fit in 2 servers, then you're OK.
If that is a Xeon E5-2650 I think you're OK with 1. 1Ghz for each OSD 
and 1 core for metadata. This will leave for your VMs 4-5 cores with 1 
CPU on each server.

> About proxmox only :
> Is possible to setup a fourth proxmox only as ceph server and not join
> it to the previous proxmox cluster but join it to the ceph cluster
> (i've to find the real term)? I don't see any issue with that.
It is possible but you would lose the integration of the ceph 
administration in that fourth node. It would be a bit strange, I 
wouldn't do it. You can join to the proxmox cluster and not run any VMs 
in that proxmox node.
> Last one :
> Should i reduce the costs for those 3 servers to be able to buy a
> fourth one (next year) to increase my ceph osds numbers (won't be a
> proxmox server) or not ?
> I didn't find find benchmarks on how the perfs increase with the
> number of servers (and so OSDs)
As said before, I would extend the proxmox/ceph cluster so that it is 
consistent across servers. You plan to use Proxmox Ceph Server 
integration right?
Generally speaking, more OSDs = better performance. Take into account 
that Ceph is optimized for multiple user/VMs access to storage, not for 
1 user/VM fast access.
You can also add more OSDs/SSDs to existing servers.


Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943575997
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)

More information about the pve-user mailing list