[PVE-User] Ceph Cache Tiering
Lindsay Mathieson
lindsay.mathieson at gmail.com
Mon Oct 10 13:46:01 CEST 2016
On 10/10/2016 8:19 PM, Brian :: wrote:
> I think with clusters with VM type workload and at the scale that
> proxmox users tend to build < 20 OSD servers that cache tier is adding
> layer of complexity that isn't going to payback. If you want decent
> IOPS / throughput at this scale with Ceph no spinning rust allowed
> anywhere:)
I think you're right, for all the talk of small scale deployments and
commodity hardware, ceph on the small business scale is a poor
price/performance ratio. To get decent performance out of it you have to
spend big on terrabyte SSD's, battery backed disk controllers etc.
Its a shame, the flexibility of it is really good and rock solid in the
9.x range if you don't push it :) Its ability to quickly heal only dirty
data is outsanding. I just tested a simple 3 node setup backed by our
ZFS pools and was only getting 50MB/s seq writes. If I added SSD
journals that probably would have got to the 100MB/s level, which would
have been good enough, but there is one deal breaker for us and thats
snapshots - they are incredibly slow to restore, 45 minutes for one that
was only a few minutes old. It gets worse the more writes you add. qcow2
snapshots on gluster only take a couple of minutes, and we use snapshots
a lot for testing development and support.
> Additionally i think it is error prone.
> I ran into the problem, that a ssd stuck because it was full causing the complete storage to stuck.
It does seem to be very much a work in progress :)
--
Lindsay Mathieson
More information about the pve-user
mailing list