[pve-devel] [PATCH] ceph : use jemalloc for build

Alexandre DERUMIER aderumier at odiso.com
Tue May 2 15:27:27 CEST 2017


>>I did some quick fio benchmark on Friday, and did not see any real
>>performance improvements (but the memory usage grows by ~50% for the OSD
>>processes!). 

yes, memory increase is expected. (That's why is not enable by default, ceph guys didn't want impact servers with a lot of disks)


>>could you share your benchmarks/test setup?

I don't have my test cluster running for now, but when I have done test,


My test setup was with 3 nodes (2x 10 cores 3.1ghz), with 6 osd  (intel ssd s3610) each .  (so 18 osd total).

without jemalloc, I was around 400k iops randread 4k , with jemalloc 600k iops randread 4k.
without jemalloc, I was around 150k iops randread 4k , with jemalloc 200k iops randread 4k.

(This was with 10 fio jobs, queue depth=64, randread).

during the test, the cpu of cluster was 100% usage.


Maybe you don't see difference because you are not cpu limited ?
But I think you should see latency difference. (I don't have number exactly)



Here a benchmark with jemalloc results:
https://www.flashmemorysummit.com/English/Collaterals/Proceedings/2015/20150813_S303E_Zhang.pdf



ceph.conf
---------
[global]
fsid = 49312468-47e2-47f3-87a2-ed8033f515a2
mon_initial_members = ceph1-1, ceph1-2, ceph1-3
mon_host = 10.5.0.34,10.5.0.35,10.5.0.36
auth_cluster_required = none
auth_service_required = none
auth_client_required = none

debug paxos = 0/0
debug journal = 0/0
debug mds_balancer = 0/0
debug mds = 0/0

debug lockdep = 0/0
debug auth = 0/0
debug mds_log = 0/0
debug mon = 0/0
debug perfcounter = 0/0
debug monc = 0/0
debug rbd = 0/0
debug throttle = 0/0
debug mds_migrator = 0/0
debug client = 0/0
debug rgw = 0/0
debug finisher = 0/0
debug journaler = 0/0
debug ms = 0/0
debug hadoop = 0/0
debug mds_locker = 0/0
debug tp = 0/0
debug context = 0/0
debug osd = 0/0
debug bluestore = 0/0
debug objclass = 0/0
debug objecter = 0/0

osd pool default size = 3
osd_pool_default_min_size = 1
osd_pool_default_pg_num = 1024
osd_pool_default_pgp_num = 1024


osd_mount_options_xfs = rw,noatime,inode64,logbsize=256k,delaylog
osd_mkfs_type = xfs
osd_mkfs_options_xfs = -f -i size=2048
mon_pg_warn_max_per_osd = 10000

filestore_queue_max_ops = 5000
osd_client_message_size_cap = 0
objecter_infilght_op_bytes = 1048576000
ms_dispatch_throttle_bytes = 1048576000

filestore_wbthrottle_enable = true
filestore_fd_cache_shards = 64
objecter_inflight_ops = 1024000
filestore_queue_committing_max_bytes = 1048576000
osd_op_num_threads_per_shard = 2
filestore_queue_max_bytes = 10485760000
osd_op_threads = 20
osd_op_num_shards = 10
filestore_max_sync_interval = 10
filestore_op_threads = 16
osd_pg_object_context_cache_count = 10240
journal_queue_max_ops = 3000
journal_queue_max_bytes = 10485760000
journal_max_write_entries = 1000
filestore_queue_committing_max_ops = 5000
journal_max_write_bytes = 1048576000
osd_enable_op_tracker = False
filestore_fd_cache_size = 10240
osd_client_message_cap = 0



----- Mail original -----
De: "Fabian Grünbichler" <f.gruenbichler at proxmox.com>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mardi 2 Mai 2017 10:37:19
Objet: Re: [pve-devel] [PATCH] ceph : use jemalloc for build

On Fri, Apr 28, 2017 at 02:32:07PM +0200, Alexandre DERUMIER wrote: 
> >>this flag is only for rocksdb, and only for the windows build? 
> 
> I have tested it, and it's working fine. (using perf command, I was seeing jemalloc) 

should have been more clear here - it DOES work, but not because of that 
flag (the Cmake build output even explicitly states that it will ignore 
it ;)). the Cmake build scripts use tcmalloc or jemalloc based on 
whether you have tcmalloc or jemalloc headers installed (in that order), 
and because of your added Build-Conflicts tcmalloc headers cannot be 
installed. 

ALLOCATOR explicitly sets the desired allocator, bypassing the 
autodetection. 

> 
> >>I propose "-DALLOCATOR=jemalloc" instead - what do you think? 
> 
> I think it should work, I will test it this weekend. 
> 

thanks for confirming that it works as expected. 

I did some quick fio benchmark on Friday, and did not see any real 
performance improvements (but the memory usage grows by ~50% for the OSD 
processes!). could you share your benchmarks/test setup? 




More information about the pve-devel mailing list