[PVE-User] High ceph OSD latency

Fabrizio Cuseo f.cuseo at panservice.it
Thu Jan 15 11:25:44 CET 2015


I have a small proxmox/ceph cluster:

- 3 x Dell CS24, each with:
    - 2 x xeon CPU
    - 24 Gbyte ram
    - 1 x 500Gbyte SATA disk (used for proxmox)
    - 3 x 2Tbyte WD2000F9YZ SATA Enterprise Edition (used for ceph OSDs)
    - 1 x Gbit ethernet (used for ceph and proxmox)
    - 1 x Gbit ethernet (used for vms ethernet)

What is strange is that on OSD tree I have high latency: tipically Apply latency is between 5 and 25, but commit lattency is between 150 and 300 (and sometimes 5/600), with 5/10 op/s and some B/s rd/wr (i have only 3 vms, and only 1 is working now, so the cluster is really unloaded).

I am using a pool with 3 copies, and I have increased pg_num to 256 (the default value of 64 is too low); but OSD latency is the same with a different pg_num value.

I have other clusters (similar configuration, using dell 2950, dual ethernet for ceph and proxmox, 4 x OSD with 1Tbyte drive, perc 5i controller), with several vlms, and the commit and apply latency is 1/2ms.

Another cluster (test cluster) with 3 x dell PE860, with only 1 OSD per node, have better latency (10/20 ms).

What can i check ? 

Thank's in advance, Fabrizio 

Fabrizio Cuseo - mailto:f.cuseo at panservice.it
Direzione Generale - Panservice InterNetWorking
Servizi Professionali per Internet ed il Networking
Panservice e' associata AIIP - RIPE Local Registry
Phone: +39 0773 410020 - Fax: +39 0773 470219
http://www.panservice.it  mailto:info at panservice.it
Numero verde nazionale: 800 901492

More information about the pve-user mailing list