[PVE-User] ceph - slow rbd ls -l

Eneko Lacunza elacunza at binovo.es
Tue Oct 11 10:43:43 CEST 2016


Hi,

El 11/10/16 a las 03:43, Thiago Damas escribió:
> 2016-10-10 20:47 GMT-03:00 Lindsay Mathieson <lindsay.mathieson at gmail.com>:
>
>> On 11/10/2016 7:59 AM, Thiago Damas wrote:
>>
>>> I'm experiencing some timeouts when creating new disks/VMs, using a ceph
>>> storage.
>>>     Is there some way to reduce the long listing of rbd ls, ie "rbd ls -l"?
>>>
>> Hows your "ceph -s" look?
>>
>>
>> Are the logs showing any particular OSD's as being slow to respond?
>>
> Look:
>
> ~# ceph -s
>      cluster 71b07854-5d50-46e7-a130-2230712cd3aa
>       health HEALTH_OK
>       monmap e9: 5 mons at
> {0=192.168.33.AAA:6789/0,1=192.168.33.BBB:6789/0,2=192.168.33.CCC:6789/0,3=192.168.33.DDD:6789/0,4=192.168.33.EEE:6789/0}
>              election epoch 294, quorum 0,1,2,3,4 0,1,2,3,4
>       osdmap e3466: 40 osds: 40 up, 40 in
>        pgmap v15080315: 2048 pgs, 1 pools, 7910 GB data, 1995 kobjects
>              23467 GB used, 122 TB / 145 TB avail
>                  2048 active+clean
>    client io 36159 kB/s rd, 11745 kB/s wr, 1385 op/s
>
> ~# ceph osd perf
> osd fs_commit_latency(ms) fs_apply_latency(ms)
>    0                     1                   10
>    1                     1                   13
>    2                     0                    7
>    3                     1                   12
>    4                     0                    6
>    5                     0                    7
>    6                     0                    5
>    7                     1                    9
>    8                     1                   10
>    9                     1                    9
>   10                     1                   11
>   11                     1                    6
>   12                     1                   13
>   13                     0                   22
>   14                     2                    7
>   15                     1                    8
>   16                     1                    8
>   17                     0                    8
>   18                     1                    9
>   19                     1                    7
>   20                     3                   13
>   21                     2                   30
>   22                     3                   16
>   23                     1                   10
>   24                     1                    8
>   25                     0                    8
>   26                     1                    7
>   27                     1                    8
>   28                     1                    7
>   29                     0                    6
>   30                     1                   13
>   31                     1                    9
>   32                     1                   11
>   33                     1                    8
>   34                     1                    7
>   35                     1                   10
>   36                     1                   13
>   37                     1                    6
>   38                     1                    9
>   39                     2                   12
>
> ~# time rbd ls -l
> ...lines and lines...
> real 0m37.351s
> user 0m0.552s
> sys 0m0.232s
>
For our cluster (12 OSDs) a pool with 58 images needs 5s the first time, 
then its <1s

Not sure what makes this so slow?

Cheers

-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
       943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es




More information about the pve-user mailing list