[pve-devel] [PATCH] iothreads : create 1 iothread by virtio device

Alexandre DERUMIER aderumier at odiso.com
Thu Jan 29 08:21:06 CET 2015


>>On our SSD pool which has 24x Intel S3700 across 4 hosts, the most we can get inside a guest using virtio-scsi is ~9000IOPS @ 4K size.  
>>In guest CPU usage is around 7% yet the guest is using 100% for each allocated core on the host, which makes me think that the multiple IO threads patch will unleash the performance for us.

Hi Andrew,
well Indeed, I have notice same result than you.


)librbd is really cpu intensive, and qemu use 1global thread by default for all ios.
That mean that your 9000iops is related to the cpu of 1core.( the more cpu frequency you'll have, the more iops you'll get).
(you can add 10 disk to the vm, you'll 1 thread for all disks)

So, I solution is to use the krbd kernel module. I have sent a patch to the mailing some time ago, not yet in master.
with krbd kernel module, I can reach around 40000iops, with default qemu without iothread.
http://pve.proxmox.com/pipermail/pve-devel/2014-November/013175.html

(krbd use 4x less cpu than librbd, I have sent a bug report to inktank, they are looking at it currently)



Now, with librbd + iothread (note that currently it's works fine with virtio-blk, but for virtio-scsi it's not stable yet),
you can use 1 thread for each disk.
With 1 disk, I think I'm around 15000iops vs 9000iops. 
with X disks, I'm around X disks x 15000iops.


For virtio-scsi, in my patches series, I have also support for multiqueues in virtio-scsi, which also help.
This require a recent guest kernel.
I have seen some improvement, for sequential reads for small blocks, which aggregate small blocks in bigger one,
so less ios to ceph, less cpu.
(10000->50000iops for this workload)



Also, to reduce cpu usage and improve performance, you can improve the ceph.conf with

        
        debug lockdep = 0/0
        debug context = 0/0
        debug crush = 0/0
        debug buffer = 0/0
        debug timer = 0/0
        debug journaler = 0/0
        debug osd = 0/0
        debug optracker = 0/0
        debug objclass = 0/0
        debug filestore = 0/0
        debug journal = 0/0
        debug ms = 0/0
        debug monc = 0/0
        debug tp = 0/0
        debug auth = 0/0
        debug finisher = 0/0
        debug heartbeatmap = 0/0
        debug perfcounter = 0/0
        debug asok = 0/0
        debug throttle = 0/0

        cephx sign messages = false
        cephx require signatures = false



>From my benchmark, the more iops I was able to reach with 1 vm was.

3 x 40000iops with 3 virtio-blk disk + iothreads + krbd

This was will small cpu  CPU E5-2603 v2 @ 1.80GHz.


I going to have a full ssd ceph cluster next month (18x s 3500 on 3 hosts),
with more powerfull nodes and more powerfull clients (20 cores 3,1Gghz)

I'll do benchmarks and post results with differents setup.


If you have time, maybe can you test my krbd patch ?
(Note that discard support in not yet available in krbd in kernel 3.10)


Hope It's help.


Alexandre



(I'm also going to try to test vhost-scsi feature of qemu soon, which is some kind of full passthrough, bypassing all emulation layers)

----- Mail original -----
De: "Andrew Thrift" <andrew at networklabs.co.nz>
À: "aderumier" <aderumier at odiso.com>
Envoyé: Jeudi 29 Janvier 2015 02:55:11
Objet: Re: [pve-devel] [PATCH] iothreads : create 1 iothread by virtio device

Hi Alexandre, 
Without this patch what IOPS do you get ? 

On our SSD pool which has 24x Intel S3700 across 4 hosts, the most we can get inside a guest using virtio-scsi is ~9000IOPS @ 4K size. 

In guest CPU usage is around 7% yet the guest is using 100% for each allocated core on the host, which makes me think that the multiple IO threads patch will unleash the performance for us. 


Regards, 




Andrew 




On Fri, Jan 16, 2015 at 8:43 AM, Dietmar Maurer < dietmar at proxmox.com > wrote: 


> The patch series for block jobs was here 
> http://lists.gnu.org/archive/html/qemu-devel/2014-10/msg02191.html 
> 
> 
> (I'm not sure about the difference between proxmox backup patches and qemu 
> implemention) 

We have a bunch of patches above that (vma). 

_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 





More information about the pve-devel mailing list