[pve-devel] [PATCH qemu-server] add discard_granularity to 4M for rbd storage

Alexandre DERUMIER aderumier at odiso.com
Wed Jul 4 13:54:35 CEST 2018


Hi,

I'm not sure, but I think we could take cluster_size value from block drivers which support it

something like:



    if (!bdrv_get_info(bs, &bdi) && bdi.cluster_size) {
        s->qdev.conf.discard_granularity = bdi.cluster_size;
    } else {
        s->qdev.conf.discard_granularity = DEFAULT_DISCARD_GRANULARITY;
    }

it should work with rbd, fileformat (qcow2,vmdk,), iscsi


for local block storage, I don't know how to retrieve the granularity from qemu
maybe something like in block/file-posix.c

    sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
                                major(st->st_rdev), minor(st->st_rdev));
    fd = open(sysfspath, O_RDONLY);
    if (fd == -1) {
        ret = -errno;
        goto out;
    }
   ....
?



----- Mail original -----
De: "Wolfgang Bumiller" <w.bumiller at proxmox.com>
À: "Alexandre Derumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mardi 3 Juillet 2018 09:50:05
Objet: Re: [pve-devel] [PATCH qemu-server] add discard_granularity to 4M for rbd storage

I'm answering here instead of the storage patches since I think we 
should finish the discussion first. 

On Fri, Jun 29, 2018 at 10:27:52AM +0200, Alexandre DERUMIER wrote: 
> >>IMHO: I would be up for this approach, as a first measure, so the discard 
> >>granularity can be overwritten. 
> 
> I'll rework my patch on qemu-server and make a new sub in pve-storage plugins to retrieve the block size. 
> 
> 
> >> A second step may be that Qemu should be able to work with the proper granularity by itself. 
> 
> I really don't known if qemu is able to retrieve the value from zfs (as it's a simple block storage). 

Block storages have their discard granularity exposed via sysfs. Qemu 
already detects the difference between raw files and block devices in 
order to know whether to do eg. a BLKDISCARD ioctl() vs an 
fallocate(FALLOC_FL_PUNCH_HOLE), so figuring out the granularity will be 
as simple as querying 
/sys/dev/block/$major:$minor/queue/discard_granularity for block 
devices. 

# zfs get volblocksize tank/test 
NAME PROPERTY VALUE SOURCE 
tank/test volblocksize 8K default 
# ls -l /dev/zvol/tank/test 
lrwxrwxrwx 1 root root 9 Jul 3 07:56 /dev/zvol/tank/test -> ../../zd0 
# cat /sys/block/zd0/queue/discard_granularity 
8192 

Raw files are file system dependent and for that it *may* make sense to 
allow specifying an actual size in the VM config. But I don't know if 
there are many actual file systems in use by our users with different 
granularities. (Not sure how cephfs handles these fallocate() 
operations?) 

So I still think we should first ask on qemu-devel whether they'd accept 
a way to add attempts to automatically detect the discard granularity. 
We can then still do it on pve-storage side. 

Unless someone knows a storage type where this really is too difficult 
on qemu's side while still possible on our side?



More information about the pve-devel mailing list