[pve-devel] [PATCH pve-storage] qcow2 format: enable subcluster allocation by default

Fiona Ebner f.ebner at proxmox.com
Wed Sep 11 13:44:02 CEST 2024


Am 03.07.24 um 16:24 schrieb Alexandre Derumier via pve-devel:
> 
> 
> extended_l2 is an optimisation to reduce write amplification.
> Currently,without it, when a vm write 4k, a full 64k cluster

s/write/writes/

> need to be writen.

needs to be written.

> 
> When enabled, the cluster is splitted in 32 subclusters.

s/splitted/split/

> 
> We use a 128k cluster by default, to have 32 * 4k subclusters
> 
> https://blogs.igalia.com/berto/2020/12/03/subcluster-allocation-for-qcow2-images/
> https://static.sched.com/hosted_files/kvmforum2020/d9/qcow2-subcluster-allocation.pdf
> 
> some stats for 4k randwrite benchmark

Can you please share the exact command you used? What kind of underlying
disks do you have?

> 
> Cluster size   Without subclusters     With subclusters
> 16 KB          5859 IOPS               8063 IOPS
> 32 KB          5674 IOPS               11107 IOPS
> 64 KB          2527 IOPS               12731 IOPS
> 128 KB         1576 IOPS               11808 IOPS
> 256 KB         976 IOPS                 9195 IOPS
> 512 KB         510 IOPS                 7079 IOPS
> 1 MB           448 IOPS                 3306 IOPS
> 2 MB           262 IOPS                 2269 IOPS
> 

How does read performance compare for you (with 128 KiB cluster size)?

I don't see any noticeable difference in my testing with an ext4
directory storage on an SSD, attaching the qcow2 images as SCSI disks to
the VM, neither for reading nor writing. I only tested without your
change and with your change using 4k (rand)read and (rand)write.

I'm not sure we should enable this for everybody, there's always a risk
to break stuff with added complexity. Maybe it's better to have a
storage configuration option that people can opt-in to, e.g.

qcow2-create-opts extended_l2=on,cluster_size=128k

If we get enough positive feedback, we can still change the default in a
future (major) release.

> Signed-off-by: Alexandre Derumier <alexandre.derumier at groupe-cyllene.com>
> ---
>  src/PVE/Storage/Plugin.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
> index 6444390..31b20fe 100644
> --- a/src/PVE/Storage/Plugin.pm
> +++ b/src/PVE/Storage/Plugin.pm
> @@ -561,7 +561,7 @@ sub preallocation_cmd_option {
>  	die "preallocation mode '$prealloc' not supported by format '$fmt'\n"
>  	    if !$QCOW2_PREALLOCATION->{$prealloc};
>  
> -	return "preallocation=$prealloc";
> +	return "preallocation=$prealloc,extended_l2=on,cluster_size=128k";

Also, it doesn't really fit here in the preallocation helper as the
helper is specific to that setting.

>      } elsif ($fmt eq 'raw') {
>  	$prealloc = $prealloc // 'off';
>  	$prealloc = 'off' if $prealloc eq 'metadata';
> -- 
> 2.39.2
> 
> 




More information about the pve-devel mailing list