[pve-devel] [PATCH zfsonlinux] pick bug-fixes staged for 2.2.1

Stoiko Ivanov s.ivanov at proxmox.com
Fri Nov 17 15:03:02 CET 2023


ZFS 2.2.1 is currently being prepared, but the 3 patches added here
seem quite relevant, as the might cause dataloss/panics on setups
which run `zpool upgrade`.
See upstreams discussion for 2.2.1:
https://github.com/openzfs/zfs/pull/15498/
and the most critical issue:
https://github.com/openzfs/zfs/pull/15529
finally:
https://github.com/openzfs/zfs/commit/459c99ff2339a4a514abcf2255f9b3e5324ef09e
should not hurt either

the change to the UBSAN patch (0013) is unrelate, cosmetic only and
happened by running export-patchqueue.

Signed-off-by: Stoiko Ivanov <s.ivanov at proxmox.com>
---
minimally tested by building our current kernel with this and booting it in
2 VMs - the tunable (module parameter) is present and set to 0
 ...und-UBSAN-errors-for-variable-arrays.patch |   5 +-
 ...g-between-unencrypted-and-encrypted-.patch |  44 ++++
 ...Add-a-tunable-to-disable-BRT-support.patch | 201 ++++++++++++++++++
 ...2.1-Disable-block-cloning-by-default.patch |  42 ++++
 debian/patches/series                         |   3 +
 5 files changed, 291 insertions(+), 4 deletions(-)
 create mode 100644 debian/patches/0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
 create mode 100644 debian/patches/0016-Add-a-tunable-to-disable-BRT-support.patch
 create mode 100644 debian/patches/0017-zfs-2.2.1-Disable-block-cloning-by-default.patch

diff --git a/debian/patches/0013-Workaround-UBSAN-errors-for-variable-arrays.patch b/debian/patches/0013-Workaround-UBSAN-errors-for-variable-arrays.patch
index 02815311..0b98c42a 100644
--- a/debian/patches/0013-Workaround-UBSAN-errors-for-variable-arrays.patch
+++ b/debian/patches/0013-Workaround-UBSAN-errors-for-variable-arrays.patch
@@ -1,4 +1,4 @@
-From 28be24aefc13b11e4c96e172cf2685994e03150d Mon Sep 17 00:00:00 2001
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
 From: Tony Hutter <hutter2 at llnl.gov>
 Date: Thu, 9 Nov 2023 16:43:35 -0800
 Subject: [PATCH] Workaround UBSAN errors for variable arrays
@@ -70,6 +70,3 @@ index c13217159..b9c284a24 100644
  # Suppress incorrect warnings from versions of objtool which are not
  # aware of x86 EVEX prefix instructions used for AVX512.
  OBJECT_FILES_NON_STANDARD_vdev_raidz_math_avx512bw.o := y
--- 
-2.39.2
-
diff --git a/debian/patches/0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch b/debian/patches/0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
new file mode 100644
index 00000000..c2fc506e
--- /dev/null
+++ b/debian/patches/0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
@@ -0,0 +1,44 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: =?UTF-8?q?Martin=20Matu=C5=A1ka?= <mm at FreeBSD.org>
+Date: Tue, 31 Oct 2023 21:49:41 +0100
+Subject: [PATCH] Fix block cloning between unencrypted and encrypted datasets
+
+Block cloning from an encrypted dataset into an unencrypted dataset
+and vice versa is not possible. The current code did allow cloning
+unencrypted files into an encrypted dataset causing a panic when
+these were accessed. Block cloning between encrypted and encrypted
+is currently supported on the same filesystem only.
+
+Reviewed-by: Alexander Motin <mav at FreeBSD.org>
+Reviewed-by: Kay Pedersen <mail at mkwg.de>
+Reviewed-by: Rob N <robn at despairlabs.com>
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Signed-off-by: Martin Matuska <mm at FreeBSD.org>
+Closes #15464
+Closes #15465
+(cherry picked from commit 459c99ff2339a4a514abcf2255f9b3e5324ef09e)
+Signed-off-by: Stoiko Ivanov <s.ivanov at proxmox.com>
+---
+ module/zfs/zfs_vnops.c | 9 +++++++++
+ 1 file changed, 9 insertions(+)
+
+diff --git a/module/zfs/zfs_vnops.c b/module/zfs/zfs_vnops.c
+index 40d6c87a7..84e6b10ef 100644
+--- a/module/zfs/zfs_vnops.c
++++ b/module/zfs/zfs_vnops.c
+@@ -1094,6 +1094,15 @@ zfs_clone_range(znode_t *inzp, uint64_t *inoffp, znode_t *outzp,
+ 
+ 	ASSERT(!outzfsvfs->z_replay);
+ 
++	/*
++	 * Block cloning from an unencrypted dataset into an encrypted
++	 * dataset and vice versa is not supported.
++	 */
++	if (inos->os_encrypted != outos->os_encrypted) {
++		zfs_exit_two(inzfsvfs, outzfsvfs, FTAG);
++		return (SET_ERROR(EXDEV));
++	}
++
+ 	error = zfs_verify_zp(inzp);
+ 	if (error == 0)
+ 		error = zfs_verify_zp(outzp);
diff --git a/debian/patches/0016-Add-a-tunable-to-disable-BRT-support.patch b/debian/patches/0016-Add-a-tunable-to-disable-BRT-support.patch
new file mode 100644
index 00000000..53977479
--- /dev/null
+++ b/debian/patches/0016-Add-a-tunable-to-disable-BRT-support.patch
@@ -0,0 +1,201 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Rich Ercolani <214141+rincebrain at users.noreply.github.com>
+Date: Thu, 16 Nov 2023 14:35:22 -0500
+Subject: [PATCH] Add a tunable to disable BRT support.
+
+Copy the disable parameter that FreeBSD implemented, and extend it to
+work on Linux as well, until we're sure this is stable.
+
+Reviewed-by: Alexander Motin <mav at FreeBSD.org>
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Signed-off-by: Rich Ercolani <rincebrain at gmail.com>
+Closes #15529
+(cherry picked from commit 87e9e828655c250ce064874ff5df16f870c0a52e)
+Signed-off-by: Stoiko Ivanov <s.ivanov at proxmox.com>
+---
+ include/os/freebsd/zfs/sys/zfs_vfsops_os.h        |  1 +
+ include/os/linux/zfs/sys/zfs_vfsops_os.h          |  2 ++
+ man/man4/zfs.4                                    |  5 +++++
+ module/os/freebsd/zfs/zfs_vfsops.c                |  4 ++++
+ module/os/freebsd/zfs/zfs_vnops_os.c              |  5 +++++
+ module/os/linux/zfs/zfs_vnops_os.c                |  4 ++++
+ module/os/linux/zfs/zpl_file_range.c              |  5 +++++
+ tests/zfs-tests/include/libtest.shlib             | 15 +++++++++++++++
+ tests/zfs-tests/include/tunables.cfg              |  1 +
+ .../tests/functional/block_cloning/cleanup.ksh    |  4 ++++
+ .../tests/functional/block_cloning/setup.ksh      |  5 +++++
+ 11 files changed, 51 insertions(+)
+
+diff --git a/include/os/freebsd/zfs/sys/zfs_vfsops_os.h b/include/os/freebsd/zfs/sys/zfs_vfsops_os.h
+index 24bb03575..56a0ac96a 100644
+--- a/include/os/freebsd/zfs/sys/zfs_vfsops_os.h
++++ b/include/os/freebsd/zfs/sys/zfs_vfsops_os.h
+@@ -286,6 +286,7 @@ typedef struct zfid_long {
+ 
+ extern uint_t zfs_fsyncer_key;
+ extern int zfs_super_owner;
++extern int zfs_bclone_enabled;
+ 
+ extern void zfs_init(void);
+ extern void zfs_fini(void);
+diff --git a/include/os/linux/zfs/sys/zfs_vfsops_os.h b/include/os/linux/zfs/sys/zfs_vfsops_os.h
+index b4d5db21f..220466550 100644
+--- a/include/os/linux/zfs/sys/zfs_vfsops_os.h
++++ b/include/os/linux/zfs/sys/zfs_vfsops_os.h
+@@ -45,6 +45,8 @@ extern "C" {
+ typedef struct zfsvfs zfsvfs_t;
+ struct znode;
+ 
++extern int zfs_bclone_enabled;
++
+ /*
+  * This structure emulates the vfs_t from other platforms.  It's purpose
+  * is to facilitate the handling of mount options and minimize structural
+diff --git a/man/man4/zfs.4 b/man/man4/zfs.4
+index cfadd79d8..32f1765a5 100644
+--- a/man/man4/zfs.4
++++ b/man/man4/zfs.4
+@@ -1137,6 +1137,11 @@ Selecting any option other than
+ results in vector instructions
+ from the respective CPU instruction set being used.
+ .
++.It Sy zfs_bclone_enabled Ns = Ns Sy 1 Ns | Ns 0 Pq int
++Enable the experimental block cloning feature.
++If this setting is 0, then even if feature at block_cloning is enabled,
++attempts to clone blocks will act as though the feature is disabled.
++.
+ .It Sy zfs_blake3_impl Ns = Ns Sy fastest Pq string
+ Select a BLAKE3 implementation.
+ .Pp
+diff --git a/module/os/freebsd/zfs/zfs_vfsops.c b/module/os/freebsd/zfs/zfs_vfsops.c
+index e8b9ada13..09e18de81 100644
+--- a/module/os/freebsd/zfs/zfs_vfsops.c
++++ b/module/os/freebsd/zfs/zfs_vfsops.c
+@@ -89,6 +89,10 @@ int zfs_debug_level;
+ SYSCTL_INT(_vfs_zfs, OID_AUTO, debug, CTLFLAG_RWTUN, &zfs_debug_level, 0,
+ 	"Debug level");
+ 
++int zfs_bclone_enabled = 1;
++SYSCTL_INT(_vfs_zfs, OID_AUTO, bclone_enabled, CTLFLAG_RWTUN,
++	&zfs_bclone_enabled, 0, "Enable block cloning");
++
+ struct zfs_jailparam {
+ 	int mount_snapshot;
+ };
+diff --git a/module/os/freebsd/zfs/zfs_vnops_os.c b/module/os/freebsd/zfs/zfs_vnops_os.c
+index c498a1328..f672deed3 100644
+--- a/module/os/freebsd/zfs/zfs_vnops_os.c
++++ b/module/os/freebsd/zfs/zfs_vnops_os.c
+@@ -6243,6 +6243,11 @@ zfs_freebsd_copy_file_range(struct vop_copy_file_range_args *ap)
+ 	int error;
+ 	uint64_t len = *ap->a_lenp;
+ 
++	if (!zfs_bclone_enabled) {
++		mp = NULL;
++		goto bad_write_fallback;
++	}
++
+ 	/*
+ 	 * TODO: If offset/length is not aligned to recordsize, use
+ 	 * vn_generic_copy_file_range() on this fragment.
+diff --git a/module/os/linux/zfs/zfs_vnops_os.c b/module/os/linux/zfs/zfs_vnops_os.c
+index 33baac9db..76fac3a02 100644
+--- a/module/os/linux/zfs/zfs_vnops_os.c
++++ b/module/os/linux/zfs/zfs_vnops_os.c
+@@ -4229,4 +4229,8 @@ EXPORT_SYMBOL(zfs_map);
+ module_param(zfs_delete_blocks, ulong, 0644);
+ MODULE_PARM_DESC(zfs_delete_blocks, "Delete files larger than N blocks async");
+ 
++/* CSTYLED */
++module_param(zfs_bclone_enabled, uint, 0644);
++MODULE_PARM_DESC(zfs_bclone_enabled, "Enable block cloning");
++
+ #endif
+diff --git a/module/os/linux/zfs/zpl_file_range.c b/module/os/linux/zfs/zpl_file_range.c
+index c47fe99da..73476ff40 100644
+--- a/module/os/linux/zfs/zpl_file_range.c
++++ b/module/os/linux/zfs/zpl_file_range.c
+@@ -31,6 +31,8 @@
+ #include <sys/zfs_vnops.h>
+ #include <sys/zfeature.h>
+ 
++int zfs_bclone_enabled = 1;
++
+ /*
+  * Clone part of a file via block cloning.
+  *
+@@ -50,6 +52,9 @@ __zpl_clone_file_range(struct file *src_file, loff_t src_off,
+ 	fstrans_cookie_t cookie;
+ 	int err;
+ 
++	if (!zfs_bclone_enabled)
++		return (-EOPNOTSUPP);
++
+ 	if (!spa_feature_is_enabled(
+ 	    dmu_objset_spa(ITOZSB(dst_i)->z_os), SPA_FEATURE_BLOCK_CLONING))
+ 		return (-EOPNOTSUPP);
+diff --git a/tests/zfs-tests/include/libtest.shlib b/tests/zfs-tests/include/libtest.shlib
+index 844caa17d..d5d7bb6c8 100644
+--- a/tests/zfs-tests/include/libtest.shlib
++++ b/tests/zfs-tests/include/libtest.shlib
+@@ -3334,6 +3334,21 @@ function set_tunable_impl
+ 	esac
+ }
+ 
++function save_tunable
++{
++	[[ ! -d $TEST_BASE_DIR ]] && return 1
++	[[ -e $TEST_BASE_DIR/tunable-$1 ]] && return 2
++	echo "$(get_tunable """$1""")" > "$TEST_BASE_DIR"/tunable-"$1"
++}
++
++function restore_tunable
++{
++	[[ ! -e $TEST_BASE_DIR/tunable-$1 ]] && return 1
++	val="$(cat $TEST_BASE_DIR/tunable-"""$1""")"
++	set_tunable64 "$1" "$val"
++	rm $TEST_BASE_DIR/tunable-$1
++}
++
+ #
+ # Get a global system tunable
+ #
+diff --git a/tests/zfs-tests/include/tunables.cfg b/tests/zfs-tests/include/tunables.cfg
+index 80e7bcb3b..a0edad14d 100644
+--- a/tests/zfs-tests/include/tunables.cfg
++++ b/tests/zfs-tests/include/tunables.cfg
+@@ -90,6 +90,7 @@ VOL_INHIBIT_DEV			UNSUPPORTED			zvol_inhibit_dev
+ VOL_MODE			vol.mode			zvol_volmode
+ VOL_RECURSIVE			vol.recursive			UNSUPPORTED
+ VOL_USE_BLK_MQ			UNSUPPORTED			zvol_use_blk_mq
++BCLONE_ENABLED			zfs_bclone_enabled		zfs_bclone_enabled
+ XATTR_COMPAT			xattr_compat			zfs_xattr_compat
+ ZEVENT_LEN_MAX			zevent.len_max			zfs_zevent_len_max
+ ZEVENT_RETAIN_MAX		zevent.retain_max		zfs_zevent_retain_max
+diff --git a/tests/zfs-tests/tests/functional/block_cloning/cleanup.ksh b/tests/zfs-tests/tests/functional/block_cloning/cleanup.ksh
+index 7ac13adb6..b985445a5 100755
+--- a/tests/zfs-tests/tests/functional/block_cloning/cleanup.ksh
++++ b/tests/zfs-tests/tests/functional/block_cloning/cleanup.ksh
+@@ -31,4 +31,8 @@ verify_runnable "global"
+ 
+ default_cleanup_noexit
+ 
++if tunable_exists BCLONE_ENABLED ; then
++	log_must restore_tunable BCLONE_ENABLED
++fi
++
+ log_pass
+diff --git a/tests/zfs-tests/tests/functional/block_cloning/setup.ksh b/tests/zfs-tests/tests/functional/block_cloning/setup.ksh
+index 512f5a064..58441bf8f 100755
+--- a/tests/zfs-tests/tests/functional/block_cloning/setup.ksh
++++ b/tests/zfs-tests/tests/functional/block_cloning/setup.ksh
+@@ -33,4 +33,9 @@ fi
+ 
+ verify_runnable "global"
+ 
++if tunable_exists BCLONE_ENABLED ; then
++    log_must save_tunable BCLONE_ENABLED
++    log_must set_tunable32 BCLONE_ENABLED 1
++fi
++
+ log_pass
diff --git a/debian/patches/0017-zfs-2.2.1-Disable-block-cloning-by-default.patch b/debian/patches/0017-zfs-2.2.1-Disable-block-cloning-by-default.patch
new file mode 100644
index 00000000..53a088da
--- /dev/null
+++ b/debian/patches/0017-zfs-2.2.1-Disable-block-cloning-by-default.patch
@@ -0,0 +1,42 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Tony Hutter <hutter2 at llnl.gov>
+Date: Thu, 16 Nov 2023 11:42:19 -0800
+Subject: [PATCH] zfs-2.2.1: Disable block cloning by default
+
+Disable block cloning by default to mitigate possible data corruption
+(see #15529 and #15526).
+
+Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
+(cherry picked from commit 479dca51c66a731e637bd2d4f9bba01a05f9ac9f)
+Signed-off-by: Stoiko Ivanov <s.ivanov at proxmox.com>
+---
+ module/os/freebsd/zfs/zfs_vfsops.c   | 2 +-
+ module/os/linux/zfs/zpl_file_range.c | 2 +-
+ 2 files changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/module/os/freebsd/zfs/zfs_vfsops.c b/module/os/freebsd/zfs/zfs_vfsops.c
+index 09e18de81..0ac670ed9 100644
+--- a/module/os/freebsd/zfs/zfs_vfsops.c
++++ b/module/os/freebsd/zfs/zfs_vfsops.c
+@@ -89,7 +89,7 @@ int zfs_debug_level;
+ SYSCTL_INT(_vfs_zfs, OID_AUTO, debug, CTLFLAG_RWTUN, &zfs_debug_level, 0,
+ 	"Debug level");
+ 
+-int zfs_bclone_enabled = 1;
++int zfs_bclone_enabled = 0;
+ SYSCTL_INT(_vfs_zfs, OID_AUTO, bclone_enabled, CTLFLAG_RWTUN,
+ 	&zfs_bclone_enabled, 0, "Enable block cloning");
+ 
+diff --git a/module/os/linux/zfs/zpl_file_range.c b/module/os/linux/zfs/zpl_file_range.c
+index 73476ff40..139c51cf4 100644
+--- a/module/os/linux/zfs/zpl_file_range.c
++++ b/module/os/linux/zfs/zpl_file_range.c
+@@ -31,7 +31,7 @@
+ #include <sys/zfs_vnops.h>
+ #include <sys/zfeature.h>
+ 
+-int zfs_bclone_enabled = 1;
++int zfs_bclone_enabled = 0;
+ 
+ /*
+  * Clone part of a file via block cloning.
diff --git a/debian/patches/series b/debian/patches/series
index 5927d521..d20a6054 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -12,3 +12,6 @@
 0012-Fix-nfs_truncate_shares-without-etc-exports.d.patch
 0013-Workaround-UBSAN-errors-for-variable-arrays.patch
 0014-zpool-status-tighten-bounds-for-noalloc-stat-availab.patch
+0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
+0016-Add-a-tunable-to-disable-BRT-support.patch
+0017-zfs-2.2.1-Disable-block-cloning-by-default.patch
-- 
2.39.2






More information about the pve-devel mailing list