[pve-devel] [PATCH kernel 1/2] add fix for CIFS client memory leak

Fiona Ebner f.ebner at proxmox.com
Wed Jul 10 13:37:08 CEST 2024


As reported in the community forum [0], there currently is a memory
leak in the CIFS client code. Reproduced by running a backup with CIFS
target storage:

> while true; do vzdump 101 --storage cifs --prune-backups keep-last=1; echo 3 > /proc/sys/vm/drop_caches; done

A fix was found on the kernel mailing list tagged for stable v6.6+
and it does solve the issue, but is not yet included in any (stable)
kernels.

[0]: https://forum.proxmox.com/threads/147603/post-682388

Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
---
 ...ix-pagecache-leak-when-do-writepages.patch | 108 ++++++++++++++++++
 1 file changed, 108 insertions(+)
 create mode 100644 patches/kernel/0018-cifs-fix-pagecache-leak-when-do-writepages.patch

diff --git a/patches/kernel/0018-cifs-fix-pagecache-leak-when-do-writepages.patch b/patches/kernel/0018-cifs-fix-pagecache-leak-when-do-writepages.patch
new file mode 100644
index 0000000..495dd71
--- /dev/null
+++ b/patches/kernel/0018-cifs-fix-pagecache-leak-when-do-writepages.patch
@@ -0,0 +1,108 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Yang Erkun <yangerkun at huawei.com>
+Date: Tue, 25 Jun 2024 11:43:32 +0800
+Subject: [PATCH] cifs: fix pagecache leak when do writepages
+
+After commit f3dc1bdb6b0b("cifs: Fix writeback data corruption"), the
+writepages for cifs will find all folio needed writepage with two phase.
+The first folio will be found in cifs_writepages_begin, and the latter
+various folios will be found in cifs_extend_writeback.
+
+All those will first get folio, and for normal case, once we set page
+writeback and after do really write, we should put the reference, folio
+found in cifs_extend_writeback do this with folio_batch_release. But the
+folio found in cifs_writepages_begin never get the chance do it. And
+every writepages call, we will leak a folio(found this problem while do
+xfstests over cifs, the latter show that we will leak about 600M+ every
+we run generic/074).
+
+echo 3 > /proc/sys/vm/drop_caches ; cat /proc/meminfo | grep file
+Active(file):      34092 kB
+Inactive(file):   176192 kB
+./check generic/074 (smb v1)
+...
+generic/074 50s ...  53s
+Ran: generic/074
+Passed all 1 tests
+
+echo 3 > /proc/sys/vm/drop_caches ; cat /proc/meminfo | grep file
+Active(file):      35036 kB
+Inactive(file):   854708 kB
+
+Besides, the exist path seem never handle this folio correctly, fix it too
+with this patch.
+
+The problem does not exist in mainline since writepages path for cifs
+has changed to netfs(3ee1a1fc3981 ("cifs: Cut over to using netfslib")).
+It's had to backport all related change, so try fix this problem with this
+single patch.
+
+Fixes: f3dc1bdb6b0b ("cifs: Fix writeback data corruption")
+Cc: stable at kernel.org # v6.6+
+Signed-off-by: Yang Erkun <yangerkun at huawei.com>
+(picked from https://lore.kernel.org/linux-cifs/20240625034332.750312-1-yangerkun@huawei.com/)
+Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
+---
+ fs/smb/client/file.c | 16 +++++++++++++---
+ 1 file changed, 13 insertions(+), 3 deletions(-)
+
+diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
+index af5c476db6e6..8aee0f520300 100644
+--- a/fs/smb/client/file.c
++++ b/fs/smb/client/file.c
+@@ -2845,17 +2845,21 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
+ 	rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY, &cfile);
+ 	if (rc) {
+ 		cifs_dbg(VFS, "No writable handle in writepages rc=%d\n", rc);
++		folio_unlock(folio);
+ 		goto err_xid;
+ 	}
+ 
+ 	rc = server->ops->wait_mtu_credits(server, cifs_sb->ctx->wsize,
+ 					   &wsize, credits);
+-	if (rc != 0)
++	if (rc != 0) {
++		folio_unlock(folio);
+ 		goto err_close;
++	}
+ 
+ 	wdata = cifs_writedata_alloc(cifs_writev_complete);
+ 	if (!wdata) {
+ 		rc = -ENOMEM;
++		folio_unlock(folio);
+ 		goto err_uncredit;
+ 	}
+ 
+@@ -3002,17 +3006,22 @@ static ssize_t cifs_writepages_begin(struct address_space *mapping,
+ lock_again:
+ 	if (wbc->sync_mode != WB_SYNC_NONE) {
+ 		ret = folio_lock_killable(folio);
+-		if (ret < 0)
++		if (ret < 0) {
++			folio_put(folio);
+ 			return ret;
++		}
+ 	} else {
+-		if (!folio_trylock(folio))
++		if (!folio_trylock(folio)) {
++			folio_put(folio);
+ 			goto search_again;
++		}
+ 	}
+ 
+ 	if (folio->mapping != mapping ||
+ 	    !folio_test_dirty(folio)) {
+ 		start += folio_size(folio);
+ 		folio_unlock(folio);
++		folio_put(folio);
+ 		goto search_again;
+ 	}
+ 
+@@ -3042,6 +3051,7 @@ static ssize_t cifs_writepages_begin(struct address_space *mapping,
+ out:
+ 	if (ret > 0)
+ 		*_start = start + ret;
++	folio_put(folio);
+ 	return ret;
+ }
+ 
-- 
2.39.2





More information about the pve-devel mailing list