[pve-devel] [PATCH pve-kernel] fix #5558: cherry-pick NFSv4 fix

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Jul 11 10:22:57 CEST 2024


picked from v6.9.8, the bug can cause lost NFS connections according to
upstream, and possibly corrupt backups according to our user report.

Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
---
numbered after Fiona's two cherry-picks already on the list, assuming those
will all be applied in one go ;)

 ...0-SUNRPC-Fix-backchannel-reply-again.patch | 58 +++++++++++++++++++
 1 file changed, 58 insertions(+)
 create mode 100644 patches/kernel/0020-SUNRPC-Fix-backchannel-reply-again.patch

diff --git a/patches/kernel/0020-SUNRPC-Fix-backchannel-reply-again.patch b/patches/kernel/0020-SUNRPC-Fix-backchannel-reply-again.patch
new file mode 100644
index 0000000..7fe2703
--- /dev/null
+++ b/patches/kernel/0020-SUNRPC-Fix-backchannel-reply-again.patch
@@ -0,0 +1,58 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Chuck Lever <chuck.lever at oracle.com>
+Date: Wed, 19 Jun 2024 09:51:08 -0400
+Subject: [PATCH] SUNRPC: Fix backchannel reply, again
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+[ Upstream commit 6ddc9deacc1312762c2edd9de00ce76b00f69f7c ]
+
+I still see "RPC: Could not send backchannel reply error: -110"
+quite often, along with slow-running tests. Debugging shows that the
+backchannel is still stumbling when it has to queue a callback reply
+on a busy transport.
+
+Note that every one of these timeouts causes a connection loss by
+virtue of the xprt_conditional_disconnect() call in that arm of
+call_cb_transmit_status().
+
+I found that setting to_maxval is necessary to get the RPC timeout
+logic to behave whenever to_exponential is not set.
+
+Fixes: 57331a59ac0d ("NFSv4.1: Use the nfs_client's rpc timeouts for backchannel")
+Signed-off-by: Chuck Lever <chuck.lever at oracle.com>
+Reviewed-by: Benjamin Coddington <bcodding at redhat.com>
+Signed-off-by: Trond Myklebust <trond.myklebust at hammerspace.com>
+Signed-off-by: Sasha Levin <sashal at kernel.org>
+(cherry picked from commit bd1e42e0f2567c911d3df761cf7a33b021fdceeb)
+Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
+---
+ net/sunrpc/svc.c | 5 ++++-
+ 1 file changed, 4 insertions(+), 1 deletion(-)
+
+diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
+index b969e505c7b7..5fc974dda811 100644
+--- a/net/sunrpc/svc.c
++++ b/net/sunrpc/svc.c
+@@ -1548,9 +1548,11 @@ void svc_process(struct svc_rqst *rqstp)
+  */
+ void svc_process_bc(struct rpc_rqst *req, struct svc_rqst *rqstp)
+ {
++	struct rpc_timeout timeout = {
++		.to_increment		= 0,
++	};
+ 	struct rpc_task *task;
+ 	int proc_error;
+-	struct rpc_timeout timeout;
+ 
+ 	/* Build the svc_rqst used by the common processing routine */
+ 	rqstp->rq_xid = req->rq_xid;
+@@ -1603,6 +1605,7 @@ void svc_process_bc(struct rpc_rqst *req, struct svc_rqst *rqstp)
+ 		timeout.to_initval = req->rq_xprt->timeout->to_initval;
+ 		timeout.to_retries = req->rq_xprt->timeout->to_retries;
+ 	}
++	timeout.to_maxval = timeout.to_initval;
+ 	memcpy(&req->rq_snd_buf, &rqstp->rq_res, sizeof(req->rq_snd_buf));
+ 	task = rpc_run_bc_task(req, &timeout);
+ 
-- 
2.39.2





More information about the pve-devel mailing list