[pve-devel] proxmox6 : vhost_net experimental_zcopytx=1 bug?

Wed Aug 21 10:01:37 CEST 2019

>>I do not really see the connection, though... O.o 
>>Could also be an issue with our systemd related code, or with systemd's 
>>scope handling (yet again)..

for the user having this stack trace,
Aug 17 11:45:26 server kernel: Call Trace:
Aug 17 11:45:26 server kernel: __schedule+0x2d4/0x870
Aug 17 11:45:26 server kernel: ? wait_for_completion+0xc2/0x140
Aug 17 11:45:26 server kernel: ? wake_up_q+0x80/0x80
Aug 17 11:45:26 server kernel: schedule+0x2c/0x70
Aug 17 11:45:26 server kernel: vhost_net_ubuf_put_and_wait+0x60/0x90 [vhost_net]
Aug 17 11:45:26 server kernel: ? wait_woken+0x80/0x80
Aug 17 11:45:26 server kernel: vhost_net_ioctl+0x5fe/0xa50 [vhost_net]
Aug 17 11:45:26 server kernel: ? send_signal+0x3e/0x80
Aug 17 11:45:26 server kernel: do_vfs_ioctl+0xa9/0x640
Aug 17 11:45:26 server kernel: ksys_ioctl+0x67/0x90
Aug 17 11:45:26 server kernel: __x64_sys_ioctl+0x1a/0x20
Aug 17 11:45:26 server kernel: do_syscall_64+0x5a/0x110
Aug 17 11:45:26 server kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 17 11:45:26 server kernel: RIP: 0033:0x7f03ce3bc427
Aug 17 11:45:26 server kernel: Code: Bad RIP value.

they was a similar bug in kernel 4.14 (with same stack trace host_net_ubuf_put_and_wait)
https://bugzilla.redhat.com/show_bug.cgi?id=1494974

(mainly at vm shutdown, but I think the user have problem when he's trying to shutdown, then start again)

At this time it was because of a kernel patch
"Date:   Thu Aug 3 16:29:38 2017 -0400

    sock: skb_copy_ubufs support for compound pages

    Refine skb_copy_ubufs to support compound pages. With upcoming TCP
    zerocopy sendmsg, such fragments may appear.

    The existing code replaces each page one for one. Splitting each
    compound page into an independent number of regular pages can result
    in exceeding limit MAX_SKB_FRAGS if data is not exactly page aligned.

    Instead, fill all destination pages but the last to PAGE_SIZE.
    Split the existing alloc + copy loop into separate stages:
    1. compute bytelength and minimum number of pages to store this.
    2. allocate
    3. copy, filling each page except the last to PAGE_SIZE bytes
    4. update skb frag array"

This was fixed, but maybe they are a new regression in recent kernel ?

----- Mail original -----
De: "Wolfgang Bumiller" <w.bumiller at proxmox.com>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 21 Août 2019 09:31:04
Objet: Re: [pve-devel] proxmox6 : vhost_net  experimental_zcopytx=1 bug?

On Tue, Aug 20, 2019 at 11:32:57AM +0200, Alexandre DERUMIER wrote: 
> Hi, 
> 
> Some users have reported vm start timeout on proxmox6 
> 
> https://forum.proxmox.com/threads/vm-doesnt-start-proxmox-6-timeout-waiting-on-systemd.56218/ 
> 
> and for at least 2 of them, 
> 
> the vhost_net module option experimental_zcopytx=0 have fixed it. 

I do not really see the connection, though... O.o 
Could also be an issue with our systemd related code, or with systemd's 
scope handling (yet again)... 

> It was enable by default since 2012, so maybe regression somewhere. (also redhat disable it by default) 
> 
> 
> It seem than in kernel 5.1, the value has been rollback to 0 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.2.9&id=9d28094f3cf3b315e784f04ba617de8c8d8978fa 
> 
> Not sure if this patch could be backported in proxmox 5.0 kernel ? 

Makes sense IMO. After all it's an upstream decision made due to known issues.