[pve-devel] [PATCH] add virtio-vhost-user nic support
Alexandre DERUMIER
aderumier at odiso.com
Fri May 6 11:38:07 CEST 2016
>>I was confused for a minute by this. This modifies the variable outside
>>the loop to be used afterwards to decide on the $numa_object.
I wonder if it could be better to enable hugepage with a new vm config option.
It can be used without vhost_user (some very specific vm workloads can benefit from hugepages).
It could also help with vhost_user nic hotplug.
(currently, start a vm without vhost_user, don't enable hugepage, so hotplug a vhost_user nic after that, will not work)
Any opininion ?
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Jeudi 5 Mai 2016 07:15:18
Objet: Re: [pve-devel] [PATCH] add virtio-vhost-user nic support
>>$use_virtio was used in past, to disabling use of openvz cfq scheduler, as far I remember.
>>I think we have forgot to remove it
https://git.proxmox.com/?p=qemu-server.git;a=blobdiff;f=PVE/QemuServer.pm;h=94b9176b7b76d139c0df8e4aba0a948c7ae98356;hp=fe401402830df96f24ba020f42fa4d37537c8a7e;hb=ba02e591f0b53d9b0e783bffa5b3fdf2620972e6;hpb=c8effec354c6323124187cf77facd78a491a2262
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "Wolfgang Bumiller" <w.bumiller at proxmox.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Jeudi 5 Mai 2016 07:14:29
Objet: Re: [pve-devel] [PATCH] add virtio-vhost-user nic support
Thanks for the comment, I'll rebase my patch
>>This code already contains the dead $use_virtio variable taken over from
>>its previous hunk (found somewhere below) which as far as I can tell is
>>only ever written and never actually used, so it should probably be
>>removed (including its declaration somewhere further up in the
>>function).
$use_virtio was used in past, to disabling use of openvz cfq scheduler, as far I remember.
I think we have forgot to remove it
----- Mail original -----
De: "Wolfgang Bumiller" <w.bumiller at proxmox.com>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 4 Mai 2016 16:09:27
Objet: Re: [pve-devel] [PATCH] add virtio-vhost-user nic support
On Tue, May 03, 2016 at 05:05:38PM +0200, Alexandre Derumier wrote:
> vm config:
>
> net0: virtio-vhost-user=66:32:37:61:66:64,bridge=vmbr0,tag=10
> numa: 1
>
> virtio-vhost-user is userland only, so no tap interface
> Numa need to be enabled, and all vm memory is mapped to hugepages
>
> The qemu netdev script/downscript don't work with vhost-user
>
> System configuration:
>
> hugepages need to be mounted. (can be 2MB or 1GB Hupages)
>
> /etc/fstab
> ----------
> hugetlbfs /dev/hugepages hugetlbfs pagesize=2048k 0 0
>
> sysctl
> ------
> echo 2048 > /proc/sys/vm/nr_hugepages
>
> Openvswitch need to have dpdk enabled
>
> /etc/default/openvswitch-switch
> --------------------------------
> DPDK_OPTS='--dpdk -c 0x1 -n 4'
>
> ovs switch need to have datapath_type=netdev
>
> /etc/network/interfaces
> -----------------------
> auto vmbr0
> iface vmbr10 inet manual
> ovs_type OVSBridge
> ovs_ports eth0
> ovs_extra set bridge vmbr10 datapath_type=netdev
>
> Signed-off-by: Alexandre Derumier <aderumier at odiso.com>
> ---
> PVE/QemuServer.pm | 108 +++++++++++++++++++++++++++++++++++++++++-------------
> 1 file changed, 83 insertions(+), 25 deletions(-)
>
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 86c2f32..9070dec 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -564,7 +564,7 @@ for (my $i = 0; $i < $MAX_NUMA; $i++) {
>
> my $nic_model_list = ['rtl8139', 'ne2k_pci', 'e1000', 'pcnet', 'virtio',
> 'ne2k_isa', 'i82551', 'i82557b', 'i82559er', 'vmxnet3',
> - 'e1000-82540em', 'e1000-82544gc', 'e1000-82545em'];
> + 'e1000-82540em', 'e1000-82544gc', 'e1000-82545em', 'virtio-vhost-user'];
> my $nic_model_list_txt = join(' ', sort @$nic_model_list);
>
> my $net_fmt = {
> @@ -1565,9 +1565,9 @@ sub print_netdevice_full {
> my $bootorder = $conf->{boot} || $confdesc->{boot}->{default};
>
> my $device = $net->{model};
> - if ($net->{model} eq 'virtio') {
> + if ($net->{model} =~ m/^virtio/) {
> $device = 'virtio-net-pci';
> - };
> + };
You fixed the indentation but forgot about the semicolon :p
> my $pciaddr = print_pci_addr("$netid", $bridges);
> my $tmpstr = "$device,mac=$net->{macaddr},netdev=$netid$pciaddr,id=$netid";
> @@ -1622,7 +1622,12 @@ sub print_netdev_full {
> my $script = $hotplug ? "pve-bridge-hotplug" : "pve-bridge";
>
> if ($net->{bridge}) {
> - $netdev = "type=tap,id=$netid,ifname=${ifname},script=/var/lib/qemu-server/$script,downscript=/var/lib/qemu-server/pve-bridgedown$vhostparam";
> + if ($net->{model} eq 'virtio-vhost-user') {
> + $netdev = "type=vhost-user,id=$netid,chardev=charvhostuser$i,vhostforce";
> + } else {
> + $netdev = "type=tap,id=$netid,ifname=${ifname},script=/var/lib/qemu-server/$script,downscript=/var/lib/qemu-server/pve-bridgedown$vhostparam";
> + }
> +
> } else {
> $netdev = "type=user,id=$netid,hostname=$vmname";
> }
> @@ -3122,6 +3127,36 @@ sub config_to_command {
>
> push @$cmd, '-cpu', $cpu;
>
> + my $use_vhost_user = undef;
> +
> + for (my $i = 0; $i < $MAX_NETS; $i++) {
> + next if !$conf->{"net$i"};
> + my $d = parse_net($conf->{"net$i"});
> + next if !$d;
> +
> + $use_virtio = 1 if $d->{model} eq 'virtio';
> + $use_vhost_user = 1 if $d->{model} eq 'virtio-vhost-user';
I was confused for a minute by this. This modifies the variable outside
the loop to be used afterwards to decide on the $numa_object.
What bugs me is that it's very close to an `if()` containing the exact
same condition and with the real use of this variable being way down in
the next hunk this relation might be missed in a future patch which
could potentially break this.
I'm mentioning this because of ease of maintenance:
This code already contains the dead $use_virtio variable taken over from
its previous hunk (found somewhere below) which as far as I can tell is
only ever written and never actually used, so it should probably be
removed (including its declaration somewhere further up in the
function).
So my suggestion is this:
-) rename the outer variable to $have_vhost_user
-) remove $use_virtio (unless I'm wrong and it has a purpose?)
-) change this piece to:
my $vhost_user = ($d->{model} eq 'virtio-vhost-user');
$have_vhost_user ||= $vhost_user;
-) change the if() marked below to use $vhost_user
This also prevents people from accidentally "optimizing" the if() below
to use $use_vhost_user ;-).
> +
> + if ($bootindex_hash->{n}) {
> + $d->{bootindex} = $bootindex_hash->{n};
> + $bootindex_hash->{n} += 1;
> + }
> +
> + if ($d->{model} eq 'virtio-vhost-user') {
see above comment
Also you're now switching indentation from from 9 spaces to 1 tab.
(The for-loop body should begin at 1-tab instead.)
> + my $vhostuser = "vhost-user".$vmid."i".$i;
> + push @$devices, '-chardev', "socket,id=charvhostuser$i,path=/var/run/openvswitch/$vhostuser";
> +
> + }
> + my $netdevfull = print_netdev_full($vmid,$conf,$d,"net$i");
> + push @$devices, '-netdev', $netdevfull;
> +
> + my $netdevicefull = print_netdevice_full($vmid, $conf, $d, "net$i", $bridges, $use_old_bios_files);
> + push @$devices, '-device', $netdevicefull;
> + }
> +
> + die "Numa need to be enabled ot use vhost-user nics" if $use_vhost_user && !$conf->{numa};
"needs"; "to" vs "ot".
> +
> +
> my $memory = $conf->{memory} || $defaults->{memory};
> my $static_memory = 0;
> my $dimm_memory = 0;
> @@ -3151,7 +3186,14 @@ sub config_to_command {
> die "missing numa node$i memory value\n" if !$numa->{memory};
> my $numa_memory = $numa->{memory};
> $numa_totalmemory += $numa_memory;
> - my $numa_object = "memory-backend-ram,id=ram-node$i,size=${numa_memory}M";
> +
> + my $numa_object = '';
> +
> + if ($use_vhost_user) {
> + $numa_object = "memory-backend-file,id=ram-node$i,size=${numa_memory}M,mem-path=/dev/hugepages,share=on";
> + } else {
> + $numa_object = "memory-backend-ram,id=ram-node$i,size=${numa_memory}M";
> + }
>
> # cpus
> my $cpulists = $numa->{cpus};
> @@ -3201,10 +3243,19 @@ sub config_to_command {
> my $cpus = $cpustart;
> $cpus .= "-$cpuend" if $cpuend;
>
> - push @$cmd, '-object', "memory-backend-ram,size=$numa_memory,id=ram-node$i";
> + my $numa_object = '';
> + if ($use_vhost_user) {
> + $numa_object = "memory-backend-file,size=$numa_memory,id=ram-node$i,mem-path=/dev/hugepages,share=on";
> + } else {
> + $numa_object = "memory-backend-ram,size=$numa_memory,id=ram-node$i";
> + }
> +
> + push @$cmd, '-object', $numa_object;
> push @$cmd, '-numa', "node,nodeid=$i,cpus=$cpus,memdev=ram-node$i";
> }
> }
> +
> + push @$devices, '-mem-prealloc' if $use_vhost_user;
> }
>
> if ($hotplug_features->{memory}) {
> @@ -3352,25 +3403,6 @@ sub config_to_command {
> push @$devices, '-device', print_drivedevice_full($storecfg, $conf, $vmid, $drive, $bridges);
> });
>
> - for (my $i = 0; $i < $MAX_NETS; $i++) {
> - next if !$conf->{"net$i"};
> - my $d = parse_net($conf->{"net$i"});
> - next if !$d;
> -
> - $use_virtio = 1 if $d->{model} eq 'virtio';
> -
> - if ($bootindex_hash->{n}) {
> - $d->{bootindex} = $bootindex_hash->{n};
> - $bootindex_hash->{n} += 1;
> - }
> -
> - my $netdevfull = print_netdev_full($vmid,$conf,$d,"net$i");
> - push @$devices, '-netdev', $netdevfull;
> -
> - my $netdevicefull = print_netdevice_full($vmid, $conf, $d, "net$i", $bridges, $use_old_bios_files);
> - push @$devices, '-device', $netdevicefull;
> - }
> -
> if (!$q35) {
> # add pci bridges
> if (qemu_machine_feature_enabled ($machine_type, $kvmver, 2, 3)) {
> @@ -4571,6 +4603,21 @@ sub vm_start {
> }
> }
>
> + #ovs vhost-user
> + for (my $i = 0; $i < $MAX_NETS; $i++) {
> + next if !$conf->{"net$i"};
> + my $d = parse_net($conf->{"net$i"});
> + next if !$d;
> + if ($d->{model} eq 'virtio-vhost-user') {
> + my $interface = "vhost-user".$vmid."i".$i;
> + #remove old uncleaned port
> + if(-e "/var/run/openvswitch/$interface") {
> + run_command("/usr/bin/ovs-vsctl del-port $interface", outfunc => sub {}, errfunc => sub {});
> + }
> + PVE::Network::ovs_bridge_add_port($d->{bridge}, $interface, $d->{vlan}, 'dpdkvhostuser', $d->{trunks});
> + }
> + }
> +
> PVE::Storage::activate_volumes($storecfg, $vollist);
>
> if (!check_running($vmid, 1) && -d "/sys/fs/cgroup/systemd/qemu.slice/$vmid.scope") {
> @@ -4746,6 +4793,17 @@ sub vm_stop_cleanup {
> PVE::Storage::deactivate_volumes($storecfg, $vollist);
> }
>
> + #ovs vhost-user
> + for (my $i = 0; $i < $MAX_NETS; $i++) {
> + next if !$conf->{"net$i"};
> + my $d = parse_net($conf->{"net$i"});
> + next if !$d;
> + if ($d->{model} eq 'virtio-vhost-user') {
> + my $ovsintport = "vhost-user".$vmid."i".$i;
> + run_command("/usr/bin/ovs-vsctl del-port $ovsintport", outfunc => sub {}, errfunc => sub {});
> + }
> + }
> +
> foreach my $ext (qw(mon qmp pid vnc qga)) {
> unlink "/var/run/qemu-server/${vmid}.$ext";
> }
> --
> 2.1.4
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list