[pve-devel] [PATCH] Fix error handling if ram hot-plug fail.
Alexandre DERUMIER
aderumier at odiso.com
Mon Apr 11 05:08:59 CEST 2016
Hi,
sorry to be late, I was on holiday the 2 last weeks.
Does memory hot-*un*-plugging work for you?
Yes, the memory hot-unpluging is working.
But, the main problem is linux kernel :p
I need to retreive documentation,but linux kernel currently put some unmovable memory allocations.
If they are theses allocation on the specific dimm, the linux kernel will refuse the offline them
here a good powerpoint
https://events.linuxfoundation.org/sites/events/files/lcjp13_ishimatsu.pdf
(Note that I don't have checked with last 4.X kernel)
>> Does it need any
>>special guest OS? Because the `device_del` qmp command doesn't seem to
>>have any effect regardless of the `removable` or `online` states in the
>>guest's /sys/devices/system/memory/memory* files.
a udev rules should help to put offline state (like for hotplug)
>>As bug #931 reports that it takes a huge amount of time for memory
>>unplugging to give up I also wonder why we retry 5 times with a timeout
>>of 3 seconds per dimm. Can't we just send the device_del commands for
>>all dimms at once, then wait 3 seconds _once_, then check? Why bother
>>with so many retries?
>>Of course the foreach_dimm*() would have to use qemu_dimm_list() instead
>>of assuming a default layout if eg. a remove command ended up removing
>>some dimms in between but failing on the last ones, otherwise further
>>changes will be problematic.
I don't think you can send device_del on all dimm at the same time. (But I don't have tested it).
As we don't manage a dimm list in config (we only use memory size to known the memory mapping),
we need to unplug them in reverse order.
For the retry and timeout, I had added them because I was helping sometimes. Feel free to remove them.
until linux have proper fixes for unplug, I don't think we can do better unplug.
----- Mail original -----
De: "Wolfgang Bumiller" <w.bumiller at proxmox.com>
À: "Wolfgang Link" <w.link at proxmox.com>, "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Jeudi 7 Avril 2016 12:52:58
Objet: Re: [pve-devel] [PATCH] Fix error handling if ram hot-plug fail.
@Alexandre: pinging you about this, since you added the memory
hotplug/unplug code:
Does memory hot-*un*-plugging work for you? Does it need any
special guest OS? Because the `device_del` qmp command doesn't seem to
have any effect regardless of the `removable` or `online` states in the
guest's /sys/devices/system/memory/memory* files.
As bug #931 reports that it takes a huge amount of time for memory
unplugging to give up I also wonder why we retry 5 times with a timeout
of 3 seconds per dimm. Can't we just send the device_del commands for
all dimms at once, then wait 3 seconds _once_, then check? Why bother
with so many retries?
Of course the foreach_dimm*() would have to use qemu_dimm_list() instead
of assuming a default layout if eg. a remove command ended up removing
some dimms in between but failing on the last ones, otherwise further
changes will be problematic.
On Wed, Apr 06, 2016 at 10:24:35AM +0200, Wolfgang Link wrote:
> There is no need to cancel the program if the ram can't remove.
> The user will see that it is pending.
> ---
> PVE/API2/Qemu.pm | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 0d33f6c..96829c8 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -960,7 +960,14 @@ my $update_vm_api = sub {
> if ($running) {
> my $errors = {};
> PVE::QemuServer::vmconfig_hotplug_pending($vmid, $conf, $storecfg, $modified, $errors);
> - raise_param_exc($errors) if scalar(keys %$errors);
> + if (scalar(keys %$errors)) {
> + foreach my $k (keys %$errors) {
> + my $msg = $errors->{$k};
> + $msg =~ s/\n/ /;
> + print $msg;
> + syslog('warning', "$k: $msg");
> + }
> + }
> } else {
> PVE::QemuServer::vmconfig_apply_pending($vmid, $conf, $storecfg, $running);
> }
> --
> 2.1.4
More information about the pve-devel
mailing list