[pve-devel] migration problems since qemu 1.3
Alexandre DERUMIER
aderumier at odiso.com
Thu Dec 20 16:11:16 CET 2012
maybe can you add some logs in QemuMigrate.pm, to see exactly where it's hanging
$self->log('info', "starting migration tunnel");
## create tunnel to remote port
my $lport = PVE::QemuServer::next_migrate_port();
$self->{tunnel} = $self->fork_tunnel($self->{nodeip}, $lport, $rport);
$self->log('info', "starting online/live migration on port $lport");
# start migration
my $start = time();
my $capabilities = {};
$capabilities->{capability} = "xbzrle";
$capabilities->{state} = JSON::false;
eval {
>> add log here
PVE::QemuServer::vm_mon_cmd_nocheck($vmid, "migrate-set-capabilities", capabilities => [$capabilities]);
};
#set cachesize 10% of the total memory
my $cachesize = int($conf->{memory}*1048576/10);
eval {
>> add log here
PVE::QemuServer::vm_mon_cmd_nocheck($vmid, "migrate-set-cache-size", value => $cachesize);
};
eval {
>> add log here
PVE::QemuServer::vm_mon_cmd_nocheck($vmid, "migrate", uri => "tcp:localhost:$lport");
};
----- Mail original -----
De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag>
À: "Alexandre DERUMIER" <aderumier at odiso.com>
Cc: pve-devel at pve.proxmox.com
Envoyé: Jeudi 20 Décembre 2012 15:50:38
Objet: Re: [pve-devel] migration problems since qemu 1.3
Hi,
Am 20.12.2012 15:49, schrieb Alexandre DERUMIER:
>>> i had it again.
> Do you have applied the fix from today about balloning ?
> https://git.proxmox.com/?p=qemu-server.git;a=commit;h=95381ce06cea266d40911a7129da6067a1640cbf
Yes.
>>> I even canot connect anymore through console to this VM.
>
> mmm, seem that something break qmp on source vm...
> Is the source vm running ? (is ssh working?)
It is marked as running the kvm process is still there. But no service
is running anymore - so i cannot even connect via ssh anymore.
Stefan
> ----- Mail original -----
>
> De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag>
> À: "Alexandre DERUMIER" <aderumier at odiso.com>
> Cc: pve-devel at pve.proxmox.com
> Envoyé: Jeudi 20 Décembre 2012 15:27:53
> Objet: Re: [pve-devel] migration problems since qemu 1.3
>
> Hi,
>
> i had it again.
>
> Migration hangs at:
> Dec 20 15:23:03 starting migration of VM 107 to node 'cloud1-1202'
> (10.255.0.20)
> Dec 20 15:23:03 copying disk images
> Dec 20 15:23:03 starting VM 107 on remote node 'cloud1-1202'
> Dec 20 15:23:06 starting migration tunnel
> Dec 20 15:23:06 starting online/live migration on port 60000
>
> I even canot connect anymore through console to this VM.
>
> Stefan
>
> Am 20.12.2012 12:31, schrieb Stefan Priebe - Profihost AG:
>> Hi,
>>
>> at least migration works at all ;-) I'll wait until tomorrow and test
>> again. I've restarted all VMs with latest pve-qemu-kvm.
>>
>> Thanks!
>>
>> Am 20.12.2012 11:57, schrieb Alexandre DERUMIER:
>>> with last git, I think it's related to balloon driver enabled by
>>> default, and qmp command send (see my previous mail).
>>>
>>>
>>> can you try to replace (in QemuServer.pm)
>>>
>>> if (!defined($conf->{balloon}) || $conf->{balloon}) {
>>> vm_mon_cmd($vmid, "balloon", value =>
>>> $conf->{balloon}*1024*1024)
>>> if $conf->{balloon};
>>>
>>> vm_mon_cmd($vmid, 'qom-set',
>>> path => "machine/peripheral/balloon0",
>>> property => "stats-polling-interval",
>>> value => 2);
>>> }
>>>
>>> by
>>>
>>> if (!defined($conf->{balloon}) || $conf->{balloon}) {
>>> vm_mon_cmd_nocheck($vmid, "balloon", value =>
>>> $conf->{balloon}*1024*1024)
>>> if $conf->{balloon};
>>>
>>> vm_mon_cmd_nocheck($vmid, 'qom-set',
>>> path => "machine/peripheral/balloon0",
>>> property => "stats-polling-interval",
>>> value => 2);
>>> }
>>>
>>>
>>> (vm_mon_cmd_nocheck)
>>>
>>> ----- Mail original -----
>>>
>>> De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag>
>>> À: "Alexandre DERUMIER" <aderumier at odiso.com>
>>> Cc: pve-devel at pve.proxmox.com
>>> Envoyé: Jeudi 20 Décembre 2012 11:48:06
>>> Objet: Re: [pve-devel] migration problems since qemu 1.3
>>>
>>> Hi,
>>>
>>> Am 20.12.2012 10:04, schrieb Alexandre DERUMIER:
>>>>>> Yes. It works fine with NEWLY started VMs but if the VMs are running
>>>>>> more than 1-3 days. It stops working and the VMs just crahs during
>>>>>> migration.
>>>> Maybe vm running since 1-3 days,have more memory used, so I take more
>>>> time to live migrate.
>>>
>>> I see totally different outputs - the vm crashes and the status output
>>> stops.
>>>
>>> with git from yesterday i'm just getting this:
>>> ----------------------------------------------------------
>>> Dec 20 11:34:21 starting migration of VM 100 to node 'cloud1-1203'
>>> (10.255.0.22)
>>> Dec 20 11:34:21 copying disk images
>>> Dec 20 11:34:21 starting VM 100 on remote node 'cloud1-1203'
>>> Dec 20 11:34:23 ERROR: online migrate failure - command '/usr/bin/ssh -o
>>> 'BatchMode=yes' root at 10.255.0.22 qm start 100 --stateuri tcp --skiplock
>>> --migratedfrom cloud1-1202' failed: exit code 255
>>> Dec 20 11:34:23 aborting phase 2 - cleanup resources
>>> Dec 20 11:34:24 ERROR: migration finished with problems (duration
>>> 00:00:03)
>>> TASK ERROR: migration problems
>>> ----------------------------------------------------------
>>>
>>>
>>>> Does it crash at start of the migration ? or in the middle of the
>>>> migration ?
>>>
>>> At the beginning mostly i see no more output after:
>>> migration listens on port 60000
>>>
>>>
>>>> what is your vm conf ? (memory size, storage ?)
>>> 2GB mem, RBD / Ceph Storage
>>>
>>> Stefan
>>>
More information about the pve-devel
mailing list