[pve-devel] migration problems since qemu 1.3

Alexandre DERUMIER aderumier at odiso.com
Thu Dec 20 16:11:16 CET 2012


maybe can you add some logs in QemuMigrate.pm, to see exactly where it's hanging



    $self->log('info', "starting migration tunnel");

    ## create tunnel to remote port
    my $lport = PVE::QemuServer::next_migrate_port();
    $self->{tunnel} = $self->fork_tunnel($self->{nodeip}, $lport, $rport);

    $self->log('info', "starting online/live migration on port $lport");
    # start migration

    my $start = time();

    my $capabilities = {};
    $capabilities->{capability} =  "xbzrle";
    $capabilities->{state} = JSON::false;

    eval {
>> add log here
        PVE::QemuServer::vm_mon_cmd_nocheck($vmid, "migrate-set-capabilities", capabilities => [$capabilities]);

    };

    #set cachesize 10% of the total memory
    my $cachesize = int($conf->{memory}*1048576/10);
    eval {
>> add log here
        PVE::QemuServer::vm_mon_cmd_nocheck($vmid, "migrate-set-cache-size", value => $cachesize);
    };

    eval {
>> add log here
        PVE::QemuServer::vm_mon_cmd_nocheck($vmid, "migrate", uri => "tcp:localhost:$lport");
    };



----- Mail original ----- 

De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag> 
À: "Alexandre DERUMIER" <aderumier at odiso.com> 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Jeudi 20 Décembre 2012 15:50:38 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 
Am 20.12.2012 15:49, schrieb Alexandre DERUMIER: 
>>> i had it again. 
> Do you have applied the fix from today about balloning ? 
> https://git.proxmox.com/?p=qemu-server.git;a=commit;h=95381ce06cea266d40911a7129da6067a1640cbf 

Yes. 

>>> I even canot connect anymore through console to this VM. 
> 
> mmm, seem that something break qmp on source vm... 
> Is the source vm running ? (is ssh working?) 
It is marked as running the kvm process is still there. But no service 
is running anymore - so i cannot even connect via ssh anymore. 

Stefan 

> ----- Mail original ----- 
> 
> De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag> 
> À: "Alexandre DERUMIER" <aderumier at odiso.com> 
> Cc: pve-devel at pve.proxmox.com 
> Envoyé: Jeudi 20 Décembre 2012 15:27:53 
> Objet: Re: [pve-devel] migration problems since qemu 1.3 
> 
> Hi, 
> 
> i had it again. 
> 
> Migration hangs at: 
> Dec 20 15:23:03 starting migration of VM 107 to node 'cloud1-1202' 
> (10.255.0.20) 
> Dec 20 15:23:03 copying disk images 
> Dec 20 15:23:03 starting VM 107 on remote node 'cloud1-1202' 
> Dec 20 15:23:06 starting migration tunnel 
> Dec 20 15:23:06 starting online/live migration on port 60000 
> 
> I even canot connect anymore through console to this VM. 
> 
> Stefan 
> 
> Am 20.12.2012 12:31, schrieb Stefan Priebe - Profihost AG: 
>> Hi, 
>> 
>> at least migration works at all ;-) I'll wait until tomorrow and test 
>> again. I've restarted all VMs with latest pve-qemu-kvm. 
>> 
>> Thanks! 
>> 
>> Am 20.12.2012 11:57, schrieb Alexandre DERUMIER: 
>>> with last git, I think it's related to balloon driver enabled by 
>>> default, and qmp command send (see my previous mail). 
>>> 
>>> 
>>> can you try to replace (in QemuServer.pm) 
>>> 
>>> if (!defined($conf->{balloon}) || $conf->{balloon}) { 
>>> vm_mon_cmd($vmid, "balloon", value => 
>>> $conf->{balloon}*1024*1024) 
>>> if $conf->{balloon}; 
>>> 
>>> vm_mon_cmd($vmid, 'qom-set', 
>>> path => "machine/peripheral/balloon0", 
>>> property => "stats-polling-interval", 
>>> value => 2); 
>>> } 
>>> 
>>> by 
>>> 
>>> if (!defined($conf->{balloon}) || $conf->{balloon}) { 
>>> vm_mon_cmd_nocheck($vmid, "balloon", value => 
>>> $conf->{balloon}*1024*1024) 
>>> if $conf->{balloon}; 
>>> 
>>> vm_mon_cmd_nocheck($vmid, 'qom-set', 
>>> path => "machine/peripheral/balloon0", 
>>> property => "stats-polling-interval", 
>>> value => 2); 
>>> } 
>>> 
>>> 
>>> (vm_mon_cmd_nocheck) 
>>> 
>>> ----- Mail original ----- 
>>> 
>>> De: "Stefan Priebe - Profihost AG" <s.priebe at profihost.ag> 
>>> À: "Alexandre DERUMIER" <aderumier at odiso.com> 
>>> Cc: pve-devel at pve.proxmox.com 
>>> Envoyé: Jeudi 20 Décembre 2012 11:48:06 
>>> Objet: Re: [pve-devel] migration problems since qemu 1.3 
>>> 
>>> Hi, 
>>> 
>>> Am 20.12.2012 10:04, schrieb Alexandre DERUMIER: 
>>>>>> Yes. It works fine with NEWLY started VMs but if the VMs are running 
>>>>>> more than 1-3 days. It stops working and the VMs just crahs during 
>>>>>> migration. 
>>>> Maybe vm running since 1-3 days,have more memory used, so I take more 
>>>> time to live migrate. 
>>> 
>>> I see totally different outputs - the vm crashes and the status output 
>>> stops. 
>>> 
>>> with git from yesterday i'm just getting this: 
>>> ---------------------------------------------------------- 
>>> Dec 20 11:34:21 starting migration of VM 100 to node 'cloud1-1203' 
>>> (10.255.0.22) 
>>> Dec 20 11:34:21 copying disk images 
>>> Dec 20 11:34:21 starting VM 100 on remote node 'cloud1-1203' 
>>> Dec 20 11:34:23 ERROR: online migrate failure - command '/usr/bin/ssh -o 
>>> 'BatchMode=yes' root at 10.255.0.22 qm start 100 --stateuri tcp --skiplock 
>>> --migratedfrom cloud1-1202' failed: exit code 255 
>>> Dec 20 11:34:23 aborting phase 2 - cleanup resources 
>>> Dec 20 11:34:24 ERROR: migration finished with problems (duration 
>>> 00:00:03) 
>>> TASK ERROR: migration problems 
>>> ---------------------------------------------------------- 
>>> 
>>> 
>>>> Does it crash at start of the migration ? or in the middle of the 
>>>> migration ? 
>>> 
>>> At the beginning mostly i see no more output after: 
>>> migration listens on port 60000 
>>> 
>>> 
>>>> what is your vm conf ? (memory size, storage ?) 
>>> 2GB mem, RBD / Ceph Storage 
>>> 
>>> Stefan 
>>> 



More information about the pve-devel mailing list