[pve-devel] [PATCH qemu-server 4/4] implement PoC migration to remote cluster/node

Wed Mar 11 11:55:20 CET 2020

>>not yet. I guess you are thinking about Ceph? if you have a pool per 
>>cluster, you could simply configure them (with appropriate names to 
>>avoid mis-usage) on all clusters, and we could add a third level to our 
>>proposed targetstorage map. e.g., for a migration from cluster a to 
>>cluster b: 

yes, ceph or other shared storage (excluding shared lvm).

The problem using same storage is that you can have vmid conflict.
you could have vm-100-disk0  on cluster1   && vm-100-disk1 on cluster2,
deleting the vm on cluster1 will delete disks of cluster2

>>as long as you somehow ensure that VMIDs don't conflict between 
>>clusters[1] a sort of shared storage migration would easily be possible. 
>>you could even not give any user allocatespace permission on the 
>>storages that are owned by other clusters. it kind of conflicts with 
>>another idea that I have been toying with though, which is 'targetvmid' 
>>to enable cross-cluster migration where the VMID does have a conflict ;)
>>1: e.g., by having a naming scheme where the VMID is prefixed by some 
>>sort of short cluster ID 

I have thinked about this, it's work until you migrate vm across cluster

for example:

1)

cluster1: vm-100-dc1:vm-100-dc1-disk0         
cluster2: vm-100-dc2:vm-100-dc2-disk0

2)migrate

cluster1: 
cluster2: vm-100-dc1:vm-100-dc1-disk0         
        : vm-100-dc2:vm-100-dc2-disk0

3) create a new vm on cluster1, here the problem, how to be sure to not recreate vm with same id ?

cluster1: vm-100-dc1:vm-100-dc1-disk1
cluster2: vm-100-dc1:vm-100-dc1-disk0         
        : vm-100-dc2:vm-100-dc2-disk0

Maybe add feature in pmxcfs to never reuse a vmid after a vm delete ? 
I has been discuss here:
https://pve.proxmox.com/pipermail/pve-devel/2018-April/031490.html

----- Mail original -----
De: "Fabian Grünbichler" <f.gruenbichler at proxmox.com>
À: "aderumier" <aderumier at odiso.com>, "Thomas Lamprecht" <t.lamprecht at proxmox.com>
Cc: "dietmar" <dietmar at proxmox.com>, "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 11 Mars 2020 09:48:29
Objet: Re: [pve-devel] [PATCH qemu-server 4/4] implement PoC migration to remote cluster/node

On March 11, 2020 8:55 am, Alexandre DERUMIER wrote: 
> Hi, 
> 
> Thinking about cross-cluster migration, 
> 
> is there any plan to share storage across different cluster? 
> It's not uncommon to have multiple cluster as we are currently limited by corosync, and a shared storage with differents pools. 

not yet. I guess you are thinking about Ceph? if you have a pool per 
cluster, you could simply configure them (with appropriate names to 
avoid mis-usage) on all clusters, and we could add a third level to our 
proposed targetstorage map. e.g., for a migration from cluster a to 
cluster b: 

local_ceph:ceph_cluster_a:1 

where the '1' indicates that these two storages are 'shared' in the PVE 
sense of having identical content on both clusters. in that case, 
instead of allocating a new volume and starting an NDB mirror, we'd just 
need to rewrite the volid and that drive would be done. 

we could extend the storage config schema to add a 'guests' property 
similar to the 'nodes' one, to limit the view to certain vmids. a remote 
incoming migration could then add the vmid to that list on the target 
cluster, a remote outgoing migration could remove it on the source 
cluster. alternatively, we could have such a list/list of ranges on the 
datacenter level instead of the storage level (to signify: these VMs 
belong to this cluster), and simply ignore anything else. I think I need 
to think more about this ;) 

> It could avoid data copy and allow simple live migration (through the new websocket) 

yes, and to do a final 'cleanup' you could then do a move disk, or you 
could keep the disk where it is forever. 

> The main problem is unique ids. Guid could be a solution, but only implemented for disk id, 
> we couln't track orphans disk associations with vmid. 

the problem is that we'd like to have the vmid in the volid to track 
ownership. replacing that with a random ID does not solve the problem, 
as we'd still need to track ownership somehow. as soon as we put that 
somewhere, we are back to square one. 

> and using guid for vmid too, we can't use tap|veth current scheme, as they are limited 15 bytes. 
> (guid are 128bits, so 16 bytes + an extra byte for nic number). 

they are also way less readable / memorable. generating a short ID for 
the NICs is not the issue I think. 

> Of course, this should be optionnal. but if an guid is detected, we could allow cross cluster live migration without disk migration. 

as long as you somehow ensure that VMIDs don't conflict between 
clusters[1] a sort of shared storage migration would easily be possible. 
you could even not give any user allocatespace permission on the 
storages that are owned by other clusters. it kind of conflicts with 
another idea that I have been toying with though, which is 'targetvmid' 
to enable cross-cluster migration where the VMID does have a conflict ;) 

1: e.g., by having a naming scheme where the VMID is prefixed by some 
sort of short cluster ID