[pve-devel] RFC: vm migration+storage to external/remote proxmox cluster

Fri Mar 10 08:46:45 CET 2017

On 03/10/2017 06:35 AM, Alexandre DERUMIER wrote:
 > Hi,
 >
 >> First, thanks for your work!
 >
 > And thanks for your review ! :)
 >
 >
 >> I did some shallow review of most patches, I hope most of it is somewhat
 >> constructive (I had to include a few nit picks, sorry :))
 >
 > I'll try to take time to reply for each patch comments.
 >
 >> Some general methods like doing a lot manual with rm and echo > pipe 
looks
 >> still hacky, but as its a RFC I'm thinking this wasn't your concern 
yet :)
 >> As such methods can be dangerous, I would favor the ones where the 
target
 >> cluster does this stuff, i.e. each cluster touches only his own stuff if
 >> possible.
 >
 > yes, currently it's more a proof of concept without cleanup. (I have 
coded it in 1 night)
 > I needed it fastly to move a customer on a new cluster/storage on 
remote datacenter.
 > It's really hacky ;)  (but it's working, I have migrated around 50vms 
with it without problem)
 >
 >>
 >> A general flow I could imagine to work well with our stack and which
 >> then would allow us to move
 >> or implement this easier in a project like Proxmox Datacenter Manager
 >> would be:
 > Can't wait for Proxmox Datacenter Manager :)  (I have now 3 cluster 
with 16 nodes, each around 700vms)
 >
 > 1) external_migrate command gets executed
 >
 >> 2) Access to the other cluster gets made over API, this access can be
 >> kept for the whole process
 >
 > Great ! I was not sure about it, as in currently migration code we 
use only qm command through ssh tunnel.
 > Doing it with api allow to to more thing.
 >
 > For authentification, I don't known what's it better ?
 > If we use the GUI, we could reuse client ticket.

Then he must be already logged in on the remote side, else we need to 
ask for
credentials or something.

 >
 > But for command line ?  Maybe generate the ticket through ssh tunnel, 
then use api ?
 >
 > Do we have already a perl client implementation somewhere in the code ?

Dietmar wrote an example client:

https://git.proxmox.com/?p=pve-apiclient.git;a=tree

Maybe something of this can be used.

 >
 >> 3) do some error checks on the remote side, is the target storage
 >> available, ...
 >
 > Yes. Currently This is stopping when remote vm create the disk. It's 
works, but It's need to cleanup the disks.
 > If we can do the check early it's better
 >
 >> 4) get a VMID from the remote side, with /cluster/nextid (I plan to
 >> revive my "reserve VMID" patch so that this can be done in a nice 
manner)
 >
 > I was not sure about the "reserve VMID". Does it work currently ?
 > (Is the nextid reserved for some seconds between 2 api calls)

No currently it does not really reserve the VMID, so two successive calls -
in fact even two parallel, as no lock is here yet.
I sent already a patch which did that, and got some enhancement comments
from Fabian which sounded good.

Basically I lock the code, get a free VMID and also generate an unique 
token and then
save it with time stamp and this unique token, the VMID together with 
the token gets
returned to the code reserving the VMID.

Now API calls which create a new VMID (i.e. VM/CT create) mus provide 
this exact
token if the want to create a VM whit an reserved VMID, else he cannot do it
(a non reserved VMID can be created as normal).

The reservation times out after some time (2 minutes?) or can be deleted 
by the
one holding the UUID token.

Besides the unique token this was already proposed in a patch series,
where those two would be the important ones
http://pve.proxmox.com/pipermail/pve-devel/2016-October/023440.html
http://pve.proxmox.com/pipermail/pve-devel/2016-October/023439.html

Fabians valid comments regarding the need for some token:
http://pve.proxmox.com/pipermail/pve-devel/2016-October/023447.html
I'll try to do this soon.

 >> 5) Create the Target VM Skeleton from it's source config, via API call,
 >> this should be probably done in phase1
 >> 6) Sync Disks now as we have a VMID which currently belongs still us, we
 >> mustn't do anything before we have this VM on the remote cluster
 >> 7) Start the VM on target node, maybe add an "external_migration"
 >> parameter so that we can let incoming but differ between external or 
not.
 >
 > Ok, got it.
 >
 >> 8) do the migration as always
 >> 9) cleanup locally (maybe with option to keep the VM on the old cluster
 >> (as mentioned in reviews: can be potentially dangerous))
 >> At least, it should be optionnal.  (we have also "disconnect" option 
on vm nics)
 >>
 >