Cross cluster/node VM/qcow2 replication

Robert Lech robert.lech at ajboggs.com
Mon Jan 30 20:32:00 CET 2023


Hi,

I'm interested in adding replication for VMs using qcow2 that aren't backed by a ZFS file system.
I've got a proof of concept written in bash file below.
My company is currently doing this type of replication in VMWare to minimize RTO and RPO. I'd like to start moving us to Proxmox but noticed it falls short here.
The test below proves to me that it's possible, but I'd like to make something configurable by the UI that's easier to use.
I'm new to Proxmox development but I'm working to get myself familiar with the code.
I'd eventually like this to work across clusters but for now I'm focusing on between nodes with local storage.
Where should I be looking to lay the foundation for this?

The basic workflow I currently have below is:

  1.  Generate a signature of the qcow2 file on the remote host
     *   A copy of the vm already exists on the remote host in this scenario
  2.  Copy the signature back to the host running vm100
  3.  Take a snapshot of vm100
  4.  Create a delta
  5.  Delete the snapshot of vm100
  6.  Transfer the delta to the remote host
  7.  Apply the delta on the remote host
  8.  Repair the qcow2 file and apply the snapshot to remove any data written to the file after the snapshot was taken

It looks like there may be a better solution with dirty bitmaps as described in the qemu docs here: https://qemu.readthedocs.io/en/latest/interop/bitmaps.html
But I couldn't figure out how to perform the steps in a Proxmox environment.
In the future, I'd like to seed a replication from a backup but I'm getting ahead of myself at this point.

Thanks,
Robert


---
#!/bin/bash
#define vm
vm=100
time_stamp=$(date +%F-%T | sed 's/[-:]/_/g')

#define host
remote_host=root at 10.0.0.41<mailto:remote_host=root at 10.0.0.41>

storage=$(cat /etc/pve/storage.cfg | grep -A1 ':' | perl -0777 -pe 's/dir: ([a-z]*)\s*path ([a-z\/0-9]*)/\1: \2/igs')
mapfile -t drives < <(grep 'virtio[0-9]:' /etc/pve/qemu-server/$vm.conf)
for drive in "${drives[@]}"; do
               drive=$(echo $drive | cut -d':' -f3 | cut -d',' -f1)
               echo $drive
done

#get signature from remote vm
echo "Generating remote signature"
remote_cmd="qm stop $vm --timeout=2; rm /tmp/vm-$vm.sig; rdiff signature /var/lib/vz/images/$drive /tmp/vm-$vm.sig"
ssh $remote_host -t $remote_cmd
echo "Getting remote signature"
scp $remote_host:/tmp/vm-$vm.sig /var/lib/vz/images/$vm/

#build diff
echo "Snapshotting vm"
qm snapshot $vm "replication_$time_stamp"
rm -f /var/lib/vz/images/$vm/vm-$vm.delta
echo "Building diff"
rdiff delta /var/lib/vz/images/$vm/vm-$vm.sig /var/lib/vz/images/$drive /var/lib/vz/images/$vm/vm-$vm.delta
echo "Deleting snapshot"
qm delsnapshot $vm "replication_$time_stamp"

#send diff
echo "Sending diff"
scp /var/lib/vz/images/$vm/vm-$vm.delta $remote_host:/var/lib/vz/images/100/

#apply diff
echo "Applying diff"
remote_cmd="rdiff patch /var/lib/vz/images/$drive /var/lib/vz/images/$vm/vm-$vm.delta /var/lib/vz/images/$drive.patched"
ssh $remote_host -t $remote_cmd

remote_cmd="mv -f /var/lib/vz/images/$drive{.patched,}"
ssh $remote_host -t $remote_cmd

remote_cmd="qemu-img check -r all /var/lib/vz/images/$drive && qemu-img snapshot -a replication_$time_stamp /var/lib/vz/images/$drive"
ssh $remote_host -t $remote_cmd
echo "Done!"




More information about the pve-devel mailing list