[pve-devel] [RFC qemu-server 00/10] improve live-migration downtime

Fabian Grünbichler f.gruenbichler at proxmox.com
Fri Aug 4 10:54:59 CEST 2017


this patch series attempts to reduce the downtime occuring during
live-migration of VMs to sane levels by
- conditionalizing potentially unneeded SSH connections
- replacing commands over SSH with new 'qm mtunnel' commands
- reducing the polling interval to notice a completed migration faster

attempts to monitor down time via ping produced rather unreliable results,
probably cause of ARP? but old to old is reliable slowest there too..

following are durations in 'paused' state, between 'paused inmigrate' and
'running', measured with qmp status with 0.1 sleep inbetween, tests repeated 5
times each on a network-rate-limited virtual cluster.

with old polling, 2G RAM (actual RAM transfer in <2s, so no auto-reduction of
polling interval happens):

old code: average 3.2s
new to old: average 1.6s (skips pvesr set-state)
new to new: average 1.2s

with old polling, 8G RAM (auto-reduction of polling interval kicks in, slightly better results):

old code: average 2.7s
new to old: 1s
new to new: 0.7s

with reduced polling interval (last patch applied), 2G and 8G RAM:
new to old: 0.4s
new to new: one single instance of logged paused state over 5 migrations!

with reduced polling interval, 8G RAM, old code but with last patch applied:
2s

so it seems like this is the right combination of changes to get downtime back
to acceptable levels without sacrificing consistency.

commands which might be integrated into mtunnel as well in the future:
-pvesr set-state
-qm nbdstop
-qm unlock

Fabian Grünbichler (10):
  migrate: switch back to qm mtunnel
  migrate: refactor mtunnel read/write
  qm mtunnel: add tunnel version
  migrate: read mtunnel version
  qm mtunnel: add write helper
  mtunnel: add and handle OK/ERR replies
  qm mtunnel/migrate: add resume VMID command
  migrate: finish tunnel in phase 3
  migrate: keep track of replication
  migrate: reduce polling intervals

 PVE/CLI/qm.pm      |  28 ++++++++++--
 PVE/QemuMigrate.pm | 126 ++++++++++++++++++++++++++++++++++++++---------------
 2 files changed, 116 insertions(+), 38 deletions(-)

-- 
2.11.0





More information about the pve-devel mailing list