[pve-devel] PVE child process behavior question

Denis Kanchev denis.kanchev at storpool.com
Mon Jun 2 10:35:22 CEST 2025


> I thought your storage plugin is a shared storage, so there is no storage
migration at all, yet you keep talking about storage migration?
It's a shared storage indeed, the issue was that the migration process on
the destination host got OOM killed and the migration failed, most probably
that's why there is no log about the storage migration, but that didn't
stop the storage migration on the destination host.
2025-04-11T03:26:52.283913+07:00 telpr01pve03 kernel: [96031.290519] pvesh
invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0

Here is one more migration task attempt where it lived long enough to show
more detailed log:

2025-04-11 03:29:11 starting migration of VM 2421 to node 'telpr01pve06'
(10.10.17.6)
2025-04-11 03:29:11 starting VM 2421 on remote node 'telpr01pve06'
2025-04-11 03:29:15 [telpr01pve06] Warning: sch_htb: quantum of class 10001
is big. Consider r2q change.
2025-04-11 03:29:15 [telpr01pve06] kvm: failed to find file
'/usr/share/qemu-server/bootsplash.jpg'
2025-04-11 03:29:15 start remote tunnel
2025-04-11 03:29:16 ssh tunnel ver 1
2025-04-11 03:29:16 starting online/live migration on
unix:/run/qemu-server/2421.migrate
2025-04-11 03:29:16 set migration capabilities
2025-04-11 03:29:16 migration downtime limit: 100 ms
2025-04-11 03:29:16 migration cachesize: 256.0 MiB
2025-04-11 03:29:16 set migration parameters
2025-04-11 03:29:16 start migrate command to
unix:/run/qemu-server/2421.migrate
2025-04-11 03:29:17 migration active, transferred 281.0 MiB of 2.0 GiB
VM-state, 340.5 MiB/s
2025-04-11 03:29:18 migration active, transferred 561.5 MiB of 2.0 GiB
VM-state, 307.2 MiB/s
2025-04-11 03:29:19 migration active, transferred 849.2 MiB of 2.0 GiB
VM-state, 288.5 MiB/s
2025-04-11 03:29:20 migration active, transferred 1.1 GiB of 2.0 GiB
VM-state, 283.7 MiB/s
2025-04-11 03:29:21 migration active, transferred 1.4 GiB of 2.0 GiB
VM-state, 302.5 MiB/s
2025-04-11 03:29:23 migration active, transferred 1.8 GiB of 2.0 GiB
VM-state, 278.6 MiB/s
2025-04-11 03:29:23 migration status error: failed
2025-04-11 03:29:23 ERROR: online migrate failure - aborting
2025-04-11 03:29:23 aborting phase 2 - cleanup resources
2025-04-11 03:29:23 migrate_cancel
2025-04-11 03:29:25 ERROR: migration finished with problems (duration
00:00:14)
TASK ERROR: migration problems




>  could you provide the full migration task log and the VM config?
2025-04-11 03:26:50 starting migration of VM 2421 to node 'telpr01pve03'
(10.10.17.3) ### QemuMigrate::phase1() +749
2025-04-11 03:26:50 starting VM 2421 on remote node 'telpr01pve03' #
QemuMigrate::phase2_start_local_cluster() +888
2025-04-11 03:26:52 ERROR: online migrate failure - remote command failed
with exit code 255
2025-04-11 03:26:52 aborting phase 2 - cleanup resources
2025-04-11 03:26:52 migrate_cancel
2025-04-11 03:26:53 ERROR: migration finished with problems (duration
00:00:03)
TASK ERROR: migration problems


VM config
#Ubuntu-24.04-14082024
#StorPool adjustment
agent: 1,fstrim_cloned_disks=1
autostart: 1
boot: c
bootdisk: scsi0
cipassword: XXX
citype: nocloud
ciupgrade: 0
ciuser: test
cores: 2
cpu: EPYC-Genoa
cpulimit: 2
ide0: VMDataSp:vm-2421-cloudinit.raw,media=cdrom
ipconfig0: ipxxx
memory: 2048
meta: creation-qemu=8.1.5,ctime=1722917972
name: kredibel-service
nameserver: xxx
net0: virtio=xxx,bridge=vmbr2,firewall=1,rate=250,tag=220
numa: 0
onboot: 1
ostype: l26
scsi0:
VMDataSp:vm-2421-disk-0-sp-bj7n.b.sdj.raw,aio=native,discard=on,iops_rd=20000,iops_rd_max=40000,iops_rd_max_length=60,iops_wr=20000,iops_wr_max=40000,iops_wr_max_length=60,iothread=1,size=40G

scsihw: virtio-scsi-single
searchdomain: neo.internal
serial0: socket
smbios1: uuid=dfxxx
sockets: 1
sshkeys: ssh-rsa%
vmgenid: 17b154a0-


IN this case the call to PVE::Storage::Plugin::activate_volume() was
performed after migration cancelation
2025-04-11T03:26:53.072206+07:00 telpr01pve03 qm[3670228]: StorPool plugin:
NOT a live migration of VM 2421, will force detach volume ~bj7n.b.abe <<<
This log is from the sub activate_volume() in our custom storage plugin


More information about the pve-devel mailing list