[PVE-User] VM Migration Not Happening :-(
duluxoz
duluxoz at gmail.com
Mon Sep 2 07:50:30 CEST 2024
Hi Gilberto, and Thank you for getting back to me.
Just to be 100% clear: the Proxmox (with Hyper-Converged Ceph) Cluster
is working AOK, except for the face I can't migrate *any* of the VMs
(live or shutdown).
Yes, I can SSH into each node from every other node using:
* the hostname of the "management" NIC
* the hostname of the "migration traffic" NIC
* the ip address of the "managment" NIC
* the IP address of the "migration traffic" NIC
The VM's HDD is on the rbd storage (see below).
As requested:
/etc/pve/storage.cfg
```
dir: local
path /var/lib/vz
content vztmpl,iso,backup
lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir
rbd: rbd
content images,rootdir
krbd 0
pool rbd
cephfs: cephfs
path /data/cephfs
content backup,vztmpl,iso
fs-name cephfs
```
/etc/pve/datacenter.cfg
```
console: html5
crs: ha-rebalance-on-start=1
ha: shutdown_policy=migrate
keyboard: en-us
migration: secure,network=192.168.200.0/24
next-id: lower=1000
```
/etc/pve/corosync.conf
```
logging {
debug: off
to_syslog: yes
}
nodelist {
node {
name: pven1
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.100.101
ring1_addr: 192.168.200.101
}
node {
name: pven2
nodeid: 2
quorum_votes: 1
ring0_addr: 192.168.100.102
ring1_addr: 192.168.200.102
}
node {
name: pven3
nodeid: 3
quorum_votes: 1
ring0_addr: 192.168.100.103
ring1_addr: 192.168.200.103
}
}
quorum {
provider: corosync_votequorum
}
totem {
cluster_name: cluster1
config_version: 4
interface {
knet_link_priority: 10
linknumber: 0
}
interface {
knet_link_priority: 20
linknumber: 1
}
ip_version: ipv4-6
link_mode: passive
secauth: on
version: 2
}
```
/etc/hosts
```
127.0.0.1 localhost localhost.localdomain localhost4
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6 ip6-localhost ip6-loopback
192.168.100.101 pven1.mydomain.local pven1
192.168.100.102 pven2.mydomain.local pven2
192.168.100.103 pven3.mydomain.local pven3
192.168.200.101 pvent1.mydomain.local pvent1
192.168.200.102 pvent2.mydomain.local pvent2
192.168.200.103 pvent3.mydomain.local pvent3
```
/etc/network/interfaces (for pven1; pven2 & pven3 are the same, except
for the IP Address (see above))
```
auto lo
iface lo inet loopback
iface eno1 inet manual
iface eno2 inet manual
auto enp2s0
iface enp2s0 inet static
address 192.168.200.20/24
mtu 9000
auto bond0
iface bond0 inet manual
bond-slaves eno1 eno2
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
bond-miimon 100
bond-downdelay 200
bond-updelay 200
auto vmbr0
iface vmbr0 inet static
bridge-ports bond0
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 100,200
auto vmbr0.200
iface vmbr0.200 inet static
address 192.168.100.101/24
gateway 192.168.100.1
```
Note: iface enp2s0 (on all 3 Nodes) sits on an isolated VLAN which
(obviously) has no gateway ie the only hosts on that VLAN are pven1,
pven2, & pven3, and all are "pingable" from each other.
Thanks for the help
Dulux-Oz
On 2/9/24 08:02, Gilberto Ferreira wrote:
> Hi
> We need more details.
> Send us the following:
>
> cat /etc/pve/storage
> cat /etc/pvedatacenter
> cat /etc/pve/corosync.conf
> cat /etc/hosts
> cat /etc/network/interfaces
>
> Can you ssh between the nodes?
>
>
> ---
>
>
> Gilberto Nunes Ferreira
> (47) 99676-7530 - Whatsapp / Telegram
>
>
>
>
>
>
> Em dom., 1 de set. de 2024 às 05:11, duluxoz <duluxoz at gmail.com> escreveu:
>
> Hi All,
>
> I need help with figuring out why I can't migrate a VM from one
> Proxmox
> Node to another (in the same cluster, of course).
>
> These are the details provided by the Proxmox Task Log:
>
> ```
>
> task started by HA resource agent
> 2024-09-01 18:02:30 use dedicated network address for sending
> migration
> traffic (192.168.200.103)
> 2024-09-01 18:02:30 starting migration of VM 100 to node 'pven3'
> (192.168.200.103)
> 2024-09-01 18:02:30 starting VM 100 on remote node 'pven3'
> 2024-09-01 18:02:30 [pven3]
> 2024-09-01 18:02:32 start remote tunnel
> 2024-09-01 18:02:33 ssh tunnel ver 1
> 2024-09-01 18:02:33 starting online/live migration on
> unix:/run/qemu-server/100.migrate
> 2024-09-01 18:02:33 set migration capabilities
> 2024-09-01 18:02:33 migration downtime limit: 100 ms
> 2024-09-01 18:02:33 migration cachesize: 256.0 MiB
> 2024-09-01 18:02:33 set migration parameters
> 2024-09-01 18:02:33 start migrate command to
> unix:/run/qemu-server/100.migrate
> channel 2: open failed: connect failed: open failed
> 2024-09-01 18:02:34 migration status error: failed - Unable to
> write to
> socket: Broken pipe
> 2024-09-01 18:02:34 ERROR: online migrate failure - aborting
> 2024-09-01 18:02:34 aborting phase 2 - cleanup resources
> 2024-09-01 18:02:34 migrate_cancel
> 2024-09-01 18:02:36 ERROR: migration finished with problems (duration
> 00:00:07)
> TASK ERROR: migration problems
> ```
>
> If someone could point me in the correct direction to resolve this
> issue
> I'd be very grateful - thanks
>
> Cheer
>
> Dulux-Oz
>
>
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
More information about the pve-user
mailing list