[PVE-User] VM Migration Not Happening :-(

duluxoz duluxoz at gmail.com
Mon Sep 2 07:50:30 CEST 2024


Hi Gilberto, and Thank you for getting back to me.

Just to be 100% clear: the Proxmox (with Hyper-Converged Ceph) Cluster 
is working AOK, except for the face I can't migrate *any* of the VMs 
(live or shutdown).

Yes, I can SSH into each node from every other node using:

  * the hostname of the "management" NIC
  * the hostname of the "migration traffic" NIC
  * the ip address of the "managment" NIC
  * the IP address of the "migration traffic" NIC

The VM's HDD is on the rbd storage (see below).

As requested:

/etc/pve/storage.cfg

```

dir: local
        path /var/lib/vz
        content vztmpl,iso,backup

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

rbd: rbd
        content images,rootdir
        krbd 0
        pool rbd

cephfs: cephfs
        path /data/cephfs
        content backup,vztmpl,iso
        fs-name cephfs
```

/etc/pve/datacenter.cfg

```

console: html5
crs: ha-rebalance-on-start=1
ha: shutdown_policy=migrate
keyboard: en-us
migration: secure,network=192.168.200.0/24
next-id: lower=1000
```

/etc/pve/corosync.conf

```

logging {
  debug: off
  to_syslog: yes
}

nodelist {
  node {
    name: pven1
    nodeid: 1
    quorum_votes: 1
    ring0_addr: 192.168.100.101
    ring1_addr: 192.168.200.101
  }
  node {
    name: pven2
    nodeid: 2
    quorum_votes: 1
    ring0_addr: 192.168.100.102
    ring1_addr: 192.168.200.102
}
  node {
    name: pven3
    nodeid: 3
    quorum_votes: 1
    ring0_addr: 192.168.100.103
    ring1_addr: 192.168.200.103
}
}

quorum {
  provider: corosync_votequorum
}

totem {
  cluster_name: cluster1
  config_version: 4
  interface {
    knet_link_priority: 10
    linknumber: 0
  }
  interface {
    knet_link_priority: 20
    linknumber: 1
  }
  ip_version: ipv4-6
  link_mode: passive
  secauth: on
  version: 2
}

```

/etc/hosts

```

127.0.0.1   localhost localhost.localdomain localhost4 
localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 
localhost6.localdomain6 ip6-localhost ip6-loopback
192.168.100.101  pven1.mydomain.local pven1
192.168.100.102 pven2.mydomain.local pven2
192.168.100.103 pven3.mydomain.local pven3
192.168.200.101 pvent1.mydomain.local pvent1
192.168.200.102 pvent2.mydomain.local pvent2
192.168.200.103 pvent3.mydomain.local pvent3
```

/etc/network/interfaces (for pven1; pven2 & pven3 are the same, except 
for the IP Address (see above))

```

auto lo
iface lo inet loopback

iface eno1 inet manual

iface eno2 inet manual

auto enp2s0
iface enp2s0 inet static
  address 192.168.200.20/24
  mtu 9000

auto bond0
iface bond0 inet manual
  bond-slaves eno1 eno2
  bond-mode 802.3ad
  bond-xmit-hash-policy layer2+3
  bond-miimon 100
  bond-downdelay 200
  bond-updelay 200

auto vmbr0
iface vmbr0 inet static
  bridge-ports bond0
  bridge-stp off
  bridge-fd 0
  bridge-vlan-aware yes
  bridge-vids 100,200

auto vmbr0.200
iface vmbr0.200 inet static
  address 192.168.100.101/24
  gateway 192.168.100.1
```

Note: iface enp2s0 (on all 3 Nodes) sits on an isolated VLAN which 
(obviously) has no gateway ie the only hosts on that VLAN are pven1, 
pven2, & pven3, and all are "pingable" from each other.

Thanks for the help

Dulux-Oz


On 2/9/24 08:02, Gilberto Ferreira wrote:
> Hi
> We need more details.
> Send us the following:
>
> cat /etc/pve/storage
> cat /etc/pvedatacenter
> cat /etc/pve/corosync.conf
> cat /etc/hosts
> cat /etc/network/interfaces
>
> Can you ssh between the nodes?
>
>
> ---
>
>
> Gilberto Nunes Ferreira
> (47) 99676-7530 - Whatsapp / Telegram
>
>
>
>
>
>
> Em dom., 1 de set. de 2024 às 05:11, duluxoz <duluxoz at gmail.com> escreveu:
>
>     Hi All,
>
>     I need help with figuring out why I can't migrate a VM from one
>     Proxmox
>     Node to another (in the same cluster, of course).
>
>     These are the details provided by the Proxmox Task Log:
>
>     ```
>
>     task started by HA resource agent
>     2024-09-01 18:02:30 use dedicated network address for sending
>     migration
>     traffic (192.168.200.103)
>     2024-09-01 18:02:30 starting migration of VM 100 to node 'pven3'
>     (192.168.200.103)
>     2024-09-01 18:02:30 starting VM 100 on remote node 'pven3'
>     2024-09-01 18:02:30 [pven3]
>     2024-09-01 18:02:32 start remote tunnel
>     2024-09-01 18:02:33 ssh tunnel ver 1
>     2024-09-01 18:02:33 starting online/live migration on
>     unix:/run/qemu-server/100.migrate
>     2024-09-01 18:02:33 set migration capabilities
>     2024-09-01 18:02:33 migration downtime limit: 100 ms
>     2024-09-01 18:02:33 migration cachesize: 256.0 MiB
>     2024-09-01 18:02:33 set migration parameters
>     2024-09-01 18:02:33 start migrate command to
>     unix:/run/qemu-server/100.migrate
>     channel 2: open failed: connect failed: open failed
>     2024-09-01 18:02:34 migration status error: failed - Unable to
>     write to
>     socket: Broken pipe
>     2024-09-01 18:02:34 ERROR: online migrate failure - aborting
>     2024-09-01 18:02:34 aborting phase 2 - cleanup resources
>     2024-09-01 18:02:34 migrate_cancel
>     2024-09-01 18:02:36 ERROR: migration finished with problems (duration
>     00:00:07)
>     TASK ERROR: migration problems
>     ```
>
>     If someone could point me in the correct direction to resolve this
>     issue
>     I'd be very grateful - thanks
>
>     Cheer
>
>     Dulux-Oz
>
>
>     _______________________________________________
>     pve-user mailing list
>     pve-user at lists.proxmox.com
>     https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>


More information about the pve-user mailing list