[PVE-User] Ceph jewel to luminous upgrade problem
Eneko Lacunza
elacunza at binovo.es
Mon Nov 13 16:44:56 CET 2017
Hi again,
It seems we hit this reported/won't fix bug:
http://tracker.ceph.com/issues/16211
I managed to start an affected VM following step #12, will continue
applying the fix to see if all affected VMs are fixed this way.
Thanks
El 13/11/17 a las 16:26, Eneko Lacunza escribió:
> Hi all,
>
> We're in the process of upgrading our office Proxmox v4.4 cluster to
> v5.1 .
>
> For that we first have followed instructions in
> https://pve.proxmox.com/wiki/Ceph_Jewel_to_Luminous
>
> to upgrade Ceph Jewel to Luminous.
>
> Upgrade was apparently a success:
> # ceph -s
> cluster:
> id: 8ee074d4-005c-4bd6-a077-85eddde543b5
> health: HEALTH_OK
>
> services:
> mon: 3 daemons, quorum 0,2,3
> mgr: butroe(active), standbys: guadalupe, sanmarko
> osd: 12 osds: 12 up, 12 in
>
> data:
> pools: 2 pools, 640 pgs
> objects: 518k objects, 1966 GB
> usage: 4120 GB used, 7052 GB / 11172 GB avail
> pgs: 640 active+clean
>
> io:
> client: 644 kB/s rd, 3299 kB/s wr, 61 op/s rd, 166 op/s wr
>
> And versions seem good too:
> # ceph mon versions
> {
> "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
> luminous (stable)": 3
> }
> # ceph osd versions
> {
> "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
> luminous (stable)": 12
> }
>
> But this weeked there were problems backing up some VMs, all with the
> same error:
> no such volume 'ceph-proxmox:vm-120-disk-1'
>
> The "missing" volumes don't show in storage content, but they DO if we
> do a "rbd -p proxmox ls".
>
> When we try an info command we get an error though:
> # rbd -p proxmox info vm-120-disk-1
> 2017-11-13 16:04:02.979006 7f99d8ff9700 -1 librbd::image::OpenRequest:
> failed to retreive immutable metadata: (2) No such file or directory
> rbd: error opening image vm-120-disk-1: (2) No such file or directory
>
> Other VM disk images behave normally:
> # rbd -p proxmox info vm-119-disk-1
> rbd image 'vm-119-disk-1':
> size 3072 MB in 768 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.575762ae8944a
> format: 2
> features: layering
> flags:
>
> I don't really know what to look at to further diagnose this. I recall
> that there was a version 1 format for rbd, but I doubt "missing" disk
> images are in that old format (and really don't know how to check that
> if info doesn't work...)
>
> Some of the missing VMs continue to be used by "old" running qemu
> processes and work correctly; but if we stop the VM, then it won't
> start again with the error reported above. I can start and stop VMs
> with non-"missing" disk images normally.
>
> Any hints about what to try next?
>
> OSDs are filestore with XFS (created from GUI).
>
> # pveversion -v
> proxmox-ve: 4.4-96 (running kernel: 4.4.83-1-pve)
> pve-manager: 4.4-18 (running version: 4.4-18/ef2610e8)
> pve-kernel-4.4.67-1-pve: 4.4.67-92
> pve-kernel-4.4.76-1-pve: 4.4.76-94
> pve-kernel-4.4.83-1-pve: 4.4.83-96
> lvm2: 2.02.116-pve3
> corosync-pve: 2.4.2-2~pve4+1
> libqb0: 1.0.1-1
> pve-cluster: 4.0-53
> qemu-server: 4.0-113
> pve-firmware: 1.1-11
> libpve-common-perl: 4.0-96
> libpve-access-control: 4.0-23
> libpve-storage-perl: 4.0-76
> pve-libspice-server1: 0.12.8-2
> vncterm: 1.3-2
> pve-docs: 4.4-4
> pve-qemu-kvm: 2.9.0-5~pve4
> pve-container: 1.0-101
> pve-firewall: 2.0-33
> pve-ha-manager: 1.0-41
> ksm-control-daemon: 1.2-1
> glusterfs-client: 3.5.2-2+deb8u3
> lxc-pve: 2.0.7-4
> lxcfs: 2.0.6-pve1
> criu: 1.6.0-1
> novnc-pve: 0.5-9
> smartmontools: 6.5+svn4324-1~pve80
> zfsutils: 0.6.5.9-pve15~bpo80
> ceph: 12.2.1-1~bpo80+1
>
> Thanks a lot
> Eneko
>
--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es
More information about the pve-user
mailing list