[PVE-User] Boot disk corruption after Ceph OSD destroy with cleanup

Eneko Lacunza elacunza at binovo.es
Fri Mar 22 10:40:17 CET 2019


Hi,

El 22/3/19 a las 9:59, Alwin Antreich escribió:
> On Fri, Mar 22, 2019 at 09:03:22AM +0100, Eneko Lacunza wrote:
>> El 22/3/19 a las 8:35, Alwin Antreich escribió:
>>> On Thu, Mar 21, 2019 at 03:58:53PM +0100, Eneko Lacunza wrote:
>>>> We have removed an OSD disk from a server in our office cluster, removing
>>>> partitions (with --cleanup 1) and that has made the server unable to boot
>>>> (we have seen this in 2 servers in a row...)
>>>>
>>>> Looking at the command output:
>>>>
>>>> --- cut ---
>>>> root at sanmarko:~# pveceph osd destroy 5 --cleanup 1
>>>> destroy OSD osd.5
>>>> Remove osd.5 from the CRUSH map
>>>> Remove the osd.5 authentication key.
>>>> Remove OSD osd.5
>>>> Unmount OSD osd.5 from  /var/lib/ceph/osd/ceph-5
>>>> remove partition /dev/sda1 (disk '/dev/sda', partnum 1)
>>>> The operation has completed successfully.
>>>> remove partition /dev/sdd7 (disk '/dev/sdd', partnum 7)
>>>> Warning: The kernel is still using the old partition table.
>>>> The new table will be used at the next reboot or after you
>>>> run partprobe(8) or kpartx(8)
>>>> The operation has completed successfully.
>>>> wipe disk: /dev/sda
>>>> 200+0 records in
>>>> 200+0 records out
>>>> 209715200 bytes (210 MB, 200 MiB) copied, 1.29266 s, 162 MB/s
>>>> wipe disk: /dev/sdd
>>>> 200+0 records in
>>>> 200+0 records out
>>>> 209715200 bytes (210 MB, 200 MiB) copied, 1.00753 s, 208 MB/s
>>>> --- cut ---
>>>>
>>>> Boot disk is SSD, look that scripts says it is wiping /dev/sdd!! It should
>>>> do that to the journal partition? (dev/sdd7)
>>>>
>>>> This cluster is on PVE 5.3 .
>>> Can you please update, I suppose you don't have the pve-manager with
>>> version 5.3-10 or newer installed yet. There the issue has been fixed.
>>>
>>> But if you do and the issue still persists, then please post the
>>> 'pveversion -v'.
>> Seems both servers were on 5.3-8, thanks for the hint.
>>
>> Maybe it would be helpful if you can publish some release notes for each
>> package push made to pve-enterprise/pve-non-subscription (maybe capturing
>> changed package's changelog?), so that this kind of (maybe corner but) grave
>> problems are better communicated when the fix isn't first released on a
>> point release.
> I am not quiet sure what you mean by that, but each package ships with a
> changelog, see pve-manager_5.3-11.
> http://download.proxmox.com/debian/pve/dists/stretch/pve-no-subscription/binary-amd64/pve-manager_5.3-11.changelog
>
> There is also the possibility to subscribe to bug reports and get
> notifications. See the coresponding bug report for this issue.
> https://bugzilla.proxmox.com/show_bug.cgi?id=2051
Right know it isn't announced when a repository is updated, nor what 
changes/fixes/improvements have been included. (It's only done for point 
releases, not for new package uploads).

Thanks
Eneko

-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es




More information about the pve-user mailing list