[PVE-User] Boot disk corruption after Ceph OSD destroy with cleanup

Alwin Antreich a.antreich at proxmox.com
Fri Mar 22 09:59:25 CET 2019


On Fri, Mar 22, 2019 at 09:03:22AM +0100, Eneko Lacunza wrote:
> Hi Alwin,
> 
> El 22/3/19 a las 8:35, Alwin Antreich escribió:
> > On Thu, Mar 21, 2019 at 03:58:53PM +0100, Eneko Lacunza wrote:
> > > We have removed an OSD disk from a server in our office cluster, removing
> > > partitions (with --cleanup 1) and that has made the server unable to boot
> > > (we have seen this in 2 servers in a row...)
> > > 
> > > Looking at the command output:
> > > 
> > > --- cut ---
> > > root at sanmarko:~# pveceph osd destroy 5 --cleanup 1
> > > destroy OSD osd.5
> > > Remove osd.5 from the CRUSH map
> > > Remove the osd.5 authentication key.
> > > Remove OSD osd.5
> > > Unmount OSD osd.5 from  /var/lib/ceph/osd/ceph-5
> > > remove partition /dev/sda1 (disk '/dev/sda', partnum 1)
> > > The operation has completed successfully.
> > > remove partition /dev/sdd7 (disk '/dev/sdd', partnum 7)
> > > Warning: The kernel is still using the old partition table.
> > > The new table will be used at the next reboot or after you
> > > run partprobe(8) or kpartx(8)
> > > The operation has completed successfully.
> > > wipe disk: /dev/sda
> > > 200+0 records in
> > > 200+0 records out
> > > 209715200 bytes (210 MB, 200 MiB) copied, 1.29266 s, 162 MB/s
> > > wipe disk: /dev/sdd
> > > 200+0 records in
> > > 200+0 records out
> > > 209715200 bytes (210 MB, 200 MiB) copied, 1.00753 s, 208 MB/s
> > > --- cut ---
> > > 
> > > Boot disk is SSD, look that scripts says it is wiping /dev/sdd!! It should
> > > do that to the journal partition? (dev/sdd7)
> > > 
> > > This cluster is on PVE 5.3 .
> > Can you please update, I suppose you don't have the pve-manager with
> > version 5.3-10 or newer installed yet. There the issue has been fixed.
> > 
> > But if you do and the issue still persists, then please post the
> > 'pveversion -v'.
> Seems both servers were on 5.3-8, thanks for the hint.
> 
> Maybe it would be helpful if you can publish some release notes for each
> package push made to pve-enterprise/pve-non-subscription (maybe capturing
> changed package's changelog?), so that this kind of (maybe corner but) grave
> problems are better communicated when the fix isn't first released on a
> point release.
I am not quiet sure what you mean by that, but each package ships with a
changelog, see pve-manager_5.3-11.
http://download.proxmox.com/debian/pve/dists/stretch/pve-no-subscription/binary-amd64/pve-manager_5.3-11.changelog

There is also the possibility to subscribe to bug reports and get
notifications. See the coresponding bug report for this issue.
https://bugzilla.proxmox.com/show_bug.cgi?id=2051


--
Cheers,
Alwin




More information about the pve-user mailing list