[pve-devel] [PATCH manager] add wipe_disk option when destroying ceph disk
Alwin Antreich
a.antreich at proxmox.com
Tue Oct 23 16:54:54 CEST 2018
Additional, this is also tracked on ceph's side.
https://tracker.ceph.com/issues/22354
On Tue, Oct 23, 2018 at 04:49:27PM +0200, Alwin Antreich wrote:
> On Tue, Oct 23, 2018 at 04:19:36PM +0200, Thomas Lamprecht wrote:
> > On 10/23/18 4:02 PM, Alwin Antreich wrote:
> > > Nice, was on my list too. ;) Some comments inline.
> > >
> > > On Tue, Oct 23, 2018 at 03:33:44PM +0200, David Limbeck wrote:
> > >> this allows the disk to be reused as ceph disk by zeroing the first 200M
> > >> of the destroyed disk
> > >>
> > >> Signed-off-by: David Limbeck <d.limbeck at proxmox.com>
> > >> ---
> > >> PVE/API2/Ceph.pm | 22 ++++++++++++++++++++++
> > >> www/manager6/ceph/OSD.js | 18 +++++++++++++++++-
> > >> 2 files changed, 39 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
> > >> index 69489a70..6dce2f01 100644
> > >> --- a/PVE/API2/Ceph.pm
> > >> +++ b/PVE/API2/Ceph.pm
> > >> @@ -347,6 +347,12 @@ __PACKAGE__->register_method ({
> > >> optional => 1,
> > >> default => 0,
> > >> },
> > >> + wipe_disk => {
> > >> + description => 'Wipe first 200M of disk to make it reusable as a ceph OSD.',
> > >> + type => 'boolean',
> > >> + optional => 1,
> > >> + default => 0,
> > >> + },
> > > I suggest to not expose this as a separate option, as the 'cleanup'
> > > should do this in one go. If I want to set the cleanup option, I
> > > definitely want the wipe too.
> >
> > Is the data destroyed too, or could I re-add it (somehow) if
> > it wasn't wiped? If that's the case we may want to have the option
> > but yes, normally you're right.
> The command is 'destroyosd', doesn't that imply that the data is
> destroyed already? And the cleanup flag zaps the device.
>
> The OSD has to be set to down & out prior the destroy. With or witout a
> wipe, it is removed from the ceph cluster. It contains "old data" that is
> not usable or only with forensic efforts. ;)
>
> >
> > >
> > >> },
> > >> },
> > >> returns => { type => 'string' },
> > >> @@ -434,6 +440,15 @@ __PACKAGE__->register_method ({
> > >> }
> > >> }
> > >>
> > >> + my $disks_to_wipe = {};
> > >> + if ($param->{wipe_disk}) {
> > >> + foreach my $part (@$partitions_to_remove) {
> > >> + next if !$part || (! -b $part );
> > >> + my $devpath = PVE::Diskmanage::get_blockdev($part);
> > >> + $disks_to_wipe->{$devpath} = 1;
> > >> + }
> > >> + }
> > >> +
> > >> print "Unmount OSD $osdsection from $mountpoint\n";
> > >> eval { run_command(['/bin/umount', $mountpoint]); };
> > >> if (my $err = $@) {
> > >> @@ -443,6 +458,13 @@ __PACKAGE__->register_method ({
> > >> foreach my $part (@$partitions_to_remove) {
> > >> $remove_partition->($part);
> > >> }
> > >> + if ($param->{wipe_disk}) {
> > >> + foreach my $devpath (keys %$disks_to_wipe) {
> > >> + print "wipe disk: $devpath\n";
> > >> + eval { run_command(['/bin/dd', 'if=/dev/zero', "of=${devpath}", 'bs=1M', 'count=200']); };
> > > The dd needs the fdatasync option and maybe additionally as input
> > > /dev/urandom, as some disks or NVMe will not write the data out.
> >
> > citation? If I write zeros to a blockdev I actually want zeros to be
> > written. If the blockdev internally optimizes that by not writing all
> > zeros out but only marks the area I wrote as zeroed does not matter,
> > for the kernel/user. The read after that operation *must* return zero -
> > else it's not a storage device but garbage...
> That should have happen, but with 'if=/dev/zero' creating a new OSD
> failed and fdatasync or urandom helped.
>
> >
> > urandom has a (minimal) chance to write something that looks like a
> > OSD (or whatever) again. Also we do not want to interfere with the
> > blockdevs features (i.e., write optimizations)
> True. Merely a suggestion, as those optimizations seem to not help.
>
More information about the pve-devel
mailing list