[pve-devel] [PATCH manager] add wipe_disk option when destroying ceph disk

Tue Oct 23 17:08:13 CEST 2018

On 10/23/18 4:49 PM, Alwin Antreich wrote:
> On Tue, Oct 23, 2018 at 04:19:36PM +0200, Thomas Lamprecht wrote:
>> On 10/23/18 4:02 PM, Alwin Antreich wrote:
>>> Nice, was on my list too. ;) Some comments inline.
>>>
>>> On Tue, Oct 23, 2018 at 03:33:44PM +0200, David Limbeck wrote:
>>>> this allows the disk to be reused as ceph disk by zeroing the first 200M
>>>> of the destroyed disk
>>>>
>>>> Signed-off-by: David Limbeck <d.limbeck at proxmox.com>
>>>> ---
>>>>  PVE/API2/Ceph.pm         | 22 ++++++++++++++++++++++
>>>>  www/manager6/ceph/OSD.js | 18 +++++++++++++++++-
>>>>  2 files changed, 39 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
>>>> index 69489a70..6dce2f01 100644
>>>> --- a/PVE/API2/Ceph.pm
>>>> +++ b/PVE/API2/Ceph.pm
>>>> @@ -347,6 +347,12 @@ __PACKAGE__->register_method ({
>>>>  		optional => 1,
>>>>  		default => 0,
>>>>  	    },
>>>> +	    wipe_disk => {
>>>> +		description => 'Wipe first 200M of disk to make it reusable as a ceph OSD.',
>>>> +		type => 'boolean',
>>>> +		optional => 1,
>>>> +		default => 0,
>>>> +	    },
>>> I suggest to not expose this as a separate option, as the 'cleanup'
>>> should do this in one go. If I want to set the cleanup option, I
>>> definitely want the wipe too.
>>
>> Is the data destroyed too, or could I re-add it (somehow) if
>> it wasn't wiped? If that's the case we may want to have the option
>> but yes, normally you're right.
> The command is 'destroyosd', doesn't that imply that the data is
> destroyed already? And the cleanup flag zaps the device.
> 
> The OSD has to be set to down & out prior the destroy. With or witout a
> wipe, it is removed from the ceph cluster. It contains "old data" that is
> not usable or only with forensic efforts. ;)

OK, then yes, the option is probably useless and it should be always done.

> 
>>
>>>
>>>>  	},
>>>>      },
>>>>      returns => { type => 'string' },
>>>> @@ -434,6 +440,15 @@ __PACKAGE__->register_method ({
>>>>  		}
>>>>  	    }
>>>>  
>>>> +	    my $disks_to_wipe = {};
>>>> +	    if ($param->{wipe_disk}) {
>>>> +		foreach my $part (@$partitions_to_remove) {
>>>> +		    next if !$part || (! -b $part );
>>>> +		    my $devpath = PVE::Diskmanage::get_blockdev($part);
>>>> +		    $disks_to_wipe->{$devpath} = 1;
>>>> +		}
>>>> +	    }
>>>> +
>>>>  	    print "Unmount OSD $osdsection from  $mountpoint\n";
>>>>  	    eval { run_command(['/bin/umount', $mountpoint]); };
>>>>  	    if (my $err = $@) {
>>>> @@ -443,6 +458,13 @@ __PACKAGE__->register_method ({
>>>>  		foreach my $part (@$partitions_to_remove) {
>>>>  		    $remove_partition->($part);
>>>>  		}
>>>> +		if ($param->{wipe_disk}) {
>>>> +		    foreach my $devpath (keys %$disks_to_wipe) {
>>>> +			print "wipe disk: $devpath\n";
>>>> +			eval { run_command(['/bin/dd', 'if=/dev/zero', "of=${devpath}", 'bs=1M', 'count=200']); };
>>> The dd needs the fdatasync option and maybe additionally as input
>>> /dev/urandom, as some disks or NVMe will not write the data out.
>>
>> citation? If I write zeros to a blockdev I actually want zeros to be
>> written. If the blockdev internally optimizes that by not writing all
>> zeros out but only marks the area I wrote as zeroed does not matter,
>> for the kernel/user. The read after that operation *must* return zero -
>> else it's not a storage device but garbage...
> That should have happen, but with 'if=/dev/zero' creating a new OSD
> failed and fdatasync or urandom helped.

fdatasync is OK and even desired, no objection there, but urandom really
isn't /dev/zero should be OK.