[pve-devel] [PATCH manager] fix #4631: ceph: osd: create: add osds-per-device
Aaron Lauterer
a.lauterer at proxmox.com
Mon Aug 21 12:51:47 CEST 2023
responses inline
On 8/21/23 10:20, Fiona Ebner wrote:
> Am 18.04.23 um 14:26 schrieb Aaron Lauterer:
>> Allows to automatically create multiple OSDs per physical device. The
>> main use case are fast NVME drives that would be bottlenecked by a
>> single OSD service.
>>
>> By using the 'ceph-volume lvm batch' command instead of the 'ceph-volume
>> lvm create' for multiple OSDs / device, we don't have to deal with the
>> split of the drive ourselves.
>>
>> But this means that the parameters to specify a DB or WAL device won't
>> work as the 'batch' command doesn't use them. Dedicated DB and WAL
>> devices don't make much sense anyway if we place the OSDs on fast NVME
>> drives.
>>
>> Some other changes to how the command is built were needed as well, as
>> the 'batch' command needs the path to the disk as a positional argument,
>> not as '--data /dev/sdX'.
>> We drop the '--cluster-fsid' paramter because the 'batch' command
>> doesn't accept it. The 'create' will fall back to reading it from the
>> ceph.conf file.
>>
>> Removal of OSDs works as expected without any code changes. As long as
>> there are other OSDs on a disk, the VG & PV won't be removed, even if
>> 'cleanup' is enabled.
>>
>> Signed-off-by: Aaron Lauterer <a.lauterer at proxmox.com>
>> ---
>
> I noticed a warning while testing
>
> --> DEPRECATION NOTICE
> --> You are using the legacy automatic disk sorting behavior
> --> The Pacific release will change the default to --no-auto
> --> passed data devices: 1 physical, 0 LVM
> --> relative data size: 0.3333333333333333
>
> Note that I'm on Quincy, so maybe they didn't still didn't change it :P
Also shows up when using `ceph-volume lvm batch …` directly. So I guess not much
we can do about it after consulting the man page.
>
>> @@ -275,6 +275,12 @@ __PACKAGE__->register_method ({
>> type => 'string',
>> description => "Set the device class of the OSD in crush."
>> },
>> + 'osds-per-device' => {
>> + optional => 1,
>> + type => 'number',
>
> should be integer
will change
>
>> + minimum => '1',
>> + description => 'OSD services per physical device. Can improve fast NVME utilization.',
>
> Can we add an explicit recommendation against doing it for other disk
> types? I imagine it's not beneficial for those, or?
What about something like:
"Only useful for fast NVME devices to utilize their performance better."?
>
>> + },
>> },
>> },
>> returns => { type => 'string' },
>> @@ -294,6 +300,15 @@ __PACKAGE__->register_method ({
>> # extract parameter info and fail if a device is set more than once
>> my $devs = {};
>>
>> + # allow 'osds-per-device' only without dedicated db and/or wal devs. We cannot specify them with
>> + # 'ceph-volume lvm batch' and they don't make a lot of sense on fast NVMEs anyway.
>> + if ($param->{'osds-per-device'}) {
>> + for my $type ( qw(db_dev wal_dev) ) {
>> + die "Cannot use 'osds-per-device' parameter with '${type}'"
>
> Missing newline after error message.
> Could also use raise_param_exc().
Ah thanks. Will switch it to an `raise_param_exc()` where we don't need the
newline AFAICT?
>
>> + if $param->{$type};
>> + }
>> + }
>> +
>> my $ceph_conf = cfs_read_file('ceph.conf');
>>
>> my $osd_network = $ceph_conf->{global}->{cluster_network};
More information about the pve-devel
mailing list