[PVE-User] Proxmox HCI Ceph: "osd_max_backfills" is overridden and set to 1000
Benjamin Hofer
benjamin at gridscale.io
Tue May 30 12:00:02 CEST 2023
Dear community,
We've set up a Proxmox hyper-converged Ceph cluster in production.
After syncing in one new OSD using the "pveceph osd create" command,
we got massive network performance issues and outages. We then found
that "osd_max_backfills" is set to 1000 (Ceph default is 1) and that
this (along with some other values) have been overridden.
Does anyone know a root cause? I can't imagine that this is the
Proxmox default behaviour and I'm very sure that we didn't change
anything (actually I didn't even know the value before researching and
talking to colleagues with deeper Ceph knowledge).
System:
PVE version output: pve-manager/7.3-6/723bb6ec (running kernel: 5.15.102-1-pve)
ceph version 17.2.5 (e04241aa9b639588fa6c864845287d2824cb6b55) quincy (stable)
# ceph config get osd.1
WHO MASK LEVEL OPTION VALUE RO
osd.1 basic osd_mclock_max_capacity_iops_ssd 17080.220753
# ceph config show osd.1
NAME VALUE
SOURCE OVERRIDES IGNORES
auth_client_required cephx
file
auth_cluster_required cephx
file
auth_service_required cephx
file
cluster_network 10.0.18.0/24
file
daemonize false
override
keyring $osd_data/keyring
default
leveldb_log
default
mon_allow_pool_delete true
file
mon_host 10.0.18.30 10.0.18.10
10.0.18.20 file
ms_bind_ipv4 true
file
ms_bind_ipv6 false
file
no_config_file false
override
osd_delete_sleep 0.000000
override
osd_delete_sleep_hdd 0.000000
override
osd_delete_sleep_hybrid 0.000000
override
osd_delete_sleep_ssd 0.000000
override
osd_max_backfills 1000
override
osd_mclock_max_capacity_iops_ssd 17080.220753
mon
osd_mclock_scheduler_background_best_effort_lim 999999
default
osd_mclock_scheduler_background_best_effort_res 534
default
osd_mclock_scheduler_background_best_effort_wgt 2
default
osd_mclock_scheduler_background_recovery_lim 2135
default
osd_mclock_scheduler_background_recovery_res 534
default
osd_mclock_scheduler_background_recovery_wgt 1
default
osd_mclock_scheduler_client_lim 999999
default
osd_mclock_scheduler_client_res 1068
default
osd_mclock_scheduler_client_wgt 2
default
osd_pool_default_min_size 2
file
osd_pool_default_size 3
file
osd_recovery_max_active 1000
override
osd_recovery_max_active_hdd 1000
override
osd_recovery_max_active_ssd 1000
override
osd_recovery_sleep 0.000000
override
osd_recovery_sleep_hdd 0.000000
override
osd_recovery_sleep_hybrid 0.000000
override
osd_recovery_sleep_ssd 0.000000
override
osd_scrub_sleep 0.000000
override
osd_snap_trim_sleep 0.000000
override
osd_snap_trim_sleep_hdd 0.000000
override
osd_snap_trim_sleep_hybrid 0.000000
override
osd_snap_trim_sleep_ssd 0.000000
override
public_network 10.0.18.0/24
file
rbd_default_features 61
default
rbd_qos_exclude_ops 0
default
setgroup ceph
cmdline
setuser ceph
cmdline
Thanks a lot in advance.
Best
Benjamin
More information about the pve-user
mailing list