[PVE-User] [ceph-users] Re: Ceph Usage web and terminal.
Сергей Цаболов
tsabolov at t8.ru
Wed Dec 29 13:51:03 CET 2021
Hi, Uwe
29.12.2021 14:16, Uwe Sauter пишет:
> Just a feeling but I'd say that the imbalance in OSDs (one host having many more disks than the
> rest) is your problem.
Yes, last node in cluster have more disk then the rest, but
one disk is 12TB and all others 9 HD is 1TB
>
> Assuming that your configuration keeps 3 copies of each VM image then the imbalance probably means
> that 2 of these 3 copies reside on pve-3111 and if this host is unavailable, all VM images with 2
> copies on that host become unresponsive, too.
In Proxmox web ceph pool I set the Size: 2 , Min.Size: 2
With : ceph osd map vm.pool object-name (vm ID) I see some of vm object
one copy is on osd.12, example :
osdmap e14321 pool 'vm.pool' (2) object '114' -> pg 2.10486407 (2.7) ->
up ([12,8], p12) acting ([12,8], p12)
But this example :
osdmap e14321 pool 'vm.pool' (2) object '113' -> pg 2.8bd09f6d (2.36d)
-> up ([10,7], p10) acting ([10,7], p10)
osd.10 and osd.7
>
> Check your failure domain for Ceph and possibly change it from OSD to host. This should prevent that
> one host holds multiple copies of a VM image.
I didn 't understand a little what to check ?
Can you explain me with example?
>
>
> Regards,
>
> Uwe
>
> Am 29.12.21 um 09:36 schrieb Сергей Цаболов:
>> Hello to all.
>>
>> In my case I have the 7 node cluster Proxmox and working Ceph (ceph version 15.2.15 octopus
>> (stable)": 7)
>>
>> Ceph HEALTH_OK
>>
>> ceph -s
>> cluster:
>> id: 9662e3fa-4ce6-41df-8d74-5deaa41a8dde
>> health: HEALTH_OK
>>
>> services:
>> mon: 7 daemons, quorum pve-3105,pve-3107,pve-3108,pve-3103,pve-3101,pve-3111,pve-3109 (age 17h)
>> mgr: pve-3107(active, since 41h), standbys: pve-3109, pve-3103, pve-3105, pve-3101, pve-3111,
>> pve-3108
>> mds: cephfs:1 {0=pve-3105=up:active} 6 up:standby
>> osd: 22 osds: 22 up (since 17h), 22 in (since 17h)
>>
>> task status:
>>
>> data:
>> pools: 4 pools, 1089 pgs
>> objects: 1.09M objects, 4.1 TiB
>> usage: 7.7 TiB used, 99 TiB / 106 TiB avail
>> pgs: 1089 active+clean
>>
>> ---------------------------------------------------------------------------------------------------------------------
>>
>>
>> ceph osd tree
>>
>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>> -1 106.43005 root default
>> -13 14.55478 host pve-3101
>> 10 hdd 7.27739 osd.10 up 1.00000 1.00000
>> 11 hdd 7.27739 osd.11 up 1.00000 1.00000
>> -11 14.55478 host pve-3103
>> 8 hdd 7.27739 osd.8 up 1.00000 1.00000
>> 9 hdd 7.27739 osd.9 up 1.00000 1.00000
>> -3 14.55478 host pve-3105
>> 0 hdd 7.27739 osd.0 up 1.00000 1.00000
>> 1 hdd 7.27739 osd.1 up 1.00000 1.00000
>> -5 14.55478 host pve-3107
>> 2 hdd 7.27739 osd.2 up 1.00000 1.00000
>> 3 hdd 7.27739 osd.3 up 1.00000 1.00000
>> -9 14.55478 host pve-3108
>> 6 hdd 7.27739 osd.6 up 1.00000 1.00000
>> 7 hdd 7.27739 osd.7 up 1.00000 1.00000
>> -7 14.55478 host pve-3109
>> 4 hdd 7.27739 osd.4 up 1.00000 1.00000
>> 5 hdd 7.27739 osd.5 up 1.00000 1.00000
>> -15 19.10138 host pve-3111
>> 12 hdd 10.91409 osd.12 up 1.00000 1.00000
>> 13 hdd 0.90970 osd.13 up 1.00000 1.00000
>> 14 hdd 0.90970 osd.14 up 1.00000 1.00000
>> 15 hdd 0.90970 osd.15 up 1.00000 1.00000
>> 16 hdd 0.90970 osd.16 up 1.00000 1.00000
>> 17 hdd 0.90970 osd.17 up 1.00000 1.00000
>> 18 hdd 0.90970 osd.18 up 1.00000 1.00000
>> 19 hdd 0.90970 osd.19 up 1.00000 1.00000
>> 20 hdd 0.90970 osd.20 up 1.00000 1.00000
>> 21 hdd 0.90970 osd.21 up 1.00000 1.00000
>>
>> ---------------------------------------------------------------------------------------------------------------
>>
>>
>> POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
>> vm.pool 2 1024 3.0 TiB 863.31k 6.0 TiB 6.38 44 TiB (this pool
>> have the all VM disk)
>>
>> ---------------------------------------------------------------------------------------------------------------
>>
>>
>> ceph osd map vm.pool vm.pool.object
>> osdmap e14319 pool 'vm.pool' (2) object 'vm.pool.object' -> pg 2.196f68d5 (2.d5) -> up ([2,4], p2)
>> acting ([2,4], p2)
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> pveversion -v
>> proxmox-ve: 6.4-1 (running kernel: 5.4.143-1-pve)
>> pve-manager: 6.4-13 (running version: 6.4-13/9f411e79)
>> pve-kernel-helper: 6.4-8
>> pve-kernel-5.4: 6.4-7
>> pve-kernel-5.4.143-1-pve: 5.4.143-1
>> pve-kernel-5.4.106-1-pve: 5.4.106-1
>> ceph: 15.2.15-pve1~bpo10
>> ceph-fuse: 15.2.15-pve1~bpo10
>> corosync: 3.1.2-pve1
>> criu: 3.11-3
>> glusterfs-client: 5.5-3
>> ifupdown: residual config
>> ifupdown2: 3.0.0-1+pve4~bpo10
>> ksm-control-daemon: 1.3-1
>> libjs-extjs: 6.0.1-10
>> libknet1: 1.22-pve1~bpo10+1
>> libproxmox-acme-perl: 1.1.0
>> libproxmox-backup-qemu0: 1.1.0-1
>> libpve-access-control: 6.4-3
>> libpve-apiclient-perl: 3.1-3
>> libpve-common-perl: 6.4-4
>> libpve-guest-common-perl: 3.1-5
>> libpve-http-server-perl: 3.2-3
>> libpve-storage-perl: 6.4-1
>> libqb0: 1.0.5-1
>> libspice-server1: 0.14.2-4~pve6+1
>> lvm2: 2.03.02-pve4
>> lxc-pve: 4.0.6-2
>> lxcfs: 4.0.6-pve1
>> novnc-pve: 1.1.0-1
>> proxmox-backup-client: 1.1.13-2
>> proxmox-mini-journalreader: 1.1-1
>> proxmox-widget-toolkit: 2.6-1
>> pve-cluster: 6.4-1
>> pve-container: 3.3-6
>> pve-docs: 6.4-2
>> pve-edk2-firmware: 2.20200531-1
>> pve-firewall: 4.1-4
>> pve-firmware: 3.3-2
>> pve-ha-manager: 3.1-1
>> pve-i18n: 2.3-1
>> pve-qemu-kvm: 5.2.0-6
>> pve-xtermjs: 4.7.0-3
>> qemu-server: 6.4-2
>> smartmontools: 7.2-pve2
>> spiceterm: 3.1-1
>> vncterm: 1.6-2
>> zfsutils-linux: 2.0.6-pve1~bpo10+1
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>> And now my problem:
>>
>> For all VM I have one pool for VM disks
>>
>> When node/host pve-3111 is shutdown in many of other nodes/hosts pve-3107, pve-3105 VM not
>> shutdown but not available in network.
>>
>> After the node/host is up Ceph back to HEALTH_OK and the all VM back to access in Network (without
>> reboot).
>>
>> Can some one to suggest me what I can to check in Ceph ?
>>
>> Thanks.
>>
>
--
-------------------------
С уважением
Сергей Цаболов,
Системный администратор
ООО "Т8"
Тел.: +74992716161,
Моб: +79850334875
tsabolov at t8.ru
ООО «Т8», 107076, г. Москва, Краснобогатырская ул., д. 44, стр.1
www.t8.ru
More information about the pve-user
mailing list