[PVE-User] Spillover issue
Eneko Lacunza
elacunza at binovo.es
Wed Mar 25 08:43:41 CET 2020
Hi Alwin,
El 24/3/20 a las 14:54, Alwin Antreich escribió:
> On Tue, Mar 24, 2020 at 01:12:03PM +0100, Eneko Lacunza wrote:
>> Hi Allwin,
>>
>> El 24/3/20 a las 12:24, Alwin Antreich escribió:
>>> On Tue, Mar 24, 2020 at 10:34:15AM +0100, Eneko Lacunza wrote:
>>>> We're seeing a spillover issue with Ceph, using 14.2.8:
>> [...]
>>>> 3. ceph health detail
>>>> HEALTH_WARN BlueFS spillover detected on 3 OSD
>>>> BLUEFS_SPILLOVER BlueFS spillover detected on 3 OSD
>>>> osd.3 spilled over 5 MiB metadata from 'db' device (556 MiB used of
>>>> 6.0 GiB) to slow device
>>>> osd.4 spilled over 5 MiB metadata from 'db' device (552 MiB used of
>>>> 6.0 GiB) to slow device
>>>> osd.5 spilled over 5 MiB metadata from 'db' device (551 MiB used of
>>>> 6.0 GiB) to slow device
>>>>
>>>> I may be overlooking something, any idea? Just found also the following ceph
>>>> issue:
>>>>
>>>> https://tracker.ceph.com/issues/38745
>>>>
>>>> 5MiB of metadata in slow isn't a big problem, but cluster is permanently in
>>>> health Warning state... :)
>>> The DB/WAL device is to small and all the new metadata has to be written
>>> to the slow device. This will destroy performance.
>>>
>>> I think the size changes, as the DB gets compacted.
>> Yes. But it isn't too small... it's 6 GiB and there's only ~560MiB of data.
> Yes true. I meant the used of size. But the message is oddly.
>
> You should find the compaction stats in the OSD log files. It could be,
> as in the bug tracker reasoned, that the compaction needs to much space
> and spills over to the slow device. Addionally, if no set extra, the WAL
> will take up 512 MB on the DB device.
I don't see any indication that compaction needs too much space:
2020-03-24 14:24:04.883 7f03ffbee700 4 rocksdb: [db/db_impl.cc:777]
------- DUMPING STATS -------
2020-03-24 14:24:04.883 7f03ffbee700 4 rocksdb: [db/db_impl.cc:778]
** DB Stats **
Uptime(secs): 15000.1 total, 600.0 interval
Cumulative writes: 4646 writes, 18K keys, 4646 commit groups, 1.0 writes
per commit group, ingest: 0.01 GB, 0.00 MB/s
Cumulative WAL: 4646 writes, 1891 syncs, 2.46 writes per sync, written:
0.01 GB, 0.00 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 163 writes, 637 keys, 163 commit groups, 1.0 writes per
commit group, ingest: 0.63 MB, 0.00 MB/s
Interval WAL: 163 writes, 67 syncs, 2.40 writes per sync, written: 0.00
MB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB)
Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec)
Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
L0 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0
0.0 0.0 1.0 0.0 33.4 0.02 0.00
2 0.009 0 0
L1 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.8 162.1 134.6 0.09 0.06
1 0.092 127K 10K
L2 9/0 538.64 MB 0.2 0.5 0.0 0.5 0.5
0.0 0.0 43.6 102.7 101.2 5.32 1.31
1 5.325 1496K 110K
Sum 9/0 538.64 MB 0.0 0.5 0.0 0.5 0.5
0.0 0.0 961.1 103.3 101.5 5.43 1.37
4 1.358 1623K 121K
Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.00 0.00
0 0.000 0 0
** Compaction Stats [default] **
Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB)
Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec)
Comp(cnt) Avg(sec) KeyIn KeyDrop
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Low 0/0 0.00 KB 0.0 0.5 0.0 0.5 0.5
0.0 0.0 0.0 103.7 101.7 5.42 1.36
2 2.708 1623K 121K
High 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 43.9 0.01 0.00
1 0.013 0 0
User 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.4 0.00 0.00
1 0.004 0 0
Uptime(secs): 15000.1 total, 600.0 interval
Flush(GB): cumulative 0.001, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.54 GB write, 0.04 MB/s write, 0.55 GB read,
0.04 MB/s read, 5.4 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00
MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0
level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for
pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0
memtable_compaction, 0 memtable_slowdown, interval 0 total count
I see the following in a perf dump:
"bluefs": {
"gift_bytes": 0,
"reclaim_bytes": 0,
"db_total_bytes": 6442442752,
"db_used_bytes": 696246272,
"wal_total_bytes": 0,
"wal_used_bytes": 0,
"slow_total_bytes": 40004222976,
"slow_used_bytes": 5242880,
"num_files": 20,
"log_bytes": 41631744,
"log_compactions": 0,
"logged_bytes": 40550400,
"files_written_wal": 2,
"files_written_sst": 41,
"bytes_written_wal": 102040973,
"bytes_written_sst": 2233090674,
"bytes_written_slow": 0,
"max_bytes_wal": 0,
"max_bytes_db": 1153425408,
"max_bytes_slow": 0,
"read_random_count": 127832,
"read_random_bytes": 2761102524,
"read_random_disk_count": 19206,
"read_random_disk_bytes": 2330400597,
"read_random_buffer_count": 108844,
"read_random_buffer_bytes": 430701927,
"read_count": 21457,
"read_bytes": 1087948189,
"read_prefetch_count": 21438,
"read_prefetch_bytes": 1086853927
},
> If the above doesn't give any information then you may need to export
> the bluefs (RocksDB). Then you can run the kvstore-tool on it.
I'll look to try this, although I'd say it's some kind of bug.
>
>>> The easiest way ist to destroy and re-create the OSD with a bigger
>>> DB/WAL. The guideline from Facebook for RocksDB is 3/30/300 GB.
>> It's well below the 3GiB limit in the guideline ;)
> For now. ;)
Cluster has 2 years now, data amount is quite stable, I think it will
hold for some time ;)
Thanks a lot
Eneko
--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarragako bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es
More information about the pve-user
mailing list