[pbs-devel] [PATCH v5 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates
Christian Ebner
c.ebner at proxmox.com
Wed Apr 2 21:50:57 CEST 2025
On 4/2/25 17:57, Thomas Lamprecht wrote:
> Am 26.03.25 um 11:03 schrieb Christian Ebner:
>> Basic benchmarking:
>>
>> Number of utimensat calls shows significatn reduction:
>> unpatched: 31591944
>> patched: 1495136
>>
>> Total GC runtime shows significatn reduction (average of 3 runs):
>> unpatched: 155.4 ± 3.5 s
>> patched: 22.8 ± 0.5 s
>
> Thanks a lot for providing these numbers, and what a nice runtime
> improvement!
>
>>
>> VmPeak measured via /proc/self/status before and after
>> `mark_used_chunks` (proxmox-backup-proxy was restarted in between
>> for normalization, average of 3 runs):
>> unpatched before: 1196028 ± 0 kB
>> unpatched after: 1196028 ± 0 kB
>>
>> unpatched before: 1163337 ± 28317 kB
>> unpatched after: 1330906 ± 29280 kB
>> delta: 167569 kB
>
> VmPeak is virtual memory though, not something like resident set size,
> or better proportional set size – but yeah that's harder to get.
> Simplest way might be polling something like `ps -o pid,rss,pss -u backup`
> in a shell alongside the GC run a few times per second, e.g.:
>
> while :; do printf '%s ' $(date '+%T.%3N'); $(); ps -o pid,rss,pss -u backup --no-headers; sleep 0.5; done | tee gc-stats
>
> And then get the highest PSS values via:
>
> sort -nk4,4 gc-stats | tail
>
> I do not think this needs to be redone, and a new revision needs to be
> send though. But, it might be nice to do a quick test just for a rough
> comparison to VmPeak delta.
Ah, thanks for the explanation and suggestion, was a bit unsure already
if the VmPeak is informative enough. Will re-check this with the
suggested metrics.
>
>>
>> Dependence on the cache capacity:
>> capacity runtime[s] VmPeakDiff[kB]
>> 1*1024 66.221 0
>> 10*1024 36.164 0
>> 100*1024 23.141 0
>> 1024*1024 22.188 101060
>
> Hmm, seems like we could lower the cache size to something like 128*1024
> or 256*1024 and get already most benefits for this workload.
>
> What do you think about applying this as is and after doing a quick RSS
> and/or PSS benchmark decide if it's worth to start out a bit smaller, as
> 167 MiB delta is a bit much for my taste if a quarter of that is enough
> to get most benefits. If the actual used memory (not just virtual memory
> mappings) is rather closer to the cache size without overhead (32 MiB),
> I'd be fine with keeping this as is.
Okay, yes will ask Stoiko to get access to the PBS instance once more to
have a similar datastore and gain some more data on this.
>
> tuning option in MiB (i.e. 32 MiB / 32 B == 1024*1024 capacity) where the
> admin can better control this themselves.
This I do not fully understand, I assume this sentence is cut off? But I
can send a patch to expose this as datastore tuning option as well.
>
>> 10*1024*1024 23.178 689660
>> 100*1024*1024 25.135 5507292
>
More information about the pbs-devel
mailing list