[pbs-devel] [PATCH proxmox-backup v2 6/6] api: admin: datastore: implement streaming content api call

Thu Oct 9 08:36:35 CEST 2025

On 10/8/25 9:49 PM, Thomas Lamprecht wrote:
> Am 08.10.25 um 15:43 schrieb Dominik Csapak:
>> this is a new api call that utilizes `proxmox_router::Stream` to provide
>> a streaming interface to querying the datastore content.
>>
>> This can be done when a client requests this api call with the
>> `application/json-seq` Accept header.
>>
>> In contrast to the existing api calls, this one
>> * returns all types of content items (namespaces, groups, snapshots; can
>>    be filtered with a parameter)
>> * iterates over them recursively (with the range that is given with the
>>    parameter)
>>
>> The api call returns the data in the following order:
>> * first all visible namespaces
>> * then for each ns in order
>>    * each group
>>    * each snapshot
>>
>> This is done so that we can have a good way of building a tree view in
>> the ui.
> 
> I guess you did not get around to test some more performance / memory
> usage here? Might be nice to have whatever stats you did compare encoded
> in the commit message here.

no, not yet, but I wanted to do some testing today. I'll send an update
when I have more info.

> 
> I.e. that part of you and my text from patch 6/6 from the v1:
> 
> Am 03.10.25 um 13:55 schrieb Thomas Lamprecht:
>> Am 03.10.25 um 10:51 schrieb Dominik Csapak:
>>> interesting side node, in my rather large setup with ~600 groups and ~1000
>>> snapshosts per group, streaming this is faster than using the current
>>> `snapshot` api (by a lot):
>>> * `snapshot` api -> ~3 min
>>> * `content` api with streaming -> ~2:11 min
>>> * `content` api without streaming -> ~3 min
>>>
>>> It seems that either collecting such a 'large' api response (~200MiB)
>>> is expensive. My guesses what happens here are either:
>>> * frequent (re)allocation of the resulting vec
>>> * or serde's serializing code
>>
>> You could compare peak (RSS) memory usage of the daemon as side-effect,
>> and/or also use bpftrace to log bigger allocations. While I did use bpftrace
>> lots of times, I did not try this specifically to rust, but I found a
>> shorth'ish article that describes doing just that for rust, and looks like
>> it would not be _that_ much work (and could be a nice tool to have in the
>> belt in the future):
>>
>> https://readyset.io/blog/tracing-large-memory-allocations-in-rust-with-bpftrace