[pbs-devel] [PATCH proxmox-backup v6 29/37] datastore: add local datastore cache for network attached storages
Christian Ebner
c.ebner at proxmox.com
Thu Jul 10 12:30:03 CEST 2025
On 7/10/25 12:05, Thomas Lamprecht wrote:
> Am 08.07.25 um 19:01 schrieb Christian Ebner:
>> Use a local datastore as cache using LRU cache replacement policy for
>> operations on a datastore backed by a network, e.g. by an S3 object
>> store backend. The goal is to reduce number of requests to the
>> backend and thereby save costs (monetary as well as time).
>>
>> The cacher allows to fetch cache items on cache misses via the access
>> method.
>
> FWIW, this might be also be used for local data stores backed by slower
> storage as an application aware–and thus more efficient–caching mechanism.
> E.g., for those setups with huge and slow storage for actual backup pools
> to setup the cache on a smaller but faster (e.g. flash based) storage.
>
> But nothing we need to incorporate as part of this series, just an idea
> to potentially investigate after this landed.
Yes, indeed! It could make sense to provide this as application level
cache for slower storage in general.
> Some before/after benchmark and words about chosen cache size might be
> nice here too, again just to get a rough ballpark estimation about what
> order of magnitude of change one might get from this.
Okay, will incorporate that as well in the next version of the patches.
> Also, just from reading the commit message it's not entirely clear to me
> on which medium/path the data will be actually cached on, while that is
> pretty much given from the current core design of saving the indexes to
> the local storage anyway, it still might be nice to mention that
> explicitly here.
Okay, will include that as well
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index c1ba2dcea..ef146e84a 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>
>> @@ -437,6 +471,16 @@ impl DataStore {
>> .parse_property_string(config.backend.as_deref().unwrap_or(""))?,
>> )?;
>>
>> + const LOCAL_DATASTORE_CACHE_SIZE: usize = 10_000_000;
>
> This is amount of chunks or?
Yes, actually this is a bit of an oversight from me, the initial idea
was to make this configurable, but I opted for a constant for the time
being, but then never actually made this configurable.
Deriving this from the available storage space plus some headroom for
the datastore contents metadata on cache instantiation might make even
more sense, will look into that as it seems the better option here.
> The cache size could be derived, or at least limited from available free
> space of the local datastores backing storage, for a cache that is backed
> on storage being able to configure the limits will be probably good to
> have as tuning knob for admins (providing Somewhat OK™ default value is
> still great to avoid making tuning a necessity in common setups).
>
>
>> + let lru_store_caching = if DatastoreBackendType::S3 == backend_config.ty.unwrap_or_default()
>> + {
>> + let cache =
>> + LocalDatastoreLruCache::new(LOCAL_DATASTORE_CACHE_SIZE, chunk_store.clone());
>> + Some(cache)
>> + } else {
>> + None
>> + };
>> +
>> Ok(DataStoreImpl {
>> chunk_store,
>> gc_mutex: Mutex::new(()),
More information about the pbs-devel
mailing list