[pbs-devel] [PATCH proxmox-backup v2] datastore: remove datastore from internal cache based on maintenance mode

Thomas Lamprecht t.lamprecht at proxmox.com
Mon Mar 4 11:42:28 CET 2024


Am 01/03/2024 um 16:03 schrieb Hannes Laimer:
> We keep a DataStore cache, so ChunkStore's and lock files are kept by
> the proxy process and don't have to be reopened every time. However, for
> specific maintenance modes, e.g. 'offline', our process should not keep
> file in that datastore open. This clears the cache entry of a datastore
> if it is in a specific maintanance mode and the last task finished, which
> also drops any files still open by the process.

One always asks themselves if command sockets are the right approach, but
for this it seems alright.

Some code style comments inline.

> Signed-off-by: Hannes Laimer <h.laimer at proxmox.com>
> Tested-by: Gabriel Goller <g.goller at proxmox.com>
> Reviewed-by: Gabriel Goller <g.goller at proxmox.com>
> ---
> 
> v2, thanks @Gabriel:
>  - improve comments
>  - remove not needed &'s and .clone()'s
> 
>  pbs-api-types/src/maintenance.rs   |  6 +++++
>  pbs-datastore/src/datastore.rs     | 41 ++++++++++++++++++++++++++++--
>  pbs-datastore/src/task_tracking.rs | 23 ++++++++++-------
>  src/api2/config/datastore.rs       | 18 +++++++++++++
>  src/bin/proxmox-backup-proxy.rs    |  8 ++++++
>  5 files changed, 85 insertions(+), 11 deletions(-)
> 
> diff --git a/pbs-api-types/src/maintenance.rs b/pbs-api-types/src/maintenance.rs
> index 1b03ca94..a1564031 100644
> --- a/pbs-api-types/src/maintenance.rs
> +++ b/pbs-api-types/src/maintenance.rs
> @@ -77,6 +77,12 @@ pub struct MaintenanceMode {
>  }
>  
>  impl MaintenanceMode {
> +    /// Used for deciding whether the datastore is cleared from the internal cache after the last
> +    /// task finishes, so all open files are closed.
> +    pub fn clear_from_cache(&self) -> bool {

that function name makes it sound like calling it does actively clears it,
but this is only for checking if a required condition for clearing is met.

So maybe use a name that better convey that and maybe even avoid coupling
this to an action that a user of ours executes, as this might have some use
for other call sites too.

>From top of my head one could use `is_offline` as name, adding a note to
the doc-comment that this is e.g. used to check if a datastore can be
removed from the cache would still be fine though.

> +        self.ty == MaintenanceType::Offline
> +    }
> +
>      pub fn check(&self, operation: Option<Operation>) -> Result<(), Error> {
>          if self.ty == MaintenanceType::Delete {
>              bail!("datastore is being deleted");
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 2f0e5279..f26dff83 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -104,8 +104,27 @@ impl Clone for DataStore {
>  impl Drop for DataStore {
>      fn drop(&mut self) {
>          if let Some(operation) = self.operation {
> -            if let Err(e) = update_active_operations(self.name(), operation, -1) {
> -                log::error!("could not update active operations - {}", e);
> +            let mut last_task = false;
> +            match update_active_operations(self.name(), operation, -1) {
> +                Err(e) => log::error!("could not update active operations - {}", e),
> +                Ok(updated_operations) => {
> +                    last_task = updated_operations.read + updated_operations.write == 0;
> +                }
> +            }
> +
> +            // remove datastore from cache iff 
> +            //  - last task finished, and
> +            //  - datastore is in a maintenance mode that mandates it
> +            let remove_from_cache = last_task
> +                && pbs_config::datastore::config()
> +                    .and_then(|(s, _)| s.lookup::<DataStoreConfig>("datastore", self.name()))
> +                    .map_or(false, |c| {
> +                        c.get_maintenance_mode()
> +                            .map_or(false, |m| m.clear_from_cache())
> +                    });
> +
> +            if remove_from_cache {
> +                DATASTORE_MAP.lock().unwrap().remove(self.name());
>              }
>          }
>      }
> @@ -193,6 +212,24 @@ impl DataStore {
>          Ok(())
>      }
>  
> +    /// trigger clearing cache entries based on maintenance mode. Entries will only
> +    /// be cleared iff there is no other task running, if there is, the end of the
> +    /// last running task will trigger the clearing of the cache entry.
> +    pub fn update_datastore_cache() -> Result<(), Error> {

why does this work on all but not a single datastore, after all we always want to
remove a specific one?

> +        let (config, _digest) = pbs_config::datastore::config()?;
> +        for (store, (_, _)) in &config.sections {
> +            let datastore: DataStoreConfig = config.lookup("datastore", store)?;
> +            if datastore
> +                .get_maintenance_mode()
> +                .map_or(false, |m| m.clear_from_cache())
> +            {
> +                let _ = DataStore::lookup_datastore(store, Some(Operation::Lookup));

A comment that the actual removal from the cache happens through the drop handler
would be good, as this is a bit to subtle for my taste, if one stumbles over this
in a few months down the line it might cause a bit to much easily to avoid head
scratching...

Alternatively, factor the actual check-maintenance-mode-and-remove-from-cache out
of the drop handler and call that explicit here, all you need of outside info is
the name there anyway.




More information about the pbs-devel mailing list