[pbs-devel] [PATCH proxmox-backup v12 06/26] datastore: add helper for checking if a removable datastore is available

Fabian Grünbichler f.gruenbichler at proxmox.com
Wed Oct 30 10:45:48 CET 2024


Quoting Hannes Laimer (2024-10-29 15:04:25)
> On Mon Oct 14, 2024 at 3:42 PM CEST, Fabian Grünbichler wrote:
> > On September 4, 2024 4:11 pm, Hannes Laimer wrote:
> > > Co-authored-by: Wolfgang Bumiller <w.bumiller at proxmox.com>
> > > Signed-off-by: Hannes Laimer <h.laimer at proxmox.com>
> > > ---
> > >  pbs-api-types/src/maintenance.rs |  2 ++
> > >  pbs-datastore/src/datastore.rs   | 58 ++++++++++++++++++++++++++++++++
> > >  pbs-datastore/src/lib.rs         |  2 +-
> > >  src/bin/proxmox-backup-proxy.rs  |  5 ++-
> > >  4 files changed, 65 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/pbs-api-types/src/maintenance.rs b/pbs-api-types/src/maintenance.rs
> > > index fd4d3416..9f51292e 100644
> > > --- a/pbs-api-types/src/maintenance.rs
> > > +++ b/pbs-api-types/src/maintenance.rs
> > > @@ -82,6 +82,8 @@ impl MaintenanceMode {
> > >      /// task finishes, so all open files are closed.
> > >      pub fn is_offline(&self) -> bool {
> > >          self.ty == MaintenanceType::Offline
> > > +            || self.ty == MaintenanceType::Unmount
> > > +            || self.ty == MaintenanceType::Delete
> > >      }
> > >  
> > >      pub fn check(&self, operation: Option<Operation>) -> Result<(), Error> {
> > > diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> > > index fb37bd5a..29f98b37 100644
> > > --- a/pbs-datastore/src/datastore.rs
> > > +++ b/pbs-datastore/src/datastore.rs
> > > @@ -1,5 +1,6 @@
> > >  use std::collections::{HashMap, HashSet};
> > >  use std::io::{self, Write};
> > > +use std::os::unix::ffi::OsStrExt;
> > >  use std::os::unix::io::AsRawFd;
> > >  use std::path::{Path, PathBuf};
> > >  use std::sync::{Arc, LazyLock, Mutex};
> > > @@ -14,6 +15,7 @@ use proxmox_schema::ApiType;
> > >  use proxmox_sys::error::SysError;
> > >  use proxmox_sys::fs::{file_read_optional_string, replace_file, CreateOptions};
> > >  use proxmox_sys::fs::{lock_dir_noblock, DirLockGuard};
> > > +use proxmox_sys::linux::procfs::MountInfo;
> > >  use proxmox_sys::process_locker::ProcessLockSharedGuard;
> > >  use proxmox_worker_task::WorkerTaskContext;
> > >  
> > > @@ -46,6 +48,52 @@ pub fn check_backup_owner(owner: &Authid, auth_id: &Authid) -> Result<(), Error>
> > >      Ok(())
> > >  }
> > >  
> > > +/// check if a removable datastore is currently available/mounted by
> > > +/// comparing the `st_rdev` values of `/dev/disk/by-uuid/<uuid>` and the source device in
> > > +/// /proc/self/mountinfo
> >
> > check if a *removable* datastore is ..
> >
> 
> The idea was to be able to work with normal and removable datastores the
> same way throughout the codebase, not having this would mean having to
> disginguish between them when using datastores.

which you already do in parts, but
 
> It might make sense to add a check for the existance on `.chunks/`, so
> this wouldn't do nothing for normal datastores.

I don't mind this helper working for both, I just care about its intended use
case and semantics being clear :)

> > > +pub fn is_datastore_available(config: &DataStoreConfig) -> bool {
> > > +    use nix::sys::stat::SFlag;
> > > +
> > > +    let uuid = match config.backing_device.as_deref() {
> > > +        Some(dev) => dev,
> > > +        None => return true,
> >
> > returns true if not a removable datastore?
> >
> > > +    };
> > > +
> > > +    let Some(store_mount_point) = config.get_mount_point() else {
> > > +        return true;
> >
> > same here. also see further below - this should either be removable
> > datastore specific, in which case it could take uuid and mountpoint as
> > parameters, or it could take any DataStoreConfig, then the doc comment
> > should reflect that and clearly describe the semantics..
> >
> 
> yes, I'll update the docs
> 
> > > +    };
> > > +    let store_mount_point = Path::new(&store_mount_point);
> > > +
> > > +    let dev_node = match nix::sys::stat::stat(format!("/dev/disk/by-uuid/{uuid}").as_str()) {
> > > +        Ok(stat) if SFlag::from_bits_truncate(stat.st_mode) == SFlag::S_IFBLK => stat.st_rdev,
> > > +        _ => return false,
> >
> > shouldn't this differentiate between:
> > - stat failed with ENOENT
> > - stat failed for other reasons
> > - stat worked but result is not as expected
> >
> > ?
> >
> 
> I guess it comes down to wether we want to differantiate between, "it is
> not there" or "we can't check if it is there". Since this used mostly
> done in the background and quite frequently, I opted for treating the
> two cases the same.

I don't really see how the frequency of calls makes a difference for whether
this should differentiate between an error and an expected state.. it can be
okay to map errors to "not available" if we don't ever want to differentiate,
but then this should be clearly communicated ;)

> 
> > also, that code (continued below at (A))
> >
> > > +    };
> > > +
> > > +    let Ok(mount_info) = MountInfo::read() else {
> > > +        return false;
> >
> > shouldn't this be an error?
> >
> 
> ... same here
> 
> > > +    };
> > > +
> > > +    for (_, entry) in mount_info {
> > > +        let Some(source) = entry.mount_source else {
> > > +            continue;
> > > +        };
> > > +
> > > +        if entry.mount_point != store_mount_point || !source.as_bytes().starts_with(b"/") {
> > > +            continue;
> > > +        }
> > > +
> > > +        if let Ok(stat) = nix::sys::stat::stat(source.as_os_str()) {
> > > +            let sflag = SFlag::from_bits_truncate(stat.st_mode);
> > > +
> > > +            if sflag == SFlag::S_IFBLK && stat.st_rdev == dev_node {
> > > +                return true;
> > > +            }
> >
> > (A) and this code could go into a helper..
> >
> 
> yup, I'll do that
> 
> > > +        }
> > > +    }
> > > +
> > > +    false
> > > +}
> > > +
> > >  /// Datastore Management
> > >  ///
> > >  /// A Datastore can store severals backups, and provides the
> > > @@ -155,6 +203,12 @@ impl DataStore {
> > >              }
> > >          }
> > >  
> > > +        if config.backing_device.is_some() && !is_datastore_available(&config) {
> > > +            let mut datastore_cache = DATASTORE_MAP.lock().unwrap();
> > > +            datastore_cache.remove(&config.name);
> > > +            bail!("Removable Datastore is not mounted");
> > > +        }
> >
> > so here the helper is only called for removable datastores..
> >
> 
> the thing we do here is only relevant for removable datastores, so if
> we would also need to drop cache entries for non-removable datastores,
> we could drop the is-removable check, yes.

the first thing is_datastore_available does is check whether a backing device
is set, and return true otherwise.. so the extra condition here doesn't really
make sense, it's just checked twice in case of a removable datastore..

> 
> > > +
> > >          let mut datastore_cache = DATASTORE_MAP.lock().unwrap();
> > >          let entry = datastore_cache.get(name);
> > >  
> > > @@ -258,6 +312,10 @@ impl DataStore {
> > >      ) -> Result<Arc<Self>, Error> {
> > >          let name = config.name.clone();
> > >  
> > > +        if !is_datastore_available(&config) {
> > > +            bail!("Datastore is not available")
> > > +        }
> >
> > but here it is called for all datastores
> >
> 
> ... here what this check guards is relevant for both normal and
> removable. In my head this made sense, I hope it also does outside :) 
> 

yes, I just wanted to point out that it doesn't really make sense. a call for a
regular datastore will always return true, so guarding it by whether a backing
device is set doesn't make sense with the current implementation. either all
calls should be made unconditional (since it's a nop for regular datastores
anyway), or all calls should be made conditional and it should only be called
for removable datastores. then all the redundant checks within can be dropped.
a mix of both is just confusing.




More information about the pbs-devel mailing list