[pbs-devel] [PATCH proxmox-backup] fix #5967: datastore lookup: check for disappeared datastores
Christian Ebner
c.ebner at proxmox.com
Fri Dec 6 15:03:35 CET 2024
Had a look at this, one comment inline.
On 12/5/24 10:12, Fabian Grünbichler wrote:
> some code paths will only read the datastore base - if the datastore
> disappeared (e.g., a lazy unmount, or some network storage disappearing without
> blocking) then we should at least detect the issue at lookup time before
> re-using an already opened datastore.
>
> any code path that tried to access the chunks themselves would already have
> failed without this additional safeguard, but in particular queries listing
> namespaces or groups in the root namespace don't, which might cause a remote
> client such as pull sync to mistakenly think that the datastore is empty.
>
> limit the additional check to `Read` or `Write` lookups, since any code path
> doing those will most likely want to do I/O on the datastore afterwards, as
> opposed to a `Lookup` lookup or one without an operation, which might not
> require the datastore to be fully functional.
>
> Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
> ---
> there is of course still a gap beween the lookup and the readdir after it.. we
> could track down all API endpoints that access the datastore but don't touch
> the chunk dir, and before returning, check a second time that the datastore
> hasn't vanished, but I am not sure that is worth the effort?
>
> pbs-datastore/src/chunk_store.rs | 2 +-
> pbs-datastore/src/datastore.rs | 6 ++++++
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 29d5874a1..189c4ebcd 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -158,7 +158,7 @@ impl ChunkStore {
>
> /// Check if the chunkstore path is absolute and that we can
> /// access it. Returns the absolute '.chunks' path on success.
> - fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
> + pub(crate) fn chunk_dir_accessible(base: &Path) -> Result<PathBuf, Error> {
`chunk_dir_accessible` tries to get the metadata for the chunk directory
in order to check if the chunk store is available.
So this could be used to not only check if there is a chunk directory,
but rather that the chunk directory has the same device ID and i-node
number. For that check, one would only need to get and store the chunk
directory metadata on `ChunkStore::open`. That would be a rather non
invasive check to also detect if the datastore has been over-mounted,
not just if it vanished.
What do you think? Is this feasible? Not fully sure about the
implications for removable datastores, but that should be fine as well,
as the device id and i-node should not change as long as the datastore
remains online.
> if !base.is_absolute() {
> bail!("expected absolute path - got {:?}", base);
> }
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 0801b4bf6..bb9b4471f 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -232,6 +232,12 @@ impl DataStore {
> // reuse chunk store so that we keep using the same process locker instance!
> let chunk_store = if let Some(datastore) = &entry {
> let last_digest = datastore.last_digest.as_ref();
> + match operation {
> + Some(Operation::Read) | Some(Operation::Write) => {
> + ChunkStore::chunk_dir_accessible(&datastore.chunk_store.base_path())?;
> + }
> + Some(Operation::Lookup) | None => {}
> + }
> if let Some(true) = last_digest.map(|last_digest| last_digest == &digest) {
> if let Some(operation) = operation {
> update_active_operations(name, operation, 1)?;
More information about the pbs-devel
mailing list