[pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored

Christian Ebner c.ebner at proxmox.com
Mon Feb 17 16:57:08 CET 2025


On 2/17/25 16:36, Fabian Grünbichler wrote:
> On February 17, 2025 2:12 pm, Christian Ebner wrote:
>> Check if the filesystem the chunk store is located on actually
>> updates the atime when performing the marking of the chunks in
>> phase 1 of the garbage collection. Since it is not enough to check if
>> a single/first chunks atime is updated, since the filesystem can be
>> mounted via the `relatime` option, find the first chunk which is'
>> outside the relatime's 24 hour cutoff window and check the update on
>> that chunk only.
> 
> given that our touching should punch through relatime (and does so on
> all filesystems we tested so far), couldn't we just
> 
> - stat the first chunk
> - touch the first chunk
> - check if timestamps have been updated
> - print a warning about the filesystem being potentially broken, and
> - if the option is enabled, suggest the user report the details to us
> - only continue if the option is explicitly disabled
> 
> that way we should get a real world survey of broken file systems that
> could inform our decision to drop the 24h window (or keep it).. if we
> introduce an option (defaulting to yes for now) conditionalizing the 24h
> window, we could even tell users with semi-broken storages (see below)
> to explicitly set that option in case we later switch the default,
> although I am not sure whether such storages exist for real.

Hmm, that is a good idea!

> the only downside would be a potential slew of reports if we missed some
> prominent filesystem that applies relatime to explicit timestamp updates
> (any prominent storage ignoring explicit timestamp updates altogether
> would have already cropped up in our support channels after causing
> fatal data loss, and we only had a handful such reports so far, usually
> involving proprietary storage appliances).

Of course not ideal, but to big of an issue if this check is implemented 
to be opt-out. Giving the user a way to disable such a check and 
generating data to potentially drop the 24h grace period altogether in 
the future sound good to me!

Will adapt the patches accordingly in version 2, thanks!




More information about the pbs-devel mailing list