[pbs-devel] [PATCH proxmox-backup 2/2] fix #5982: garbage collection: check atime updates are honored

Christian Ebner c.ebner at proxmox.com
Tue Feb 18 13:39:44 CET 2025


On 2/18/25 12:53, Thomas Lamprecht wrote:
> Am 17.02.25 um 16:36 schrieb Fabian Grünbichler:
>> On February 17, 2025 2:12 pm, Christian Ebner wrote:
>>> Check if the filesystem the chunk store is located on actually
>>> updates the atime when performing the marking of the chunks in
>>> phase 1 of the garbage collection. Since it is not enough to check if
>>> a single/first chunks atime is updated, since the filesystem can be
>>> mounted via the `relatime` option, find the first chunk which is'
>>> outside the relatime's 24 hour cutoff window and check the update on
>>> that chunk only.
>>
>> given that our touching should punch through relatime (and does so on
>> all filesystems we tested so far), couldn't we just
>>
>> - stat the first chunk
>> - touch the first chunk
>> - check if timestamps have been updated
>> - print a warning about the filesystem being potentially broken, and
>> - if the option is enabled, suggest the user report the details to us
>> - only continue if the option is explicitly disabled
>>
>> that way we should get a real world survey of broken file systems that
>> could inform our decision to drop the 24h window (or keep it).. if we
>> introduce an option (defaulting to yes for now) conditionalizing the 24h
>> window, we could even tell users with semi-broken storages (see below)
>> to explicitly set that option in case we later switch the default,
>> although I am not sure whether such storages exist for real.
> 
> +1; one (additional) option _might_  be to trigger suck a check on
> datastore creation, e.g. create the all-zero chunk and then do that
> test. As of now that probably would not win us much, but if we make
> the 24h-wait opt-in then users would be warned early enough, or we
> could even auto-set that option in such a case.

Only checking the atime update check on datastore creation is not enough 
I think, as the backing filesystem might get remounted with changed 
mount parameters? Or do you mean to *also* check on datastore creation 
already to early on detect issues? Although, in my testing even with 
`noatime` the atime update seems to be honored by the way the garbage 
collection performs the time updates (further details see below).

Anyways, creating the all-zero chunk and use that for the check sounds 
like a good optimization to me, as that allows to avoid conditional 
checking in the phase 1 of garbage collection. However, at the cost of 
having to make sure that it is never cleaned up by phase 2...

Regarding the 24 hour waiting period, as mentioned above I noted that 
atime updates are honored even if I set the `noatime` for an ext4 or 
`atime=off` on zfs.
Seems like the utimensat() bypasses this directly, as it calls into 
vfs_utimes() [0], which sets this to be an explicit time update, 
followed by the notify_change() [1], which then calls the setattr() of 
the corresponding filesystem [2] via the given inode.
This seems to bypass the atime_needs_update() [3], only called by 
touch_atime(). The atime_needs_update() also checks the 
relatime_needs_update() [4].

Although not conclusive (yet).

[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/utimes.c#n20
[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/attr.c#n426
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/attr.c#n552
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/inode.c#n2139
[4] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/inode.c#n2008




More information about the pbs-devel mailing list