[pbs-devel] [PATCH proxmox-backup 3/5] datastore: implement sync-level tuning for datastores

Tue May 24 10:14:21 CEST 2022

On 23/05/2022 09:13, Fabian Grünbichler wrote:
> I am not sure whether that "in practice" statement really holds
> - we only tested the exact failure case on a few filesystems, there 
>   might be ones in use out there where a powerloss can also lead to a 
>   truncated chunk, not only an empty/missing one. granted, both will be 
>   detected on the next verification, but only the latter will be 
>   automatically cleaned up by a subsequent backup task that uploads this 
>   chunk..

I don't think partial written files can happen on journaled FS in that way,
at least as long as the default-on write barriers are not disabled.
XFS, ext4 and btrfs are fine in that regard, FWICT, didn't checked others,
but I'd think that additionally NFS, CIFS and ZFS would be interesting.

> - the FS underlying the datastore might be used for many datastores, or 
>   even other, busy, non-datastore usage. not an ideal setup, but there 
>   might be $reasons. in this case, syncfs might have a much bigger 
>   negative effect (because of syncing out other, unrelated I/O) than 
>   fsync.

Yeah, edge case exists, in practice means for the general case, in which
PBS is still its own appliance that one won't also co-host their high churn
appache cassandra DB or whatever on it, so this still holds and in most cases
it will be more efficient then too than two fsyncs per chunk (which internally
often flushes more than just the two inodes too).

If an admin still does it for $reasons they can still switch to fsync based
level, I'd find it odd if the "in practice" of a doc comment would hinder
them of ever trying, especially as such setups most of the time get created
when the users won't care for best practice anyway.

> - not sure what effect syncfs has if a datastore is really busy (as in, 
>   has tons of basically no-op backups over a short period of time)

What effects do you imagine? It just starts a writeback kernel worker that
flushes all dirty inodes belonging to a super block from the time the syncfs
was called in a lockless manner using the RCU (see sync.c and fs-writeback.c
in the kernel fs/ tree), new IO isn't stopped and the inodes would have been
synced over the next 30s (default) anyway..

> 
> I'd rather mark 'Filesystem' as a good compromise, and the 'File' on as 
> most consistent.