[pbs-devel] [PATCH proxmox-backup] mapped loop device: use read loop instead of read_exact
Wolfgang Bumiller
w.bumiller at proxmox.com
Tue Nov 28 11:07:15 CET 2023
On Mon, Nov 27, 2023 at 06:27:53PM +0100, Fabian Grünbichler wrote:
> > Wolfgang Bumiller <w.bumiller at proxmox.com> hat am 27.11.2023 14:22 CET geschrieben:
> >
> > On Thu, Jun 29, 2023 at 12:32:13PM +0200, Fabian Grünbichler wrote:
> > > since read_exact does not support short reads, which can easily happen if the
> > > mapped image's EOF is not aligned with the request size.
> > >
> > > Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
> > > ---
> > >
> > > Notes:
> > > reported on the forum:
> > >
> > > https://forum.proxmox.com/threads/problem-backing-up-using-backup-client.129347
> > >
> > > did a quick test reading from a mapped image full of random data, observed
> > > no performance difference..
> >
> > Do you get one if we just drop the loop logic and *actually* just
> > `read()` once? IMO this is more in line with what a read syscall
> > *should* be doing.
> > Further, we use a `CachedChunkReader` under it which actually does a
> > read loop anyway, so AFAICT this *can't* make a difference.
>
> with a plain read (+ optional truncate of the reply buf) performance is still the same. but (and I am unfortunately not sure if this is a regression in the meantime, or was also broken back when I originally wrote this patch) access via the loop device actually truncates the resulting data:
>
> - my test input image is 1701838801 bytes long (arbitrary misaligned size, straight from /dev/urandom)
it's a loop device -> these are block devices defaulting to 512 blocks
1701838801 % 512 = 465
if you call `losetup` manually on it you'll get a warning like:
losetup: /your/file: Warning: file does not fit into a 512-byte sector; the end of the file will be ignored.
> - the fuse session correctly gets this passed in as size
> - a regular restore restores as many (correct) bytes
> - reading via the loop device with bs=1024 or bs=512 or bs=32 only returns 1701838336 bytes (465 are missing)
> -- the fuse requests quickly ramp up to 128k request size (no matter the block size used to read from the loop device)
> -- the last fuse read request is for 16384 bytes, but the read from PBS (correctly!) only returns 16337
> -- 16337 - 31*512 = 465
> -- so it seems the short read result is lost somewhere?
> -- reading with O_DIRECT doesn't help (in fact, it tanks performance while still reproducing the issue)
>
> anyhow, this requires further analysis and fixing before being applied in whichever fashion..
More information about the pbs-devel
mailing list