[pbs-devel] [PATCH pxar 2/2] decoder: aio: improve performance of async file reads
Wolfgang Bumiller
w.bumiller at proxmox.com
Fri Aug 4 13:27:39 CEST 2023
On Thu, Jul 20, 2023 at 07:15:05PM +0200, Max Carrara wrote:
> In order to bring `aio::Decoder` on par with its `sync` counterpart
> as well as `sync::Accessor` and `aio::Accessor`, its input is now
> buffered.
>
> As the `tokio` docs mention themselves [0], it can be really
> inefficient to directly work with an (unbuffered) `AsyncRead`
> instance.
Sure, but the question is *where* does it truly make sense to do the
buffering, more below...
(...)
> ---
> src/decoder/aio.rs | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/src/decoder/aio.rs b/src/decoder/aio.rs
> index 200dd3d..174551b 100644
> --- a/src/decoder/aio.rs
> +++ b/src/decoder/aio.rs
> @@ -79,14 +79,20 @@ mod tok {
> use std::pin::Pin;
> use std::task::{Context, Poll};
>
> - /// Read adapter for `futures::io::AsyncRead`
> + use tokio::io::AsyncRead;
> +
> + /// Read adapter for `tokio::io::AsyncRead`
> pub struct TokioReader<T> {
^ This is a very generic interface here...
> - inner: T,
> + inner: tokio::io::BufReader<T>,
> }
>
> impl<T: tokio::io::AsyncRead> TokioReader<T> {
> pub fn new(inner: T) -> Self {
Note that `tokio`'s `BufReader` itself also implements `AsyncRead`, and
the user may already have a buffered reader here.
A better choice for us here would be to perform this change with the
`tokio-fs` feature and replace the
impl Decoder<TokioReader<tokio::fs::File>> {
fn open(...) -> io::Result<Self> { ... }
}
(which exists only so that `Decoder::open` can be used by the crate
consumer easily, automatically producing a `Decoder` for "some file
type"...)
with:
impl Decoder<TokioReader<BufReader<tokio::fs::File>>> {
fn open(...) -> io::Result<Self> { ... }
}
Since this is the place where we *actually* should be creating the
buffered reader.
> - Self { inner }
> + // buffer size "sweet spot" - larger sizes don't seem to provide any benefit
> + const BUF_SIZE: usize = 1024 * 16;
And we also wouldn't have to decide on what would be a sane size here
with the assumption that it is the right size for any possible T we
instantiate the decoder with.
There's a bit of a danger with sprinkling `BufReaders` in generic `T:
Read` APIs, as this may lead to multiple of those getting chained
together.
Eg. a consumer of the crate may instantiate a
`Decoder<SomeNetworkFile<TlsStreamThing>>`.
Then reads that buffering for such things can improve performance and
turn that into: `Decoder<SomeNetworkFile<BufReader<TlsStreamThing>>>`.
Little do they know that `Decoder` buffers, the creator of
`SomeNetworkFile` also thought the same thing and buffers as well, and
`TlsStreamThing` might also need buffering for a sane implementation, and
suddenly you're just chaining memcpys across 4 buffers before they end
up at the destination ;-)
> + Self {
> + inner: tokio::io::BufReader::with_capacity(BUF_SIZE, inner),
> + }
> }
> }
>
> --
> 2.39.2
More information about the pbs-devel
mailing list