[pbs-devel] [RFC PATCH proxmox-backup 0/2] Introduce experimental `AsyncExtractor<T>`

Max Carrara m.carrara at proxmox.com
Mon Aug 28 16:42:02 CEST 2023


This RFC proposes an asynchronous implementation of
`pbs_client::pxar::extract::{Extractor, ExtractorIter}`.

This `AsyncExtractor<T>` has been remodeled from the ground up while
preserving the core extraction logic. Its purpose is to provide
fully concurrent extraction of pxar files or streams. It does so by
offloading every synchronous / blocking call to a separate worker
thread with an internal queue. Extraction tasks are executed
sequentially to allow for predictable behaviour.

The async extractor is intentionally put into a separate module
(located at `pbs_client::pxar::aio`), as a complete refactor of every
existing extraction-related piece of code is beyond the scope (and
intention) of this RFC.

Its public API is nowhere near final, but serves its purpose for the
time being. Other functions found within `pbs_client::pxar::extract`
are not yet implemented.

Questions this RFC intends to resolve:
  1. In which situations would the `AsyncExtractor<T>` make sense?
     In which wouldn't it?
  2. Should the sync variant be kept around, sharing a `common`
     implementation with its async variant? If yes, why?
  3. Are there any features that the `AsyncExtractor<T>` lacks?

Even though of lesser priority, these questions should also be addressed:
  4. Which parts of the `AsyncExtractor<T>` are inadequate and could
     use improvement?
  5. Which traits should the `AsyncExtractor<T>` implement (if any?)
     (e.g. `tokio_stream`, etc.)

Furthermore, due to the nature of async applications requiring a
runtime in Rust, the `AsyncExtractor<T>` currently suffers from
the runtime's overhead. This difference in performance can be seen
when comparing the async version of `pxar` (see patch 2) with its
current sync counterpart. In my opinion, this does point towards a
common implementation which may be used by either sync or async
variant, but I am curious to what others have to say.

Let me know what you think! :-)

Max Carrara (2):
  pbs-client: pxar: Add prototype implementation of `AsyncExtractor<T>`
  pxar-bin: Use async instead of sync extractor

 Cargo.toml                                   |   1 +
 pbs-client/Cargo.toml                        |   1 +
 pbs-client/src/pxar/aio/dir_stack.rs         | 543 +++++++++++++++++++
 pbs-client/src/pxar/aio/extract/extractor.rs | 446 +++++++++++++++
 pbs-client/src/pxar/aio/extract/mod.rs       | 220 ++++++++
 pbs-client/src/pxar/aio/extract/raw.rs       | 503 +++++++++++++++++
 pbs-client/src/pxar/aio/metadata.rs          | 412 ++++++++++++++
 pbs-client/src/pxar/aio/mod.rs               |  11 +
 pbs-client/src/pxar/aio/worker.rs            | 167 ++++++
 pbs-client/src/pxar/mod.rs                   |   1 +
 pxar-bin/src/main.rs                         |  91 ++--
 11 files changed, 2352 insertions(+), 44 deletions(-)
 create mode 100644 pbs-client/src/pxar/aio/dir_stack.rs
 create mode 100644 pbs-client/src/pxar/aio/extract/extractor.rs
 create mode 100644 pbs-client/src/pxar/aio/extract/mod.rs
 create mode 100644 pbs-client/src/pxar/aio/extract/raw.rs
 create mode 100644 pbs-client/src/pxar/aio/metadata.rs
 create mode 100644 pbs-client/src/pxar/aio/mod.rs
 create mode 100644 pbs-client/src/pxar/aio/worker.rs

--
2.39.2






More information about the pbs-devel mailing list