[pbs-devel] [RFC pxar proxmox-backup 00/20] fix #3174: improve file-level backup
Christian Ebner
c.ebner at proxmox.com
Fri Sep 22 09:16:01 CEST 2023
This (still rather rough) series of patches prototypes a possible
approach to improve the pxar file level backup creation speed.
The series is intended to get a first feedback on the implementation
approach and to find possible pitfalls I might not be aware of.
The current approach is to skip encoding of regular file payloads,
for which metadata (currently mtime and size) did not change as
compared to a previous backup run. Instead of re-encoding the files, a
reference to a newly introduced appendix section of the pxar archive
will be written. The appenidx section will be created as concatination
of indexed chunks from the previous backup run, thereby containing the
sequential file payload at a calculated offset with respect to the
starting point of the appendix section.
Metadata comparison and caclulation of the chunks to be indexed for the
appendix section is performed using the catalog of a previous backup as
reference. In order to be able to calculate the offsets, the current
catalog format is extended to include the file offset with respect to
the pxar archive byte stream. This allows to find the required chunks
indexes, the start padding within the concatenated chunks and the total
bytes introduced by the chunks.
During encoding, the chunks needed for the appendix section are injected
in the pxar archive after forcing a chunk boundary when regular pxar
encoding is finished. Finally, the pxar archive containing an appenidx
section are marked as such by appending a final pxar goodbye lookup
table only containing the offset to the appendix section start and total
size of that section, needed for random access as e.g. for mounting the
archive via the fuse filesystem implementation.
Currently, the code assumes the reference backup (for which the previous
run is used) to be a regular backup without appendix section, and the
catalog for that backup to already contain the required additional
offset information.
An invocation therefore looks lile:
```bash
proxmox-backup-client backup <label>.pxar:<source-path>
proxmox-backup-client backup <label>.pxar:<source-path> --incremental
```
pxar:
Christian Ebner (8):
fix #3174: encoder: impl fn new for LinkOffset
fix #3174: decoder: factor out skip_bytes from skip_entry
fix #3174: decoder: impl skip_bytes for sync dec
fix #3174: metadata: impl fn to calc byte size
fix #3174: enc/dec: impl PXAR_APPENDIX_REF entrytype
fix #3174: enc/dec: impl PXAR_APPENDIX entrytype
fix #3174: encoder: add helper to incr encoder pos
fix #3174: enc/dec: impl PXAR_APPENDIX_TAIL entrytype
examples/mk-format-hashes.rs | 11 +++++
examples/pxarcmd.rs | 4 +-
src/accessor/mod.rs | 46 ++++++++++++++++++++
src/decoder/mod.rs | 38 +++++++++++++---
src/decoder/sync.rs | 6 +++
src/encoder/aio.rs | 36 ++++++++++++++--
src/encoder/mod.rs | 84 +++++++++++++++++++++++++++++++++++-
src/encoder/sync.rs | 32 +++++++++++++-
src/format/mod.rs | 16 +++++++
src/lib.rs | 54 +++++++++++++++++++++++
10 files changed, 312 insertions(+), 15 deletions(-)
proxmox-backup:
Christian Ebner (12):
fix #3174: index: add fn index list from start/end-offsets
fix #3174: index: add fn digest for DynamicEntry
fix #3174: api: double catalog upload size
fix #3174: catalog: incl pxar archives file offset
fix #3174: archiver/extractor: impl appendix ref
fix #3174: extractor: impl seq restore from appendix
fix #3174: archiver: store ref to previous backup
fix #3174: upload stream: impl reused chunk injector
fix #3174: chunker: add forced boundaries
fix #3174: backup writer: inject queued chunk in upload steam
fix #3174: archiver: reuse files with unchanged metadata
fix #3174: client: Add incremental flag to backup creation
examples/test_chunk_speed2.rs | 9 +-
pbs-client/src/backup_writer.rs | 88 ++++---
pbs-client/src/chunk_stream.rs | 41 +++-
pbs-client/src/inject_reused_chunks.rs | 123 ++++++++++
pbs-client/src/lib.rs | 1 +
pbs-client/src/pxar/create.rs | 217 ++++++++++++++++--
pbs-client/src/pxar/extract.rs | 141 ++++++++++++
pbs-client/src/pxar/mod.rs | 2 +-
pbs-client/src/pxar/tools.rs | 9 +
pbs-client/src/pxar_backup_stream.rs | 8 +-
pbs-datastore/src/catalog.rs | 122 ++++++++--
pbs-datastore/src/dynamic_index.rs | 38 +++
proxmox-backup-client/src/main.rs | 142 +++++++++++-
.../src/proxmox_restore_daemon/api.rs | 15 +-
pxar-bin/src/main.rs | 22 +-
src/api2/backup/upload_chunk.rs | 4 +-
src/tape/file_formats/snapshot_archive.rs | 2 +-
tests/catar.rs | 3 +
18 files changed, 886 insertions(+), 101 deletions(-)
create mode 100644 pbs-client/src/inject_reused_chunks.rs
--
2.39.2
More information about the pbs-devel
mailing list