[pbs-devel] [PATCH v7 proxmox-backup 58/69] docs: add section describing change detection mode
Christian Ebner
c.ebner at proxmox.com
Mon May 27 16:33:12 CEST 2024
Describe the motivation and basic principle of the clients change
detection mode and show an example invocation.
Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
---
changes since version 6:
- add more information on metadata being compared
- adapt and link from technical overview
docs/backup-client.rst | 45 +++++++++++++++++++++++++++++++++++++
docs/technical-overview.rst | 3 +++
2 files changed, 48 insertions(+)
diff --git a/docs/backup-client.rst b/docs/backup-client.rst
index 00a1abbb3..58fcd79f0 100644
--- a/docs/backup-client.rst
+++ b/docs/backup-client.rst
@@ -280,6 +280,51 @@ Multiple paths can be excluded like this:
# proxmox-backup-client backup.pxar:./linux --exclude=/usr --exclude=/rust
+.. _client_change_detection_mode:
+
+Change Detection Mode
+~~~~~~~~~~~~~~~~~~~~~
+
+File-based backups containing a lot of data can take a long time, as the default
+behavior for the Proxmox backup client is to read all data and re-encode it.
+The encoded stream is split into variable sized chunks for efficient
+deduplication and based on the chunk digest a decision can be made whether a
+given chunk needs to be uploaded or can be indexed without upload as it is
+already available on the server (and therefore deduplicated). For some
+use-cases, where files do not change frequently the full re-reading is not
+feasible and undesired.
+
+The backup clients `change-detection-mode` can be switched from default to
+`metadata` based detection to reduce limitations as described above, instructing
+the client to avoid re-reading files with unchanged metadata whenever possible.
+When using this mode, instead of the regular pxar archive, the backup snapshot
+is stored into two separate files: the `mpxar` containing the archives metadata
+and the `ppxar` containing a concatenation of the file contents. This splitting
+allows for metadata lookups without the overhead of the file contents.
+Using the `change-detection-mode` set to `data` allows to create the same split
+archive as when using the `metadata` mode, but without using a previous
+reference and therefore reencoding all file payloads.
+
+When creating the backup archives, the current file metadata is compared to the
+one looked up in the previous `mpxar` archive.
+The metadata comparison includes file size, file type, ownership and permission
+information acls and attributes and most importantly the files mtime, for
+details see the :ref:`pxar metadata archive format <pxar-meta-format>`.
+
+If unchanged, the entry is cached for possible re-use of content chunks without
+re-reading, by indexing the already present chunks containing the contents from
+the previous backup snapshot. Since the file might only partially re-use chunks
+(thereby introducing wasted space in the form of padding), the decision whether
+to re-use or re-encode the currently cached entries is delegated to when enough
+information is available, comparing the possible padding a threshold value.
+
+The following shows an example for the client invocation with the `metadata`
+mode:
+
+.. code-block:: console
+
+ # proxmox-backup-client backup.pxar:./linux --change-detection-mode=metadata
+
.. _client_encryption:
Encryption
diff --git a/docs/technical-overview.rst b/docs/technical-overview.rst
index 89835a7cc..a8b1c7268 100644
--- a/docs/technical-overview.rst
+++ b/docs/technical-overview.rst
@@ -28,6 +28,9 @@ which are not chunked, e.g. the client log), or one or more indexes
When uploading an index, the client first has to read the source data, chunk it
and send the data as chunks with their identifying checksum to the server.
+When using the :ref:`change detection mode <change_detection_mode>` payload
+chunks for unchanged files are reused from the previous snapshot, thereby not
+reading the source data again.
If there is a previous Snapshot in the backup group, the client can first
download the chunk list of the previous Snapshot. If it detects a chunk that
--
2.39.2
More information about the pbs-devel
mailing list