[pbs-devel] [PATCH proxmox-backup 1/3] docs: centralise and update garbage collection description
Gabriel Goller
g.goller at proxmox.com
Fri Apr 5 12:49:26 CEST 2024
On Tue Apr 2, 2024 at 3:36 PM CEST, Hannes Duerr wrote:
> The "backup client usage" chapter describes a grace period that is 24
> hours and 5 minutes long, and unconnected to this a cut-off time is
> mentioned under "maintenance tasks", which leads to confusion. Therefore
> we summarise the entire description of garbage collection under
> "maintenance tasks" and link to it in the "backup client usage" chapter
>
> Signed-off-by: Hannes Duerr <h.duerr at proxmox.com>
> ---
> docs/backup-client.rst | 16 +++---------
> docs/maintenance.rst | 55 ++++++++++++++++++++++++++++++------------
> 2 files changed, 43 insertions(+), 28 deletions(-)
>
> diff --git a/docs/backup-client.rst b/docs/backup-client.rst
> index 00a1abbb..d015b844 100644
> --- a/docs/backup-client.rst
> +++ b/docs/backup-client.rst
> @@ -735,25 +735,15 @@ command. It is recommended to carry out garbage collection on a regular basis.
>
> The garbage collection works in two phases. In the first phase, all
> data blocks that are still in use are marked. In the second phase,
> -unused data blocks are removed.
> +unused data blocks are removed. A more detailed description of the GC
> +can be found :ref:`here <maintenance_gc>`.
> +
>
> .. note:: This command needs to read all existing backup index files
> and touches the complete chunk-store. This can take a long time
> depending on the number of chunks and the speed of the underlying
> disks.
>
> -.. note:: The garbage collection will only remove chunks that haven't been used
> - for at least one day (exactly 24h 5m). This grace period is necessary because
> - chunks in use are marked by touching the chunk which updates the ``atime``
> - (access time) property. Filesystems are mounted with the ``relatime`` option
> - by default. This results in a better performance by only updating the
> - ``atime`` property if the last access has been at least 24 hours ago. The
> - downside is that touching a chunk within these 24 hours will not always
> - update its ``atime`` property.
> -
> - Chunks in the grace period will be logged at the end of the garbage
> - collection task as *Pending removals*.
> -
> .. code-block:: console
>
> # proxmox-backup-client garbage-collect
> diff --git a/docs/maintenance.rst b/docs/maintenance.rst
> index 6dbb6941..baa1241e 100644
> --- a/docs/maintenance.rst
> +++ b/docs/maintenance.rst
> @@ -171,7 +171,7 @@ It's recommended to setup a schedule to ensure that unused space is cleaned up
> periodically. For most setups a weekly schedule provides a good interval to
> start.
>
> -GC Background
> +Overview
> ^^^^^^^^^^^^^
Small nit: adjust the length of the underline to match the length of the
title.
>
> In `Proxmox Backup`_ Server, backup data is not saved directly, but rather as
> @@ -187,26 +187,51 @@ references to the same chunks on every snapshot deletion. Moreover, locking the
> entire datastore is not feasible because new backups would be blocked until the deletion
> process was complete.
>
> -Therefore, Proxmox Backup Server uses a garbage collection (GC) process to
> +Therefore, Proxmox Backup Server uses a `tracing garbage collection
> +<https://en.wikipedia.org/wiki/Tracing_garbage_collection>`_ algorithm to
> identify and remove the unused backup chunks that are no longer needed by any
> -snapshot in the datastore. The GC process is designed to efficiently reclaim
> +snapshot in the datastore. The GC algorithm is designed to efficiently reclaim
> the space occupied by these chunks with low impact on the performance of the
> datastore or interfering with other backups.
>
> -The garbage collection (GC) process is performed per datastore and is split
> -into two phases:
> +The GC is performed per datastore and is split into two phases:
>
> -- Phase one: Mark
> - All index files are read, and the access time of the referred chunk files is
> - updated.
> +- Phase one - Mark:
> +
> + Read all index files and update the ``atime`` (access time) of the relevant
> + chunk files.
> +
> +- Phase two - Sweep:
> +
> + Iterate over all chunks and check the ``atime`` of the files. If
> + the ``atime`` is older than the cut-off time, the chunk was neither
> + referenced in a backup index nor is it part of a running backup that
> + does not yet have an index to search. As such, safely remove the chunk.
> +
> +
> +Cut-off Time
> +^^^^^^^^^^^^
> +
> +The GC only clears the chunks that were last accessed before the
> +cut-off time. The cut-off time is determined by whichever is earlier:
> +
> +- 24 hours and 5 minutes before the start of the garbage collection
> + due to the mounting of the data storage with relatime, or
I would also make relatime a inline literal like this: ``relatime``
> +
> +- the start time of the oldest active backup job that has been running
> + for longer than 24 hours and 5 minutes at the beginning of the
> + garbage collection. This is necessary because the newly created
> + backup could refer to blocks, but the GC would not notice this as
> + there is no index of the backup that could be searched.
> +
> +Chunks accessed after the cut-off time are marked as *Pending removals*
> +by the GC as it cannot be certain whether they are still needed.
> +
> +.. Note:: Mounting a volume with relatime means that the ``atime``
Same here
> + of the chunk files is not updated every time, but only when the
> + data has changed or the ``atime`` was before a certain time,
> + which is 24 hours by default.
>
> -- Phase two: Sweep
> - The task iterates over all chunks, checks their file access time, and if it
> - is older than the cutoff time (i.e., the time when GC started, plus some
> - headroom for safety and Linux file system behavior), the task knows that the
> - chunk was neither referred to in any backup index nor part of any currently
> - running backup that has no index to scan for. As such, the chunk can be
> - safely deleted.
>
> Manually Starting GC
> ^^^^^^^^^^^^^^^^^^^^
More information about the pbs-devel
mailing list