[pbs-devel] [PATCH proxmox-backup 1/3] docs: centralise and update garbage collection description

Fri Apr 5 12:49:26 CEST 2024

On Tue Apr 2, 2024 at 3:36 PM CEST, Hannes Duerr wrote:
> The "backup client usage" chapter describes a grace period that is 24
> hours and 5 minutes long, and unconnected to this a cut-off time is
> mentioned under "maintenance tasks", which leads to confusion. Therefore
> we summarise the entire description of garbage collection under
> "maintenance tasks" and link to it in the "backup client usage" chapter
>
> Signed-off-by: Hannes Duerr <h.duerr at proxmox.com>
> ---
>  docs/backup-client.rst | 16 +++---------
>  docs/maintenance.rst   | 55 ++++++++++++++++++++++++++++++------------
>  2 files changed, 43 insertions(+), 28 deletions(-)
>
> diff --git a/docs/backup-client.rst b/docs/backup-client.rst
> index 00a1abbb..d015b844 100644
> --- a/docs/backup-client.rst
> +++ b/docs/backup-client.rst
> @@ -735,25 +735,15 @@ command. It is recommended to carry out garbage collection on a regular basis.
>  
>  The garbage collection works in two phases. In the first phase, all
>  data blocks that are still in use are marked. In the second phase,
> -unused data blocks are removed.
> +unused data blocks are removed. A more detailed description of the GC
> +can be found :ref:`here <maintenance_gc>`.
> +
>  
>  .. note:: This command needs to read all existing backup index files
>    and touches the complete chunk-store. This can take a long time
>    depending on the number of chunks and the speed of the underlying
>    disks.
>  
> -.. note:: The garbage collection will only remove chunks that haven't been used
> -   for at least one day (exactly 24h 5m). This grace period is necessary because
> -   chunks in use are marked by touching the chunk which updates the ``atime``
> -   (access time) property. Filesystems are mounted with the ``relatime`` option
> -   by default. This results in a better performance by only updating the
> -   ``atime`` property if the last access has been at least 24 hours ago. The
> -   downside is that touching a chunk within these 24 hours will not always
> -   update its ``atime`` property.
> -
> -   Chunks in the grace period will be logged at the end of the garbage
> -   collection task as *Pending removals*.
> -
>  .. code-block:: console
>  
>    # proxmox-backup-client garbage-collect
> diff --git a/docs/maintenance.rst b/docs/maintenance.rst
> index 6dbb6941..baa1241e 100644
> --- a/docs/maintenance.rst
> +++ b/docs/maintenance.rst
> @@ -171,7 +171,7 @@ It's recommended to setup a schedule to ensure that unused space is cleaned up
>  periodically. For most setups a weekly schedule provides a good interval to
>  start.
>  
> -GC Background
> +Overview
>  ^^^^^^^^^^^^^

Small nit: adjust the length of the underline to match the length of the
title.

>  
>  In `Proxmox Backup`_ Server, backup data is not saved directly, but rather as
> @@ -187,26 +187,51 @@ references to the same chunks on every snapshot deletion. Moreover, locking the
>  entire datastore is not feasible because new backups would be blocked until the deletion
>  process was complete.
>  
> -Therefore, Proxmox Backup Server uses a garbage collection (GC) process to
> +Therefore, Proxmox Backup Server uses a `tracing garbage collection
> +<https://en.wikipedia.org/wiki/Tracing_garbage_collection>`_ algorithm to
>  identify and remove the unused backup chunks that are no longer needed by any
> -snapshot in the datastore. The GC process is designed to efficiently reclaim
> +snapshot in the datastore. The GC algorithm is designed to efficiently reclaim
>  the space occupied by these chunks with low impact on the performance of the
>  datastore or interfering with other backups.
>  
> -The garbage collection (GC) process is performed per datastore and is split
> -into two phases:
> +The GC is performed per datastore and is split into two phases:
>  
> -- Phase one: Mark
> -  All index files are read, and the access time of the referred chunk files is
> -  updated.
> +- Phase one - Mark:
> +
> +  Read all index files and update the ``atime`` (access time) of the relevant
> +  chunk files.
> +
> +- Phase two - Sweep:
> +
> +  Iterate over all chunks and check the ``atime`` of the files. If
> +  the ``atime`` is older than the cut-off time, the chunk was neither
> +  referenced in a backup index nor is it part of a running backup that
> +  does not yet have an index to search. As such, safely remove the chunk.
> +
> +
> +Cut-off Time
> +^^^^^^^^^^^^
> +
> +The GC only clears the chunks that were last accessed before the
> +cut-off time. The cut-off time is determined by whichever is earlier:
> +
> +- 24 hours and 5 minutes before the start of the garbage collection
> +  due to the mounting of the data storage with relatime, or

I would also make relatime a inline literal like this: ``relatime``

> +
> +- the start time of the oldest active backup job that has been running
> +  for longer than 24 hours and 5 minutes at the beginning of the
> +  garbage collection. This is necessary because the newly created
> +  backup could refer to blocks, but the GC would not notice this as
> +  there is no index of the backup that could be searched.
> +
> +Chunks accessed after the cut-off time are marked as *Pending removals*
> +by the GC as it cannot be certain whether they are still needed.
> +
> +.. Note:: Mounting a volume with relatime means that the ``atime``

Same here

> +   of the chunk files is not updated every time, but only when the
> +   data has changed or the ``atime`` was before a certain time,
> +   which is 24 hours by default.
>  
> -- Phase two: Sweep
> -  The task iterates over all chunks, checks their file access time, and if it
> -  is older than the cutoff time (i.e., the time when GC started, plus some
> -  headroom for safety and Linux file system behavior), the task knows that the
> -  chunk was neither referred to in any backup index nor part of any currently
> -  running backup that has no index to scan for. As such, the chunk can be
> -  safely deleted.
>  
>  Manually Starting GC
>  ^^^^^^^^^^^^^^^^^^^^