[pbs-devel] [PATCH proxmox-backup v2 1/3] docs: centralise and update garbage collection description
Hannes Dürr
h.duerr at proxmox.com
Wed Apr 10 15:38:23 CEST 2024
On 4/8/24 11:20, Fabian Grünbichler wrote:
> On April 5, 2024 3:05 pm, Hannes Duerr wrote:
>> The "backup client usage" chapter describes a grace period that is 24
>> hours and 5 minutes long, and unconnected to this a cut-off time is
>> mentioned under "maintenance tasks", which leads to confusion. Therefore
>> we summarise the entire description of garbage collection under
>> "maintenance tasks" and link to it in the "backup client usage" chapter
>>
>> Signed-off-by: Hannes Duerr <h.duerr at proxmox.com>
>> ---
>> docs/backup-client.rst | 16 +++---------
>> docs/maintenance.rst | 57 ++++++++++++++++++++++++++++++------------
>> 2 files changed, 44 insertions(+), 29 deletions(-)
>>
>> diff --git a/docs/backup-client.rst b/docs/backup-client.rst
>> index 00a1abbb..d015b844 100644
>> --- a/docs/backup-client.rst
>> +++ b/docs/backup-client.rst
>> @@ -735,25 +735,15 @@ command. It is recommended to carry out garbage collection on a regular basis.
>>
>> The garbage collection works in two phases. In the first phase, all
>> data blocks that are still in use are marked. In the second phase,
>> -unused data blocks are removed.
>> +unused data blocks are removed. A more detailed description of the GC
>> +can be found :ref:`here <maintenance_gc>`.
>> +
>>
>> .. note:: This command needs to read all existing backup index files
>> and touches the complete chunk-store. This can take a long time
>> depending on the number of chunks and the speed of the underlying
>> disks.
>>
>> -.. note:: The garbage collection will only remove chunks that haven't been used
>> - for at least one day (exactly 24h 5m). This grace period is necessary because
>> - chunks in use are marked by touching the chunk which updates the ``atime``
>> - (access time) property. Filesystems are mounted with the ``relatime`` option
>> - by default. This results in a better performance by only updating the
>> - ``atime`` property if the last access has been at least 24 hours ago. The
>> - downside is that touching a chunk within these 24 hours will not always
>> - update its ``atime`` property.
>> -
>> - Chunks in the grace period will be logged at the end of the garbage
>> - collection task as *Pending removals*.
>> -
>> .. code-block:: console
>>
>> # proxmox-backup-client garbage-collect
>> diff --git a/docs/maintenance.rst b/docs/maintenance.rst
>> index 6dbb6941..e25c8f19 100644
>> --- a/docs/maintenance.rst
>> +++ b/docs/maintenance.rst
>> @@ -171,8 +171,8 @@ It's recommended to setup a schedule to ensure that unused space is cleaned up
>> periodically. For most setups a weekly schedule provides a good interval to
>> start.
>>
>> -GC Background
>> -^^^^^^^^^^^^^
>> +Overview
>> +^^^^^^^^
>>
>> In `Proxmox Backup`_ Server, backup data is not saved directly, but rather as
>> chunks that are referred to by the indexes of each backup snapshot. This
>> @@ -187,26 +187,51 @@ references to the same chunks on every snapshot deletion. Moreover, locking the
>> entire datastore is not feasible because new backups would be blocked until the deletion
>> process was complete.
>>
>> -Therefore, Proxmox Backup Server uses a garbage collection (GC) process to
>> +Therefore, Proxmox Backup Server uses a `tracing garbage collection
>> +<https://en.wikipedia.org/wiki/Tracing_garbage_collection>`_ algorithm to
>> identify and remove the unused backup chunks that are no longer needed by any
>> -snapshot in the datastore. The GC process is designed to efficiently reclaim
>> +snapshot in the datastore. The GC algorithm is designed to efficiently reclaim
>> the space occupied by these chunks with low impact on the performance of the
>> datastore or interfering with other backups.
>>
>> -The garbage collection (GC) process is performed per datastore and is split
>> -into two phases:
>> +The GC is performed per datastore and is split into two phases:
>>
>> -- Phase one: Mark
>> - All index files are read, and the access time of the referred chunk files is
>> - updated.
>> +- Phase one - Mark:
>> +
>> + Read all index files and update the ``atime`` (access time) of the relevant
>> + chunk files.
> I'd replace "relevant" with "referenced" here, it is more concrete and
> matches the terminology below
>
>> +
>> +- Phase two - Sweep:
>> +
>> + Iterate over all chunks and check the ``atime`` of the files. If
>> + the ``atime`` is older than the cut-off time, the chunk was neither
>> + referenced in a backup index nor is it part of a running backup that
>> + does not yet have an index to search. As such, safely remove the chunk.
> nor was it recently created as part of a running backup task, but is not
> referenced yet by any finished index file. Such chunks can be safely
> removed since they are no longer needed.
>
> (Safely remove implies that we do some special removing that is safe ;))
>
>> +
>> +
>> +Cut-off Time
>> +^^^^^^^^^^^^
>> +
>> +The GC only clears the chunks that were last accessed before the
> s/clears/removes/
>
>> +cut-off time. The cut-off time is determined by whichever is earlier:
> is determined *at the start of the GC task*
>
> this is an important detail that helps understanding for more
> technically inclined readers
>
>> +
>> +- 24 hours and 5 minutes before the start of the garbage collection
>> + due to the mounting of the data storage with ``relatime``, or
> "before the start of .. due to" is a bit confusing. maybe:
>
> - 24 hours before the start of the garbage collection (to
> account for the datastore potentially being mounted with ``relatime``).
>
>> +
>> +- the start time of the oldest active backup job that has been running
>> + for longer than 24 hours and 5 minutes at the beginning of the
>> + garbage collection. This is necessary because the newly created
>> + backup could refer to blocks, but the GC would not notice this as
>> + there is no index of the backup that could be searched.
> the whole "that has been" can be dropped. the cut off is determined by
> whichever is earlier:
> - now - 24h
> - start time of oldest backup writer
*
>
> with an extra 5m of safety margin added in any case - not just the 24h
> one!
>
> - the start time of the oldest active backup job (to account for newly
> written chunks that are not yet referenced by any finished snapshot)
>
> is a bit shorter and IMHO conveys the same information
>
>> +
>> +Chunks accessed after the cut-off time are marked as *Pending removals*
>> +by the GC as it cannot be certain whether they are still needed.
> this is rather incomplete and a bit hard to parse as well. I'd replace
> "accessed after" with "with an atime after".
>
> pending is actually:
> - chunks with atime between the cut-off and the oldest writer (if one
> exists)
At this point i am slightly confused as we defined earlier:
the cut-off is the start of oldest backup writer* (if one exists)
Which would lead to the following:
- chunks with atime between the cut-off (which is the start of the
oldest existing writer) and the oldest writer (if one exists)
which does not make any sense, where is my mistake ?
> - chunks with atime between the cut-off and the start of GC (if no
> writer exists at the start)
>
> this normally means chunks of snapshots which have been recently
> forgotten/pruned. it can also mean freshly uploaded chunks of recently
> aborted backup tasks.
>
>> +
>> +.. Note:: Mounting a volume with ``relatime`` means that the ``atime``
>> + of the chunk files is not updated every time, but only when the
>> + data has changed or the ``atime`` was before a certain time,
>> + which is 24 hours by default.
>>
>> -- Phase two: Sweep
>> - The task iterates over all chunks, checks their file access time, and if it
>> - is older than the cutoff time (i.e., the time when GC started, plus some
>> - headroom for safety and Linux file system behavior), the task knows that the
>> - chunk was neither referred to in any backup index nor part of any currently
>> - running backup that has no index to scan for. As such, the chunk can be
>> - safely deleted.
>>
>> Manually Starting GC
>> ^^^^^^^^^^^^^^^^^^^^
>> --
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
More information about the pbs-devel
mailing list