[pbs-devel] [PATCH proxmox-backup v2 1/3] docs: centralise and update garbage collection description

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Apr 11 09:22:17 CEST 2024


On April 10, 2024 3:38 pm, Hannes Dürr wrote:
> 
> On 4/8/24 11:20, Fabian Grünbichler wrote:
>> On April 5, 2024 3:05 pm, Hannes Duerr wrote:
>>> +Chunks accessed after the cut-off time are marked as *Pending removals*
>>> +by the GC as it cannot be certain whether they are still needed.
>> this is rather incomplete and a bit hard to parse as well. I'd replace
>> "accessed after" with "with an atime after".
>>
>> pending is actually:
>> - chunks with atime between the cut-off and the oldest writer (if one
>>    exists)
> At this point i am slightly confused as we defined earlier:
> the cut-off is the start of oldest backup writer* (if one exists)

the cut off is the minimum of (oldest worker start, now-24h) (minus 5
minutes). there can be an oldest worker that was started after the cut
off timestamp, in which case GC might find pending chunks (see below).

> 
> Which would lead to the following:
> 
> - chunks with atime between the cut-off (which is the start of the 
> oldest existing writer) and the oldest writer (if one exists)
> 
> which does not make any sense, where is my mistake ?

if the cut off is determined by a worker started more than 24h before
the start of the GC, then there cannot be any pending chunks - because
all the chunks which might be considered pending (otherwise) could have
been written by that worker, we can't tell.

pending chunks can only happen if there is
- no backup writer
- the olders writer was started less than 24h before the GC

then any chunks written in the time frame between cut-off and writer
start (or GC start, if no writer exists), which are not yet referenced
by any snapshot/index, are considered pending. because those chunks are
neither referenced nor can they have been written by any still going
writer, so they are most likely "garbage".




More information about the pbs-devel mailing list