[pbs-devel] [PATCH proxmox-backup 1/3] fix #3786: api: add resync-corrupt option to sync jobs

Thomas Lamprecht t.lamprecht at proxmox.com
Thu Oct 17 16:09:50 CEST 2024


Am 17/10/2024 um 15:20 schrieb Gabriel Goller:
> On 17.10.2024 08:15, Thomas Lamprecht wrote:
>> In general, it would be _really_ nice to get some regression testing for
>> sync jobs covering the selection with different matching and options.
>> The lack of that doesn't have to block your/shannon's patch, but IMO it
>> should be added rather sooner than later. Sync is becoming more and more
>> complex, and the selection of snapshots is a very important and fundamental
>> part of it.
> 
> Do you think we could mock ourselves out of this and unit-test it
> somehow? For full regression tests we will need Lukas's CI work.

I hope so, as that is mostly pure application logic. Albeit I'm I currently
can't give you tips for what the best/simplest way for mocking this would be.
Obvious candidates would probably be to pull out the parts for getting local and
remote backup snapshot lists and info into a trait or alternatively
refactoring more logic out that gets the info passed as parameter and call
that in the tests with the test data.

I have no strong preference here and there might be even better options for
this specific cases, but it'd be great if the existing "real" code does not
need to bend backwards to support the mocking and that defining or expanding
the tests isn't too tedious.
FWIW, one option might even be to have the list of snapshots defined and
generating two directory trees during build that mimic the actual datastores
more closely and might require less mocking and thus have bigger code coverage.

>>> +    /// Load the verify state from the manifest.
>>> +    pub fn verify_state(&self) -> Option<VerifyState> {
>>> +        self.load_manifest().ok().and_then(|(m, _)| {
>>
>> hmm, just ignoring errors here seems a bit odd, that might be a case where one
>> might want to re-sync. So I'd prefer wrapping this a result, even if it's a bit
>> tedious.
> 
> Converted to (and returned) a `Result<VerifyState, anyhow::Error>` for now, 
> but I'd be wary about treating a failed `load_manifest` the same as a
> `VerifyState::Failed`. Especially because currently an unverified
> snapshot simply returns `Err(null...)`. So we would first need to extend
> `VerifyState` to have a `Unknown` variant.

Yeah, I agree with you not treating a failure to load/parse the manifest
the same as a failed verification state, but the caller can log the error
and thus at least make the user aware of something unexpected happening.

>> And would it make sense to check the downloaded manifest's verify state, no point
>> in re-pulling if the other side is also corrupt, or is that possibility already
>> excluded somewhere else in this code path?
> 
> This should be covered, because:
>   - if the index is broken, we can't even open it
>   - if an unencrypted chunk is corrupted, the crc is checked when reading
>     it (pbs-client/src/remote_chunk_reader.rs:55)
>   - if an encrypted chunk is corrupted, the crc is checked as well 
>     (pbs-datastore/src/data_blob.rs:81)

ok, thanks for confirming this.




More information about the pbs-devel mailing list