[pbs-devel] [PATCH proxmox-backup] verify: handle manifest update errors as non-fatal

Christian Ebner c.ebner at proxmox.com
Wed Jan 29 09:15:21 CET 2025


On 1/28/25 13:43, Gabriel Goller wrote:
> On 28.01.2025 12:47, Christian Ebner wrote:
>> Since commit 8ea00f6e ("allow to abort verify jobs") errors
>> propagated up to the verify jobs worker call side are interpreted as
>> job aborts.
>>
>> The manifest update did not honor this, leading to the verify job
>> being aborted with the misleading log entry:
>> `verification failed - job aborted`
>>
>> Instead, handle the manifest update error non-fatal just like any
>> other verification related error, log it including the error message
>> and continue verification with the next item.
>>
>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>> ---
>> src/backup/verify.rs | 18 +++++++++++++-----
>> 1 file changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/backup/verify.rs b/src/backup/verify.rs
>> index 840a37859..02478b165 100644
>> --- a/src/backup/verify.rs
>> +++ b/src/backup/verify.rs
>> @@ -3,7 +3,7 @@ use std::sync::atomic::{AtomicUsize, Ordering};
>> use std::sync::{Arc, Mutex};
>> use std::time::Instant;
>>
>> -use anyhow::{bail, format_err, Error};
>> +use anyhow::{bail, Error};
>> use nix::dir::Dir;
>> use tracing::{error, info, warn};
>>
>> @@ -399,12 +399,20 @@ pub fn verify_backup_dir_with_lock(
>>         state: verify_result,
>>         upid,
>>     };
>> -    let verify_state = serde_json::to_value(verify_state)?;
>> -    backup_dir
>> -        .update_manifest(|manifest| {
>> +
>> +    if let Err(err) = {
>> +        let verify_state = serde_json::to_value(verify_state)?;
>> +        backup_dir.update_manifest(|manifest| {
>>             manifest.unprotected["verify_state"] = verify_state;
>>         })
>> -        .map_err(|err| format_err!("unable to update manifest blob - 
>> {}", err))?;
>> +    } {
>> +        info!(
>> +            "verify {}:{} - manifest update error: {err}",
>> +            verify_worker.datastore.name(),
>> +            backup_dir.dir(),
>> +        );
> 
> Is there any reason for not using tracing::error? This would be nice to
> find in the syslog as well. Also using "{err:#}" would print the whole
> error chain/context.

The reason I used `info` over `error` here is that all other errors for 
verification are logged the same way, so I would argue that logging only 
this one case as error is not correct. If we however decide for using 
the `error` over the `info`, I think this should be adapted for all 
verification errors being logged, to be consistent.

Regarding error formatting, as the errors by update_manifest do not add 
error context, this has no effect currently as far as I see, but you are 
right in that this would be more future proof.

Should I send a v2 for that?




More information about the pbs-devel mailing list