[pbs-devel] [PATCH proxmox-backup 2/2] tape: write informational MAM attributes on tapes

Dominik Csapak d.csapak at proxmox.com
Thu May 23 10:22:39 CEST 2024


On 5/23/24 10:10, Thomas Lamprecht wrote:
> Am 23/05/2024 um 08:09 schrieb Dominik Csapak:
>> On 5/22/24 19:24, Thomas Lamprecht wrote:
>>> What your commit did not mention is why you skip setting a few others, like I
>>> could imagine that the following would have some use:> - DATE AND TIME LAST WRITTEN
>>
>> i hesitated with this one as there is no timezone included and it does not
>> specify one either, but i guess we could just use UTC (although that might
>> be confusing for some people?)
> 
> Ok, that is a good point, and yeah a shame that the spec didn't make this
> field 17 bytes wide, allowing to encode a [+-]ZZZZ UTC-timezone difference.
> 
> While for small setups in just one location, or set to the same timezone
> in the whole organisation, independent of where the servers are located,
> in can be even nice to use the local timezone, that still is a loss of
> information.. Using UTC and documenting that is the only way to allow
> being sure of what the actual time the tape was written to is in any
> timezone.
> 
> So, I'd go for UTC here and document that. If we ever show this in the
> UI or CLI we can render it correctly as ISO 8601 indicating that this is
> UTC time.

sounds reasonable

> 
>>
>>> - TEXT LOCALIZATION IDENTIFIER (Strings are UTF-8 in rust, and we do not
>>>     explicitly keep them in ASCII or the like FWICT)
>>
>> that one i explicitly left out because we (currently) only write ascii,
>> but yes, we could simply set that to utf-8 for "future-proof"ness
> 
> This is mostly enforced indirectly currently or? As the label depends on
> the pool name and that one is enforced to match the "safe" regex?

yes it's only enforced by our regexes (though i don't expect them to
get less restrictive over time)

> 
> In that case it might be good to either future-proof or alternatively, IMO
> not really better, to at least enforce/check that the saved text is ascii
> directly here, as the data is a String, which is utf-8 in rust, so coupling
> the assumption here to the rather distanced API schema format seems not
> ideal to me.

i sent a patch to also include that field with the utf-8 value

> 
> btw. enforcing the length might be nice too, what would actually happen if
> one writes more data than reserved by the spec, does it spill into the
> next field, does something catches this and errors out?

we do enforce this already and fail with an error on writing
(before sending at all)


but my (educated) guess would be that if we'd try, the drive
would answer with a scsi error since they are relatively strict
in what they accept before doing anything
e.g. using the wrong type (ASCII vs TEXT) in the type field also does not work
even if it makes no difference in the actual data
so sending an invalid size wouldn't probably work either


> 
> 
>> - APPLICATION FORMAT VERSION (always good to have)
>>
>> isn't that implicated by the application version ?
> 
> Not necessarily, there can be a new way to write tapes from PBS in the
> future and the version to use might be selectable, or the newer one
> backported to an older stable version (at least as option).
> 
> IME, tracking format and program versions as separate things makes life
> only easier in the long run.
> 

makes sense

>>
>> we don't really have a 'format version' for the tape format, but each
>> archive on it has it's own version e.g. the snapshot archives
>> have version 1.2 while the chunk archive and catalog archive have 1.1
>> and the labels have only 1.0
> 
> You could combine those atoms that make out the whole tape format into a
> full version by concatenating them with a separator like semicolon or a
> plus or the like.
> 
> As this field has 16 characters you could even prefix each version with a
> letter to make it slightly simpler to read, e.g.:
> 
> A1.2;C1.1;L1.0
> 
> Or use the letter as separator, making a bit more space for future version
> extension:
> 
> A1.2C1.1L1.0
> 
> We could even use that now to define a global tape version or a, well,
> versioning-version:
> 
> T1.0A1.2C1.1L1.0
> 
> A bit crowded but any (future) command of ours that outputs this information
> could format that nicely and documenting it should cover third party tools.

ok, i'll look into that. makes sense to combine the versions
(hopefully we don't get too man new archive types and our versions don't
get too long ;) )

> 
>>>
>>> Not so sure from top of my head about the UIDS, i.e., if we even have something
>>> that can be easily mapped to this.
>>
>> not sure which field you mean here? in  LTO-5 there is only one standardized
>> field left and that is the VOLUME COHERENCY INFORMATION
>> and i don't think we'll need that
> 
> At least in LTO-9 there would be "MEDIUM GLOBALLY UNIQUE IDENTIFIER" and "MEDIA
> POOL GLOBALLY UNIQUE IDENTIFIER", differing per LTO version shouldn't (hopefully)
> be an issue, but probably not _that_ important, at least if we do not have
> existing information that can be mapped 1:1 to those two fields already.

for medium we'd actually have already a uuid, but i don't really want
to deviate for different lto versions, this would make it a bit more cumbersome
since we'd first have to query and parse the version of the tape each time

so i'd omit that for now until we decide we actually need it if that's
ok with you ?

> 
> btw. there's also BARCODE, as we support barcode labeling, it might be good
> to write that out too I guess?

well at least some vtls already set that automatically, and that might be used by
tape libs themselves, so I'm a bit hesitant to touch that field.
Though it could work out fine...





More information about the pbs-devel mailing list