[pbs-devel] applied: [PATCH proxmox v2] fix #3618: proxmox-async: zip: add conditional EFS flag to zip files

Dominik Csapak d.csapak at proxmox.com
Tue Jan 11 08:40:12 CET 2022


On 1/11/22 06:49, Thomas Lamprecht wrote:
> On 10.01.22 12:23, Dominik Csapak wrote:
>> this flag marks the file names as 'UTF-8' encoded if they are valid UTF-8.
>>
>> By default, encoding of file names in zips are defined as code page 437,
>> but we save the filenames as bytes (like in linux fs).
>>
>> For linux systems this would not be a problem since most tools
>> simply use the filenames as bytes, but for the zip utility under
>> windows it's important since NTFS uses UTF-16 for file names.
>>
>> For filenames that are valid UTF-8, they are decoded as UTF-8 everywhere
>> correctly (Linux as UTF-8 bytes, Windows as correct UTF-16 sequence) and
>> for other filenames with a high bit set, it depends on the OS/Software
>> what exactly happens. Some cases below:
>>
>> * Windows + Built-in/7zip: decoded as CP437
>> * Debian + zip: Bytes taken as-is
>> * Debian + 7z: interpreted as Windows1252, decoded as UTF-8
>>
>> Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
>> ---
>> changes from v1:
>> * moved to proxmox/proxmox-async from proxmox-backup/pbs-tools
>> * included bug# in the subject
>> * removed two spurious newlines
>>
>>   proxmox-async/src/zip.rs | 22 +++++++++++++++++++---
>>   1 file changed, 19 insertions(+), 3 deletions(-)
>>
>>
> 
> applied, thanks!
> 
> Out of interest, did you benchmark if this changes makes an impact in zip-streaming?
> I'd think that if, then only for the case with many small files?

no i did not benchmark it, but during zip streaming i am here almost 
always disk limited (accessing random chunks), so i don't think i would
have gotten interesting results...

ofc i can do some benchmarks with/without the patch this week





More information about the pbs-devel mailing list