[pve-devel] [PATCH v2 access-control] fix #5136: ldap: Decode non-ASCII characters in attributes

Wed Feb 28 19:58:25 CET 2024

On Wed, 28 Feb 2024 16:00:48 +0100
Fiona Ebner <f.ebner at proxmox.com> wrote:

> Am 28.02.24 um 15:41 schrieb Thomas Lamprecht:
> > Am 09/01/2024 um 14:35 schrieb Filip Schauer:  
> >> UTF8 decode non-ASCII characters when syncing user attributes, since
> >> those will be encoded later on. Without this fix the attributes were
> >> encoded twice, resulting in cases such as 'ü' turning into 'Ã¼'.
> >>
> >> Signed-off-by: Filip Schauer <f.schauer at proxmox.com>
> >> ---
> >> Changes since v1:
> >> * Do not try to URI unescape the user attributes, since we do that later
> >>   in PVE::AccessControl::parse_user_config anyways.
> >>
> >>  src/PVE/Auth/LDAP.pm | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/src/PVE/Auth/LDAP.pm b/src/PVE/Auth/LDAP.pm
> >> index b958f2b..06177db 100755
> >> --- a/src/PVE/Auth/LDAP.pm
> >> +++ b/src/PVE/Auth/LDAP.pm
> >> @@ -301,7 +301,7 @@ sub get_users {
> >>  
> >>  	foreach my $attr (keys %$user_attributes) {
> >>  	    if (my $ours = $ldap_attribute_map->{$attr}) {
> >> -		$ret->{$username}->{$ours} = $user_attributes->{$attr}->[0];
> >> +		$ret->{$username}->{$ours} = Encode::decode('utf8', $user_attributes->{$attr}->[0]);  
> 
> Note: missing use Encode; at the beginning of the file.
> 
> >>  	    }
> >>  	}
> >>    
> > 
> > this would need a rebase, oh, and would be great if the original testers
> > could reconfirm the v2 approach of doing utf-8 decoding only.
> >   
> 
> Gave it a quick test and fixes issues with special characters for me.
> Don't forget to also use the latest master of pve-cluster, otherwise
> writing the user config will still do the wrong thing [0]! Both are
> needed to fix the issue here. I'm just wondering if we are guaranteed
> that the LDAP server sends UTF-8 encoded data?
sadly (or luckily) not too much experience with validity of LDAP data out
in the wild. Quickly searched online and went through the rfc-chain until
there was not Link to "Obsoleted by" anymore (and then going through all
RFC indexed there [0]:
The (~18 year old) standard indicates that strings used should be UTF-8
encoded:
https://datatracker.ietf.org/doc/html/rfc4511#section-4.1.2
(and pointed out the (by now probably not significant difference between
unicode and ISO10646 - see [1]).

However, probably with any protocol that has been around for 30+ years -
guarantees are hard to come by:
https://datatracker.ietf.org/doc/html/rfc4512#section-7.2

anyways - iiuc we can just skip the syncing of the attribute in this part?
- if we add a warning to the log it sounds ok to me (but I only very
  quickly skimmed through what the code does)


[0] https://datatracker.ietf.org/doc/html/rfc4510
[1] https://www.unicode.org/versions/Unicode15.0.0/appC.pdf
> 
> [0]:
> https://git.proxmox.com/?p=pve-cluster.git;a=commit;h=2e276ccd9beb2004ddd72396b2a9b72a288771d8
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel