[pbs-devel] Authentication performance
Mark Schouten
mark at tuxis.nl
Mon Jan 6 20:07:43 CET 2025
https://bugzilla.proxmox.com/show_bug.cgi?id=6049 has been created for
this.
Thanks!
—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / mark at tuxis.nl
------ Original Message ------
>From "Shannon Sterz" <s.sterz at proxmox.com>
To "Mark Schouten" <mark at tuxis.nl>
Cc "Proxmox Backup Server development discussion"
<pbs-devel at lists.proxmox.com>
Date 20/12/2024 14:22:18
Subject Re: Re[4]: [pbs-devel] Authentication performance
>On Thu Dec 19, 2024 at 10:56 AM CET, Mark Schouten wrote:
>> Hi,
>>
>> We upgraded to 3.3 yesterday, not much gain to notice with regards to
>> the new version or the change in keying. It’s still (obvioulsy) pretty
>> busy.
>
>just be aware that the patch i linked to in my last mail has not been
>packaged yet, so you wouldn't see the impact of that patch yet.
>
>> However, I also tried to remove some datastores, which failed with
>> timeouts. PBS even stopped authenticating (so probably just working) all
>> together for about 10 seconds, which was an unpleasant surprise.
>>
>> So looking into that further, I noticed the following logging:
>> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>> [::ffff]:42104] Unable to acquire lock
>> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>> 4)
>> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>> [::ffff]:42144] Unable to acquire lock
>> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>> 4)
>> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>> [::ffff]:47286] Unable to acquire lock
>> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>> 4)
>> Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>> /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>> [::ffff:]:45994] Unable to acquire lock
>> "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>> 4)
>>
>> Which surprised me, since this is a ’status’ call, which should not need
>> locking of the datastore-config.
>>
>>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=c611f593624977defc49d6e4de2ab8185cfe09e9;hb=HEAD#l687
>> does not lock the config, but
>>
>>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-datastore/src/datastore.rs;h=0801b4bf6b25eaa6f306c7b39ae2cfe81b4782e1;hb=HEAD#l204
>> does.
>>
>> So if I understand this correctly, every ’status’ call (30 per second in
>> our case) locks the datastore-config exclusively. And also, every time
>> ’status’ get called, the whole datastore-config gets loaded?
>
>probably, there are some comments about that there already, it might
>make sense to open a bugzilla issue to discuss this further [1].
>
>[1]: https://bugzilla.proxmox.com/
>
>> Is that something that could use some performance tuning?
>>
>> —
>> Mark Schouten
>> CTO, Tuxis B.V.
>> +31 318 200208 / mark at tuxis.nl
>>
>>
>> ------ Original Message ------
>> From "Shannon Sterz" <s.sterz at proxmox.com>
>> To "Mark Schouten" <mark at tuxis.nl>
>> Cc "Proxmox Backup Server development discussion"
>> <pbs-devel at lists.proxmox.com>
>> Date 16/12/2024 12:51:47
>> Subject Re: Re[2]: [pbs-devel] Authentication performance
>>
>> >On Mon Dec 16, 2024 at 12:23 PM CET, Mark Schouten wrote:
>> >> Hi,
>> >>
>> >> >
>> >> >would you mind sharing either `authkey.pub` or the output of the
>> >> >following commands:
>> >> >
>> >> >head --lines=1 /etc/proxmox-backup/authkey.key
>> >> >cat /etc/proxmox-backup/authkey.key | wc -l
>> >>
>> >> -----BEGIN RSA PRIVATE KEY-----
>> >> 51
>> >>
>> >> So that is indeed the legacy method. We are going to upgrade our PBS’es
>> >> on wednesday.
>> >>
>> >> >
>> >> >The first should give the PEM header of the authkey whereas the second
>> >> >provides the amount of lines that the key takes up in the file. Both
>> >> >give an indication whether you are using the legacy RSA keys or newer
>> >> >Ed25519 keys. The later should provide more performance, security should
>> >> >not be affected much by this change. If the output of the commands look
>> >> >like this:
>> >> >
>> >> >-----BEGIN PRIVATE KEY-----
>> >> >3
>> >> >
>> >> >Then you are using the newer keys. There currently isn't a recommended
>> >> >way to upgrade the keys. However, in theory you should be able to remove
>> >> >the old keys, re-start PBS and it should just generate keys in the new
>> >> >format. Note that this will logout anyone that is currently
>> >> >authenticated and they'll have to re-authenticate.
>> >>
>> >> Seems like a good moment to update those keys as well.
>> >
>> >Sure, just be aware that you have to manually delete the key before
>> >restarting the PBS. Upgrading alone won't affect the key. Ideally you'd
>> >test this before rolling it out, if you can
>> >
>> >> >In general, tokens should still be fater to authenticate so we'd
>> >> >recommend that you try to get your users to switch to token-based
>> >> >authentication where possible. Improving performance there is a bit
>> >> >trickier though, as it often comes with a security trade-off (in the
>> >> >background we use yescrypt fo the authentication there, that
>> >> >delibaretely adds a work factor). However, we may be able to improve
>> >> >performance a bit via caching methods or similar.
>> >>
>> >> Yes, that might help. I’m also not sure if it actually is
>> >> authentication, or if it is the datastore-call that the PVE-environments
>> >> call. As you can see in your support issue 3153557, it looks like some
>> >> requests loop through all datastores, before responding with a limited
>> >> set of datastores.
>> >
>> >I looked at that ticket and yes, that is probably unrelated to
>> >authentication.
>> >
>> >> For instance (and I’m a complete noob wrt Rust) but if I understand
>> >>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=11d2641b9ca2d2c92da1a85e4cb16d780368abd3;hb=HEAD#l1315
>> >> correcly, PBS loops through all the datastores, checks mount-status and
>> >> config, and only starts filtering at line 1347. If I understand that
>> >> correctly, in our case with over 1100 datastores, that might cause quite
>> >> some load?
>> >
>> >Possible, yes, that would depend on your configuration. Are all of these
>> >datastores defined with a backing device? Because if not, than this
>> >should be fairly fast (as in, this should not actually touch the disks).
>> >If they are, then yes this could be slow as each store would trigger at
>> >least 2 stat calls afaict.
>> >
>> >In any case, it should be fine to move the `mount_status` check after
>> >the `if allowed || allow_id` check from what i can tell. Not sure why
>> >we'd need to check the mount_status for a datastore we won't include in
>> >the resulsts anyway. Same goes for parsing the store config imo. Send a
>> >patch for that [1].
>> >
>> >[1]: https://lore.proxmox.com/pbs-devel/20241216115044.208595-1-s.sterz@proxmox.com/T/#u
>> >
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> —
>> >> Mark Schouten
>> >> CTO, Tuxis B.V.
>> >> +31 318 200208 / mark at tuxis.nl
>> >
>> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pbs-devel/attachments/20250106/e2204682/attachment.htm>
More information about the pbs-devel
mailing list