[pbs-devel] Authentication performance

Mark Schouten mark at tuxis.nl
Mon Jan 6 20:07:43 CET 2025


https://bugzilla.proxmox.com/show_bug.cgi?id=6049 has been created for 
this.

Thanks!

—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / mark at tuxis.nl


------ Original Message ------
>From "Shannon Sterz" <s.sterz at proxmox.com>
To "Mark Schouten" <mark at tuxis.nl>
Cc "Proxmox Backup Server development discussion" 
<pbs-devel at lists.proxmox.com>
Date 20/12/2024 14:22:18
Subject Re: Re[4]: [pbs-devel] Authentication performance

>On Thu Dec 19, 2024 at 10:56 AM CET, Mark Schouten wrote:
>>  Hi,
>>
>>  We upgraded to 3.3 yesterday, not much gain to notice with regards to
>>  the new version or the change in keying. It’s still (obvioulsy) pretty
>>  busy.
>
>just be aware that the patch i linked to in my last mail has not been
>packaged yet, so you wouldn't see the impact of that patch yet.
>
>>  However, I also tried to remove some datastores, which failed with
>>  timeouts. PBS even stopped authenticating (so probably just working) all
>>  together for about 10 seconds, which was an unpleasant surprise.
>>
>>  So looking into that further, I noticed the following logging:
>>  Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>>  /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>>  [::ffff]:42104] Unable to acquire lock
>>  "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>>  4)
>>  Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>>  /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>>  [::ffff]:42144] Unable to acquire lock
>>  "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>>  4)
>>  Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>>  /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>>  [::ffff]:47286] Unable to acquire lock
>>  "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>>  4)
>>  Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET
>>  /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client
>>  [::ffff:]:45994] Unable to acquire lock
>>  "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error
>>  4)
>>
>>  Which surprised me, since this is a ’status’ call, which should not need
>>  locking of the datastore-config.
>>
>>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=c611f593624977defc49d6e4de2ab8185cfe09e9;hb=HEAD#l687
>>  does not lock the config, but
>>
>>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-datastore/src/datastore.rs;h=0801b4bf6b25eaa6f306c7b39ae2cfe81b4782e1;hb=HEAD#l204
>>  does.
>>
>>  So if I understand this correctly, every ’status’ call (30 per second in
>>  our case) locks the datastore-config exclusively. And also, every time
>>  ’status’ get called, the whole datastore-config gets loaded?
>
>probably, there are some comments about that there already, it might
>make sense to open a bugzilla issue to discuss this further [1].
>
>[1]: https://bugzilla.proxmox.com/
>
>>  Is that something that could use some performance tuning?
>>
>>>>  Mark Schouten
>>  CTO, Tuxis B.V.
>>  +31 318 200208 / mark at tuxis.nl
>>
>>
>>  ------ Original Message ------
>>  From "Shannon Sterz" <s.sterz at proxmox.com>
>>  To "Mark Schouten" <mark at tuxis.nl>
>>  Cc "Proxmox Backup Server development discussion"
>>  <pbs-devel at lists.proxmox.com>
>>  Date 16/12/2024 12:51:47
>>  Subject Re: Re[2]: [pbs-devel] Authentication performance
>>
>>  >On Mon Dec 16, 2024 at 12:23 PM CET, Mark Schouten wrote:
>>  >>  Hi,
>>  >>
>>  >>  >
>>  >>  >would you mind sharing either `authkey.pub` or the output of the
>>  >>  >following commands:
>>  >>  >
>>  >>  >head --lines=1 /etc/proxmox-backup/authkey.key
>>  >>  >cat /etc/proxmox-backup/authkey.key | wc -l
>>  >>
>>  >>  -----BEGIN RSA PRIVATE KEY-----
>>  >>  51
>>  >>
>>  >>  So that is indeed the legacy method. We are going to upgrade our PBS’es
>>  >>  on wednesday.
>>  >>
>>  >>  >
>>  >>  >The first should give the PEM header of the authkey whereas the second
>>  >>  >provides the amount of lines that the key takes up in the file. Both
>>  >>  >give an indication whether you are using the legacy RSA keys or newer
>>  >>  >Ed25519 keys. The later should provide more performance, security should
>>  >>  >not be affected much by this change. If the output of the commands look
>>  >>  >like this:
>>  >>  >
>>  >>  >-----BEGIN PRIVATE KEY-----
>>  >>  >3
>>  >>  >
>>  >>  >Then you are using the newer keys. There currently isn't a recommended
>>  >>  >way to upgrade the keys. However, in theory you should be able to remove
>>  >>  >the old keys, re-start PBS and it should just generate keys in the new
>>  >>  >format. Note that this will logout anyone that is currently
>>  >>  >authenticated and they'll have to re-authenticate.
>>  >>
>>  >>    Seems like a good moment to update those keys as well.
>>  >
>>  >Sure, just be aware that you have to manually delete the key before
>>  >restarting the PBS. Upgrading alone won't affect the key. Ideally you'd
>>  >test this before rolling it out, if you can
>>  >
>>  >>  >In general, tokens should still be fater to authenticate so we'd
>>  >>  >recommend that you try to get your users to switch to token-based
>>  >>  >authentication where possible. Improving performance there is a bit
>>  >>  >trickier though, as it often comes with a security trade-off (in the
>>  >>  >background we use yescrypt fo the authentication there, that
>>  >>  >delibaretely adds a work factor). However, we may be able to improve
>>  >>  >performance a bit via caching methods or similar.
>>  >>
>>  >>  Yes, that might help. I’m also not sure if it actually is
>>  >>  authentication, or if it is the datastore-call that the PVE-environments
>>  >>  call. As you can see in your support issue 3153557, it looks like some
>>  >>  requests loop through all datastores, before responding with a limited
>>  >>  set of datastores.
>>  >
>>  >I looked at that ticket and yes, that is probably unrelated to
>>  >authentication.
>>  >
>>  >>  For instance (and I’m a complete noob wrt Rust) but if I understand
>>  >>https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=11d2641b9ca2d2c92da1a85e4cb16d780368abd3;hb=HEAD#l1315
>>  >>  correcly, PBS loops through all the datastores, checks mount-status and
>>  >>  config, and only starts filtering at line 1347. If I understand that
>>  >>  correctly, in our case with over 1100 datastores, that might cause quite
>>  >>  some load?
>>  >
>>  >Possible, yes, that would depend on your configuration. Are all of these
>>  >datastores defined with a backing device? Because if not, than this
>>  >should be fairly fast (as in, this should not actually touch the disks).
>>  >If they are, then yes this could be slow as each store would trigger at
>>  >least 2 stat calls afaict.
>>  >
>>  >In any case, it should be fine to move the `mount_status` check after
>>  >the `if allowed || allow_id` check from what i can tell. Not sure why
>>  >we'd need to check the mount_status for a datastore we won't include in
>>  >the resulsts anyway. Same goes for parsing the store config imo. Send a
>>  >patch for that [1].
>>  >
>>  >[1]: https://lore.proxmox.com/pbs-devel/20241216115044.208595-1-s.sterz@proxmox.com/T/#u
>>  >
>>  >>
>>  >>
>>  >>  Thanks,
>>  >>
>>  >>  —
>>  >>    Mark Schouten
>>  >>  CTO, Tuxis B.V.
>>  >>  +31 318 200208 / mark at tuxis.nl
>>  >
>>  >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pbs-devel/attachments/20250106/e2204682/attachment.htm>


More information about the pbs-devel mailing list