<html><head>
<style id="css_styles" type="text/css"><!--blockquote.cite { margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc }
blockquote.cite2 {margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc; margin-top: 3px; padding-top: 0px; }
a img { border: 0px; }
table { border-collapse: collapse; }
li[style='text-align: center;'], li[style='text-align: center; '], li[style='text-align: right;'], li[style='text-align: right; '] { list-style-position: inside;}
body { font-family: Helvetica; font-size: 9pt; }
.quote { margin-left: 1em; margin-right: 1em; border-left: 5px #ebebeb solid; padding-left: 0.3em; }
a.em-mention[href] { text-decoration: none; color: inherit; border-radius: 3px; padding-left: 2px; padding-right: 2px; background-color: #e2e2e2; }
--></style></head>
<body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div>Hi,</div><div><br /></div><div>We upgraded to 3.3 yesterday, not much gain to notice with regards to the new version or the change in keying. It’s still (obvioulsy) pretty busy.</div><div><br /></div><div>However, I also tried to remove some datastores, which failed with timeouts. PBS even stopped authenticating (so probably just working) all together for about 10 seconds, which was an unpleasant surprise.</div><div><br /></div><div>So looking into that further, I noticed the following logging:</div><div>Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client [::ffff]:42104] Unable to acquire lock "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error 4)</div><div>Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client [::ffff]:42144] Unable to acquire lock "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error 4)</div><div>Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client [::ffff]:47286] Unable to acquire lock "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error 4)</div><div>Dec 18 16:14:32 pbs005 proxmox-backup-proxy[39143]: GET /api2/json/admin/datastore/XXXXXX/status: 400 Bad Request: [client [::ffff:]:45994] Unable to acquire lock "/etc/proxmox-backup/.datastore.lck" - Interrupted system call (os error 4)</div><div><br /></div><div>Which surprised me, since this is a ’status’ call, which should not need locking of the datastore-config.</div><div><br /></div><div><a href="https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=c611f593624977defc49d6e4de2ab8185cfe09e9;hb=HEAD#l687">https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=c611f593624977defc49d6e4de2ab8185cfe09e9;hb=HEAD#l687</a> does not lock the config, but </div>
<div><br /></div><div><a href="https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-datastore/src/datastore.rs;h=0801b4bf6b25eaa6f306c7b39ae2cfe81b4782e1;hb=HEAD#l204">https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-datastore/src/datastore.rs;h=0801b4bf6b25eaa6f306c7b39ae2cfe81b4782e1;hb=HEAD#l204</a> does.</div><div><br /></div><div>So if I understand this correctly, every ’status’ call (30 per second in our case) locks the datastore-config exclusively. And also, every time ’status’ get called, the whole datastore-config gets loaded?</div><div><br /></div><div>Is that something that could use some performance tuning?</div><div><br /></div><div id="signature_old" style="clear:both"><div style="margin: 0px; padding: 0px; box-sizing: content-box;">— </div><div style="margin: 0px; padding: 0px; box-sizing: content-box;">Mark Schouten</div><div style="margin: 0px; padding: 0px; box-sizing: content-box;">CTO, Tuxis B.V.</div><div style="margin: 0px; padding: 0px; box-sizing: content-box;">+31 318 200208 / mark@tuxis.nl</div></div><div><br /></div>
<div x-em-replyforwardheader=""><br /></div>
<div>
<div>------ Original Message ------</div>
<div>From "Shannon Sterz" <<a href="mailto:s.sterz@proxmox.com">s.sterz@proxmox.com</a>></div>
<div>To "Mark Schouten" <<a href="mailto:mark@tuxis.nl">mark@tuxis.nl</a>></div>
<div>Cc "Proxmox Backup Server development discussion" <<a href="mailto:pbs-devel@lists.proxmox.com">pbs-devel@lists.proxmox.com</a>></div>
<div>Date 16/12/2024 12:51:47</div>
<div>Subject Re: Re[2]: [pbs-devel] Authentication performance</div></div><div x-em-quote=""><br /></div>
<div id="x6df669de829a4a9" class="plain"><blockquote cite="D6D3QC6Y5H4S.1QHYHPHXK6RVR@proxmox.com" type="cite" class="cite2">
<div class="plain_line">On Mon Dec 16, 2024 at 12:23 PM CET, Mark Schouten wrote:</div>
<blockquote type="cite" class="cite">
<div class="plain_line"> Hi,</div>
<div class="plain_line"> </div>
<div class="plain_line"> ></div>
<div class="plain_line"> >would you mind sharing either `authkey.pub` or the output of the</div>
<div class="plain_line"> >following commands:</div>
<div class="plain_line"> ></div>
<div class="plain_line"> >head --lines=1 /etc/proxmox-backup/authkey.key</div>
<div class="plain_line"> >cat /etc/proxmox-backup/authkey.key | wc -l</div>
<div class="plain_line"> </div>
<div class="plain_line"> -----BEGIN RSA PRIVATE KEY-----</div>
<div class="plain_line"> 51</div>
<div class="plain_line"> </div>
<div class="plain_line"> So that is indeed the legacy method. We are going to upgrade our PBS’es</div>
<div class="plain_line"> on wednesday.</div>
<div class="plain_line"> </div>
<div class="plain_line"> ></div>
<div class="plain_line"> >The first should give the PEM header of the authkey whereas the second</div>
<div class="plain_line"> >provides the amount of lines that the key takes up in the file. Both</div>
<div class="plain_line"> >give an indication whether you are using the legacy RSA keys or newer</div>
<div class="plain_line"> >Ed25519 keys. The later should provide more performance, security should</div>
<div class="plain_line"> >not be affected much by this change. If the output of the commands look</div>
<div class="plain_line"> >like this:</div>
<div class="plain_line"> ></div>
<div class="plain_line"> >-----BEGIN PRIVATE KEY-----</div>
<div class="plain_line"> >3</div>
<div class="plain_line"> ></div>
<div class="plain_line"> >Then you are using the newer keys. There currently isn't a recommended</div>
<div class="plain_line"> >way to upgrade the keys. However, in theory you should be able to remove</div>
<div class="plain_line"> >the old keys, re-start PBS and it should just generate keys in the new</div>
<div class="plain_line"> >format. Note that this will logout anyone that is currently</div>
<div class="plain_line"> >authenticated and they'll have to re-authenticate.</div>
<div class="plain_line"> </div>
<div class="plain_line"> Seems like a good moment to update those keys as well.</div>
</blockquote>
<div class="plain_line"> </div>
<div class="plain_line">Sure, just be aware that you have to manually delete the key before</div>
<div class="plain_line">restarting the PBS. Upgrading alone won't affect the key. Ideally you'd</div>
<div class="plain_line">test this before rolling it out, if you can</div>
<div class="plain_line"> </div>
<blockquote type="cite" class="cite2">
<div class="plain_line"> >In general, tokens should still be fater to authenticate so we'd</div>
<div class="plain_line"> >recommend that you try to get your users to switch to token-based</div>
<div class="plain_line"> >authentication where possible. Improving performance there is a bit</div>
<div class="plain_line"> >trickier though, as it often comes with a security trade-off (in the</div>
<div class="plain_line"> >background we use yescrypt fo the authentication there, that</div>
<div class="plain_line"> >delibaretely adds a work factor). However, we may be able to improve</div>
<div class="plain_line"> >performance a bit via caching methods or similar.</div>
<div class="plain_line"> </div>
<div class="plain_line"> Yes, that might help. I’m also not sure if it actually is</div>
<div class="plain_line"> authentication, or if it is the datastore-call that the PVE-environments</div>
<div class="plain_line"> call. As you can see in your support issue 3153557, it looks like some</div>
<div class="plain_line"> requests loop through all datastores, before responding with a limited</div>
<div class="plain_line"> set of datastores.</div>
</blockquote>
<div class="plain_line"> </div>
<div class="plain_line">I looked at that ticket and yes, that is probably unrelated to</div>
<div class="plain_line">authentication.</div>
<div class="plain_line"> </div>
<blockquote type="cite" class="cite2">
<div class="plain_line"> For instance (and I’m a complete noob wrt Rust) but if I understand</div>
<div class="plain_line"> <a href="https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=11d2641b9ca2d2c92da1a85e4cb16d780368abd3;hb=HEAD#l1315">https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=src/api2/admin/datastore.rs;h=11d2641b9ca2d2c92da1a85e4cb16d780368abd3;hb=HEAD#l1315</a></div>
<div class="plain_line"> correcly, PBS loops through all the datastores, checks mount-status and</div>
<div class="plain_line"> config, and only starts filtering at line 1347. If I understand that</div>
<div class="plain_line"> correctly, in our case with over 1100 datastores, that might cause quite</div>
<div class="plain_line"> some load?</div>
</blockquote>
<div class="plain_line"> </div>
<div class="plain_line">Possible, yes, that would depend on your configuration. Are all of these</div>
<div class="plain_line">datastores defined with a backing device? Because if not, than this</div>
<div class="plain_line">should be fairly fast (as in, this should not actually touch the disks).</div>
<div class="plain_line">If they are, then yes this could be slow as each store would trigger at</div>
<div class="plain_line">least 2 stat calls afaict.</div>
<div class="plain_line"> </div>
<div class="plain_line">In any case, it should be fine to move the `mount_status` check after</div>
<div class="plain_line">the `if allowed || allow_id` check from what i can tell. Not sure why</div>
<div class="plain_line">we'd need to check the mount_status for a datastore we won't include in</div>
<div class="plain_line">the resulsts anyway. Same goes for parsing the store config imo. Send a</div>
<div class="plain_line">patch for that [1].</div>
<div class="plain_line"> </div>
<div class="plain_line">[1]: <a href="https://lore.proxmox.com/pbs-devel/20241216115044.208595-1-s.sterz@proxmox.com/T/#u">https://lore.proxmox.com/pbs-devel/20241216115044.208595-1-s.sterz@proxmox.com/T/#u</a></div>
<div class="plain_line"> </div>
<blockquote type="cite" class="cite2">
<div class="plain_line"> </div>
<div class="plain_line"> </div>
<div class="plain_line"> Thanks,</div>
<div class="plain_line"> </div>
<div class="plain_line"> —</div>
<div class="plain_line"> Mark Schouten</div>
<div class="plain_line"> CTO, Tuxis B.V.</div>
<div class="plain_line"> +31 318 200208 / <a href="mailto:mark@tuxis.nl">mark@tuxis.nl</a></div>
</blockquote>
<div class="plain_line"> </div>
<div class="plain_line"> </div>
</blockquote></div>
</body></html>