[pbs-devel] [PATCH proxmox-backup v3] etc: raise nofile soft limit to hard limit for proxmox-backup-proxy

Fabian Grünbichler f.gruenbichler at proxmox.com
Fri Nov 21 10:06:37 CET 2025


On November 21, 2025 9:00 am, Christian Ebner wrote:
> On 11/21/25 8:42 AM, Fabian Grünbichler wrote:
>> On November 20, 2025 6:23 pm, Thomas Lamprecht wrote:
>>> Am 20.11.25 um 16:12 schrieb Christian Ebner:
>>>> On 11/20/25 4:05 PM, Thomas Lamprecht wrote:
>>>>> Am 20.11.25 um 15:32 schrieb Christian Ebner:
>>>>>> This is acceptable since PBS does not directly depend on problematic
>>>>>> select() calls as verified via `nm` and does not use it in linked
>>>>>> libraries to the best of my knowledge.
>>>>>>
>>>>>
>>>>> Isn't above and
>>>>
>>>> With above I intended to state that the PBS code itself does not call into select(), while below are dependencies on shared objects which might call into select() according to their symbols.
>>>>
>>>
>>> And the systemd news entry you link to in the commit message clearly states:
>>>
>>> ----8<----
>>> Programs that want to take benefit of the increased limit have to "opt-in" into
>>> high file descriptors explicitly by raising their soft limit. Of course, when
>>> they do that they must acknowledge that they cannot use select() anymore (and
>>> **neither can any shared library they use — or any shared library used by any
>>> shared library they use and so on**).
>>> ---->8----
>>>
>>> I just checked the apt repo, and it includes various select calls. Most seem
>>> to center around downloading packages and such, but I'd not bet on it that
>>> no such select is anywhere in the code paths we use.
>>>
>>> PAM uses select in the pam_loginuid, which might be part of the login call,
>>> albeit it uses it only if require_auditd is enabled (which I don't think it is).
>>> I did not yet checked the others out.
>>>
>>> I mean, one option might be to provide our own select wrapper preloaded
>>> overriding the glibc one and keep some FDs below 1024 resereved for that, but
>>> I really really dislike doing such things. Similar in spirit would be providing
>>> a select compatible implementation using poll and ld_preload that, but also far
>>> from great..
>>>
>>> Moving either GC, or all the things that might call select as per your list,
>>> into a dedicated process might be the nicer thing to do. But as mentioned offlist
>>> I'll try to walk through the problem and code again tomorrow and see if I can
>>> find some other viable options (or you/fabian got some ideas), as of my current
>>> knowledge I cannot really accept doing this bump.
>> 
>> if we move something, we should move the things (potentially) calling
>> select, as we can then benefit from higher FD limits for all the regular
>> operations. 1k open FDs is not much even without the newly added locks,
>> and we had users running into issues already before that fixed them by
>> raising the limit with a systemd override or other means (or not at
>> all):
>> 
>> https://forum.proxmox.com/threads/too-many-open-files-os-error-24.73094/
>> https://forum.proxmox.com/threads/garbage-collect-job-fails-with-emfile-too-many-open-files.152687/
>> https://forum.proxmox.com/threads/tasks-fail-with-too-many-open-files-os-error-24.126770/
>> https://forum.proxmox.com/threads/sync-from-pbs-to-pbs-failed-too-many-open-files.113036/
>> https://forum.proxmox.com/threads/another-sync-error.73417/
>> 
>> the only alternative I see at the moment would be to either
>> - reduce the lock granularity of the newly introduced lock (e.g.,
>>    lock-per-chunk-prefix)
> 
> This however does not necessarily solve the issue at hand? Many of these 
> chunks will have different prefixes... So worst case one ends up in the 
> exact same spot we are in now.

yes, that's true. it makes things more complicated, and might reduce the
number of open locks by virtue of lock contention, but doesn't ensure we
don't run into the issue..




More information about the pbs-devel mailing list