[pve-devel] [PATCH container 2/2] update_lxc_config: mount /sys read-only for CONTAINER_INTERFACE comapt

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Mar 17 14:24:00 CET 2020


On 3/17/20 2:10 PM, Wolfgang Bumiller wrote:
> On 3/17/20 12:31 PM, Thomas Lamprecht wrote:
>> On 3/17/20 10:27 AM, Wolfgang Bumiller wrote:
>>> On 3/17/20 7:35 AM, Thomas Lamprecht wrote:
>>>> CONTAINER_INTERFACE[0] is omething systemd people call their API and
>>>> we need to adapt to it a bit, even if it means doing stupid
>>>> unnecessary things, as else systemd decides to regress and suddenly
>>>> break network stack in CT after an upgrade[1].
>>>>
>>>> This mounts the parent /sys as ro, child mounts can be whatever.
>>>> Fixes the system regression introduced by[2].
>>>>
>>>> [0]: https://systemd.io/CONTAINER_INTERFACE/
>>>> [1]: https://github.com/systemd/systemd/issues/15101#issuecomment-598607582
>>>> [2]: https://github.com/systemd/systemd/commit/bf331d87171b7750d1c72ab0b140a240c0cf32c3
>>>>
>>>> Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
>>>> ---
>>>>
>>>> I hate it.
>>>>
>>>> Just a POC for commenting or picking up, probably belongs in a LXC config or in
>>>> a "per distro, per systemd version" specific thing
>>>
>>> Could `sys:mixed` be enough?
>>>
>>> We might need to explicitly rw-mount /sys/kernel/security for nested apparmor with either of them.
>>>
>>> Since we're effectively reducing access this will surely annoy some users. We probably want this to be configurable at first at least. We can make it default/opt-out IMO, at least for archlinux containers, but I don't like the idea of a more "complex" version check for this.
>>
>> Sooner or later you need that anyway, we get now already warning for
>> the v4/v6 DHCP systemd-network settings, they will be dropped in a future
>> release, but the new variants ("ipv4", "ipv6") are only available since
>> systemd version v219 or v220, and other settings will surely also get
>> replaced sometimes in the future.
>>
>> IMO, a "get CT systemd version" helper allows to differentiate between
>> old and new methods easily without much hassle.
> 
> I suppose so. Need to figure out a *good* way to do that then.


I mean, the single and for me obvious way was to get the systemd binary
by checking common /lib and /usr path(s) and the just do a `systemd --version`
on that..


> 
>>>
>>> I do wonder though if we should just remove the auto sys mount and mount it in our hooks together with the rest manually. They do say a read-only tmpfs works fine, and then we can skip some mounts, or selectively make some ro/rw as mentioned in your [0].
>>
>> Not to sure, could you take a closer look at this and a sane and (hopefully)
>> future proof fix for this debacle?
> 
> Yeah.
> 






More information about the pve-devel mailing list