[pve-devel] [PATCH container 2/2] update_lxc_config: mount /sys read-only for CONTAINER_INTERFACE comapt

Wolfgang Bumiller w.bumiller at proxmox.com
Tue Mar 17 14:10:18 CET 2020


On 3/17/20 12:31 PM, Thomas Lamprecht wrote:
> On 3/17/20 10:27 AM, Wolfgang Bumiller wrote:
>> On 3/17/20 7:35 AM, Thomas Lamprecht wrote:
>>> CONTAINER_INTERFACE[0] is omething systemd people call their API and
>>> we need to adapt to it a bit, even if it means doing stupid
>>> unnecessary things, as else systemd decides to regress and suddenly
>>> break network stack in CT after an upgrade[1].
>>>
>>> This mounts the parent /sys as ro, child mounts can be whatever.
>>> Fixes the system regression introduced by[2].
>>>
>>> [0]: https://systemd.io/CONTAINER_INTERFACE/
>>> [1]: https://github.com/systemd/systemd/issues/15101#issuecomment-598607582
>>> [2]: https://github.com/systemd/systemd/commit/bf331d87171b7750d1c72ab0b140a240c0cf32c3
>>>
>>> Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
>>> ---
>>>
>>> I hate it.
>>>
>>> Just a POC for commenting or picking up, probably belongs in a LXC config or in
>>> a "per distro, per systemd version" specific thing
>>
>> Could `sys:mixed` be enough?
>>
>> We might need to explicitly rw-mount /sys/kernel/security for nested apparmor with either of them.
>>
>> Since we're effectively reducing access this will surely annoy some users. We probably want this to be configurable at first at least. We can make it default/opt-out IMO, at least for archlinux containers, but I don't like the idea of a more "complex" version check for this.
> 
> Sooner or later you need that anyway, we get now already warning for
> the v4/v6 DHCP systemd-network settings, they will be dropped in a future
> release, but the new variants ("ipv4", "ipv6") are only available since
> systemd version v219 or v220, and other settings will surely also get
> replaced sometimes in the future.
> 
> IMO, a "get CT systemd version" helper allows to differentiate between
> old and new methods easily without much hassle.

I suppose so. Need to figure out a *good* way to do that then.

>>
>> I do wonder though if we should just remove the auto sys mount and mount it in our hooks together with the rest manually. They do say a read-only tmpfs works fine, and then we can skip some mounts, or selectively make some ro/rw as mentioned in your [0].
> 
> Not to sure, could you take a closer look at this and a sane and (hopefully)
> future proof fix for this debacle?

Yeah.





More information about the pve-devel mailing list