[pve-devel] [PATCH container 2/2] update_lxc_config: mount /sys read-only for CONTAINER_INTERFACE comapt

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Mar 17 12:31:49 CET 2020


On 3/17/20 10:27 AM, Wolfgang Bumiller wrote:
> On 3/17/20 7:35 AM, Thomas Lamprecht wrote:
>> CONTAINER_INTERFACE[0] is omething systemd people call their API and
>> we need to adapt to it a bit, even if it means doing stupid
>> unnecessary things, as else systemd decides to regress and suddenly
>> break network stack in CT after an upgrade[1].
>>
>> This mounts the parent /sys as ro, child mounts can be whatever.
>> Fixes the system regression introduced by[2].
>>
>> [0]: https://systemd.io/CONTAINER_INTERFACE/
>> [1]: https://github.com/systemd/systemd/issues/15101#issuecomment-598607582
>> [2]: https://github.com/systemd/systemd/commit/bf331d87171b7750d1c72ab0b140a240c0cf32c3
>>
>> Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
>> ---
>>
>> I hate it.
>>
>> Just a POC for commenting or picking up, probably belongs in a LXC config or in
>> a "per distro, per systemd version" specific thing
> 
> Could `sys:mixed` be enough?
> 
> We might need to explicitly rw-mount /sys/kernel/security for nested apparmor with either of them.
> 
> Since we're effectively reducing access this will surely annoy some users. We probably want this to be configurable at first at least. We can make it default/opt-out IMO, at least for archlinux containers, but I don't like the idea of a more "complex" version check for this.

Sooner or later you need that anyway, we get now already warning for
the v4/v6 DHCP systemd-network settings, they will be dropped in a future
release, but the new variants ("ipv4", "ipv6") are only available since
systemd version v219 or v220, and other settings will surely also get
replaced sometimes in the future.

IMO, a "get CT systemd version" helper allows to differentiate between
old and new methods easily without much hassle.

> 
> I do wonder though if we should just remove the auto sys mount and mount it in our hooks together with the rest manually. They do say a read-only tmpfs works fine, and then we can skip some mounts, or selectively make some ro/rw as mentioned in your [0].

Not to sure, could you take a closer look at this and a sane and (hopefully)
future proof fix for this debacle?





More information about the pve-devel mailing list