2024-05-24 08:24:59

by Javier Martinez Canillas

[permalink] [raw]
Subject: [PATCH] userns: Default to 'yes' when CONFIG_MEMCG option is enabled

The default value for the CONFIG_USER_NS Kconfig symbol changed over time.

When first was introduced by commit acce292c82d4 ("user namespace: add the
framework"), the default was 'no'. But then it was changed to 'yes' if the
CONFIG_NAMESPACES option was enabled, by commit 17a6d4411a4d ("namespaces:
default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected").

Then, commit 5673a94c1457 ("userns: Add a Kconfig option to enforce strict
kuid and kgid type checks") changed the default to 'no' again and selected
the (now defunct) UIDGID_STRICT_TYPE_CHECKS option.

This selected option was removed by commit 261000a56b63 ("userns: Remove
UIDGID_STRICT_TYPE_CHECKS"), but CONFIG_USER_NS default was left to 'no'.

Finally, the commit e11f0ae388f2 ("userns: Recommend use of memory control
groups") added to the Kconfig symbol's help text a recommendation that the
memory control groups should be used, to limit the amount of memory that a
user who can create user namespaces can consume.

Looking at the changes' history, a default to 'yes' when the CONFIG_MEMCG
option is enabled seems like a sane thing to do. Specially since systemd
requires user namespaces support for services that use the PrivateUsers=
property in their unit files (e.g: the UPower daemon).

Signed-off-by: Javier Martinez Canillas <[email protected]>
---

init/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/init/Kconfig b/init/Kconfig
index 72404c1f2157..208e2f500ef0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1239,6 +1239,7 @@ config IPC_NS

config USER_NS
bool "User namespace"
+ default y if MEMCG
default n
help
This allows containers, i.e. vservers, to use user namespaces
--
2.45.1



2024-05-24 11:48:52

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] userns: Default to 'yes' when CONFIG_MEMCG option is enabled

On Fri, May 24, 2024 at 10:24:16AM +0200, Javier Martinez Canillas wrote:
> The default value for the CONFIG_USER_NS Kconfig symbol changed over time.
>
> When first was introduced by commit acce292c82d4 ("user namespace: add the
> framework"), the default was 'no'. But then it was changed to 'yes' if the
> CONFIG_NAMESPACES option was enabled, by commit 17a6d4411a4d ("namespaces:
> default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected").
>
> Then, commit 5673a94c1457 ("userns: Add a Kconfig option to enforce strict
> kuid and kgid type checks") changed the default to 'no' again and selected
> the (now defunct) UIDGID_STRICT_TYPE_CHECKS option.
>
> This selected option was removed by commit 261000a56b63 ("userns: Remove
> UIDGID_STRICT_TYPE_CHECKS"), but CONFIG_USER_NS default was left to 'no'.
>
> Finally, the commit e11f0ae388f2 ("userns: Recommend use of memory control
> groups") added to the Kconfig symbol's help text a recommendation that the
> memory control groups should be used, to limit the amount of memory that a
> user who can create user namespaces can consume.
>
> Looking at the changes' history, a default to 'yes' when the CONFIG_MEMCG
> option is enabled seems like a sane thing to do. Specially since systemd
> requires user namespaces support for services that use the PrivateUsers=
> property in their unit files (e.g: the UPower daemon).

Fyi, user namespaces are an entirely optional feature in systemd and it
gracefully falls back if they are not available with PrivateUsers= set.
If that isn't the case then it's a bug in systemd with PrivateUsers=
handling and should be reported.

But specifically to you change, afair CONFIG_MEMCG and userns are
unrelated so tying them together like this in the kconfig seems
misguided.

2024-05-24 12:33:41

by Javier Martinez Canillas

[permalink] [raw]
Subject: Re: [PATCH] userns: Default to 'yes' when CONFIG_MEMCG option is enabled

Christian Brauner <[email protected]> writes:

Hello Christian,

Thanks a lot for your feedback.

> On Fri, May 24, 2024 at 10:24:16AM +0200, Javier Martinez Canillas wrote:
>> The default value for the CONFIG_USER_NS Kconfig symbol changed over time.
>>
>> When first was introduced by commit acce292c82d4 ("user namespace: add the
>> framework"), the default was 'no'. But then it was changed to 'yes' if the
>> CONFIG_NAMESPACES option was enabled, by commit 17a6d4411a4d ("namespaces:
>> default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected").
>>
>> Then, commit 5673a94c1457 ("userns: Add a Kconfig option to enforce strict
>> kuid and kgid type checks") changed the default to 'no' again and selected
>> the (now defunct) UIDGID_STRICT_TYPE_CHECKS option.
>>
>> This selected option was removed by commit 261000a56b63 ("userns: Remove
>> UIDGID_STRICT_TYPE_CHECKS"), but CONFIG_USER_NS default was left to 'no'.
>>
>> Finally, the commit e11f0ae388f2 ("userns: Recommend use of memory control
>> groups") added to the Kconfig symbol's help text a recommendation that the
>> memory control groups should be used, to limit the amount of memory that a
>> user who can create user namespaces can consume.
>>
>> Looking at the changes' history, a default to 'yes' when the CONFIG_MEMCG
>> option is enabled seems like a sane thing to do. Specially since systemd
>> requires user namespaces support for services that use the PrivateUsers=
>> property in their unit files (e.g: the UPower daemon).
>
> Fyi, user namespaces are an entirely optional feature in systemd and it
> gracefully falls back if they are not available with PrivateUsers= set.
> If that isn't the case then it's a bug in systemd with PrivateUsers=
> handling and should be reported.
>

Interesting, it definitely failed for me:

$ systemctl status upower
● upower.service - Daemon for power management
Loaded: loaded (/lib/systemd/system/upower.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2024-05-24 12:23:49 UTC; 34s ago
Docs: man:upowerd(8)
Process: 390 ExecStart=/usr/libexec/upowerd (code=exited, status=217/USER)
Main PID: 390 (code=exited, status=217/USER)
CPU: 122ms

May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 5.
May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.
May 24 12:23:49 igep systemd[1]: upower.service: Start request repeated too quickly.
May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.

$ journalctl -u upower
May 24 12:23:49 igep systemd[1]: Starting Daemon for power management...
May 24 12:23:49 igep systemd[404]: upower.service: Failed to set up user namespacing: Invalid argument
May 24 12:23:49 igep systemd[404]: upower.service: Failed at step USER spawning /usr/libexec/upowerd: Invalid argument
May 24 12:23:49 igep systemd[1]: upower.service: Main process exited, code=exited, status=217/USER
May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.
May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 1.
May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.

That lead me to https://gitlab.freedesktop.org/upower/upower/-/issues/104
and finally to systemd's README:

https://github.com/systemd/systemd/blob/main/README#L89C22-L89C34

But I'll investigate more if is upower or systemd to be blamed here...

> But specifically to you change, afair CONFIG_MEMCG and userns are
> unrelated so tying them together like this in the kconfig seems
> misguided.
>

Yes, but the config USER_NS help text already tieds them toghether:

help
This allows containers, i.e. vservers, to use user namespaces
to provide different user info for different servers.

When user namespaces are enabled in the kernel it is
recommended that the MEMCG option also be enabled and that
user-space use the memory control groups to limit the amount
of memory a memory unprivileged users can use.

And as mentioned in the commit message, it seems to be the reason why the
default for this Kconfig symbol is no. Maybe I misunderstood though or do
you think that could be switched unconditionally to 'default y' ?

Or is there a reason to be the only namespace to be default to no instead
of yes? Specially since important system services are trying to use it.

--
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat


2024-05-24 13:19:38

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] userns: Default to 'yes' when CONFIG_MEMCG option is enabled

On Fri, May 24, 2024 at 02:33:19PM +0200, Javier Martinez Canillas wrote:
> Christian Brauner <[email protected]> writes:
>
> Hello Christian,
>
> Thanks a lot for your feedback.
>
> > On Fri, May 24, 2024 at 10:24:16AM +0200, Javier Martinez Canillas wrote:
> >> The default value for the CONFIG_USER_NS Kconfig symbol changed over time.
> >>
> >> When first was introduced by commit acce292c82d4 ("user namespace: add the
> >> framework"), the default was 'no'. But then it was changed to 'yes' if the
> >> CONFIG_NAMESPACES option was enabled, by commit 17a6d4411a4d ("namespaces:
> >> default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected").
> >>
> >> Then, commit 5673a94c1457 ("userns: Add a Kconfig option to enforce strict
> >> kuid and kgid type checks") changed the default to 'no' again and selected
> >> the (now defunct) UIDGID_STRICT_TYPE_CHECKS option.
> >>
> >> This selected option was removed by commit 261000a56b63 ("userns: Remove
> >> UIDGID_STRICT_TYPE_CHECKS"), but CONFIG_USER_NS default was left to 'no'.
> >>
> >> Finally, the commit e11f0ae388f2 ("userns: Recommend use of memory control
> >> groups") added to the Kconfig symbol's help text a recommendation that the
> >> memory control groups should be used, to limit the amount of memory that a
> >> user who can create user namespaces can consume.
> >>
> >> Looking at the changes' history, a default to 'yes' when the CONFIG_MEMCG
> >> option is enabled seems like a sane thing to do. Specially since systemd
> >> requires user namespaces support for services that use the PrivateUsers=
> >> property in their unit files (e.g: the UPower daemon).
> >
> > Fyi, user namespaces are an entirely optional feature in systemd and it
> > gracefully falls back if they are not available with PrivateUsers= set.
> > If that isn't the case then it's a bug in systemd with PrivateUsers=
> > handling and should be reported.
> >
>
> Interesting, it definitely failed for me:
>
> $ systemctl status upower
> ● upower.service - Daemon for power management
> Loaded: loaded (/lib/systemd/system/upower.service; disabled; vendor preset: enabled)
> Active: failed (Result: exit-code) since Fri 2024-05-24 12:23:49 UTC; 34s ago
> Docs: man:upowerd(8)
> Process: 390 ExecStart=/usr/libexec/upowerd (code=exited, status=217/USER)
> Main PID: 390 (code=exited, status=217/USER)
> CPU: 122ms
>
> May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 5.
> May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.
> May 24 12:23:49 igep systemd[1]: upower.service: Start request repeated too quickly.
> May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
> May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.
>
> $ journalctl -u upower
> May 24 12:23:49 igep systemd[1]: Starting Daemon for power management...
> May 24 12:23:49 igep systemd[404]: upower.service: Failed to set up user namespacing: Invalid argument
> May 24 12:23:49 igep systemd[404]: upower.service: Failed at step USER spawning /usr/libexec/upowerd: Invalid argument
> May 24 12:23:49 igep systemd[1]: upower.service: Main process exited, code=exited, status=217/USER
> May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
> May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.
> May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 1.
> May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.
>
> That lead me to https://gitlab.freedesktop.org/upower/upower/-/issues/104
> and finally to systemd's README:
>
> https://github.com/systemd/systemd/blob/main/README#L89C22-L89C34
>
> But I'll investigate more if is upower or systemd to be blamed here...
>
> > But specifically to you change, afair CONFIG_MEMCG and userns are
> > unrelated so tying them together like this in the kconfig seems
> > misguided.
> >
>
> Yes, but the config USER_NS help text already tieds them toghether:
>
> help
> This allows containers, i.e. vservers, to use user namespaces
> to provide different user info for different servers.
>
> When user namespaces are enabled in the kernel it is
> recommended that the MEMCG option also be enabled and that

But then the patch your patch is the wrong way around and you should
make CONFIG_USER_NS select CONFIG_MEMCG. IOW, if you do userns, do memcg
but _not_ if you do memcg, do userns.

> user-space use the memory control groups to limit the amount
> of memory a memory unprivileged users can use.
>
> And as mentioned in the commit message, it seems to be the reason why the
> default for this Kconfig symbol is no. Maybe I misunderstood though or do
> you think that could be switched unconditionally to 'default y' ?

No, definitely not.

>
> Or is there a reason to be the only namespace to be default to no instead
> of yes? Specially since important system services are trying to use it.

Yes, it has a lot more security implications than all of the other ones
and makes them available to unprivileged users by default. So that
definitely requires a conscious choice.

>
> --
> Best regards,
>
> Javier Martinez Canillas
> Core Platforms
> Red Hat
>

2024-05-24 15:39:33

by Javier Martinez Canillas

[permalink] [raw]
Subject: Re: [PATCH] userns: Default to 'yes' when CONFIG_MEMCG option is enabled

Christian Brauner <[email protected]> writes:

> On Fri, May 24, 2024 at 02:33:19PM +0200, Javier Martinez Canillas wrote:
>> Christian Brauner <[email protected]> writes:
>>
>> Hello Christian,
>>
>> Thanks a lot for your feedback.
>>
>> > On Fri, May 24, 2024 at 10:24:16AM +0200, Javier Martinez Canillas wrote:
>> >> The default value for the CONFIG_USER_NS Kconfig symbol changed over time.
>> >>
>> >> When first was introduced by commit acce292c82d4 ("user namespace: add the
>> >> framework"), the default was 'no'. But then it was changed to 'yes' if the
>> >> CONFIG_NAMESPACES option was enabled, by commit 17a6d4411a4d ("namespaces:
>> >> default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected").
>> >>
>> >> Then, commit 5673a94c1457 ("userns: Add a Kconfig option to enforce strict
>> >> kuid and kgid type checks") changed the default to 'no' again and selected
>> >> the (now defunct) UIDGID_STRICT_TYPE_CHECKS option.
>> >>
>> >> This selected option was removed by commit 261000a56b63 ("userns: Remove
>> >> UIDGID_STRICT_TYPE_CHECKS"), but CONFIG_USER_NS default was left to 'no'.
>> >>
>> >> Finally, the commit e11f0ae388f2 ("userns: Recommend use of memory control
>> >> groups") added to the Kconfig symbol's help text a recommendation that the
>> >> memory control groups should be used, to limit the amount of memory that a
>> >> user who can create user namespaces can consume.
>> >>
>> >> Looking at the changes' history, a default to 'yes' when the CONFIG_MEMCG
>> >> option is enabled seems like a sane thing to do. Specially since systemd
>> >> requires user namespaces support for services that use the PrivateUsers=
>> >> property in their unit files (e.g: the UPower daemon).
>> >
>> > Fyi, user namespaces are an entirely optional feature in systemd and it
>> > gracefully falls back if they are not available with PrivateUsers= set.
>> > If that isn't the case then it's a bug in systemd with PrivateUsers=
>> > handling and should be reported.
>> >
>>
>> Interesting, it definitely failed for me:
>>
>> $ systemctl status upower
>> ● upower.service - Daemon for power management
>> Loaded: loaded (/lib/systemd/system/upower.service; disabled; vendor preset: enabled)
>> Active: failed (Result: exit-code) since Fri 2024-05-24 12:23:49 UTC; 34s ago
>> Docs: man:upowerd(8)
>> Process: 390 ExecStart=/usr/libexec/upowerd (code=exited, status=217/USER)
>> Main PID: 390 (code=exited, status=217/USER)
>> CPU: 122ms
>>
>> May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 5.
>> May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.
>> May 24 12:23:49 igep systemd[1]: upower.service: Start request repeated too quickly.
>> May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
>> May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.
>>
>> $ journalctl -u upower
>> May 24 12:23:49 igep systemd[1]: Starting Daemon for power management...
>> May 24 12:23:49 igep systemd[404]: upower.service: Failed to set up user namespacing: Invalid argument
>> May 24 12:23:49 igep systemd[404]: upower.service: Failed at step USER spawning /usr/libexec/upowerd: Invalid argument
>> May 24 12:23:49 igep systemd[1]: upower.service: Main process exited, code=exited, status=217/USER
>> May 24 12:23:49 igep systemd[1]: upower.service: Failed with result 'exit-code'.
>> May 24 12:23:49 igep systemd[1]: Failed to start Daemon for power management.
>> May 24 12:23:49 igep systemd[1]: upower.service: Scheduled restart job, restart counter is at 1.
>> May 24 12:23:49 igep systemd[1]: Stopped Daemon for power management.
>>
>> That lead me to https://gitlab.freedesktop.org/upower/upower/-/issues/104
>> and finally to systemd's README:
>>
>> https://github.com/systemd/systemd/blob/main/README#L89C22-L89C34
>>
>> But I'll investigate more if is upower or systemd to be blamed here...
>>
>> > But specifically to you change, afair CONFIG_MEMCG and userns are
>> > unrelated so tying them together like this in the kconfig seems
>> > misguided.
>> >
>>
>> Yes, but the config USER_NS help text already tieds them toghether:
>>
>> help
>> This allows containers, i.e. vservers, to use user namespaces
>> to provide different user info for different servers.
>>
>> When user namespaces are enabled in the kernel it is
>> recommended that the MEMCG option also be enabled and that
>
> But then the patch your patch is the wrong way around and you should
> make CONFIG_USER_NS select CONFIG_MEMCG. IOW, if you do userns, do memcg
> but _not_ if you do memcg, do userns.
>

You are right.

>> user-space use the memory control groups to limit the amount
>> of memory a memory unprivileged users can use.
>>
>> And as mentioned in the commit message, it seems to be the reason why the
>> default for this Kconfig symbol is no. Maybe I misunderstood though or do
>> you think that could be switched unconditionally to 'default y' ?
>
> No, definitely not.
>

Ok.

>>
>> Or is there a reason to be the only namespace to be default to no instead
>> of yes? Specially since important system services are trying to use it.
>
> Yes, it has a lot more security implications than all of the other ones
> and makes them available to unprivileged users by default. So that
> definitely requires a conscious choice.
>

Got it. Thanks for the explanation. Maybe the option help text could be a
little bit more verbose about it? Since it wasn't clear, at least to me.

Let's discard this patch then and sorry for the noise.

--
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat