2020-04-24 15:41:57

by Chris Down

[permalink] [raw]
Subject: PSI poll() support for unprivileged users

Hi Suren,

I noticed that one restriction of the PSI poll() interface is that currently
only root can set up new triggers. Talking to Johannes, it seems the reason for
this was that you end up with a realtime kernel thread for every cgroup where a
trigger is set, and this could be used by unprivileged users to sap resources.

I'm building a userspace daemon for desktop users which notifies based on
pressure events, and it's particularly janky to ask people to run such a
notifier as root: the notification mechanism is usually tied to the user's
display server auth, and the surrounding environment is generally pretty
important to maintain. In addition to this, just in general this doesn't feel
like the kind of feature that by its nature needs to be restricted to root --
it seems reasonable that there would be unprivileged users which want to use
this, and that not using RT threads would be acceptable in that scenario.

Have you considered making the per-cgroup RT threads optional? If the
processing isn't done in the FIFO kthread for unprivileged users, I think it
should be safe to allow them to write to pressure files (perhaps with some
additional limits or restrictions on things like the interval, as needed).

Thanks!

Chris


2020-04-24 19:47:31

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: PSI poll() support for unprivileged users

On Fri, Apr 24, 2020 at 8:39 AM Chris Down <[email protected]> wrote:
>
> Hi Suren,

Hi Chris,

>
> I noticed that one restriction of the PSI poll() interface is that currently
> only root can set up new triggers. Talking to Johannes, it seems the reason for
> this was that you end up with a realtime kernel thread for every cgroup where a
> trigger is set, and this could be used by unprivileged users to sap resources.
>

This reasoning is correct and IIRC the enforcement of this is just the
way /proc/pressure files are created:

proc_create("pressure/io", 0, NULL, &psi_io_fops);
proc_create("pressure/memory", 0, NULL, &psi_memory_fops);
proc_create("pressure/cpu", 0, NULL, &psi_cpu_fops);

IOW there are no additional capability checks performed on the PSI
trigger users.

> I'm building a userspace daemon for desktop users which notifies based on
> pressure events, and it's particularly janky to ask people to run such a
> notifier as root: the notification mechanism is usually tied to the user's
> display server auth, and the surrounding environment is generally pretty
> important to maintain. In addition to this, just in general this doesn't feel
> like the kind of feature that by its nature needs to be restricted to root --
> it seems reasonable that there would be unprivileged users which want to use
> this, and that not using RT threads would be acceptable in that scenario.

For these cases you can provide a userspace privileged daemon that
will relay pressure notifications to its unprivileged clients. This is
what we do on Android - Android Management Server registers its PSI
triggers and then relays low memory notifications to unprivileged
apps.
Another approach is taken by Android Low Memory Killer Daemon (lmkd)
which is an unprivileged process but registers its PSI triggers. The
trick is that the init process executes "chmod 0664
/proc/pressure/memory" from its init script and further restrictions
are enforced by selinux policy granting only LMKD write access to this
file.

Would any of these options work for you?

> Have you considered making the per-cgroup RT threads optional? If the
> processing isn't done in the FIFO kthread for unprivileged users, I think it
> should be safe to allow them to write to pressure files (perhaps with some
> additional limits or restrictions on things like the interval, as needed).

I didn't consider that as I viewed memory condition tracking that
consumes kernel resources as being potentially exploitable. RT threads
did make that more of an issue but even without them I'm not sure we
should allow unprivileged processes to create unlimited numbers of
triggers each of which is not really free.

>
> Thanks!
>
> Chris

Thanks,
Suren.

2020-04-24 22:50:52

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: PSI poll() support for unprivileged users

On Fri, Apr 24, 2020 at 12:43 PM Suren Baghdasaryan <[email protected]> wrote:
>
> On Fri, Apr 24, 2020 at 8:39 AM Chris Down <[email protected]> wrote:
> >
> > Hi Suren,
>
> Hi Chris,
>
> >
> > I noticed that one restriction of the PSI poll() interface is that currently
> > only root can set up new triggers. Talking to Johannes, it seems the reason for
> > this was that you end up with a realtime kernel thread for every cgroup where a
> > trigger is set, and this could be used by unprivileged users to sap resources.
> >
>
> This reasoning is correct and IIRC the enforcement of this is just the
> way /proc/pressure files are created:
>
> proc_create("pressure/io", 0, NULL, &psi_io_fops);
> proc_create("pressure/memory", 0, NULL, &psi_memory_fops);
> proc_create("pressure/cpu", 0, NULL, &psi_cpu_fops);
>
> IOW there are no additional capability checks performed on the PSI
> trigger users.
>
> > I'm building a userspace daemon for desktop users which notifies based on
> > pressure events, and it's particularly janky to ask people to run such a
> > notifier as root: the notification mechanism is usually tied to the user's
> > display server auth, and the surrounding environment is generally pretty
> > important to maintain. In addition to this, just in general this doesn't feel
> > like the kind of feature that by its nature needs to be restricted to root --
> > it seems reasonable that there would be unprivileged users which want to use
> > this, and that not using RT threads would be acceptable in that scenario.
>
> For these cases you can provide a userspace privileged daemon that
> will relay pressure notifications to its unprivileged clients. This is
> what we do on Android - Android Management Server registers its PSI
> triggers and then relays low memory notifications to unprivileged
> apps.
> Another approach is taken by Android Low Memory Killer Daemon (lmkd)
> which is an unprivileged process but registers its PSI triggers. The
> trick is that the init process executes "chmod 0664
> /proc/pressure/memory" from its init script and further restrictions
> are enforced by selinux policy granting only LMKD write access to this
> file.
>
> Would any of these options work for you?
>
> > Have you considered making the per-cgroup RT threads optional? If the
> > processing isn't done in the FIFO kthread for unprivileged users, I think it
> > should be safe to allow them to write to pressure files (perhaps with some
> > additional limits or restrictions on things like the interval, as needed).
>
> I didn't consider that as I viewed memory condition tracking that
> consumes kernel resources as being potentially exploitable. RT threads
> did make that more of an issue but even without them I'm not sure we
> should allow unprivileged processes to create unlimited numbers of
> triggers each of which is not really free.

Thinking some more about this. LMKD in the above-mentioned usecase is
not a privileged process but it is granted access to PSI triggers by a
privileged init process+sepolicy and it needs RT threads to react to
memory pressure promptly without being preempted. If we allow only the
privileged users to have RT threads for PSI triggers then that
requirement would break this scenario and LMKD won't be able to use RT
threads.

>
> >
> > Thanks!
> >
> > Chris
>
> Thanks,
> Suren.

2020-04-28 11:37:46

by Chris Down

[permalink] [raw]
Subject: Re: PSI poll() support for unprivileged users

Hey Suren,

Suren Baghdasaryan writes:
>> > I'm building a userspace daemon for desktop users which notifies based on
>> > pressure events, and it's particularly janky to ask people to run such a
>> > notifier as root: the notification mechanism is usually tied to the user's
>> > display server auth, and the surrounding environment is generally pretty
>> > important to maintain. In addition to this, just in general this doesn't feel
>> > like the kind of feature that by its nature needs to be restricted to root --
>> > it seems reasonable that there would be unprivileged users which want to use
>> > this, and that not using RT threads would be acceptable in that scenario.
>>
>> For these cases you can provide a userspace privileged daemon that
>> will relay pressure notifications to its unprivileged clients. This is
>> what we do on Android - Android Management Server registers its PSI
>> triggers and then relays low memory notifications to unprivileged
>> apps.
>> Another approach is taken by Android Low Memory Killer Daemon (lmkd)
>> which is an unprivileged process but registers its PSI triggers. The
>> trick is that the init process executes "chmod 0664
>> /proc/pressure/memory" from its init script and further restrictions
>> are enforced by selinux policy granting only LMKD write access to this
>> file.
>>
>> Would any of these options work for you?

Hmm, I think these are reasonable options when you have control over the
system, but not so great if you don't. For example, I want to get pressure
notifications for my logind seat, but that doesn't necessarily imply that I
have administrative access to the machine.

>> > Have you considered making the per-cgroup RT threads optional? If the
>> > processing isn't done in the FIFO kthread for unprivileged users, I think it
>> > should be safe to allow them to write to pressure files (perhaps with some
>> > additional limits or restrictions on things like the interval, as needed).
>>
>> I didn't consider that as I viewed memory condition tracking that
>> consumes kernel resources as being potentially exploitable. RT threads
>> did make that more of an issue but even without them I'm not sure we
>> should allow unprivileged processes to create unlimited numbers of
>> triggers each of which is not really free.

There's precedent for other similar issues like this in the kernel, eg. rates
for some ICMP packets, where we enforce a static limit in the kernel for
unprivileged users. I'd imagine we can do something similar here, too.

>Thinking some more about this. LMKD in the above-mentioned usecase is
>not a privileged process but it is granted access to PSI triggers by a
>privileged init process+sepolicy and it needs RT threads to react to
>memory pressure promptly without being preempted. If we allow only the
>privileged users to have RT threads for PSI triggers then that
>requirement would break this scenario and LMKD won't be able to use RT
>threads.

Well, fiddlesticks :-)

If we needed to have both, I don't know what the interface would look like, but
yes, it sounds overcomplicated. I'll think about it some more.

Thanks,

Chris

2020-04-28 18:33:53

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: PSI poll() support for unprivileged users

On Tue, Apr 28, 2020 at 4:34 AM Chris Down <[email protected]> wrote:
>
> Hey Suren,
>
> Suren Baghdasaryan writes:
> >> > I'm building a userspace daemon for desktop users which notifies based on
> >> > pressure events, and it's particularly janky to ask people to run such a
> >> > notifier as root: the notification mechanism is usually tied to the user's
> >> > display server auth, and the surrounding environment is generally pretty
> >> > important to maintain. In addition to this, just in general this doesn't feel
> >> > like the kind of feature that by its nature needs to be restricted to root --
> >> > it seems reasonable that there would be unprivileged users which want to use
> >> > this, and that not using RT threads would be acceptable in that scenario.
> >>
> >> For these cases you can provide a userspace privileged daemon that
> >> will relay pressure notifications to its unprivileged clients. This is
> >> what we do on Android - Android Management Server registers its PSI
> >> triggers and then relays low memory notifications to unprivileged
> >> apps.
> >> Another approach is taken by Android Low Memory Killer Daemon (lmkd)
> >> which is an unprivileged process but registers its PSI triggers. The
> >> trick is that the init process executes "chmod 0664
> >> /proc/pressure/memory" from its init script and further restrictions
> >> are enforced by selinux policy granting only LMKD write access to this
> >> file.
> >>
> >> Would any of these options work for you?
>
> Hmm, I think these are reasonable options when you have control over the
> system, but not so great if you don't. For example, I want to get pressure
> notifications for my logind seat, but that doesn't necessarily imply that I
> have administrative access to the machine.
>
> >> > Have you considered making the per-cgroup RT threads optional? If the
> >> > processing isn't done in the FIFO kthread for unprivileged users, I think it
> >> > should be safe to allow them to write to pressure files (perhaps with some
> >> > additional limits or restrictions on things like the interval, as needed).
> >>
> >> I didn't consider that as I viewed memory condition tracking that
> >> consumes kernel resources as being potentially exploitable. RT threads
> >> did make that more of an issue but even without them I'm not sure we
> >> should allow unprivileged processes to create unlimited numbers of
> >> triggers each of which is not really free.
>
> There's precedent for other similar issues like this in the kernel, eg. rates
> for some ICMP packets, where we enforce a static limit in the kernel for
> unprivileged users. I'd imagine we can do something similar here, too.
>
> >Thinking some more about this. LMKD in the above-mentioned usecase is
> >not a privileged process but it is granted access to PSI triggers by a
> >privileged init process+sepolicy and it needs RT threads to react to
> >memory pressure promptly without being preempted. If we allow only the
> >privileged users to have RT threads for PSI triggers then that
> >requirement would break this scenario and LMKD won't be able to use RT
> >threads.
>
> Well, fiddlesticks :-)
>
> If we needed to have both, I don't know what the interface would look like, but
> yes, it sounds overcomplicated. I'll think about it some more.

Yeah, the only idea I could come up with was to tie RT thread usage to
some selinux policy instead of using file permissions or being root.
But I have very little experience with selinux to tell you whether
there might be issues with such an approach.

>
> Thanks,
>
> Chris