2021-08-20 04:36:43

by Jordy Zomer

[permalink] [raw]
Subject: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

When a secret memory region is active, memfd_secret disables
hibernation. One of the goals is to keep the secret data from being
written to persistent-storage.

It accomplishes this by maintaining a reference count to
`secretmem_users`. Once this reference is held your system can not be
hibernated due to the check in `hibernation_available()`. However,
because `secretmem_users` is of type `atomic_t`, reference counter
overflows are possible.

As you can see there's an `atomic_inc` for each `memfd` that is opened
in the `memfd_secret` syscall. If a local attacker succeeds to open 2^32
memfd's, the counter will wrap around to 0. This implies that you may
hibernate again, even though there are still regions of this secret
memory, thereby bypassing the security check.

In an attempt to fix this I have used `refcount_t` instead of `atomic_t`
which prevents reference counter overflows.

Signed-off-by: Jordy Zomer <[email protected]>
---
mm/secretmem.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/secretmem.c b/mm/secretmem.c
index 030f02ddc7c1..1fea68b8d5a6 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -18,6 +18,7 @@
#include <linux/secretmem.h>
#include <linux/set_memory.h>
#include <linux/sched/signal.h>
+#include <linux/refcount.h>

#include <uapi/linux/magic.h>

@@ -40,11 +41,11 @@ module_param_named(enable, secretmem_enable, bool, 0400);
MODULE_PARM_DESC(secretmem_enable,
"Enable secretmem and memfd_secret(2) system call");

-static atomic_t secretmem_users;
+static refcount_t secretmem_users;

bool secretmem_active(void)
{
- return !!atomic_read(&secretmem_users);
+ return !!refcount_read(&secretmem_users);
}

static vm_fault_t secretmem_fault(struct vm_fault *vmf)
@@ -103,7 +104,7 @@ static const struct vm_operations_struct secretmem_vm_ops = {

static int secretmem_release(struct inode *inode, struct file *file)
{
- atomic_dec(&secretmem_users);
+ refcount_dec(&secretmem_users);
return 0;
}

@@ -217,7 +218,7 @@ SYSCALL_DEFINE1(memfd_secret, unsigned int, flags)
file->f_flags |= O_LARGEFILE;

fd_install(fd, file);
- atomic_inc(&secretmem_users);
+ refcount_inc(&secretmem_users);
return fd;

err_put_fd:
--
2.27.0


2021-08-20 05:36:13

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

On Fri, Aug 20, 2021 at 06:33:38AM +0200, Jordy Zomer wrote:
> When a secret memory region is active, memfd_secret disables
> hibernation. One of the goals is to keep the secret data from being
> written to persistent-storage.
>
> It accomplishes this by maintaining a reference count to
> `secretmem_users`. Once this reference is held your system can not be
> hibernated due to the check in `hibernation_available()`. However,
> because `secretmem_users` is of type `atomic_t`, reference counter
> overflows are possible.

It's an unlikely condition to hit given max-open-fds, etc, but there's
no reason to leave this weakness. Changing this to refcount_t is easy
and better than using atomic_t.

Reviewed-by: Kees Cook <[email protected]>

> As you can see there's an `atomic_inc` for each `memfd` that is opened
> in the `memfd_secret` syscall. If a local attacker succeeds to open 2^32
> memfd's, the counter will wrap around to 0. This implies that you may
> hibernate again, even though there are still regions of this secret
> memory, thereby bypassing the security check.

IMO, this hibernation check is also buggy, since it looks to be
vulnerable to ToCToU: processes aren't frozen when
hibernation_available() checks secretmem_users(), so a process could add
one and fill it before the process freezer stops it.

And of course, there's still the ptrace hole[1], which is think is quite
serious as it renders the entire defense moot.

-Kees

[1] https://lore.kernel.org/lkml/202105071620.E834B1FA92@keescook/

--
Kees Cook

2021-08-20 15:05:05

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

On Fri, 2021-08-20 at 06:33 +0200, Jordy Zomer wrote:
> As you can see there's an `atomic_inc` for each `memfd` that is
> opened in the `memfd_secret` syscall. If a local attacker succeeds to
> open 2^32 memfd's, the counter will wrap around to 0. This implies
> that you may hibernate again, even though there are still regions of
> this secret memory, thereby bypassing the security check.

This isn't a possible attack, is it? secret memory is per process and
each process usually has an open fd limit of 1024. That's not to say
we shouldn't have overflow protection just in case, but I think today
we don't have a problem.

James


2021-08-20 16:13:26

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

On Fri, Aug 20, 2021 at 07:57:25AM -0700, James Bottomley wrote:
> On Fri, 2021-08-20 at 06:33 +0200, Jordy Zomer wrote:
> > As you can see there's an `atomic_inc` for each `memfd` that is
> > opened in the `memfd_secret` syscall. If a local attacker succeeds to
> > open 2^32 memfd's, the counter will wrap around to 0. This implies
> > that you may hibernate again, even though there are still regions of
> > this secret memory, thereby bypassing the security check.
>
> This isn't a possible attack, is it? secret memory is per process and
> each process usually has an open fd limit of 1024. That's not to say
> we shouldn't have overflow protection just in case, but I think today
> we don't have a problem.

But it's a _global_ setting, so it's still possible, though likely
impractical today. But refcount_t mitigates it and is a trivial change.
:)

--
Kees Cook

2021-08-20 16:41:00

by Jordy Zomer

[permalink] [raw]
Subject: Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

Hi There!

Because this is a global variable, it appears to be exploitable. Either we generate a sufficient number of processes to achieve this counter, or you increase the open file limit with ulimit or sysctl. Unless the kernel has a hard restriction on the number of potential file descriptors that I'm not aware of.

In any case, it's probably a good idea to patch this to make it explicitly secure. If you discover a hard-limit in the kernel for open file descriptors, please let me know. I'm genuinely ​interested :D!

Best Regards,

Jordy

> On 08/20/2021 12:05 PM Kees Cook <[email protected]> wrote:
>
>
> On Fri, Aug 20, 2021 at 07:57:25AM -0700, James Bottomley wrote:
> > On Fri, 2021-08-20 at 06:33 +0200, Jordy Zomer wrote:
> > > As you can see there's an `atomic_inc` for each `memfd` that is
> > > opened in the `memfd_secret` syscall. If a local attacker succeeds to
> > > open 2^32 memfd's, the counter will wrap around to 0. This implies
> > > that you may hibernate again, even though there are still regions of
> > > this secret memory, thereby bypassing the security check.
> >
> > This isn't a possible attack, is it? secret memory is per process and
> > each process usually has an open fd limit of 1024. That's not to say
> > we shouldn't have overflow protection just in case, but I think today
> > we don't have a problem.
>
> But it's a _global_ setting, so it's still possible, though likely
> impractical today. But refcount_t mitigates it and is a trivial change.
> :)
>
> --
> Kees Cook

2021-08-20 19:41:23

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

On Fri, 2021-08-20 at 12:38 -0400, Jordy Zomer wrote:
> Hi There!
>
> Because this is a global variable, it appears to be exploitable.
> Either we generate a sufficient number of processes to achieve this
> counter, or you increase the open file limit with ulimit or sysctl.
> Unless the kernel has a hard restriction on the number of potential
> file descriptors that I'm not aware of.

There's no direct global counter for file descriptors, no; however,
there is an indirect limit: the number of processes per user which is
now defaulting to around 65535, so even a fork bomb opening the max
number of fds won't get you a wrap.

> In any case, it's probably a good idea to patch this to make it
> explicitly secure. If you discover a hard-limit in the kernel for
> open file descriptors, please let me know. I'm genuinely interested
> :D!

I didn't disagree it might be a useful think to update ... I just
didn't think it was currently exploitable.

James


2021-08-24 14:07:20

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

On Thu, Aug 19, 2021 at 10:33:49PM -0700, Kees Cook wrote:
> On Fri, Aug 20, 2021 at 06:33:38AM +0200, Jordy Zomer wrote:
> > When a secret memory region is active, memfd_secret disables
> > hibernation. One of the goals is to keep the secret data from being
> > written to persistent-storage.
> >
> > It accomplishes this by maintaining a reference count to
> > `secretmem_users`. Once this reference is held your system can not be
> > hibernated due to the check in `hibernation_available()`. However,
> > because `secretmem_users` is of type `atomic_t`, reference counter
> > overflows are possible.
>
> It's an unlikely condition to hit given max-open-fds, etc, but there's
> no reason to leave this weakness. Changing this to refcount_t is easy
> and better than using atomic_t.
>
> Reviewed-by: Kees Cook <[email protected]>
>
> > As you can see there's an `atomic_inc` for each `memfd` that is opened
> > in the `memfd_secret` syscall. If a local attacker succeeds to open 2^32
> > memfd's, the counter will wrap around to 0. This implies that you may
> > hibernate again, even though there are still regions of this secret
> > memory, thereby bypassing the security check.
>
> IMO, this hibernation check is also buggy, since it looks to be
> vulnerable to ToCToU: processes aren't frozen when
> hibernation_available() checks secretmem_users(), so a process could add
> one and fill it before the process freezer stops it.
>
> And of course, there's still the ptrace hole[1], which is think is quite
> serious as it renders the entire defense moot.

I thought about what can be done here and could not come up with anything
better that prevent PTRACE on a process with secretmem, but this seems to
me too much from usability vs security POV.

Protecting against root is always hard and secretmem anyway does not
provide 100% guarantee by itself but rather makes an accidental data leak
or non-target attack much harder.

To be effective it also presumes that other hardening features are turned
on by the system administrator on production systems, so it's not
unrealistic to rely on ptrace being disabled.

--
Sincerely yours,
Mike.

2021-10-21 09:03:36

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

On Tue, 24 Aug 2021 at 16:06, Mike Rapoport <[email protected]> wrote:
>
> On Thu, Aug 19, 2021 at 10:33:49PM -0700, Kees Cook wrote:
> > On Fri, Aug 20, 2021 at 06:33:38AM +0200, Jordy Zomer wrote:
> > > When a secret memory region is active, memfd_secret disables
> > > hibernation. One of the goals is to keep the secret data from being
> > > written to persistent-storage.
> > >
> > > It accomplishes this by maintaining a reference count to
> > > `secretmem_users`. Once this reference is held your system can not be
> > > hibernated due to the check in `hibernation_available()`. However,
> > > because `secretmem_users` is of type `atomic_t`, reference counter
> > > overflows are possible.
> >
> > It's an unlikely condition to hit given max-open-fds, etc, but there's
> > no reason to leave this weakness. Changing this to refcount_t is easy
> > and better than using atomic_t.
> >
> > Reviewed-by: Kees Cook <[email protected]>
> >
> > > As you can see there's an `atomic_inc` for each `memfd` that is opened
> > > in the `memfd_secret` syscall. If a local attacker succeeds to open 2^32
> > > memfd's, the counter will wrap around to 0. This implies that you may
> > > hibernate again, even though there are still regions of this secret
> > > memory, thereby bypassing the security check.
> >
> > IMO, this hibernation check is also buggy, since it looks to be
> > vulnerable to ToCToU: processes aren't frozen when
> > hibernation_available() checks secretmem_users(), so a process could add
> > one and fill it before the process freezer stops it.
> >
> > And of course, there's still the ptrace hole[1], which is think is quite
> > serious as it renders the entire defense moot.
>
> I thought about what can be done here and could not come up with anything
> better that prevent PTRACE on a process with secretmem, but this seems to
> me too much from usability vs security POV.
>
> Protecting against root is always hard and secretmem anyway does not
> provide 100% guarantee by itself but rather makes an accidental data leak
> or non-target attack much harder.
>
> To be effective it also presumes that other hardening features are turned
> on by the system administrator on production systems, so it's not
> unrealistic to rely on ptrace being disabled.

Hi,

The issue existed before this change, but I think refcount_inc needs
to be done before fd_install. After fd_install finishes, the fd can be
used by userspace and we can have secret data in memory before the
refcount_inc.

A straightforward mis-use where a user will predict the returned fd in
another thread before the syscall returns and will use it to store
secret data is somewhat dubious because such a user just shoots
themself in the foot.

But a more interesting mis-used would be to close the predicted fd and
decrement the refcount before the corresponding refcount_inc, this way
one can briefly drop the refcount to zero while there are other users
of secretmem.