2018-03-16 21:50:33

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 0/3] x86, pkeys: make pkey 0 more normal

This restores pkey 0 to more of a state of normalcy and adds a
new test in the pkeys selftest to make sure it stays that way.

Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>p
Cc: Shuah Khan <[email protected]>


2018-03-16 21:50:46

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 2/3] x86, pkeys, selftests: save off 'prot' for allocations


From: Dave Hansen <[email protected]>

This makes it possible to to tell what 'prot' a given allocation
is supposed to have. That way, if we want to change just the
pkey, we know what 'prot' to pass to mprotect_pkey().

Also, keep a record of the most recent allocation so the tests
can easily find it.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-store-malloc-record tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-store-malloc-record 2018-03-16 14:46:39.582285474 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-03-16 14:46:39.585285474 -0700
@@ -702,10 +702,12 @@ int mprotect_pkey(void *ptr, size_t size
struct pkey_malloc_record {
void *ptr;
long size;
+ int prot;
};
struct pkey_malloc_record *pkey_malloc_records;
+struct pkey_malloc_record *pkey_last_malloc_record;
long nr_pkey_malloc_records;
-void record_pkey_malloc(void *ptr, long size)
+void record_pkey_malloc(void *ptr, long size, int prot)
{
long i;
struct pkey_malloc_record *rec = NULL;
@@ -737,6 +739,8 @@ void record_pkey_malloc(void *ptr, long
(int)(rec - pkey_malloc_records), rec, ptr, size);
rec->ptr = ptr;
rec->size = size;
+ rec->prot = prot;
+ pkey_last_malloc_record = rec;
nr_pkey_malloc_records++;
}

@@ -781,7 +785,7 @@ void *malloc_pkey_with_mprotect(long siz
pkey_assert(ptr != (void *)-1);
ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
pkey_assert(!ret);
- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);
rdpkru();

dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
@@ -802,7 +806,7 @@ void *malloc_pkey_anon_huge(long size, i
size = ALIGN_UP(size, HPAGE_SIZE * 2);
ptr = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
pkey_assert(ptr != (void *)-1);
- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);
mprotect_pkey(ptr, size, prot, pkey);

dprintf1("unaligned ptr: %p\n", ptr);
@@ -875,7 +879,7 @@ void *malloc_pkey_hugetlb(long size, int
pkey_assert(ptr != (void *)-1);
mprotect_pkey(ptr, size, prot, pkey);

- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);

dprintf1("mmap()'d hugetlbfs for pkey %d @ %p\n", pkey, ptr);
return ptr;
@@ -897,7 +901,7 @@ void *malloc_pkey_mmap_dax(long size, in

mprotect_pkey(ptr, size, prot, pkey);

- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);

dprintf1("mmap()'d for pkey %d @ %p\n", pkey, ptr);
close(fd);
_

2018-03-16 21:51:34

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 3/3] x86, pkeys, selftests: add a test for pkey 0


From: Dave Hansen <[email protected]>

Protection key 0 is the default key for all memory and will
not normally come back from pkey_alloc(). But, you might
still want pass it to mprotect_pkey().

This check ensures that you can use pkey 0.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 30 ++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-with-pkey-0-test tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-with-pkey-0-test 2018-03-16 14:46:40.121285473 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-03-16 14:46:40.125285473 -0700
@@ -1197,6 +1197,35 @@ void test_pkey_alloc_exhaust(int *ptr, u
}
}

+/*
+ * pkey 0 is special. It is allocated by default, so you do not
+ * have to call pkey_alloc() to use it first. Make sure that it
+ * is usable.
+ */
+void test_mprotect_with_pkey_0(int *ptr, u16 pkey)
+{
+ long size;
+ int prot;
+
+ assert(pkey_last_malloc_record);
+ size = pkey_last_malloc_record->size;
+ /*
+ * This is a bit of a hack. But mprotect() requires
+ * huge-page-aligned sizes when operating on hugetlbfs.
+ * So, make sure that we use something that's a multiple
+ * of a huge page when we can.
+ */
+ if (size >= HPAGE_SIZE)
+ size = HPAGE_SIZE;
+ prot = pkey_last_malloc_record->prot;
+
+ /* Use pkey 0 */
+ mprotect_pkey(ptr, size, prot, 0);
+
+ /* Make sure that we can set it back to the original pkey. */
+ mprotect_pkey(ptr, size, prot, pkey);
+}
+
void test_ptrace_of_child(int *ptr, u16 pkey)
{
__attribute__((__unused__)) int peek_result;
@@ -1334,6 +1363,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey)
test_kernel_gup_of_access_disabled_region,
test_kernel_gup_write_to_write_disabled_region,
test_executing_on_unreadable_memory,
+ test_mprotect_with_pkey_0,
test_ptrace_of_child,
test_pkey_syscalls_on_non_allocated_pkey,
test_pkey_syscalls_bad_args,
_

2018-03-16 21:51:55

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 1/3] x86, pkeys: do not special case protection key 0


From: Dave Hansen <[email protected]>

mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
inconsistent with the manpages, and also inconsistent with
mm->context.pkey_allocation_map. Stop special casing it and only
disallow values that are actually bad (< 0).

The end-user visible effect of this is that you can now use
mprotect_pkey() to set pkey=0.

This is a bit nicer than what Ram proposed because it is simpler
and removes special-casing for pkey 0. On the other hand, it does
allow applciations to pkey_free() pkey-0, but that's just a silly
thing to do, so we are not going to protect against it.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>p
Cc: Shuah Khan <[email protected]>
---

b/arch/x86/include/asm/mmu_context.h | 2 +-
b/arch/x86/include/asm/pkeys.h | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)

diff -puN arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated 2018-03-16 14:46:39.023285476 -0700
+++ b/arch/x86/include/asm/mmu_context.h 2018-03-16 14:46:39.028285476 -0700
@@ -191,7 +191,7 @@ static inline int init_new_context(struc

#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
- /* pkey 0 is the default and always allocated */
+ /* pkey 0 is the default and allocated implicitly */
mm->context.pkey_allocation_map = 0x1;
/* -1 means unallocated or invalid */
mm->context.execute_only_pkey = -1;
diff -puN arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated 2018-03-16 14:46:39.025285476 -0700
+++ b/arch/x86/include/asm/pkeys.h 2018-03-16 14:46:39.028285476 -0700
@@ -49,10 +49,10 @@ bool mm_pkey_is_allocated(struct mm_stru
{
/*
* "Allocated" pkeys are those that have been returned
- * from pkey_alloc(). pkey 0 is special, and never
- * returned from pkey_alloc().
+ * from pkey_alloc() or pkey 0 which is allocated
+ * implicitly when the mm is created.
*/
- if (pkey <= 0)
+ if (pkey < 0)
return false;
if (pkey >= arch_max_pkey())
return false;
_

2018-03-17 09:18:24

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

On Fri, 16 Mar 2018, Dave Hansen wrote:

>
> From: Dave Hansen <[email protected]>
>
> mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
> inconsistent with the manpages, and also inconsistent with
> mm->context.pkey_allocation_map. Stop special casing it and only
> disallow values that are actually bad (< 0).
>
> The end-user visible effect of this is that you can now use
> mprotect_pkey() to set pkey=0.
>
> This is a bit nicer than what Ram proposed because it is simpler
> and removes special-casing for pkey 0. On the other hand, it does
> allow applciations to pkey_free() pkey-0, but that's just a silly
> thing to do, so we are not going to protect against it.

What's the consequence of that? Application crashing and burning itself or
something more subtle?

Thanks,

tglx

2018-03-17 16:03:05

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

On 03/17/2018 02:12 AM, Thomas Gleixner wrote:
>> This is a bit nicer than what Ram proposed because it is simpler
>> and removes special-casing for pkey 0. On the other hand, it does
>> allow applciations to pkey_free() pkey-0, but that's just a silly
>> thing to do, so we are not going to protect against it.
> What's the consequence of that? Application crashing and burning itself or
> something more subtle?

You would have to:

pkey_free(0)
... later
new_key = pkey_alloc();
// now new_key=0
pkey_deny_access(new_key); // or whatever

At which point most apps would probably croak because its stack is
inaccessible. The free itself does not make the key inaccessible, *but*
we could also do that within the existing ABI if we want. I think I
called out that behavior as undefined in the manpage.

2018-03-17 19:07:17

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

On Sat, 17 Mar 2018, Dave Hansen wrote:
> On 03/17/2018 02:12 AM, Thomas Gleixner wrote:
> >> This is a bit nicer than what Ram proposed because it is simpler
> >> and removes special-casing for pkey 0. On the other hand, it does
> >> allow applciations to pkey_free() pkey-0, but that's just a silly
> >> thing to do, so we are not going to protect against it.
> > What's the consequence of that? Application crashing and burning itself or
> > something more subtle?
>
> You would have to:
>
> pkey_free(0)
> ... later
> new_key = pkey_alloc();
> // now new_key=0
> pkey_deny_access(new_key); // or whatever
>
> At which point most apps would probably croak because its stack is
> inaccessible. The free itself does not make the key inaccessible, *but*
> we could also do that within the existing ABI if we want. I think I
> called out that behavior as undefined in the manpage.

As long as its documented and the change only allows people to shoot them
more in the foot, we're all good.

Thanks,

tglx


2018-03-17 23:26:43

by Ram Pai

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

On Fri, Mar 16, 2018 at 02:46:56PM -0700, Dave Hansen wrote:
>
> From: Dave Hansen <[email protected]>
>
> mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
> inconsistent with the manpages, and also inconsistent with
> mm->context.pkey_allocation_map. Stop special casing it and only
> disallow values that are actually bad (< 0).
>
> The end-user visible effect of this is that you can now use
> mprotect_pkey() to set pkey=0.
>
> This is a bit nicer than what Ram proposed because it is simpler
> and removes special-casing for pkey 0. On the other hand, it does
> allow applciations to pkey_free() pkey-0, but that's just a silly
> thing to do, so we are not going to protect against it.

So your proposal
(a) allocates pkey 0 implicitly,
(b) does not stop anyone from freeing pkey-0
(c) and allows pkey-0 to be explicitly associated with any address range.
correct?

My proposal
(a) allocates pkey 0 implicitly,
(b) stops anyone from freeing pkey-0
(c) and allows pkey-0 to be explicitly associated with any address range.

So the difference between the two proposals is just the freeing part i.e (b).
Did I get this right?

Its a philosophical debate; allow the user
to shoot-in-the-feet or stop from not doing so. There is no
clear answer either way. I am fine either way.

So here is my

Reviewed-by: Ram Pai <[email protected]>

I will write a corresponding patch for powerpc.

>
> Signed-off-by: Dave Hansen <[email protected]>
> Cc: Ram Pai <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Michael Ellermen <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Andrew Morton <[email protected]>p
> Cc: Shuah Khan <[email protected]>
> ---
>
> b/arch/x86/include/asm/mmu_context.h | 2 +-
> b/arch/x86/include/asm/pkeys.h | 6 +++---
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff -puN arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated arch/x86/include/asm/mmu_context.h
> --- a/arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated 2018-03-16 14:46:39.023285476 -0700
> +++ b/arch/x86/include/asm/mmu_context.h 2018-03-16 14:46:39.028285476 -0700
> @@ -191,7 +191,7 @@ static inline int init_new_context(struc
>
> #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
> if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
> - /* pkey 0 is the default and always allocated */
> + /* pkey 0 is the default and allocated implicitly */
> mm->context.pkey_allocation_map = 0x1;
> /* -1 means unallocated or invalid */
> mm->context.execute_only_pkey = -1;
> diff -puN arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated arch/x86/include/asm/pkeys.h
> --- a/arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated 2018-03-16 14:46:39.025285476 -0700
> +++ b/arch/x86/include/asm/pkeys.h 2018-03-16 14:46:39.028285476 -0700
> @@ -49,10 +49,10 @@ bool mm_pkey_is_allocated(struct mm_stru
> {
> /*
> * "Allocated" pkeys are those that have been returned
> - * from pkey_alloc(). pkey 0 is special, and never
> - * returned from pkey_alloc().
> + * from pkey_alloc() or pkey 0 which is allocated
> + * implicitly when the mm is created.
> */
> - if (pkey <= 0)
> + if (pkey < 0)
> return false;
> if (pkey >= arch_max_pkey())
> return false;
> _

--
Ram Pai


2018-03-18 00:53:52

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

On 03/17/2018 04:24 PM, Ram Pai wrote:
> So the difference between the two proposals is just the freeing part i.e (b).
> Did I get this right?

Yeah, I think that's the only difference.

2018-03-18 09:33:03

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

On Sat, 17 Mar 2018, Ram Pai wrote:
> On Fri, Mar 16, 2018 at 02:46:56PM -0700, Dave Hansen wrote:
> >
> > From: Dave Hansen <[email protected]>
> >
> > mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
> > inconsistent with the manpages, and also inconsistent with
> > mm->context.pkey_allocation_map. Stop special casing it and only
> > disallow values that are actually bad (< 0).
> >
> > The end-user visible effect of this is that you can now use
> > mprotect_pkey() to set pkey=0.
> >
> > This is a bit nicer than what Ram proposed because it is simpler
> > and removes special-casing for pkey 0. On the other hand, it does
> > allow applciations to pkey_free() pkey-0, but that's just a silly
> > thing to do, so we are not going to protect against it.
>
> So your proposal
> (a) allocates pkey 0 implicitly,
> (b) does not stop anyone from freeing pkey-0
> (c) and allows pkey-0 to be explicitly associated with any address range.
> correct?
>
> My proposal
> (a) allocates pkey 0 implicitly,
> (b) stops anyone from freeing pkey-0
> (c) and allows pkey-0 to be explicitly associated with any address range.
>
> So the difference between the two proposals is just the freeing part i.e (b).
> Did I get this right?

Yes, and that's consistent with the other pkeys.

> Its a philosophical debate; allow the user to shoot-in-the-feet or stop
> from not doing so. There is no clear answer either way. I am fine either
> way.

The user can shoot himself already with the other pkeys, so adding another
one does not matter and is again consistent.

Thanks,

tglx

2018-03-18 23:48:45

by Ram Pai

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

On Sun, Mar 18, 2018 at 10:30:48AM +0100, Thomas Gleixner wrote:
> On Sat, 17 Mar 2018, Ram Pai wrote:
> > On Fri, Mar 16, 2018 at 02:46:56PM -0700, Dave Hansen wrote:
> > >
> > > From: Dave Hansen <[email protected]>
> > >
> > > mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
> > > inconsistent with the manpages, and also inconsistent with
> > > mm->context.pkey_allocation_map. Stop special casing it and only
> > > disallow values that are actually bad (< 0).
> > >
> > > The end-user visible effect of this is that you can now use
> > > mprotect_pkey() to set pkey=0.
> > >
> > > This is a bit nicer than what Ram proposed because it is simpler
> > > and removes special-casing for pkey 0. On the other hand, it does
> > > allow applciations to pkey_free() pkey-0, but that's just a silly
> > > thing to do, so we are not going to protect against it.
> >
> > So your proposal
> > (a) allocates pkey 0 implicitly,
> > (b) does not stop anyone from freeing pkey-0
> > (c) and allows pkey-0 to be explicitly associated with any address range.
> > correct?
> >
> > My proposal
> > (a) allocates pkey 0 implicitly,
> > (b) stops anyone from freeing pkey-0
> > (c) and allows pkey-0 to be explicitly associated with any address range.
> >
> > So the difference between the two proposals is just the freeing part i.e (b).
> > Did I get this right?
>
> Yes, and that's consistent with the other pkeys.
>

ok.

Yes it makes pkey-0 even more consistent with the other keys, but not
entirely consistent. pkey-0 still has the priviledge of being
allocated by default.


RP


2018-03-19 05:52:21

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86, pkeys: do not special case protection key 0

Dave Hansen <[email protected]> writes:

> On 03/17/2018 02:12 AM, Thomas Gleixner wrote:
>>> This is a bit nicer than what Ram proposed because it is simpler
>>> and removes special-casing for pkey 0. On the other hand, it does
>>> allow applciations to pkey_free() pkey-0, but that's just a silly
>>> thing to do, so we are not going to protect against it.
>> What's the consequence of that? Application crashing and burning itself or
>> something more subtle?
>
> You would have to:
>
> pkey_free(0)
> ... later
> new_key = pkey_alloc();
> // now new_key=0
> pkey_deny_access(new_key); // or whatever
>
> At which point most apps would probably croak because its stack is
> inaccessible. The free itself does not make the key inaccessible, *but*
> we could also do that within the existing ABI if we want. I think I
> called out that behavior as undefined in the manpage.

Allowing key 0 to be freed introduces some pretty weird API IMHO. For
example this part of the manpage:

An application should not call pkey_free() on any protection key
which has been assigned to an address range by pkey_mprotect(2)
and which is still in use. The behavior in this case is undefined
and may result in an error.

You basically can't avoid hitting undefined behaviour with pkey 0,
because even if you never assigned pkey 0 to an address range, it *is
still in use* - because it's used as the default key for every address
range that doesn't have another key.

So I don't really think it makes sense to allow pkey 0 to be freed. But
I won't die in a ditch over it, I just look forward to a manpage update
that can sensibly describe the semantics.

cheers