2018-04-27 17:54:02

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 0/9] [v3] x86, pkeys: two protection keys bug fixes

Hi x86 maintainers,

This set is basically unchanged from the last post. There was
some previous discussion about other ways to fix this with the ppc
folks (Ram Pai), but we've concluded that this x86-specific fix is
fine. I think Ram had a different fix for ppc.

Changes from v2:
* Clarified commit message in patch 1/9 taking some feedback from
Shuah.

Changes from v1:
* Added Fixes: and cc'd stable. No code changes.

--

This fixes two bugs, and adds selftests to make sure they stay fixed:

1. pkey 0 was not usable via mprotect_pkey() because it had never
been explicitly allocated.
2. mprotect(PROT_EXEC) memory could sometimes be left with the
implicit exec-only protection key assigned.

I already posted #1 previously. I'm including them both here because
I don't think it's been picked up in case folks want to pull these
all in a single bundle.

Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>p
Cc: Shuah Khan <[email protected]>
Cc: Shakeel Butt <[email protected]>


2018-04-27 17:51:28

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 7/9] x86, pkeys, selftests: factor out "instruction page"


From: Dave Hansen <[email protected]>

We currently have an execute-only test, but it is for
the explicit mprotect_pkey() interface. We will soon
add a test for the implicit mprotect(PROT_EXEC)
enterface. We need this code in both tests.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-get_pointer_to_instructions tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-get_pointer_to_instructions 2018-03-26 10:22:37.012170189 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-03-26 10:22:37.015170189 -0700
@@ -1277,12 +1277,9 @@ void test_ptrace_of_child(int *ptr, u16
free(plain_ptr_unaligned);
}

-void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
+void *get_pointer_to_instructions(void)
{
void *p1;
- int scratch;
- int ptr_contents;
- int ret;

p1 = ALIGN_PTR_UP(&lots_o_noops_around_write, PAGE_SIZE);
dprintf3("&lots_o_noops: %p\n", &lots_o_noops_around_write);
@@ -1292,7 +1289,23 @@ void test_executing_on_unreadable_memory
/* Point 'p1' at the *second* page of the function: */
p1 += PAGE_SIZE;

+ /*
+ * Try to ensure we fault this in on next touch to ensure
+ * we get an instruction fault as opposed to a data one
+ */
madvise(p1, PAGE_SIZE, MADV_DONTNEED);
+
+ return p1;
+}
+
+void test_executing_on_unreadable_memory(int *ptr, u16 pkey)
+{
+ void *p1;
+ int scratch;
+ int ptr_contents;
+ int ret;
+
+ p1 = get_pointer_to_instructions();
lots_o_noops_around_write(&scratch);
ptr_contents = read_ptr(p1);
dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
_

2018-04-27 17:51:33

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 9/9] x86, pkeys, selftests: add PROT_EXEC test


From: Dave Hansen <[email protected]>

Under the covers, implement executable-only memory with
protection keys when userspace calls mprotect(PROT_EXEC).

But, we did not have a selftest for that. Now we do.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 51 ++++++++++++++++++++++--
1 file changed, 47 insertions(+), 4 deletions(-)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-prot_exec tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-prot_exec 2018-04-26 11:24:12.572481103 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-04-26 11:24:12.575481103 -0700
@@ -930,10 +930,10 @@ void expected_pk_fault(int pkey)
dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
pkey_assert(last_pkru_faults + 1 == pkru_faults);

- /*
- * For exec-only memory, we do not know the pkey in
- * advance, so skip this check.
- */
+ /*
+ * For exec-only memory, we do not know the pkey in
+ * advance, so skip this check.
+ */
if (pkey != UNKNOWN_PKEY)
pkey_assert(last_si_pkey == pkey);

@@ -1335,6 +1335,49 @@ void test_executing_on_unreadable_memory
expected_pk_fault(pkey);
}

+void test_implicit_mprotect_exec_only_memory(int *ptr, u16 pkey)
+{
+ void *p1;
+ int scratch;
+ int ptr_contents;
+ int ret;
+
+ dprintf1("%s() start\n", __func__);
+
+ p1 = get_pointer_to_instructions();
+ lots_o_noops_around_write(&scratch);
+ ptr_contents = read_ptr(p1);
+ dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
+
+ /* Use a *normal* mprotect(), not mprotect_pkey(): */
+ ret = mprotect(p1, PAGE_SIZE, PROT_EXEC);
+ pkey_assert(!ret);
+
+ dprintf2("pkru: %x\n", rdpkru());
+
+ /* Make sure this is an *instruction* fault */
+ madvise(p1, PAGE_SIZE, MADV_DONTNEED);
+ lots_o_noops_around_write(&scratch);
+ do_not_expect_pk_fault();
+ ptr_contents = read_ptr(p1);
+ dprintf2("ptr (%p) contents@%d: %x\n", p1, __LINE__, ptr_contents);
+ expected_pk_fault(UNKNOWN_PKEY);
+
+ /*
+ * Put the memory back to non-PROT_EXEC. Should clear the
+ * exec-only pkey off the VMA and allow it to be readable
+ * again. Go to PROT_NONE first to check for a kernel bug
+ * that did not clear the pkey when doing PROT_NONE.
+ */
+ ret = mprotect(p1, PAGE_SIZE, PROT_NONE);
+ pkey_assert(!ret);
+
+ ret = mprotect(p1, PAGE_SIZE, PROT_READ|PROT_EXEC);
+ pkey_assert(!ret);
+ ptr_contents = read_ptr(p1);
+ do_not_expect_pk_fault();
+}
+
void test_mprotect_pkey_on_unsupported_cpu(int *ptr, u16 pkey)
{
int size = PAGE_SIZE;
_

2018-04-27 17:52:01

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 6/9] x86, pkeys, selftests: fix pkey exhaustion test off-by-one


From: Dave Hansen <[email protected]>

In our "exhaust all pkeys" test, we make sure that there
is the expected number available. Turns out that the
test did not cover the execute-only key, but discussed
it anyway. It did *not* discuss the test-allocated
key.

Now that we have a test for the mprotect(PROT_EXEC) case,
this off-by-one issue showed itself. Correct the off-by-
one and add the explanation for the case we missed.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-exhaust-off-by-one tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-exhaust-off-by-one 2018-03-26 10:22:36.477170190 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-03-26 10:22:36.480170190 -0700
@@ -1155,12 +1155,15 @@ void test_pkey_alloc_exhaust(int *ptr, u
pkey_assert(i < NR_PKEYS*2);

/*
- * There are 16 pkeys supported in hardware. One is taken
- * up for the default (0) and another can be taken up by
- * an execute-only mapping. Ensure that we can allocate
- * at least 14 (16-2).
+ * There are 16 pkeys supported in hardware. Three are
+ * allocated by the time we get here:
+ * 1. The default key (0)
+ * 2. One possibly consumed by an execute-only mapping.
+ * 3. One allocated by the test code and passed in via
+ * 'pkey' to this function.
+ * Ensure that we can allocate at least another 13 (16-3).
*/
- pkey_assert(i >= NR_PKEYS-2);
+ pkey_assert(i >= NR_PKEYS-3);

for (i = 0; i < nr_allocated_pkeys; i++) {
err = sys_pkey_free(allocated_pkeys[i]);
_

2018-04-27 17:52:10

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 8/9] x86, pkeys, selftests: add allow faults on unknown keys


From: Dave Hansen <[email protected]>

The exec-only pkey is allocated inside the kernel and userspace
is not told what it is. So, allow PK faults to occur that have
an unknown key.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-unknown-exec-only-key tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-unknown-exec-only-key 2018-03-26 10:22:37.549170187 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-03-26 10:22:37.553170187 -0700
@@ -922,13 +922,21 @@ void *malloc_pkey(long size, int prot, u
}

int last_pkru_faults;
+#define UNKNOWN_PKEY -2
void expected_pk_fault(int pkey)
{
dprintf2("%s(): last_pkru_faults: %d pkru_faults: %d\n",
__func__, last_pkru_faults, pkru_faults);
dprintf2("%s(%d): last_si_pkey: %d\n", __func__, pkey, last_si_pkey);
pkey_assert(last_pkru_faults + 1 == pkru_faults);
- pkey_assert(last_si_pkey == pkey);
+
+ /*
+ * For exec-only memory, we do not know the pkey in
+ * advance, so skip this check.
+ */
+ if (pkey != UNKNOWN_PKEY)
+ pkey_assert(last_si_pkey == pkey);
+
/*
* The signal handler shold have cleared out PKRU to let the
* test program continue. We now have to restore it.
_

2018-04-27 17:52:34

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 3/9] x86, pkeys, selftests: add a test for pkey 0


From: Dave Hansen <[email protected]>

Protection key 0 is the default key for all memory and will
not normally come back from pkey_alloc(). But, you might
still want pass it to mprotect_pkey().

This check ensures that you can use pkey 0.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 30 ++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-with-pkey-0-test tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-with-pkey-0-test 2018-03-26 10:22:34.841170194 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-03-26 10:22:34.844170194 -0700
@@ -1169,6 +1169,35 @@ void test_pkey_alloc_exhaust(int *ptr, u
}
}

+/*
+ * pkey 0 is special. It is allocated by default, so you do not
+ * have to call pkey_alloc() to use it first. Make sure that it
+ * is usable.
+ */
+void test_mprotect_with_pkey_0(int *ptr, u16 pkey)
+{
+ long size;
+ int prot;
+
+ assert(pkey_last_malloc_record);
+ size = pkey_last_malloc_record->size;
+ /*
+ * This is a bit of a hack. But mprotect() requires
+ * huge-page-aligned sizes when operating on hugetlbfs.
+ * So, make sure that we use something that's a multiple
+ * of a huge page when we can.
+ */
+ if (size >= HPAGE_SIZE)
+ size = HPAGE_SIZE;
+ prot = pkey_last_malloc_record->prot;
+
+ /* Use pkey 0 */
+ mprotect_pkey(ptr, size, prot, 0);
+
+ /* Make sure that we can set it back to the original pkey. */
+ mprotect_pkey(ptr, size, prot, pkey);
+}
+
void test_ptrace_of_child(int *ptr, u16 pkey)
{
__attribute__((__unused__)) int peek_result;
@@ -1306,6 +1335,7 @@ void (*pkey_tests[])(int *ptr, u16 pkey)
test_kernel_gup_of_access_disabled_region,
test_kernel_gup_write_to_write_disabled_region,
test_executing_on_unreadable_memory,
+ test_mprotect_with_pkey_0,
test_ptrace_of_child,
test_pkey_syscalls_on_non_allocated_pkey,
test_pkey_syscalls_bad_args,
_

2018-04-27 17:52:39

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 5/9] x86, pkeys, selftests: fix pointer math


From: Dave Hansen <[email protected]>

We dump out the entire area of the siginfo where the si_pkey_ptr is
supposed to be. But, we do some math on the poitner, which is a u32.
We intended to do byte math, not u32 math on the pointer.

Cast it over to a u8* so it works.

Also, move this block of code to below th si_code check. It doesn't
hurt anything, but the si_pkey field is gibberish for other signal
types.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-fix-pointer-math tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-selftests-fix-pointer-math 2018-04-26 11:23:51.588481155 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-04-26 11:23:51.592481155 -0700
@@ -289,13 +289,6 @@ void signal_handler(int signum, siginfo_
dump_mem(pkru_ptr - 128, 256);
pkey_assert(*pkru_ptr);

- si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
- dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
- dump_mem(si_pkey_ptr - 8, 24);
- siginfo_pkey = *si_pkey_ptr;
- pkey_assert(siginfo_pkey < NR_PKEYS);
- last_si_pkey = siginfo_pkey;
-
if ((si->si_code == SEGV_MAPERR) ||
(si->si_code == SEGV_ACCERR) ||
(si->si_code == SEGV_BNDERR)) {
@@ -303,6 +296,13 @@ void signal_handler(int signum, siginfo_
exit(4);
}

+ si_pkey_ptr = (u32 *)(((u8 *)si) + si_pkey_offset);
+ dprintf1("si_pkey_ptr: %p\n", si_pkey_ptr);
+ dump_mem((u8 *)si_pkey_ptr - 8, 24);
+ siginfo_pkey = *si_pkey_ptr;
+ pkey_assert(siginfo_pkey < NR_PKEYS);
+ last_si_pkey = siginfo_pkey;
+
dprintf1("signal pkru from xsave: %08x\n", *pkru_ptr);
/* need __rdpkru() version so we do not do shadow_pkru checking */
dprintf1("signal pkru from pkru: %08x\n", __rdpkru());
_

2018-04-27 17:52:57

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 4/9] x86, pkeys: override pkey when moving away from PROT_EXEC


From: Dave Hansen <[email protected]>

I got a bug report that the following code (roughly) was
causing a SIGSEGV:

mprotect(ptr, size, PROT_EXEC);
mprotect(ptr, size, PROT_NONE);
mprotect(ptr, size, PROT_READ);
*ptr = 100;

The problem is hit when the mprotect(PROT_EXEC)
is implicitly assigned a protection key to the VMA, and made
that key ACCESS_DENY|WRITE_DENY. The PROT_NONE mprotect()
failed to remove the protection key, and the PROT_NONE->
PROT_READ left the PTE usable, but the pkey still in place
and left the memory inaccessible.

To fix this, we ensure that we always "override" the pkee
at mprotect() if the VMA does not have execute-only
permissions, but the VMA has the execute-only pkey.

We had a check for PROT_READ/WRITE, but it did not work
for PROT_NONE. This entirely removes the PROT_* checks,
which ensures that PROT_NONE now works.

Signed-off-by: Dave Hansen <[email protected]>
Fixes: 62b5f7d013f ("mm/core, x86/mm/pkeys: Add execute-only protection keys support")
Reported-by: Shakeel Butt <[email protected]>
Cc: [email protected]
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/arch/x86/include/asm/pkeys.h | 12 +++++++++++-
b/arch/x86/mm/pkeys.c | 21 +++++++++++----------
2 files changed, 22 insertions(+), 11 deletions(-)

diff -puN arch/x86/include/asm/pkeys.h~pkeys-abandon-exec-only-pkey-more-aggressively arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~pkeys-abandon-exec-only-pkey-more-aggressively 2018-04-26 10:42:18.971487371 -0700
+++ b/arch/x86/include/asm/pkeys.h 2018-04-26 10:42:18.977487371 -0700
@@ -2,6 +2,8 @@
#ifndef _ASM_X86_PKEYS_H
#define _ASM_X86_PKEYS_H

+#define ARCH_DEFAULT_PKEY 0
+
#define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)

extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
@@ -15,7 +17,7 @@ extern int __execute_only_pkey(struct mm
static inline int execute_only_pkey(struct mm_struct *mm)
{
if (!boot_cpu_has(X86_FEATURE_OSPKE))
- return 0;
+ return ARCH_DEFAULT_PKEY;

return __execute_only_pkey(mm);
}
@@ -56,6 +58,14 @@ bool mm_pkey_is_allocated(struct mm_stru
return false;
if (pkey >= arch_max_pkey())
return false;
+ /*
+ * The exec-only pkey is set in the allocation map, but
+ * is not available to any of the user interfaces like
+ * mprotect_pkey().
+ */
+ if (pkey == mm->context.execute_only_pkey)
+ return false;
+
return mm_pkey_allocation_map(mm) & (1U << pkey);
}

diff -puN arch/x86/mm/pkeys.c~pkeys-abandon-exec-only-pkey-more-aggressively arch/x86/mm/pkeys.c
--- a/arch/x86/mm/pkeys.c~pkeys-abandon-exec-only-pkey-more-aggressively 2018-04-26 10:42:18.973487371 -0700
+++ b/arch/x86/mm/pkeys.c 2018-04-26 10:47:34.806486584 -0700
@@ -94,26 +94,27 @@ int __arch_override_mprotect_pkey(struct
*/
if (pkey != -1)
return pkey;
- /*
- * Look for a protection-key-drive execute-only mapping
- * which is now being given permissions that are not
- * execute-only. Move it back to the default pkey.
- */
- if (vma_is_pkey_exec_only(vma) &&
- (prot & (PROT_READ|PROT_WRITE))) {
- return 0;
- }
+
/*
* The mapping is execute-only. Go try to get the
* execute-only protection key. If we fail to do that,
* fall through as if we do not have execute-only
- * support.
+ * support in this mm.
*/
if (prot == PROT_EXEC) {
pkey = execute_only_pkey(vma->vm_mm);
if (pkey > 0)
return pkey;
+ } else if (vma_is_pkey_exec_only(vma)) {
+ /*
+ * Protections are *not* PROT_EXEC, but the mapping
+ * is using the exec-only pkey. This mapping was
+ * PROT_EXEC and will no longer be. Move back to
+ * the default pkey.
+ */
+ return ARCH_DEFAULT_PKEY;
}
+
/*
* This is a vanilla, non-pkey mprotect (or we failed to
* setup execute-only), inherit the pkey from the VMA we
_

2018-04-27 17:53:11

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 1/9] x86, pkeys: do not special case protection key 0


From: Dave Hansen <[email protected]>

mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
inconsistent with the manpages, and also inconsistent with
mm->context.pkey_allocation_map. Stop special casing it and only
disallow values that are actually bad (< 0).

The end-user visible effect of this is that you can now use
mprotect_pkey() to set pkey=0.

This is a bit nicer than what Ram proposed because it is simpler
and removes special-casing for pkey 0. On the other hand, it does
allow applciations to pkey_free() pkey-0, but that's just a silly
thing to do, so we are not going to protect against it.

Signed-off-by: Dave Hansen <[email protected]>
Fixes: 58ab9a088dda ("x86/pkeys: Check against max pkey to avoid overflows")
Cc: [email protected]
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>p
Cc: Shuah Khan <[email protected]>
---

b/arch/x86/include/asm/mmu_context.h | 2 +-
b/arch/x86/include/asm/pkeys.h | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)

diff -puN arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~x86-pkey-0-default-allocated 2018-03-26 10:22:33.742170197 -0700
+++ b/arch/x86/include/asm/mmu_context.h 2018-03-26 10:22:33.747170197 -0700
@@ -192,7 +192,7 @@ static inline int init_new_context(struc

#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
- /* pkey 0 is the default and always allocated */
+ /* pkey 0 is the default and allocated implicitly */
mm->context.pkey_allocation_map = 0x1;
/* -1 means unallocated or invalid */
mm->context.execute_only_pkey = -1;
diff -puN arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~x86-pkey-0-default-allocated 2018-03-26 10:22:33.744170197 -0700
+++ b/arch/x86/include/asm/pkeys.h 2018-03-26 10:22:33.747170197 -0700
@@ -49,10 +49,10 @@ bool mm_pkey_is_allocated(struct mm_stru
{
/*
* "Allocated" pkeys are those that have been returned
- * from pkey_alloc(). pkey 0 is special, and never
- * returned from pkey_alloc().
+ * from pkey_alloc() or pkey 0 which is allocated
+ * implicitly when the mm is created.
*/
- if (pkey <= 0)
+ if (pkey < 0)
return false;
if (pkey >= arch_max_pkey())
return false;
_

2018-04-27 17:53:36

by Dave Hansen

[permalink] [raw]
Subject: [PATCH 2/9] x86, pkeys, selftests: save off 'prot' for allocations


From: Dave Hansen <[email protected]>

This makes it possible to to tell what 'prot' a given allocation
is supposed to have. That way, if we want to change just the
pkey, we know what 'prot' to pass to mprotect_pkey().

Also, keep a record of the most recent allocation so the tests
can easily find it.

Signed-off-by: Dave Hansen <[email protected]>
Cc: Ram Pai <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Michael Ellermen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Shuah Khan <[email protected]>
---

b/tools/testing/selftests/x86/protection_keys.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff -puN tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-store-malloc-record tools/testing/selftests/x86/protection_keys.c
--- a/tools/testing/selftests/x86/protection_keys.c~pkeys-update-selftests-store-malloc-record 2018-03-26 10:22:34.301170195 -0700
+++ b/tools/testing/selftests/x86/protection_keys.c 2018-03-26 10:22:34.305170195 -0700
@@ -674,10 +674,12 @@ int mprotect_pkey(void *ptr, size_t size
struct pkey_malloc_record {
void *ptr;
long size;
+ int prot;
};
struct pkey_malloc_record *pkey_malloc_records;
+struct pkey_malloc_record *pkey_last_malloc_record;
long nr_pkey_malloc_records;
-void record_pkey_malloc(void *ptr, long size)
+void record_pkey_malloc(void *ptr, long size, int prot)
{
long i;
struct pkey_malloc_record *rec = NULL;
@@ -709,6 +711,8 @@ void record_pkey_malloc(void *ptr, long
(int)(rec - pkey_malloc_records), rec, ptr, size);
rec->ptr = ptr;
rec->size = size;
+ rec->prot = prot;
+ pkey_last_malloc_record = rec;
nr_pkey_malloc_records++;
}

@@ -753,7 +757,7 @@ void *malloc_pkey_with_mprotect(long siz
pkey_assert(ptr != (void *)-1);
ret = mprotect_pkey((void *)ptr, PAGE_SIZE, prot, pkey);
pkey_assert(!ret);
- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);
rdpkru();

dprintf1("%s() for pkey %d @ %p\n", __func__, pkey, ptr);
@@ -774,7 +778,7 @@ void *malloc_pkey_anon_huge(long size, i
size = ALIGN_UP(size, HPAGE_SIZE * 2);
ptr = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
pkey_assert(ptr != (void *)-1);
- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);
mprotect_pkey(ptr, size, prot, pkey);

dprintf1("unaligned ptr: %p\n", ptr);
@@ -847,7 +851,7 @@ void *malloc_pkey_hugetlb(long size, int
pkey_assert(ptr != (void *)-1);
mprotect_pkey(ptr, size, prot, pkey);

- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);

dprintf1("mmap()'d hugetlbfs for pkey %d @ %p\n", pkey, ptr);
return ptr;
@@ -869,7 +873,7 @@ void *malloc_pkey_mmap_dax(long size, in

mprotect_pkey(ptr, size, prot, pkey);

- record_pkey_malloc(ptr, size);
+ record_pkey_malloc(ptr, size, prot);

dprintf1("mmap()'d for pkey %d @ %p\n", pkey, ptr);
close(fd);
_

2018-04-28 07:07:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/9] [v3] x86, pkeys: two protection keys bug fixes


* Dave Hansen <[email protected]> wrote:

> Hi x86 maintainers,
>
> This set is basically unchanged from the last post. There was
> some previous discussion about other ways to fix this with the ppc
> folks (Ram Pai), but we've concluded that this x86-specific fix is
> fine. I think Ram had a different fix for ppc.
>
> Changes from v2:
> * Clarified commit message in patch 1/9 taking some feedback from
> Shuah.
>
> Changes from v1:
> * Added Fixes: and cc'd stable. No code changes.
>
> --
>
> This fixes two bugs, and adds selftests to make sure they stay fixed:
>
> 1. pkey 0 was not usable via mprotect_pkey() because it had never
> been explicitly allocated.
> 2. mprotect(PROT_EXEC) memory could sometimes be left with the
> implicit exec-only protection key assigned.
>
> I already posted #1 previously. I'm including them both here because
> I don't think it's been picked up in case folks want to pull these
> all in a single bundle.

A couple of observations:

1)

Minor patch series organization requests:

- please include the shortlog and diffstat in the cover letter in the future, as
it makes it easier to see the overall structure and makes it easier to reply to
certain commits as a group.

- please capitalize commit titles as is usually done in arch/x86/ and change the
change the subsystem tags to the usual ones:

d76eeb1914c8: x86/pkeys: Override pkey when moving away from PROT_EXEC
f30f10248200: x86/pkeys/selftests: Add PROT_EXEC test
0530ebfefcdc: x86/pkeys/selftests: Add allow faults on unknown keys
e81c40e33818: x86/pkeys/selftests: Factor out "instruction page"
57042882631c: x86/pkeys/selftests: Fix pkey exhaustion test off-by-one
6b833e9d3171: x86/pkeys/selftests: Fix pointer math
d16f12e3c4ca: x86/pkeys: Do not special case protection key 0
1cb7691d0ee4: x86/pkeys/selftests: Add a test for pkey 0
273ae5cde423: x86/pkeys/selftests: Save off 'prot' for allocations

- please re-order the series to first introduce a unit test which specifically
tests for the failure, ascertain that it indeed fails, and then apply the
kernel fix. I.e. please use the order I used above for future versions of this
patch-set.

2)

The new self-test you added does not fail overly nicely, it does the following on
older kernels:

deimos:~/tip/tools/testing/selftests/x86> ./protection_keys_64
has pku: 1
startup pkru: 55555554
WARNING: not run as root, can not do hugetlb test
test 0 PASSED (iteration 1)
test 1 PASSED (iteration 1)
test 2 PASSED (iteration 1)
test 3 PASSED (iteration 1)
test 4 PASSED (iteration 1)
test 5 PASSED (iteration 1)
test 6 PASSED (iteration 1)
test 7 PASSED (iteration 1)
test 8 PASSED (iteration 1)
assert() at protection_keys.c::668 test_nr: 9 iteration: 1
errno at assert: 22running abort_hooks()...
protection_keys_64: protection_keys.c:668: mprotect_pkey: Assertion `!ret' failed.
Aborted (core dumped)

It would be nice to catch the crash or the error in a more obvious way and turn it
into a proper test failure - and maybe print an indication that this is probably
an older kernel or so?

This, beyond being less scary to users, would also allow the other tests to be run
on older kernels. (It would also be helpful to us should we (accidentally)
reintroduce a similar bug in the future.)

I.e. x86 unit tests should never 'crash' in a way that suggests that the testing
itself might be buggy - the crashes/failures should always be well controlled.

3)

When the first kernel bug fix is applied but not the second, then I don't see the
new PROT_EXEC test catching the bug:

deimos:~/tip/tools/testing/selftests/x86> ./protection_keys_64
has pku: 1
startup pkru: 55555554
WARNING: not run as root, can not do hugetlb test
test 0 PASSED (iteration 1)
test 1 PASSED (iteration 1)
...
done (all tests OK)

I.e. in the booted kernel I didn't have this kernel fix applied:

x86/pkeys: Override pkey when moving away from PROT_EXEC

But I had these applied:

f30f10248200 x86/pkeys/selftests: Add PROT_EXEC test
0530ebfefcdc x86/pkeys/selftests: Add allow faults on unknown keys
e81c40e33818 x86/pkeys/selftests: Factor out "instruction page"
57042882631c x86/pkeys/selftests: Fix pkey exhaustion test off-by-one
6b833e9d3171 x86/pkeys/selftests: Fix pointer math
d16f12e3c4ca x86/pkeys: Do not special case protection key 0
1cb7691d0ee4 x86/pkeys/selftests: Add a test for pkey 0
273ae5cde423 x86/pkeys/selftests: Save off 'prot' for allocations

(Note that the key-0 kernel fix is applied, so that test passes.)

4)

In the above kernel that was missing the PROT_EXEC fix I was repeatedly running
the 64-bit and 32-bit testcases as non-root and as root as well, until I got a
hang in the middle of a 32-bit test running as root:

test 7 PASSED (iteration 19)
test 8 PASSED (iteration 19)
test 9 PASSED (iteration 19)

< test just hangs here >

this is what it looked like in ps:

3954 pts/0 S 0:00 bash
3987 pts/0 S+ 0:00 ./protection_keys_32
4006 pts/0 t+ 0:00 ./protection_keys_32

And when attaching to it via gdb the main process was hanging here:

(gdb) bt
#0 0xf7f7ac79 in __kernel_vsyscall ()
#1 0xf7e69b11 in ?? () from /lib32/libc.so.6
#2 0xf7ddc1fb in ?? () from /lib32/libc.so.6
#3 0xf7ddc5b6 in _IO_flush_all () from /lib32/libc.so.6
#4 0x0804bc63 in sig_chld (x=17) at protection_keys.c:342
#5 <signal handler called>
#6 0xf7ddc1fb in ?? () from /lib32/libc.so.6
#7 0xf7ddc5b6 in _IO_flush_all () from /lib32/libc.so.6
#8 0x0804c5b2 in __wrpkru (pkru=4) at pkey-helpers.h:93
#9 pkey_set (pkey=1, rights=1, flags=0) at protection_keys.c:437
#10 0x0804c687 in pkey_disable_set (pkey=1, flags=1) at protection_keys.c:463
#11 0x0804f286 in pkey_access_deny (pkey=1) at protection_keys.c:525
#12 test_ptrace_of_child (ptr=0xf7800000, pkey=1) at protection_keys.c:1248
#13 0x0804fd18 in run_tests_once () at protection_keys.c:1429
#14 0x08049145 in main () at protection_keys.c:1476

the child task could not be attached to, because it was already a ptrace child of
the main task. Then I killed the main task (while it was still being ptraced by
gdb), which allowed me to attach gdb to the child task:

(gdb) bt
#0 0xf7f7ac79 in __kernel_vsyscall ()
#1 0xf7e25233 in nanosleep () from /lib32/libc.so.6
#2 0xf7e2516d in sleep () from /lib32/libc.so.6
#3 0x0804c476 in fork_lazy_child () at protection_keys.c:390
#4 0x0804f20b in test_ptrace_of_child (ptr=0xf7800000, pkey=1) at protection_keys.c:1231
#5 0x0804fd18 in run_tests_once () at protection_keys.c:1429
#6 0x08049145 in main () at protection_keys.c:1476

After I got the GDB backtraces I tried to clean up leftover tasks, but the main
thread would not go away:

4006 pts/0 00:00:00 protection_keys <defunct>

neither SIGCONT nor SIGKILL appears to help:

root@deimos:/home/mingo/tip/tools/testing/selftests/x86# kill -CONT 4006
root@deimos:/home/mingo/tip/tools/testing/selftests/x86# kill -9 4006
root@deimos:/home/mingo/tip/tools/testing/selftests/x86# ps
PID TTY TIME CMD
3953 pts/0 00:00:00 su
3954 pts/0 00:00:00 bash
4006 pts/0 00:00:00 protection_keys <defunct>
4307 pts/0 00:00:00 ps

This task stayed zombie until the next reboot. There were no suspicious kernel
messages in the log during or after the test.

I ran the tests based on tip:x86/urgent (which is v4.17-rc2 based), on top of a
pretty vanilla installation of Ubuntu:

# cat /etc/os-release
NAME="Ubuntu"
VERSION="17.10 (Artful Aardvark)"

Thanks,

Ingo

2018-04-28 07:17:05

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/9] [v3] x86, pkeys: two protection keys bug fixes


* Ingo Molnar <[email protected]> wrote:

> After I got the GDB backtraces I tried to clean up leftover tasks, but the main
> thread would not go away:
>
> 4006 pts/0 00:00:00 protection_keys <defunct>
>
> neither SIGCONT nor SIGKILL appears to help:

Just seconds after I sent this I found out that this was user error: I forgot
about a gdb session I still had running, which understandably blocked the task
from being cleaned up. Once I exited GDB it all got cleaned up properly.

The hang problem is still there, if I run a script like this:

while :; do date; echo -n "32-bit: "; ./protection_keys_32 >/dev/null; date; echo -n "64-bit: "; ./protection_keys_64 >/dev/null; done

then within a minute one of the testcases hangs reliably.

Out of 4 attempts so far one hang was in the 32-bit testcase, 3 hangs were in the
64-bit testcase - so 64-bit appears to trigger it more frequently.

Thanks,

Ingo

2018-04-28 08:30:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/9] [v3] x86, pkeys: two protection keys bug fixes


* Ingo Molnar <[email protected]> wrote:

> The hang problem is still there, if I run a script like this:
>
> while :; do date; echo -n "32-bit: "; ./protection_keys_32 >/dev/null; date; echo -n "64-bit: "; ./protection_keys_64 >/dev/null; done
>
> then within a minute one of the testcases hangs reliably.
>
> Out of 4 attempts so far one hang was in the 32-bit testcase, 3 hangs were in the
> 64-bit testcase - so 64-bit appears to trigger it more frequently.

Note that even with all fixes in this series applied the self-test hang still
triggers.

Thanks,

Ingo

2018-04-30 15:32:37

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 0/9] [v3] x86, pkeys: two protection keys bug fixes

On 04/28/2018 12:05 AM, Ingo Molnar wrote:
> In the above kernel that was missing the PROT_EXEC fix I was repeatedly running
> the 64-bit and 32-bit testcases as non-root and as root as well, until I got a
> hang in the middle of a 32-bit test running as root:
>
> test 7 PASSED (iteration 19)
> test 8 PASSED (iteration 19)
> test 9 PASSED (iteration 19)
>
> < test just hangs here >

For the hang, there is a known issue with the use of printf() in the
signal handler and a resulting deadlock. I *thought* there was a patch
merged to fix this from Ram Pai or one of the other IBM folks.

2018-04-30 16:30:24

by Ram Pai

[permalink] [raw]
Subject: Re: [PATCH 0/9] [v3] x86, pkeys: two protection keys bug fixes

On Mon, Apr 30, 2018 at 08:30:43AM -0700, Dave Hansen wrote:
> On 04/28/2018 12:05 AM, Ingo Molnar wrote:
> > In the above kernel that was missing the PROT_EXEC fix I was repeatedly running
> > the 64-bit and 32-bit testcases as non-root and as root as well, until I got a
> > hang in the middle of a 32-bit test running as root:
> >
> > test 7 PASSED (iteration 19)
> > test 8 PASSED (iteration 19)
> > test 9 PASSED (iteration 19)
> >
> > < test just hangs here >
>
> For the hang, there is a known issue with the use of printf() in the
> signal handler and a resulting deadlock. I *thought* there was a patch
> merged to fix this from Ram Pai or one of the other IBM folks.

Yes. there is a patch. unfortunately that patch assumes the selftest has
been moved into selftests/vm directory. One option is -- I merge your
changes in my selftest patchset, and send the entire series for upstream
merge.

Or you can manually massage-in the specific fix.
The patch is "selftests/vm: Fix deadlock in protection_keys.c"
https://patchwork.ozlabs.org/patch/864394/

Let me know,
--
Ram Pai


2018-05-08 22:52:26

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 0/9] [v3] x86, pkeys: two protection keys bug fixes

> 1)
>
> Minor patch series organization requests:
>
> - please include the shortlog and diffstat in the cover letter in the future, as
> it makes it easier to see the overall structure and makes it easier to reply to
> certain commits as a group.

Will do.

> - please capitalize commit titles as is usually done in arch/x86/ and change the
> change the subsystem tags to the usual ones:
>
> d76eeb1914c8: x86/pkeys: Override pkey when moving away from PROT_EXEC
> f30f10248200: x86/pkeys/selftests: Add PROT_EXEC test
> 0530ebfefcdc: x86/pkeys/selftests: Add allow faults on unknown keys
> e81c40e33818: x86/pkeys/selftests: Factor out "instruction page"
> 57042882631c: x86/pkeys/selftests: Fix pkey exhaustion test off-by-one
> 6b833e9d3171: x86/pkeys/selftests: Fix pointer math
> d16f12e3c4ca: x86/pkeys: Do not special case protection key 0
> 1cb7691d0ee4: x86/pkeys/selftests: Add a test for pkey 0
> 273ae5cde423: x86/pkeys/selftests: Save off 'prot' for allocations
>
> - please re-order the series to first introduce a unit test which specifically
> tests for the failure, ascertain that it indeed fails, and then apply the
> kernel fix. I.e. please use the order I used above for future versions of this
> patch-set.

I can't _quite_ use this order, but I get your point and I'll do as you
suggest, conceptually.


> 2)
>
> The new self-test you added does not fail overly nicely, it does the following on
> older kernels:
...
> I.e. x86 unit tests should never 'crash' in a way that suggests that the testing
> itself might be buggy - the crashes/failures should always be well controlled.

I've tried to make this nicer. I never abort() any more, for instance.

> 3)
>
> When the first kernel bug fix is applied but not the second, then I don't see the
> new PROT_EXEC test catching the bug:

Thanks for catching this. I forgot to add the test function to the
pkey_tests[] array. It's fixed up now.

> 4)
>
> In the above kernel that was missing the PROT_EXEC fix I was repeatedly running
> the 64-bit and 32-bit testcases as non-root and as root as well, until I got a
> hang in the middle of a 32-bit test running as root:

I believe this is all my stupidity from not being careful about using
signal-safe functions in the signal handlers. There's no pretty
solution for this, but I've at least made it stop hanging. The fixes
for that will be in the beginning of the next series.