This version incorporates feedback from Ingo Molnar and Dave Hansen,
and adds a few test cases that exercise this flow.
v3 updates:
- Split the original patch into 3:
- function interface changes
- helper functions
- functional change to write pkru on sigframe
- Enable all pkeys before XSAVE - i.e. wrpkru(0), rather than assuming
that the alt sig stack is always protected by pkey 0.
- Add a few test cases in pkey_sighandler_tests.c.
I had some trouble adding these tests to
tools/testing/selftests/mm/protection_keys.c, so they're in a separate
file.
Aruna Ramakrishna (4):
x86/pkeys: Signal handling function interface changes to accept PKRU
as a parameter
x86/pkeys: Add helper functions to update PKRU on sigframe
x86/pkeys: Update PKRU to enable all pkeys before XSAVE
selftests/mm: Add new testcases for pkeys
arch/x86/include/asm/fpu/signal.h | 3 +-
arch/x86/include/asm/sighandling.h | 10 +-
arch/x86/kernel/fpu/signal.c | 45 ++-
arch/x86/kernel/fpu/xstate.c | 13 +
arch/x86/kernel/fpu/xstate.h | 1 +
arch/x86/kernel/signal.c | 33 +-
arch/x86/kernel/signal_32.c | 8 +-
arch/x86/kernel/signal_64.c | 9 +-
tools/testing/selftests/mm/Makefile | 2 +
tools/testing/selftests/mm/pkey-helpers.h | 11 +-
.../selftests/mm/pkey_sighandler_tests.c | 315 ++++++++++++++++++
tools/testing/selftests/mm/protection_keys.c | 10 -
12 files changed, 422 insertions(+), 38 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_sighandler_tests.c
--
2.39.3
Problem description:
Let's assume there's a multithreaded application that runs untrusted
user code. Each thread has its stack/code protected by a non-zero pkey,
and the PKRU register is set up such that only that particular non-zero
pkey is enabled. Each thread also sets up an alternate signal stack to
handle signals, which is protected by pkey zero. The pkeys man page
documents that the PKRU will be reset to init_pkru when the signal
handler is invoked, which means that pkey zero access will be enabled.
But this reset happens after the kernel attempts to push fpu state
to the alternate stack, which is not (yet) accessible by the kernel,
which leads to a new SIGSEGV being sent to the application, terminating
it.
Enabling both the non-zero pkey (for the thread) and pkey zero (in
userspace) will not work for this use case. We cannot have the alt stack
writeable by all - the rationale here is that the code running in that
thread (using a non-zero pkey) is untrusted and should not have access
to the alternate signal stack (that uses pkey zero), to prevent the
return address of a function from being changed. The expectation is that
kernel should be able to set up the alternate signal stack and deliver
the signal to the application even if pkey zero is explicitly disabled by
the application. The signal handler accessibility should not be dictated
by whatever PKRU value the thread sets up.
Solution:
The PKRU register is managed by XSAVE, which means the sigframe contents
must match the register contents - which is not the case here. We want
the sigframe to contain the user-defined PKRU value (so that it is
restored correctly from sigcontext) but the actual register must be
reset to init_pkru so that the alt stack is accessible and the signal
can be delivered to the application. It seems that the proper fix here
would be to remove PKRU from the XSAVE framework and manage it
separately, which is quite complicated. As a workaround, this patchset
does something like this:
orig_pkru = rdpkru();
wrpkru(0);
xsave_to_user_sigframe();
put_user(pkru_sigframe_addr, orig_pkru)
This change is split over multiple patches. This patch adds pkru as an
additional parameter that's passed down the chain from handle_signal:
setup_rt_frame()
xxx_setup_rt_frame()
get_sigframe()
copy_fpstate_to_sigframe()
copy_fpregs_to_sigframe()
There are no functional changes in this patch.
In a later patch, this pkru value will be written onto the sigframe, so
that it is correctly restored from the sigcontext.
Signed-off-by: Aruna Ramakrishna <[email protected]>
---
arch/x86/include/asm/fpu/signal.h | 3 ++-
arch/x86/include/asm/sighandling.h | 10 +++++-----
arch/x86/kernel/fpu/signal.c | 8 +++++---
arch/x86/kernel/signal.c | 18 ++++++++++--------
arch/x86/kernel/signal_32.c | 8 ++++----
arch/x86/kernel/signal_64.c | 9 +++++----
6 files changed, 31 insertions(+), 25 deletions(-)
diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 611fa41711af..deaa81f44c2a 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -29,7 +29,8 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
unsigned long fpu__get_fpstate_size(void);
-extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
+extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size,
+ u32 pkru);
extern void fpu__clear_user_states(struct fpu *fpu);
extern bool fpu__restore_sig(void __user *buf, int ia32_frame);
diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index e770c4fc47f4..de458354a3ea 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -17,11 +17,11 @@ void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
void __user *
get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
- void __user **fpstate);
+ void __user **fpstate, u32 pkru);
-int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs);
-int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
-int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
-int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
+int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
+int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
+int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
+int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
#endif /* _ASM_X86_SIGHANDLING_H */
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 247f2225aa9f..7654f368accd 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -156,7 +156,8 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
return !err;
}
-static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
+static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf,
+ u32 pkru)
{
if (use_xsave())
return xsave_to_user_sigframe(buf);
@@ -185,7 +186,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
* For [f]xsave state, update the SW reserved fields in the [f]xsave frame
* indicating the absence/presence of the extended state to the user.
*/
-bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
+bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size,
+ u32 pkru)
{
struct task_struct *tsk = current;
struct fpstate *fpstate = tsk->thread.fpu.fpstate;
@@ -228,7 +230,7 @@ bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
fpregs_restore_userregs();
pagefault_disable();
- ret = copy_fpregs_to_sigframe(buf_fx);
+ ret = copy_fpregs_to_sigframe(buf_fx, pkru);
pagefault_enable();
fpregs_unlock();
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 31b6f5dddfc2..d1c84b7f6852 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -74,7 +74,7 @@ static inline int is_x32_frame(struct ksignal *ksig)
*/
void __user *
get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
- void __user **fpstate)
+ void __user **fpstate, u32 pkru)
{
struct k_sigaction *ka = &ksig->ka;
int ia32_frame = is_ia32_frame(ksig);
@@ -139,7 +139,8 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
}
/* save i387 and extended state */
- if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size))
+ if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx,
+ math_size, pkru))
return (void __user *)-1L;
return (void __user *)sp;
@@ -206,7 +207,7 @@ unsigned long get_sigframe_size(void)
}
static int
-setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
/* Perform fixup for the pre-signal frame. */
rseq_signal_deliver(ksig, regs);
@@ -214,13 +215,13 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
/* Set up the stack frame */
if (is_ia32_frame(ksig)) {
if (ksig->ka.sa.sa_flags & SA_SIGINFO)
- return ia32_setup_rt_frame(ksig, regs);
+ return ia32_setup_rt_frame(ksig, regs, pkru);
else
- return ia32_setup_frame(ksig, regs);
+ return ia32_setup_frame(ksig, regs, pkru);
} else if (is_x32_frame(ksig)) {
- return x32_setup_rt_frame(ksig, regs);
+ return x32_setup_rt_frame(ksig, regs, pkru);
} else {
- return x64_setup_rt_frame(ksig, regs);
+ return x64_setup_rt_frame(ksig, regs, pkru);
}
}
@@ -229,6 +230,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
bool stepping, failed;
struct fpu *fpu = ¤t->thread.fpu;
+ u32 pkru = read_pkru();
if (v8086_mode(regs))
save_v86_state((struct kernel_vm86_regs *) regs, VM86_SIGNAL);
@@ -264,7 +266,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
if (stepping)
user_disable_single_step(current);
- failed = (setup_rt_frame(ksig, regs) < 0);
+ failed = (setup_rt_frame(ksig, regs, pkru) < 0);
if (!failed) {
/*
* Clear the direction flag as per the ABI for function entry.
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index c12624bc82a3..68f2bfd7d6e7 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -228,7 +228,7 @@ do { \
goto label; \
} while(0)
-int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs)
+int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
sigset32_t *set = (sigset32_t *) sigmask_to_save();
struct sigframe_ia32 __user *frame;
@@ -246,7 +246,7 @@ int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs)
0x80cd, /* int $0x80 */
};
- frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru);
if (ksig->ka.sa.sa_flags & SA_RESTORER) {
restorer = ksig->ka.sa.sa_restorer;
@@ -299,7 +299,7 @@ int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs)
return -EFAULT;
}
-int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
sigset32_t *set = (sigset32_t *) sigmask_to_save();
struct rt_sigframe_ia32 __user *frame;
@@ -319,7 +319,7 @@ int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
0,
};
- frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru);
if (!user_access_begin(frame, sizeof(*frame)))
return -EFAULT;
diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index 23d8aaf8d9fd..b5c9535ff08b 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -161,7 +161,7 @@ static unsigned long frame_uc_flags(struct pt_regs *regs)
return flags;
}
-int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
sigset_t *set = sigmask_to_save();
struct rt_sigframe __user *frame;
@@ -172,7 +172,8 @@ int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
if (!(ksig->ka.sa.sa_flags & SA_RESTORER))
return -EFAULT;
- frame = get_sigframe(ksig, regs, sizeof(struct rt_sigframe), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(struct rt_sigframe), &fp,
+ pkru);
uc_flags = frame_uc_flags(regs);
if (!user_access_begin(frame, sizeof(*frame)))
@@ -300,7 +301,7 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *to,
return __copy_siginfo_to_user32(to, from);
}
-int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
compat_sigset_t *set = (compat_sigset_t *) sigmask_to_save();
struct rt_sigframe_x32 __user *frame;
@@ -311,7 +312,7 @@ int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
if (!(ksig->ka.sa.sa_flags & SA_RESTORER))
return -EFAULT;
- frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru);
uc_flags = frame_uc_flags(regs);
--
2.39.3
If the alternate signal stack is protected by a different pkey than the
current execution stack, copying xsave data to the altsigstack will fail
if its pkey is not enabled. This commit enables all pkeys before xsave,
so that the signal handler accessibility is not dictated by the PKRU
value that the thread sets up. It then writes the original PKRU value
onto the sigframe so that it's restored correctly from sigcontext.
Signed-off-by: Aruna Ramakrishna <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 11 +++++++++--
arch/x86/kernel/signal.c | 3 +++
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index dce84cce7cf8..5d52c7fde43b 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -185,8 +185,15 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf,
u32 pkru)
{
- if (use_xsave())
- return xsave_to_user_sigframe(buf);
+ int err = 0;
+
+ if (use_xsave()) {
+ err = xsave_to_user_sigframe(buf);
+ if (!err && cpu_feature_enabled(X86_FEATURE_OSPKE))
+ err = __update_pkru_in_sigframe(buf, pkru);
+ return err;
+ }
+
if (use_fxsr())
return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
else
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 75dfd05c59aa..c985bdfd855a 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -278,6 +278,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
if (stepping)
user_disable_single_step(current);
+ pkru = sig_prepare_pkru();
failed = (setup_rt_frame(ksig, regs, pkru) < 0);
if (!failed) {
/*
@@ -295,6 +296,8 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
* Ensure the signal handler starts with the new fpu state.
*/
fpu__clear_user_states(fpu);
+ } else {
+ write_pkru(pkru);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.39.3
This patch adds helper functions that will update PKRU value on the
sigframe after XSAVE.
These functions will be called in a later patch; this patch does not
change any behavior as yet.
Signed-off-by: Aruna Ramakrishna <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 26 ++++++++++++++++++++++++++
arch/x86/kernel/fpu/xstate.c | 13 +++++++++++++
arch/x86/kernel/fpu/xstate.h | 1 +
arch/x86/kernel/signal.c | 12 ++++++++++++
4 files changed, 52 insertions(+)
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 7654f368accd..dce84cce7cf8 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -63,6 +63,32 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
return true;
}
+/*
+ * Update the value of PKRU register that was already pushed
+ * onto the signal frame.
+ */
+static inline int
+__update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru)
+{
+ int err = -EFAULT;
+ struct _fpx_sw_bytes fx_sw;
+ struct pkru_state *pk = NULL;
+
+ if (unlikely(!check_xstate_in_sigframe((void __user *) buf, &fx_sw)))
+ goto out;
+
+ pk = get_xsave_addr_user(buf, XFEATURE_PKRU);
+ if (!pk || !user_write_access_begin(buf, sizeof(struct xregs_state)))
+ goto out;
+ unsafe_put_user(pkru, (unsigned int __user *) pk, uaccess_end);
+
+ err = 0;
+uaccess_end:
+ user_access_end();
+out:
+ return err;
+}
+
/*
* Signal frame handlers.
*/
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 33a214b1a4ce..8395a3633709 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -992,6 +992,19 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
return __raw_xsave_addr(xsave, xfeature_nr);
}
+/*
+ * Given an xstate feature nr, calculate where in the xsave
+ * buffer the state is. The xsave buffer should be in standard
+ * format, not compacted (e.g. user mode signal frames).
+ */
+void *get_xsave_addr_user(struct xregs_state __user *xsave, int xfeature_nr)
+{
+ if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
+ return NULL;
+
+ return (void *)xsave + xstate_offsets[xfeature_nr];
+}
+
#ifdef CONFIG_ARCH_HAS_PKEYS
/*
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index 19ca623ffa2a..6511797940ad 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -55,6 +55,7 @@ extern void fpu__init_cpu_xstate(void);
extern void fpu__init_system_xstate(unsigned int legacy_size);
extern void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
+extern void *get_xsave_addr_user(struct xregs_state *xsave, int xfeature_nr);
static inline u64 xfeatures_mask_supervisor(void)
{
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index d1c84b7f6852..75dfd05c59aa 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -225,6 +225,18 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
}
}
+/*
+ * Enable all pkeys to ensure that both the current stack and the alternate
+ * signal stack are always writeable.
+ */
+static inline u32 sig_prepare_pkru(void)
+{
+ u32 orig_pkru = read_pkru();
+
+ write_pkru(0);
+ return orig_pkru;
+}
+
static void
handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
--
2.39.3
This commit adds a few new tests to exercise the signal handler flow,
especially with pkey 0 disabled.
Signed-off-by: Keith Lucas <[email protected]>
Signed-off-by: Aruna Ramakrishna <[email protected]>
---
tools/testing/selftests/mm/Makefile | 2 +
tools/testing/selftests/mm/pkey-helpers.h | 11 +-
.../selftests/mm/pkey_sighandler_tests.c | 315 ++++++++++++++++++
tools/testing/selftests/mm/protection_keys.c | 10 -
4 files changed, 327 insertions(+), 11 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_sighandler_tests.c
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index eb5f39a2668b..2f90344ad11e 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -82,6 +82,7 @@ CAN_BUILD_X86_64 := $(shell ./../x86/check_cc.sh "$(CC)" ../x86/trivial_64bit_pr
CAN_BUILD_WITH_NOPIE := $(shell ./../x86/check_cc.sh "$(CC)" ../x86/trivial_program.c -no-pie)
VMTARGETS := protection_keys
+VMTARGETS := pkey_sighandler_tests
BINARIES_32 := $(VMTARGETS:%=%_32)
BINARIES_64 := $(VMTARGETS:%=%_64)
@@ -100,6 +101,7 @@ else
ifneq (,$(findstring $(ARCH),ppc64))
TEST_GEN_FILES += protection_keys
+TEST_GEN_FILES += pkey_sighandler_tests
endif
endif
diff --git a/tools/testing/selftests/mm/pkey-helpers.h b/tools/testing/selftests/mm/pkey-helpers.h
index 1af3156a9db8..a0766e3d9ab6 100644
--- a/tools/testing/selftests/mm/pkey-helpers.h
+++ b/tools/testing/selftests/mm/pkey-helpers.h
@@ -79,7 +79,16 @@ extern void abort_hooks(void);
} \
} while (0)
-__attribute__((noinline)) int read_ptr(int *ptr);
+#define barrier() __asm__ __volatile__("": : :"memory")
+__attribute__((noinline)) int read_ptr(int *ptr)
+{
+ /*
+ * * Keep GCC from optimizing this away somehow
+ * */
+ barrier();
+ return *ptr;
+}
+
void expected_pkey_fault(int pkey);
int sys_pkey_alloc(unsigned long flags, unsigned long init_val);
int sys_pkey_free(unsigned long pkey);
diff --git a/tools/testing/selftests/mm/pkey_sighandler_tests.c b/tools/testing/selftests/mm/pkey_sighandler_tests.c
new file mode 100644
index 000000000000..eb4155859846
--- /dev/null
+++ b/tools/testing/selftests/mm/pkey_sighandler_tests.c
@@ -0,0 +1,315 @@
+#define _GNU_SOURCE
+#define __SANE_USERSPACE_TYPES__
+#include <errno.h>
+#include <sys/syscall.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <limits.h>
+
+#include "pkey-helpers.h"
+
+/*
+ * Compile with:
+ * gcc -mxsave -o pkey_sighandler_tests -O2 -g -std=gnu99 -pthread -Wall pkey_sighandler_tests.c -lrt -ldl -lm
+ */
+
+#define STACK_SIZE PTHREAD_STACK_MIN
+
+void expected_pkey_fault(int pkey) {}
+
+pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
+pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
+siginfo_t siginfo = {0};
+
+static inline __attribute__((always_inline)) long
+syscall_raw(long n, long a1, long a2, long a3, long a4, long a5, long a6)
+{
+ unsigned long ret;
+ register long r10 asm("r10") = a4;
+ register long r8 asm("r8") = a5;
+ register long r9 asm("r9") = a6;
+ asm volatile ("syscall"
+ : "=a"(ret)
+ : "a"(n), "D"(a1), "S"(a2), "d"(a3), "r"(r10), "r"(r8), "r"(r9)
+ : "rcx", "r11", "memory");
+ return ret;
+}
+
+
+static void sigsegv_handler(int signo, siginfo_t *info, void *ucontext)
+{
+ pthread_mutex_lock(&mutex);
+
+ memcpy(&siginfo, info, sizeof(siginfo_t));
+
+ pthread_cond_signal(&cond);
+ pthread_mutex_unlock(&mutex);
+
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+}
+
+static void sigusr1_handler(int signo, siginfo_t *info, void *ucontext)
+{
+ pthread_mutex_lock(&mutex);
+
+ memcpy(&siginfo, info, sizeof(siginfo_t));
+
+ pthread_cond_signal(&cond);
+ pthread_mutex_unlock(&mutex);
+}
+
+static void *thread_segv_with_pkey0_disabled(void *ptr)
+{
+ /* Disable MPK 0 (and all others too) */
+ __write_pkey_reg(0x55555555);
+
+ /* Segfault (with SEGV_MAPERR) */
+ *(int *) (0x1) = 1;
+ return NULL;
+}
+
+static void *thread_segv_pkuerr_stack(void *ptr)
+{
+ /* Disable MPK 0 (and all others too) */
+ __write_pkey_reg(0x55555555);
+
+ /* After we disable MPK 0, we can't access the stack to return */
+ return NULL;
+}
+
+static void *thread_segv_maperr_ptr(void *ptr)
+{
+ stack_t *stack = ptr;
+ int *bad = (int *) 1;
+
+ /*
+ * Setup alternate signal stack, which should be pkey_mprotect()ed by
+ * MPK 0. The thread's stack cannot be used for signals because it is
+ * not accessible by the default init_pkru value of 0x55555554.
+ */
+ syscall_raw(SYS_sigaltstack, (long)stack, 0, 0, 0, 0, 0);
+
+ /* Disable MPK 0. Only MPK 1 is enabled. */
+ __write_pkey_reg(0x55555551);
+
+ /* Segfault */
+ *bad = 1;
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+ return NULL;
+}
+
+/*
+ * Verify that the sigsegv handler is invoked when pkey 0 is disabled.
+ * Note that the new thread stack and the alternate signal stack is
+ * protected by MPK 0.
+ */
+static void test_sigsegv_handler_with_pkey0_disabled(void)
+{
+ struct sigaction sa;
+
+ sa.sa_flags = SA_SIGINFO;
+
+ sa.sa_sigaction = sigsegv_handler;
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGSEGV, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ pthread_t thr;
+ pthread_attr_t attr;
+ pthread_attr_init(&attr);
+ pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
+
+ pthread_create(&thr, &attr, thread_segv_with_pkey0_disabled, NULL);
+
+ pthread_mutex_lock(&mutex);
+ while(siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ assert(siginfo.si_signo == SIGSEGV);
+ assert(siginfo.si_code == SEGV_MAPERR);
+ assert(siginfo.si_addr == (void *)1);
+ printf("%s passed!\n", __func__);
+}
+
+/*
+ * Verify that the sigsegv handler is invoked when pkey 0 is disabled.
+ * Note that the new thread stack and the alternate signal stack is
+ * protected by MPK 0, which renders them inaccessible when MPK 0
+ * is disabled. So just the return from the thread should cause a
+ * segfault with SEGV_PKUERR.
+ */
+static void test_sigsegv_handler_cannot_access_stack(void)
+{
+ struct sigaction sa;
+
+ sa.sa_flags = SA_SIGINFO;
+
+ sa.sa_sigaction = sigsegv_handler;
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGSEGV, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ pthread_t thr;
+ pthread_attr_t attr;
+ pthread_attr_init(&attr);
+ pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
+
+ pthread_create(&thr, &attr, thread_segv_pkuerr_stack, NULL);
+
+ pthread_mutex_lock(&mutex);
+ while(siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ assert(siginfo.si_signo == SIGSEGV);
+ assert(siginfo.si_code == SEGV_PKUERR);
+ printf("%s passed!\n", __func__);
+}
+
+/*
+ * Verify that the sigsegv handler that uses an alternate signal stack
+ * is correctly invoked for a thread which uses a non-zero MPK to protect
+ * its own stack, and disables all other MPKs (including 0).
+ */
+static void test_sigsegv_handler_with_different_pkey_for_stack(void)
+{
+ struct sigaction sa;
+ static stack_t sigstack;
+ void *stack;
+ int pkey;
+ int parentPid = 0;
+ int childPid = 0;
+
+ sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
+
+ sa.sa_sigaction = sigsegv_handler;
+
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGSEGV, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ stack = mmap(0, STACK_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+
+ assert(stack != MAP_FAILED);
+
+ /* Allow access to MPK 0 and MPK 1 */
+ __write_pkey_reg(0x55555550);
+
+ /* Protect the new stack with MPK 1 */
+ pkey = pkey_alloc(0, 0);
+ pkey_mprotect(stack, STACK_SIZE, PROT_READ | PROT_WRITE, pkey);
+
+ /* Set up alternate signal stack that will use the default MPK */
+ sigstack.ss_sp = mmap(0, STACK_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ sigstack.ss_flags = 0;
+ sigstack.ss_size = STACK_SIZE;
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ /* Use clone to avoid newer glibcs using rseq on new threads */
+ long ret = syscall_raw(SYS_clone,
+ CLONE_VM | CLONE_FS | CLONE_FILES |
+ CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
+ CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID |
+ CLONE_DETACHED,
+ (long)((char *)(stack) + STACK_SIZE),
+ (long)&parentPid,
+ (long)&childPid, 0, 0);
+
+ if (ret < 0) {
+ errno = -ret;
+ perror("clone");
+ } else if (ret == 0) {
+ thread_segv_maperr_ptr(&sigstack);
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+ }
+
+ pthread_mutex_lock(&mutex);
+ while(siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ assert(siginfo.si_signo == SIGSEGV);
+ assert(siginfo.si_code == SEGV_MAPERR);
+ assert(siginfo.si_addr == (void *)1);
+ printf("%s passed!\n", __func__);
+}
+
+/*
+ * Verify that the PKRU value set by the application is correctly
+ * restored upon return from signal handling.
+ */
+static void test_pkru_preserved_after_sigusr1(void)
+{
+ struct sigaction sa;
+ unsigned long pkru = 0x45454544;
+
+ sa.sa_flags = SA_SIGINFO;
+
+ sa.sa_sigaction = sigusr1_handler;
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGUSR1, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ __write_pkey_reg(pkru);
+
+ raise(SIGUSR1);
+
+ pthread_mutex_lock(&mutex);
+ while(siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ /* Ensure the pkru value is the same after returning from signal. */
+ assert(pkru == __read_pkey_reg());
+ assert(siginfo.si_signo == SIGUSR1);
+ printf("%s passed!\n", __func__);
+}
+
+
+void (*pkey_tests[])(void) = {
+ test_sigsegv_handler_with_pkey0_disabled,
+ test_sigsegv_handler_cannot_access_stack,
+ test_sigsegv_handler_with_different_pkey_for_stack,
+ test_pkru_preserved_after_sigusr1,
+};
+
+int main(int argc, char *argv[])
+{
+ size_t i;
+
+ for (i = 0; i < sizeof(pkey_tests)/sizeof(pkey_tests[0]); i++) {
+ (*pkey_tests[i])();
+ }
+
+ printf("All pkey-signal-handling tests PASSED!\n");
+ return 0;
+}
+
diff --git a/tools/testing/selftests/mm/protection_keys.c b/tools/testing/selftests/mm/protection_keys.c
index 374a308174d2..ec1e1d30ea6f 100644
--- a/tools/testing/selftests/mm/protection_keys.c
+++ b/tools/testing/selftests/mm/protection_keys.c
@@ -951,16 +951,6 @@ void close_test_fds(void)
nr_test_fds = 0;
}
-#define barrier() __asm__ __volatile__("": : :"memory")
-__attribute__((noinline)) int read_ptr(int *ptr)
-{
- /*
- * Keep GCC from optimizing this away somehow
- */
- barrier();
- return *ptr;
-}
-
void test_pkey_alloc_free_attach_pkey0(int *ptr, u16 pkey)
{
int i, err;
--
2.39.3
Hi Aruna,
kernel test robot noticed the following build errors:
[auto build test ERROR on tip/x86/core]
[also build test ERROR on akpm-mm/mm-everything tip/master linus/master v6.9-rc5]
[cannot apply to tip/auto-latest next-20240426]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Aruna-Ramakrishna/x86-pkeys-Signal-handling-function-interface-changes-to-accept-PKRU-as-a-parameter/20240426-020800
base: tip/x86/core
patch link: https://lore.kernel.org/r/20240425180542.1042933-5-aruna.ramakrishna%40oracle.com
patch subject: [PATCH v3 4/4] selftests/mm: Add new testcases for pkeys
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/r/[email protected]/
All errors:
linux/tools/testing/selftests/mm$ make
..
gcc -Wall -I /root/linux/tools/testing/selftests/../../.. -isystem /root/linux/tools/testing/selftests/../../../usr/include -no-pie -m32 -mxsave pkey_sighandler_tests.c vm_util.c thp_settings.c -lrt -lpthread -lm -lrt -ldl -lm -o /root/linux/tools/testing/selftests/mm/pkey_sighandler_tests_32
pkey_sighandler_tests.c: In function 'sigsegv_handler':
pkey_sighandler_tests.c:38:23: error: the register specified for 'r10' cannot be accessed by the current target
38 | register long r10 asm("r10") = a4;
| ^~~
pkey_sighandler_tests.c:39:23: error: the register specified for 'r8' cannot be accessed by the current target
39 | register long r8 asm("r8") = a5;
| ^~
pkey_sighandler_tests.c:40:23: error: the register specified for 'r9' cannot be accessed by the current target
40 | register long r9 asm("r9") = a6;
| ^~
In function 'syscall_raw',
inlined from 'sigsegv_handler' at pkey_sighandler_tests.c:58:2:
pkey_sighandler_tests.c:41:9: error: the register 'r11' cannot be clobbered in 'asm' for the current target
41 | asm volatile ("syscall"
| ^~~
pkey_sighandler_tests.c: In function 'thread_segv_maperr_ptr':
pkey_sighandler_tests.c:38:23: error: the register specified for 'r10' cannot be accessed by the current target
38 | register long r10 asm("r10") = a4;
| ^~~
pkey_sighandler_tests.c:39:23: error: the register specified for 'r8' cannot be accessed by the current target
39 | register long r8 asm("r8") = a5;
| ^~
pkey_sighandler_tests.c:40:23: error: the register specified for 'r9' cannot be accessed by the current target
40 | register long r9 asm("r9") = a6;
| ^~
pkey_sighandler_tests.c:38:23: error: the register specified for 'r10' cannot be accessed by the current target
38 | register long r10 asm("r10") = a4;
| ^~~
pkey_sighandler_tests.c:39:23: error: the register specified for 'r8' cannot be accessed by the current target
39 | register long r8 asm("r8") = a5;
| ^~
pkey_sighandler_tests.c:40:23: error: the register specified for 'r9' cannot be accessed by the current target
40 | register long r9 asm("r9") = a6;
| ^~
In function 'syscall_raw',
inlined from 'thread_segv_maperr_ptr' at pkey_sighandler_tests.c:100:9:
pkey_sighandler_tests.c:41:9: error: the register 'r11' cannot be clobbered in 'asm' for the current target
41 | asm volatile ("syscall"
| ^~~
In function 'syscall_raw',
inlined from 'thread_segv_maperr_ptr' at pkey_sighandler_tests.c:107:2:
pkey_sighandler_tests.c:41:9: error: the register 'r11' cannot be clobbered in 'asm' for the current target
41 | asm volatile ("syscall"
| ^~~
pkey_sighandler_tests.c: In function 'test_sigsegv_handler_with_different_pkey_for_stack':
pkey_sighandler_tests.c:38:23: error: the register specified for 'r10' cannot be accessed by the current target
38 | register long r10 asm("r10") = a4;
| ^~~
pkey_sighandler_tests.c:39:23: error: the register specified for 'r8' cannot be accessed by the current target
39 | register long r8 asm("r8") = a5;
| ^~
pkey_sighandler_tests.c:40:23: error: the register specified for 'r9' cannot be accessed by the current target
40 | register long r9 asm("r9") = a6;
| ^~
pkey_sighandler_tests.c:38:23: error: the register specified for 'r10' cannot be accessed by the current target
38 | register long r10 asm("r10") = a4;
| ^~~
pkey_sighandler_tests.c:39:23: error: the register specified for 'r8' cannot be accessed by the current target
39 | register long r8 asm("r8") = a5;
| ^~
pkey_sighandler_tests.c:40:23: error: the register specified for 'r9' cannot be accessed by the current target
40 | register long r9 asm("r9") = a6;
| ^~
In function 'syscall_raw',
inlined from 'test_sigsegv_handler_with_different_pkey_for_stack' at pkey_sighandler_tests.c:233:13:
pkey_sighandler_tests.c:41:9: error: the register 'r11' cannot be clobbered in 'asm' for the current target
41 | asm volatile ("syscall"
| ^~~
In function 'syscall_raw',
inlined from 'test_sigsegv_handler_with_different_pkey_for_stack' at pkey_sighandler_tests.c:247:3:
pkey_sighandler_tests.c:41:9: error: the register 'r11' cannot be clobbered in 'asm' for the current target
41 | asm volatile ("syscall"
| ^~~
make: *** [Makefile:146: /root/linux/tools/testing/selftests/mm/pkey_sighandler_tests_32] Error 1
linux/tools/testing/selftests/mm$ make LLVM=1
..
clang --target=x86_64-linux-gnu -fintegrated-as -Wall -I /root/linux/tools/testing/selftests/../../.. -isystem /root/linux/tools/testing/selftests/../../../usr/include -no-pie -m32 -mxsave pkey_sighandler_tests.c vm_util.c thp_settings.c -lrt -lpthread -lm -lrt -ldl -lm -o /root/linux/tools/testing/selftests/mm/pkey_sighandler_tests_32
pkey_sighandler_tests.c:41:16: error: couldn't allocate input reg for constraint '{r10}'
41 | asm volatile ("syscall"
| ^
pkey_sighandler_tests.c:41:16: error: couldn't allocate input reg for constraint '{r10}'
pkey_sighandler_tests.c:41:16: error: couldn't allocate input reg for constraint '{r10}'
pkey_sighandler_tests.c:41:16: error: couldn't allocate input reg for constraint '{r10}'
pkey_sighandler_tests.c:41:16: error: couldn't allocate input reg for constraint '{r10}'
5 errors generated.
make: *** [Makefile:146: /root/linux/tools/testing/selftests/mm/pkey_sighandler_tests_32] Error 1
vim +41 tools/testing/selftests/mm/pkey_sighandler_tests.c
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 33
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 34 static inline __attribute__((always_inline)) long
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 35 syscall_raw(long n, long a1, long a2, long a3, long a4, long a5, long a6)
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 36 {
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 37 unsigned long ret;
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 38 register long r10 asm("r10") = a4;
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 39 register long r8 asm("r8") = a5;
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 40 register long r9 asm("r9") = a6;
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 @41 asm volatile ("syscall"
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 42 : "=a"(ret)
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 43 : "a"(n), "D"(a1), "S"(a2), "d"(a3), "r"(r10), "r"(r8), "r"(r9)
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 44 : "rcx", "r11", "memory");
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 45 return ret;
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 46 }
6b81215d7e8a34 Aruna Ramakrishna 2024-04-25 47
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
> This commit adds a few new tests to exercise the signal handler flow,
"This commit" is equally useless as "This patch". See
Documentation/process/ and grep for "This patch".
> especially with pkey 0 disabled.
>
> Signed-off-by: Keith Lucas <[email protected]>
> Signed-off-by: Aruna Ramakrishna <[email protected]>
> ---
> tools/testing/selftests/mm/Makefile | 2 +
> tools/testing/selftests/mm/pkey-helpers.h | 11 +-
> .../selftests/mm/pkey_sighandler_tests.c | 315 ++++++++++++++++++
> tools/testing/selftests/mm/protection_keys.c | 10 -
> 4 files changed, 327 insertions(+), 11 deletions(-)
> create mode 100644 tools/testing/selftests/mm/pkey_sighandler_tests.c
>
> diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
> index eb5f39a2668b..2f90344ad11e 100644
> --- a/tools/testing/selftests/mm/Makefile
> +++ b/tools/testing/selftests/mm/Makefile
> @@ -82,6 +82,7 @@ CAN_BUILD_X86_64 := $(shell ./../x86/check_cc.sh "$(CC)" ../x86/trivial_64bit_pr
> CAN_BUILD_WITH_NOPIE := $(shell ./../x86/check_cc.sh "$(CC)" ../x86/trivial_program.c -no-pie)
>
> VMTARGETS := protection_keys
> +VMTARGETS := pkey_sighandler_tests
> BINARIES_32 := $(VMTARGETS:%=%_32)
> BINARIES_64 := $(VMTARGETS:%=%_64)
>
> @@ -100,6 +101,7 @@ else
>
> ifneq (,$(findstring $(ARCH),ppc64))
> TEST_GEN_FILES += protection_keys
> +TEST_GEN_FILES += pkey_sighandler_tests
> endif
>
> endif
> diff --git a/tools/testing/selftests/mm/pkey-helpers.h b/tools/testing/selftests/mm/pkey-helpers.h
> index 1af3156a9db8..a0766e3d9ab6 100644
> --- a/tools/testing/selftests/mm/pkey-helpers.h
> +++ b/tools/testing/selftests/mm/pkey-helpers.h
> @@ -79,7 +79,16 @@ extern void abort_hooks(void);
> } \
> } while (0)
>
> -__attribute__((noinline)) int read_ptr(int *ptr);
> +#define barrier() __asm__ __volatile__("": : :"memory")
#include <linux/compiler.h>
> +__attribute__((noinline)) int read_ptr(int *ptr)
> +{
> + /*
> + * * Keep GCC from optimizing this away somehow
> + * */
That comment is completely malformatted.
> + barrier();
> + return *ptr;
White space damage.
> +
> +static inline __attribute__((always_inline)) long
> +syscall_raw(long n, long a1, long a2, long a3, long a4, long a5, long a6)
> +{
What for? syscall(2) provides exactly what you want, no?
> + unsigned long ret;
> + register long r10 asm("r10") = a4;
> + register long r8 asm("r8") = a5;
> + register long r9 asm("r9") = a6;
> + asm volatile ("syscall"
> + : "=a"(ret)
> + : "a"(n), "D"(a1), "S"(a2), "d"(a3), "r"(r10), "r"(r8), "r"(r9)
> + : "rcx", "r11", "memory");
Aside of that this breaks on 32-bit builds.
> + return ret;
> +}
> +
> +static void *thread_segv_with_pkey0_disabled(void *ptr)
> +{
> + /* Disable MPK 0 (and all others too) */
> + __write_pkey_reg(0x55555555);
> +
> + /* Segfault (with SEGV_MAPERR) */
Please use tabs for indentation. (All over the place)
> + *(int *) (0x1) = 1;
> + return NULL;
> +}
> +
> +/*
> + * Verify that the sigsegv handler is invoked when pkey 0 is disabled.
> + * Note that the new thread stack and the alternate signal stack is
> + * protected by MPK 0.
> + */
> +static void test_sigsegv_handler_with_pkey0_disabled(void)
> +{
> + struct sigaction sa;
> +
> + sa.sa_flags = SA_SIGINFO;
> +
> + sa.sa_sigaction = sigsegv_handler;
> + sigemptyset(&sa.sa_mask);
> + if (sigaction(SIGSEGV, &sa, NULL) == -1) {
> + perror("sigaction");
> + exit(EXIT_FAILURE);
> + }
> +
> + memset(&siginfo, 0, sizeof(siginfo));
> +
> + pthread_t thr;
> + pthread_attr_t attr;
> + pthread_attr_init(&attr);
> + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
> +
> + pthread_create(&thr, &attr, thread_segv_with_pkey0_disabled, NULL);
> +
> + pthread_mutex_lock(&mutex);
> + while(siginfo.si_signo == 0)
> + pthread_cond_wait(&cond, &mutex);
> + pthread_mutex_unlock(&mutex);
> +
> + assert(siginfo.si_signo == SIGSEGV);
> + assert(siginfo.si_code == SEGV_MAPERR);
> + assert(siginfo.si_addr == (void *)1);
This should not use assert(). This wants to use proper kselftest result
and exit mechanisms all over the place.
> + printf("%s passed!\n", __func__);
Ditto for printf().
> +}
> +/*
> + * Verify that the sigsegv handler that uses an alternate signal stack
> + * is correctly invoked for a thread which uses a non-zero MPK to protect
> + * its own stack, and disables all other MPKs (including 0).
> + */
> +static void test_sigsegv_handler_with_different_pkey_for_stack(void)
> +{
> + struct sigaction sa;
> + static stack_t sigstack;
> + void *stack;
> + int pkey;
> + int parentPid = 0;
> + int childPid = 0;
No camel case please
> +int main(int argc, char *argv[])
> +{
> + size_t i;
size_t? What's wrong with int?
> +
> + for (i = 0; i < sizeof(pkey_tests)/sizeof(pkey_tests[0]); i++) {
ARRAY_SIZE() ?
> + (*pkey_tests[i])();
> + }
Pointless brackets.
Thanks,
tglx
On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
>
> -extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
> +extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size,
> + u32 pkru);
No line break required. Line length is 100 characters. All over the place.
> @@ -229,6 +230,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
> {
> bool stepping, failed;
> struct fpu *fpu = ¤t->thread.fpu;
> + u32 pkru = read_pkru();
Please use reverse fir tree ordering of variables:
https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#variable-declarations
Thanks,
tglx
On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
> This patch adds helper functions that will update PKRU value on the
git grep 'This patch' Documentation/process/
Also please explain WHY this is needed and not just what.
> sigframe after XSAVE.
..
> +/*
> + * Update the value of PKRU register that was already pushed
> + * onto the signal frame.
> + */
> +static inline int
> +__update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru)
No line break and why does this need two underscores in the function name?
> +{
> + int err = -EFAULT;
> + struct _fpx_sw_bytes fx_sw;
> + struct pkru_state *pk = NULL;
Why assign NULL to pk?
Also this wants to have a
if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE)))
return 0;
Instead of doing it at the call site.
> + if (unlikely(!check_xstate_in_sigframe((void __user *) buf, &fx_sw)))
What is this check for?
More interesting: How is this check supposed to succeed at all?
copy_fpstate_to_sigframe()
....
copy_fpregs_to_sigframe()
xsave_to_user_sigframe();
__update_pkru_in_sigframe();
save_xstate_epilog();
check_xstate_in_sigframe() validates the full frame including what
save_xstate_epilog() writes afterwards. So this clearly cannot work.
> + goto out;
What's wrong with 'return -EFAULT;'?
> + pk = get_xsave_addr_user(buf, XFEATURE_PKRU);
> + if (!pk || !user_write_access_begin(buf, sizeof(struct xregs_state)))
> + goto out;
Why user_write_access_begin()?
1) The access to the FPU frame on the stack has been validated
already in copy_fpstate_to_sigframe() _before_
copy_fpregs_to_sigframe() is invoked.
2) This does not require the nospec_barrier() as this is not a user
controlled potentially malicious access.
> + unsafe_put_user(pkru, (unsigned int __user *) pk, uaccess_end);
This type case would need __force to be valid for make C=1.
But that's not required at all because get_xsave_addr_user() should
return a user pointer in the first place.
> +
> + err = 0;
> +uaccess_end:
> + user_access_end();
> +out:
> + return err;
So none of the above voodoo is required at all.
return __put_user(pkru, get_xsave_addr_user(buf, XFEATURE_PKRU));
Is all what's needed, no?
> +/*
> + * Given an xstate feature nr, calculate where in the xsave
> + * buffer the state is. The xsave buffer should be in standard
> + * format, not compacted (e.g. user mode signal frames).
> + */
> +void *get_xsave_addr_user(struct xregs_state __user *xsave, int xfeature_nr)
void __user *
> +{
> + if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
> + return NULL;
> +
> + return (void *)xsave + xstate_offsets[xfeature_nr];
return (void __user *)....
Thanks,
tglx
On Tue, May 07 2024 at 14:05, Thomas Gleixner wrote:
> On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
>> especially with pkey 0 disabled.
>>
>> Signed-off-by: Keith Lucas <[email protected]>
>> Signed-off-by: Aruna Ramakrishna <[email protected]>
Forgot to mention that this Signed-off-by chain is invalid.
On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
> If the alternate signal stack is protected by a different pkey than the
> current execution stack, copying xsave data to the altsigstack will fail
> if its pkey is not enabled. This commit enables all pkeys before
> xsave,
This commit (patch) ....
Also this lacks any justification why this enables all pkeys and how
that is the right thing to do instead of using init_pkru_value which
is what is set by fpu__clear_user_states() before going back to user
space. For signal handling this can be the only valid PKEY state unless
I'm missing something here.
> static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf,
> u32 pkru)
> {
> - if (use_xsave())
> - return xsave_to_user_sigframe(buf);
> + int err = 0;
> +
> + if (use_xsave()) {
> + err = xsave_to_user_sigframe(buf);
> + if (!err && cpu_feature_enabled(X86_FEATURE_OSPKE))
The CPU feature check really wants to be in update_pkru_in_sigframe()
> @@ -278,6 +278,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
> if (stepping)
> user_disable_single_step(current);
>
> + pkru = sig_prepare_pkru();
pkru is defined in the first patch:
> + u32 pkru = read_pkru();
Why do we need a read and then another read in sig_prepare_pkru()?
Also this lacks a comment what the sig_prepare_pkru() invocation is for ...
> failed = (setup_rt_frame(ksig, regs, pkru) < 0);
> if (!failed) {
> /*
> @@ -295,6 +296,8 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
> * Ensure the signal handler starts with the new fpu state.
> */
> fpu__clear_user_states(fpu);
> + } else {
> + write_pkru(pkru);
.. and a corresponding comment why this needs to be restored here.
> }
> signal_setup_done(failed, ksig, stepping);
> }
Thanks,
tglx
> On May 7, 2024, at 9:16 AM, Thomas Gleixner <[email protected]> wrote:
>
> On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
>> This patch adds helper functions that will update PKRU value on the
>
> git grep 'This patch' Documentation/process/
>
> Also please explain WHY this is needed and not just what.
Patch 1/4 has the “why”. Should I expand more in that commit message?
Or are you specifically asking about why the helper functions are needed;
I can certainly elaborate on that here.
Thanks very much for your detailed review. I will make the required corrections.
Thanks,
Aruna
>
>> sigframe after XSAVE.
>
> ...
>
>> +/*
>> + * Update the value of PKRU register that was already pushed
>> + * onto the signal frame.
>> + */
>> +static inline int
>> +__update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru)
>
> No line break and why does this need two underscores in the function name?
>
>> +{
>> + int err = -EFAULT;
>> + struct _fpx_sw_bytes fx_sw;
>> + struct pkru_state *pk = NULL;
>
> Why assign NULL to pk?
>
> Also this wants to have a
>
> if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE)))
> return 0;
>
> Instead of doing it at the call site.
>
>> + if (unlikely(!check_xstate_in_sigframe((void __user *) buf, &fx_sw)))
>
> What is this check for?
>
> More interesting: How is this check supposed to succeed at all?
>
> copy_fpstate_to_sigframe()
> ....
> copy_fpregs_to_sigframe()
> xsave_to_user_sigframe();
> __update_pkru_in_sigframe();
> save_xstate_epilog();
>
> check_xstate_in_sigframe() validates the full frame including what
> save_xstate_epilog() writes afterwards. So this clearly cannot work.
>
>> + goto out;
>
> What's wrong with 'return -EFAULT;'?
>
>> + pk = get_xsave_addr_user(buf, XFEATURE_PKRU);
>> + if (!pk || !user_write_access_begin(buf, sizeof(struct xregs_state)))
>> + goto out;
>
> Why user_write_access_begin()?
>
> 1) The access to the FPU frame on the stack has been validated
> already in copy_fpstate_to_sigframe() _before_
> copy_fpregs_to_sigframe() is invoked.
>
> 2) This does not require the nospec_barrier() as this is not a user
> controlled potentially malicious access.
>
>> + unsafe_put_user(pkru, (unsigned int __user *) pk, uaccess_end);
>
> This type case would need __force to be valid for make C=1.
>
> But that's not required at all because get_xsave_addr_user() should
> return a user pointer in the first place.
>
>> +
>> + err = 0;
>> +uaccess_end:
>> + user_access_end();
>> +out:
>> + return err;
>
> So none of the above voodoo is required at all.
>
> return __put_user(pkru, get_xsave_addr_user(buf, XFEATURE_PKRU));
>
> Is all what's needed, no?
>
>> +/*
>> + * Given an xstate feature nr, calculate where in the xsave
>> + * buffer the state is. The xsave buffer should be in standard
>> + * format, not compacted (e.g. user mode signal frames).
>> + */
>> +void *get_xsave_addr_user(struct xregs_state __user *xsave, int xfeature_nr)
>
> void __user *
>
>> +{
>> + if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
>> + return NULL;
>> +
>> + return (void *)xsave + xstate_offsets[xfeature_nr];
>
> return (void __user *)....
>
> Thanks,
>
> tglx
> On May 7, 2024, at 9:47 AM, Thomas Gleixner <[email protected]> wrote:
>
> On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
>
>> If the alternate signal stack is protected by a different pkey than the
>> current execution stack, copying xsave data to the altsigstack will fail
>> if its pkey is not enabled. This commit enables all pkeys before
>> xsave,
>
> This commit (patch) ....
>
> Also this lacks any justification why this enables all pkeys and how
> that is the right thing to do instead of using init_pkru_value which
> is what is set by fpu__clear_user_states() before going back to user
> space. For signal handling this can be the only valid PKEY state unless
> I'm missing something here.
If the alt sig stack is protected by a different pkey (other than pkey 0), then
this flow would need to enable that, along with the pkey for the thread’s
stack. Since the code has no way of knowing what pkey the altstack needs,
it enables all for this brief window.
>
>> static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf,
>> u32 pkru)
>> {
>> - if (use_xsave())
>> - return xsave_to_user_sigframe(buf);
>> + int err = 0;
>> +
>> + if (use_xsave()) {
>> + err = xsave_to_user_sigframe(buf);
>> + if (!err && cpu_feature_enabled(X86_FEATURE_OSPKE))
>
> The CPU feature check really wants to be in update_pkru_in_sigframe()
>
>> @@ -278,6 +278,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
>> if (stepping)
>> user_disable_single_step(current);
>>
>> + pkru = sig_prepare_pkru();
>
> pkru is defined in the first patch:
>
>> + u32 pkru = read_pkru();
>
> Why do we need a read and then another read in sig_prepare_pkru()?
sig_prepare_pkru() is where the pkru value is updated to make sure the alt stack is
accessible for XSAVE, while capturing the user-defined orig_pkru so that it can be
correctly restored later. Two reads are obviously redundant - I will remove the first one
(introduced in patch 1) here.
Thanks,
Aruna
> On May 7, 2024, at 9:56 AM, Thomas Gleixner <[email protected]> wrote:
>
> On Tue, May 07 2024 at 14:05, Thomas Gleixner wrote:
>> On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
>>> especially with pkey 0 disabled.
>>>
>>> Signed-off-by: Keith Lucas <[email protected]>
>>> Signed-off-by: Aruna Ramakrishna <[email protected]>
>
> Forgot to mention that this Signed-off-by chain is invalid.
Apologies for what is probably an obvious question.
Keith Lucas developed the testcases in the new file pkey_sighandler_tests.c.
I added them to tools/testing/selftests/mm (i.e. just the changes
to Makefile, protection_keys.c and some comments etc.).
So should I change this to:
Co-developed-by: Keith Lucas <[email protected]>
Signed-off-by: Keith Lucas <[email protected]>
Signed-off-by: Aruna Ramakrishna <[email protected]>
Thanks,
Aruna
On Tue, May 07 2024 at 17:34, Aruna Ramakrishna wrote:
>> On May 7, 2024, at 9:47 AM, Thomas Gleixner <[email protected]> wrote:
>>
>> Also this lacks any justification why this enables all pkeys and how
>> that is the right thing to do instead of using init_pkru_value which
>> is what is set by fpu__clear_user_states() before going back to user
>> space. For signal handling this can be the only valid PKEY state unless
>> I'm missing something here.
>
> If the alt sig stack is protected by a different pkey (other than pkey 0), then
> this flow would need to enable that, along with the pkey for the thread’s
> stack. Since the code has no way of knowing what pkey the altstack needs,
> it enables all for this brief window.
Again. The flow here is:
handle_signal()
enable_access_to_altstack()
....
fpu__clear_user_states()
restore_fpregs_from_init_fpstate(XFEATURE_MASK_USER_RESTORE)
os_xrstor(&init_fpstate, features_mask)
pkru_write_default()
write_pkru(init_pkru_value); <- Loads the default PKRU value
return_to_user_space()
User space resumes with the default PKRU value and the first thing user
space does when entering the signal handler is to push stuff on the
signal stack.
If the signal stack is protected by a key which is not contained in
init_pkru_value then the application segfaults in a non recoverable way,
no?
So arguably it is sufficient to ensure that PKRU has the keys in
init_pkru_value enabled:
sigpkru = read_pkru() & init_pkru_value;
If user space protects the task stack or the sigalt stack with a key
which is not in init_pkru_value then it does not matter at all whether
it dies in handle_signal() or later when returning to user space, no?
I'm not fundamentaly opposed to enable all keys, but I don't buy this
without a proper explanation why this has been chosen over enabling only
the absolute minimum access rights.
Thanks,
tglx
On Tue, May 07 2024 at 18:04, Aruna Ramakrishna wrote:
>> On May 7, 2024, at 9:56 AM, Thomas Gleixner <[email protected]> wrote:
>>
>> On Tue, May 07 2024 at 14:05, Thomas Gleixner wrote:
>>> On Thu, Apr 25 2024 at 18:05, Aruna Ramakrishna wrote:
>>>> especially with pkey 0 disabled.
>>>>
>>>> Signed-off-by: Keith Lucas <[email protected]>
>>>> Signed-off-by: Aruna Ramakrishna <[email protected]>
>>
>> Forgot to mention that this Signed-off-by chain is invalid.
>
> Apologies for what is probably an obvious question.
>
> Keith Lucas developed the testcases in the new file pkey_sighandler_testsc.
>
> I added them to tools/testing/selftests/mm (i.e. just the changes
> to Makefile, protection_keys.c and some comments etc.).
>
> So should I change this to:
>
> Co-developed-by: Keith Lucas <[email protected]>
> Signed-off-by: Keith Lucas <[email protected]>
> Signed-off-by: Aruna Ramakrishna <[email protected]>
If Keith developed them, then you probably want to credit him as author.
[ aruna: Adopted to upstream ]
Signed-off-by: Keith Lucas <[email protected]>
Signed-off-by: Aruna Ramakrishna <[email protected]>
Or split out the pkey helper changes into a separate patch which is
authored by you.
Either way works just fine.
Thanks,
tglx