2024-06-06 22:40:59

by Aruna Ramakrishna

[permalink] [raw]
Subject: [PATCH v5 0/5] x86/pkeys: update PKRU to enable pkey 0 before XSAVE

v5 updates:
- No major changes, mostly a resend of v4 - except for updating the
commit description for patch 5/5

v4 updates (based on review feedback from Thomas Gleixner):
- Simplified update_pkru_in_sigframe()
- Changed sigpkru to enable minimally required keys (init_pkru and
current
pkru)
- Modified pkey_sighandler_tests.c to use kselfttest framework
- Fixed commit descriptions
- Fixed sigreturn use case (pointed out by Jeff Xu)
- Added a new sigreturn test case

v3 updates (based on review feedback from Ingo Molnar and Dave Hansen):
- Split the original patch into 3:
- function interface changes
- helper functions
- functional change to write pkru on sigframe
- Enabled all pkeys before XSAVE - i.e. wrpkru(0), rather than assuming
that the alt sig stack is always protected by pkey 0.
- Added a few test cases in pkey_sighandler_tests.c.

I had some trouble adding these tests to
tools/testing/selftests/mm/protection_keys.c, so they're in a separate
file.

Aruna Ramakrishna (4):
x86/pkeys: Add PKRU as a parameter in signal handling functions
x86/pkeys: Add helper functions to update PKRU on sigframe
x86/pkeys: Update PKRU to enable minimally required pkeys before XSAVE
x86/pkeys: Restore altstack before sigcontext

Keith Lucas (1):
selftests/mm: Add new testcases for pkeys

arch/x86/include/asm/fpu/signal.h | 2 +-
arch/x86/include/asm/sighandling.h | 10 +-
arch/x86/kernel/fpu/signal.c | 27 +-
arch/x86/kernel/fpu/xstate.c | 13 +
arch/x86/kernel/fpu/xstate.h | 1 +
arch/x86/kernel/signal.c | 42 +-
arch/x86/kernel/signal_32.c | 12 +-
arch/x86/kernel/signal_64.c | 14 +-
tools/testing/selftests/mm/Makefile | 5 +-
tools/testing/selftests/mm/pkey-helpers.h | 11 +-
.../selftests/mm/pkey_sighandler_tests.c | 479 ++++++++++++++++++
tools/testing/selftests/mm/protection_keys.c | 10 -
12 files changed, 581 insertions(+), 45 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_sighandler_tests.c


base-commit: a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
prerequisite-patch-id: d84439301b44c03df2555d3722ec512001ae52f2
--
2.39.3



2024-06-06 22:41:03

by Aruna Ramakrishna

[permalink] [raw]
Subject: [PATCH v5 2/5] x86/pkeys: Add helper functions to update PKRU on sigframe

In the case where a user thread sets up an alternate signal stack
protected by the default pkey (i.e. pkey 0), while the thread's stack
is protected by a non-zero pkey, both these pkeys have to be enabled in
the PKRU register for the signal to be delivered to the application
correctly. However, the PKRU value restored after handling the signal
must not enable this extra pkey (i.e. pkey 0), so the PKRU value on the
on the sigframe should be overwritten with the user-defined value.

Add helper functions that will update PKRU value on the sigframe after
XSAVE. These functions will be called in a later patch; this patch does not
change any behavior as yet.

Signed-off-by: Aruna Ramakrishna <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 11 +++++++++++
arch/x86/kernel/fpu/xstate.c | 13 +++++++++++++
arch/x86/kernel/fpu/xstate.h | 1 +
arch/x86/kernel/signal.c | 15 +++++++++++++++
4 files changed, 40 insertions(+)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 2b3b9e140dd4..b0b254b931fd 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -63,6 +63,16 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
return true;
}

+/*
+ * Update the value of PKRU register that was already pushed onto the signal frame.
+ */
+static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru)
+{
+ if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE)))
+ return 0;
+ return __put_user(pkru, (unsigned int __user *)get_xsave_addr_user(buf, XFEATURE_PKRU));
+}
+
/*
* Signal frame handlers.
*/
@@ -160,6 +170,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pk
{
if (use_xsave())
return xsave_to_user_sigframe(buf);
+
if (use_fxsr())
return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
else
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 33a214b1a4ce..e257478a0962 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -992,6 +992,19 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
return __raw_xsave_addr(xsave, xfeature_nr);
}

+/*
+ * Given an xstate feature nr, calculate where in the xsave buffer the state is.
+ * The xsave buffer should be in standard format, not compacted (e.g. user mode
+ * signal frames).
+ */
+void __user *get_xsave_addr_user(struct xregs_state __user *xsave, int xfeature_nr)
+{
+ if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
+ return NULL;
+
+ return (void __user *)xsave + xstate_offsets[xfeature_nr];
+}
+
#ifdef CONFIG_ARCH_HAS_PKEYS

/*
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index 19ca623ffa2a..236742db69fa 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -55,6 +55,7 @@ extern void fpu__init_cpu_xstate(void);
extern void fpu__init_system_xstate(unsigned int legacy_size);

extern void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
+extern void __user *get_xsave_addr_user(struct xregs_state *xsave, int xfeature_nr);

static inline u64 xfeatures_mask_supervisor(void)
{
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 94b894437327..3fa66b2fe753 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -224,6 +224,21 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
}
}

+/*
+ * Enable init_pkru pkey as well as the user-defined pkey to ensure that both
+ * the current stack and the alternate signal stack are writeable.
+ * Note: this function assumes that the alternate signal stack is accessible
+ * with the init_pkru_value. If the sigaltstack is protected by a different,
+ * non-zero pkey, then the application will segfault.
+ */
+static inline u32 sig_prepare_pkru(void)
+{
+ u32 orig_pkru = read_pkru();
+
+ write_pkru(orig_pkru & pkru_get_init_value());
+ return orig_pkru;
+}
+
static void
handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
--
2.39.3


2024-06-06 22:41:27

by Aruna Ramakrishna

[permalink] [raw]
Subject: [PATCH v5 1/5] x86/pkeys: Add PKRU as a parameter in signal handling functions

Problem description:
Let's assume there's a multithreaded application that runs untrusted
user code. Each thread has its stack/code protected by a non-zero pkey,
and the PKRU register is set up such that only that particular non-zero
pkey is enabled. Each thread also sets up an alternate signal stack to
handle signals, which is protected by pkey zero. The pkeys man page
documents that the PKRU will be reset to init_pkru when the signal
handler is invoked, which means that pkey zero access will be enabled.
But this reset happens after the kernel attempts to push fpu state
to the alternate stack, which is not (yet) accessible by the kernel,
which leads to a new SIGSEGV being sent to the application, terminating
it.

Enabling both the non-zero pkey (for the thread) and pkey zero in
userspace will not work for this use case. We cannot have the alt stack
writeable by all - the rationale here is that the code running in that
thread (using a non-zero pkey) is untrusted and should not have access
to the alternate signal stack (that uses pkey zero), to prevent the
return address of a function from being changed. The expectation is that
kernel should be able to set up the alternate signal stack and deliver
the signal to the application even if pkey zero is explicitly disabled
by the application. The signal handler accessibility should not be dictated
by whatever PKRU value the thread sets up.

Solution:
The PKRU register is managed by XSAVE, which means the sigframe contents
must match the register contents - which is not the case here. We want
the sigframe to contain the user-defined PKRU value (so that it is
restored correctly from sigcontext) but the actual register must be
reset to init_pkru so that the alt stack is accessible and the signal
can be delivered to the application. It seems that the proper fix here
would be to remove PKRU from the XSAVE framework and manage it
separately, which is quite complicated. As a workaround, do this:

orig_pkru = rdpkru();
wrpkru(orig_pkru & init_pkru_value);
xsave_to_user_sigframe();
put_user(pkru_sigframe_addr, orig_pkru)

This change is split over multiple patches.

In preparation for writing PKRU to sigframe in a later patch, pass in PKRU as
an additional parameter down the chain from handle_signal:
setup_rt_frame()
xxx_setup_rt_frame()
get_sigframe()
copy_fpstate_to_sigframe()
copy_fpregs_to_sigframe()

There are no functional changes in this patch.

Signed-off-by: Aruna Ramakrishna <[email protected]>
---
arch/x86/include/asm/fpu/signal.h | 2 +-
arch/x86/include/asm/sighandling.h | 10 +++++-----
arch/x86/kernel/fpu/signal.c | 6 +++---
arch/x86/kernel/signal.c | 19 ++++++++++---------
arch/x86/kernel/signal_32.c | 8 ++++----
arch/x86/kernel/signal_64.c | 8 ++++----
6 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 611fa41711af..eccc75bc9c4f 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -29,7 +29,7 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame,

unsigned long fpu__get_fpstate_size(void);

-extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
+extern bool copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size, u32 pkru);
extern void fpu__clear_user_states(struct fpu *fpu);
extern bool fpu__restore_sig(void __user *buf, int ia32_frame);

diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index e770c4fc47f4..de458354a3ea 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -17,11 +17,11 @@ void signal_fault(struct pt_regs *regs, void __user *frame, char *where);

void __user *
get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
- void __user **fpstate);
+ void __user **fpstate, u32 pkru);

-int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs);
-int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
-int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
-int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
+int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
+int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
+int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
+int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);

#endif /* _ASM_X86_SIGHANDLING_H */
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 247f2225aa9f..2b3b9e140dd4 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -156,7 +156,7 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
return !err;
}

-static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
+static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
{
if (use_xsave())
return xsave_to_user_sigframe(buf);
@@ -185,7 +185,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
* For [f]xsave state, update the SW reserved fields in the [f]xsave frame
* indicating the absence/presence of the extended state to the user.
*/
-bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
+bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size, u32 pkru)
{
struct task_struct *tsk = current;
struct fpstate *fpstate = tsk->thread.fpu.fpstate;
@@ -228,7 +228,7 @@ bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
fpregs_restore_userregs();

pagefault_disable();
- ret = copy_fpregs_to_sigframe(buf_fx);
+ ret = copy_fpregs_to_sigframe(buf_fx, pkru);
pagefault_enable();
fpregs_unlock();

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 31b6f5dddfc2..94b894437327 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -74,7 +74,7 @@ static inline int is_x32_frame(struct ksignal *ksig)
*/
void __user *
get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
- void __user **fpstate)
+ void __user **fpstate, u32 pkru)
{
struct k_sigaction *ka = &ksig->ka;
int ia32_frame = is_ia32_frame(ksig);
@@ -139,7 +139,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
}

/* save i387 and extended state */
- if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size))
+ if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))
return (void __user *)-1L;

return (void __user *)sp;
@@ -206,7 +206,7 @@ unsigned long get_sigframe_size(void)
}

static int
-setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
/* Perform fixup for the pre-signal frame. */
rseq_signal_deliver(ksig, regs);
@@ -214,21 +214,22 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
/* Set up the stack frame */
if (is_ia32_frame(ksig)) {
if (ksig->ka.sa.sa_flags & SA_SIGINFO)
- return ia32_setup_rt_frame(ksig, regs);
+ return ia32_setup_rt_frame(ksig, regs, pkru);
else
- return ia32_setup_frame(ksig, regs);
+ return ia32_setup_frame(ksig, regs, pkru);
} else if (is_x32_frame(ksig)) {
- return x32_setup_rt_frame(ksig, regs);
+ return x32_setup_rt_frame(ksig, regs, pkru);
} else {
- return x64_setup_rt_frame(ksig, regs);
+ return x64_setup_rt_frame(ksig, regs, pkru);
}
}

static void
handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
- bool stepping, failed;
struct fpu *fpu = &current->thread.fpu;
+ u32 pkru = read_pkru();
+ bool stepping, failed;

if (v8086_mode(regs))
save_v86_state((struct kernel_vm86_regs *) regs, VM86_SIGNAL);
@@ -264,7 +265,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
if (stepping)
user_disable_single_step(current);

- failed = (setup_rt_frame(ksig, regs) < 0);
+ failed = (setup_rt_frame(ksig, regs, pkru) < 0);
if (!failed) {
/*
* Clear the direction flag as per the ABI for function entry.
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index c12624bc82a3..68f2bfd7d6e7 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -228,7 +228,7 @@ do { \
goto label; \
} while(0)

-int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs)
+int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
sigset32_t *set = (sigset32_t *) sigmask_to_save();
struct sigframe_ia32 __user *frame;
@@ -246,7 +246,7 @@ int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs)
0x80cd, /* int $0x80 */
};

- frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru);

if (ksig->ka.sa.sa_flags & SA_RESTORER) {
restorer = ksig->ka.sa.sa_restorer;
@@ -299,7 +299,7 @@ int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs)
return -EFAULT;
}

-int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
sigset32_t *set = (sigset32_t *) sigmask_to_save();
struct rt_sigframe_ia32 __user *frame;
@@ -319,7 +319,7 @@ int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
0,
};

- frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru);

if (!user_access_begin(frame, sizeof(*frame)))
return -EFAULT;
diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index 23d8aaf8d9fd..6b189de005b5 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -161,7 +161,7 @@ static unsigned long frame_uc_flags(struct pt_regs *regs)
return flags;
}

-int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
sigset_t *set = sigmask_to_save();
struct rt_sigframe __user *frame;
@@ -172,7 +172,7 @@ int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
if (!(ksig->ka.sa.sa_flags & SA_RESTORER))
return -EFAULT;

- frame = get_sigframe(ksig, regs, sizeof(struct rt_sigframe), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(struct rt_sigframe), &fp, pkru);
uc_flags = frame_uc_flags(regs);

if (!user_access_begin(frame, sizeof(*frame)))
@@ -300,7 +300,7 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *to,
return __copy_siginfo_to_user32(to, from);
}

-int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
+int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
{
compat_sigset_t *set = (compat_sigset_t *) sigmask_to_save();
struct rt_sigframe_x32 __user *frame;
@@ -311,7 +311,7 @@ int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
if (!(ksig->ka.sa.sa_flags & SA_RESTORER))
return -EFAULT;

- frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
+ frame = get_sigframe(ksig, regs, sizeof(*frame), &fp, pkru);

uc_flags = frame_uc_flags(regs);

--
2.39.3


2024-06-06 22:41:37

by Aruna Ramakrishna

[permalink] [raw]
Subject: [PATCH v5 4/5] x86/pkeys: Restore altstack before sigcontext

A process can disable access to the alternate signal stack and still
expect signals to be delivered correctly. handle_signal() updates the
PKRU value to enable access to the atlstack, and makes sure that the
value on the sigframe is the user-defined PKRU value so that it is
correctly restored. However, in sigreturn(), restore_altstack() needs
read access to the alt stack. But the PKRU is already restored from the
sigframe (in restore_sigcontext()) which will disable access to the
alt stack, resulting in a SIGSEGV.

Fix this by restoring altstack before restoring PKRU.

Signed-off-by: Aruna Ramakrishna <[email protected]>
---
arch/x86/kernel/signal_32.c | 4 ++--
arch/x86/kernel/signal_64.c | 6 +++---
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index 68f2bfd7d6e7..c7a489f0d943 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -160,10 +160,10 @@ SYSCALL32_DEFINE0(rt_sigreturn)

set_current_blocked(&set);

- if (!ia32_restore_sigcontext(regs, &frame->uc.uc_mcontext))
+ if (restore_altstack32(&frame->uc.uc_stack))
goto badframe;

- if (restore_altstack32(&frame->uc.uc_stack))
+ if (!ia32_restore_sigcontext(regs, &frame->uc.uc_mcontext))
goto badframe;

return regs->ax;
diff --git a/arch/x86/kernel/signal_64.c b/arch/x86/kernel/signal_64.c
index 6b189de005b5..2d053333fcf5 100644
--- a/arch/x86/kernel/signal_64.c
+++ b/arch/x86/kernel/signal_64.c
@@ -260,13 +260,13 @@ SYSCALL_DEFINE0(rt_sigreturn)

set_current_blocked(&set);

- if (!restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags))
+ if (restore_altstack(&frame->uc.uc_stack))
goto badframe;

- if (restore_signal_shadow_stack())
+ if (!restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags))
goto badframe;

- if (restore_altstack(&frame->uc.uc_stack))
+ if (restore_signal_shadow_stack())
goto badframe;

return regs->ax;
--
2.39.3


2024-06-06 22:48:19

by Aruna Ramakrishna

[permalink] [raw]
Subject: [PATCH v5 3/5] x86/pkeys: Update PKRU to enable minimally required pkeys before XSAVE

If the alternate signal stack is protected by a different pkey than the
current execution stack, copying xsave data to the sigaltstack will fail
if its pkey is not enabled. Enable the extra pkey needed, before xsave,
so that the signal handler accessibility is not dictated by the PKRU
value that the thread sets up. But this updated PKRU value is also
pushed onto the sigframe, so overwrite that with the original, user-defined
PKRU value so that the value restored from sigcontext does not have the extra
pkey enabled.

Signed-off-by: Aruna Ramakrishna <[email protected]>
---
arch/x86/kernel/fpu/signal.c | 10 ++++++++--
arch/x86/kernel/signal.c | 10 +++++++++-
2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index b0b254b931fd..1065ab995305 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -168,8 +168,14 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,

static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
{
- if (use_xsave())
- return xsave_to_user_sigframe(buf);
+ int err = 0;
+
+ if (use_xsave()) {
+ err = xsave_to_user_sigframe(buf);
+ if (!err)
+ err = update_pkru_in_sigframe(buf, pkru);
+ return err;
+ }

if (use_fxsr())
return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 3fa66b2fe753..659faf076b48 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -243,8 +243,8 @@ static void
handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
struct fpu *fpu = &current->thread.fpu;
- u32 pkru = read_pkru();
bool stepping, failed;
+ u32 pkru;

if (v8086_mode(regs))
save_v86_state((struct kernel_vm86_regs *) regs, VM86_SIGNAL);
@@ -280,6 +280,8 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
if (stepping)
user_disable_single_step(current);

+ /* Update PKRU to enable access to the alternate signal stack. */
+ pkru = sig_prepare_pkru();
failed = (setup_rt_frame(ksig, regs, pkru) < 0);
if (!failed) {
/*
@@ -297,6 +299,12 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
* Ensure the signal handler starts with the new fpu state.
*/
fpu__clear_user_states(fpu);
+ } else {
+ /*
+ * Restore PKRU to the original, user-defined value; disable
+ * extra pkeys enabled for the alternate signal stack, if any.
+ */
+ write_pkru(pkru);
}
signal_setup_done(failed, ksig, stepping);
}
--
2.39.3


2024-06-06 22:48:30

by Aruna Ramakrishna

[permalink] [raw]
Subject: [PATCH v5 5/5] selftests/mm: Add new testcases for pkeys

From: Keith Lucas <[email protected]>

Add a few new tests to exercise the signal handler flow,
especially with pkey 0 disabled.

There are 5 new tests added:
- test_sigsegv_handler_with_pkey0_disabled
- test_sigsegv_handler_cannot_access_stack
- test_sigsegv_handler_with_different_pkey_for_stack
- test_pkru_preserved_after_sigusr1
- test_pkru_sigreturn

[ Aruna: Adapted to upstream ]

Signed-off-by: Keith Lucas <[email protected]>
Signed-off-by: Aruna Ramakrishna <[email protected]>
---
tools/testing/selftests/mm/Makefile | 5 +-
tools/testing/selftests/mm/pkey-helpers.h | 11 +-
.../selftests/mm/pkey_sighandler_tests.c | 479 ++++++++++++++++++
tools/testing/selftests/mm/protection_keys.c | 10 -
4 files changed, 493 insertions(+), 12 deletions(-)
create mode 100644 tools/testing/selftests/mm/pkey_sighandler_tests.c

diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 410495e0a611..1bb95960d28b 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -2,6 +2,7 @@
# Makefile for mm selftests

LOCAL_HDRS += $(selfdir)/mm/local_config.h $(top_srcdir)/mm/gup_test.h
+LINUX_TOOL_INCLUDE = $(top_srcdir)/tools/include

include local_config.mk

@@ -32,7 +33,7 @@ endif
# LDLIBS.
MAKEFLAGS += --no-builtin-rules

-CFLAGS = -Wall -I $(top_srcdir) $(EXTRA_CFLAGS) $(KHDR_INCLUDES)
+CFLAGS = -Wall -I $(top_srcdir) $(EXTRA_CFLAGS) -I$(LINUX_TOOL_INCLUDE) $(KHDR_INCLUDES)
LDLIBS = -lrt -lpthread -lm

TEST_GEN_FILES = cow
@@ -82,6 +83,7 @@ CAN_BUILD_X86_64 := $(shell ./../x86/check_cc.sh "$(CC)" ../x86/trivial_64bit_pr
CAN_BUILD_WITH_NOPIE := $(shell ./../x86/check_cc.sh "$(CC)" ../x86/trivial_program.c -no-pie)

VMTARGETS := protection_keys
+VMTARGETS := pkey_sighandler_tests
BINARIES_32 := $(VMTARGETS:%=%_32)
BINARIES_64 := $(VMTARGETS:%=%_64)

@@ -100,6 +102,7 @@ else

ifneq (,$(findstring $(ARCH),powerpc))
TEST_GEN_FILES += protection_keys
+TEST_GEN_FILES += pkey_sighandler_tests
endif

endif
diff --git a/tools/testing/selftests/mm/pkey-helpers.h b/tools/testing/selftests/mm/pkey-helpers.h
index 1af3156a9db8..2b1189c27167 100644
--- a/tools/testing/selftests/mm/pkey-helpers.h
+++ b/tools/testing/selftests/mm/pkey-helpers.h
@@ -12,6 +12,7 @@
#include <stdlib.h>
#include <ucontext.h>
#include <sys/mman.h>
+#include <linux/compiler.h>

#include "../kselftest.h"

@@ -79,7 +80,15 @@ extern void abort_hooks(void);
} \
} while (0)

-__attribute__((noinline)) int read_ptr(int *ptr);
+noinline int read_ptr(int *ptr)
+{
+ /*
+ * Keep GCC from optimizing this away somehow
+ */
+ barrier();
+ return *ptr;
+}
+
void expected_pkey_fault(int pkey);
int sys_pkey_alloc(unsigned long flags, unsigned long init_val);
int sys_pkey_free(unsigned long pkey);
diff --git a/tools/testing/selftests/mm/pkey_sighandler_tests.c b/tools/testing/selftests/mm/pkey_sighandler_tests.c
new file mode 100644
index 000000000000..c43030c7056d
--- /dev/null
+++ b/tools/testing/selftests/mm/pkey_sighandler_tests.c
@@ -0,0 +1,479 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Tests Memory Protection Keys (see Documentation/core-api/protection-keys.rst)
+ *
+ * The testcases in this file exercise various flows related to signal handling,
+ * using an alternate signal stack, with the default pkey (pkey 0) disabled.
+ *
+ * Compile with:
+ * gcc -mxsave -o pkey_sighandler_tests -O2 -g -std=gnu99 -pthread -Wall pkey_sighandler_tests.c -I../../../../tools/include -lrt -ldl -lm
+ * gcc -mxsave -m32 -o pkey_sighandler_tests -O2 -g -std=gnu99 -pthread -Wall pkey_sighandler_tests.c -I../../../../tools/include -lrt -ldl -lm
+ */
+#define _GNU_SOURCE
+#define __SANE_USERSPACE_TYPES__
+#include <errno.h>
+#include <sys/syscall.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <signal.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <limits.h>
+
+#include "pkey-helpers.h"
+
+#define STACK_SIZE PTHREAD_STACK_MIN
+
+void expected_pkey_fault(int pkey) {}
+
+pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
+pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
+siginfo_t siginfo = {0};
+
+/*
+ * We need to use inline assembly instead of glibc's syscall because glibc's
+ * syscall will attempt to access the PLT in order to call a library function
+ * which is protected by MPK 0 which we don't have access to.
+ */
+static inline __always_inline
+long syscall_raw(long n, long a1, long a2, long a3, long a4, long a5, long a6)
+{
+ unsigned long ret;
+#ifdef __x86_64__
+ register long r10 asm("r10") = a4;
+ register long r8 asm("r8") = a5;
+ register long r9 asm("r9") = a6;
+ asm volatile ("syscall"
+ : "=a"(ret)
+ : "a"(n), "D"(a1), "S"(a2), "d"(a3), "r"(r10), "r"(r8), "r"(r9)
+ : "rcx", "r11", "memory");
+#elif defined __i386__
+ asm volatile ("int $0x80"
+ : "=a"(ret)
+ : "a"(n), "b"(a1), "c"(a2), "d"(a3), "S"(a4), "D"(a5)
+ : "memory");
+#endif
+ return ret;
+}
+
+static void sigsegv_handler(int signo, siginfo_t *info, void *ucontext)
+{
+ pthread_mutex_lock(&mutex);
+
+ memcpy(&siginfo, info, sizeof(siginfo_t));
+
+ pthread_cond_signal(&cond);
+ pthread_mutex_unlock(&mutex);
+
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+}
+
+static void sigusr1_handler(int signo, siginfo_t *info, void *ucontext)
+{
+ pthread_mutex_lock(&mutex);
+
+ memcpy(&siginfo, info, sizeof(siginfo_t));
+
+ pthread_cond_signal(&cond);
+ pthread_mutex_unlock(&mutex);
+}
+
+static void sigusr2_handler(int signo, siginfo_t *info, void *ucontext)
+{
+ /*
+ * pkru should be the init_pkru value which enabled MPK 0 so
+ * we can use library functions.
+ */
+ printf("%s invoked.\n", __func__);
+}
+
+static void raise_sigusr2(void)
+{
+ pid_t tid = 0;
+
+ tid = syscall_raw(SYS_gettid, 0, 0, 0, 0, 0, 0);
+
+ syscall_raw(SYS_tkill, tid, SIGUSR2, 0, 0, 0, 0);
+
+ /*
+ * We should return from the signal handler here and be able to
+ * return to the interrupted thread.
+ */
+}
+
+static void *thread_segv_with_pkey0_disabled(void *ptr)
+{
+ /* Disable MPK 0 (and all others too) */
+ __write_pkey_reg(0x55555555);
+
+ /* Segfault (with SEGV_MAPERR) */
+ *(int *) (0x1) = 1;
+ return NULL;
+}
+
+static void *thread_segv_pkuerr_stack(void *ptr)
+{
+ /* Disable MPK 0 (and all others too) */
+ __write_pkey_reg(0x55555555);
+
+ /* After we disable MPK 0, we can't access the stack to return */
+ return NULL;
+}
+
+static void *thread_segv_maperr_ptr(void *ptr)
+{
+ stack_t *stack = ptr;
+ int *bad = (int *)1;
+
+ /*
+ * Setup alternate signal stack, which should be pkey_mprotect()ed by
+ * MPK 0. The thread's stack cannot be used for signals because it is
+ * not accessible by the default init_pkru value of 0x55555554.
+ */
+ syscall_raw(SYS_sigaltstack, (long)stack, 0, 0, 0, 0, 0);
+
+ /* Disable MPK 0. Only MPK 1 is enabled. */
+ __write_pkey_reg(0x55555551);
+
+ /* Segfault */
+ *bad = 1;
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+ return NULL;
+}
+
+/*
+ * Verify that the sigsegv handler is invoked when pkey 0 is disabled.
+ * Note that the new thread stack and the alternate signal stack is
+ * protected by MPK 0.
+ */
+static void test_sigsegv_handler_with_pkey0_disabled(void)
+{
+ struct sigaction sa;
+ pthread_attr_t attr;
+ pthread_t thr;
+
+ sa.sa_flags = SA_SIGINFO;
+
+ sa.sa_sigaction = sigsegv_handler;
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGSEGV, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ pthread_attr_init(&attr);
+ pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
+
+ pthread_create(&thr, &attr, thread_segv_with_pkey0_disabled, NULL);
+
+ pthread_mutex_lock(&mutex);
+ while (siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ ksft_test_result(siginfo.si_signo == SIGSEGV &&
+ siginfo.si_code == SEGV_MAPERR &&
+ siginfo.si_addr == (void *)1,
+ "%s\n", __func__);
+}
+
+/*
+ * Verify that the sigsegv handler is invoked when pkey 0 is disabled.
+ * Note that the new thread stack and the alternate signal stack is
+ * protected by MPK 0, which renders them inaccessible when MPK 0
+ * is disabled. So just the return from the thread should cause a
+ * segfault with SEGV_PKUERR.
+ */
+static void test_sigsegv_handler_cannot_access_stack(void)
+{
+ struct sigaction sa;
+ pthread_attr_t attr;
+ pthread_t thr;
+
+ sa.sa_flags = SA_SIGINFO;
+
+ sa.sa_sigaction = sigsegv_handler;
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGSEGV, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ pthread_attr_init(&attr);
+ pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
+
+ pthread_create(&thr, &attr, thread_segv_pkuerr_stack, NULL);
+
+ pthread_mutex_lock(&mutex);
+ while (siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ ksft_test_result(siginfo.si_signo == SIGSEGV &&
+ siginfo.si_code == SEGV_PKUERR,
+ "%s\n", __func__);
+}
+
+/*
+ * Verify that the sigsegv handler that uses an alternate signal stack
+ * is correctly invoked for a thread which uses a non-zero MPK to protect
+ * its own stack, and disables all other MPKs (including 0).
+ */
+static void test_sigsegv_handler_with_different_pkey_for_stack(void)
+{
+ struct sigaction sa;
+ static stack_t sigstack;
+ void *stack;
+ int pkey;
+ int parent_pid = 0;
+ int child_pid = 0;
+
+ sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
+
+ sa.sa_sigaction = sigsegv_handler;
+
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGSEGV, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ stack = mmap(0, STACK_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+
+ assert(stack != MAP_FAILED);
+
+ /* Allow access to MPK 0 and MPK 1 */
+ __write_pkey_reg(0x55555550);
+
+ /* Protect the new stack with MPK 1 */
+ pkey = pkey_alloc(0, 0);
+ pkey_mprotect(stack, STACK_SIZE, PROT_READ | PROT_WRITE, pkey);
+
+ /* Set up alternate signal stack that will use the default MPK */
+ sigstack.ss_sp = mmap(0, STACK_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ sigstack.ss_flags = 0;
+ sigstack.ss_size = STACK_SIZE;
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ /* Use clone to avoid newer glibcs using rseq on new threads */
+ long ret = syscall_raw(SYS_clone,
+ CLONE_VM | CLONE_FS | CLONE_FILES |
+ CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
+ CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID |
+ CLONE_DETACHED,
+ (long) ((char *)(stack) + STACK_SIZE),
+ (long) &parent_pid,
+ (long) &child_pid, 0, 0);
+
+ if (ret < 0) {
+ errno = -ret;
+ perror("clone");
+ } else if (ret == 0) {
+ thread_segv_maperr_ptr(&sigstack);
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+ }
+
+ pthread_mutex_lock(&mutex);
+ while (siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ ksft_test_result(siginfo.si_signo == SIGSEGV &&
+ siginfo.si_code == SEGV_MAPERR &&
+ siginfo.si_addr == (void *)1,
+ "%s\n", __func__);
+}
+
+/*
+ * Verify that the PKRU value set by the application is correctly
+ * restored upon return from signal handling.
+ */
+static void test_pkru_preserved_after_sigusr1(void)
+{
+ struct sigaction sa;
+ unsigned long pkru = 0x45454544;
+
+ sa.sa_flags = SA_SIGINFO;
+
+ sa.sa_sigaction = sigusr1_handler;
+ sigemptyset(&sa.sa_mask);
+ if (sigaction(SIGUSR1, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(&siginfo, 0, sizeof(siginfo));
+
+ __write_pkey_reg(pkru);
+
+ raise(SIGUSR1);
+
+ pthread_mutex_lock(&mutex);
+ while (siginfo.si_signo == 0)
+ pthread_cond_wait(&cond, &mutex);
+ pthread_mutex_unlock(&mutex);
+
+ /* Ensure the pkru value is the same after returning from signal. */
+ ksft_test_result(pkru == __read_pkey_reg() &&
+ siginfo.si_signo == SIGUSR1,
+ "%s\n", __func__);
+}
+
+static noinline void *thread_sigusr2_self(void *ptr)
+{
+ /*
+ * A const char array like "Resuming after SIGUSR2" won't be stored on
+ * the stack and the code could access it via an offset from the program
+ * counter. This makes sure it's on the function's stack frame.
+ */
+ char str[] = {'R', 'e', 's', 'u', 'm', 'i', 'n', 'g', ' ',
+ 'a', 'f', 't', 'e', 'r', ' ',
+ 'S', 'I', 'G', 'U', 'S', 'R', '2',
+ '.', '.', '.', '\n', '\0'};
+ stack_t *stack = ptr;
+
+ /*
+ * Setup alternate signal stack, which should be pkey_mprotect()ed by
+ * MPK 0. The thread's stack cannot be used for signals because it is
+ * not accessible by the default init_pkru value of 0x55555554.
+ */
+ syscall(SYS_sigaltstack, (long)stack, 0, 0, 0, 0, 0);
+
+ /* Disable MPK 0. Only MPK 2 is enabled. */
+ __write_pkey_reg(0x55555545);
+
+ raise_sigusr2();
+
+ /* Do something, to show the thread resumed execution after the signal */
+ syscall_raw(SYS_write, 1, (long) str, sizeof(str) - 1, 0, 0, 0);
+
+ /*
+ * We can't return to test_pkru_sigreturn because it
+ * will attempt to use a %rbp value which is on the stack
+ * of the main thread.
+ */
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+ return NULL;
+}
+
+/*
+ * Verify that sigreturn is able to restore altstack even if the thread had
+ * disabled pkey 0.
+ */
+static void test_pkru_sigreturn(void)
+{
+ struct sigaction sa = {0};
+ static stack_t sigstack;
+ void *stack;
+ int pkey;
+ int parent_pid = 0;
+ int child_pid = 0;
+
+ sa.sa_handler = SIG_DFL;
+ sa.sa_flags = 0;
+ sigemptyset(&sa.sa_mask);
+
+ /*
+ * For this testcase, we do not want to handle SIGSEGV. Reset handler
+ * to default so that the application can crash if it receives SIGSEGV.
+ */
+ if (sigaction(SIGSEGV, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
+ sa.sa_sigaction = sigusr2_handler;
+ sigemptyset(&sa.sa_mask);
+
+ if (sigaction(SIGUSR2, &sa, NULL) == -1) {
+ perror("sigaction");
+ exit(EXIT_FAILURE);
+ }
+
+ stack = mmap(0, STACK_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+
+ assert(stack != MAP_FAILED);
+
+ /*
+ * Allow access to MPK 0 and MPK 2. The child thread (to be created
+ * later in this flow) will have its stack protected by MPK 2, whereas
+ * the current thread's stack is protected by the default MPK 0. Hence
+ * both need to be enabled.
+ */
+ __write_pkey_reg(0x55555544);
+
+ /* Protect the stack with MPK 2 */
+ pkey = pkey_alloc(0, 0);
+ pkey_mprotect(stack, STACK_SIZE, PROT_READ | PROT_WRITE, pkey);
+
+ /* Set up alternate signal stack that will use the default MPK */
+ sigstack.ss_sp = mmap(0, STACK_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+ sigstack.ss_flags = 0;
+ sigstack.ss_size = STACK_SIZE;
+
+ /* Use clone to avoid newer glibcs using rseq on new threads */
+ long ret = syscall_raw(SYS_clone,
+ CLONE_VM | CLONE_FS | CLONE_FILES |
+ CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
+ CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID |
+ CLONE_DETACHED,
+ (long) ((char *)(stack) + STACK_SIZE),
+ (long) &parent_pid,
+ (long) &child_pid, 0, 0);
+
+ if (ret < 0) {
+ errno = -ret;
+ perror("clone");
+ } else if (ret == 0) {
+ thread_sigusr2_self(&sigstack);
+ syscall_raw(SYS_exit, 0, 0, 0, 0, 0, 0);
+ }
+
+ child_pid = ret;
+ /* Check that thread exited */
+ do {
+ sched_yield();
+ ret = syscall_raw(SYS_tkill, child_pid, 0, 0, 0, 0, 0);
+ } while (ret != -ESRCH && ret != -EINVAL);
+
+ ksft_test_result_pass("%s\n", __func__);
+}
+
+void (*pkey_tests[])(void) = {
+ test_sigsegv_handler_with_pkey0_disabled,
+ test_sigsegv_handler_cannot_access_stack,
+ test_sigsegv_handler_with_different_pkey_for_stack,
+ test_pkru_preserved_after_sigusr1,
+ test_pkru_sigreturn
+};
+
+int main(int argc, char *argv[])
+{
+ int i;
+
+ ksft_print_header();
+ ksft_set_plan(ARRAY_SIZE(pkey_tests));
+
+ for (i = 0; i < ARRAY_SIZE(pkey_tests); i++)
+ (*pkey_tests[i])();
+
+ ksft_finished();
+ return 0;
+}
diff --git a/tools/testing/selftests/mm/protection_keys.c b/tools/testing/selftests/mm/protection_keys.c
index 48dc151f8fca..2af344e55d37 100644
--- a/tools/testing/selftests/mm/protection_keys.c
+++ b/tools/testing/selftests/mm/protection_keys.c
@@ -950,16 +950,6 @@ void close_test_fds(void)
nr_test_fds = 0;
}

-#define barrier() __asm__ __volatile__("": : :"memory")
-__attribute__((noinline)) int read_ptr(int *ptr)
-{
- /*
- * Keep GCC from optimizing this away somehow
- */
- barrier();
- return *ptr;
-}
-
void test_pkey_alloc_free_attach_pkey0(int *ptr, u16 pkey)
{
int i, err;
--
2.39.3


2024-06-10 21:22:19

by Jeff Xu

[permalink] [raw]
Subject: Re [PATCH v5 0/5] x86/pkeys: update PKRU to enable pkey 0 before XSAVE

Hi

On Thu, Jun 06, 2024 at 10:40:30PM +0000, Aruna Ramakrishna wrote:
> v5 updates:
> - No major changes, mostly a resend of v4 - except for updating the
> commit description for patch 5/5
>
> v4 updates (based on review feedback from Thomas Gleixner):
> - Simplified update_pkru_in_sigframe()
> - Changed sigpkru to enable minimally required keys (init_pkru and
> current
> pkru)
> - Modified pkey_sighandler_tests.c to use kselfttest framework
> - Fixed commit descriptions
> - Fixed sigreturn use case (pointed out by Jeff Xu)
> - Added a new sigreturn test case
>
> v3 updates (based on review feedback from Ingo Molnar and Dave Hansen):
> - Split the original patch into 3:
> - function interface changes
> - helper functions
> - functional change to write pkru on sigframe
> - Enabled all pkeys before XSAVE - i.e. wrpkru(0), rather than assuming
> that the alt sig stack is always protected by pkey 0.
> - Added a few test cases in pkey_sighandler_tests.c.
>
> I had some trouble adding these tests to
> tools/testing/selftests/mm/protection_keys.c, so they're in a separate
> file.
>

Please add me to the future thread of this patch ! Thanks

I added a few more people to the email list. We've dealt with similar
situations in the past ([1] [2]), so it might be a good idea to
keep them updated. Also, ChromeOS and Chrome will use the functionality
provided by this patch series.

[1] https://lore.kernel.org/all/202208221331.71C50A6F@keescook/

[2] https://docs.google.com/document/d/1DjPhBq-5gRKtTeaknQDTWfvqRBCONmYqkU1I6k-3Ai8/edit?resourcekey=0-GGQta3_yhKqK7xV5SxIrVQ&tab=t.0

Thanks!
-Jeff

2024-06-10 21:31:31

by Jeff Xu

[permalink] [raw]
Subject: Re [PATCH v5 1/5] x86/pkeys: Add PKRU as a parameter in signal handling functions

On Thu, Jun 06, 2024 at 10:40:31PM +0000, Aruna Ramakrishna wrote:

> diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
> index e770c4fc47f4..de458354a3ea 100644
> --- a/arch/x86/include/asm/sighandling.h
> +++ b/arch/x86/include/asm/sighandling.h
> @@ -17,11 +17,11 @@ void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
>
> void __user *
> get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> - void __user **fpstate);
> + void __user **fpstate, u32 pkru);
>
> -int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs);
> -int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
> -int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
> -int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
> +int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
> +int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
> +int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
> +int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);

Does ia32 support PKEY ? I thought it is X64 only.
It might be possible to not to change any of ia32 code.

>
> #endif /* _ASM_X86_SIGHANDLING_H */
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index 247f2225aa9f..2b3b9e140dd4 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -156,7 +156,7 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
> return !err;
> }
>
> -static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
> +static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
> {
> if (use_xsave())
> return xsave_to_user_sigframe(buf);
> @@ -185,7 +185,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
> * For [f]xsave state, update the SW reserved fields in the [f]xsave frame
> * indicating the absence/presence of the extended state to the user.
> */
> -bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
> +bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size, u32 pkru)
> {
> struct task_struct *tsk = current;
> struct fpstate *fpstate = tsk->thread.fpu.fpstate;
> @@ -228,7 +228,7 @@ bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
> fpregs_restore_userregs();
>
> pagefault_disable();
> - ret = copy_fpregs_to_sigframe(buf_fx);
> + ret = copy_fpregs_to_sigframe(buf_fx, pkru);
> pagefault_enable();
> fpregs_unlock();
>
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index 31b6f5dddfc2..94b894437327 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -74,7 +74,7 @@ static inline int is_x32_frame(struct ksignal *ksig)
> */
> void __user *
> get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> - void __user **fpstate)
> + void __user **fpstate, u32 pkru)
> {
> struct k_sigaction *ka = &ksig->ka;
> int ia32_frame = is_ia32_frame(ksig);
> @@ -139,7 +139,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> }
>
> /* save i387 and extended state */
> - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size))
> + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))

You might find that we only need to update PKRU right before copy_fpstate_to_sigframe,
so no need to pass pkru all the way from top.

-Jeff

2024-06-10 21:39:45

by Jeff Xu

[permalink] [raw]
Subject: Re [PATCH v5 2/5] x86/pkeys: Add helper functions to update PKRU on sigframe

The orig_pkru & init_pkru_value is quite difficult to understand.

case 1> init_pkru: 00 (allow all)
orig_pkru all cases => allow all

case 2> init_pkru: 01 (disable all)
Orig_pkru:
allow all 00 => 00 allow all.
disable all 01 => 01 disable all.
disable write 10 => 00 allow all <--- *** odd ***
disable all 11 => 01 disable all

case 3> init pkru: 10 (disable write)
allow all 00 => 00 allow all.
disable all 01 => 00 (allow all) <----*** odd ***
disable write 10 => 10 allow all
disable all 11 => 10 disable write <--- *** odd ***

case 4> init pkru: 11 (disable all)
orig_pkru all cases => unchanged.

set PKRU(0) seems to be better, easy to understand.

In addition, kernel overwrites the PKRU during the
signal handleing is a new ABI, it might be the best
to add a flag during sigaltstack(), similar to
how SS_AUTODISARM is set.

> + return orig_pkru;
> +}
> +

-Jeff

2024-06-10 21:45:26

by Jeff Xu

[permalink] [raw]
Subject: Re [PATCH v5 4/5] x86/pkeys: Restore altstack before sigcontext

Can we we move this patch to the first of the series? this can be an
independent patch, the problem not only affect PKRU, but also other
scenarios, as Rick pointed out in [1]

[1] https://lore.kernel.org/lkml/[email protected]/

-Jeff

2024-06-11 14:02:58

by Aruna Ramakrishna

[permalink] [raw]
Subject: Re: Re [PATCH v5 1/5] x86/pkeys: Add PKRU as a parameter in signal handling functions



> On Jun 10, 2024, at 2:31 PM, [email protected] wrote:
>
> On Thu, Jun 06, 2024 at 10:40:31PM +0000, Aruna Ramakrishna wrote:
>
>> diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
>> index e770c4fc47f4..de458354a3ea 100644
>> --- a/arch/x86/include/asm/sighandling.h
>> +++ b/arch/x86/include/asm/sighandling.h
>> @@ -17,11 +17,11 @@ void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
>>
>> void __user *
>> get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
>> - void __user **fpstate);
>> + void __user **fpstate, u32 pkru);
>>
>> -int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs);
>> -int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
>> -int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
>> -int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
>> +int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
>> +int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
>> +int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
>> +int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
>
> Does ia32 support PKEY ? I thought it is X64 only.
> It might be possible to not to change any of ia32 code.

From the Intel manual, it did not seem that pkey support was x64 only.

>
>
>>
>> #endif /* _ASM_X86_SIGHANDLING_H */
>> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
>> index 247f2225aa9f..2b3b9e140dd4 100644
>> --- a/arch/x86/kernel/fpu/signal.c
>> +++ b/arch/x86/kernel/fpu/signal.c
>> @@ -156,7 +156,7 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
>> return !err;
>> }
>>
>> -static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
>> +static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
>> {
>> if (use_xsave())
>> return xsave_to_user_sigframe(buf);
>> @@ -185,7 +185,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
>> * For [f]xsave state, update the SW reserved fields in the [f]xsave frame
>> * indicating the absence/presence of the extended state to the user.
>> */
>> -bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
>> +bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size, u32 pkru)
>> {
>> struct task_struct *tsk = current;
>> struct fpstate *fpstate = tsk->thread.fpu.fpstate;
>> @@ -228,7 +228,7 @@ bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
>> fpregs_restore_userregs();
>>
>> pagefault_disable();
>> - ret = copy_fpregs_to_sigframe(buf_fx);
>> + ret = copy_fpregs_to_sigframe(buf_fx, pkru);
>> pagefault_enable();
>> fpregs_unlock();
>>
>> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
>> index 31b6f5dddfc2..94b894437327 100644
>> --- a/arch/x86/kernel/signal.c
>> +++ b/arch/x86/kernel/signal.c
>> @@ -74,7 +74,7 @@ static inline int is_x32_frame(struct ksignal *ksig)
>> */
>> void __user *
>> get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
>> - void __user **fpstate)
>> + void __user **fpstate, u32 pkru)
>> {
>> struct k_sigaction *ka = &ksig->ka;
>> int ia32_frame = is_ia32_frame(ksig);
>> @@ -139,7 +139,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
>> }
>>
>> /* save i387 and extended state */
>> - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size))
>> + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))
>
> You might find that we only need to update PKRU right before copy_fpstate_to_sigframe,
> so no need to pass pkru all the way from top.
>

The PKRU is updated in copy_fpregs_to_sigframe(), right after xsave_to_user_sigframe().
But the code that decides or cares about what pkru is written to the sigframe is handle_signal(), so
it seemed cleaner to do those checks there and pass in pkru as a parameter down the stack.

Thanks,
Aruna

2024-06-11 14:06:23

by Aruna Ramakrishna

[permalink] [raw]
Subject: Re: Re [PATCH v5 2/5] x86/pkeys: Add helper functions to update PKRU on sigframe



> On Jun 10, 2024, at 2:39 PM, [email protected] wrote:
>
> The orig_pkru & init_pkru_value is quite difficult to understand.
>
> case 1> init_pkru: 00 (allow all)
> orig_pkru all cases => allow all
>
> case 2> init_pkru: 01 (disable all)
> Orig_pkru:
> allow all 00 => 00 allow all.
> disable all 01 => 01 disable all.
> disable write 10 => 00 allow all <--- *** odd ***
> disable all 11 => 01 disable all
>
> case 3> init pkru: 10 (disable write)
> allow all 00 => 00 allow all.
> disable all 01 => 00 (allow all) <----*** odd ***
> disable write 10 => 10 allow all
> disable all 11 => 10 disable write <--- *** odd ***
>
> case 4> init pkru: 11 (disable all)
> orig_pkru all cases => unchanged.
>
> set PKRU(0) seems to be better, easy to understand.
>

I’m not sure I follow.

The default init_pkru is 0x55555554 (enable only pkey 0). Let’s assume the application
sets up PKRU = 0x55555545 (i.e. enable only pkey 2). We want to set up the PKRU
to enable both pkey 0 and pkey 2, before the XSAVE, so that both the current stack as
well as the alternate signal stack are writable.

So with
write_pkru(orig_pkru & pkru_get_init_value());

It changes PKRU to 0x55555544 - enabling both pkey 0 and pkey 2.

After the XSAVE, it calls update_pkru_in_sigframe(), which overwrites this (new)
PKRU saved on the sigframe with orig_pkru, which is 0x55555545 in this example.

Setting PKRU to 0 would be simpler, it would enable all pkeys - 0 through 15 - which,
as Thomas pointed out, seems unnecessary. The application needs the pkey it
enabled for access to its own stack, and we need to enable pkey 0 under the hood
to enable access to the alternate signal stack.

Thanks,
Aruna

2024-06-11 14:09:10

by Aruna Ramakrishna

[permalink] [raw]
Subject: Re: Re [PATCH v5 4/5] x86/pkeys: Restore altstack before sigcontext



> On Jun 10, 2024, at 2:44 PM, [email protected] wrote:
>
> Can we we move this patch to the first of the series? this can be an
> independent patch, the problem not only affect PKRU, but also other
> scenarios, as Rick pointed out in [1]
>
> [1] https://lore.kernel.org/lkml/[email protected]/
>
> -Jeff

For this patch set, the issue with rt_sigreturn() is only exposed after patch 3/5 - i.e., when
copy_fpregs_to_sigframe() calls update_pkru_in_sigframe() to update the PKRU value to
user-specified PKRU that might disable pkey 0 access, thus breaking restore_altstack().
So it seemed logical to me to have this fix as patch 4/5 of this series. I’m not strongly
opposed to moving this to patch 1/5, but this ordering is easier to understand (I think).
But if this patch needs to be broken out of this patchset and submitted independently
(for the other scenarios you mentioned) - I can do that.

Thanks,
Aruna

2024-06-11 21:26:37

by Jeff Xu

[permalink] [raw]
Subject: Re: Re [PATCH v5 1/5] x86/pkeys: Add PKRU as a parameter in signal handling functions

On Tue, Jun 11, 2024 at 6:56 AM Aruna Ramakrishna
<[email protected]> wrote:
>
>
>
> > On Jun 10, 2024, at 2:31 PM, [email protected] wrote:
> >
> > On Thu, Jun 06, 2024 at 10:40:31PM +0000, Aruna Ramakrishna wrote:
> >
> >> diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
> >> index e770c4fc47f4..de458354a3ea 100644
> >> --- a/arch/x86/include/asm/sighandling.h
> >> +++ b/arch/x86/include/asm/sighandling.h
> >> @@ -17,11 +17,11 @@ void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
> >>
> >> void __user *
> >> get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> >> - void __user **fpstate);
> >> + void __user **fpstate, u32 pkru);
> >>
> >> -int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs);
> >> -int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
> >> -int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
> >> -int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs);
> >> +int ia32_setup_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
> >> +int ia32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
> >> +int x64_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
> >> +int x32_setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru);
> >
> > Does ia32 support PKEY ? I thought it is X64 only.
> > It might be possible to not to change any of ia32 code.
>
> From the Intel manual, it did not seem that pkey support was x64 only.
>
https://docs.kernel.org/core-api/protection-keys.html#memory-protection-keys

"The feature is only available in 64-bit mode, even though there is
theoretically space in the PAE PTEs."

> >
> >
> >>
> >> #endif /* _ASM_X86_SIGHANDLING_H */
> >> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> >> index 247f2225aa9f..2b3b9e140dd4 100644
> >> --- a/arch/x86/kernel/fpu/signal.c
> >> +++ b/arch/x86/kernel/fpu/signal.c
> >> @@ -156,7 +156,7 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
> >> return !err;
> >> }
> >>
> >> -static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
> >> +static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
> >> {
> >> if (use_xsave())
> >> return xsave_to_user_sigframe(buf);
> >> @@ -185,7 +185,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
> >> * For [f]xsave state, update the SW reserved fields in the [f]xsave frame
> >> * indicating the absence/presence of the extended state to the user.
> >> */
> >> -bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
> >> +bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size, u32 pkru)
> >> {
> >> struct task_struct *tsk = current;
> >> struct fpstate *fpstate = tsk->thread.fpu.fpstate;
> >> @@ -228,7 +228,7 @@ bool copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
> >> fpregs_restore_userregs();
> >>
> >> pagefault_disable();
> >> - ret = copy_fpregs_to_sigframe(buf_fx);
> >> + ret = copy_fpregs_to_sigframe(buf_fx, pkru);
> >> pagefault_enable();
> >> fpregs_unlock();
> >>
> >> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> >> index 31b6f5dddfc2..94b894437327 100644
> >> --- a/arch/x86/kernel/signal.c
> >> +++ b/arch/x86/kernel/signal.c
> >> @@ -74,7 +74,7 @@ static inline int is_x32_frame(struct ksignal *ksig)
> >> */
> >> void __user *
> >> get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> >> - void __user **fpstate)
> >> + void __user **fpstate, u32 pkru)
> >> {
> >> struct k_sigaction *ka = &ksig->ka;
> >> int ia32_frame = is_ia32_frame(ksig);
> >> @@ -139,7 +139,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> >> }
> >>
> >> /* save i387 and extended state */
> >> - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size))
> >> + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))
> >
> > You might find that we only need to update PKRU right before copy_fpstate_to_sigframe,
> > so no need to pass pkru all the way from top.
> >
>
> The PKRU is updated in copy_fpregs_to_sigframe(), right after xsave_to_user_sigframe().
> But the code that decides or cares about what pkru is written to the sigframe is handle_signal(), so
> it seemed cleaner to do those checks there and pass in pkru as a parameter down the stack.
>
IIUC, the kernel need the read/write access from
get_sigframe (arch/x86/kernel/signal.c), it is possible to have the
PKRU check/update PKRU inside get_sigframe, and there is no need to
pass orig_pkru from handle_signal().

Thanks
-Jeff

2024-06-11 22:13:43

by Jeff Xu

[permalink] [raw]
Subject: Re: Re [PATCH v5 2/5] x86/pkeys: Add helper functions to update PKRU on sigframe

On Tue, Jun 11, 2024 at 7:05 AM Aruna Ramakrishna
<[email protected]> wrote:
>
>
>
> > On Jun 10, 2024, at 2:39 PM, [email protected] wrote:
> >
> > The orig_pkru & init_pkru_value is quite difficult to understand.
> >
> > case 1> init_pkru: 00 (allow all)
> > orig_pkru all cases => allow all
> >
> > case 2> init_pkru: 01 (disable all)
> > Orig_pkru:
> > allow all 00 => 00 allow all.
> > disable all 01 => 01 disable all.
> > disable write 10 => 00 allow all <--- *** odd ***
> > disable all 11 => 01 disable all
> >
> > case 3> init pkru: 10 (disable write)
> > allow all 00 => 00 allow all.
> > disable all 01 => 00 (allow all) <----*** odd ***
> > disable write 10 => 10 allow all
> > disable all 11 => 10 disable write <--- *** odd ***
> >
> > case 4> init pkru: 11 (disable all)
> > orig_pkru all cases => unchanged.
> >
> > set PKRU(0) seems to be better, easy to understand.
> >
>
> I’m not sure I follow.
>
> The default init_pkru is 0x55555554 (enable only pkey 0). Let’s assume the application
> sets up PKRU = 0x55555545 (i.e. enable only pkey 2). We want to set up the PKRU
> to enable both pkey 0 and pkey 2, before the XSAVE, so that both the current stack as
> well as the alternate signal stack are writable.
>
> So with
> write_pkru(orig_pkru & pkru_get_init_value());
>
> It changes PKRU to 0x55555544 - enabling both pkey 0 and pkey 2.
>
Consider below examples:

1>
The default init_pkru is 0x55555554 (pkey 0 has all access, 1-15 disable all)
and thread has PKRU of 0xaaaaaaa8 (pkey 0 has all access, 1-15 disable write)
init_pkru & curr_pkru will have 0x0
If altstack is protected by pkey 1, your code will change PKRU to 0,
so the kernel is able to read/write to altstack.

2>
However when the thread's PKRU disable all access to 1-15:
The default init_pkru is 0x55555554 (pkey 0 has all access, 1-15 disable all)
and thread has PKRU of 0x5555554 (pkey 0 has all access, 1-15 disable all)
init_pkru & curr_pkru will have 0x55555554
If altstack is protected by pkey 1, kernel doesn't change PKRU, so
still not able
to access altstack.

3> This algorithm is stranger if inti_pkru is configured differently:
The init_pkru is 0xaaaaaaa8 (pkey 0 has all access, and 1-15 disables write.)
and thread has PKRU of 0x55555554 (pkey 0 has all access, 1-15 disable all)
init_pkru & curr_pkru will have 0x0 (0-15 has all access).

Overall I think this is a confusing algorithm to decide the new PKRU to use.

> After the XSAVE, it calls update_pkru_in_sigframe(), which overwrites this (new)
> PKRU saved on the sigframe with orig_pkru, which is 0x55555545 in this example.
>
> Setting PKRU to 0 would be simpler, it would enable all pkeys - 0 through 15 - which,
> as Thomas pointed out, seems unnecessary. The application needs the pkey it
> enabled for access to its own stack, and we need to enable pkey 0 under the hood
> to enable access to the alternate signal stack.
>

I think you are referring to Thomas's comments in V3, copy here for
ease of response:

>User space resumes with the default PKRU value and the first thing user
>space does when entering the signal handler is to push stuff on the
>signal stack.
...
> If user space protects the task stack or the sigalt stack with a key
> which is not in init_pkru_value then it does not matter at all whether
> it dies in handle_signal() or later when returning to user space, no?

The userspace could register a custom handler (written in assembly) to
change PKRU and allow access to the stack, this could be written in such
that it is before pushing stuff to the stack. So all it requires is
that the kernel
doesn't SIGSEGV when preparing the signal frame in sigaltstack, this is
where PKRU=0 inside the kernel path helps.

Even today, without patch, one can already do following:
1> use PKEY 1 to protect sigaltstack
3> let the thread have all access to PKEY 1
3> send a signal to the thread, kernel will save PKRU to the altstack correctly.
4> kernel set init_pkur before hands over control to userspace
5> userspace set PKRU to allow access to PKEY 1 as the first thing to do.
6> on sig_return, threads have PKRU restored correctly from the value
in sigframe.

That is why I consider "let kernel to modify PKRU to give access to
altstack" is
an ABI change, i.e. compare with current behavior that kernel deny
such access.
It might be good to consider adding a new flag in ss_flags.

Thanks.
-Jeff

2024-06-11 22:17:13

by Jeff Xu

[permalink] [raw]
Subject: Re: Re [PATCH v5 4/5] x86/pkeys: Restore altstack before sigcontext

On Tue, Jun 11, 2024 at 7:08 AM Aruna Ramakrishna
<[email protected]> wrote:
>
>
>
> > On Jun 10, 2024, at 2:44 PM, [email protected] wrote:
> >
> > Can we we move this patch to the first of the series? this can be an
> > independent patch, the problem not only affect PKRU, but also other
> > scenarios, as Rick pointed out in [1]
> >
> > [1] https://lore.kernel.org/lkml/[email protected]/
> >
> > -Jeff
>
> For this patch set, the issue with rt_sigreturn() is only exposed after patch 3/5 - i.e., when
> copy_fpregs_to_sigframe() calls update_pkru_in_sigframe() to update the PKRU value to
> user-specified PKRU that might disable pkey 0 access, thus breaking restore_altstack().
> So it seemed logical to me to have this fix as patch 4/5 of this series. I’m not strongly
> opposed to moving this to patch 1/5, but this ordering is easier to understand (I think).
> But if this patch needs to be broken out of this patchset and submitted independently
> (for the other scenarios you mentioned) - I can do that.
>
My main thought is that Rick mentioned this can be a cross-arch changes [1]
and we can start with x86/64 and other arches might follow.

[1] https://lore.kernel.org/lkml/[email protected]/

Thanks
-Jeff

> Thanks,
> Aruna