2021-01-13 17:46:20

by Piotr Figiel

[permalink] [raw]
Subject: [PATCH] fs/proc: Expose RSEQ configuration

For userspace checkpoint and restore (C/R) some way of getting process
state containing RSEQ configuration is needed.

There are two ways this information is going to be used:
- to re-enable RSEQ for threads which had it enabled before C/R
- to detect if a thread was in a critical section during C/R

Since C/R preserves TLS memory and addresses RSEQ ABI will be restored
using the address registered before C/R.

Detection whether the thread is in a critical section during C/R is
needed to enforce behavior of RSEQ abort during C/R. Attaching with
ptrace() before registers are dumped itself doesn't cause RSEQ abort.
Restoring the instruction pointer within the critical section is
problematic because rseq_cs may get cleared before the control is
passed to the migrated application code leading to RSEQ invariants not
being preserved.

Signed-off-by: Piotr Figiel <[email protected]>
---
fs/proc/base.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index b3422cda2a91..3d4712ac4370 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -662,6 +662,20 @@ static int proc_pid_syscall(struct seq_file *m, struct pid_namespace *ns,

return 0;
}
+
+#ifdef CONFIG_RSEQ
+static int proc_pid_rseq(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
+{
+ int res = lock_trace(task);
+
+ if (res)
+ return res;
+ seq_printf(m, "0x%llx 0x%x\n", (uint64_t)task->rseq, task->rseq_sig);
+ unlock_trace(task);
+ return 0;
+}
+#endif /* CONFIG_RSEQ */
#endif /* CONFIG_HAVE_ARCH_TRACEHOOK */

/************************************************************************/
@@ -3182,6 +3196,9 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
ONE("syscall", S_IRUSR, proc_pid_syscall),
+#ifdef CONFIG_RSEQ
+ ONE("rseq", S_IRUSR, proc_pid_rseq),
+#endif
#endif
REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
ONE("stat", S_IRUGO, proc_tgid_stat),
@@ -3522,6 +3539,9 @@ static const struct pid_entry tid_base_stuff[] = {
&proc_pid_set_comm_operations, {}),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
ONE("syscall", S_IRUSR, proc_pid_syscall),
+#ifdef CONFIG_RSEQ
+ ONE("rseq", S_IRUSR, proc_pid_rseq),
+#endif
#endif
REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
ONE("stat", S_IRUGO, proc_tid_stat),
--
2.30.0.284.gd98b1dd5eaa7-goog


2021-01-13 21:40:01

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH] fs/proc: Expose RSEQ configuration

On Wed, Jan 13, 2021 at 06:41:27PM +0100, Piotr Figiel wrote:
> For userspace checkpoint and restore (C/R) some way of getting process
> state containing RSEQ configuration is needed.

> + seq_printf(m, "0x%llx 0x%x\n", (uint64_t)task->rseq, task->rseq_sig);

%llx is too much on 32-bit. "%tx %x" is better (or even %08x)

2021-01-14 02:17:19

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH] fs/proc: Expose RSEQ configuration

On Wed, Jan 13, 2021 at 06:41:27PM +0100, Piotr Figiel wrote:
> +static int proc_pid_rseq(struct seq_file *m, struct pid_namespace *ns,
> + struct pid *pid, struct task_struct *task)
> +{
> + int res = lock_trace(task);
> +
> + if (res)
> + return res;
> + seq_printf(m, "0x%llx 0x%x\n", (uint64_t)task->rseq, task->rseq_sig);

may I suggest

"%tx", (uintptr_t) // or %lx

Mandatory 64-bit is too much on 32-bit.

Or even "%tx %08x" ?

2021-01-14 19:25:42

by Piotr Figiel

[permalink] [raw]
Subject: Re: [PATCH] fs/proc: Expose RSEQ configuration

On Thu, Jan 14, 2021 at 12:32:30AM +0300, Alexey Dobriyan wrote:
> On Wed, Jan 13, 2021 at 06:41:27PM +0100, Piotr Figiel wrote:
> > For userspace checkpoint and restore (C/R) some way of getting process
> > state containing RSEQ configuration is needed.
> > + seq_printf(m, "0x%llx 0x%x\n", (uint64_t)task->rseq, task->rseq_sig);
> %llx is too much on 32-bit. "%tx %x" is better (or even %08x)

Hi, many thanks for the suggestion. I applied this on v2,
https://lore.kernel.org/linux-fsdevel/[email protected]
I had to cast it via uintptr_t to cast-away the user address space
without warnings. Could you please take a look?

Best regards, Piotr.