2020-07-06 20:51:12

by Mathieu Desnoyers

[permalink] [raw]
Subject: [RFC PATCH for 5.8 0/4] rseq cpu_id ABI fix

Hi,

Recent integration of rseq into glibc unearthed an issue with inaccurate
cpu_id field for newly created tasks. This series includes a fix for the
underlying issue (meant to be backported to stable), as well as new rseq
flags to let user-space know that the kernel implements this fix, so
glibc and other rseq users can use this flag to know whether they can
safely use rseq without risk of corrupting their per-cpu data. This new
flag could either be added only to the master branch (no stable
backport) or backported to stable, depending on what seems the most
appropriate.

This is an RFC aiming for quick inclusion into the Linux kernel, unless
we prefer reverting the entire rseq glibc integration and try again in 6
months. Their upcoming release is on August 3rd, so we need to take a
decision on this matter quickly.

Thanks,

Mathieu

Mathieu Desnoyers (4):
sched: Fix unreliable rseq cpu_id for new tasks
rseq: Introduce RSEQ_FLAG_REGISTER
rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID
rseq: selftests: Expect reliable cpu_id field

include/uapi/linux/rseq.h | 15 +++++-
kernel/rseq.c | 81 ++++++++++++++++-------------
kernel/sched/core.c | 2 +
tools/testing/selftests/rseq/rseq.c | 10 +++-
4 files changed, 71 insertions(+), 37 deletions(-)

--
2.17.1


2020-07-06 20:52:52

by Mathieu Desnoyers

[permalink] [raw]
Subject: [RFC PATCH for 5.8 1/4] sched: Fix unreliable rseq cpu_id for new tasks

While integrating rseq into glibc and replacing glibc's sched_getcpu
implementation with rseq, glibc's tests discovered an issue with
incorrect __rseq_abi.cpu_id field value right after the first time
a newly created process issues sched_setaffinity.

For the records, it triggers after building glibc and running tests, and
then issuing:

for x in {1..2000} ; do posix/tst-affinity-static & done

and shows up as:

error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0

This is caused by the scheduler invoking __set_task_cpu() directly from
sched_fork() and wake_up_new_task(), thus bypassing rseq_migrate() which
is done by set_task_cpu().

Add the missing rseq_migrate() to both functions. The only other direct
use of __set_task_cpu() is done by init_idle(), which does not involve a
user-space task.

Based on my testing with the glibc test-case, just adding rseq_migrate()
to wake_up_new_task() is sufficient to fix the observed issue. Also add
it to sched_fork() to keep things consistent.

The reason why this never triggered so far with the rseq/basic_test
selftest is unclear.

The current use of sched_getcpu(3) does not typically require it to be
always accurate. However, use of the __rseq_abi.cpu_id field within rseq
critical sections requires it to be accurate. If it is not accurate, it
can cause corruption in the per-cpu data targeted by rseq critical
sections in user-space.

Link: https://sourceware.org/pipermail/libc-alpha/2020-July/115816.html
Signed-off-by: Mathieu Desnoyers <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Florian Weimer <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Neel Natu <[email protected]>
Cc: [email protected]
Cc: [email protected] # v4.18+
---
kernel/sched/core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ca5db40392d4..86a855bd4d90 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2962,6 +2962,7 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p)
* Silence PROVE_RCU.
*/
raw_spin_lock_irqsave(&p->pi_lock, flags);
+ rseq_migrate(p);
/*
* We're setting the CPU for the first time, we don't migrate,
* so use __set_task_cpu().
@@ -3026,6 +3027,7 @@ void wake_up_new_task(struct task_struct *p)
* as we're not fully set-up yet.
*/
p->recent_used_cpu = task_cpu(p);
+ rseq_migrate(p);
__set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0));
#endif
rq = __task_rq_lock(p, &rf);
--
2.17.1

2020-07-07 06:28:12

by Florian Weimer

[permalink] [raw]
Subject: Re: [RFC PATCH for 5.8 0/4] rseq cpu_id ABI fix

* Mathieu Desnoyers:

> This is an RFC aiming for quick inclusion into the Linux kernel, unless
> we prefer reverting the entire rseq glibc integration and try again in 6
> months. Their upcoming release is on August 3rd, so we need to take a
> decision on this matter quickly.

Just to clarify here, I don't think it's necessary to change glibc so
that it only enables the rseq functionality if this particular bug is
not present. From our perspective, it's just another kernel bug.

If you truly feel we must not enable this feature in its present
kernel state, then we need a working patch on all sides by the end of
the week because the hard ABI freeze for glibc 2.32 is coming up, and
at without any other patches, the only safe choice to prevent glibc
from using slightly broken rseq support would be to back out the rseq
patches.

But again, I don't think such drastic action is necessary.

2020-07-07 07:32:48

by Florian Weimer

[permalink] [raw]
Subject: Re: [RFC PATCH for 5.8 1/4] sched: Fix unreliable rseq cpu_id for new tasks

* Mathieu Desnoyers:

> While integrating rseq into glibc and replacing glibc's sched_getcpu
> implementation with rseq, glibc's tests discovered an issue with
> incorrect __rseq_abi.cpu_id field value right after the first time
> a newly created process issues sched_setaffinity.
>
> For the records, it triggers after building glibc and running tests, and
> then issuing:
>
> for x in {1..2000} ; do posix/tst-affinity-static & done
>
> and shows up as:
>
> error: Unexpected CPU 2, expected 0
> error: Unexpected CPU 2, expected 0
> error: Unexpected CPU 2, expected 0
> error: Unexpected CPU 2, expected 0
> error: Unexpected CPU 138, expected 0
> error: Unexpected CPU 138, expected 0
> error: Unexpected CPU 138, expected 0
> error: Unexpected CPU 138, expected 0

As far as I can tell, the glibc reproducer no longer shows the issue
with this patch applied.

Tested-By: Florian Weimer <[email protected]>

2020-07-07 10:52:24

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [RFC PATCH for 5.8 1/4] sched: Fix unreliable rseq cpu_id for new tasks

----- On Jul 7, 2020, at 3:30 AM, Florian Weimer [email protected] wrote:

> * Mathieu Desnoyers:
>
>> While integrating rseq into glibc and replacing glibc's sched_getcpu
>> implementation with rseq, glibc's tests discovered an issue with
>> incorrect __rseq_abi.cpu_id field value right after the first time
>> a newly created process issues sched_setaffinity.
>>
>> For the records, it triggers after building glibc and running tests, and
>> then issuing:
>>
>> for x in {1..2000} ; do posix/tst-affinity-static & done
>>
>> and shows up as:
>>
>> error: Unexpected CPU 2, expected 0
>> error: Unexpected CPU 2, expected 0
>> error: Unexpected CPU 2, expected 0
>> error: Unexpected CPU 2, expected 0
>> error: Unexpected CPU 138, expected 0
>> error: Unexpected CPU 138, expected 0
>> error: Unexpected CPU 138, expected 0
>> error: Unexpected CPU 138, expected 0
>
> As far as I can tell, the glibc reproducer no longer shows the issue
> with this patch applied.
>
> Tested-By: Florian Weimer <[email protected]>

Thanks a lot Florian for your thorough review and testing !

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

2020-07-07 14:56:03

by Florian Weimer

[permalink] [raw]
Subject: Re: [RFC PATCH for 5.8 0/4] rseq cpu_id ABI fix

I would like to point out that the subject is misleading: This is not
an ABI change. It fixes the contents of the __rseq_abi TLS variable
(as glibc calls it), but that's it.

(Sorry, I should have mentioned this earlier.)