2023-06-14 12:56:49

by Wander Lairson Costa

[permalink] [raw]
Subject: [PATCH v10 0/2] kernel/fork: beware of __put_task_struct calling context

Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping
locks. Therefore, it can't be called from an non-preemptible context.

Instead of calling __put_task_struct() directly, we defer it using
call_rcu(). A more natural approach would use a workqueue, but since
in PREEMPT_RT, we can't allocate dynamic memory from atomic context,
the code would become more complex because we would need to put the
work_struct instance in the task_struct and initialize it when we
allocate a new task_struct.

Changelog
=========

v1:
* Initial implementation fixing the splat.

v2:
* Isolate the logic in its own function.
* Fix two more cases caught in review.

v3:
* Change __put_task_struct() to handle the issue internally.

v4:
* Explain why call_rcu() is safe to call from interrupt context.

v5:
* Explain why __put_task_struct() doesn't conflict with
put_task_sruct_rcu_user.

v6:
* As per Sebastian's review, revert back the implementation of v2
with a distinct function.
* Add a check in put_task_struct() to warning when called from a
non-sleepable context.
* Address more call sites.

v7:
* Fix typos.
* Add an explanation why the new function doesn't conflict with
delayed_free_task().

v8:
* Bring back v5.
* Fix coding style.

v9:
* Reorganize to not need ___put_task_struct() by Oleg's suggestion.

v10:
* Add a patch preventing a splat when compile with
CONFIG_PROVE_RAW_LOCK_NESTING.

Reported-by: Hu Chunyu <[email protected]>
Suggested-by: Oleg Nesterov <[email protected]>
Suggested-by: Valentin Schneider <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Paul McKenney <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Luis Goncalves <[email protected]>

Wander Lairson Costa (2):
kernel/fork: beware of __put_task_struct calling context
sched: avoid false lockdep splat in put_task_struct()

include/linux/sched/task.h | 38 +++++++++++++++++++++++++++++++++++++-
kernel/fork.c | 8 ++++++++
2 files changed, 45 insertions(+), 1 deletion(-)

--
2.40.1



2023-06-14 13:05:15

by Wander Lairson Costa

[permalink] [raw]
Subject: [PATCH v10 2/2] sched: avoid false lockdep splat in put_task_struct()

In put_task_struct(), a spin_lock is indirectly acquired under the kernel
stock. When running the kernel in real-time (RT) configuration, the
operation is dispatched to a preemptible context call to ensure
guaranteed preemption. However, if PROVE_RAW_LOCK_NESTING is enabled
and __put_task_struct() is called while holding a raw_spinlock, lockdep
incorrectly reports an "Invalid lock context" in the stock kernel.

This false splat occurs because lockdep is unaware of the different
route taken under RT. To address this issue, override the inner wait
type to prevent the false lockdep splat.

Signed-off-by: Wander Lairson Costa <[email protected]>
Suggested-by: Oleg Nesterov <[email protected]>
Suggested-by: Sebastian Andrzej Siewior <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Luis Goncalves <[email protected]>
---
include/linux/sched/task.h | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index d20de91e3b95..b53909027771 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -125,6 +125,19 @@ static inline void put_task_struct(struct task_struct *t)
if (!refcount_dec_and_test(&t->usage))
return;

+ /*
+ * In !RT, it is always safe to call __put_task_struct().
+ * Under RT, we can only call it in preemptible context.
+ */
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
+ static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
+
+ lock_map_acquire_try(&put_task_map);
+ __put_task_struct(t);
+ lock_map_release(&put_task_map);
+ return;
+ }
+
/*
* under PREEMPT_RT, we can't call put_task_struct
* in atomic context because it will indirectly
@@ -145,10 +158,7 @@ static inline void put_task_struct(struct task_struct *t)
* when it fails to fork a process. Therefore, there is no
* way it can conflict with put_task_struct().
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
- call_rcu(&t->rcu, __put_task_struct_rcu_cb);
- else
- __put_task_struct(t);
+ call_rcu(&t->rcu, __put_task_struct_rcu_cb);
}

static inline void put_task_struct_many(struct task_struct *t, int nr)
--
2.40.1


2023-06-19 12:24:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v10 0/2] kernel/fork: beware of __put_task_struct calling context

On Wed, Jun 14, 2023 at 09:23:20AM -0300, Wander Lairson Costa wrote:
> Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping
> locks. Therefore, it can't be called from an non-preemptible context.
>
> Instead of calling __put_task_struct() directly, we defer it using
> call_rcu(). A more natural approach would use a workqueue, but since
> in PREEMPT_RT, we can't allocate dynamic memory from atomic context,
> the code would become more complex because we would need to put the
> work_struct instance in the task_struct and initialize it when we
> allocate a new task_struct.

>
> Wander Lairson Costa (2):
> kernel/fork: beware of __put_task_struct calling context
> sched: avoid false lockdep splat in put_task_struct()

Thanks!

Subject: [tip: sched/core] sched: avoid false lockdep splat in put_task_struct()

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 893cdaaa3977be6afb3a7f756fbfd7be83f68d8c
Gitweb: https://git.kernel.org/tip/893cdaaa3977be6afb3a7f756fbfd7be83f68d8c
Author: Wander Lairson Costa <[email protected]>
AuthorDate: Wed, 14 Jun 2023 09:23:22 -03:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 13 Jul 2023 15:21:48 +02:00

sched: avoid false lockdep splat in put_task_struct()

In put_task_struct(), a spin_lock is indirectly acquired under the kernel
stock. When running the kernel in real-time (RT) configuration, the
operation is dispatched to a preemptible context call to ensure
guaranteed preemption. However, if PROVE_RAW_LOCK_NESTING is enabled
and __put_task_struct() is called while holding a raw_spinlock, lockdep
incorrectly reports an "Invalid lock context" in the stock kernel.

This false splat occurs because lockdep is unaware of the different
route taken under RT. To address this issue, override the inner wait
type to prevent the false lockdep splat.

Suggested-by: Oleg Nesterov <[email protected]>
Suggested-by: Sebastian Andrzej Siewior <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Wander Lairson Costa <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/linux/sched/task.h | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 6b687c1..a23af22 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -126,6 +126,19 @@ static inline void put_task_struct(struct task_struct *t)
return;

/*
+ * In !RT, it is always safe to call __put_task_struct().
+ * Under RT, we can only call it in preemptible context.
+ */
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
+ static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
+
+ lock_map_acquire_try(&put_task_map);
+ __put_task_struct(t);
+ lock_map_release(&put_task_map);
+ return;
+ }
+
+ /*
* under PREEMPT_RT, we can't call put_task_struct
* in atomic context because it will indirectly
* acquire sleeping locks.
@@ -145,10 +158,7 @@ static inline void put_task_struct(struct task_struct *t)
* when it fails to fork a process. Therefore, there is no
* way it can conflict with put_task_struct().
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
- call_rcu(&t->rcu, __put_task_struct_rcu_cb);
- else
- __put_task_struct(t);
+ call_rcu(&t->rcu, __put_task_struct_rcu_cb);
}

DEFINE_FREE(put_task, struct task_struct *, if (_T) put_task_struct(_T))