2022-08-08 19:38:15

by Rik van Riel

[permalink] [raw]
Subject: [PATCH v5] livepatch: fix race between fork and KLP transition

The KLP transition code depends on the TIF_PATCH_PENDING and
the task->patch_state to stay in sync. On a normal (forward)
transition, TIF_PATCH_PENDING will be set on every task in
the system, while on a reverse transition (after a failed
forward one) first TIF_PATCH_PENDING will be cleared from
every task, followed by it being set on tasks that need to
be transitioned back to the original code.

However, the fork code copies over the TIF_PATCH_PENDING flag
from the parent to the child early on, in dup_task_struct and
setup_thread_stack. Much later, klp_copy_process will set
child->patch_state to match that of the parent.

However, the parent's patch_state may have been changed by KLP loading
or unloading since it was initially copied over into the child.

This results in the KLP code occasionally hitting this warning in
klp_complete_transition:

for_each_process_thread(g, task) {
WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
task->patch_state = KLP_UNDEFINED;
}

Set, or clear, the TIF_PATCH_PENDING flag in the child task
depending on whether or not it is needed at the time
klp_copy_process is called, at a point in copy_process where the
tasklist_lock is held exclusively, preventing races with the KLP
code.

The KLP code does have a few places where the state is changed
without the tasklist_lock held, but those should not cause
problems because klp_update_patch_state(current) cannot be
called while the current task is in the middle of fork,
klp_check_and_switch_task() which is called under the pi_lock,
which prevents rescheduling, and manipulation of the patch
state of idle tasks, which do not fork.

This should prevent this warning from triggering again in the
future, and close the race for both normal and reverse transitions.

Signed-off-by: Rik van Riel <[email protected]>
Reported-by: Breno Leitao <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Fixes: d83a7cb375ee ("livepatch: change to a per-task consistency model")
Cc: [email protected]
---
v5: incorporate changelog suggestions by Petr (thank you)

kernel/livepatch/transition.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index 5d03a2ad1066..30187b1d8275 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -610,9 +610,23 @@ void klp_reverse_transition(void)
/* Called from copy_process() during fork */
void klp_copy_process(struct task_struct *child)
{
- child->patch_state = current->patch_state;

- /* TIF_PATCH_PENDING gets copied in setup_thread_stack() */
+ /*
+ * The parent process may have gone through a KLP transition since
+ * the thread flag was copied in setup_thread_stack earlier. Bring
+ * the task flag up to date with the parent here.
+ *
+ * The operation is serialized against all klp_*_transition()
+ * operations by the tasklist_lock. The only exception is
+ * klp_update_patch_state(current), but we cannot race with
+ * that because we are current.
+ */
+ if (test_tsk_thread_flag(current, TIF_PATCH_PENDING))
+ set_tsk_thread_flag(child, TIF_PATCH_PENDING);
+ else
+ clear_tsk_thread_flag(child, TIF_PATCH_PENDING);
+
+ child->patch_state = current->patch_state;
}

/*
--
2.37.1



2022-09-01 15:03:48

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v5] livepatch: fix race between fork and KLP transition

On Mon 2022-08-08 15:00:19, Rik van Riel wrote:
> The KLP transition code depends on the TIF_PATCH_PENDING and
> the task->patch_state to stay in sync. On a normal (forward)
> transition, TIF_PATCH_PENDING will be set on every task in
> the system, while on a reverse transition (after a failed
> forward one) first TIF_PATCH_PENDING will be cleared from
> every task, followed by it being set on tasks that need to
> be transitioned back to the original code.
>
> However, the fork code copies over the TIF_PATCH_PENDING flag
> from the parent to the child early on, in dup_task_struct and
> setup_thread_stack. Much later, klp_copy_process will set
> child->patch_state to match that of the parent.
>
> However, the parent's patch_state may have been changed by KLP loading
> or unloading since it was initially copied over into the child.
>
> This results in the KLP code occasionally hitting this warning in
> klp_complete_transition:
>
> for_each_process_thread(g, task) {
> WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> task->patch_state = KLP_UNDEFINED;
> }
>
> Set, or clear, the TIF_PATCH_PENDING flag in the child task
> depending on whether or not it is needed at the time
> klp_copy_process is called, at a point in copy_process where the
> tasklist_lock is held exclusively, preventing races with the KLP
> code.
>
> The KLP code does have a few places where the state is changed
> without the tasklist_lock held, but those should not cause
> problems because klp_update_patch_state(current) cannot be
> called while the current task is in the middle of fork,
> klp_check_and_switch_task() which is called under the pi_lock,
> which prevents rescheduling, and manipulation of the patch
> state of idle tasks, which do not fork.
>
> This should prevent this warning from triggering again in the
> future, and close the race for both normal and reverse transitions.
>
> Signed-off-by: Rik van Riel <[email protected]>
> Reported-by: Breno Leitao <[email protected]>
> Reviewed-by: Petr Mladek <[email protected]>
> Acked-by: Josh Poimboeuf <[email protected]>
> Fixes: d83a7cb375ee ("livepatch: change to a per-task consistency model")
> Cc: [email protected]

The patch has been pushed to livepatching/livepaching.git,
branch for-6.1/fixes.

Best Regards,
Petr