2016-10-25 11:05:53

by Roman Pen

[permalink] [raw]
Subject: [PATCH 1/1] kthread: fix possible infinite wait for parking when kthread exits meanwhile

The patch handles the case, when someone waits on parked completion but
kthread exits meanwhile. To avoid infinite wait the waiter has to spin
once more in a loop and simply try to get an alive kthread. If the
kthread has been died, put_kthread_cb() wakes up possible waiter when
kthread->vfork_done is already NULL, so next attempt to grab alive
kthread pointer will fail.

Signed-off-by: Roman Pen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: [email protected]
---
kernel/kthread.c | 41 ++++++++++++++++++++++++++++++++++++-----
1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index e8adc10556e0..a001c1ec489b 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -77,6 +77,12 @@ static void put_kthread_cb(struct callback_head *work)
struct kthread *kthread;

kthread = container_of(work, struct kthread, put_work);
+ /*
+ * Kick out possible waiter on a parked completion before
+ * ref put. That will force them to spin in a loop once
+ * more and eventually get the NULL kthread pointer.
+ */
+ complete(&kthread->parked);
put_kthread(kthread);
}

@@ -449,6 +455,11 @@ static void __kthread_unpark(struct task_struct *k, struct kthread *kthread)
}
}

+static bool __kthread_isparked(struct kthread *kthread)
+{
+ return test_bit(KTHREAD_IS_PARKED, &kthread->flags);
+}
+
/**
* kthread_unpark - unpark a thread created by kthread_create().
* @k: thread created by kthread_create().
@@ -479,23 +490,43 @@ EXPORT_SYMBOL_GPL(kthread_unpark);
*
* Returns 0 if the thread is parked, -ENOSYS if the thread exited.
* If called by the kthread itself just the park bit is set.
+ *
+ * BEWARE: The caller is responsible for ensuring the validity of @k when
+ * calling this function.
+ *
+ * BEWARE: Only one simultaneous caller is possible. Others will hang
+ * forever. You have been warned.
*/
int kthread_park(struct task_struct *k)
{
- struct kthread *kthread = to_live_kthread_and_get(k);
+ struct kthread *kthread;
int ret = -ENOSYS;

- if (kthread) {
- if (!test_bit(KTHREAD_IS_PARKED, &kthread->flags)) {
+ /*
+ * Here we try to be careful and handle the case, when kthread
+ * is going to die and will never park. In that particular case
+ * put_kthread_cb() is called when kthread->vfork_done is already
+ * NULL. put_kthread_cb() does the last completion on kthread->parked,
+ * thus we will spin once more and next attempt to get an alive
+ * kthread will fail.
+ */
+ do {
+ kthread = to_live_kthread_and_get(k);
+ if (!kthread)
+ break;
+ if (!__kthread_isparked(kthread)) {
set_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
if (k != current) {
wake_up_process(k);
wait_for_completion(&kthread->parked);
}
}
+ if (k == current || __kthread_isparked(kthread))
+ /* The way out */
+ ret = 0;
put_kthread(kthread);
- ret = 0;
- }
+ } while (ret);
+
return ret;
}
EXPORT_SYMBOL_GPL(kthread_park);
--
2.9.3


2016-10-25 15:18:18

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/1] kthread: fix possible infinite wait for parking when kthread exits meanwhile

Well. I was going to ignore this patch, I will leave this to Thomas
anyway...

But can't resist, because I have to admit I dislike this one too ;)

On 10/25, Roman Pen wrote:
>
> int kthread_park(struct task_struct *k)
> {
> - struct kthread *kthread = to_live_kthread_and_get(k);
> + struct kthread *kthread;
> int ret = -ENOSYS;
>
> - if (kthread) {
> - if (!test_bit(KTHREAD_IS_PARKED, &kthread->flags)) {
> + /*
> + * Here we try to be careful and handle the case, when kthread
> + * is going to die and will never park.

But why?

IMO we should not complicate this code for the case when the kernel
thread crashes.

> + do {
> + kthread = to_live_kthread_and_get(k);
> + if (!kthread)
> + break;
> + if (!__kthread_isparked(kthread)) {
> set_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
> if (k != current) {
> wake_up_process(k);
> wait_for_completion(&kthread->parked);
> }
> }
> + if (k == current || __kthread_isparked(kthread))
> + /* The way out */
> + ret = 0;

And why do we need to restart if __kthread_isparked() is false? In this
case we know that we were woken up by the exiting kthread which won't
park anyway.

Oleg.

2016-10-25 17:34:38

by Roman Pen

[permalink] [raw]
Subject: Re: [PATCH 1/1] kthread: fix possible infinite wait for parking when kthread exits meanwhile

On Tue, Oct 25, 2016 at 5:16 PM, Oleg Nesterov <[email protected]> wrote:
> Well. I was going to ignore this patch, I will leave this to Thomas
> anyway...
>
> But can't resist, because I have to admit I dislike this one too ;)
>
> On 10/25, Roman Pen wrote:
>>
>> int kthread_park(struct task_struct *k)
>> {
>> - struct kthread *kthread = to_live_kthread_and_get(k);
>> + struct kthread *kthread;
>> int ret = -ENOSYS;
>>
>> - if (kthread) {
>> - if (!test_bit(KTHREAD_IS_PARKED, &kthread->flags)) {
>> + /*
>> + * Here we try to be careful and handle the case, when kthread
>> + * is going to die and will never park.
>
> But why?
>
> IMO we should not complicate this code for the case when the kernel
> thread crashes.

that is not only about crashes, or let's say this is *only* about exit
of a thread. exit, which asynchronously happens. the kthread dies or
you have a bug in your kthread, say, in handling some pending signals,
or whatever, but the result is the same: the thread exits before you
have invoked kthread_park(), then you call kthread_park() and you hang
forever.

so the complication for me comes from the fact, that kthread.h is an
API, which should provide some guarantees and if it does not do that,
at least the API should be explicit on that and I (as a user) must
take care.

e.g. kthread_data() says 'The caller is responsible for ensuring the
validity of @task when calling this function.'. That is clear. As a
user of the kthread API I can do it myself, doing get_task_struct().

but with kthread_park() call I can't do anything. I can rely only on
the fact, that my thread is still running.

I understand, that kthread_park() is used only in special cases, I've
counted only 5 callers, which are probably completely aware of what
they are doing. so of course it does not make a lot of sense even to
touch that code. but I did not like that particular place, which is
easy to fix with rather small and painless changes.

>
>> + do {
>> + kthread = to_live_kthread_and_get(k);
>> + if (!kthread)
>> + break;
>> + if (!__kthread_isparked(kthread)) {
>> set_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
>> if (k != current) {
>> wake_up_process(k);
>> wait_for_completion(&kthread->parked);
>> }
>> }
>> + if (k == current || __kthread_isparked(kthread))
>> + /* The way out */
>> + ret = 0;
>
> And why do we need to restart if __kthread_isparked() is false? In this
> case we know that we were woken up by the exiting kthread which won't
> park anyway.

That's true. Loop is an overkill.

--
Roman