2013-03-14 07:59:19

by Li Guang

[permalink] [raw]
Subject: [PATCH 1/2] task_work: make FIFO task_work list

Signed-off-by: liguang <[email protected]>
---
kernel/task_work.c | 15 +++------------
1 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/kernel/task_work.c b/kernel/task_work.c
index 65bd3c9..0bf4258 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -13,11 +13,12 @@ task_work_add(struct task_struct *task, struct callback_head *work, bool notify)
head = ACCESS_ONCE(task->task_works);
if (unlikely(head == &work_exited))
return -ESRCH;
- work->next = head;
- } while (cmpxchg(&task->task_works, head, work) != head);
+ head = head->next;
+ } while (cmpxchg(&head, NULL, work) == head);

if (notify)
set_notify_resume(task);
+
return 0;
}

@@ -72,16 +73,6 @@ void task_work_run(void)
raw_spin_unlock_wait(&task->pi_lock);
smp_mb();

- /* Reverse the list to run the works in fifo order */
- head = NULL;
- do {
- next = work->next;
- work->next = head;
- head = work;
- work = next;
- } while (work);
-
- work = head;
do {
next = work->next;
work->func(work);
--
1.7.2.5


2013-03-14 07:59:20

by Li Guang

[permalink] [raw]
Subject: [PATCH 2/2] task_work: check callback if it's NULL

Signed-off-by: liguang <[email protected]>
---
kernel/task_work.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/task_work.c b/kernel/task_work.c
index 0bf4258..f458b08 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -75,7 +75,8 @@ void task_work_run(void)

do {
next = work->next;
- work->func(work);
+ if (unlikely(work->func))
+ work->func(work);
work = next;
cond_resched();
} while (work);
--
1.7.2.5

2013-03-14 14:42:22

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/2] task_work: make FIFO task_work list

On 03/14, liguang wrote:
>
> Signed-off-by: liguang <[email protected]>

Changelog please...

> ---
> kernel/task_work.c | 15 +++------------
> 1 files changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/task_work.c b/kernel/task_work.c
> index 65bd3c9..0bf4258 100644
> --- a/kernel/task_work.c
> +++ b/kernel/task_work.c
> @@ -13,11 +13,12 @@ task_work_add(struct task_struct *task, struct callback_head *work, bool notify)
> head = ACCESS_ONCE(task->task_works);
> if (unlikely(head == &work_exited))
> return -ESRCH;
> - work->next = head;
> - } while (cmpxchg(&task->task_works, head, work) != head);
> + head = head->next;
> + } while (cmpxchg(&head, NULL, work) == head);

I simply can't understand how this can work... The patch assumes
that head->next == NULL after head = head->next, why? And then
compares the result with head and succeeds if not equal.

Could you please explain how it was supposed to work? If nothing
else, Suppose we have task->task_works -> W1 -> W2 -> W3. How this
code can add W4 after W3?

And cmpxchg(&head) should be cmpxchg(&head->next)....



Anyway, whatever I missed this is racy.

head = head->next;

nothing protects "head" after this. Say, it can be task_work_cancel'ed
and freed. So,

cmpxchg(&head, ...)

can modify the freed and reused memory.

Oleg.

2013-03-14 14:45:07

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 2/2] task_work: check callback if it's NULL

On 03/14, liguang wrote:
>
> Signed-off-by: liguang <[email protected]>
> ---
> kernel/task_work.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/task_work.c b/kernel/task_work.c
> index 0bf4258..f458b08 100644
> --- a/kernel/task_work.c
> +++ b/kernel/task_work.c
> @@ -75,7 +75,8 @@ void task_work_run(void)
>
> do {
> next = work->next;
> - work->func(work);
> + if (unlikely(work->func))
> + work->func(work);

Why?

Oleg.

2013-03-15 00:18:03

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH 1/2] task_work: make FIFO task_work list

在 2013-03-14四的 15:40 +0100,Oleg Nesterov写道:
> On 03/14, liguang wrote:
> >
> > Signed-off-by: liguang <[email protected]>
>
> Changelog please...
>

OK.

> > ---
> > kernel/task_work.c | 15 +++------------
> > 1 files changed, 3 insertions(+), 12 deletions(-)
> >
> > diff --git a/kernel/task_work.c b/kernel/task_work.c
> > index 65bd3c9..0bf4258 100644
> > --- a/kernel/task_work.c
> > +++ b/kernel/task_work.c
> > @@ -13,11 +13,12 @@ task_work_add(struct task_struct *task, struct callback_head *work, bool notify)
> > head = ACCESS_ONCE(task->task_works);
> > if (unlikely(head == &work_exited))
> > return -ESRCH;
> > - work->next = head;
> > - } while (cmpxchg(&task->task_works, head, work) != head);
> > + head = head->next;
> > + } while (cmpxchg(&head, NULL, work) == head);
>
> I simply can't understand how this can work... The patch assumes
> that head->next == NULL after head = head->next, why? And then
> compares the result with head and succeeds if not equal.
>

then ->next filed was not initialized, so I think it will
be 0'ed by compiler, is it unreliable?.

> Could you please explain how it was supposed to work? If nothing
> else, Suppose we have task->task_works -> W1 -> W2 -> W3. How this
> code can add W4 after W3?
>

1. head = task_works
2. head = head->next
3. if head == NULL
/* it's next node of list tail (w3->next) */
head = work
else
goto 1



> And cmpxchg(&head) should be cmpxchg(&head->next)....
>
>
>
> Anyway, whatever I missed this is racy.
>
> head = head->next;
>
> nothing protects "head" after this. Say, it can be task_work_cancel'ed
> and freed. So,
>
> cmpxchg(&head, ...)
>
> can modify the freed and reused memory.
>
> Oleg.
>

Thanks Oleg,
Hmm, at first, I think even it was changed, it can't happened to be
NULL, but ... maybe it need more deliberation.

The motivation it make the list FIFO at task_work_add, so you don't
need to reverse it at task_work_run, and it's a time-saver if the list
doesn't have too many nodes I think.


2013-03-15 00:22:04

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH 2/2] task_work: check callback if it's NULL

在 2013-03-14四的 15:43 +0100,Oleg Nesterov写道:
> On 03/14, liguang wrote:
> >
> > Signed-off-by: liguang <[email protected]>
> > ---
> > kernel/task_work.c | 3 ++-
> > 1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/task_work.c b/kernel/task_work.c
> > index 0bf4258..f458b08 100644
> > --- a/kernel/task_work.c
> > +++ b/kernel/task_work.c
> > @@ -75,7 +75,8 @@ void task_work_run(void)
> >
> > do {
> > next = work->next;
> > - work->func(work);
> > + if (unlikely(work->func))
> > + work->func(work);
>
> Why?
>
> Oleg.
>

can we believe a callback always be call-able?
can it happened to be 0? e.g. wrong initialized.
of course, we can complain the caller, be why don't
we easily make it more safer?

2013-03-15 01:02:22

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] task_work: check callback if it's NULL

On 2013/3/15 8:20, li guang wrote:
> 在 2013-03-14四的 15:43 +0100,Oleg Nesterov写道:
>> On 03/14, liguang wrote:
>>>
>>> Signed-off-by: liguang <[email protected]>
>>> ---
>>> kernel/task_work.c | 3 ++-
>>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/kernel/task_work.c b/kernel/task_work.c
>>> index 0bf4258..f458b08 100644
>>> --- a/kernel/task_work.c
>>> +++ b/kernel/task_work.c
>>> @@ -75,7 +75,8 @@ void task_work_run(void)
>>>
>>> do {
>>> next = work->next;
>>> - work->func(work);
>>> + if (unlikely(work->func))
>>> + work->func(work);
>>
>> Why?
>>
>> Oleg.
>>
>
> can we believe a callback always be call-able?
> can it happened to be 0? e.g. wrong initialized.
> of course, we can complain the caller, be why don't
> we easily make it more safer?
>

Because you're not making things safer, but your're trying
to cover up bugs...

2013-03-15 01:27:40

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH 2/2] task_work: check callback if it's NULL

在 2013-03-15五的 09:01 +0800,Li Zefan写道:
> On 2013/3/15 8:20, li guang wrote:
> > 在 2013-03-14四的 15:43 +0100,Oleg Nesterov写道:
> >> On 03/14, liguang wrote:
> >>>
> >>> Signed-off-by: liguang <[email protected]>
> >>> ---
> >>> kernel/task_work.c | 3 ++-
> >>> 1 files changed, 2 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/kernel/task_work.c b/kernel/task_work.c
> >>> index 0bf4258..f458b08 100644
> >>> --- a/kernel/task_work.c
> >>> +++ b/kernel/task_work.c
> >>> @@ -75,7 +75,8 @@ void task_work_run(void)
> >>>
> >>> do {
> >>> next = work->next;
> >>> - work->func(work);
> >>> + if (unlikely(work->func))
> >>> + work->func(work);
> >>
> >> Why?
> >>
> >> Oleg.
> >>
> >
> > can we believe a callback always be call-able?
> > can it happened to be 0? e.g. wrong initialized.
> > of course, we can complain the caller, be why don't
> > we easily make it more safer?
> >
>
> Because you're not making things safer, but your're trying
> to cover up bugs...
>

Oh, that's a little harsh to a normal programmer like me :-)
for it seems you are asking me to be coding without any bug.
are you? or it is the theory of kernel coding?


2013-03-15 01:43:50

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 2/2] task_work: check callback if it's NULL

On 2013/3/15 9:26, li guang wrote:
> 在 2013-03-15五的 09:01 +0800,Li Zefan写道:
>> On 2013/3/15 8:20, li guang wrote:
>>> 在 2013-03-14四的 15:43 +0100,Oleg Nesterov写道:
>>>> On 03/14, liguang wrote:
>>>>>
>>>>> Signed-off-by: liguang <[email protected]>
>>>>> ---
>>>>> kernel/task_work.c | 3 ++-
>>>>> 1 files changed, 2 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/kernel/task_work.c b/kernel/task_work.c
>>>>> index 0bf4258..f458b08 100644
>>>>> --- a/kernel/task_work.c
>>>>> +++ b/kernel/task_work.c
>>>>> @@ -75,7 +75,8 @@ void task_work_run(void)
>>>>>
>>>>> do {
>>>>> next = work->next;
>>>>> - work->func(work);
>>>>> + if (unlikely(work->func))
>>>>> + work->func(work);
>>>>
>>>> Why?
>>>>
>>>> Oleg.
>>>>
>>>
>>> can we believe a callback always be call-able?
>>> can it happened to be 0? e.g. wrong initialized.
>>> of course, we can complain the caller, be why don't
>>> we easily make it more safer?
>>>
>>
>> Because you're not making things safer, but your're trying
>> to cover up bugs...
>>
>
> Oh, that's a little harsh to a normal programmer like me :-)
> for it seems you are asking me to be coding without any bug.
> are you? or it is the theory of kernel coding?
>

And you make a bug, and you want the kernel to cover up the bug
instead of crash on a null pointer deref so you'll know you've
made a bug?

Why we check if a callback is NULL before calling it? Because
it's allowed to be. Why we don't check if a callback is NULL?
Because it's not supposed to be.

2013-03-15 02:31:16

by Li Guang

[permalink] [raw]
Subject: Re: [PATCH 2/2] task_work: check callback if it's NULL

在 2013-03-15五的 09:43 +0800,Li Zefan写道:
> On 2013/3/15 9:26, li guang wrote:
> > 在 2013-03-15五的 09:01 +0800,Li Zefan写道:
> >> On 2013/3/15 8:20, li guang wrote:
> >>> 在 2013-03-14四的 15:43 +0100,Oleg Nesterov写道:
> >>>> On 03/14, liguang wrote:
> >>>>>
> >>>>> Signed-off-by: liguang <[email protected]>
> >>>>> ---
> >>>>> kernel/task_work.c | 3 ++-
> >>>>> 1 files changed, 2 insertions(+), 1 deletions(-)
> >>>>>
> >>>>> diff --git a/kernel/task_work.c b/kernel/task_work.c
> >>>>> index 0bf4258..f458b08 100644
> >>>>> --- a/kernel/task_work.c
> >>>>> +++ b/kernel/task_work.c
> >>>>> @@ -75,7 +75,8 @@ void task_work_run(void)
> >>>>>
> >>>>> do {
> >>>>> next = work->next;
> >>>>> - work->func(work);
> >>>>> + if (unlikely(work->func))
> >>>>> + work->func(work);
> >>>>
> >>>> Why?
> >>>>
> >>>> Oleg.
> >>>>
> >>>
> >>> can we believe a callback always be call-able?
> >>> can it happened to be 0? e.g. wrong initialized.
> >>> of course, we can complain the caller, be why don't
> >>> we easily make it more safer?
> >>>
> >>
> >> Because you're not making things safer, but your're trying
> >> to cover up bugs...
> >>
> >
> > Oh, that's a little harsh to a normal programmer like me :-)
> > for it seems you are asking me to be coding without any bug.
> > are you? or it is the theory of kernel coding?
> >
>
> And you make a bug, and you want the kernel to cover up the bug
> instead of crash on a null pointer deref so you'll know you've
> made a bug?
>
> Why we check if a callback is NULL before calling it? Because
> it's allowed to be. Why we don't check if a callback is NULL?
> Because it's not supposed to be.
>

OK, Thanks for your reminder.

2013-03-15 14:36:21

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/2] task_work: make FIFO task_work list

On 03/15, li guang wrote:
>
> 在 2013-03-14四的 15:40 +0100,Oleg Nesterov写道:
> > > --- a/kernel/task_work.c
> > > +++ b/kernel/task_work.c
> > > @@ -13,11 +13,12 @@ task_work_add(struct task_struct *task, struct callback_head *work, bool notify)
> > > head = ACCESS_ONCE(task->task_works);
> > > if (unlikely(head == &work_exited))
> > > return -ESRCH;
> > > - work->next = head;
> > > - } while (cmpxchg(&task->task_works, head, work) != head);
> > > + head = head->next;
> > > + } while (cmpxchg(&head, NULL, work) == head);
> >
> > I simply can't understand how this can work... The patch assumes
> > that head->next == NULL after head = head->next, why? And then
> > compares the result with head and succeeds if not equal.
> >
>
> then ->next filed was not initialized, so I think it will
> be 0'ed by compiler, is it unreliable?.

work->next is not necessarily initialized, but this is not the main
problem...

> > Could you please explain how it was supposed to work? If nothing
> > else, Suppose we have task->task_works -> W1 -> W2 -> W3. How this
> > code can add W4 after W3?
> >
>
> 1. head = task_works

head == &W1

> 2. head = head->next

head == &W2

> 3. if head == NULL
> /* it's next node of list tail (w3->next) */
> head = work

No,

> else
> goto 1

And? You restart from ->task_works again.

> > Anyway, whatever I missed this is racy.
> >
> > head = head->next;
> >
> > nothing protects "head" after this. Say, it can be task_work_cancel'ed
> > and freed. So,
> >
> > cmpxchg(&head, ...)
> >
> > can modify the freed and reused memory.
> >
> > Oleg.
>
> Thanks Oleg,
> Hmm, at first, I think even it was changed, it can't happened to be
> NULL, but ... maybe it need more deliberation.

My point was, even if it is not NULL nothing protects this element. It can
be freed/reused before you do cmpxchg(&head).

> The motivation it make the list FIFO at task_work_add, so you don't
> need to reverse it at task_work_run,

I understand, but this is not easy and unlikely possible without the
locking.

> and it's a time-saver if the list

Yes, but compared to the next loop which does do/while again _and_
calls the work->func() "Reverse the list" doesn't add too much.

Oleg.