2008-11-05 14:31:49

by Jiri Pirko

[permalink] [raw]
Subject: [PATCH] atkbd: cancel delayed work before freeing its structure

Pointed out by Oleg Nesterov. Since delayed work is used here, use of
flush_scheduled_work() is not sufficient in atkbd_disconnect(). It does
not wait for scheduled delayed work to finish. This patch prevents
delayed work to be processed after freeing atkbd structure (used struct
delayed_work is part of atkbd) by cancelling this delayed work.

Jirka

Signed-off-by: Jiri Pirko <[email protected]>
---
drivers/input/keyboard/atkbd.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c
index 22016ca..f3bbf49 100644
--- a/drivers/input/keyboard/atkbd.c
+++ b/drivers/input/keyboard/atkbd.c
@@ -824,7 +824,7 @@ static void atkbd_disconnect(struct serio *serio)
atkbd_disable(atkbd);

/* make sure we don't have a command in flight */
- flush_scheduled_work();
+ cancel_delayed_work_sync(&atkbd->event_work);

sysfs_remove_group(&serio->dev.kobj, &atkbd_attribute_group);
input_unregister_device(atkbd->dev);
--
1.5.6.5


2008-11-07 14:42:46

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] atkbd: cancel delayed work before freeing its structure

On 11/05, Jiri Pirko wrote:
>
> diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c
> index 22016ca..f3bbf49 100644
> --- a/drivers/input/keyboard/atkbd.c
> +++ b/drivers/input/keyboard/atkbd.c
> @@ -824,7 +824,7 @@ static void atkbd_disconnect(struct serio *serio)
> atkbd_disable(atkbd);
>
> /* make sure we don't have a command in flight */
> - flush_scheduled_work();
> + cancel_delayed_work_sync(&atkbd->event_work);

Ping. Dmitry, could you take a look?


While we are here, what is the reason for atkbd_schedule_event_work()->wmb() ?
It looks absolutely bogus. Is it for atkbd_event_work() ? In that case it
is not needed, it must see all previous STOREs because both queue_work() and
run_workqueue() take cwq->lock. And in any case,
test_and_set_bit(WORK_STRUCT_PENDING) implies mb(). If schedule_delayed_work()
fails we can race with the soon-to-be-executed atkbd_event_work(), in that
case that test_and_set_bit() + test_and_clear_bit(->event_mask) save us,
but wmb() can't help again.

Another question is why do we need ->event_mutex? OK, it can serialize
multiple instances of atkbd_event_work() running on the different CPUs,
but in that case atkbd_reconnect() needs this lock too? It also calls
atkbd_set_repeat_rate/atkbd_set_leds.

I don't understand this code, don't take my words too seriously, just
curious.

Oleg.

2008-11-11 14:52:18

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] atkbd: cancel delayed work before freeing its structure

On Fri, Nov 07, 2008 at 04:43:25PM +0100, Oleg Nesterov wrote:
> On 11/05, Jiri Pirko wrote:
> >
> > diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c
> > index 22016ca..f3bbf49 100644
> > --- a/drivers/input/keyboard/atkbd.c
> > +++ b/drivers/input/keyboard/atkbd.c
> > @@ -824,7 +824,7 @@ static void atkbd_disconnect(struct serio *serio)
> > atkbd_disable(atkbd);
> >
> > /* make sure we don't have a command in flight */
> > - flush_scheduled_work();
> > + cancel_delayed_work_sync(&atkbd->event_work);
>
> Ping. Dmitry, could you take a look?
>

Applied, thank you.

>
> While we are here, what is the reason for atkbd_schedule_event_work()->wmb() ?
> It looks absolutely bogus. Is it for atkbd_event_work() ? In that case it
> is not needed, it must see all previous STOREs because both queue_work() and
> run_workqueue() take cwq->lock. And in any case,
> test_and_set_bit(WORK_STRUCT_PENDING) implies mb().

I wanted to be sure that event_mask is set before we schedule event_work
and I don't want to rely on details of queue_delayed_work
implementation. If the fact that queue_delayed_work acts as a barrier
would be listed part of its published spec I would gladly remove wmb()
from atkbd.

> If schedule_delayed_work()
> fails we can race with the soon-to-be-executed atkbd_event_work(), in that
> case that test_and_set_bit() + test_and_clear_bit(->event_mask) save us,
> but wmb() can't help again.
>
> Another question is why do we need ->event_mutex? OK, it can serialize
> multiple instances of atkbd_event_work() running on the different CPUs,
> but in that case atkbd_reconnect() needs this lock too? It also calls
> atkbd_set_repeat_rate/atkbd_set_leds.

Probably, I will need to thiknk about it a bit more.

--
Dmitry

2008-11-11 16:20:26

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] atkbd: cancel delayed work before freeing its structure

On 11/11, Dmitry Torokhov wrote:
>
> On Fri, Nov 07, 2008 at 04:43:25PM +0100, Oleg Nesterov wrote:
> >
> > While we are here, what is the reason for atkbd_schedule_event_work()->wmb() ?
> > It looks absolutely bogus. Is it for atkbd_event_work() ? In that case it
> > is not needed, it must see all previous STOREs because both queue_work() and
> > run_workqueue() take cwq->lock. And in any case,
> > test_and_set_bit(WORK_STRUCT_PENDING) implies mb().
>
> I wanted to be sure that event_mask is set before we schedule event_work
> and I don't want to rely on details of queue_delayed_work
> implementation. If the fact that queue_delayed_work acts as a barrier
> would be listed part of its published spec I would gladly remove wmb()
> from atkbd.

Yes, queue_delayed_work() acts as a barrier for the work->func(), otherwise
almost any code which uses wqs is broken.

But let me repeat, if queue_delayed_work() fails becuase this work is
already queued we (in this particular case) need mb(), not wmb(). Or
atkbd_schedule_event_work() can miss a bit in ->event_mask. So I think
this wmb() is misleading. And unneeded because queue_work() implies mb(),
but this is not really documented.

Oleg.

2008-11-11 16:30:51

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] atkbd: cancel delayed work before freeing its structure

On Tue, Nov 11, 2008 at 06:20:50PM +0100, Oleg Nesterov wrote:
> On 11/11, Dmitry Torokhov wrote:
> >
> > On Fri, Nov 07, 2008 at 04:43:25PM +0100, Oleg Nesterov wrote:
> > >
> > > While we are here, what is the reason for atkbd_schedule_event_work()->wmb() ?
> > > It looks absolutely bogus. Is it for atkbd_event_work() ? In that case it
> > > is not needed, it must see all previous STOREs because both queue_work() and
> > > run_workqueue() take cwq->lock. And in any case,
> > > test_and_set_bit(WORK_STRUCT_PENDING) implies mb().
> >
> > I wanted to be sure that event_mask is set before we schedule event_work
> > and I don't want to rely on details of queue_delayed_work
> > implementation. If the fact that queue_delayed_work acts as a barrier
> > would be listed part of its published spec I would gladly remove wmb()
> > from atkbd.
>
> Yes, queue_delayed_work() acts as a barrier for the work->func(), otherwise
> almost any code which uses wqs is broken.
>
> But let me repeat, if queue_delayed_work() fails becuase this work is
> already queued we (in this particular case) need mb(), not wmb(). Or
> atkbd_schedule_event_work() can miss a bit in ->event_mask. So I think
> this wmb() is misleading.

Could you please explain why wmb() is not enough and full mb() is
needed in this case? I thought that if write happens before we decide
whether to schedule event_work or not it would be enough.

> And unneeded because queue_work() implies mb(),
> but this is not really documented.
>

It would be great if we can get it documented and then i'd drop *mb()
from atkbd.

Thanks.

--
Dmitry

2008-11-11 17:23:54

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] atkbd: cancel delayed work before freeing its structure

On 11/11, Dmitry Torokhov wrote:
>
> On Tue, Nov 11, 2008 at 06:20:50PM +0100, Oleg Nesterov wrote:
> > On 11/11, Dmitry Torokhov wrote:
> > >
> > But let me repeat, if queue_delayed_work() fails becuase this work is
> > already queued we (in this particular case) need mb(), not wmb(). Or
> > atkbd_schedule_event_work() can miss a bit in ->event_mask. So I think
> > this wmb() is misleading.
>
> Could you please explain why wmb() is not enough and full mb() is
> needed in this case? I thought that if write happens before we decide
> whether to schedule event_work or not it would be enough.

Yes, but how we decide whether to schedule or not? Let's suppose we do
this without mb(). say, queue_work() starts with

if (test_bit(WORK_STRUCT_PENDING)) // no barrier semantics
return;

In that case the code in atkbd_schedule_event_work()

set_bit(event_bit, &atkbd->event_mask);
wmb();
schedule_delayed_work(atkbd->event_work);

can be reordered (if ->event_work is queued) as

schedule_delayed_work(atkbd->event_work);
set_bit(event_bit, &atkbd->event_mask);

wmb() can only serialize STOREs, not STORE vs LOAD. The result of
set_bit() can be "delayed".

Now, run_workqueue() does

// again, no barrier semantics, but this doesn't matter
clear_bit(WORK_STRUCT_PENDING);

call atkbd_schedule_event_work()
if (test_and_clear_bit(atkbd->event_mask))
atkbd_set_xxx();

and we can miss an event.

> > And unneeded because queue_work() implies mb(),
> > but this is not really documented.
>
> It would be great if we can get it documented and then i'd drop *mb()
> from atkbd.

It is not easy document the current behaviour. Actually, perhaps
run_workqueue() needs smp_mb__after_clear_bit()...

But for this particular case this doesn't matter. Note that
atkbd_event_work() does test_and_clear_bit(), it can't be re-ordered
with clear_bit(WORK_STRUCT_PENDING), otherwise even mb() can't help.

Oleg.