2007-10-25 04:13:42

by Maxim Levitsky

[permalink] [raw]
Subject: QUESTION: How to fix race between .suspend routine and watchdog timer

Hi,

Recently, trying to fix saa7134 suspend/resume problems I found that there
is a race between IRQ handler and .suspend , and that I cant let driver access the device
while its in D3 since it can lock up some systems.

Now I am looking to fix those issues in two drivers that have my .suspend/.resume routines.
the saa7134 capture chip and dmfe, the davicom network driver.

Looking through the dmfe code, I noticed yet another possible race.
A race between the .suspend, and a timer that serves both as a watchdog, and link state detector.
Again I need to prevent it from running during the suspend/resume, but how?

I can use del_timer in .suspend, and mod_timer in .resume, but that doesn't protect against
race with already running timer.
I can use del_timer_sync, but it states that it is useless if timer re-enables itself, and I agree with that.
In dmfe case the timer does re-enable itself.

I can put checks in the timer for ->insuspend, and don't re enable it if set,
but that opens a new can of worms with memory barriers, etc...

So please tell me how properly to do that.

By the way, this problem, together with synchronize_irq it very generic, since most drivers
have and irq handler, and a timeout timer.

Best regards,
Maxim Levitsky


2007-10-25 17:02:22

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] QUESTION: How to fix race between .suspend routine and watchdog timer

On Thu, 25 Oct 2007, Maxim Levitsky wrote:

> Hi,
>
> Recently, trying to fix saa7134 suspend/resume problems I found that there
> is a race between IRQ handler and .suspend , and that I cant let driver access the device
> while its in D3 since it can lock up some systems.
>
> Now I am looking to fix those issues in two drivers that have my .suspend/.resume routines.
> the saa7134 capture chip and dmfe, the davicom network driver.
>
> Looking through the dmfe code, I noticed yet another possible race.
> A race between the .suspend, and a timer that serves both as a watchdog, and link state detector.
> Again I need to prevent it from running during the suspend/resume, but how?
>
> I can use del_timer in .suspend, and mod_timer in .resume, but that doesn't protect against
> race with already running timer.
> I can use del_timer_sync, but it states that it is useless if timer re-enables itself, and I agree with that.
> In dmfe case the timer does re-enable itself.

That comment isn't right. del_timer_sync works perfectly well even if
the timer routine re-enables itself, provided it stops doing so after a
small number of iterations.

> I can put checks in the timer for ->insuspend, and don't re enable it if set,
> but that opens a new can of worms with memory barriers, etc...

You don't have to worry about any of that stuff. Just check the
insuspend flag and don't re-enable the timer if it is set. Even
without any memory barriers, the timer routine won't iterate more than
once or twice.

Alan Stern

2007-10-26 11:49:48

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [linux-pm] QUESTION: How to fix race between .suspend routine and watchdog timer

On Thursday 25 October 2007 19:02:12 Alan Stern wrote:
> On Thu, 25 Oct 2007, Maxim Levitsky wrote:
>
> > Hi,
> >
> > Recently, trying to fix saa7134 suspend/resume problems I found that there
> > is a race between IRQ handler and .suspend , and that I cant let driver access the device
> > while its in D3 since it can lock up some systems.
> >
> > Now I am looking to fix those issues in two drivers that have my .suspend/.resume routines.
> > the saa7134 capture chip and dmfe, the davicom network driver.
> >
> > Looking through the dmfe code, I noticed yet another possible race.
> > A race between the .suspend, and a timer that serves both as a watchdog, and link state detector.
> > Again I need to prevent it from running during the suspend/resume, but how?
> >
> > I can use del_timer in .suspend, and mod_timer in .resume, but that doesn't protect against
> > race with already running timer.
> > I can use del_timer_sync, but it states that it is useless if timer re-enables itself, and I agree with that.
> > In dmfe case the timer does re-enable itself.
>
> That comment isn't right. del_timer_sync works perfectly well even if
> the timer routine re-enables itself, provided it stops doing so after a
> small number of iterations.
Thanks for the info. but....
Due to the "don't access the hardware, while powered-off" rule, I must know that the timer isn't running.
and won't be.
So what function to use (if possible) to be sure that the timer won't run anymore?
(Taking in the account the fact that it re-enables itself)
>
> > I can put checks in the timer for ->insuspend, and don't re enable it if set,
> > but that opens a new can of worms with memory barriers, etc...
>
> You don't have to worry about any of that stuff. Just check the
> insuspend flag and don't re-enable the timer if it is set. Even
> without any memory barriers, the timer routine won't iterate more than
> once or twice.
>
> Alan Stern
>
>

Thanks,
Best regards,
` Maxim Levitsky

2007-10-27 19:18:13

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] QUESTION: How to fix race between .suspend routine and watchdog timer

On Fri, 26 Oct 2007, Maxim Levitsky wrote:

> > > Looking through the dmfe code, I noticed yet another possible race.
> > > A race between the .suspend, and a timer that serves both as a watchdog, and link state detector.
> > > Again I need to prevent it from running during the suspend/resume, but how?
> > >
> > > I can use del_timer in .suspend, and mod_timer in .resume, but that doesn't protect against
> > > race with already running timer.
> > > I can use del_timer_sync, but it states that it is useless if timer re-enables itself, and I agree with that.
> > > In dmfe case the timer does re-enable itself.
> >
> > That comment isn't right. del_timer_sync works perfectly well even if
> > the timer routine re-enables itself, provided it stops doing so after a
> > small number of iterations.
> Thanks for the info. but....
> Due to the "don't access the hardware, while powered-off" rule, I must know that the timer isn't running.
> and won't be.
> So what function to use (if possible) to be sure that the timer won't run anymore?
> (Taking in the account the fact that it re-enables itself)

Use del_timer_sync(). It guarantees that when it returns, the timer
will be stopped and the timer routine will no longer be running on any
CPU.

Alan Stern

2007-10-27 19:22:55

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [linux-pm] QUESTION: How to fix race between .suspend routine and watchdog timer

On Saturday 27 October 2007 21:17:55 Alan Stern wrote:
> On Fri, 26 Oct 2007, Maxim Levitsky wrote:
>
> > > > Looking through the dmfe code, I noticed yet another possible race.
> > > > A race between the .suspend, and a timer that serves both as a watchdog, and link state detector.
> > > > Again I need to prevent it from running during the suspend/resume, but how?
> > > >
> > > > I can use del_timer in .suspend, and mod_timer in .resume, but that doesn't protect against
> > > > race with already running timer.
> > > > I can use del_timer_sync, but it states that it is useless if timer re-enables itself, and I agree with that.
> > > > In dmfe case the timer does re-enable itself.
> > >
> > > That comment isn't right. del_timer_sync works perfectly well even if
> > > the timer routine re-enables itself, provided it stops doing so after a
> > > small number of iterations.
> > Thanks for the info. but....
> > Due to the "don't access the hardware, while powered-off" rule, I must know that the timer isn't running.
> > and won't be.
> > So what function to use (if possible) to be sure that the timer won't run anymore?
> > (Taking in the account the fact that it re-enables itself)
>
> Use del_timer_sync(). It guarantees that when it returns, the timer
> will be stopped and the timer routine will no longer be running on any
> CPU.
>
Even if the timer re-enables itself, are you sure?

> Alan Stern
>
>

Best regards,
Maxim Levitsky

2007-10-27 22:24:22

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] QUESTION: How to fix race between .suspend routine and watchdog timer

On Sat, 27 Oct 2007, Maxim Levitsky wrote:

> > Use del_timer_sync(). It guarantees that when it returns, the timer
> > will be stopped and the timer routine will no longer be running on any
> > CPU.
> >
> Even if the timer re-enables itself, are you sure?

Last time I looked at the source code, that's what it did. I'll look
again... Yep, it still does. It checks to see if the timer routine is
currently running; if so then it waits a little while and tries again.

Alan Stern

2007-10-28 19:19:26

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [linux-pm] QUESTION: How to fix race between .suspend routine and watchdog timer

On Sunday 28 October 2007 00:24:10 Alan Stern wrote:
> On Sat, 27 Oct 2007, Maxim Levitsky wrote:
>
> > > Use del_timer_sync(). It guarantees that when it returns, the timer
> > > will be stopped and the timer routine will no longer be running on any
> > > CPU.
> > >
> > Even if the timer re-enables itself, are you sure?
>
> Last time I looked at the source code, that's what it did. I'll look
> again... Yep, it still does. It checks to see if the timer routine is
> currently running; if so then it waits a little while and tries again.
>
> Alan Stern
>
>

Thanks, a lot,
Best regards,
Maxim Levitsky