2008-03-21 01:10:06

by Andrew Morton

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <[email protected]> wrote:

> Well, so far so good for LEDs, but what about the other users of in_atomic
> that apparently should not be doing it either?

Ho hum. Lots of cc's added.



./arch/x86/mm/pageattr.c

Looks wrong.

./arch/m68k/atari/time.c

Possibly buggy: deadlockable

./sound/core/seq/seq_virmidi.c

Possibly buggy

./net/iucv/iucv.c
./kernel/power/process.c

Just a debug check.

./drivers/s390/char/sclp_tty.c

Possibly buggy: deadlockable

./drivers/s390/char/sclp_vt220.c

Possibly buggy: deadlockable

./drivers/s390/net/netiucv.c

Possibly buggy: deadlockable

./drivers/char/isicom.c

Possibly buggy: deadlockable

./drivers/usb/misc/sisusbvga/sisusb_con.c

Possibly buggy: deadlockable

./drivers/net/usb/pegasus.c

Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/airo.c

Possibly buggy: deadlockable

./drivers/net/wireless/rt2x00/rt73usb.c

Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/rt2x00/rt2500usb.c

Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/hostap/hostap_ioctl.c

Possibly buggy: deadlockable (I assume)

./drivers/net/wireless/zd1211rw/zd_usb.c

Possibly buggy: deadlockable (I assume)

./drivers/net/irda/sir_dev.c

Possibly buggy: deadlockable

./drivers/net/netxen/netxen_nic_niu.c

Possibly buggy: deadlockable

./drivers/net/netxen/netxen_nic_init.c

Possibly buggy: deadlockable

./drivers/ieee1394/ieee1394_transactions.c

Possibly buggy: deadlockable

./drivers/video/amba-clcd.c

Possibly buggy: deadlockable

./drivers/i2c/i2c-core.c

Possibly buggy: deadlockable


The usual pattern for most of the above is

if (!in_atomic())
do_something_which_might_sleep();

problem is, in_atomic() returns false inside spinlock on non-preptible
kernels. So if anyone calls those functions inside spinlock they will
incorrectly schedule and another task can then come in and try take the
already-held lock.

Now, it happens that in_atomic() returns true on non-preemtible kernels
when running in interrupt or softirq context. But if the above code really
is using in_atomic() to detect am-i-called-from-interrupt and NOT
am-i-called-from-inside-spinlock, they should be using in_irq(),
in_softirq() or in_interrupt().



2008-03-21 03:07:17

by Alan Stern

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Thu, 20 Mar 2008, Andrew Morton wrote:

> > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > when running in interrupt or softirq context. But if the above code really
> > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > in_softirq() or in_interrupt().
> > >
> > > Presumably most of these places are actually trying to detect
> > > am-i-allowed-to-sleep. Isn't that what in_atomic() is supposed to do?
> >
> > No, I think there is no such check in the kernel. Most likely for performance
> > reasons, as it would require a global flag that is set on each spinlock.
>
> Yup. non-preemptible kernels avoid the inc/dec of
> current_thread_info->preempt_count on spin_lock/spin_unlock

So then what's the point of having in_atomic() at all? Is it nothing
more than a shorthand form of (in_irq() | in_softirq() |
in_interrupt())?

In short, you are saying that there is _no_ reliable way to determine
am-i-called-from-inside-spinlock. Well, why isn't there? Would it be
so terrible if non-preemptible kernels did adjust preempt_count on
spin_lock/unlock?

Alan Stern


2008-03-21 16:55:19

by Greg KH

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Fri, Mar 21, 2008 at 02:47:50PM +0100, Heiko Carstens wrote:
> On Thu, Mar 20, 2008 at 07:27:19PM -0700, Andrew Morton wrote:
> > On Fri, 21 Mar 2008 02:36:51 +0100 Michael Buesch <[email protected]> wrote:
> > > On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> > > > On Thu, 20 Mar 2008, Andrew Morton wrote:
> > > > > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <[email protected]> wrote:
> > > > >
> > > > > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > > > > that apparently should not be doing it either?
> > > > >
> > > > > Ho hum. Lots of cc's added.
> > > >
> > > > ...
> > > >
> > > > > The usual pattern for most of the above is
> > > > >
> > > > > if (!in_atomic())
> > > > > do_something_which_might_sleep();
> > > > >
> > > > > problem is, in_atomic() returns false inside spinlock on non-preptible
> > > > > kernels. So if anyone calls those functions inside spinlock they will
> > > > > incorrectly schedule and another task can then come in and try take the
> > > > > already-held lock.
> > > > >
> > > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > > when running in interrupt or softirq context. But if the above code really
> > > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > > in_softirq() or in_interrupt().
> > > >
> > > > Presumably most of these places are actually trying to detect
> > > > am-i-allowed-to-sleep. Isn't that what in_atomic() is supposed to do?
> > >
> > > No, I think there is no such check in the kernel. Most likely for performance
> > > reasons, as it would require a global flag that is set on each spinlock.
> >
> > Yup. non-preemptible kernels avoid the inc/dec of
> > current_thread_info->preempt_count on spin_lock/spin_unlock
> >
> > > You simply must always _know_, if you are allowed to sleep or not. This is
> > > done by defining an API. The call-context is part of any kernel API.
> >
> > Yup. 99.99% of kernel code manages to do this...
>
> This is difficult for console drivers. They get called and are supposed to
> print something and don't have the slightest clue which context they are
> running in and if they are allowed to schedule.
> This is the problem with e.g. s390's sclp driver. If there are no write
> buffers available anymore it tries to allocate memory if schedule is allowed
> or otherwise has to wait until finally a request finished and memory is
> available again.
> And now we have to always busy wait if we are out of buffers, since we
> cannot tell which context we are in?

This is the reason why the drivers/usb/misc/sisusbvga driver is trying
to test for in_atomic:
/* We can't handle console calls in non-schedulable
* context due to our locks and the USB transport.
* So we simply ignore them. This should only affect
* some calls to printk.
*/
if (in_atomic())
return NULL;


So how should this be "fixed" if in_atomic() is not a valid test?

thanks,

greg k-h

2008-03-21 20:02:11

by Andrew Morton

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Fri, 21 Mar 2008 09:54:05 -0700 Greg KH <[email protected]> wrote:

> > > > You simply must always _know_, if you are allowed to sleep or not. This is
> > > > done by defining an API. The call-context is part of any kernel API.
> > >
> > > Yup. 99.99% of kernel code manages to do this...
> >
> > This is difficult for console drivers. They get called and are supposed to
> > print something and don't have the slightest clue which context they are
> > running in and if they are allowed to schedule.
> > This is the problem with e.g. s390's sclp driver. If there are no write
> > buffers available anymore it tries to allocate memory if schedule is allowed
> > or otherwise has to wait until finally a request finished and memory is
> > available again.
> > And now we have to always busy wait if we are out of buffers, since we
> > cannot tell which context we are in?
>
> This is the reason why the drivers/usb/misc/sisusbvga driver is trying
> to test for in_atomic:
> /* We can't handle console calls in non-schedulable
> * context due to our locks and the USB transport.
> * So we simply ignore them. This should only affect
> * some calls to printk.
> */
> if (in_atomic())
> return NULL;
>
>
> So how should this be "fixed" if in_atomic() is not a valid test?

Well. The kernel has traditionally assumed that console writes are atomic.

But we now have complex sleepy drivers acting as consoles. Presumably this
means that large amounts of device driver code, page allocator code, etc
cannot have printks in them without going recursive. Except printk itself
internally handles that, due to its need to be able to handle
printk-from-interrupt-when-this-cpu-is-already-running-printk.

The typical fix is for these console drivers to just assume that they
cannot sleep: pass GFP_ATOMIC down into the device driver code. But I bet
the device driver code was designed assuming that it could sleep,
oops-bad-we-lose.

And it's not just sleep-in-spinlock. If any of that device driver code
uses alloc_pages(GFP_KERNEL) then it can deadlock if we do a printk from
within the page allocator (and hence a lot of the block and storage layer).
Because in those code paths we must use GFP_NOFS or GFP_NOIO to allocate
memory.

So I think the right fix here is to switch those drivers to being
unconditionally atomic: don't schedule, don't take mutexes, don't use
__GFP_WAIT allocations.

They could of course be switched to using
kmalloc(GFP_ATOMIC)+memcpy()+schedule_task(). That's rather slow, but this
is not a performance-sensitive area. But more seriously, this could lead
to messages getting lost from a dying machine.

One possibility would be to do current->called_for_console_output=1 and
then test that in various places. But a) ugh and b) that's only useful for
memory allocations - it doesn't help if sleeping locks need to be taken.

Another possibility might be:

if (current->called_for_console_output == false) {
mutex_lock(lock);
} else {
if (!mutex_trylock(lock))
return -EAGAIN;
}

and then teach the console-calling code to requeue the message for later.
But that's hard, because the straightforward implementation would result in
the output being queued for _all_ the currently active consoles, but some
of them might already have displayed this output - there's only one
log_buf.


An interesting problem ;)


2008-03-21 20:21:38

by Michael Büsch

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Friday 21 March 2008 21:16:48 Michael Buesch wrote:
> On Friday 21 March 2008 20:59:50 Andrew Morton wrote:
> > They could of course be switched to using
> > kmalloc(GFP_ATOMIC)+memcpy()+schedule_task(). That's rather slow, but this
> > is not a performance-sensitive area. But more seriously, this could lead
> > to messages getting lost from a dying machine.
>
> Well, IMO drivers that need to sleep to transmit some data (to whatever,
> the screen or something) are not useful for debugging a crashing kernel anyway.
> Or how high is the possibility that it'd survive the actual sleep in the
> memory allocation? I'd say almost zero.
> So that schedule_task() is not that bad.

and

transmit_data_func()
{
if (!oops_in_progress) {
schedule_transmission_for_later();
} else {
/* We crash anyway, so we don't care about
* possible deadlocks from memory alloc sleeps
* or whatever. */
close_eyes_and_transmit_it_now();
}
}


--
Greetings Michael.

2008-03-21 01:37:34

by Michael Büsch

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> On Thu, 20 Mar 2008, Andrew Morton wrote:
>
> > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <[email protected]> wrote:
> >
> > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > that apparently should not be doing it either?
> >
> > Ho hum. Lots of cc's added.
>
> ...
>
> > The usual pattern for most of the above is
> >
> > if (!in_atomic())
> > do_something_which_might_sleep();
> >
> > problem is, in_atomic() returns false inside spinlock on non-preptible
> > kernels. So if anyone calls those functions inside spinlock they will
> > incorrectly schedule and another task can then come in and try take the
> > already-held lock.
> >
> > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > when running in interrupt or softirq context. But if the above code really
> > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > in_softirq() or in_interrupt().
>
> Presumably most of these places are actually trying to detect
> am-i-allowed-to-sleep. Isn't that what in_atomic() is supposed to do?

No, I think there is no such check in the kernel. Most likely for performance
reasons, as it would require a global flag that is set on each spinlock.
You simply must always _know_, if you are allowed to sleep or not. This is
done by defining an API. The call-context is part of any kernel API.

--
Greetings Michael.

2008-03-21 03:19:47

by Andrew Morton

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Thu, 20 Mar 2008 23:07:16 -0400 (EDT) Alan Stern <[email protected]> wrote:

> On Thu, 20 Mar 2008, Andrew Morton wrote:
>
> > > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > > when running in interrupt or softirq context. But if the above code really
> > > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > > in_softirq() or in_interrupt().
> > > >
> > > > Presumably most of these places are actually trying to detect
> > > > am-i-allowed-to-sleep. Isn't that what in_atomic() is supposed to do?
> > >
> > > No, I think there is no such check in the kernel. Most likely for performance
> > > reasons, as it would require a global flag that is set on each spinlock.
> >
> > Yup. non-preemptible kernels avoid the inc/dec of
> > current_thread_info->preempt_count on spin_lock/spin_unlock
>
> So then what's the point of having in_atomic() at all? Is it nothing
> more than a shorthand form of (in_irq() | in_softirq() |
> in_interrupt())?

in_atomic() is for core kernel use only. Because in special circumstances
(ie: kmap_atomic()) we run inc_preempt_count() even on non-preemptible
kernels to tell the per-arch fault handler that it was invoked by
copy_*_user() inside kmap_atomic(), and it must fail.

> In short, you are saying that there is _no_ reliable way to determine
> am-i-called-from-inside-spinlock.

That's correct.

> Well, why isn't there?

The reasons I identified: it adds additional overhead and it encourages
poorly-thought-out design.

Now we _could_ change kernel design principles from
caller-knows-whats-going-on over to callee-works-out-whats-going-on. But
that would affect more than this particular thing.

> Would it be
> so terrible if non-preemptible kernels did adjust preempt_count on
> spin_lock/unlock?

The vast, vast majority of kernel code has managed to get through life
without needing this hidden-argument-passing. The handful of errant
callsites should be able to do so as well...


2008-03-21 17:10:47

by David Brownell

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Thursday 20 March 2008, Andrew Morton wrote:
> ./drivers/net/usb/pegasus.c
>=20
> =A0 Possibly buggy: deadlockable (I assume)

Looks just unecessary to me ... ethtool MII ops get called from
a task context, as I recall, and other drivers just rely on that.

- Dave

=3D=3D=3D=3D=3D=3D=3D=3D=3D CUT HERE
Remove superfluous in-atomic() check; ethtool MII ops are called
from task context.

Signed-off-by: David Brownell <[email protected]>
---
drivers/net/usb/pegasus.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

--- g26.orig/drivers/net/usb/pegasus.c 2008-03-21 08:53:28.000000000 -0=
700
+++ g26/drivers/net/usb/pegasus.c 2008-03-21 08:54:07.000000000 -0700
@@ -1128,12 +1128,8 @@ pegasus_get_settings(struct net_device *
{
pegasus_t *pegasus;
=20
- if (in_atomic())
- return 0;
-
pegasus =3D netdev_priv(dev);
mii_ethtool_gset(&pegasus->mii, ecmd);
-
return 0;
}
=20

2008-03-21 01:31:45

by Alan Stern

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Thu, 20 Mar 2008, Andrew Morton wrote:

> On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <[email protected]> wrote:
>
> > Well, so far so good for LEDs, but what about the other users of in_atomic
> > that apparently should not be doing it either?
>
> Ho hum. Lots of cc's added.

...

> The usual pattern for most of the above is
>
> if (!in_atomic())
> do_something_which_might_sleep();
>
> problem is, in_atomic() returns false inside spinlock on non-preptible
> kernels. So if anyone calls those functions inside spinlock they will
> incorrectly schedule and another task can then come in and try take the
> already-held lock.
>
> Now, it happens that in_atomic() returns true on non-preemtible kernels
> when running in interrupt or softirq context. But if the above code really
> is using in_atomic() to detect am-i-called-from-interrupt and NOT
> am-i-called-from-inside-spinlock, they should be using in_irq(),
> in_softirq() or in_interrupt().

Presumably most of these places are actually trying to detect
am-i-allowed-to-sleep. Isn't that what in_atomic() is supposed to do?
Why doesn't it do that in non-preemptible kernels?

For that matter, isn't it also the sort of thing that might_sleep() is
supposed to check? But looking at the definitions in
include/linux/kernel.h, it appears that might_sleep() does nothing at
all when neither CONFIG_PREEMPT_VOLUNTARY nor
CONFIG_DEBUG_SPINLOCK_SLEEP is set.

Alan Stern


Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Fri, 21 Mar 2008, Stefan Richter wrote:
> and eth1394 to deal with temporary lack of of tlabels. Alas I just
> recently received a report that eth1394's workaround is unsuccessful on
> non-preemptible uniprocessor kernels. I suspect the same issue exists

Which, I think, is exactly the config where in_atomic() can't be used to
mean "in_scheduleable_context()" ?

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2008-03-21 09:29:09

by Stefan Richter

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

I wrote:
> Andrew Morton wrote:
>> ./drivers/ieee1394/ieee1394_transactions.c
>>
>> Possibly buggy: deadlockable
>
> That's in hpsb_get_tlabel(), an exported symbol of the ieee1394 core.
>
> The in_atomic() there didn't cause problems yet and is unlikely to do so
> in the future, because there are no plans for substantial changes to the
> whole drivers/ieee1394/ anymore (because of drivers/firewire/).
>
> Nevertheless I shall look into replacing the in_atomic() by in_softirq()
> or something like that.

Or extend the API to have separate calls for callers which can sleep and
callers which can't. But that may be thwarted by deep call chains.

> Touching this legacy code is dangerous though.
--
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

2008-03-21 13:47:54

by Heiko Carstens

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Thu, Mar 20, 2008 at 07:27:19PM -0700, Andrew Morton wrote:
> On Fri, 21 Mar 2008 02:36:51 +0100 Michael Buesch <[email protected]> wrote:
> > On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> > > On Thu, 20 Mar 2008, Andrew Morton wrote:
> > > > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <[email protected]> wrote:
> > > >
> > > > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > > > that apparently should not be doing it either?
> > > >
> > > > Ho hum. Lots of cc's added.
> > >
> > > ...
> > >
> > > > The usual pattern for most of the above is
> > > >
> > > > if (!in_atomic())
> > > > do_something_which_might_sleep();
> > > >
> > > > problem is, in_atomic() returns false inside spinlock on non-preptible
> > > > kernels. So if anyone calls those functions inside spinlock they will
> > > > incorrectly schedule and another task can then come in and try take the
> > > > already-held lock.
> > > >
> > > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > > when running in interrupt or softirq context. But if the above code really
> > > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > > in_softirq() or in_interrupt().
> > >
> > > Presumably most of these places are actually trying to detect
> > > am-i-allowed-to-sleep. Isn't that what in_atomic() is supposed to do?
> >
> > No, I think there is no such check in the kernel. Most likely for performance
> > reasons, as it would require a global flag that is set on each spinlock.
>
> Yup. non-preemptible kernels avoid the inc/dec of
> current_thread_info->preempt_count on spin_lock/spin_unlock
>
> > You simply must always _know_, if you are allowed to sleep or not. This is
> > done by defining an API. The call-context is part of any kernel API.
>
> Yup. 99.99% of kernel code manages to do this...

This is difficult for console drivers. They get called and are supposed to
print something and don't have the slightest clue which context they are
running in and if they are allowed to schedule.
This is the problem with e.g. s390's sclp driver. If there are no write
buffers available anymore it tries to allocate memory if schedule is allowed
or otherwise has to wait until finally a request finished and memory is
available again.
And now we have to always busy wait if we are out of buffers, since we
cannot tell which context we are in?

2008-03-21 09:23:16

by Stefan Richter

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

Andrew Morton wrote:
> ./drivers/ieee1394/ieee1394_transactions.c
>
> Possibly buggy: deadlockable

That's in hpsb_get_tlabel(), an exported symbol of the ieee1394 core.

The in_atomic() there didn't cause problems yet and is unlikely to do so
in the future, because there are no plans for substantial changes to the
whole drivers/ieee1394/ anymore (because of drivers/firewire/).

Nevertheless I shall look into replacing the in_atomic() by in_softirq()
or something like that. Touching this legacy code is dangerous though.


Some background:

This in_atomic() is just one symptom of one of the fundamental design
flaws of the ieee1394 stack: The "tlabels" (transaction labels, a
limited resource) are acquired not only in process context but also in
soft IRQ context --- but they are released only in process context.
Unsurprisingly (in hindsight), the stack used to run out of tlabels
simply because the tlabel consumers were scheduled more frequently than
the tlabel recycler. This resulted in IO failures in sbp2 and eth1394.

This is one of the design problems which inspired the submission of a
new alternative driver stack. (Though this particular one of the
ieee1394 stack's problems could of course also be solved by a rework of
the stack --- with a respective need of resources for testing and some
danger of regressions.)

In the meantime (Linux 2.6.19 and 2.6.22) I added workarounds in sbp2
and eth1394 to deal with temporary lack of of tlabels. Alas I just
recently received a report that eth1394's workaround is unsuccessful on
non-preemptible uniprocessor kernels. I suspect the same issue exists
with sbp2's workaround, it just isn't as likely to happen there.

The new drivers/firewire/ recycle tlabels in bottom halves context and
in timer context, which is the appropriate approach. Alas
drivers/firewire/ don't have an eth1394 equivalent yet...
--
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

2008-03-22 11:31:26

by Stefan Richter

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

I wrote:
>>> and eth1394 to deal with temporary lack of of tlabels. Alas I just
>>> recently received a report that eth1394's workaround is unsuccessful
>>> on non-preemptible uniprocessor kernels.
> (I haven't started working on a fix, or opened a bugzilla
> ticket for it yet. The reporter currently switched his kernel to
> PREEMPT which is not affected.)

now logged as http://bugzilla.kernel.org/show_bug.cgi?id=10306

> The failure in the workaround is *not* about the in_atomic() being the
> wrong question asked in hpsb_get_tlabel() --- no, ieee1394's in_atomic()
> abuse works just fine even on UP PREEMPT_NONE. Instead, the failure is
> about kthreads not being scheduled in the way that I thought they would.
--
Stefan Richter
-=====-==--- --== =-==-
http://arcgraph.de/sr/

2008-03-21 02:29:28

by Andrew Morton

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Fri, 21 Mar 2008 02:36:51 +0100 Michael Buesch <[email protected]> wrote:

> On Friday 21 March 2008 02:31:44 Alan Stern wrote:
> > On Thu, 20 Mar 2008, Andrew Morton wrote:
> >
> > > On Thu, 20 Mar 2008 21:36:04 -0300 Henrique de Moraes Holschuh <[email protected]> wrote:
> > >
> > > > Well, so far so good for LEDs, but what about the other users of in_atomic
> > > > that apparently should not be doing it either?
> > >
> > > Ho hum. Lots of cc's added.
> >
> > ...
> >
> > > The usual pattern for most of the above is
> > >
> > > if (!in_atomic())
> > > do_something_which_might_sleep();
> > >
> > > problem is, in_atomic() returns false inside spinlock on non-preptible
> > > kernels. So if anyone calls those functions inside spinlock they will
> > > incorrectly schedule and another task can then come in and try take the
> > > already-held lock.
> > >
> > > Now, it happens that in_atomic() returns true on non-preemtible kernels
> > > when running in interrupt or softirq context. But if the above code really
> > > is using in_atomic() to detect am-i-called-from-interrupt and NOT
> > > am-i-called-from-inside-spinlock, they should be using in_irq(),
> > > in_softirq() or in_interrupt().
> >
> > Presumably most of these places are actually trying to detect
> > am-i-allowed-to-sleep. Isn't that what in_atomic() is supposed to do?
>
> No, I think there is no such check in the kernel. Most likely for performance
> reasons, as it would require a global flag that is set on each spinlock.

Yup. non-preemptible kernels avoid the inc/dec of
current_thread_info->preempt_count on spin_lock/spin_unlock

> You simply must always _know_, if you are allowed to sleep or not. This is
> done by defining an API. The call-context is part of any kernel API.

Yup. 99.99% of kernel code manages to do this...

2008-03-21 13:18:41

by Stefan Richter

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

Henrique de Moraes Holschuh wrote:
> On Fri, 21 Mar 2008, Stefan Richter wrote:
>> and eth1394 to deal with temporary lack of of tlabels. Alas I just
>> recently received a report that eth1394's workaround is unsuccessful on
>> non-preemptible uniprocessor kernels.
>
> Which, I think, is exactly the config where in_atomic() can't be used to
> mean "in_scheduleable_context()" ?

That's coincidence.

The mentioned workaround fails this way:
- tlabel consumer eth1394 (IPv4 over FireWire) grabs lots of tlabels
in soft IRQ context.
- tlabel recycler khpsbpkt (a kthread of ieee1394) sleeps even though
it could start putting tlabels back into the pool.
- eth1394 can't get tlabels anymore, stops the transmit queue,
schedules a workqueue job.
- eth1394's workqueue job (run by the events kthread) tries to acquire
a tlabel. It does so in non-atomic context and hence sleeps in
hpsb_get_tlabel() until the tlabel pool is nonempty again. It would
then wake up the eth1394 transmit queue again.
- Normally, khpsbpkt would have been woken up by now and would have
released a lot of now unused tlabels back into the pool again.
However, on UP preempt_none kernels, khpsbpkt continues to sleep.
(The 1394 stack's lower level runing in IRQ context or perhaps
tasklet context wakes up khpsbpkt.)
- Since it doesn't get a tlabel, eth1394's workqueue jobs sleeps
forever as well.

Result is that all other tasks of the shared workqueue can't be
serviced, notably the keyboard is stuck, and that the eth1394 connection
breaks down. (I haven't started working on a fix, or opened a bugzilla
ticket for it yet. The reporter currently switched his kernel to
PREEMPT which is not affected.)

IOW:
The failure in the workaround is *not* about the in_atomic() being the
wrong question asked in hpsb_get_tlabel() --- no, ieee1394's in_atomic()
abuse works just fine even on UP PREEMPT_NONE. Instead, the failure is
about kthreads not being scheduled in the way that I thought they would.
--
Stefan Richter
-=====-==--- --== =-=-=
http://arcgraph.de/sr/

2008-03-21 20:17:40

by Michael Büsch

[permalink] [raw]
Subject: Re: use of preempt_count instead of in_atomic() at leds-gpio.c

On Friday 21 March 2008 20:59:50 Andrew Morton wrote:
> They could of course be switched to using
> kmalloc(GFP_ATOMIC)+memcpy()+schedule_task(). That's rather slow, but this
> is not a performance-sensitive area. But more seriously, this could lead
> to messages getting lost from a dying machine.

Well, IMO drivers that need to sleep to transmit some data (to whatever,
the screen or something) are not useful for debugging a crashing kernel anyway.
Or how high is the possibility that it'd survive the actual sleep in the
memory allocation? I'd say almost zero.
So that schedule_task() is not that bad.

--
Greetings Michael.