LinuxLists.cc - 2.5 kernel + hostap_cs + X11 = scheduling while atomic

2003-02-05 07:27:06

Subject: 2.5 kernel + hostap_cs + X11 = scheduling while atomic

Hi Jouni and LKML folks,

In kernel 2.5.59 I built the hostap modules as stated in the Makefile.
(make -C /usr/src/linux SUBDIRS=$PWD/driver/modules modules, etc.)

However, a combination of running said kernel, hostap_cs, and X11 produces
this nasty infinite string of errors:

bad: scheduling while atomic!
Call Trace:
[<c011bafd>] do_schedule+0x33a/0x33f
[<c01264a1>] schedule_timeout+0x5f/0xb3
[<c0175a7f>] proc_alloc_inode+0x4c/0x75
[<c0126439>] process_timeout+0x0/0x9
[<d294853d>] hfa384x_cmd+0x3eb/0x2d6b7eae [hostap_cs]
[<c016474d>] get_new_inode_fast+0x48/0xf0
[<c011bb52>] default_wake_function+0x0/0x3e
[<c011bb52>] default_wake_function+0x0/0x3e
[<c0137b46>] get_page_cache_size+0x12/0x1d
[<d2948a30>] hfa384x_get_rid+0x36/0x2d6b7606 [hostap_cs]
[<d29388b5>] hostap_get_wireless_stats+0xa6/0x2d6c77f1 [hostap]
[<c0168f45>] seq_printf+0x45/0x56
[<c02bd9b6>] wireless_seq_show+0xd6/0xf7
[<c0141dd5>] do_mmap_pgoff+0x40e/0x6dc
[<c0168a56>] seq_read+0x1c9/0x2ee
[<c014d20f>] vfs_read+0xbc/0x127
[<c014d496>] sys_read+0x3e/0x55
[<c01093cb>] syscall_call+0x7/0xb

This will repeat itself over and over again in the same order, same everything.
The second I kill X the messages stop completely. I use the radeon accelerated X
server using an Intel AGP bridge (kernel supported.)

Any ideas? It seems like it would be a problem in hostap_cs's main loop. Or it
could be a kernel problem, which is why I'm forwarding it to LKML :) Why would
it only happen when X11 is active though?

Thanks for any insight anyone might have on this issue. Lately, I left my machine
on and went to go eat lunch. When i came back, /var/log/syslog was overflowing with
these errors and had filled up /var completely. Had to purge them all manually :)

Regards

Josh

2003-02-06 05:19:23

by Jouni Malinen

[permalink] [raw]

Subject: Re: 2.5 kernel + hostap_cs + X11 = scheduling while atomic

On Tue, Feb 04, 2003 at 11:36:37PM -0800, Joshua Kwan wrote:

> However, a combination of running said kernel, hostap_cs, and X11 produces
> this nasty infinite string of errors:
>
> bad: scheduling while atomic!
> Call Trace:

> [<d2948a30>] hfa384x_get_rid+0x36/0x2d6b7606 [hostap_cs]

That will sleep, so it better not be called while in interrupt context
or apparently also, while atomic with preemptive kernels(?).

> [<d29388b5>] hostap_get_wireless_stats+0xa6/0x2d6c77f1 [hostap]

That's the dev->get_wireless_stats handler. I have assumed that it is
allowed to sleep there, but apparently that is not the case with Linux
2.5.x (at least with CONFIG_PREEMPT). I added a workaround for this into
Host AP CVS, but you will not get signal quality statistics in that
case. I'll do a proper fix if that function is indeed not allowed to
sleep (e.g., by collecting the statistics before and just copying the
values here).

Jean, do you have a comment on this? This happens, e.g., when executing
'cat /proc/net/wireless':

> [<c0168f45>] seq_printf+0x45/0x56
> [<c02bd9b6>] wireless_seq_show+0xd6/0xf7
> [<c0141dd5>] do_mmap_pgoff+0x40e/0x6dc
> [<c0168a56>] seq_read+0x1c9/0x2ee
> [<c014d20f>] vfs_read+0xbc/0x127
> [<c014d496>] sys_read+0x3e/0x55
> [<c01093cb>] syscall_call+0x7/0xb

--
Jouni Malinen PGP id EFC895FA

2003-02-06 17:27:56

by Jean Tourrilhes

[permalink] [raw]

Subject: Re: 2.5 kernel + hostap_cs + X11 = scheduling while atomic

On Wed, Feb 05, 2003 at 09:28:49PM -0800, Jouni Malinen wrote:
> On Tue, Feb 04, 2003 at 11:36:37PM -0800, Joshua Kwan wrote:
>
> > However, a combination of running said kernel, hostap_cs, and X11 produces
> > this nasty infinite string of errors:
> >
> > bad: scheduling while atomic!
> > Call Trace:
>
> > [<d2948a30>] hfa384x_get_rid+0x36/0x2d6b7606 [hostap_cs]
>
> That will sleep, so it better not be called while in interrupt context
> or apparently also, while atomic with preemptive kernels(?).
>
> > [<d29388b5>] hostap_get_wireless_stats+0xa6/0x2d6c77f1 [hostap]
>
> That's the dev->get_wireless_stats handler. I have assumed that it is
> allowed to sleep there, but apparently that is not the case with Linux
> 2.5.x (at least with CONFIG_PREEMPT). I added a workaround for this into
> Host AP CVS, but you will not get signal quality statistics in that
> case. I'll do a proper fix if that function is indeed not allowed to
> sleep (e.g., by collecting the statistics before and just copying the
> values here).
>
> Jean, do you have a comment on this? This happens, e.g., when executing
> 'cat /proc/net/wireless':
>
> > [<c0168f45>] seq_printf+0x45/0x56
> > [<c02bd9b6>] wireless_seq_show+0xd6/0xf7
> > [<c0141dd5>] do_mmap_pgoff+0x40e/0x6dc
> > [<c0168a56>] seq_read+0x1c9/0x2ee
> > [<c014d20f>] vfs_read+0xbc/0x127
> > [<c014d496>] sys_read+0x3e/0x55
> > [<c01093cb>] syscall_call+0x7/0xb

I had an argument with David a few month ago on the subject
(you can ask him how it ended). I believe that it's not a good
practice to "schedule" in any of the ioctl, and that seem to also
apply to get_wireless_stats. On the other hand, you can perfectly take
a spinlock, disable irq and do your job.
For the ioctl, on the way down you grab the rtnetlink
semaphore, which mean that all ioctl and rtnetlink operation will be
blocked until you return. That's the reason I deprecated the old
APLIST ioctl and designed the SCAN support as a *pair* of ioctl (+ an
event).
For get_wireless_stats, check what I did in Orinoco. I
basically get (under spinlock) the stuff I can get immediately (RSSI),
start a request for the counters and return the result of the
*previous* request. That's the best I can think, because I don't want
to run permanently a thread that poll the counters and this way the
counter polling rate is adapted to what the user so.
Note that in your case, the issue is slightly different. I
believe that the probability of having the BAD busy is not that
high. In that case, just return the last polled value, and set the
updated flag to 0 (now you understand why there is an updated flag).

> Jouni Malinen PGP id EFC895FA

Have fun...

Jean

2003-02-06 23:59:30

by David Gibson

[permalink] [raw]

Subject: Re: 2.5 kernel + hostap_cs + X11 = scheduling while atomic

On Thu, Feb 06, 2003 at 09:27:59AM -0800, Jean Tourrilhes wrote:
> On Wed, Feb 05, 2003 at 09:28:49PM -0800, Jouni Malinen wrote:
> > On Tue, Feb 04, 2003 at 11:36:37PM -0800, Joshua Kwan wrote:
> >
> > > However, a combination of running said kernel, hostap_cs, and X11 produces
> > > this nasty infinite string of errors:
> > >
> > > bad: scheduling while atomic!
> > > Call Trace:
> >
> > > [<d2948a30>] hfa384x_get_rid+0x36/0x2d6b7606 [hostap_cs]
> >
> > That will sleep, so it better not be called while in interrupt context
> > or apparently also, while atomic with preemptive kernels(?).
> >
> > > [<d29388b5>] hostap_get_wireless_stats+0xa6/0x2d6c77f1 [hostap]
> >
> > That's the dev->get_wireless_stats handler. I have assumed that it is
> > allowed to sleep there, but apparently that is not the case with Linux
> > 2.5.x (at least with CONFIG_PREEMPT). I added a workaround for this into
> > Host AP CVS, but you will not get signal quality statistics in that
> > case. I'll do a proper fix if that function is indeed not allowed to
> > sleep (e.g., by collecting the statistics before and just copying the
> > values here).
> >
> > Jean, do you have a comment on this? This happens, e.g., when executing
> > 'cat /proc/net/wireless':
> >
> > > [<c0168f45>] seq_printf+0x45/0x56
> > > [<c02bd9b6>] wireless_seq_show+0xd6/0xf7
> > > [<c0141dd5>] do_mmap_pgoff+0x40e/0x6dc
> > > [<c0168a56>] seq_read+0x1c9/0x2ee
> > > [<c014d20f>] vfs_read+0xbc/0x127
> > > [<c014d496>] sys_read+0x3e/0x55
> > > [<c01093cb>] syscall_call+0x7/0xb
>
> I had an argument with David a few month ago on the subject
> (you can ask him how it ended). I believe that it's not a good
> practice to "schedule" in any of the ioctl, and that seem to also
> apply to get_wireless_stats. On the other hand, you can perfectly take
> a spinlock, disable irq and do your job.

Yes, this is because most of the device ioctl() calls are made with
one or more spinlocks held by the network layer.

--
David Gibson | For every complex problem there is a
[email protected] | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson

2003-02-07 01:30:36

by Jean Tourrilhes

[permalink] [raw]

Subject: Re: 2.5 kernel + hostap_cs + X11 = scheduling while atomic

On Fri, Feb 07, 2003 at 11:07:33AM +1100, David Gibson wrote:
> >
> > I had an argument with David a few month ago on the subject
> > (you can ask him how it ended). I believe that it's not a good
> > practice to "schedule" in any of the ioctl, and that seem to also
> > apply to get_wireless_stats. On the other hand, you can perfectly take
> > a spinlock, disable irq and do your job.
>
> Yes, this is because most of the device ioctl() calls are made with
> one or more spinlocks held by the network layer.
>
> --
> David Gibson

Thanks for the clarification, appreciated...

Jean