2004-06-21 18:24:37

by walt

[permalink] [raw]
Subject: [2.6.7-bk] NFS-related kernel panic

Starting just today with the latest bk changesets I get a kernel
panic when starting the rpc.statd daemon (NFS):

Kernel panic: Aiee, killing interrupt handler
In interrupt handler - not syncing

Anyone else seeing problems with NFS?


2004-06-21 18:39:50

by Grzegorz Kulewski

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic

On Wed, 23 Jun 2004, walt wrote:

> Starting just today with the latest bk changesets I get a kernel
> panic when starting the rpc.statd daemon (NFS):
>
> Kernel panic: Aiee, killing interrupt handler
> In interrupt handler - not syncing
>
> Anyone else seeing problems with NFS?

This is probably not NFS problem since I have no NFS and I get the same.
Look at my report. I sent it minute ago to this list.


Thanks,

Grzegorz Kulewski

2004-06-22 02:44:43

by bob

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic

Ok. I get (a very similar) error with 2.6.7-bk4.
The error message I get is:
Starting portmapper: [OK]
Starting NFS statd: Kernel panic: Aiee, killing interrupt handler!
bad: scheduling while atomic!
[<c02a2c37>] schedule+0x483/0x488
[<c0133298>] __get_free_pages+0x33/0x3f
[<c026974c>] tcp_poll+0x34/0x15a
[<c02a30b5>] schedule_timeout+0xb5/0xb7
[<c0248fb3>] sock_poll+0x29/0x31
[<c015da2e>] do_pollfd+0x4f/0x90
[<c015db10>] do_poll+0xa1/0xc0
[<c016dc7e>] sys_poll+0x14f/0x211
[<c015d09d>] __pollwait+0x0/0xc6
[<c014afd2>] sys_close+0x63/0x96
[<c0103eb1>] sysenter_past_esp+0x52/0x71
In interrupt handler - not syncing

And that's where it dies.
I don't have NFS as part of the build either. My build script shows the
following:
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set

Something is borken.

Bob

2004-06-22 03:35:37

by Chris Wright

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic

* Bob Gill ([email protected]) wrote:
> Ok. I get (a very similar) error with 2.6.7-bk4.
> The error message I get is:
> Starting portmapper: [OK]
> Starting NFS statd: Kernel panic: Aiee, killing interrupt handler!
> bad: scheduling while atomic!
> [<c02a2c37>] schedule+0x483/0x488
> [<c0133298>] __get_free_pages+0x33/0x3f
> [<c026974c>] tcp_poll+0x34/0x15a
> [<c02a30b5>] schedule_timeout+0xb5/0xb7
> [<c0248fb3>] sock_poll+0x29/0x31
> [<c015da2e>] do_pollfd+0x4f/0x90
> [<c015db10>] do_poll+0xa1/0xc0
> [<c016dc7e>] sys_poll+0x14f/0x211
> [<c015d09d>] __pollwait+0x0/0xc6
> [<c014afd2>] sys_close+0x63/0x96
> [<c0103eb1>] sysenter_past_esp+0x52/0x71
> In interrupt handler - not syncing

The lockless loopback transmission patch mucks up the preempt count.
Can you give this patch a try?

thanks,
-chris
--
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net

===== loopback.c 1.15 vs edited =====
--- 1.15/drivers/net/loopback.c 2004-06-20 17:35:52 -07:00
+++ edited/loopback.c 2004-06-21 20:23:06 -07:00
@@ -143,10 +143,11 @@

dev->last_rx = jiffies;
if (likely(loopback_stats)) {
- get_cpu_ptr(loopback_stats)->rx_bytes += skb->len;
- get_cpu_ptr(loopback_stats)->tx_bytes += skb->len;
- get_cpu_ptr(loopback_stats)->rx_packets++;
- get_cpu_ptr(loopback_stats)->tx_packets++;
+ struct net_device_stats *stats = get_cpu_ptr(loopback_stats);
+ stats->rx_bytes += skb->len;
+ stats->tx_bytes += skb->len;
+ stats->rx_packets++;
+ stats->tx_packets++;
put_cpu_ptr(loopback_stats);
}

2004-06-22 13:51:18

by Stewart Smith

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic

On Tue, 2004-06-22 at 13:35, Chris Wright wrote:
> > bad: scheduling while atomic!
> The lockless loopback transmission patch mucks up the preempt count.
> Can you give this patch a try?

This seems to fix the problem, thanks.

--
Stewart Smith ([email protected])
http://www.flamingspork.com/


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-06-22 14:08:01

by Grzegorz Kulewski

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic

Unfortunatelly this does not fix my (similar?) problem described in
"Network related(?) kernel panic (2.6.7-bk4)".


Thanks,

Grzegorz Kulewski


On Tue, 22 Jun 2004, Stewart Smith wrote:

> On Tue, 2004-06-22 at 13:35, Chris Wright wrote:
> > > bad: scheduling while atomic!
> > The lockless loopback transmission patch mucks up the preempt count.
> > Can you give this patch a try?
>
> This seems to fix the problem, thanks.
>
> --
> Stewart Smith ([email protected])
> http://www.flamingspork.com/
>
>

2004-06-22 16:58:37

by Arthur Kepner

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic


On Mon, 21 Jun 2004, Chris Wright wrote:

>
> The lockless loopback transmission patch mucks up the preempt count.
> Can you give this patch a try?
>
> thanks,
> -chris
> --
> Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
>

Yes, there's a problem with preempt count. And this patch is
the minimal fix for it.

However, Andrew Morton noted this problem already and posted
a patch to be tested yesterday. I'd like to suggest that we go
with his patch (very slightly modified) which I'll post in a few
minutes to netdev. (I will cc Chris and Bob on this too.)

--

Arthur


> ===== loopback.c 1.15 vs edited =====
> --- 1.15/drivers/net/loopback.c 2004-06-20 17:35:52 -07:00
> +++ edited/loopback.c 2004-06-21 20:23:06 -07:00
> @@ -143,10 +143,11 @@
>
> dev->last_rx = jiffies;
> if (likely(loopback_stats)) {
> - get_cpu_ptr(loopback_stats)->rx_bytes += skb->len;
> - get_cpu_ptr(loopback_stats)->tx_bytes += skb->len;
> - get_cpu_ptr(loopback_stats)->rx_packets++;
> - get_cpu_ptr(loopback_stats)->tx_packets++;
> + struct net_device_stats *stats = get_cpu_ptr(loopback_stats);
> + stats->rx_bytes += skb->len;
> + stats->tx_bytes += skb->len;
> + stats->rx_packets++;
> + stats->tx_packets++;
> put_cpu_ptr(loopback_stats);
> }
>
>

2004-06-22 21:17:30

by walt

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic

Chris Wright wrote:

> The lockless loopback transmission patch mucks up the preempt count.
> Can you give this patch a try?

I saw an update to loopback.c from linus which I assume was yours --
anyway my panic is fixed. Thanks.

2004-06-22 21:29:05

by Chris Wright

[permalink] [raw]
Subject: Re: [2.6.7-bk] NFS-related kernel panic

* walt ([email protected]) wrote:
> Chris Wright wrote:
>
> > The lockless loopback transmission patch mucks up the preempt count.
> > Can you give this patch a try?
>
> I saw an update to loopback.c from linus which I assume was yours --
> anyway my panic is fixed. Thanks.

Actually that's from Andrew, and Arthur sent in a minor tweak which has
yet to hit mainline. Glad it's working.

thanks,
-chris
--
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net