2003-02-19 05:47:27

by Shaya Potter

[permalink] [raw]
Subject: hard lockup on 2.4.20 w/ nfs over frees/wan

I'm trying to use frees/wan 1.99 w/ NFSv3. I've been testing it w/
large r and wsize's (32k each). When used w/o ipsec, it seems to work
fine. When used w/ ipsec, make dep on a kernel source tree has
consistently frozen up these IBM Netfinity boxes (2*933mhz P3s w/ smp
kernel). One time the last thing the kernel printk'd was

pcnet32.c: printk(KERN_ERR "%s: Bus master arbitration failure,
status %4.4x.\n",

but didn't record the status number (well it was eth0: Bus master....,
and it's using a pcnet32 controller, so assume that's the line).
Usually it's locked up w/o printk'ing anything, last things I see on
console are the normal ipsec printk's

Is it possible that the r/w size's are causing issues when used in
conjuction w/ ipsec? Am I triggering some sort of race condition? The
NFS client is running the exact same kernel on the same exact hardware
and hasn't had an issue yet.

any ideas on what I can do to debug it?

thanks,

shaya


2003-02-19 20:27:08

by Shaya Potter

[permalink] [raw]
Subject: Re: hard lockup on 2.4.20 w/ nfs over frees/wan

didn't get any responses on this, but its crashes again a few times
again today, the status code it printed out was ffff.

On Wed, 2003-02-19 at 00:56, Shaya Potter wrote:
> I'm trying to use frees/wan 1.99 w/ NFSv3. I've been testing it w/
> large r and wsize's (32k each). When used w/o ipsec, it seems to work
> fine. When used w/ ipsec, make dep on a kernel source tree has
> consistently frozen up these IBM Netfinity boxes (2*933mhz P3s w/ smp
> kernel). One time the last thing the kernel printk'd was
>
> pcnet32.c: printk(KERN_ERR "%s: Bus master arbitration failure,
> status %4.4x.\n",
>
> but didn't record the status number (well it was eth0: Bus master....,
> and it's using a pcnet32 controller, so assume that's the line).
> Usually it's locked up w/o printk'ing anything, last things I see on
> console are the normal ipsec printk's
>
> Is it possible that the r/w size's are causing issues when used in
> conjuction w/ ipsec? Am I triggering some sort of race condition? The
> NFS client is running the exact same kernel on the same exact hardware
> and hasn't had an issue yet.
>
> any ideas on what I can do to debug it?
>
> thanks,
>
> shaya
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2003-02-20 16:07:13

by Shaya Potter

[permalink] [raw]
Subject: Re: hard lockup on 2.4.20 w/ nfs over frees/wan

moved from the netfinity's onboard pcnet32 adapter to an IBM branded
Intel epro/100 w/ the intel driver in 2.4.20 and it appears very
stable. Is it possible the pcnet/32 adapter is broken or the driver is
buggy?

On Wed, 2003-02-19 at 15:36, Shaya Potter wrote:
> didn't get any responses on this, but its crashes again a few times
> again today, the status code it printed out was ffff.
>
> On Wed, 2003-02-19 at 00:56, Shaya Potter wrote:
> > I'm trying to use frees/wan 1.99 w/ NFSv3. I've been testing it w/
> > large r and wsize's (32k each). When used w/o ipsec, it seems to work
> > fine. When used w/ ipsec, make dep on a kernel source tree has
> > consistently frozen up these IBM Netfinity boxes (2*933mhz P3s w/ smp
> > kernel). One time the last thing the kernel printk'd was
> >
> > pcnet32.c: printk(KERN_ERR "%s: Bus master arbitration failure,
> > status %4.4x.\n",
> >
> > but didn't record the status number (well it was eth0: Bus master....,
> > and it's using a pcnet32 controller, so assume that's the line).
> > Usually it's locked up w/o printk'ing anything, last things I see on
> > console are the normal ipsec printk's
> >
> > Is it possible that the r/w size's are causing issues when used in
> > conjuction w/ ipsec? Am I triggering some sort of race condition? The
> > NFS client is running the exact same kernel on the same exact hardware
> > and hasn't had an issue yet.
> >
> > any ideas on what I can do to debug it?
> >
> > thanks,
> >
> > shaya
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2003-02-20 16:34:21

by Jeff Garzik

[permalink] [raw]
Subject: Re: hard lockup on 2.4.20 w/ nfs over frees/wan

On Thu, Feb 20, 2003 at 11:16:13AM -0500, Shaya Potter wrote:
> moved from the netfinity's onboard pcnet32 adapter to an IBM branded
> Intel epro/100 w/ the intel driver in 2.4.20 and it appears very
> stable. Is it possible the pcnet/32 adapter is broken or the driver is
> buggy?

I have gotten reports the 2.4.20 pcnet32 is buggy.

Can you test 2.4.20 with 2.4.19 version of pcnet32.c?

2003-02-21 08:56:31

by Shaya Potter

[permalink] [raw]
Subject: Re: hard lockup on 2.4.20 w/ nfs over frees/wan

On Thu, 2003-02-20 at 11:44, Jeff Garzik wrote:
> On Thu, Feb 20, 2003 at 11:16:13AM -0500, Shaya Potter wrote:
> > moved from the netfinity's onboard pcnet32 adapter to an IBM branded
> > Intel epro/100 w/ the intel driver in 2.4.20 and it appears very
> > stable. Is it possible the pcnet/32 adapter is broken or the driver is
> > buggy?
>
> I have gotten reports the 2.4.20 pcnet32 is buggy.
>
> Can you test 2.4.20 with 2.4.19 version of pcnet32.c?

I'll do it at the beg. of next week, as I'm not going into the Lab
tomorrow.

I'm using 2 basically (as in have gotten serviced and parts replaced)
netfinity's. 1 seems to work perfectly well w/ the pcnet32 driver/card,
while the other one was having serious issue. Strangely enough, I took
about a 10% hit in performance when I went from the "broken" pcnet32
card to the intel eepro 100 on my informal nfs benchmarks (at least for
the ones that would complete w/o hanging the computer)

2003-02-21 15:12:13

by Bill Davidsen

[permalink] [raw]
Subject: Re: hard lockup on 2.4.20 w/ nfs over frees/wan

On 20 Feb 2003, Shaya Potter wrote:

> moved from the netfinity's onboard pcnet32 adapter to an IBM branded
> Intel epro/100 w/ the intel driver in 2.4.20 and it appears very
> stable. Is it possible the pcnet/32 adapter is broken or the driver is
> buggy?

I've seen other reports of evil in that driver, I have the same Netfinity
hardware (5000's and 5100's) and I'm not even tempted to try a 2.5 kernel
on it as yet. I do have 2.5.59 and 2.5.61-ac1 kernels running in
non-critical systems, however.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2003-02-24 22:00:51

by Shaya Potter

[permalink] [raw]
Subject: Re: hard lockup on 2.4.20 w/ nfs over frees/wan

seems to be stable w/ the 2.4.19 driver. All the tests that I ran be
(basically kernel building over nfs over ipsec) that hung it hard
consistently b4 aren't hanging it now.

shaya

On Thu, 2003-02-20 at 11:44, Jeff Garzik wrote:
> On Thu, Feb 20, 2003 at 11:16:13AM -0500, Shaya Potter wrote:
> > moved from the netfinity's onboard pcnet32 adapter to an IBM branded
> > Intel epro/100 w/ the intel driver in 2.4.20 and it appears very
> > stable. Is it possible the pcnet/32 adapter is broken or the driver is
> > buggy?
>
> I have gotten reports the 2.4.20 pcnet32 is buggy.
>
> Can you test 2.4.20 with 2.4.19 version of pcnet32.c?
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2003-02-24 22:41:43

by Jeff Garzik

[permalink] [raw]
Subject: Re: hard lockup on 2.4.20 w/ nfs over frees/wan

Shaya Potter wrote:
> seems to be stable w/ the 2.4.19 driver. All the tests that I ran be
> (basically kernel building over nfs over ipsec) that hung it hard
> consistently b4 aren't hanging it now.



Thanks. This tells me what I need to know...