2000-11-13 23:45:19

by Tom Leete

[permalink] [raw]
Subject: Hard lockups solved

Hi,

My lockup problems started increasing in frequency, and it
became obvious that they were independent of the kernel I
booted. The shoe dropped, nic was failing. It's salvage now.

The bizarre shift errors on ftp are gone, so the data I sent
is irrelevant to the kernel.

The soft hangs I was getting were real, though perhaps
encouraged by nic failure. Your net/ipv4/tcp.c patch from
the NE2000 thread cured them even before I found the
hardware fault. Has that patch gone to the queue? I
recommend it.

Thanks,
Tom


2000-11-14 01:59:58

by David Miller

[permalink] [raw]
Subject: Re: Hard lockups solved

Date: Mon, 13 Nov 2000 18:05:24 -0500
From: Tom Leete <[email protected]>

Your net/ipv4/tcp.c patch from the NE2000 thread cured them even
before I found the hardware fault. Has that patch gone to the
queue? I recommend it.

The bugs I was "fixing" there were due to problems in wait queue
exclusivity nesting. We instead fixed wait queue exclusivity nesting
so it actually worked in test11-pre3, can you see if by itself that
kernel does not show your problems too?

Thanks.

Later,
David S. Miller
[email protected]

2000-11-15 04:48:58

by Tom Leete

[permalink] [raw]
Subject: Re: Hard lockups solved

"David S. Miller" wrote:
>
> Date: Mon, 13 Nov 2000 18:05:24 -0500
> From: Tom Leete <[email protected]>
>
> Your net/ipv4/tcp.c patch from the NE2000 thread cured them even
> before I found the hardware fault. Has that patch gone to the
> queue? I recommend it.
>
> The bugs I was "fixing" there were due to problems in wait queue
> exclusivity nesting. We instead fixed wait queue exclusivity nesting
> so it actually worked in test11-pre3, can you see if by itself that
> kernel does not show your problems too?
>
> Thanks.
>
> Later,
> David S. Miller
> [email protected]

Done. Yes, it's fixed in vanilla test11-pre3, to go by
limited testing. ftp, 15 Meg in 4 files -- no deathlike
sleep, md5sums agree. That load would have certainly
triggered the problem before. On to pre5.

Thanks again,
Tom