2002-11-22 01:57:21

by Matthew Dobson

[permalink] [raw]
Subject: 2.5.48 hangs during boot

Hello all,
2.5.48 + Bill/Martin's noearlyirq patch hangs on boot on our NUMA-Q
machines. It boots normally up to

TCP: Hash tables configured (established 524288 bind 65536)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 268k freed

Then it *VERY* slowly proceeds to output a few more lines before hanging
completely. The lines come out one at a time, with large time delays
between each line. The last bit of output I get is the enabling swap line.

The -mm1 patch fixes this problem, and I'm in the process of determining
exactly what fixes it. Any input/ideas would be greatly appreciated.

Thanks!

-Matt


2002-11-22 02:07:19

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.5.48 hangs during boot

On Thu, Nov 21, 2002 at 05:58:37PM -0800, Matthew Dobson wrote:
> Hello all,
> 2.5.48 + Bill/Martin's noearlyirq patch hangs on boot on our NUMA-Q
> machines. It boots normally up to
> TCP: Hash tables configured (established 524288 bind 65536)
> NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
> VFS: Mounted root (ext2 filesystem) readonly.
> Freeing unused kernel memory: 268k freed
> Then it *VERY* slowly proceeds to output a few more lines before hanging
> completely. The lines come out one at a time, with large time delays
> between each line. The last bit of output I get is the enabling swap line.
> The -mm1 patch fixes this problem, and I'm in the process of determining
> exactly what fixes it. Any input/ideas would be greatly appreciated.
> Thanks!
> -Matt

get the axboe/akpm fixes for the elevator deadlock and/or an intermediate
bk tree. This is an io scheduling issue.


Bill

2002-11-22 19:30:10

by Matthew Dobson

[permalink] [raw]
Subject: Re: 2.5.48 hangs during boot

William Lee Irwin III wrote:
> On Thu, Nov 21, 2002 at 05:58:37PM -0800, Matthew Dobson wrote:
>
>>Hello all,
>> 2.5.48 + Bill/Martin's noearlyirq patch hangs on boot on our NUMA-Q
>>machines. It boots normally up to
>>TCP: Hash tables configured (established 524288 bind 65536)
>>NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
>>VFS: Mounted root (ext2 filesystem) readonly.
>>Freeing unused kernel memory: 268k freed
>>Then it *VERY* slowly proceeds to output a few more lines before hanging
>>completely. The lines come out one at a time, with large time delays
>>between each line. The last bit of output I get is the enabling swap line.
>>The -mm1 patch fixes this problem, and I'm in the process of determining
>>exactly what fixes it. Any input/ideas would be greatly appreciated.
>>Thanks!
>>-Matt
>
>
> get the axboe/akpm fixes for the elevator deadlock and/or an intermediate
> bk tree. This is an io scheduling issue.
>
>
> Bill

Yep.. the axboe-scsi patch from the mm1 tree fixes our problem...

Linus, you'll make a bunch of NUMA-Q developers (and likely many other
people) really happy if you add that patch to the mainline.

Cheers!

-Matt

2002-11-22 19:39:19

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.48 hangs during boot

Matthew Dobson wrote:
>
> ...
> > get the axboe/akpm fixes for the elevator deadlock and/or an intermediate
> > bk tree. This is an io scheduling issue.
> >
> >
> > Bill
>
> Yep.. the axboe-scsi patch from the mm1 tree fixes our problem...
>
> Linus, you'll make a bunch of NUMA-Q developers (and likely many other
> people) really happy if you add that patch to the mainline.
>

A different fix was merged a couple of days ago.

You can get updates from
http://www.kernel.org/pub/linux/kernel/people/dwmw2/bk-2.5/ - the
"Gzipped full patch" at the top. It's updated hourly (I think). I'd
die without that web page.