2003-06-28 01:40:27

by Michael Frank

[permalink] [raw]
Subject: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

Changes were recently made to the nbd.c in 2.5.73-mm1

When using nbd.c ex 2.5.73 boot OK.
acpi=off no effect

----------------------------
dmesg using nbd.c ex 2.5.73:

loop: loaded (max 8 devices)
anticipatory scheduling elevator

(Using nbd.c ex 2.5.73-mm1
nbd: registered device at major 43
hang)

PPP generic driver version 2.4.2
PPP Deflate Compression module registered
PPP BSD Compression module registered
Universal TUN/TAP device driver 1.5 (C)1999-2002 Maxim Krasnyansky
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ALI15X3: IDE controller at PCI slot 00:10.0
pci_irq-0294 [19] acpi_pci_irq_derive : Unable to derive IRQ for device 00:10.0
ACPI: No IRQ known for interrupt pin A of device 00:10.0 - using IRQ 255
ALI15X3: chipset revision 195
ALI15X3: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xedb0-0xedb7, BIOS settings: hda:DMA, hdb:pio
ALI15X3: simplex device: DMA forced
ide1: BM-DMA at 0xedb8-0xedbf, BIOS settings: hdc:DMA, hdd:DMA
hda: IBM-DARA-212000, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: max request size: 128KiB
hda: host protected area => 1
hda: 23579136 sectors (12073 MB) w/418KiB Cache, CHS=23392/16/63, UDMA(33)
hda: hda1 hda2 hda3 hda4
mice: PS/2 mouse device common for all mice
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
serio: i8042 AUX port at 0x60,0x64 irq 12
input: AT Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
NET4: Linux TCP/IP 1.0 for NET4.0


-----------------------------------------------------------
lspci
00:00.0 Host bridge: Transmeta Corporation LongRun Northbridge (rev 01)
00:00.1 RAM memory: Transmeta Corporation SDRAM controller
00:00.2 RAM memory: Transmeta Corporation BIOS scratchpad
00:04.0 VGA compatible controller: S3 Inc. 86C270-294 Savage/IX-MV (rev 13)
00:06.0 Multimedia audio controller: ALi Corporation M5451 PCI AC-Link Controller Audio Device (rev 01)
00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge [Aladdin IV]
00:0e.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
00:10.0 IDE interface: ALi Corporation M5229 IDE (rev c3)
00:11.0 Bridge: ALi Corporation M7101 PMU
00:12.0 CardBus bridge: Toshiba America Info Systems ToPIC95 PCI to Cardbus Bridge with ZV Support (rev 32)
00:14.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)

-----------------------------------------------------------
hdparm -iI /dev/hda:
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=DualPortCache, BuffSize=418kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=23579136
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 *udma2 udma3 udma4
AdvancedPM=yes: mode=0x80 (128) WriteCache=enabled
Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17: 1 2 3 4


ATA device, with non-removable media
Model Number: IBM-DARA-212000
Serial Number: AH0AHG94390
Firmware Revision: AR4OA51A
Standards:
Used: ATA/ATAPI-4 T13 1153D revision 17
Supported: 4 3 2 1 & some of 5
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 23579136
device size with M = 1024*1024: 11513 MBytes
device size with M = 1000*1000: 12072 MBytes (12 GB)
Capabilities:
LBA, IORDY(can be disabled)
Buffer size: 418.0kB bytes avail on r/w long: 4 Queue depth: 1
Standby timer values: spec'd by Vendor, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
Advanced power management level: 128 (0x80)
DMA: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=240ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* NOP cmd
* READ BUFFER cmd
* WRITE BUFFER cmd
* Host Protected Area feature set
* Look-ahead
* Write cache
* Power Management feature set
Security Mode feature set
SMART feature set
Address Offset Reserved Area Boot
* Advanced Power Management feature set

Regards
Michael

--
Powered by linux-2.5.73, compiled with gcc-2.95-3 - not fancy but rock solid

My current linux related activities:
- Test script development and testing of swsusp
- Everyday usage of 2.5 kernel

More info on the 2.5 kernel: http://www.codemonkey.org.uk/post-halloween-2.5.txt


2003-06-28 02:27:28

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

Michael Frank <[email protected]> wrote:
>
> Changes were recently made to the nbd.c in 2.5.73-mm1

And tons more will be in -mm2, which I shall prepare right now.
Please retest on that and if it still hangs, capture the output
from pressing alt-sysrq-T.

Thanks.

2003-06-28 04:44:23

by Michael Frank

[permalink] [raw]
Subject: Re: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

On Saturday 28 June 2003 10:41, Andrew Morton wrote:
> Michael Frank <[email protected]> wrote:
> > Changes were recently made to the nbd.c in 2.5.73-mm1
>
> And tons more will be in -mm2, which I shall prepare right now.
> Please retest on that and if it still hangs, capture the output
> from pressing alt-sysrq-T.

Legacy free, no serial port.

>

Sorry, -mm2 hang at booting kernel on 2 machines.

Regards
Michael

--
Powered by linux-2.5.73, compiled with gcc-2.95-3 - not fancy but rock solid

My current linux related activities:
- Test script development and testing of swsusp
- Everyday usage of 2.5 kernel

More info on the 2.5 kernel: http://www.codemonkey.org.uk/post-halloween-2.5.txt

2003-06-28 05:40:27

by Michael Frank

[permalink] [raw]
Subject: Re: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

On Saturday 28 June 2003 12:55, Michael Frank wrote:
> On Saturday 28 June 2003 10:41, Andrew Morton wrote:
> > Michael Frank <[email protected]> wrote:
> > > Changes were recently made to the nbd.c in 2.5.73-mm1
> >
> > And tons more will be in -mm2, which I shall prepare right now.
> > Please retest on that and if it still hangs, capture the output
> > from pressing alt-sysrq-T.
>
> Legacy free, no serial port.
>
>
>
> Sorry, -mm2 hang at booting kernel on 2 machines.
>

Oh Murphy! Bug: 250K log buffer causes a hang on boot.

Sorry for the shock. I configured the log buffer bigger - 250K and it hangs on boot.

Default 14K log buffer all OK, the NBD hang is fixed too.

This was my only config change besides that driver which didn't compile ;)

I want a bigger log buffer in preparation for testing swsusp on 2.5. On 2.4,
the test io load prevent the big swsusp logs from making it to disk...

Thank you

Regards
Michael

--
Powered by linux-2.5.73-mm2, compiled with gcc-2.95-3 - not fancy but rock solid

My current linux related activities:
- Test development and testing of swsusp
- Everyday usage of 2.5 kernel

More info on the 2.5 kernel: http://www.codemonkey.org.uk/post-halloween-2.5.txt

2003-06-28 17:06:12

by Randy.Dunlap

[permalink] [raw]
Subject: Re: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

> On Saturday 28 June 2003 12:55, Michael Frank wrote:
>> On Saturday 28 June 2003 10:41, Andrew Morton wrote:
>> > Michael Frank <[email protected]> wrote:
>> > > Changes were recently made to the nbd.c in 2.5.73-mm1
>> >
>> > And tons more will be in -mm2, which I shall prepare right now. Please
>> retest on that and if it still hangs, capture the output from pressing
>> alt-sysrq-T.
>>
>> Legacy free, no serial port.
>>
>>
>>
>> Sorry, -mm2 hang at booting kernel on 2 machines.
>>
>
> Oh Murphy! Bug: 250K log buffer causes a hang on boot.
>
> Sorry for the shock. I configured the log buffer bigger - 250K and it hangs
> on boot.
>
> Default 14K log buffer all OK, the NBD hang is fixed too.

Default value of 14 is a shift count (2 << 14), which gives a
16 KB buffer.
Did you enter '250' for the shift value?
Yes, that wouldn't boot.
Maybe consult the help text??

> This was my only config change besides that driver which didn't compile ;)
>
> I want a bigger log buffer in preparation for testing swsusp on 2.5. On 2.4,
> the test io load prevent the big swsusp logs from making it to disk...

Andrew, do you want a min/max limit on the LOG_BUF_SHIFT value,
now that Roman has added that feature for Kconfig?

~Randy



2003-06-29 07:26:33

by Michael Frank

[permalink] [raw]
Subject: Re: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

On Sunday 29 June 2003 01:20, Randy.Dunlap wrote:
> > On Saturday 28 June 2003 12:55, Michael Frank wrote:
> >> On Saturday 28 June 2003 10:41, Andrew Morton wrote:
> >> > Michael Frank <[email protected]> wrote:
> >> > > Changes were recently made to the nbd.c in 2.5.73-mm1
> >> >
> >> > And tons more will be in -mm2, which I shall prepare right now. Please
> >>
> >> retest on that and if it still hangs, capture the output from pressing
> >> alt-sysrq-T.
> >>
> >> Legacy free, no serial port.
> >>
> >>
> >>
> >> Sorry, -mm2 hang at booting kernel on 2 machines.
> >
> > Oh Murphy! Bug: 250K log buffer causes a hang on boot.
> >
> > Sorry for the shock. I configured the log buffer bigger - 250K and it
> > hangs on boot.
> >
> > Default 14K log buffer all OK, the NBD hang is fixed too.
>
> Default value of 14 is a shift count (2 << 14), which gives a

No sh**,

> 16 KB buffer.
> Did you enter '250' for the shift value?

Yes, I meant 250K bytes.

> Yes, that wouldn't boot.
> Maybe consult the help text??

I'll put it on a CD under my pillow tonight....

>
> > This was my only config change besides that driver which didn't compile
> > ;)
> >
> > I want a bigger log buffer in preparation for testing swsusp on 2.5. On
> > 2.4, the test io load prevent the big swsusp logs from making it to
> > disk...
>
> Andrew, do you want a min/max limit on the LOG_BUF_SHIFT value,
> now that Roman has added that feature for Kconfig?
>

Making this a shift count is a brilliant trap designed to humble buffalos
who do not bother to read the documentation. To put a check there would
just spoil the fun ;)

--
Powered by linux-2.5.73-mm2, compiled with gcc-2.95-3 - not fancy but rock solid

My current linux related activities:
- Test development and testing of swsusp
- Everyday usage of 2.5 kernel

More info on the 2.5 kernel: http://www.codemonkey.org.uk/post-halloween-2.5.txt


2003-06-30 15:10:02

by Lou Langholtz

[permalink] [raw]
Subject: Re: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

Michael Frank wrote:

>Changes were recently made to the nbd.c in 2.5.73-mm1
>
>When using nbd.c ex 2.5.73 boot OK.
>acpi=off no effect
>
>----------------------------
>dmesg using nbd.c ex 2.5.73:
>
>loop: loaded (max 8 devices)
>anticipatory scheduling elevator
>
>(Using nbd.c ex 2.5.73-mm1
> nbd: registered device at major 43
> hang) . . .
>
Thank you for reporting this. A few others have also found this same
problem and a patch that fixes this has been submitted to Andrew. I
haven't had the chance yet to figure out what release of mm this fix may
have made it into. The reason for this problem in the first place was
that the patch which caused the problem was tested against 2.5.73 then
applied into Andrew's 2.5.73-mm1 release. Some other changes that made
it into 2.5.73-mm1 (in a non-nbd system that also hadn't been in 2.5.73
yet) interacted with the nbd change in the un-expected way you've seen.
If there are still problems you can track back to nbd please let me know.

2003-06-30 15:37:23

by Lou Langholtz

[permalink] [raw]
Subject: Re: 2.5.73-mm1 nbd: boot hang in add_disk at first call from nbd_init

Michael Frank wrote:

>On Saturday 28 June 2003 10:41, Andrew Morton wrote:
>
>
>>Michael Frank <[email protected]> wrote:
>>
>>
>>>Changes were recently made to the nbd.c in 2.5.73-mm1
>>>
>>>
>>And tons more will be in -mm2, which I shall prepare right now.
>>Please retest on that and if it still hangs, capture the output
>>from pressing alt-sysrq-T.
>>
>>
>
>Legacy free, no serial port.
>Sorry, -mm2 hang at booting kernel on 2 machines.
>
Just catching up on email after being away for a few days... so nbd
isn't at fault then in mm2, correct? Your follow on emails at least
seemed to indicate that there was a different problem that you in fact
were encountering in mm2. If you find a problem though in mm2 w.r.t. nbd
please let me know directly and I'll haul butt to get a fix out to you.

All the best ;-)