2004-06-22 06:45:30

by David Balazic

[permalink] [raw]
Subject: EDD code causes long boot delay

Hi!

I first posted this under the title "Grub 1 troubleshooting, linux boot
delayed problem" to the LKML
( see http://lkml.org/lkml/2004/6/11/29 ), but since the discovered that
this is a kernel problem,
the CONFIG_EDD code, to be more exact. Here I repeat the simptoms :

I use the commands ( in the grub shell that boots from my HD ):
root (hd0,0)
kernel /vmlinuz-2.6.xxx ro root=dev/hda2
initrd /initrd-2.xxx
boot

After entering the "boot" command, the screen is cleared and then nothing
happens for 93 seconds.
After that linux boot normally ( beginning with the message "Uncompressing
linux..." and so on ).
If I trim the kernel command line to only "kernel /vmlinuz-2.6.xxx ro" ,
then the delay is 10 seconds.

If I change the operating mode of the IDE adapter in BIOS to RAID, then the
delay is infinite ( I waited 8 hours and it still did not print
"Uncompressing
linx ...." ). Note that with this option, there is no real difference, since
the
disks are not mirrored or striped or anything. GRUB can read everything just
fine.

I tested all kernels ( vanilla Linus release from kernel.org ) from 2.6.0 to
2.6.7
and saw that the problem appeared in 2.6.3. Then I found out that that the
cause
is the CONFIG_EDD option, if I turn it off, then linux-2.6.7 boot without
the mentioned
delay.

So, is there some bug in the EDD code ? A BIOS bug ? Any other comment ?

My hardware is a Gigabyte GA-7 VAXP Ultra motherboard ( see
http://www.giga-byte.com/MotherBoard/Products/Products_GA-7VAXP%20Ultra.htm
)
BIOS version is F6.
Disks :
- Quantum lct20 20 GB
- IBM Deskstar 120GXP 60 GB
Both on the Promise PDC 20276 on-board controller, each on its own
channel(cable).

Regards,
David Balazic
----------------------------------------------------------------------------
-----------
http://noepatents.org/ Innovation, not litigation !
---
David Balazic mailto:[email protected]
HERMES Softlab http://www.hermes-softlab.com
Zagrebska cesta 104 Phone: +386 2 450 8851
SI-2000 Maribor
Slovenija
----------------------------------------------------------------------------
-----------
"Be excellent to each other." -
Bill S. Preston, Esq. & "Ted" Theodore Logan
----------------------------------------------------------------------------
-----------









2004-06-22 12:51:16

by Matt Domsch

[permalink] [raw]
Subject: Re: EDD code causes long boot delay

On Tue, Jun 22, 2004 at 08:45:00AM +0200, David Balazic wrote:
> I use the commands ( in the grub shell that boots from my HD ):
> root (hd0,0)
> kernel /vmlinuz-2.6.xxx ro root=dev/hda2
> initrd /initrd-2.xxx
> boot
>
> After entering the "boot" command, the screen is cleared and then nothing
> happens for 93 seconds.
> After that linux boot normally ( beginning with the message "Uncompressing
> linux..." and so on ).
> If I trim the kernel command line to only "kernel /vmlinuz-2.6.xxx ro" ,
> then the delay is 10 seconds.

Wow. That's really interesting, and gives a good starting point to
look at, where the command line is stored. But the EDD code shouldn't
be using any of the space that the command line occupies, ever...

> If I change the operating mode of the IDE adapter in BIOS to RAID,
> then the delay is infinite ( I waited 8 hours and it still did not
> print "Uncompressing linx ...." ). Note that with this option, there
> is no real difference, since the disks are not mirrored or striped
> or anything. GRUB can read everything just fine.

> I tested all kernels ( vanilla Linus release from kernel.org ) from
> 2.6.0 to 2.6.7 and saw that the problem appeared in 2.6.3. Then I
> found out that that the cause is the CONFIG_EDD option, if I turn it
> off, then linux-2.6.7 boot without the mentioned delay.
>
> So, is there some bug in the EDD code ? A BIOS bug ? Any other comment ?
> My hardware is a Gigabyte GA-7 VAXP Ultra motherboard ( see
> http://www.giga-byte.com/MotherBoard/Products/Products_GA-7VAXP%20Ultra.htm
> )
> BIOS version is F6.
> Disks :
> - Quantum lct20 20 GB
> - IBM Deskstar 120GXP 60 GB
> Both on the Promise PDC 20276 on-board controller, each on its own
> channel(cable).

It *feels* like a BIOS bug (or bugs), as you're the first to report
such a problem. I've got one success report with this motherboard
dated 8 Feb 2004.

Motherboard: Gigabyte GA-7IXE4
BIOS version: FAd (that's a beta version)
Kernel: 2.6.2-mm1

The question is, which of the int13 calls is messing up your BIOS?
The "read MBR sector and save the 4-byte MBR signature" code was added
there at 2.6.3, which is quite coincident.

This is the code, in arch/i386/boot/setup.S, which makes that int13
call. Can you try #if 0'ing this chunk out and see if this helps?

+# Read the first sector of device 80h and store the 4-byte signature
+ movl $0xFFFFFFFF, %eax
+ movl %eax, (DISK80_SIG_BUFFER) # assume failure
+ movb $READ_SECTORS, %ah
+ movb $1, %al # read 1 sector
+ movb $0x80, %dl # from device 80
+ movb $0, %dh # at head 0
+ movw $1, %cx # cylinder 0, sector 0
+ pushw %es
+ pushw %ds
+ popw %es
+ movw $EDDBUF, %bx
+ int $0x13
+ jc disk_sig_done
+ movl (EDDBUF+MBR_SIG_OFFSET), %eax
+ movl %eax, (DISK80_SIG_BUFFER) # store success
+disk_sig_done:
+ popw %es
+

If so, then I'd like to believe that it's definitely a BIOS bug, as
int13 reads to the MBR had better work (gee, that's how the boot
loader got loaded in the first place, right?)

Thanks,
Matt

--
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & http://www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com


Attachments:
(No filename) (3.34 kB)
(No filename) (189.00 B)
Download all attachments

2004-06-22 12:57:05

by David Balazic

[permalink] [raw]
Subject: RE: EDD code causes long boot delay


> From: Matt Domsch[SMTP:[email protected]]
>
> On Tue, Jun 22, 2004 at 08:45:00AM +0200, David Balazic wrote:
> > I use the commands ( in the grub shell that boots from my HD ):
> > root (hd0,0)
> > kernel /vmlinuz-2.6.xxx ro root=dev/hda2
> > initrd /initrd-2.xxx
> > boot
> >
> > After entering the "boot" command, the screen is cleared and then
> nothing
> > happens for 93 seconds.
> > After that linux boot normally ( beginning with the message
> "Uncompressing
> > linux..." and so on ).
> > If I trim the kernel command line to only "kernel /vmlinuz-2.6.xxx ro" ,
> > then the delay is 10 seconds.
>
> Wow. That's really interesting, and gives a good starting point to
> look at, where the command line is stored. But the EDD code shouldn't
> be using any of the space that the command line occupies, ever...
>
Yes, it is really weird. I will look into this some more ...
>
> > If I change the operating mode of the IDE adapter in BIOS to RAID,
> > then the delay is infinite ( I waited 8 hours and it still did not
> > print "Uncompressing linx ...." ). Note that with this option, there
> > is no real difference, since the disks are not mirrored or striped
> > or anything. GRUB can read everything just fine.
>
> > I tested all kernels ( vanilla Linus release from kernel.org ) from
> > 2.6.0 to 2.6.7 and saw that the problem appeared in 2.6.3. Then I
> > found out that that the cause is the CONFIG_EDD option, if I turn it
> > off, then linux-2.6.7 boot without the mentioned delay.
> >
> > So, is there some bug in the EDD code ? A BIOS bug ? Any other comment ?
> > My hardware is a Gigabyte GA-7 VAXP Ultra motherboard ( see
> >
> http://www.giga-byte.com/MotherBoard/Products/Products_GA-7VAXP%20Ultra.ht
> m
> > )
> > BIOS version is F6.
> > Disks :
> > - Quantum lct20 20 GB
> > - IBM Deskstar 120GXP 60 GB
> > Both on the Promise PDC 20276 on-board controller, each on its own
> > channel(cable).
>
> It *feels* like a BIOS bug (or bugs), as you're the first to report
> such a problem. I've got one success report with this motherboard
> dated 8 Feb 2004.
>
> Motherboard: Gigabyte GA-7IXE4
> BIOS version: FAd (that's a beta version)
> Kernel: 2.6.2-mm1
>
> The question is, which of the int13 calls is messing up your BIOS?
> The "read MBR sector and save the 4-byte MBR signature" code was added
> there at 2.6.3, which is quite coincident.
>
> This is the code, in arch/i386/boot/setup.S, which makes that int13
> call. Can you try #if 0'ing this chunk out and see if this helps?
>
Will do ...

> +# Read the first sector of device 80h and store the 4-byte signature
> + movl $0xFFFFFFFF, %eax
> + movl %eax, (DISK80_SIG_BUFFER) # assume failure
> + movb $READ_SECTORS, %ah
> + movb $1, %al # read 1 sector
> + movb $0x80, %dl # from device 80
> + movb $0, %dh # at head 0
> + movw $1, %cx # cylinder 0, sector 0
> + pushw %es
> + pushw %ds
> + popw %es
> + movw $EDDBUF, %bx
> + int $0x13
> + jc disk_sig_done
> + movl (EDDBUF+MBR_SIG_OFFSET), %eax
> + movl %eax, (DISK80_SIG_BUFFER) # store success
> +disk_sig_done:
> + popw %es
> +
>
> If so, then I'd like to believe that it's definitely a BIOS bug, as
> int13 reads to the MBR had better work (gee, that's how the boot
> loader got loaded in the first place, right?)
>
Yes, it(grub) loads the kernel and everything just fine ...

> Thanks,
> Matt
>
Can you send mail to [email protected] ?
It got refused for me :
--begin quote--
This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

[email protected]
retry time not reached for any host after a long failure period

--end quote--
No other information was present :-(

Regards,
David Balazic

2004-06-22 13:03:07

by Matt Domsch

[permalink] [raw]
Subject: Re: EDD code causes long boot delay

On Tue, Jun 22, 2004 at 02:56:30PM +0200, David Balazic wrote:
> Can you send mail to [email protected] ?
> It got refused for me :
> --begin quote--
> This message was created automatically by mail delivery software.
>
> A message that you sent could not be delivered to one or more of its
> recipients. This is a permanent error. The following address(es) failed:
>
> [email protected]
> retry time not reached for any host after a long failure period
>
> --end quote--
> No other information was present :-(

Yes, I got the same bounce message, immediately (not after a long
failure period).

--
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & http://www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com


Attachments:
(No filename) (774.00 B)
(No filename) (189.00 B)
Download all attachments

2004-06-23 08:07:07

by David Balazic

[permalink] [raw]
Subject: RE: EDD code causes long boot delay

> From: Matt Domsch[SMTP:[email protected]]
>
> On Tue, Jun 22, 2004 at 08:45:00AM +0200, David Balazic wrote:
> > I use the commands ( in the grub shell that boots from my HD ):
> > root (hd0,0)
> > kernel /vmlinuz-2.6.xxx ro root=dev/hda2
> > initrd /initrd-2.xxx
> > boot
> >
> > After entering the "boot" command, the screen is cleared and then
> nothing
> > happens for 93 seconds.
> > After that linux boot normally ( beginning with the message
> "Uncompressing
> > linux..." and so on ).
> > If I trim the kernel command line to only "kernel /vmlinuz-2.6.xxx ro" ,
> > then the delay is 10 seconds.
>
> Wow. That's really interesting, and gives a good starting point to
> look at, where the command line is stored. But the EDD code shouldn't
> be using any of the space that the command line occupies, ever...
>
Did some investigation in this area :

grub line | delay between "boot" and
"Uncompressing linux...."
----------------------------------------------------------------------------
-------------
"kernel (hd0,0)/bzImage-2.6.7" 10 seconds
"kernel (hd0,0)/bzImage-2.6.7 a" 10 seconds
"kernel (hd0,0)/bzImage-2.6.7 a b" 10 seconds
"kernel (hd0,0)/bzImage-2.6.7 a b c d" 10 seconds
"kernel (hd0,0)/bzImage-2.6.7 a b c d e" 1 minute 54 seconds
"kernel (hd0,0)/bzImage-2.6.7 a b c de" 10 seconds
"kernel (hd0,0)/bzImage-2.6.7 a b c def" 1 minute 54 seconds
"kernel (hd0,0)/bzImage-2.6.7 abcdefgh" 10 seconds
"kernel (hd0,0)/bzImage-2.6.7 abcdefghi" 10 seconds
"kernel (hd0,0)/bzImage-2.6.7 abcdefghij" 1 minute 30 seconds
"kernel (hd0,0)/bzImage-2.6.7 abcdefghijk" 1 minute 30 seconds


>
> > If I change the operating mode of the IDE adapter in BIOS to RAID,
> > then the delay is infinite ( I waited 8 hours and it still did not
> > print "Uncompressing linx ...." ). Note that with this option, there
> > is no real difference, since the disks are not mirrored or striped
> > or anything. GRUB can read everything just fine.
>
> > I tested all kernels ( vanilla Linus release from kernel.org ) from
> > 2.6.0 to 2.6.7 and saw that the problem appeared in 2.6.3. Then I
> > found out that that the cause is the CONFIG_EDD option, if I turn it
> > off, then linux-2.6.7 boot without the mentioned delay.
> >
> > So, is there some bug in the EDD code ? A BIOS bug ? Any other comment ?
> > My hardware is a Gigabyte GA-7 VAXP Ultra motherboard ( see
> >
> http://www.giga-byte.com/MotherBoard/Products/Products_GA-7VAXP%20Ultra.ht
> m
> > )
> > BIOS version is F6.
> > Disks :
> > - Quantum lct20 20 GB
> > - IBM Deskstar 120GXP 60 GB
> > Both on the Promise PDC 20276 on-board controller, each on its own
> > channel(cable).
>
> It *feels* like a BIOS bug (or bugs), as you're the first to report
> such a problem. I've got one success report with this motherboard
> dated 8 Feb 2004.
>
I downgraded thje BIOS to version F4 and then there is no delay :-(

> Motherboard: Gigabyte GA-7IXE4
> BIOS version: FAd (that's a beta version)
> Kernel: 2.6.2-mm1
>
> The question is, which of the int13 calls is messing up your BIOS?
> The "read MBR sector and save the 4-byte MBR signature" code was added
> there at 2.6.3, which is quite coincident.
>
> This is the code, in arch/i386/boot/setup.S, which makes that int13
> call. Can you try #if 0'ing this chunk out and see if this helps?
>
> +# Read the first sector of device 80h and store the 4-byte signature
> + movl $0xFFFFFFFF, %eax
> + movl %eax, (DISK80_SIG_BUFFER) # assume failure
> + movb $READ_SECTORS, %ah
> + movb $1, %al # read 1 sector
> + movb $0x80, %dl # from device 80
> + movb $0, %dh # at head 0
> + movw $1, %cx # cylinder 0, sector 0
> + pushw %es
> + pushw %ds
> + popw %es
> + movw $EDDBUF, %bx
> + int $0x13
> + jc disk_sig_done
> + movl (EDDBUF+MBR_SIG_OFFSET), %eax
> + movl %eax, (DISK80_SIG_BUFFER) # store success
> +disk_sig_done:
> + popw %es
> +
>
My version (linux-2.6.7) has three additional lines around the int 13 call,
commented as a workaround for buggy BIOS-es ( they are mov, cli and sti
IIRC).

If I #if 0 out this block, then the delay disappears. I also tried to
comment out the buggy-BIOS
workaround code, but that did not make any difference ( the boot delay was
still
there ).

> If so, then I'd like to believe that it's definitely a BIOS bug, as
> int13 reads to the MBR had better work (gee, that's how the boot
> loader got loaded in the first place, right?)
>
Maybe the linux code does something that confuses the BIOS before the EDD
code
is run ??? Guessing ...

> Thanks,
> Matt
>
> --
> Matt Domsch
> Sr. Software Engineer, Lead Engineer
> Dell Linux Solutions linux.dell.com & http://www.dell.com/linux
> Linux on Dell mailing lists @ http://lists.us.dell.com
>