2007-12-16 15:34:57

by Roland

[permalink] [raw]
Subject: [PATCH] [RFC] be more verbose when probing EDD

Hello!

i`m sysadmin for quite some time and while being that, i have come across the one or another system, which refused to boot a linux kernel. typical symptom i have seen is a blinking cursor in the upper left just after kernel/initrd were loaded.

i never spent much time on that and either choose another system for linux or choose "failsafe" option, if the installer of my favourite distro gave that option.

since a colleague of mine was hit by that problem some weeks ago and i also came across that again recently, i started to investigate deeper and found, that EDD Bios probe is the problem here.

i found more than a handful of references on the net where people report similar problems. many discussion threads contained that typical smattering babble and seldom somebody gave that essential hint "try edd=off" for which i`m sure it would have helped many times.

that`s why i started to spend some thoughts on this how to make this easier/better for the user.

so

- it seems there are buggy Bios implementations out there which have problems with EDD
- your favourite distro may have set CONFIG_EDD=y|m , so EDD probe is on by default quite often nowadays.
- setting "edd=off" when you get that hang on boot is _not_ obvious.
- adressing this issue may be a little bit late, since i have mostly seen that problem on older machines, but not on recently bought
- i have at least two different systems with different types of chipsets to demonstrate this

on one of those, i added some printf`s to edd.c and this routine seems to be problematic and never returns

/* Extended Get Device Parameters */

ei->params.length = sizeof(ei->params);
ax = 0x4800;
dx = devno;
asm("pushfl; int $0x13; popfl"
: "+a" (ax), "+d" (dx), "=m" (ei->params)
: "S" (&ei->params)
: "ebx", "ecx", "edi");

i had a short conversation with matt domsch and hpa, who both think that additional printf`s would be an easy solution and not too bad to be added.

here is a quick and dirty initial patch from me:

--- linux-2.6.23/arch/x86/boot/main.c.orig 2007-12-09 11:40:24.315346712 +0100
+++ linux-2.6.23/arch/x86/boot/main.c 2007-12-09 16:11:43.644512504 +0100
@@ -152,7 +152,10 @@

/* Query EDD information */
#if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
- query_edd();
+ printf("Probing EDD (query Bios for boot-device information)\n");
+ printf("If boot hangs here, you may have a buggy Bios. Try edd=skipmbr or edd=off");
+ query_edd();
+ printf("\rOK \n");
#endif
/* Do the last things and invoke protected mode */
go_to_protected_mode();


ok, fore sure it`s better to do that stuff in query_edd() itself, but before making a better version, i`d like to discuss if such patch would get accepted at all and if it`s a valid approach to let the kernel print some line which get`s overwritten ("\rOK + lot`s of whitespaces") milliseconds later on successful function return.

regards
roland

______________________________________________________________________________
Jetzt neu! Im riesigen WEB.DE Club SmartDrive Dateien freigeben und mit
Freunden teilen! http://www.freemail.web.de/club/smartdrive_ttc.htm/?mc=021134


2007-12-16 17:34:53

by Parag Warudkar

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

On Sun, 16 Dec 2007, [email protected] wrote:

>
> - it seems there are buggy Bios implementations out there which have problems with EDD
> - your favourite distro may have set CONFIG_EDD=y|m , so EDD probe is on by default quite often nowadays.
> - setting "edd=off" when you get that hang on boot is _not_ obvious.

It does not look like this issue is common - googling for "Linux EDD Boot
hang" does not bring up relevant and recent results - in fact this post of
yours comes on top. Further more there does not seem to be problems with
newer BIOSes so we would be irritating lot many users needlessly.

Adding 3 printks for each such obscure problem would make it even more
complex to parse and make sense of the boot log - I, for example, already
dislike the mostly-useless-to-end-user stuff it spews on a normal boot.

If there are known chipsets / BIOSes that have this problem - applying
quirks - something like this quirk for pmtmr [1]- (if they work this
early) or even special casing them with forced edd=off may be the right and more useful thing to do.

[1] http://www.webservertalk.com/archive242-2006-3-1447442.html

Parag

2007-12-16 18:17:23

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

[email protected] wrote:
> /* Query EDD information */
> #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
> - query_edd();
> + printf("Probing EDD (query Bios for boot-device information)\n");
> + printf("If boot hangs here, you may have a buggy Bios. Try edd=skipmbr or edd=off");
> + query_edd();
> + printf("\rOK \n");
> #endif

This is awfully verbose.

I'd prefer something like:

printf("Probing EDD BIOS... ");
query_edd();
printf("ok\n");

[or even "failed"...]

Also, you really should look for the "quiet" command-line option and
squelch the message.

-hpa

2007-12-16 19:11:44

by Roland

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

> It does not look like this issue is common - googling for "Linux EDD Boot
> hang" does not bring up relevant and recent results - in fact this post of
> yours comes on top.

i don`t wonder here - how should one know that it`s an edd problem if all he sees is a blank screen with a blinking cursor in the upper left ?
so i assume, "Linux EDD Boot hang" isn`t the best term for searching here.

i think, there is no universal search phrase - i found some references with different words (linux install, blank screen, blinking cursor....)

some of them
http://forums.suselinuxsupport.de/lofiversion/index.php/t61581.html
http://suseforums.net/index.php?showtopic=1302
http://forums.suselinuxsupport.de/lofiversion/index.php/t3157.html
http://www.linuxforums.org/forum/installation/73984-cant-boot-past-loading-kernel-fedora-4-a.html
http://ubuntuforums.org/archive/index.php/t-146180.html
http://ubuntuforums.org/showthread.php?t=450630
http://www.linuxquestions.org/questions/linux-software-2/grub-freezes-before-uncompressing-kernel-464619/

not sure if really all of them are the edd problem i have, but i`m quite sure for some of those.
can provide more links if somebody likes.

> Adding 3 printks for each such obscure problem would make it even more
i wouldn`t call that problem obscure.
i have seen this problem more than 5 times since that code is in linux. so it`s at least enough to discuss about even if no other person thinks it`s necessary :) just the fact that it`s not been discussed often (here or elsewhere) doesn`t mean, that it exist or that users come across that. it`s more that type of "mhh, let`s try linux instead of windows. oh, doesn`t boot. ok, let`s stay with windows then...". you don`t expect people like those posting the right question to the right forum, do you?


> If there are known chipsets / BIOSes that have this problem - applying
> quirks - something like this quirk for pmtmr [1]- (if they work this
> early) or even special casing them with forced edd=off may be the right and more useful thing to do.
mhh. i`m unsure. ok, i`m no kernel-hacker but there are so many variants of bios/hardware out there so blacklisting certain bios versions may be never more than half of a solution.

roland


> -----Urspr?ngliche Nachricht-----
> Von: "Parag Warudkar" <[email protected]>
> Gesendet: 16.12.07 18:34:48
> An: [email protected]
> CC: [email protected], [email protected], [email protected]
> Betreff: Re: [PATCH] [RFC] be more verbose when probing EDD

>
> On Sun, 16 Dec 2007, [email protected] wrote:
>
> >
> > - it seems there are buggy Bios implementations out there which have problems with EDD
> > - your favourite distro may have set CONFIG_EDD=y|m , so EDD probe is on by default quite often nowadays.
> > - setting "edd=off" when you get that hang on boot is _not_ obvious.
>
> It does not look like this issue is common - googling for "Linux EDD Boot
> hang" does not bring up relevant and recent results - in fact this post of
> yours comes on top. Further more there does not seem to be problems with
> newer BIOSes so we would be irritating lot many users needlessly.
>
> Adding 3 printks for each such obscure problem would make it even more
> complex to parse and make sense of the boot log - I, for example, already
> dislike the mostly-useless-to-end-user stuff it spews on a normal boot.
>
> If there are known chipsets / BIOSes that have this problem - applying
> quirks - something like this quirk for pmtmr [1]- (if they work this
> early) or even special casing them with forced edd=off may be the right and more useful thing to do.
>
> [1] http://www.webservertalk.com/archive242-2006-3-1447442.html
>
> Parag
>


_______________________________________________________________________
Jetzt neu! Sch?tzen Sie Ihren PC mit McAfee und WEB.DE. 3 Monate
kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220

2007-12-16 19:19:18

by Roland

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

> This is awfully verbose.
yes, this was meant as an example
also mind, that the last printf with "\rOK...." is meant to overwrite the hint message, so nobody should ever notice that message. i`m just unsure if this is a good approach. i have found two occurences in linux kernel, where drivers do something similar.

> [or even "failed"...]
yes, but how to detect if we got stuck forever inside that bios probe ?
i assume that`s hard to do, so that`s why i`m printing that hint "into the future", to be overwritten on successful return.

> Also, you really should look for the "quiet" command-line option and
> squelch the message.
sure! will add that if there is some positive feedback and someone finds this patch useful.

regards
roland


> -----Urspr?ngliche Nachricht-----
> Von: "H. Peter Anvin" <[email protected]>
> Gesendet: 16.12.07 19:17:40
> An: [email protected]
> CC: [email protected], [email protected]
> Betreff: Re: [PATCH] [RFC] be more verbose when probing EDD

>
> [email protected] wrote:
> > /* Query EDD information */
> > #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
> > - query_edd();
> > + printf("Probing EDD (query Bios for boot-device information)\n");
> > + printf("If boot hangs here, you may have a buggy Bios. Try edd=skipmbr or edd=off");
> > + query_edd();
> > + printf("\rOK \n");
> > #endif
>
> This is awfully verbose.
>
> I'd prefer something like:
>
> printf("Probing EDD BIOS... ");
> query_edd();
> printf("ok\n");
>
> [or even "failed"...]
>
> Also, you really should look for the "quiet" command-line option and
> squelch the message.
>
> -hpa
>


_______________________________________________________________________
Jetzt neu! Sch?tzen Sie Ihren PC mit McAfee und WEB.DE. 3 Monate
kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220

2007-12-16 19:59:52

by Parag Warudkar

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

On Dec 16, 2007 2:11 PM, <[email protected]> wrote:
> some of them
> http://forums.suselinuxsupport.de/lofiversion/index.php/t61581.html
> http://suseforums.net/index.php?showtopic=1302
> http://forums.suselinuxsupport.de/lofiversion/index.php/t3157.html
> http://www.linuxforums.org/forum/installation/73984-cant-boot-past-loading-kernel-fedora-4-a.html
> http://ubuntuforums.org/archive/index.php/t-146180.html
> http://ubuntuforums.org/showthread.php?t=450630
> http://www.linuxquestions.org/questions/linux-software-2/grub-freezes-before-uncompressing-kernel-464619/
>
> not sure if really all of them are the edd problem i have, but i`m quite sure for some of those.
> can provide more links if somebody likes.

None of them seemed like they were determined as EDD problems - some
even did not work with edd=skipmbr.
But that is not the point. Problem is not widely known, likely happens
with very old BIOSes and people who use those systems should be
knowing more than simply booting back to Win98.
And if those people are smart enough to figure out the right place to
ask - they will likely get suggestions to do edd=off like some of the
above links that you posted prove - without the message.

Why tax other people with a warning/hang etc. in printk when the
problem is very unlikely on their systems?

>
> > Adding 3 printks for each such obscure problem would make it even more
> i wouldn`t call that problem obscure.
> i have seen this problem more than 5 times since that code is in linux. so it`s at least enough to discuss about even if no other person thinks it`s necessary :) just the fact that it`s not been discussed often (here or elsewhere) doesn`t mean, that it exist or that users come across that. it`s more that type of "mhh, let`s try linux instead of windows. oh, doesn`t boot. ok, let`s stay with windows then...". you don`t expect people like those posting the right question to the right forum, do you?

So you think those people with such low tech knowledge will figure out
the right mailing list, make sense of the output of EDD message and
then post the question? Or that they will figure out how to specify
the edd=off to kernel command line? In that use case - not doing EDD
on their boxes like I said would be the most useful thing to do. A
message or 3 is useless in this case.

>
> > If there are known chipsets / BIOSes that have this problem - applying
> > quirks - something like this quirk for pmtmr [1]- (if they work this
> > early) or even special casing them with forced edd=off may be the right and more useful thing to do.
> mhh. i`m unsure. ok, i`m no kernel-hacker but there are so many variants of bios/hardware out there so blacklisting certain bios versions may be never more than half of a solution.

We have no data though to say this is a widespread problem. And the
type of people you are caterign to in your patch need further hand
holding to keep their system running Linux anyway.
(Hey my modem won't work, the resolution is too low etc. let's go back
to Windows.)

But yeah whatever - if you cut it down to one printk line and remove
the hang word that would at least be bearable :)

Parag

2007-12-16 20:25:33

by Alan

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

> Why tax other people with a warning/hang etc. in printk when the
> problem is very unlikely on their systems?

I think there is sense in it if you do it subtly differently.

printk(".. if this hangs do ... \r");
edd_stuff();
printk(" \r");


So that we display it, do the EDD call, then write over it with whatever
is next that matters.

That way you'd only see it when it hung - and that might be worth a patch
and test from someone.

2007-12-16 21:05:03

by Roland

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

> > not sure if really all of them are the edd problem i have, but i`m quite sure for some of those.
> > can provide more links if somebody likes.
>
> None of them seemed like they were determined as EDD problems -

so - see how difficult it is to determine, what`s the problem is ? ;)

>some even did not work with edd=skipmbr.

on my 2 problematic systems, edd=skipmbr doesn`t help. only edd=off helps.

> But that is not the point. Problem is not widely known, likely happens
> with very old BIOSes and people who use those systems should be
> knowing more than simply booting back to Win98.

i don`t see any relation to what a user knows and what kind of system of what age he is using.


> And if those people are smart enough to figure out the right place to
> ask - they will likely get suggestions to do edd=off like some of the
> above links that you posted prove - without the message.

see - and THATS where our opinion probably differs very much from mine.
i think, user`s don`t want an operating system at all.
they want that their computer just works and that they can use their apps.
us linux fans have fun with debugging things and make them work,helping others and that stuff - but others just don`t want to search for the magic token to make their computer boot. if windows just boots on that system, so should linux. if it doesn`t, they should be able to fix this without being an expert.

> Why tax other people with a warning/hang etc. in printk when the
> problem is very unlikely on their systems?

i don`t have a clue how likely or unlikely the problem is. i have seen that problem more than once and i know people who also can tell you about.

> So you think those people with such low tech knowledge will figure out
> the right mailing list, make sense of the output of EDD message and
> then post the question? Or that they will figure out how to specify
> the edd=off to kernel command line? In that use case - not doing EDD
> on their boxes like I said would be the most useful thing to do. A
> message or 3 is useless in this case.

at least they could get help by any other linux user with average experience.
but a blinking cursor in the upper left is nothing an average linux user is being able to implicate with EDD.

> But yeah whatever - if you cut it down to one printk line and remove
> the hang word that would at least be bearable :)

ok. i think i`m making to much noise for too few lines of code, so let`s stop discussing.
but let`s wait for some more comments.
maybe some simple "probing edd" is a diplomatic solution and at least better than nothing.

regards
roland


_____________________________________________________________________
Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
http://smartsurfer.web.de/?mc=100071&distributionid=000000000066

2007-12-16 21:08:11

by Alan

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

> > But that is not the point. Problem is not widely known, likely happens
> > with very old BIOSes and people who use those systems should be
> > knowing more than simply booting back to Win98.
>
> i don`t see any relation to what a user knows and what kind of system of what age he is using.

Its much much uglier than that alas. All the cases I've traced down have
been buggy bioses on add in cards not the system BIOS - the edd hang
actually moved with the controller card.

So it is useful to know.

2007-12-17 23:25:29

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD


On Dec 16 2007 20:18, Alan Cox wrote:
>
>> Why tax other people with a warning/hang etc. in printk when the
>> problem is very unlikely on their systems?
>
>I think there is sense in it if you do it subtly differently.
>
> printk(".. if this hangs do ... \r");
> edd_stuff();
> printk(" \r");
>
>
>So that we display it, do the EDD call, then write over it with whatever
>is next that matters.

Does printk support escape sequences? The last time I tried
printk("\e[1;35m omg ponies \e[0m"); that did not went too successful.

>That way you'd only see it when it hung - and that might be worth a patch
>and test from someone.

2007-12-17 23:27:57

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

Jan Engelhardt wrote:
> On Dec 16 2007 20:18, Alan Cox wrote:
>>> Why tax other people with a warning/hang etc. in printk when the
>>> problem is very unlikely on their systems?
>> I think there is sense in it if you do it subtly differently.
>>
>> printk(".. if this hangs do ... \r");
>> edd_stuff();
>> printk(" \r");
>>
>>
>> So that we display it, do the EDD call, then write over it with whatever
>> is next that matters.
>
> Does printk support escape sequences? The last time I tried
> printk("\e[1;35m omg ponies \e[0m"); that did not went too successful.
>

Uh, no. Do that and anyone trying to interpret logs will beat you to
death with a pickled herring.

-hpa

2007-12-18 00:17:18

by Alan

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

> Does printk support escape sequences? The last time I tried
> printk("\e[1;35m omg ponies \e[0m"); that did not went too successful.

It should handle \r correctly. If not that is easy to fix.

Escape codes are bad and should not be used - you may have a serial
console and not be on a Linux console

2007-12-18 00:22:49

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

Alan Cox wrote:
>> Does printk support escape sequences? The last time I tried
>> printk("\e[1;35m omg ponies \e[0m"); that did not went too successful.
>
> It should handle \r correctly. If not that is easy to fix.
>
> Escape codes are bad and should not be used - you may have a serial
> console and not be on a Linux console

It's not printk(), or the console, that's the issue...
Consider logging to a file, which is quite common.

Dumping formatting characters in there is a bit evil.

Instead of using \r and overwrite, something like

Testing foo... ok

printk(KERN_INFO "Testing foo... ");
foo();
printk("ok\n");

... really is a lot better. We used to do that a lot more.

-hpa

2007-12-22 01:15:32

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

"H. Peter Anvin" <[email protected]> writes:

> [email protected] wrote:
>> /* Query EDD information */
>> #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
>> - query_edd();
>> + printf("Probing EDD (query Bios for boot-device information)\n");
>> + printf("If boot hangs here, you may have a buggy Bios. Try edd=skipmbr or edd=off");
>> + query_edd();
>> + printf("\rOK \n");
>> #endif
>
> This is awfully verbose.

But useful. On the other hand we have lots of IMNSHO
quite useless printks in early bootup that could be removed
to make up for that

(e.g. candidates would be the
"CPU: After/before generic identify"
or
Entering add_active_range....)

printks and you can easily make up for the added verbosity.

-Andi

2007-12-22 01:49:20

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

Andi Kleen wrote:
> "H. Peter Anvin" <[email protected]> writes:
>
>> [email protected] wrote:
>>> /* Query EDD information */
>>> #if defined(CONFIG_EDD) || defined(CONFIG_EDD_MODULE)
>>> - query_edd();
>>> + printf("Probing EDD (query Bios for boot-device information)\n");
>>> + printf("If boot hangs here, you may have a buggy Bios. Try edd=skipmbr or edd=off");
>>> + query_edd();
>>> + printf("\rOK \n");
>>> #endif
>> This is awfully verbose.
>
> But useful. On the other hand we have lots of IMNSHO
> quite useless printks in early bootup that could be removed
> to make up for that
>
> (e.g. candidates would be the
> "CPU: After/before generic identify"
> or
> Entering add_active_range....)
>
> printks and you can easily make up for the added verbosity.
>

Those don't live in an area of memory which is hard-limited to 32K.

-hpa

2007-12-22 01:57:16

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

> Those don't live in an area of memory which is hard-limited to 32K.

Why not 64k?

Ok, that's a different argument than before. Ok. Although it's
only a few bytes.

I would lobby for any message at least contain the suggestion to try
edd=off. That could save users a lot of time.

-Andi

2007-12-22 02:22:58

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

Andi Kleen wrote:
>> Those don't live in an area of memory which is hard-limited to 32K.
>
> Why not 64k?
>

Because the bootloader needs some memory in the same segment that it
controls. Furthermore, since there were some residual uses of the
0x9000 segment (now removed, but not all bootloaders could be easily
fixed), we were limited to about 40K for everything.

Unfortunately we already have the problem that some bootloaders (notably
LOADLIN and mknbi) hardcoded an arbitrary limit which was even smaller
than that (about 16K, which we're already pushing up against.)

> Ok, that's a different argument than before. Ok. Although it's
> only a few bytes.
>
> I would lobby for any message at least contain the suggestion to try
> edd=off. That could save users a lot of time.

The important thing is that there is a message before and after. The
rest can be dealt with by Google on in documentation. That's what the
patch currently in x86setup.git does.

-hpa

2007-12-22 02:40:59

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] [RFC] be more verbose when probing EDD

On Fri, Dec 21, 2007 at 06:22:41PM -0800, H. Peter Anvin wrote:
> >Ok, that's a different argument than before. Ok. Although it's
> >only a few bytes.
> >
> >I would lobby for any message at least contain the suggestion to try
> >edd=off. That could save users a lot of time.
>
> The important thing is that there is a message before and after. The
> rest can be dealt with by Google on in documentation. That's what the

That means every user needs to waste at least 5 minutes to google/read
docs and be annoyed by someone's ads. Collective waste.

Just adding "Try edd=off if this hangs" is ~23 bytes and adding
that should be hardly a problem even in your 32k. Or perhaps if you're more
worried about code size than grammer "edd=off when hang" @)

-Andi