2008-07-01 08:33:59

by Thomas Renninger

[permalink] [raw]
Subject: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS

On Friday 20 June 2008 09:43:25 Len Brown wrote:
> From: Bob Moore <[email protected]>
>
> Some BIOSs erroneously reverse the _PRT SourceName and the
> SourceIndex. Detect and repair this problem. MS ACPI also allows
> and repairs this problem, thus ACPICA must also.

It would be great to have an interface to report this as a BIOS defect.

Something like:

FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT source_name", ACPI_
Bug);

FIRMWARE_BUG_ON(severity, description, component);

If say CONFIG_FIRMWARE_DEBUG is set this could send a netlink event to
userspace and e.g. linuxfirmwarekit can grab it.
linuxfirmwarekit currently implements it's own scripts and even worse, parses
dmesg to find BIOS bugs, which is not really maintenable.

IMO the kernel developers themselves need an interface where they can report
bugs while seeing and fixing them.

Huang Cheng posted a nice patch on the linuxfirmwarekit list a while ago
(unfortunately the patch was attached not inlined):
http://www.bughost.org/pipermail/firmwarekit-discuss/2008-March/000166.html

I mean it needs some cleanup (e.g. the file name was the bug number, there
need to be a general interface created, e.g. as shown above, etc.).
But this is IMO a good idea and might be worth adding to the kernel (when
written :) ).
Comments?

Thomas

PS: I added Bjorn as in the PNP parts BIOSes are pretty much messed up.
One reason is that AFAIK ACPI parts are glued together. Having dozens/hundreds
of device declarations in a BIOS of which huge parts have not been written by
yourself, it is very hard for vendors to check for correctness.
While Bjorn and others are adding quirks and workarounds for specific devices,
machines or general BIOS bugs it would be great to be able to point vendors
to these bugs by a simple tool.

> Signed-off-by: Bob Moore <[email protected]>
> Signed-off-by: Lin Ming <[email protected]>
> Signed-off-by: Len Brown <[email protected]>
> ---
> drivers/acpi/resources/rscreate.c | 17 +++++++++++++++++
> 1 files changed, 17 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/acpi/resources/rscreate.c
> b/drivers/acpi/resources/rscreate.c index faddaee..70c84ec 100644
> --- a/drivers/acpi/resources/rscreate.c
> +++ b/drivers/acpi/resources/rscreate.c
> @@ -285,6 +285,23 @@ acpi_rs_create_pci_routing_table(union
> acpi_operand_object *package_object, }
>
> /*
> + * If the BIOS has erroneously reversed the _PRT source_name (index 2)
> + * and the source_index (index 3), fix it. _PRT is important enough to
> + * workaround this BIOS error. This also provides compatibility with
> + * other ACPI implementations.
> + */
> + obj_desc = sub_object_list[3];
> + if (!obj_desc
> + || (ACPI_GET_OBJECT_TYPE(obj_desc) != ACPI_TYPE_INTEGER)) {
> + sub_object_list[3] = sub_object_list[2];
> + sub_object_list[2] = obj_desc;
> +
> + ACPI_WARNING((AE_INFO,
> + "(PRT[%X].Source) SourceName and SourceIndex are reversed,
> fixed", + index));
> + }
> +
> + /*
> * 3) Third subobject: Dereference the PRT.source_name
> * The name may be unresolved (slack mode), so allow a null object
> */


2008-07-08 13:39:29

by Pavel Machek

[permalink] [raw]
Subject: Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS

Hi!

> > Some BIOSs erroneously reverse the _PRT SourceName and the
> > SourceIndex. Detect and repair this problem. MS ACPI also allows
> > and repairs this problem, thus ACPICA must also.
>
> It would be great to have an interface to report this as a BIOS defect.
>
> Something like:
>
> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT source_name", ACPI_
> Bug);
>
> FIRMWARE_BUG_ON(severity, description, component);

Yes, please.

I'd also like HARDWARE_BUG_ON(), with similar usage.

With all the preload-linux-on-foo project, we have some chance to make
BIOS vendors fix their stuff if we can easily diagnose errors in
there.

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-07-08 15:40:30

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS

Pavel Machek wrote:
> Hi!
>
>>> Some BIOSs erroneously reverse the _PRT SourceName and the
>>> SourceIndex. Detect and repair this problem. MS ACPI also allows
>>> and repairs this problem, thus ACPICA must also.
>> It would be great to have an interface to report this as a BIOS defect.
>>
>> Something like:
>>
>> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT source_name", ACPI_
>> Bug);
>>
>> FIRMWARE_BUG_ON(severity, description, component);
>
> Yes, please.
>
> I'd also like HARDWARE_BUG_ON(), with similar usage.
>
> With all the preload-linux-on-foo project, we have some chance to make
> BIOS vendors fix their stuff if we can easily diagnose errors in
> there.

also we can tie this to kerneloops.org to make a list of top known firmware/hardware bugs, seperate from the kernel code bugs
>

2008-07-08 19:35:11

by Brown, Len

[permalink] [raw]
Subject: RE: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS



>>>> Some BIOSs erroneously reverse the _PRT SourceName and the
>>>> SourceIndex. Detect and repair this problem. MS ACPI also allows
>>>> and repairs this problem, thus ACPICA must also.
>>> It would be great to have an interface to report this as a
>BIOS defect.
>>>
>>> Something like:
>>>
>>> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT
>source_name", ACPI_
>>> Bug);
>>>
>>> FIRMWARE_BUG_ON(severity, description, component);
>>
>> Yes, please.

I'm not excited about maintaining
maintaining linux-as-a-firmware-diagnostic --
particularly when...

1. it clutters the code for normail machines
2. finding the bug is pointless, because even
if you fix one machine, you are guaranteed to
not fix all machines and thus must maintain
the workaround anyway.

>> I'd also like HARDWARE_BUG_ON(), with similar usage.
>>
>> With all the preload-linux-on-foo project, we have some
>> chance to make
>> BIOS vendors fix their stuff if we can easily diagnose errors in
>> there.

These customers should be running
http://linuxfirmwarekit.org/

We do maintain some degree of "high-road ACPI spec checking"
with the "acpi=strict" boot option. If we do more of this,
I think it should stay under that option.

thanks,
-Len

2008-07-08 22:10:38

by Pavel Machek

[permalink] [raw]
Subject: Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS

Hi!

> >>> FIRMWARE_BUG_ON(severity, description, component);
> >>
> >> Yes, please.
>
> I'm not excited about maintaining
> maintaining linux-as-a-firmware-diagnostic --
> particularly when...
>
> 1. it clutters the code for normail machines
> 2. finding the bug is pointless, because even
> if you fix one machine, you are guaranteed to
> not fix all machines and thus must maintain
> the workaround anyway.

Well, at least we can make sure new machines are okay.

Plus, it is nice to know how common hw/BIOS problems are.

> >> I'd also like HARDWARE_BUG_ON(), with similar usage.
> >>
> >> With all the preload-linux-on-foo project, we have some
> >> chance to make
> >> BIOS vendors fix their stuff if we can easily diagnose errors in
> >> there.
>
> These customers should be running
> http://linuxfirmwarekit.org/
>
> We do maintain some degree of "high-road ACPI spec checking"
> with the "acpi=strict" boot option. If we do more of this,
> I think it should stay under that option.

That's okay with me, but it would be nice to have printk() markup, so
that we can tell BIOS/hw bugs from normal kernel messages.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-07-09 08:52:17

by Thomas Renninger

[permalink] [raw]
Subject: Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS

On Tuesday 08 July 2008 21:34:55 Brown, Len wrote:
> >>>> Some BIOSs erroneously reverse the _PRT SourceName and the
> >>>> SourceIndex. Detect and repair this problem. MS ACPI also allows
> >>>> and repairs this problem, thus ACPICA must also.
> >>>
> >>> It would be great to have an interface to report this as a
> >
> >BIOS defect.
> >
> >>> Something like:
> >>>
> >>> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT
> >
> >source_name", ACPI_
> >
> >>> Bug);
> >>>
> >>> FIRMWARE_BUG_ON(severity, description, component);
> >>
> >> Yes, please.
>
> I'm not excited about maintaining
> maintaining linux-as-a-firmware-diagnostic --
> particularly when...
Above is not for ACPI only. But ACPI is probably a candidate which should make
use of it.
>
> 1. it clutters the code for normail machines
It is the only disadvantage, "cluttering" the code.
On the other hand, this is an advantage.
While comments like "/* this must not happen, out of spec, etc. */"
should normally be added,
kernel developers tend to return -ENODEV and that's it.
A FIRMWARE_BUG("Description") would be compiled out on a
production kernel, compiled in for a kernel on which linuxfirmwarekit
boots or other debug kernels and would nicely document,
not clutter the code.

> 2. finding the bug is pointless,
No.
> because even
> if you fix one machine, you are guaranteed to
> not fix all machines and thus must maintain
> the workaround anyway.
I expect you want HP to fix their critical shut down temperature of 256 C?
If we would have known this, we had made them fix this and a lot other things
also.
This is one of dozens of examples...

> >> I'd also like HARDWARE_BUG_ON(), with similar usage.
> >>
> >> With all the preload-linux-on-foo project, we have some
> >> chance to make
> >> BIOS vendors fix their stuff if we can easily diagnose errors in
> >> there.
>
> These customers should be running
> http://linuxfirmwarekit.org/
It is not worth much.
At least currently and a firmware bug interface to the kernel is about to
change this.

1) linuxfirmwarekit is parsing dmesg..., these messages are changing
all the time in the kernel.
S390 AFAIK currently indexes all printks to avoid a simple space cleanup
to mix up messges. It is not needed to go that far...
2) The rest is done through userspace plugins written in shell/perl or C.
But most firmware bugs are absorbed by the kernel already. So there
is nothing we could tell OEMs they should correct.
3) 2+3 is unmaintainable. There is enough work in the kernel, now we should
write a linuxfirmwarekit plugin for every firmwarebug we find?!?
4) I tried to convince our kernel people to write linuxfirmwarekit plugins
with exactly zero response. Hmm wrong, Frank wrote a thermal zone
checker, some hundred lines of fragile shell code, it would have been less
than ten lines in the kernel, just add some FIRMWARE_BUG(..) in the right
if/else conditions. And it's not worth much anymore as soon as
thermal_zone is moving to sysfs.

> We do maintain some degree of "high-road ACPI spec checking"
> with the "acpi=strict" boot option. If we do more of this,
> I think it should stay under that option.
In the linuxfirmwarekit the ACPI tables are dumped, disassembled and
recompiled. The errors and warnings are printed out as ACPI BIOS errors.
This works out.
The iasl compiler still is a bit too noisy and cannot disassemble SSDTs
correctly (maybe it got fixed, don't know), but this can work out.

The problem is that you only find syntactical bugs.
But there are *a lot* bugs like e.g. critical temperature of 256 C. Or If this
object exists, that object must exist. Logic ACPI BIOS bugs can only be found
in the kernel itself at runtime.
Ahh Bjorn is in CC..., Wrong opregion declarations and PNP resource
declarations are rather common. I mean theoretically you can find this in
userspace. Not with disassembling and recompiling but with interpreting the
tables in userspace... Again unmaintainable and who should do the checking?
The guy who writes the kernel code...

I really like the idea of the linuxfirmwarekit and tried to push it, but I do
not see any future for it without kernel support.

Thomas

2008-07-17 20:31:41

by Brown, Len

[permalink] [raw]
Subject: RE: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS


>> I'm not excited about maintaining
>> maintaining linux-as-a-firmware-diagnostic --
>> particularly when...

>Above is not for ACPI only. But ACPI is probably a candidate
>which should make use of it.

Sure, there may be justification for doing something like this
in the kernel, but the issue that started this thread isn't it.

The issue that started this thread can be diagnosed by user-space
in linuxfirmwarekit. Multiple machines have this bug, which means
that it is "common industry practice" and the kernel has to work
around it (silently) if we like it or not.
ie. the issue, now that it is debugged and worked around,
is effectly just firmware lint.

thanks,
-Len