2016-04-27 12:50:23

by Josh Boyer

[permalink] [raw]
Subject: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

The promise of pretty boot splashes from firmware via BGRT was at
best only that; a promise. The kernel diligently checks to make
sure the BGRT data firmware gives it is valid, and dutifully warns
the user when it isn't. However, it does so via the pr_err log
level which seems unnecessary. The user cannot do anything about
this and there really isn't an error on the part of Linux to
correct.

This lowers the log level by using pr_debug instead. Users will
no longer have their boot process uglified by the kernel reminding
us that firmware can and often is broken. Ironic, considering
BGRT is supposed to make boot pretty to begin with.

Signed-off-by: Josh Boyer <[email protected]>
---
arch/x86/platform/efi/efi-bgrt.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/platform/efi/efi-bgrt.c b/arch/x86/platform/efi/efi-bgrt.c
index a2433817c987..6f70d2ac8029 100644
--- a/arch/x86/platform/efi/efi-bgrt.c
+++ b/arch/x86/platform/efi/efi-bgrt.c
@@ -43,40 +43,40 @@ void __init efi_bgrt_init(void)
return;

if (bgrt_tab->header.length < sizeof(*bgrt_tab)) {
- pr_err("Ignoring BGRT: invalid length %u (expected %zu)\n",
+ pr_debug("Ignoring BGRT: invalid length %u (expected %zu)\n",
bgrt_tab->header.length, sizeof(*bgrt_tab));
return;
}
if (bgrt_tab->version != 1) {
- pr_err("Ignoring BGRT: invalid version %u (expected 1)\n",
+ pr_debug("Ignoring BGRT: invalid version %u (expected 1)\n",
bgrt_tab->version);
return;
}
if (bgrt_tab->status & 0xfe) {
- pr_err("Ignoring BGRT: reserved status bits are non-zero %u\n",
+ pr_debug("Ignoring BGRT: reserved status bits are non-zero %u\n",
bgrt_tab->status);
return;
}
if (bgrt_tab->image_type != 0) {
- pr_err("Ignoring BGRT: invalid image type %u (expected 0)\n",
+ pr_debug("Ignoring BGRT: invalid image type %u (expected 0)\n",
bgrt_tab->image_type);
return;
}
if (!bgrt_tab->image_address) {
- pr_err("Ignoring BGRT: null image address\n");
+ pr_debug("Ignoring BGRT: null image address\n");
return;
}

image = memremap(bgrt_tab->image_address, sizeof(bmp_header), MEMREMAP_WB);
if (!image) {
- pr_err("Ignoring BGRT: failed to map image header memory\n");
+ pr_debug("Ignoring BGRT: failed to map image header memory\n");
return;
}

memcpy(&bmp_header, image, sizeof(bmp_header));
memunmap(image);
if (bmp_header.id != 0x4d42) {
- pr_err("Ignoring BGRT: Incorrect BMP magic number 0x%x (expected 0x4d42)\n",
+ pr_debug("Ignoring BGRT: Incorrect BMP magic number 0x%x (expected 0x4d42)\n",
bmp_header.id);
return;
}
@@ -84,14 +84,14 @@ void __init efi_bgrt_init(void)

bgrt_image = kmalloc(bgrt_image_size, GFP_KERNEL | __GFP_NOWARN);
if (!bgrt_image) {
- pr_err("Ignoring BGRT: failed to allocate memory for image (wanted %zu bytes)\n",
+ pr_debug("Ignoring BGRT: failed to allocate memory for image (wanted %zu bytes)\n",
bgrt_image_size);
return;
}

image = memremap(bgrt_tab->image_address, bmp_header.size, MEMREMAP_WB);
if (!image) {
- pr_err("Ignoring BGRT: failed to map image memory\n");
+ pr_debug("Ignoring BGRT: failed to map image memory\n");
kfree(bgrt_image);
bgrt_image = NULL;
return;
--
2.5.5


2016-04-27 13:36:36

by Môshe van der Sterre

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

(additionally CC-ing Josh Triplett)

On 04/27/2016 02:50 PM, Josh Boyer wrote:
> The promise of pretty boot splashes from firmware via BGRT was at
> best only that; a promise. The kernel diligently checks to make
> sure the BGRT data firmware gives it is valid, and dutifully warns
> the user when it isn't. However, it does so via the pr_err log
> level which seems unnecessary. The user cannot do anything about
> this and there really isn't an error on the part of Linux to
> correct.
>
> This lowers the log level by using pr_debug instead. Users will
> no longer have their boot process uglified by the kernel reminding
> us that firmware can and often is broken. Ironic, considering
> BGRT is supposed to make boot pretty to begin with.
Hi Josh Boyer,

Are you seeing these errors somewhere? I recently fixed the error
"Ignoring BGRT: invalid status 0 (expected 1)" because Linux apparently
interpreted that part of the specification differently than others.
If that's the error you are seeing, perhaps your problem is already
solved in recent kernels? (fixed in commit 66dbe99)

Personally I agree that BGRT messages should not annoy actual users of
production firmwares.
However I also agree with the previous consensus that these checks (for
actual spec violations) should remain pr_err unless some production
firmware is triggering them. What do you think?

Greetings,
M?she van der Sterre

> Signed-off-by: Josh Boyer <[email protected]>
> ---
> arch/x86/platform/efi/efi-bgrt.c | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/platform/efi/efi-bgrt.c b/arch/x86/platform/efi/efi-bgrt.c
> index a2433817c987..6f70d2ac8029 100644
> --- a/arch/x86/platform/efi/efi-bgrt.c
> +++ b/arch/x86/platform/efi/efi-bgrt.c
> @@ -43,40 +43,40 @@ void __init efi_bgrt_init(void)
> return;
>
> if (bgrt_tab->header.length < sizeof(*bgrt_tab)) {
> - pr_err("Ignoring BGRT: invalid length %u (expected %zu)\n",
> + pr_debug("Ignoring BGRT: invalid length %u (expected %zu)\n",
> bgrt_tab->header.length, sizeof(*bgrt_tab));
> return;
> }
> if (bgrt_tab->version != 1) {
> - pr_err("Ignoring BGRT: invalid version %u (expected 1)\n",
> + pr_debug("Ignoring BGRT: invalid version %u (expected 1)\n",
> bgrt_tab->version);
> return;
> }
> if (bgrt_tab->status & 0xfe) {
> - pr_err("Ignoring BGRT: reserved status bits are non-zero %u\n",
> + pr_debug("Ignoring BGRT: reserved status bits are non-zero %u\n",
> bgrt_tab->status);
> return;
> }
> if (bgrt_tab->image_type != 0) {
> - pr_err("Ignoring BGRT: invalid image type %u (expected 0)\n",
> + pr_debug("Ignoring BGRT: invalid image type %u (expected 0)\n",
> bgrt_tab->image_type);
> return;
> }
> if (!bgrt_tab->image_address) {
> - pr_err("Ignoring BGRT: null image address\n");
> + pr_debug("Ignoring BGRT: null image address\n");
> return;
> }
>
> image = memremap(bgrt_tab->image_address, sizeof(bmp_header), MEMREMAP_WB);
> if (!image) {
> - pr_err("Ignoring BGRT: failed to map image header memory\n");
> + pr_debug("Ignoring BGRT: failed to map image header memory\n");
> return;
> }
>
> memcpy(&bmp_header, image, sizeof(bmp_header));
> memunmap(image);
> if (bmp_header.id != 0x4d42) {
> - pr_err("Ignoring BGRT: Incorrect BMP magic number 0x%x (expected 0x4d42)\n",
> + pr_debug("Ignoring BGRT: Incorrect BMP magic number 0x%x (expected 0x4d42)\n",
> bmp_header.id);
> return;
> }
> @@ -84,14 +84,14 @@ void __init efi_bgrt_init(void)
>
> bgrt_image = kmalloc(bgrt_image_size, GFP_KERNEL | __GFP_NOWARN);
> if (!bgrt_image) {
> - pr_err("Ignoring BGRT: failed to allocate memory for image (wanted %zu bytes)\n",
> + pr_debug("Ignoring BGRT: failed to allocate memory for image (wanted %zu bytes)\n",
> bgrt_image_size);
> return;
> }
>
> image = memremap(bgrt_tab->image_address, bmp_header.size, MEMREMAP_WB);
> if (!image) {
> - pr_err("Ignoring BGRT: failed to map image memory\n");
> + pr_debug("Ignoring BGRT: failed to map image memory\n");
> kfree(bgrt_image);
> bgrt_image = NULL;
> return;

2016-04-27 13:57:03

by Josh Boyer

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

On Wed, Apr 27, 2016 at 9:26 AM, Môshe van der Sterre <[email protected]> wrote:
> (additionally CC-ing Josh Triplett)

Thanks for doing so. I completely forgot.

> On 04/27/2016 02:50 PM, Josh Boyer wrote:
>>
>> The promise of pretty boot splashes from firmware via BGRT was at
>> best only that; a promise. The kernel diligently checks to make
>> sure the BGRT data firmware gives it is valid, and dutifully warns
>> the user when it isn't. However, it does so via the pr_err log
>> level which seems unnecessary. The user cannot do anything about
>> this and there really isn't an error on the part of Linux to
>> correct.
>>
>> This lowers the log level by using pr_debug instead. Users will
>> no longer have their boot process uglified by the kernel reminding
>> us that firmware can and often is broken. Ironic, considering
>> BGRT is supposed to make boot pretty to begin with.
>
> Hi Josh Boyer,
>
> Are you seeing these errors somewhere? I recently fixed the error "Ignoring

We have a user that reports seeing:

"Ignoring BGRT: Invalid version 0 (expected 1)"

on a Lenovo T430 machine. We've had a few other scattered reports on
various machine types since BGRT went into the kernel as well.

> BGRT: invalid status 0 (expected 1)" because Linux apparently interpreted
> that part of the specification differently than others.
> If that's the error you are seeing, perhaps your problem is already solved
> in recent kernels? (fixed in commit 66dbe99)
>
> Personally I agree that BGRT messages should not annoy actual users of
> production firmwares.
> However I also agree with the previous consensus that these checks (for
> actual spec violations) should remain pr_err unless some production firmware
> is triggering them. What do you think?

Production firmware is literally the only firmware end users will ever
see. I don't see much point in leaving scary error messages in the
kernel to complain about things the user has no chance of fixing or in
almost all cases even reporting to people who could fix it.

josh

>> Signed-off-by: Josh Boyer <[email protected]>
>> ---
>> arch/x86/platform/efi/efi-bgrt.c | 18 +++++++++---------
>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/platform/efi/efi-bgrt.c
>> b/arch/x86/platform/efi/efi-bgrt.c
>> index a2433817c987..6f70d2ac8029 100644
>> --- a/arch/x86/platform/efi/efi-bgrt.c
>> +++ b/arch/x86/platform/efi/efi-bgrt.c
>> @@ -43,40 +43,40 @@ void __init efi_bgrt_init(void)
>> return;
>> if (bgrt_tab->header.length < sizeof(*bgrt_tab)) {
>> - pr_err("Ignoring BGRT: invalid length %u (expected
>> %zu)\n",
>> + pr_debug("Ignoring BGRT: invalid length %u (expected
>> %zu)\n",
>> bgrt_tab->header.length, sizeof(*bgrt_tab));
>> return;
>> }
>> if (bgrt_tab->version != 1) {
>> - pr_err("Ignoring BGRT: invalid version %u (expected 1)\n",
>> + pr_debug("Ignoring BGRT: invalid version %u (expected
>> 1)\n",
>> bgrt_tab->version);
>> return;
>> }
>> if (bgrt_tab->status & 0xfe) {
>> - pr_err("Ignoring BGRT: reserved status bits are non-zero
>> %u\n",
>> + pr_debug("Ignoring BGRT: reserved status bits are non-zero
>> %u\n",
>> bgrt_tab->status);
>> return;
>> }
>> if (bgrt_tab->image_type != 0) {
>> - pr_err("Ignoring BGRT: invalid image type %u (expected
>> 0)\n",
>> + pr_debug("Ignoring BGRT: invalid image type %u (expected
>> 0)\n",
>> bgrt_tab->image_type);
>> return;
>> }
>> if (!bgrt_tab->image_address) {
>> - pr_err("Ignoring BGRT: null image address\n");
>> + pr_debug("Ignoring BGRT: null image address\n");
>> return;
>> }
>> image = memremap(bgrt_tab->image_address, sizeof(bmp_header),
>> MEMREMAP_WB);
>> if (!image) {
>> - pr_err("Ignoring BGRT: failed to map image header
>> memory\n");
>> + pr_debug("Ignoring BGRT: failed to map image header
>> memory\n");
>> return;
>> }
>> memcpy(&bmp_header, image, sizeof(bmp_header));
>> memunmap(image);
>> if (bmp_header.id != 0x4d42) {
>> - pr_err("Ignoring BGRT: Incorrect BMP magic number 0x%x
>> (expected 0x4d42)\n",
>> + pr_debug("Ignoring BGRT: Incorrect BMP magic number 0x%x
>> (expected 0x4d42)\n",
>> bmp_header.id);
>> return;
>> }
>> @@ -84,14 +84,14 @@ void __init efi_bgrt_init(void)
>> bgrt_image = kmalloc(bgrt_image_size, GFP_KERNEL | __GFP_NOWARN);
>> if (!bgrt_image) {
>> - pr_err("Ignoring BGRT: failed to allocate memory for image
>> (wanted %zu bytes)\n",
>> + pr_debug("Ignoring BGRT: failed to allocate memory for
>> image (wanted %zu bytes)\n",
>> bgrt_image_size);
>> return;
>> }
>> image = memremap(bgrt_tab->image_address, bmp_header.size,
>> MEMREMAP_WB);
>> if (!image) {
>> - pr_err("Ignoring BGRT: failed to map image memory\n");
>> + pr_debug("Ignoring BGRT: failed to map image memory\n");
>> kfree(bgrt_image);
>> bgrt_image = NULL;
>> return;
>
>

2016-04-27 14:57:51

by Môshe van der Sterre

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT


On 04/27/2016 03:56 PM, Josh Boyer wrote:
> On Wed, Apr 27, 2016 at 9:26 AM, Môshe van der Sterre <[email protected]> wrote:
>> (additionally CC-ing Josh Triplett)
> Thanks for doing so. I completely forgot.
>
>> On 04/27/2016 02:50 PM, Josh Boyer wrote:
>>> The promise of pretty boot splashes from firmware via BGRT was at
>>> best only that; a promise. The kernel diligently checks to make
>>> sure the BGRT data firmware gives it is valid, and dutifully warns
>>> the user when it isn't. However, it does so via the pr_err log
>>> level which seems unnecessary. The user cannot do anything about
>>> this and there really isn't an error on the part of Linux to
>>> correct.
>>>
>>> This lowers the log level by using pr_debug instead. Users will
>>> no longer have their boot process uglified by the kernel reminding
>>> us that firmware can and often is broken. Ironic, considering
>>> BGRT is supposed to make boot pretty to begin with.
>> Hi Josh Boyer,
>>
>> Are you seeing these errors somewhere? I recently fixed the error "Ignoring
> We have a user that reports seeing:
>
> "Ignoring BGRT: Invalid version 0 (expected 1)"
>
> on a Lenovo T430 machine. We've had a few other scattered reports on
> various machine types since BGRT went into the kernel as well.
Ok. With this information, I think pr_debug is indeed better.
>> BGRT: invalid status 0 (expected 1)" because Linux apparently interpreted
>> that part of the specification differently than others.
>> If that's the error you are seeing, perhaps your problem is already solved
>> in recent kernels? (fixed in commit 66dbe99)
>>
>> Personally I agree that BGRT messages should not annoy actual users of
>> production firmwares.
>> However I also agree with the previous consensus that these checks (for
>> actual spec violations) should remain pr_err unless some production firmware
>> is triggering them. What do you think?
> Production firmware is literally the only firmware end users will ever
> see. I don't see much point in leaving scary error messages in the
> kernel to complain about things the user has no chance of fixing or in
> almost all cases even reporting to people who could fix it.
In principle I can understand the wish to show big scary error messages
to firmware developers doing it wrong.

With that said:
The patch looks good to me, but Josh Triplett and Matt Fleming their
opinions might be better informed than mine.

> josh
>
>>> Signed-off-by: Josh Boyer <[email protected]>
>>> ---
>>> arch/x86/platform/efi/efi-bgrt.c | 18 +++++++++---------
>>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/arch/x86/platform/efi/efi-bgrt.c
>>> b/arch/x86/platform/efi/efi-bgrt.c
>>> index a2433817c987..6f70d2ac8029 100644
>>> --- a/arch/x86/platform/efi/efi-bgrt.c
>>> +++ b/arch/x86/platform/efi/efi-bgrt.c
>>> @@ -43,40 +43,40 @@ void __init efi_bgrt_init(void)
>>> return;
>>> if (bgrt_tab->header.length < sizeof(*bgrt_tab)) {
>>> - pr_err("Ignoring BGRT: invalid length %u (expected
>>> %zu)\n",
>>> + pr_debug("Ignoring BGRT: invalid length %u (expected
>>> %zu)\n",
>>> bgrt_tab->header.length, sizeof(*bgrt_tab));
>>> return;
>>> }
>>> if (bgrt_tab->version != 1) {
>>> - pr_err("Ignoring BGRT: invalid version %u (expected 1)\n",
>>> + pr_debug("Ignoring BGRT: invalid version %u (expected
>>> 1)\n",
>>> bgrt_tab->version);
>>> return;
>>> }
>>> if (bgrt_tab->status & 0xfe) {
>>> - pr_err("Ignoring BGRT: reserved status bits are non-zero
>>> %u\n",
>>> + pr_debug("Ignoring BGRT: reserved status bits are non-zero
>>> %u\n",
>>> bgrt_tab->status);
>>> return;
>>> }
>>> if (bgrt_tab->image_type != 0) {
>>> - pr_err("Ignoring BGRT: invalid image type %u (expected
>>> 0)\n",
>>> + pr_debug("Ignoring BGRT: invalid image type %u (expected
>>> 0)\n",
>>> bgrt_tab->image_type);
>>> return;
>>> }
>>> if (!bgrt_tab->image_address) {
>>> - pr_err("Ignoring BGRT: null image address\n");
>>> + pr_debug("Ignoring BGRT: null image address\n");
>>> return;
>>> }
>>> image = memremap(bgrt_tab->image_address, sizeof(bmp_header),
>>> MEMREMAP_WB);
>>> if (!image) {
>>> - pr_err("Ignoring BGRT: failed to map image header
>>> memory\n");
>>> + pr_debug("Ignoring BGRT: failed to map image header
>>> memory\n");
>>> return;
>>> }
>>> memcpy(&bmp_header, image, sizeof(bmp_header));
>>> memunmap(image);
>>> if (bmp_header.id != 0x4d42) {
>>> - pr_err("Ignoring BGRT: Incorrect BMP magic number 0x%x
>>> (expected 0x4d42)\n",
>>> + pr_debug("Ignoring BGRT: Incorrect BMP magic number 0x%x
>>> (expected 0x4d42)\n",
>>> bmp_header.id);
>>> return;
>>> }
>>> @@ -84,14 +84,14 @@ void __init efi_bgrt_init(void)
>>> bgrt_image = kmalloc(bgrt_image_size, GFP_KERNEL | __GFP_NOWARN);
>>> if (!bgrt_image) {
>>> - pr_err("Ignoring BGRT: failed to allocate memory for image
>>> (wanted %zu bytes)\n",
>>> + pr_debug("Ignoring BGRT: failed to allocate memory for
>>> image (wanted %zu bytes)\n",
>>> bgrt_image_size);
>>> return;
>>> }
>>> image = memremap(bgrt_tab->image_address, bmp_header.size,
>>> MEMREMAP_WB);
>>> if (!image) {
>>> - pr_err("Ignoring BGRT: failed to map image memory\n");
>>> + pr_debug("Ignoring BGRT: failed to map image memory\n");
>>> kfree(bgrt_image);
>>> bgrt_image = NULL;
>>> return;
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-efi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-04-27 15:20:32

by Josh Boyer

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

On Wed, Apr 27, 2016 at 10:57 AM, Môshe van der Sterre <[email protected]> wrote:
>
> On 04/27/2016 03:56 PM, Josh Boyer wrote:
>>
>> On Wed, Apr 27, 2016 at 9:26 AM, Môshe van der Sterre <[email protected]> wrote:
>>>
>>> (additionally CC-ing Josh Triplett)
>>
>> Thanks for doing so. I completely forgot.
>>
>>> On 04/27/2016 02:50 PM, Josh Boyer wrote:
>>>>
>>>> The promise of pretty boot splashes from firmware via BGRT was at
>>>> best only that; a promise. The kernel diligently checks to make
>>>> sure the BGRT data firmware gives it is valid, and dutifully warns
>>>> the user when it isn't. However, it does so via the pr_err log
>>>> level which seems unnecessary. The user cannot do anything about
>>>> this and there really isn't an error on the part of Linux to
>>>> correct.
>>>>
>>>> This lowers the log level by using pr_debug instead. Users will
>>>> no longer have their boot process uglified by the kernel reminding
>>>> us that firmware can and often is broken. Ironic, considering
>>>> BGRT is supposed to make boot pretty to begin with.
>>>
>>> Hi Josh Boyer,
>>>
>>> Are you seeing these errors somewhere? I recently fixed the error
>>> "Ignoring
>>
>> We have a user that reports seeing:
>>
>> "Ignoring BGRT: Invalid version 0 (expected 1)"
>>
>> on a Lenovo T430 machine. We've had a few other scattered reports on
>> various machine types since BGRT went into the kernel as well.
>
> Ok. With this information, I think pr_debug is indeed better.
>>>
>>> BGRT: invalid status 0 (expected 1)" because Linux apparently interpreted
>>> that part of the specification differently than others.
>>> If that's the error you are seeing, perhaps your problem is already
>>> solved
>>> in recent kernels? (fixed in commit 66dbe99)
>>>
>>> Personally I agree that BGRT messages should not annoy actual users of
>>> production firmwares.
>>> However I also agree with the previous consensus that these checks (for
>>> actual spec violations) should remain pr_err unless some production
>>> firmware
>>> is triggering them. What do you think?
>>
>> Production firmware is literally the only firmware end users will ever
>> see. I don't see much point in leaving scary error messages in the
>> kernel to complain about things the user has no chance of fixing or in
>> almost all cases even reporting to people who could fix it.
>
> In principle I can understand the wish to show big scary error messages to
> firmware developers doing it wrong.

Yes, that is theoretically possible. However, my best guess is that
firmware developers aren't typically testing with Linux distributions
during firmware development. They test with Windows, which is why we
see warnings in Linux that Windows doesn't show. By then the firmware
is in production and it is too late.

We see this in lots of areas, which is why we have weird quirks for
devices all over the kernel, but I don't think there's value in doing
quirk mechanisms around BGRT.

josh

2016-04-27 17:05:36

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

On Wed, Apr 27, 2016 at 11:20:26AM -0400, Josh Boyer wrote:
> On Wed, Apr 27, 2016 at 10:57 AM, M?she van der Sterre <[email protected]> wrote:
> >
> > On 04/27/2016 03:56 PM, Josh Boyer wrote:
> >>
> >> On Wed, Apr 27, 2016 at 9:26 AM, M?she van der Sterre <[email protected]> wrote:
> >>>
> >>> (additionally CC-ing Josh Triplett)
> >>
> >> Thanks for doing so. I completely forgot.
> >>
> >>> On 04/27/2016 02:50 PM, Josh Boyer wrote:
> >>>>
> >>>> The promise of pretty boot splashes from firmware via BGRT was at
> >>>> best only that; a promise. The kernel diligently checks to make
> >>>> sure the BGRT data firmware gives it is valid, and dutifully warns
> >>>> the user when it isn't. However, it does so via the pr_err log
> >>>> level which seems unnecessary. The user cannot do anything about
> >>>> this and there really isn't an error on the part of Linux to
> >>>> correct.
> >>>>
> >>>> This lowers the log level by using pr_debug instead. Users will
> >>>> no longer have their boot process uglified by the kernel reminding
> >>>> us that firmware can and often is broken. Ironic, considering
> >>>> BGRT is supposed to make boot pretty to begin with.
> >>>
> >>> Hi Josh Boyer,
> >>>
> >>> Are you seeing these errors somewhere? I recently fixed the error
> >>> "Ignoring
> >>
> >> We have a user that reports seeing:
> >>
> >> "Ignoring BGRT: Invalid version 0 (expected 1)"
> >>
> >> on a Lenovo T430 machine. We've had a few other scattered reports on
> >> various machine types since BGRT went into the kernel as well.
> >
> > Ok. With this information, I think pr_debug is indeed better.
> >>>
> >>> BGRT: invalid status 0 (expected 1)" because Linux apparently interpreted
> >>> that part of the specification differently than others.
> >>> If that's the error you are seeing, perhaps your problem is already
> >>> solved
> >>> in recent kernels? (fixed in commit 66dbe99)
> >>>
> >>> Personally I agree that BGRT messages should not annoy actual users of
> >>> production firmwares.
> >>> However I also agree with the previous consensus that these checks (for
> >>> actual spec violations) should remain pr_err unless some production
> >>> firmware
> >>> is triggering them. What do you think?
> >>
> >> Production firmware is literally the only firmware end users will ever
> >> see. I don't see much point in leaving scary error messages in the
> >> kernel to complain about things the user has no chance of fixing or in
> >> almost all cases even reporting to people who could fix it.
> >
> > In principle I can understand the wish to show big scary error messages to
> > firmware developers doing it wrong.
>
> Yes, that is theoretically possible. However, my best guess is that
> firmware developers aren't typically testing with Linux distributions
> during firmware development.

Speaking from experience, firmware developers absolutely do test with
Linux distributions these days.

> We see this in lots of areas, which is why we have weird quirks for
> devices all over the kernel, but I don't think there's value in doing
> quirk mechanisms around BGRT.

I do; I think it makes sense to flag these issues, and making them
pr_debug means they *will* be missed on pre-production devices. If you
want to downgrade them to pr_warn, I don't have any objection there, but
they shouldn't be any lower than that.

I'd also suggest adding FW_BUG to them. (And if you want to implement a
mechanism to help end users downgrade the priority of FW_BUG messages,
such as if you already have automated reporting of such issues, feel
free; however, in the absence of such automated reporting, this hides
real problems and makes it less likely that such issues will be caught
and fixed.)

This seems consistent with how the rest of the kernel handles firmware
bugs:

~/src/linux$ git grep -h FW_BUG | grep -Eo 'pr_[a-z]*' | sort | uniq -c | sort -rn
22 pr_err
13 pr_warn
8 pr_warning
2 pr_info
1 pr_debug

2016-04-27 17:23:59

by Josh Boyer

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

On Wed, Apr 27, 2016 at 1:05 PM, Josh Triplett <[email protected]> wrote:
> On Wed, Apr 27, 2016 at 11:20:26AM -0400, Josh Boyer wrote:
>> On Wed, Apr 27, 2016 at 10:57 AM, Môshe van der Sterre <[email protected]> wrote:
>> >
>> > On 04/27/2016 03:56 PM, Josh Boyer wrote:
>> >>
>> >> On Wed, Apr 27, 2016 at 9:26 AM, Môshe van der Sterre <[email protected]> wrote:
>> >>>
>> >>> (additionally CC-ing Josh Triplett)
>> >>
>> >> Thanks for doing so. I completely forgot.
>> >>
>> >>> On 04/27/2016 02:50 PM, Josh Boyer wrote:
>> >>>>
>> >>>> The promise of pretty boot splashes from firmware via BGRT was at
>> >>>> best only that; a promise. The kernel diligently checks to make
>> >>>> sure the BGRT data firmware gives it is valid, and dutifully warns
>> >>>> the user when it isn't. However, it does so via the pr_err log
>> >>>> level which seems unnecessary. The user cannot do anything about
>> >>>> this and there really isn't an error on the part of Linux to
>> >>>> correct.
>> >>>>
>> >>>> This lowers the log level by using pr_debug instead. Users will
>> >>>> no longer have their boot process uglified by the kernel reminding
>> >>>> us that firmware can and often is broken. Ironic, considering
>> >>>> BGRT is supposed to make boot pretty to begin with.
>> >>>
>> >>> Hi Josh Boyer,
>> >>>
>> >>> Are you seeing these errors somewhere? I recently fixed the error
>> >>> "Ignoring
>> >>
>> >> We have a user that reports seeing:
>> >>
>> >> "Ignoring BGRT: Invalid version 0 (expected 1)"
>> >>
>> >> on a Lenovo T430 machine. We've had a few other scattered reports on
>> >> various machine types since BGRT went into the kernel as well.
>> >
>> > Ok. With this information, I think pr_debug is indeed better.
>> >>>
>> >>> BGRT: invalid status 0 (expected 1)" because Linux apparently interpreted
>> >>> that part of the specification differently than others.
>> >>> If that's the error you are seeing, perhaps your problem is already
>> >>> solved
>> >>> in recent kernels? (fixed in commit 66dbe99)
>> >>>
>> >>> Personally I agree that BGRT messages should not annoy actual users of
>> >>> production firmwares.
>> >>> However I also agree with the previous consensus that these checks (for
>> >>> actual spec violations) should remain pr_err unless some production
>> >>> firmware
>> >>> is triggering them. What do you think?
>> >>
>> >> Production firmware is literally the only firmware end users will ever
>> >> see. I don't see much point in leaving scary error messages in the
>> >> kernel to complain about things the user has no chance of fixing or in
>> >> almost all cases even reporting to people who could fix it.
>> >
>> > In principle I can understand the wish to show big scary error messages to
>> > firmware developers doing it wrong.
>>
>> Yes, that is theoretically possible. However, my best guess is that
>> firmware developers aren't typically testing with Linux distributions
>> during firmware development.
>
> Speaking from experience, firmware developers absolutely do test with
> Linux distributions these days.

Clearly not all and not enough.

>> We see this in lots of areas, which is why we have weird quirks for
>> devices all over the kernel, but I don't think there's value in doing
>> quirk mechanisms around BGRT.
>
> I do; I think it makes sense to flag these issues, and making them
> pr_debug means they *will* be missed on pre-production devices. If you
> want to downgrade them to pr_warn, I don't have any objection there, but
> they shouldn't be any lower than that.

pr_warn still shows up on the console for most distros, which then
runs into the problem described in the commit log in the patch.

> I'd also suggest adding FW_BUG to them. (And if you want to implement a
> mechanism to help end users downgrade the priority of FW_BUG messages,
> such as if you already have automated reporting of such issues, feel
> free; however, in the absence of such automated reporting, this hides
> real problems and makes it less likely that such issues will be caught
> and fixed.)

How is an end user supposed to see such a message and report it to the
people that can fix it? They can't. So they report it in their
distributions bug tracker and it either gets closed as "yeah, firmware
sucks" or it sits there and rots in the hope that some day someone
will do something.

I understand where you're coming from in a pre-production, development
environment but to be quite clear that is not the default environment
Linux is run in most of the time. If this were a kernel warning, that
could be fixed with a kernel patch, then maybe it would be worth it.
It isn't though.

> This seems consistent with how the rest of the kernel handles firmware
> bugs:

Well, to be honest I think those are all wrong too. There's no
recourse for the user to report them to the firmware developers and no
incentive for the firmware developers to fix them once the firmware is
shipped. Either the kernel can do something about it and work around
the firmware issue (most likely already done before the warning spew),
it cannot but it doesn't matter, or it cannot and it panics. The only
situation where added FW_BUG or pr_warn messages actually help
anything is the panic case.

josh

2016-04-30 22:35:20

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

(Adding Colin and Ricardo)

On Wed, 27 Apr, at 01:23:55PM, Josh Boyer wrote:
>
> How is an end user supposed to see such a message and report it to the
> people that can fix it? They can't. So they report it in their
> distributions bug tracker and it either gets closed as "yeah, firmware
> sucks" or it sits there and rots in the hope that some day someone
> will do something.
>
> I understand where you're coming from in a pre-production, development
> environment but to be quite clear that is not the default environment
> Linux is run in most of the time. If this were a kernel warning, that
> could be fixed with a kernel patch, then maybe it would be worth it.
> It isn't though.

If the error messages in the BGRT driver make it impossible for end
users to achieve a pretty boot experience then I agree, that is a
kernel bug. BGRT is an exception to the usual rule about complaining
loudly when we encounter firmware bugs simply because we're dealing
with UIs in this case.

That's not to say we should give up reporting these kinds of invalid
table issues to firmware developers altogether. There are other means
of doing it, and comprising the wants of many end users for the
benefit of few firmware developers (relatively) is just not sensible.

Colin, Ricardo, I haven't checked recently, are there ACPI BGRT
validations tests in FWTS and LUV? Josh (Triplett), BITS would seem
like a very good place to include these tests since it already has a
bunch of ACPI table checks.

2016-04-30 22:42:47

by Colin King

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

On 30/04/16 23:35, Matt Fleming wrote:
> (Adding Colin and Ricardo)
>
> On Wed, 27 Apr, at 01:23:55PM, Josh Boyer wrote:
>>
>> How is an end user supposed to see such a message and report it to the
>> people that can fix it? They can't. So they report it in their
>> distributions bug tracker and it either gets closed as "yeah, firmware
>> sucks" or it sits there and rots in the hope that some day someone
>> will do something.
>>
>> I understand where you're coming from in a pre-production, development
>> environment but to be quite clear that is not the default environment
>> Linux is run in most of the time. If this were a kernel warning, that
>> could be fixed with a kernel patch, then maybe it would be worth it.
>> It isn't though.
>
> If the error messages in the BGRT driver make it impossible for end
> users to achieve a pretty boot experience then I agree, that is a
> kernel bug. BGRT is an exception to the usual rule about complaining
> loudly when we encounter firmware bugs simply because we're dealing
> with UIs in this case.
>
> That's not to say we should give up reporting these kinds of invalid
> table issues to firmware developers altogether. There are other means
> of doing it, and comprising the wants of many end users for the
> benefit of few firmware developers (relatively) is just not sensible.
>
> Colin, Ricardo, I haven't checked recently, are there ACPI BGRT
> validations tests in FWTS and LUV? Josh (Triplett), BITS would seem
> like a very good place to include these tests since it already has a
> bunch of ACPI table checks.
>
fwts does have a BGRT test, although it is fairly trivial:

http://kernel.ubuntu.com/git/hwe/fwts.git/tree/src/acpi/bgrt/bgrt.c

Colin

2016-04-30 23:13:52

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT

On Sat, Apr 30, 2016 at 11:35:14PM +0100, Matt Fleming wrote:
> (Adding Colin and Ricardo)
>
> On Wed, 27 Apr, at 01:23:55PM, Josh Boyer wrote:
> >
> > How is an end user supposed to see such a message and report it to the
> > people that can fix it? They can't. So they report it in their
> > distributions bug tracker and it either gets closed as "yeah, firmware
> > sucks" or it sits there and rots in the hope that some day someone
> > will do something.
> >
> > I understand where you're coming from in a pre-production, development
> > environment but to be quite clear that is not the default environment
> > Linux is run in most of the time. If this were a kernel warning, that
> > could be fixed with a kernel patch, then maybe it would be worth it.
> > It isn't though.
>
> If the error messages in the BGRT driver make it impossible for end
> users to achieve a pretty boot experience then I agree, that is a
> kernel bug. BGRT is an exception to the usual rule about complaining
> loudly when we encounter firmware bugs simply because we're dealing
> with UIs in this case.

Fine. What's the highest priority message that will *not* cause splash
screens to go into text mode? With the default boot argument of
"quiet", pr_notice or pr_info should still remain hidden, right? So,
could we make these pr_notice, rather than pr_debug? That way they'll
at least show up in logs, even though they don't show up on the console.

> That's not to say we should give up reporting these kinds of invalid
> table issues to firmware developers altogether. There are other means
> of doing it, and comprising the wants of many end users for the
> benefit of few firmware developers (relatively) is just not sensible.
>
> Colin, Ricardo, I haven't checked recently, are there ACPI BGRT
> validations tests in FWTS and LUV? Josh (Triplett), BITS would seem
> like a very good place to include these tests since it already has a
> bunch of ACPI table checks.

BITS doesn't, but should; I've added it to the TODO list.

- Josh Triplett