Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753446AbcD0RX7 (ORCPT ); Wed, 27 Apr 2016 13:23:59 -0400 Received: from mail-oi0-f47.google.com ([209.85.218.47]:33731 "EHLO mail-oi0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752616AbcD0RX5 convert rfc822-to-8bit (ORCPT ); Wed, 27 Apr 2016 13:23:57 -0400 MIME-Version: 1.0 In-Reply-To: <20160427170525.GA1965@jtriplet-mobl2.jf.intel.com> References: <1461761412-16046-1-git-send-email-jwboyer@fedoraproject.org> <5720BE0B.8080605@moshe.nl> <5720D365.5080601@moshe.nl> <20160427170525.GA1965@jtriplet-mobl2.jf.intel.com> Date: Wed, 27 Apr 2016 13:23:55 -0400 X-Google-Sender-Auth: 9t3wQZL6tkijwGJ02oLlif5UK88 Message-ID: Subject: Re: [PATCH] x86/efi-bgrt: Switch all pr_err() to pr_debug() for invalid BGRT From: Josh Boyer To: Josh Triplett Cc: =?UTF-8?Q?M=C3=B4she_van_der_Sterre?= , Matt Fleming , "linux-efi@vger.kernel.org" , "Linux-Kernel@Vger. Kernel. Org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5236 Lines: 115 On Wed, Apr 27, 2016 at 1:05 PM, Josh Triplett wrote: > On Wed, Apr 27, 2016 at 11:20:26AM -0400, Josh Boyer wrote: >> On Wed, Apr 27, 2016 at 10:57 AM, Môshe van der Sterre wrote: >> > >> > On 04/27/2016 03:56 PM, Josh Boyer wrote: >> >> >> >> On Wed, Apr 27, 2016 at 9:26 AM, Môshe van der Sterre wrote: >> >>> >> >>> (additionally CC-ing Josh Triplett) >> >> >> >> Thanks for doing so. I completely forgot. >> >> >> >>> On 04/27/2016 02:50 PM, Josh Boyer wrote: >> >>>> >> >>>> The promise of pretty boot splashes from firmware via BGRT was at >> >>>> best only that; a promise. The kernel diligently checks to make >> >>>> sure the BGRT data firmware gives it is valid, and dutifully warns >> >>>> the user when it isn't. However, it does so via the pr_err log >> >>>> level which seems unnecessary. The user cannot do anything about >> >>>> this and there really isn't an error on the part of Linux to >> >>>> correct. >> >>>> >> >>>> This lowers the log level by using pr_debug instead. Users will >> >>>> no longer have their boot process uglified by the kernel reminding >> >>>> us that firmware can and often is broken. Ironic, considering >> >>>> BGRT is supposed to make boot pretty to begin with. >> >>> >> >>> Hi Josh Boyer, >> >>> >> >>> Are you seeing these errors somewhere? I recently fixed the error >> >>> "Ignoring >> >> >> >> We have a user that reports seeing: >> >> >> >> "Ignoring BGRT: Invalid version 0 (expected 1)" >> >> >> >> on a Lenovo T430 machine. We've had a few other scattered reports on >> >> various machine types since BGRT went into the kernel as well. >> > >> > Ok. With this information, I think pr_debug is indeed better. >> >>> >> >>> BGRT: invalid status 0 (expected 1)" because Linux apparently interpreted >> >>> that part of the specification differently than others. >> >>> If that's the error you are seeing, perhaps your problem is already >> >>> solved >> >>> in recent kernels? (fixed in commit 66dbe99) >> >>> >> >>> Personally I agree that BGRT messages should not annoy actual users of >> >>> production firmwares. >> >>> However I also agree with the previous consensus that these checks (for >> >>> actual spec violations) should remain pr_err unless some production >> >>> firmware >> >>> is triggering them. What do you think? >> >> >> >> Production firmware is literally the only firmware end users will ever >> >> see. I don't see much point in leaving scary error messages in the >> >> kernel to complain about things the user has no chance of fixing or in >> >> almost all cases even reporting to people who could fix it. >> > >> > In principle I can understand the wish to show big scary error messages to >> > firmware developers doing it wrong. >> >> Yes, that is theoretically possible. However, my best guess is that >> firmware developers aren't typically testing with Linux distributions >> during firmware development. > > Speaking from experience, firmware developers absolutely do test with > Linux distributions these days. Clearly not all and not enough. >> We see this in lots of areas, which is why we have weird quirks for >> devices all over the kernel, but I don't think there's value in doing >> quirk mechanisms around BGRT. > > I do; I think it makes sense to flag these issues, and making them > pr_debug means they *will* be missed on pre-production devices. If you > want to downgrade them to pr_warn, I don't have any objection there, but > they shouldn't be any lower than that. pr_warn still shows up on the console for most distros, which then runs into the problem described in the commit log in the patch. > I'd also suggest adding FW_BUG to them. (And if you want to implement a > mechanism to help end users downgrade the priority of FW_BUG messages, > such as if you already have automated reporting of such issues, feel > free; however, in the absence of such automated reporting, this hides > real problems and makes it less likely that such issues will be caught > and fixed.) How is an end user supposed to see such a message and report it to the people that can fix it? They can't. So they report it in their distributions bug tracker and it either gets closed as "yeah, firmware sucks" or it sits there and rots in the hope that some day someone will do something. I understand where you're coming from in a pre-production, development environment but to be quite clear that is not the default environment Linux is run in most of the time. If this were a kernel warning, that could be fixed with a kernel patch, then maybe it would be worth it. It isn't though. > This seems consistent with how the rest of the kernel handles firmware > bugs: Well, to be honest I think those are all wrong too. There's no recourse for the user to report them to the firmware developers and no incentive for the firmware developers to fix them once the firmware is shipped. Either the kernel can do something about it and work around the firmware issue (most likely already done before the warning spew), it cannot but it doesn't matter, or it cannot and it panics. The only situation where added FW_BUG or pr_warn messages actually help anything is the panic case. josh