Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754160AbdDQKj4 (ORCPT ); Mon, 17 Apr 2017 06:39:56 -0400 Received: from mail-pg0-f65.google.com ([74.125.83.65]:33956 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751462AbdDQKjx (ORCPT ); Mon, 17 Apr 2017 06:39:53 -0400 From: Daniel Axtens To: Mahesh J Salgaonkar , linuxppc-dev , Linux Kernel , Michael Ellerman Subject: Re: [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early(). In-Reply-To: <148765089349.20315.11180051017586952500.stgit@jupiter.in.ibm.com> References: <148765088309.20315.15300624012053746538.stgit@jupiter.in.ibm.com> <148765089349.20315.11180051017586952500.stgit@jupiter.in.ibm.com> User-Agent: Notmuch/0.22.1 (http://notmuchmail.org) Emacs/24.5.1 (x86_64-pc-linux-gnu) Date: Mon, 17 Apr 2017 20:39:48 +1000 Message-ID: <87o9vvuxl7.fsf@possimpible.ozlabs.ibm.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2408 Lines: 58 Hi Mahesh, > Fixes: 27ea2c420cad powerpc: Set the correct kernel taint on machine check errors. I notice this Fixes a commit I introduced. Please could you cc me when you do this? I am likely to miss it otherwise, especially since I have now left IBM. Being cced allows me to provide an Ack or a review. And getting feedback on my changes is very helpful in becoming a better programmer. In this case, as per Michael's comment, why don't we just move the add_taint from machine_check_early to machine_check_process_queued_event - the other side of the work queue. The work queue system is supposed to provide us with a safe place to do printing, etc., so it's an appropriate place. Also, we already do machine_check_print_event_info there, and adding the taint doesn't need to be done synchronously. Regards, Daniel Mahesh J Salgaonkar writes: > From: Mahesh Salgaonkar > > machine_check_early() gets called in real mode. The very first time when > add_taint() is called, it prints a warning which ends up calling opal > call (that uses OPAL_CALL wrapper) for writing it to console. If we get a > very first machine check while we are in opal we are doomed. OPAL_CALL > overwrites the PACASAVEDMSR in r13 and in this case when we are done with > MCE handling the original opal call will use this new MSR on it's way > back to opal_return. This usually leads unexpected behaviour or kernel > to panic. Instead use the add_taint_no_warn() that does not call printk. > > This is broken with current FW level. We got lucky so far for not getting > very first MCE hit while in OPAL. But easily reproducible on Mambo. > This should go to stable as well alongwith patch 1/2. > > Signed-off-by: Mahesh Salgaonkar > --- > arch/powerpc/kernel/traps.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c > index 62b587f..4a048dc 100644 > --- a/arch/powerpc/kernel/traps.c > +++ b/arch/powerpc/kernel/traps.c > @@ -306,7 +306,7 @@ long machine_check_early(struct pt_regs *regs) > > __this_cpu_inc(irq_stat.mce_exceptions); > > - add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); > + add_taint_no_warn(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); > > /* > * See if platform is capable of handling machine check. (e.g. PowerNV