Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753301AbdDQLTw (ORCPT ); Mon, 17 Apr 2017 07:19:52 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:37607 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750776AbdDQLTt (ORCPT ); Mon, 17 Apr 2017 07:19:49 -0400 Subject: Re: [PATCH 2/2] powerpc/book3s: mce: Use add_taint_no_warn() in machine_check_early(). To: Daniel Axtens , linuxppc-dev , Linux Kernel , Michael Ellerman References: <148765088309.20315.15300624012053746538.stgit@jupiter.in.ibm.com> <148765089349.20315.11180051017586952500.stgit@jupiter.in.ibm.com> <87o9vvuxl7.fsf@possimpible.ozlabs.ibm.com> From: Mahesh Jagannath Salgaonkar Date: Mon, 17 Apr 2017 16:49:41 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <87o9vvuxl7.fsf@possimpible.ozlabs.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable x-cbid: 17041711-7323-0000-0000-000000EFADAC X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17041711-7324-0000-0000-000002955EC2 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-17_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704170105 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2750 Lines: 71 On 04/17/2017 04:09 PM, Daniel Axtens wrote: > Hi Mahesh, > >> Fixes: 27ea2c420cad powerpc: Set the correct kernel taint on machine check errors. > > I notice this Fixes a commit I introduced. Please could you cc me when > you do this? I am likely to miss it otherwise, especially since I have > now left IBM. Sure will do. :-) > > Being cced allows me to provide an Ack or a review. And getting feedback > on my changes is very helpful in becoming a better programmer. > > In this case, as per Michael's comment, why don't we just move the > add_taint from machine_check_early to > machine_check_process_queued_event - the other side of the work queue. Yes. That is what my plan is. Also, that is not the only place. add_taint() need to be called from machine_check_exception() as well. So it will be called from two places. Thanks, -Mahesh. > > The work queue system is supposed to provide us with a safe place to do > printing, etc., so it's an appropriate place. Also, we already do > machine_check_print_event_info there, and adding the taint doesn't need > to be done synchronously. > > Regards, > Daniel > > Mahesh J Salgaonkar writes: > >> From: Mahesh Salgaonkar >> >> machine_check_early() gets called in real mode. The very first time when >> add_taint() is called, it prints a warning which ends up calling opal >> call (that uses OPAL_CALL wrapper) for writing it to console. If we get a >> very first machine check while we are in opal we are doomed. OPAL_CALL >> overwrites the PACASAVEDMSR in r13 and in this case when we are done with >> MCE handling the original opal call will use this new MSR on it's way >> back to opal_return. This usually leads unexpected behaviour or kernel >> to panic. Instead use the add_taint_no_warn() that does not call printk. >> >> This is broken with current FW level. We got lucky so far for not getting >> very first MCE hit while in OPAL. But easily reproducible on Mambo. >> This should go to stable as well alongwith patch 1/2. >> >> Signed-off-by: Mahesh Salgaonkar >> --- >> arch/powerpc/kernel/traps.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c >> index 62b587f..4a048dc 100644 >> --- a/arch/powerpc/kernel/traps.c >> +++ b/arch/powerpc/kernel/traps.c >> @@ -306,7 +306,7 @@ long machine_check_early(struct pt_regs *regs) >> >> __this_cpu_inc(irq_stat.mce_exceptions); >> >> - add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); >> + add_taint_no_warn(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); >> >> /* >> * See if platform is capable of handling machine check. (e.g. PowerNV >