Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754350AbbLDRxm (ORCPT ); Fri, 4 Dec 2015 12:53:42 -0500 Received: from mga02.intel.com ([134.134.136.20]:48641 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751772AbbLDRxk (ORCPT ); Fri, 4 Dec 2015 12:53:40 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,380,1444719600"; d="scan'208";a="834422562" From: "Luck, Tony" To: Borislav Petkov CC: "Raj, Ashok" , "linux-kernel@vger.kernel.org" , "linux-edac@vger.kernel.org" , "Andy Lutomirski (luto@amacapital.net)" Subject: RE: [Patch V0] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process. Thread-Topic: [Patch V0] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process. Thread-Index: AQHRLiCdo/bgn7szpEmDKgCl8H5CG567a4EAgAAsxwD///mJAP//fuxggACNwID//3uPgA== Date: Fri, 4 Dec 2015 17:53:33 +0000 Message-ID: <3908561D78D1C84285E8C5FCA982C28F39F78D9F@ORSMSX114.amr.corp.intel.com> References: <1449188170-3909-1-git-send-email-ashok.raj@intel.com> <20151204143404.GF21177@pd.tnic> <20151204171419.GA4870@otc-brkl-03.jf.intel.com> <20151204165112.GI21177@pd.tnic> <3908561D78D1C84285E8C5FCA982C28F39F78AD9@ORSMSX114.amr.corp.intel.com> <20151204173633.GK21177@pd.tnic> In-Reply-To: <20151204173633.GK21177@pd.tnic> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.138] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id tB4HrlWC024001 Content-Length: 1012 Lines: 18 > I don't mean that - I mean the stuff we do before we call > cpu_is_offline() like ist_enter, this_cpu_inc(mce_exception_count), > etc. Then we do a whole another bunch of stuff at the "out:" label like > printk and whatnot which shouldn't run on an offlined CPU. ist_enter() is black magic to me. Andy? Would you be worried about executing ist_{enter,exit}() on a cpu that was once online, but is currently marked offline by Linux? Bumping mce_exception_count doesn't look like a big deal either way. It is visible in /proc/interrupts so I'd like to keep that honest (if the cpu comes back online again). But we could do the offline check before this. There will be no printk() executed in the tail of the function. after we clear MCG_STATUS at the (new location of) the out: label we will see recover_paddr is still ~0ull and "goto done". -Tony ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?