Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754926AbdDLVsN (ORCPT ); Wed, 12 Apr 2017 17:48:13 -0400 Received: from mx2.suse.de ([195.135.220.15]:36827 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754785AbdDLVsK (ORCPT ); Wed, 12 Apr 2017 17:48:10 -0400 Date: Wed, 12 Apr 2017 23:47:49 +0200 From: Borislav Petkov To: "Luck, Tony" Cc: Thomas Gleixner , Dan Williams , "Verma, Vishal L" , "linux-kernel@vger.kernel.org" , "linux-nvdimm@lists.01.org" , "ross.zwisler@linux.intel.com" , "x86@kernel.org" Subject: Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' Message-ID: <20170412214749.jyt7cmyhovivtb2m@pd.tnic> References: <20170411224457.24777-1-vishal.l.verma@intel.com> <20170412091442.dwonfr4dwyta7nvx@pd.tnic> <20170412195903.GA29506@omniknight.lm.intel.com> <20170412202238.5d327vmwjqvbzzop@pd.tnic> <1492028744.2738.14.camel@intel.com> <20170412205229.GA13659@intel.com> <20170412211931.GA15771@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170412211931.GA15771@intel.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1405 Lines: 35 On Wed, Apr 12, 2017 at 02:19:32PM -0700, Luck, Tony wrote: > On Wed, Apr 12, 2017 at 11:12:21PM +0200, Thomas Gleixner wrote: > > There is another solution: > > > > Convert the notifier to a blocking notifier and in the panic case, ignore > > the locking and invoke the notifier chain directly. That needs some minimal > > surgery in the notifier code to allow that, but that's certainly less ugly > > than splitting stuff up into two chains. > > But I wonder whether we actually want two chains. We've been adding a bunch > of general run-time logging and recovery stuff to this chain. So now we have > things there that aren't needed or useful in the panic case. E.g. > srao_decode_notifier() (which tries to offline a page that reported an > uncorrected error out of the execution path) and Boris's new CEC code. I guess we'll have to. The CEC thing does mutex_lock() too and the atomic notifier disables preemption: __atomic_notifier_call_chain() rcu_read_lock() __rcu_read_lock() if (IS_ENABLED(CONFIG_PREEMPT_COUNT)) preempt_disable(); so we need to think about something better to handle events down the whole chain. Maybe route events from the atomic path to the blocking path where the sleeping notifier callbacks can sleep as much as they want to... -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --