Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp47563pxb; Tue, 12 Jan 2021 19:38:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJxTkfD56m0DCTgqMQsSgVKwADBA03Uh9RUP5KaDtm5jYD08gNkoo2IXg4CPhDM8At17asFf X-Received: by 2002:a17:906:aec6:: with SMTP id me6mr88030ejb.542.1610509104291; Tue, 12 Jan 2021 19:38:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610509104; cv=none; d=google.com; s=arc-20160816; b=rr0Oqc253QBaA2xiZQ1t2Pu4mdlLPZluJtNLtvBbEcrDm5qrsa0PYKZ4EFtqzGckO1 J65PO7BAK/SsZPAktH/bu9nd3DhGeRm9MM68kAq7tBzljr9zFelk7MbQZhRSw2CUqV45 WNyVRZXXjc3In+i8088Uj0hEJSuTFl7U5VpDxvD6Y1du1vdOdD2+PBgV4hM3MhW2Oz2f l95XdQ9igJNvfzKf9Ncnfiyc9GnFCU6+bp8ICDiT6JdLnARQN8o4Mr9I8ntPrSYIm1re cNJwcexjYdtH+bNFMWeh+kE6ORWPVt3CRchkXAko0sHfq7pEnTWLnSbLtebIjMsPiiGS OC1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:ironport-sdr:ironport-sdr; bh=/Oo6UC1ZxBQbhVSjUyAN64+czJZeRptjMKEo0UGJQN4=; b=AB5C9HCC1CI8dMoBBNtv3YXqXzBTRbL0TtSQ2oIm6DCUpBDQlo1yBtMjKNuzZK+fIY FRgg0/k7P/ft+p9WDm9sJ7nLR5MyBG49P5OKOKx6Lt4QbAu8XR6H2XVbX1zkEoHC8KOJ GAdTvXhtZwznItIhGaw5IZRBIcWgLLk8Tf7hNL5epa8Cmo+WxOrw3YAwXdk8NzakMKhN DmABRJHZc0DaRMGHpP+bN1AhHTLn3Qg6w5nIZbfRy8DvM3cdmwlBWyMwUCCDshjy6doq GCQGWYmirHm/wKp9mTCJ21oezOWKaN/U+yaAijyxNRRtHFnnabs9XMzrY150S5Hm/i4h LTsA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lt6si335998ejb.133.2021.01.12.19.38.01; Tue, 12 Jan 2021 19:38:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727005AbhAMBvh (ORCPT + 99 others); Tue, 12 Jan 2021 20:51:37 -0500 Received: from mga06.intel.com ([134.134.136.31]:54932 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726451AbhAMBvg (ORCPT ); Tue, 12 Jan 2021 20:51:36 -0500 IronPort-SDR: 22a7g0/OVPrRpryXfWGqSOsa5epF8a/kTulnE0Ocvz5EiKMo79Zht7xoSN0kQmdpxYw3C1675T OsPuYMyVj+4Q== X-IronPort-AV: E=McAfee;i="6000,8403,9862"; a="239677222" X-IronPort-AV: E=Sophos;i="5.79,343,1602572400"; d="scan'208";a="239677222" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2021 17:50:55 -0800 IronPort-SDR: ZFOkMzVU33jKa2RmLCRY0kRI4IaMNxOhu7eCj80pScZgPfEYxXJmRAHq+wZT5WNP/ndSisrTsM AJeGenWJBL6w== X-IronPort-AV: E=Sophos;i="5.79,343,1602572400"; d="scan'208";a="424384839" Received: from agluck-desk2.sc.intel.com (HELO agluck-desk2.amr.corp.intel.com) ([10.3.52.68]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2021 17:50:54 -0800 Date: Tue, 12 Jan 2021 17:50:53 -0800 From: "Luck, Tony" To: Andy Lutomirski Cc: Andy Lutomirski , Borislav Petkov , X86 ML , Andrew Morton , Peter Zijlstra , Darren Hart , LKML , linux-edac , Linux-MM Subject: Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery Message-ID: <20210113015053.GA21587@agluck-desk2.amr.corp.intel.com> References: <20210112205207.GA18195@agluck-desk2.amr.corp.intel.com> <38AF04BE-7F39-450F-8C26-879C9934E3D6@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <38AF04BE-7F39-450F-8C26-879C9934E3D6@amacapital.net> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 12, 2021 at 02:04:55PM -0800, Andy Lutomirski wrote: > > But we know that the fault happend in a get_user() or copy_from_user() call > > (i.e. an RIP with an extable recovery address). Does context switch > > access user memory? > > No, but NMI can. > > The case that would be very very hard to deal with is if we get an NMI just before IRET/SYSRET and get #MC inside that NMI. > > What we should probably do is have a percpu list of pending memory failure cleanups and just accept that we’re going to sometimes get a second MCE (or third or fourth) before we can get to it. > > Can we do the cleanup from an interrupt? IPI-to-self might be a credible approach, if so. You seem to be looking for a solution that is entirely contained within the machine check handling code. Willing to allow for repeated machine checks from the same poison address in order to achieve that. I'm opposed to mutliple machine checks. Willing to make some changes in core code to avoid repeated access to the same poison location. We need a tie-breaker. -Tony