Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp61825pxb; Tue, 12 Jan 2021 20:09:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJx5gvyyRQtax72q9ku8jHMv9FFlrAd4YyKpofihWJdQfTpVEyOkZ3qzzuArcDd3ciwKgYiF X-Received: by 2002:a17:906:6b88:: with SMTP id l8mr161787ejr.482.1610510970746; Tue, 12 Jan 2021 20:09:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610510970; cv=none; d=google.com; s=arc-20160816; b=H8KxeTiivl0shLBg3qL4sQqpcsq8dFy/EgNkd3S03VekkDMObPgrDxwUL1xFjE5pXP 1GaK5RteT5GqS/Hhft5igGwj8l64rmlquMVzqTFKzJ0e4ji5aa8twr1+87zHLB6RAZ3n bos7GBBCGyqPoOjo7Yv0sYU/Kj2o74TYdCGIEnfnRWT3uUgtFMQpzkUY+n+f5UTwMeiL EJ1DsGHSsnudqkTTmkF21f3mNn3Z+uz9LIf8//Ryv2QQMuSZ+Lj/wXX5EVcc5ntvKKpI jK5Ud0i3C6d+aeqHSfrBe78a0U/kBjefIgnVhixxfgb4NVWYOdPgSHorWSmaEMGO2eLD t6Uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:in-reply-to:cc:references:message-id:date :subject:mime-version:from:content-transfer-encoding:dkim-signature; bh=jp/uGViJd1rFyzqwINzBwHrML5ENBaR5RE09FM8y2rk=; b=EOLH5uuEJsVL7RUXYKOyRwFiQo/0/8CHh4a+Q82PyVQ06vYMbEx6wa2//YVNA+T61g CIkaEqe4SsWhx1OOVQcoypLmQ2DhMmlABULb/HtfpKJV2w7EYuG4g4iC0y8fq5NrGd5M PEwY6YXM4NVui3kzWdsnDy5Z7e0dWk80ovQh4e/iBmsU8VRgOSdTbBW//d59SSrhWuwI Wn6nwAxrxFKywK3kRg2oQfYNsNRpkZfX4x43o52nSoGwhjEYT2k2tNAu20xQ+hJZ997Q n3rDOP5Cx3CBfXMnhFJz4LrLxlSBii22ZseHa/RsCqaIQveMg97pozVfne1sUKXx0Zv+ 4bLw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=o+tEsxdh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a10si440808edq.184.2021.01.12.20.09.07; Tue, 12 Jan 2021 20:09:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=o+tEsxdh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394239AbhALWFy (ORCPT + 99 others); Tue, 12 Jan 2021 17:05:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390359AbhALWFu (ORCPT ); Tue, 12 Jan 2021 17:05:50 -0500 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1E31C0617A6 for ; Tue, 12 Jan 2021 14:04:57 -0800 (PST) Received: by mail-pj1-x102f.google.com with SMTP id b5so2595862pjk.2 for ; Tue, 12 Jan 2021 14:04:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=jp/uGViJd1rFyzqwINzBwHrML5ENBaR5RE09FM8y2rk=; b=o+tEsxdhzGCGsoHV1sIPiVSiNVm3QfIVPZgQqlc+JiLmlwrr+cGXF6VcEO68es1sR5 4sjbYAIz+6YaEKlIsPnQk8QRrpepwBATzotV7rf4CMiOf++gQk+lqxVKH9egpXJwTCL6 gKP+7N8kY+E30jdEsOcYcTtCficEYaP6PMEmbFX7nzUSo2B0UbrYxQDdRX2pO/Yb9o4+ 4N9s2VkD8ozP3B6Zmj0YpZpyxxa5e0UwbJWCZGP3We/ZOQpeUdbFwt/NL5zQ8b5Rf57o cHK6gDSbbSM3KeSMMbf+xAdzFvilac1HfRLY1UPA6ALiHMB9QIw3AgN1kZ5Zn0TdakUo /G5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=jp/uGViJd1rFyzqwINzBwHrML5ENBaR5RE09FM8y2rk=; b=NV/5uylo3fa2ES2OAYrfGkjHTxkMRi4VIhCaO/2NIz/r64ZgX8Ye7k1x115YZ5qhlv g5O7Qw2GvStsPurFMQo4Lft54nDtdyV4QwkoaO7GAEYop0QBaswrMsUSzSbR2M1ViAwE bDFvs2h7cs5M3IZzws1B+tEtyKfsD5PpPgrCaNPQ71vZe1N0AuifKR2VClO444E9XRUh TKGI/nL9VNVWLz5KWHkgeCoN2x5z6yIBL7z5VNOPTJSH1gOPwizFmLbbv8h9JNcfW4mI YABuNMsb5z4kOvtgc2x3X9rZrhIhhV6fdxsQsEBF0yY8+FL4lPIT29QAvMX0n5hjQLnl hCcg== X-Gm-Message-State: AOAM531p/tlWM/hM885GmCDxaAzrC3wSuJpjvEjJG8e45aq5OlE+TNL7 CdH2vX+x46Wl7R3BGuaxFEbh/UWa2v0Kow== X-Received: by 2002:a17:902:d385:b029:da:c6e4:5cab with SMTP id e5-20020a170902d385b02900dac6e45cabmr1450926pld.7.1610489097172; Tue, 12 Jan 2021 14:04:57 -0800 (PST) Received: from ?IPv6:2601:646:c200:1ef2:58cc:4dec:a37:4486? ([2601:646:c200:1ef2:58cc:4dec:a37:4486]) by smtp.gmail.com with ESMTPSA id s1sm79453pfb.103.2021.01.12.14.04.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Jan 2021 14:04:56 -0800 (PST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery Date: Tue, 12 Jan 2021 14:04:55 -0800 Message-Id: <38AF04BE-7F39-450F-8C26-879C9934E3D6@amacapital.net> References: <20210112205207.GA18195@agluck-desk2.amr.corp.intel.com> Cc: Andy Lutomirski , Borislav Petkov , X86 ML , Andrew Morton , Peter Zijlstra , Darren Hart , LKML , linux-edac , Linux-MM In-Reply-To: <20210112205207.GA18195@agluck-desk2.amr.corp.intel.com> To: "Luck, Tony" X-Mailer: iPhone Mail (18C66) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jan 12, 2021, at 12:52 PM, Luck, Tony wrote: >=20 > =EF=BB=BFOn Tue, Jan 12, 2021 at 10:57:07AM -0800, Andy Lutomirski wrote: >>> On Tue, Jan 12, 2021 at 10:24 AM Luck, Tony wrote:= >>>=20 >>> On Tue, Jan 12, 2021 at 09:21:21AM -0800, Andy Lutomirski wrote: >>>> Well, we need to do *something* when the first __get_user() trips the >>>> #MC. It would be nice if we could actually fix up the page tables >>>> inside the #MC handler, but, if we're in a pagefault_disable() context >>>> we might have locks held. Heck, we could have the pagetable lock >>>> held, be inside NMI, etc. Skipping the task_work_add() might actually >>>> make sense if we get a second one. >>>>=20 >>>> We won't actually infinite loop in pagefault_disable() context -- if >>>> we would, then we would also infinite loop just from a regular page >>>> fault, too. >>>=20 >>> Fixing the page tables inside the #MC handler to unmap the poison >>> page would indeed be a good solution. But, as you point out, not possibl= e >>> because of locks. >>>=20 >>> Could we take a more drastic approach? We know that this case the kernel= >>> is accessing a user address for the current process. Could the machine >>> check handler just re-write %cr3 to point to a kernel-only page table[1]= . >>> I.e. unmap the entire current user process. >>=20 >> That seems scary, especially if we're in the middle of a context >> switch when this happens. We *could* make it work, but I'm not at all >> convinced it's wise. >=20 > Scary? It's terrifying! >=20 > But we know that the fault happend in a get_user() or copy_from_user() cal= l > (i.e. an RIP with an extable recovery address). Does context switch > access user memory? No, but NMI can. The case that would be very very hard to deal with is if we get an NMI just b= efore IRET/SYSRET and get #MC inside that NMI. What we should probably do is have a percpu list of pending memory failure c= leanups and just accept that we=E2=80=99re going to sometimes get a second M= CE (or third or fourth) before we can get to it. Can we do the cleanup from an interrupt? IPI-to-self might be a credible ap= proach, if so.