Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp6401664ybv; Tue, 18 Feb 2020 16:16:56 -0800 (PST) X-Google-Smtp-Source: APXvYqyNOIvNYk59vqgLqoKXRtB+TH3xztMkMLT2Ye5FO3kTxc6TNrJPxC+rDy1CpR5t39iP2nG9 X-Received: by 2002:a05:6830:1e5c:: with SMTP id e28mr17350450otj.163.1582071416164; Tue, 18 Feb 2020 16:16:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582071416; cv=none; d=google.com; s=arc-20160816; b=GGaB3VRgfXyza1pdZoo58MyYThkr+ggYGKfDZizN6kk7cvxu6b7wxKbmJvQcGiyfyu NqLiop2p+vYiSyeHlnkipSbh181pVqjbNJLIMZ8ErBJqQIzMdf2uavzXwJI9yJdExW7m NHaJfly4/q1pYxkgFPPTWm6F7QMVnlgGHHvadqaK+CpheEdjI0DCTTunYheJYacUcYyS VPi1E6S1Wsq/4no3CNVDH0oSO3GX+oPH5G06hq5fLwnMK5TVuNvIcOKpDVR5INGNd7+3 X58jPQ4q9QHHgJQdT0ATaWN/yaVgJ9ysuaC4Z+a0LjYFw6rEAn7hcu7AkrZiszDClqUr bC2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=6ATnBXCRBwR607mFgsnkOEGIMc5PvxJ1Jt/wCk6g4Ls=; b=usHtcUwQm4qHPjvlrCwg+aU3CgNgfkQ/Zn9XORACeEFY7JekAWK9Cc2WlxFeainouF AWZlug8QF9cUkIIoOEF40uZszKmmcp4p6rcB5F8BhMpOSaxprxIlP6DPAgdN/L18cSKo iJdeAgc7jIV469dx0I1OgIaTxvkU9ynkDeOE/vD8fVlaXYQEJLrPoiGy3/gggJNbY7IV 6LffH8r1truhVceZqPnXIsq2gCi+Js7ECPFDNjNZuJHJePpfxclmwJ0nBH9dwz06iua3 cgvvEQkyJF1hZLabkUz0b03Fe538W1tRCeHE4BNrD44i4W5iod+jx2VaNbo7ydRdLzLA dJyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=IVK4WrIM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y8si168400otg.309.2020.02.18.16.16.44; Tue, 18 Feb 2020 16:16:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=IVK4WrIM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727186AbgBSAQL (ORCPT + 99 others); Tue, 18 Feb 2020 19:16:11 -0500 Received: from mail.kernel.org ([198.145.29.99]:55386 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726521AbgBSAQL (ORCPT ); Tue, 18 Feb 2020 19:16:11 -0500 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 99BBA24672 for ; Wed, 19 Feb 2020 00:16:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1582071370; bh=QwIOkW+WTgxFcL9LkF3c08cWjtphh8d8ugXkBOTaEjQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=IVK4WrIMWyCYm6xYl3XlhNveydH7xk7Ocv3Flqw/ZA3zCkIM17Ge1RWKz1QVGglBv ZgRRbjhHBNmkA6pqVobaSQ4xSxZb6JnJ4etdulB/V1xr+kwKvoDO0porT6Ge2CW56L U34QGl1amYBs3QxAmuct/cnXGxbio93Z20121oUA= Received: by mail-wr1-f49.google.com with SMTP id m16so26083293wrx.11 for ; Tue, 18 Feb 2020 16:16:10 -0800 (PST) X-Gm-Message-State: APjAAAWkvtOaCQzCZuE9CITbr8BkwtMaaQAh0pYyooVEo3cDRHh4CroD le3so/utcCgwJPQVCXf4tfAqaoNBxIW7mkTSWWZJfA== X-Received: by 2002:adf:ea85:: with SMTP id s5mr31201138wrm.75.1582071369038; Tue, 18 Feb 2020 16:16:09 -0800 (PST) MIME-Version: 1.0 References: <20200218173150.GK14449@zn.tnic> In-Reply-To: <20200218173150.GK14449@zn.tnic> From: Andy Lutomirski Date: Tue, 18 Feb 2020 16:15:57 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC] #MC mess To: Borislav Petkov Cc: Peter Zijlstra , Steven Rostedt , Andy Lutomirski , Tony Luck , x86-ml , lkml Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 18, 2020 at 9:31 AM Borislav Petkov wrote: > > Ok, > > so Peter raised this question on IRC today, that the #MC handler needs > to disable all kinds of tracing/kprobing and etc exceptions happening > while handling an #MC. And I guess we can talk about supporting some > exceptions but #MC is usually nasty enough to not care about tracing > when former happens. > It's worth noting that MCE is utterly, terminally screwed under high load. In particular: Step 1: NMI (due to perf). immediately thereafter (before any of the entry asm runs) Step 2: MCE (due to recoverable memory failure or remote CPU MCE) Step 3: MCE does its thing and does IRET Step 4: NMI We are toast. Tony, etc, can you ask your Intel contacts who care about this kind of thing to stop twiddling their thumbs and FIX IT? The easy fix is utterly trivial. Add a new instruction IRET_NON_NMI. It does *exactly* the same thing as IRET except that it does not unmask NMIs. (It also doesn't unmask NMIs if it faults.) No fancy design work. Future improvements can still happen on top of this. (One other improvement that may or may not have happened: the CPU should be configurable so that it never even sends #MC unless it literally cannot continue executing without OS help. No remote MCE triggering #MC, no notifications of corrected errors, no nothing. If the CPU *cannot* continue execution in its current context, for example because a load could not be satisfied, send #MC. If a cache line cannot be written back, then *which* CPU should get the MCE is an interesting question.) If Intel cares about memory failure recovery, then this design problem needs to be fixed. Without a fix, we're just duct taping little holes and ignoring the giant gaping hole in front of our faces.