Received: by 2002:a05:6a10:7420:0:0:0:0 with SMTP id hk32csp1138016pxb; Wed, 16 Feb 2022 11:53:17 -0800 (PST) X-Google-Smtp-Source: ABdhPJyaJyEV0h1hGQjt847DfIa7H574zW4yVxllGjyvFX6myq+mw1Tp2IZshbBhD8S9RdasI6MV X-Received: by 2002:a17:907:7d91:b0:6ce:b96f:c20c with SMTP id oz17-20020a1709077d9100b006ceb96fc20cmr3641549ejc.582.1645041197519; Wed, 16 Feb 2022 11:53:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645041197; cv=none; d=google.com; s=arc-20160816; b=S7hTKeIpyXty2jwRzDQNkxRRJwC60kiIapstAgGlMN+23yrioKWAUvOcUjVW+L4gTS EpgpFo08X14njJh0h43nxA55I3LFtIW7iLB36N+W/0RixalEXhYaaZoTse4dtAct8pGM 2xc25fGT76OHrUi0MOMyTKwfDb2aW2D+wCtYzF1gkQ7bXd4i3F9xBCZLfL/X25BiZHlp mvF177VMRy62rvTgFhW6E54ZtQUvPZ+IDSkb98JeJ9iqbuZWW+FQ3Q3qDYeZX+up7px3 jVMaBdvLg23pDjgSuPj53NLUzijrxbEjz/EanbzD2duItk/yDEO8Jcc1KMvjKbjx06pX XPwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Tv2WpngyBiNluh/VuV6xl+yPEIiPp/meijatzd5qsZ0=; b=qF38ootGy/4S8B2MfFm3bjrn/Y5B84aeE+BjQmMjBjbSNMG5FG89gCnvP9PnPtUVqT BGaxSBN6VXJgOjYUBbUGzrsBbg590NL2A/J+rMHoX4uRfUVjH8OEML9FHtUExWglvW/O 6AAf7OllTX3wGe+ZPnLS0kwYqdU4rE9+qfEn0aoMSJqXtIpTEwFxXrtVKWTfBbfqMYry MeTPufWS046j7bj8harKQITC5vHgWGqtwpLUfvMeJzyeKtfBxHxRLqSg22bgfiARNo6p G+kt7PBnFtRu5u6lXgQkbKqqaFu37hBPgvxUG66O5SuEtkqf39Mdbpf7fiBLGrMRzA12 mfUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b=UuzZBvIG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nb22si542290ejc.622.2022.02.16.11.52.54; Wed, 16 Feb 2022 11:53:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@alien8.de header.s=dkim header.b=UuzZBvIG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alien8.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237884AbiBPSxQ (ORCPT + 99 others); Wed, 16 Feb 2022 13:53:16 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:34530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237866AbiBPSxN (ORCPT ); Wed, 16 Feb 2022 13:53:13 -0500 Received: from mail.skyhub.de (mail.skyhub.de [5.9.137.197]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66B1616041E for ; Wed, 16 Feb 2022 10:53:01 -0800 (PST) Received: from zn.tnic (dslb-088-067-221-104.088.067.pools.vodafone-ip.de [88.67.221.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id E80DC1EC0529; Wed, 16 Feb 2022 19:52:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=dkim; t=1645037576; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=Tv2WpngyBiNluh/VuV6xl+yPEIiPp/meijatzd5qsZ0=; b=UuzZBvIGMXFHEr16Oh7C9uy60PmWEGkUg4CtM97vw5J7oF+OiwvkEyOritaYHm0SnyBmhv 7pJlRz8hVe0kEz3vTxMe0kUTb1xsyGFZvGBgxnKZUj5+b+5WVjGzrx4V3rjXBVnwrSGQVP BLMTmMU39TAzWVUOonk4lEGfkNWB8NI= Date: Wed, 16 Feb 2022 19:52:58 +0100 From: Borislav Petkov To: "Luck, Tony" Cc: Jue Wang , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "patches@lists.linux.dev" Subject: Re: [PATCH] x86/mce: Add workaround for SKX/CLX/CPX spurious machine checks Message-ID: References: <20220208150945.266978-1-juew@google.com> <9f402331d25c47b69349c8171bbd49c1@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <9f402331d25c47b69349c8171bbd49c1@intel.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 16, 2022 at 06:41:58PM +0000, Luck, Tony wrote: > > Well, we could try to decode the instructions around rIP when the #MC > > is raised and see what caused the MCE and perhaps pick apart which insn > > caused it, is it accessing behind the buffer boundaries, etc. > > Is this a case of "perfect is the enemy of good enough"? Well, you guys sounded like this happens left and right... > It is a rare scenario (only a pain point for Jue because Google has > billions and billions of cores running this code). You need: > > 1) An uncorrected error > 2) That error must be in first cache line of a page > 3) Kernel must execute page_copy from the page immediately before that page > > When all three happen, kernel crashes because we don't > have a recover path from kernel page_copy You should've lead with that - this is basically one of those "under a complex set of conditions" things. Anything against me adding them to the commit message? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette