Received: by 2002:a25:b794:0:0:0:0:0 with SMTP id n20csp2496497ybh; Mon, 5 Aug 2019 02:00:32 -0700 (PDT) X-Google-Smtp-Source: APXvYqxCGAP1EDMBXfmHoe3UwVdH7VthbKQrjJdoiIreZkXzn6kSUM+x+L2TQEew+7z70B+YxAhy X-Received: by 2002:a65:680b:: with SMTP id l11mr40996635pgt.35.1564995632650; Mon, 05 Aug 2019 02:00:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564995632; cv=none; d=google.com; s=arc-20160816; b=Gqnfxt2E47WNSz9gzbWu5LXFmYzvIn0eFLCMSAxbKgRJ5+0HlCMvXfrcKylZVuCq/s l/s1WAnCZhzTS/5v93FchmWqrWko5qJ8yU2V0I5v9tuGzfIH9K4Xci8W95k2ZffWEiBl saOS9KwxBZaPPX49VVC//VdisFs3YjMk+hi7oi9x/J2/q22z+JHySlbruWkaBU+K4so6 ixzitMvyW0v27bw7sjqmL7zRc54BJhZhfGmLlenRXjzAs2oUM+KXQItkk3cPudIapbXa QWfJ3kDqd5Ignakbz43NUl25w+JPNfh48eth21OOt9KKFdcbQFq0dhfXOvQsanz2WM51 UlaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=3ZX7vMu/4NbD1ABzKMngh4re81+FpB5xERWJgyHbVpA=; b=VxidMetFGi2nazMEHV6i0OsSL+WjTnwnrjlsXDcbA3TMeioAwLpx6m8j6afGnDYz2x cc5FS+qjDtrnhcEMenYv1IG5XOF35fZorOct9X9L+aomQmdMXpfKFuCPZDKCGkAB5MRF nXf8B59g0KDL0dU/XsrPbncK/EGLkKM6eLy0TEj5Iy8uGMhXgGrMjHp7izYIL5njKriG E1P2jassin9C9U4JoRGGQx1eDxu0kM6Cd5H0YOTLBuPYpysI49jnLDt25EFYHOJnK14E J/5YgtYTtS4gHfFw6No9emmFutiCWwXbpZqhEgmyRiy6Fv9zHp997lDneQs+SgBViewn tvpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qxUvV3Qx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l191si43751288pge.94.2019.08.05.02.00.16; Mon, 05 Aug 2019 02:00:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qxUvV3Qx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727349AbfHEI7j (ORCPT + 99 others); Mon, 5 Aug 2019 04:59:39 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:46631 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726423AbfHEI7j (ORCPT ); Mon, 5 Aug 2019 04:59:39 -0400 Received: by mail-pl1-f193.google.com with SMTP id c2so36184809plz.13 for ; Mon, 05 Aug 2019 01:59:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=3ZX7vMu/4NbD1ABzKMngh4re81+FpB5xERWJgyHbVpA=; b=qxUvV3QxBX8svEaYjw4uGCTkZAUZViNO79y/2mxpYwa5a4wMBOziU41n3j9Zd++oN3 0xN8Y+2fVxlsQBEsilV0bbIyJumft7XBTf9ynzyJVF9R8dUxu5yiArMBNg5zF/D1RmBC Nt4kxOwUDXZlVDtyDG5cXktlF4rgBJGqV07rnIyxoIHYBRa3eCcMxOPLurY4HlLNMn1+ cV6z5ZuDlPUENxt8OeYwMYnuZpUIjqttP5PvlNZuNLfuGjrkpv5kNjeLM7v37oMBQcUp 8NpS8j41QbWgIPwAADCMiBvCUcnOce1Cr/knnB6EjNrutyNDsHJZXaOFTIr3cAMS9hXj CNeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=3ZX7vMu/4NbD1ABzKMngh4re81+FpB5xERWJgyHbVpA=; b=dZg6qMHzkh83s8Eulya8OUspvperBSbgiCHfKzMxOkGVRxkHQrdt5ggWR/o+K8LBOK ubGdtv/Y/h8vZUOr3Nzn+SZrk49q9wFOQCyix8NvlCd96OWxHLes82z4UqKgXCQVd129 AF+1H8ixCk6u7nD7o1dhbowqrFTdJeqcosjohsQPF/CPyqogWcyoG/mK/BgDPyP3d/jR 3ieE+FkPaY+RVmMjf0HFgscGHkCcCseOQlQFa0El6oR5hDQHHUiWSrpjf6xgKvzZKBZw JSFubzWAYfM8mqnwoZmJR7D+9wRsej/E2yCNw29EN/2jX/h7DD6LRLRXFI6xGMrVNm1V 8nnw== X-Gm-Message-State: APjAAAUgOxwJOYlfIJsxfKQLl0E4/a7bHT6gd6npIE7odvS9WpeXTfqr om0kFBqAqu9agChdbjDmwQ== X-Received: by 2002:a17:902:7d8b:: with SMTP id a11mr89435992plm.306.1564995578334; Mon, 05 Aug 2019 01:59:38 -0700 (PDT) Received: from mylaptop.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id v184sm82428375pfb.82.2019.08.05.01.59.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 05 Aug 2019 01:59:37 -0700 (PDT) From: Pingfan Liu To: Thomas Gleixner , Andy Lutomirski , x86@kernel.org Cc: Pingfan Liu , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Dave Hansen , Peter Zijlstra , Masami Hiramatsu , Qian Cai , Vlastimil Babka , Daniel Drake , Jacob Pan , Michal Hocko , Eric Biederman , linux-kernel@vger.kernel.org, Dave Young , Baoquan He , kexec@lists.infradead.org Subject: [PATCH 0/4] x86/mce: protect nr_cpus from rebooting by broadcast mce Date: Mon, 5 Aug 2019 16:58:55 +0800 Message-Id: <1564995539-29609-1-git-send-email-kernelfans@gmail.com> X-Mailer: git-send-email 2.7.5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This series include two related groups: [1-3/4]: protect nr_cpus from rebooting by broadcast mce [4/4]: improve "kexec -l" robustness against broadcast mce When I tried to fix [1], Thomas raised concern about the nr_cpus' vulnerability to unexpected rebooting by broadcast mce. After analysis, I think only the following first case suffers from the rebooting by broadcast mce. [1-3/4] aims to fix that issue. *** Back ground *** On x86 it's required to have all logical CPUs set CR4.MCE=1. Otherwise, a broadcast MCE observing CR4.MCE=0b on any core will shutdown the machine. The option 'nosmt' has already complied with the above rule by Thomas's patch. For detail, refer to 506a66f3748 (Revert "x86/apic: Ignore secondary threads if nosmt=force") But for nr_cpus option, the exposure to broadcast MCE is a little complicated, and can be categorized into three cases. -1. boot up by BIOS. Since no one set CR4.MCE=1, nr_cpus risks rebooting by broadcast MCE. -2. boot up by "kexec -p nr_cpus=". Since the 1st kernel has all cpus' CR4.MCE=1 set before kexec -p, nr_cpus is free of rebooting by broadcast MCE. Furthermore, the crashed kernel's wreckage, including page table and text, is not touched by capture kernel. Hence if MCE event happens on capped cpu, do_machine_check->__mc_check_crashing_cpu() runs smoothly and returns immediately, the capped cpu is still pinned on "halt". -3. boot up by "kexec -l nr_cpus=". As "kexec -p", it is free of rebooting by broadcast MCE. But the 1st kernel's wreckage is discarded and changed. when capped cpus execute do_machine_check(), they may crack the new kernel. But this is not related with broadcast MCE, and need an extra fix. *** Solution *** "nr_cpus" can not follow the same way as "nosmt". Because nr_cpus limits the allocation of percpu area and some other kthread memory, which is critical to cpu hotplug framework. Instead, developing a dedicated SIPI callback make_capped_cpu_stable() for capped cpu, which does not lean on percpu area to work. [1]: https://lkml.org/lkml/2019/7/5/3 To: Gleixner To: Andy Lutomirski Cc: Ingo Molnar Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: Dave Hansen Cc: Peter Zijlstra To: x86@kernel.org Cc: Masami Hiramatsu Cc: Qian Cai Cc: Vlastimil Babka Cc: Daniel Drake Cc: Jacob Pan Cc: Michal Hocko Cc: Eric Biederman Cc: linux-kernel@vger.kernel.org Cc: Dave Young Cc: Baoquan He Cc: kexec@lists.infradead.org --- Pingfan Liu (4): x86/apic: correct the ENO in generic_processor_info() x86/apic: record capped cpu in generic_processor_info() x86/smp: send capped cpus to a stable state when smp_init() x86/smp: disallow MCE handler on rebooting AP arch/x86/include/asm/apic.h | 1 + arch/x86/include/asm/smp.h | 3 ++ arch/x86/kernel/apic/apic.c | 23 ++++++++---- arch/x86/kernel/cpu/common.c | 7 ++++ arch/x86/kernel/smp.c | 8 +++++ arch/x86/kernel/smpboot.c | 83 ++++++++++++++++++++++++++++++++++++++++++++ kernel/smp.c | 6 ++++ 7 files changed, 124 insertions(+), 7 deletions(-) -- 2.7.5