Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp169013ybp; Thu, 3 Oct 2019 11:46:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqwa+xsN4awBRZ3RonbkWi9HIXdnRtNy1Wt4i6uXyRkmTXokpZiWDrnR0MnWSmAE31IgxoV0 X-Received: by 2002:a17:906:f259:: with SMTP id gy25mr9157683ejb.2.1570128402249; Thu, 03 Oct 2019 11:46:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570128402; cv=none; d=google.com; s=arc-20160816; b=KnkyVEh5dOXZHLfZjK1RSRGOBdDsoVGLtcciXmzI9Sd4hThpcOQBEgQ2dv8Ea1n/mt bmxddW2qZnXePHMc6J6yL1Xqd6OAuL/+n25VuC39GUzvBPLziiWZVlv6RrS7he9kTf7O AaHIokkQQhDIkpXyRIhIExJ5w053eQ7B9zuWGl9sEcx2YIPrhh5GSOzTrpN23S9ue6Lx DDGZiT23kw53pTJTdRiBxSC+RxNSpLJowjQPa7OO47nyp6fk3HbDjTHordGk949k3cJD PheWFEUj7nyFxnt/3Q3e3DPWBlHjxYzLzbWba6IISGmqnBErgaecOD6zVX6VYNwYeAMZ aDuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=jn/ARkXwEHU510ewdMDAoySUyd5V8w+whOb8hX/xquU=; b=pPtEFQMlG8ealJU5Hhsr0KalSDy0g02EZN0YMLelm3v4sl+rPJFfVke6CAWh0Tspyh qIADiriMKfp7aSKPAqyUBRTT0zylr03xpqPeq3R2s/gM9qL3U2hn0RPpyRvKsENtZPy7 6uFvPE1yzpUcDtzAxgj85VdwGRVsUK/ZUZf2wfI+pmHnBJhhdi9fspNK7DPFwic1VKVR cXruu6ZnwKKLQSvyPLr2w2eUEDlEIfVe/+EEb+kcmqiZuOMG+2zCDrpIvBnv0TtThSMy vOpidZMOYF4x0ADQf5tVaefyv7nbSvz4S71yT5tDpsL1bXNFaJYPiSljBbxfNqSnlihv ZB0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=OTLS8nEc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ch26si1620587ejb.190.2019.10.03.11.46.18; Thu, 03 Oct 2019 11:46:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=OTLS8nEc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731379AbfJCP7l (ORCPT + 99 others); Thu, 3 Oct 2019 11:59:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:42766 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731364AbfJCP7j (ORCPT ); Thu, 3 Oct 2019 11:59:39 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 939AF21783; Thu, 3 Oct 2019 15:59:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1570118378; bh=/KhXoajcaFdtR7vR0vC+wUAKUGLFw6dl2D8gkPyfZZc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OTLS8nEc8MKEysQ3SKXwcNkguTIVpV2D+CbXdPb8TAIpyfbPooFc7AZpRQVrnsOEH OcnHojNZicvfDlDGOB19zAnPk7igUcF68iH7CUeWr//x/Q14Ir7wN6o4aoKI070pVD 7tPzt60T1VIyGsbD9m3LW4YlMQKk4UrG3ZpLg1xo= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Grzegorz Halat , Thomas Gleixner , Don Zickus , Sasha Levin Subject: [PATCH 4.4 41/99] x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails Date: Thu, 3 Oct 2019 17:53:04 +0200 Message-Id: <20191003154315.752475207@linuxfoundation.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191003154252.297991283@linuxfoundation.org> References: <20191003154252.297991283@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Grzegorz Halat [ Upstream commit 747d5a1bf293dcb33af755a6d285d41b8c1ea010 ] A reboot request sends an IPI via the reboot vector and waits for all other CPUs to stop. If one or more CPUs are in critical regions with interrupts disabled then the IPI is not handled on those CPUs and the shutdown hangs if native_stop_other_cpus() is called with the wait argument set. Such a situation can happen when one CPU was stopped within a lock held section and another CPU is trying to acquire that lock with interrupts disabled. There are other scenarios which can cause such a lockup as well. In theory the shutdown should be attempted by an NMI IPI after the timeout period elapsed. Though the wait loop after sending the reboot vector IPI prevents this. It checks the wait request argument and the timeout. If wait is set, which is true for sys_reboot() then it won't fall through to the NMI shutdown method after the timeout period has finished. This was an oversight when the NMI shutdown mechanism was added to handle the 'reboot IPI is not working' situation. The mechanism was added to deal with stuck panic shutdowns, which do not have the wait request set, so the 'wait request' case was probably not considered. Remove the wait check from the post reboot vector IPI wait loop and enforce that the wait loop in the NMI fallback path is invoked even if NMI IPIs are disabled or the registration of the NMI handler fails. That second wait loop will then hang if not all CPUs shutdown and the wait argument is set. [ tglx: Avoid the hard to parse line break in the NMI fallback path, add comments and massage the changelog ] Fixes: 7d007d21e539 ("x86/reboot: Use NMI to assist in shutting down if IRQ fails") Signed-off-by: Grzegorz Halat Signed-off-by: Thomas Gleixner Cc: Don Zickus Link: https://lkml.kernel.org/r/20190628122813.15500-1-ghalat@redhat.com Signed-off-by: Sasha Levin --- arch/x86/kernel/smp.c | 46 +++++++++++++++++++++++++------------------ 1 file changed, 27 insertions(+), 19 deletions(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 12c8286206ce2..6a0ba9d09b0ed 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -176,6 +176,12 @@ asmlinkage __visible void smp_reboot_interrupt(void) irq_exit(); } +static int register_stop_handler(void) +{ + return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback, + NMI_FLAG_FIRST, "smp_stop"); +} + static void native_stop_other_cpus(int wait) { unsigned long flags; @@ -209,39 +215,41 @@ static void native_stop_other_cpus(int wait) apic->send_IPI_allbutself(REBOOT_VECTOR); /* - * Don't wait longer than a second if the caller - * didn't ask us to wait. + * Don't wait longer than a second for IPI completion. The + * wait request is not checked here because that would + * prevent an NMI shutdown attempt in case that not all + * CPUs reach shutdown state. */ timeout = USEC_PER_SEC; - while (num_online_cpus() > 1 && (wait || timeout--)) + while (num_online_cpus() > 1 && timeout--) udelay(1); } - - /* if the REBOOT_VECTOR didn't work, try with the NMI */ - if ((num_online_cpus() > 1) && (!smp_no_nmi_ipi)) { - if (register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback, - NMI_FLAG_FIRST, "smp_stop")) - /* Note: we ignore failures here */ - /* Hope the REBOOT_IRQ is good enough */ - goto finish; - - /* sync above data before sending IRQ */ - wmb(); - pr_emerg("Shutting down cpus with NMI\n"); + /* if the REBOOT_VECTOR didn't work, try with the NMI */ + if (num_online_cpus() > 1) { + /* + * If NMI IPI is enabled, try to register the stop handler + * and send the IPI. In any case try to wait for the other + * CPUs to stop. + */ + if (!smp_no_nmi_ipi && !register_stop_handler()) { + /* Sync above data before sending IRQ */ + wmb(); - apic->send_IPI_allbutself(NMI_VECTOR); + pr_emerg("Shutting down cpus with NMI\n"); + apic->send_IPI_allbutself(NMI_VECTOR); + } /* - * Don't wait longer than a 10 ms if the caller - * didn't ask us to wait. + * Don't wait longer than 10 ms if the caller didn't + * reqeust it. If wait is true, the machine hangs here if + * one or more CPUs do not reach shutdown state. */ timeout = USEC_PER_MSEC * 10; while (num_online_cpus() > 1 && (wait || timeout--)) udelay(1); } -finish: local_irq_save(flags); disable_local_APIC(); mcheck_cpu_clear(this_cpu_ptr(&cpu_info)); -- 2.20.1