Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761081AbcLAVV7 (ORCPT ); Thu, 1 Dec 2016 16:21:59 -0500 Received: from cloudserver094114.home.net.pl ([79.96.170.134]:64073 "EHLO cloudserver094114.home.net.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760810AbcLAVVk (ORCPT ); Thu, 1 Dec 2016 16:21:40 -0500 From: "Rafael J. Wysocki" To: Borislav Petkov Cc: Prarit Bhargava , linux-kernel@vger.kernel.org, Borislav Petkov , "Rafael J. Wysocki" , Len Brown , Paul Gortmaker , Tyler Baicar , Punit Agrawal , Don Zickus , Linux ACPI Subject: Re: [PATCH v2] ACPI / APEI: Fix NMI notification handling Date: Thu, 01 Dec 2016 22:17:54 +0100 Message-ID: <3848089.cIkXoRipBG@aspire.rjw.lan> User-Agent: KMail/4.14.10 (Linux/4.9.0-rc5+; KDE/4.14.9; x86_64; ; ) In-Reply-To: <20161201200739.qcibekpe37podnmu@pd.tnic> References: <20161129193624.krjz2bpinl2ioi7o@pd.tnic> <1480511979-11722-1-git-send-email-prarit@redhat.com> <20161201200739.qcibekpe37podnmu@pd.tnic> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2547 Lines: 59 On Thursday, December 01, 2016 09:07:39 PM Borislav Petkov wrote: > On Wed, Nov 30, 2016 at 08:19:39AM -0500, Prarit Bhargava wrote: > > When removing and adding cpu 0 on a system with GHES NMI the following stack > > trace is seen when re-adding the cpu: > > > > WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+ > > Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra > > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2 > > Call Trace: > > dump_stack+0x63/0x8e > > __warn+0xd1/0xf0 > > warn_slowpath_null+0x1d/0x20 > > setup_local_APIC+0x275/0x370 > > apic_ap_setup+0xe/0x20 > > start_secondary+0x48/0x180 > > set_init_arg+0x55/0x55 > > early_idt_handler_array+0x120/0x120 > > x86_64_start_reservations+0x2a/0x2c > > x86_64_start_kernel+0x13d/0x14c > > > > During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an > > NMI on CPU 0. The GHES NMI handler, ghes_notify_nmi() runs the > > ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR > > (0xf6). The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is also > > 0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms > > that something has set the IRQ_WORK_VECTOR line prior to the APIC being > > initialized. > > > > Commit 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") > > incorrectly modified the behavior such that the handler returns > > NMI_HANDLED only if an error was processed, and incorrectly runs the ghes > > work queue for every NMI. > > > > This patch modifies the ghes_proc_irq_work() to run as it did prior to > > 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") by > > properly returning NMI_HANDLED and only calling the work queue if > > NMI_HANDLED has been set. > > > > v2: Borislav, setting of NMI_HANDLED moved & cleaned up changelog. > > > > Fixes: 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") > > Signed-off-by: Prarit Bhargava > > Cc: Borislav Petkov > > Cc: Rafael J. Wysocki > > Cc: Len Brown > > Cc: Paul Gortmaker > > Cc: Tyler Baicar > > Cc: Punit Agrawal > > Cc: Don Zickus > > --- > > drivers/acpi/apei/ghes.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > Reviewed-by: Borislav Petkov I guess I should pick up this one, then? Thanks, Rafael