Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933593AbcLAUHn (ORCPT ); Thu, 1 Dec 2016 15:07:43 -0500 Received: from mail.skyhub.de ([78.46.96.112]:43195 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759478AbcLAUHm (ORCPT ); Thu, 1 Dec 2016 15:07:42 -0500 Date: Thu, 1 Dec 2016 21:07:39 +0100 From: Borislav Petkov To: Prarit Bhargava Cc: linux-kernel@vger.kernel.org, Borislav Petkov , "Rafael J. Wysocki" , Len Brown , Paul Gortmaker , Tyler Baicar , Punit Agrawal , Don Zickus Subject: Re: [PATCH v2] ACPI / APEI: Fix NMI notification handling Message-ID: <20161201200739.qcibekpe37podnmu@pd.tnic> References: <20161129193624.krjz2bpinl2ioi7o@pd.tnic> <1480511979-11722-1-git-send-email-prarit@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1480511979-11722-1-git-send-email-prarit@redhat.com> User-Agent: NeoMutt/20161014 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2419 Lines: 59 On Wed, Nov 30, 2016 at 08:19:39AM -0500, Prarit Bhargava wrote: > When removing and adding cpu 0 on a system with GHES NMI the following stack > trace is seen when re-adding the cpu: > > WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+ > Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2 > Call Trace: > dump_stack+0x63/0x8e > __warn+0xd1/0xf0 > warn_slowpath_null+0x1d/0x20 > setup_local_APIC+0x275/0x370 > apic_ap_setup+0xe/0x20 > start_secondary+0x48/0x180 > set_init_arg+0x55/0x55 > early_idt_handler_array+0x120/0x120 > x86_64_start_reservations+0x2a/0x2c > x86_64_start_kernel+0x13d/0x14c > > During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an > NMI on CPU 0. The GHES NMI handler, ghes_notify_nmi() runs the > ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR > (0xf6). The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is also > 0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms > that something has set the IRQ_WORK_VECTOR line prior to the APIC being > initialized. > > Commit 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") > incorrectly modified the behavior such that the handler returns > NMI_HANDLED only if an error was processed, and incorrectly runs the ghes > work queue for every NMI. > > This patch modifies the ghes_proc_irq_work() to run as it did prior to > 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") by > properly returning NMI_HANDLED and only calling the work queue if > NMI_HANDLED has been set. > > v2: Borislav, setting of NMI_HANDLED moved & cleaned up changelog. > > Fixes: 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") > Signed-off-by: Prarit Bhargava > Cc: Borislav Petkov > Cc: Rafael J. Wysocki > Cc: Len Brown > Cc: Paul Gortmaker > Cc: Tyler Baicar > Cc: Punit Agrawal > Cc: Don Zickus > --- > drivers/acpi/apei/ghes.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) Reviewed-by: Borislav Petkov -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.