Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp3390674imb; Tue, 5 Mar 2019 08:11:16 -0800 (PST) X-Google-Smtp-Source: APXvYqz7MHzV0z9e5+7bJgJ8LPSmCqjPmQ/orRPcQSqz0j2mrRieevyMi6QycUMeyAQMFFLVirGU X-Received: by 2002:a17:902:aa87:: with SMTP id d7mr1899173plr.146.1551802276894; Tue, 05 Mar 2019 08:11:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551802276; cv=none; d=google.com; s=arc-20160816; b=QUEb5seIJmDOGQ2BoiGUJq6+w9pihOaykAJkkLDGzBoNq+eZD8+xbWd76ufgT0/9Qr H32zN+4GzocJlc+2LJB1y6m+xF+IkO4EzQiLbq816pxUbMMMyXOKKmDf8OiAor6qFLWX EPd/DymyWiTuEN66K//xYfDl7Ks9C9TUvoOnoQBC9/55vInlF2n2dIKsl8QHKvRu48cd Zg9fv/Dk0V9HSzzP2fkvGEW4zXgZOxKMxxnM1QetI9iMHUZVg/MqIyL5gSN26E81gjrZ sZ4TfF3v7B1yEMn39QXcULkW/ozYiz7q2GvkUlDXfD9SWihMlwEuCYIdYY95MVnBCGV4 +rVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=4Xm4VCYZpGk+d2e5RSMOSGxgJq0RcpYgCykTR7EyKpU=; b=IYYl3gqp2WXAMKe4EuK2ncSEpZzSIj/WSG3JuY+3cfZA8T4jaGCL8U0p8hmz2n19su wI4KxVDiktzmT6qywhkF/82TxgpcZ32Z74MSpUvy4k3KYfwz48PPYvOKWfYbekCDOt+E 7jecBQgaRmN13VlTyjdJH2UacrfCDmm61KY18wQfrCOKZesZM90qt9po/glNBCjD1GB4 dfYm7IoTzyOOQ3us5UI+NgRcUkylNwfJEPM3HhA83qDf1REOw+62+4sBLQgTATkcvHoe Wu8rxUcVnqZ81Eccm4i67RJtanhTGSUxluNpE9k4N4lJdkaS5kK2fwzHrJ7BgA6fW6Yy TWEg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p22si8113364pfd.241.2019.03.05.08.11.00; Tue, 05 Mar 2019 08:11:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728072AbfCEQCH (ORCPT + 99 others); Tue, 5 Mar 2019 11:02:07 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:38824 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727660AbfCEQCG (ORCPT ); Tue, 5 Mar 2019 11:02:06 -0500 Received: by mail-wm1-f65.google.com with SMTP id a188so3056563wmf.3 for ; Tue, 05 Mar 2019 08:02:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=4Xm4VCYZpGk+d2e5RSMOSGxgJq0RcpYgCykTR7EyKpU=; b=sALdvZXzrJ6CYpKwZdRHTGhzxsbWdHc/znD7m+flfibDic3ur89Kd0SIBiUlbaWb93 aD8I0yDv3bKtiNbHm14CQ/6WGtsD+RhY7W5RD1LjrHjXfPEnXrqogNOPHmZfa0izNEW1 XsXgi27dTl+Vmi0KFAc8kRd5W/fLZtjp0UH8rtYV6CCpXe/0P7XdSq5n/LtiikA8lIDb swFRqE2LkKq7ZtyXHOfU8+M/YKOeEyYVVJ8aijQW2RvB+e+8x51ROAQ2AsYX8hm06Lq9 +HMfZwVzwbxQI1mZZP7xpo3E8qYbUmCH1vGVkt0068MaEMJWN1vzEHUkkqBOqZsL9+RX 2izQ== X-Gm-Message-State: APjAAAXCzL7/t9laV9BG8vj/eFUpW/zgYaCVGJqA1/JyC24PbAQrIRAV bFvsNf2zKC19xdgCLB7TuXEggbHVhtA= X-Received: by 2002:a1c:3b06:: with SMTP id i6mr3125233wma.55.1551801723988; Tue, 05 Mar 2019 08:02:03 -0800 (PST) Received: from shalem.localdomain (546A5441.cm-12-3b.dynamic.ziggo.nl. [84.106.84.65]) by smtp.gmail.com with ESMTPSA id i4sm8923810wrw.19.2019.03.05.08.02.03 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Tue, 05 Mar 2019 08:02:03 -0800 (PST) Subject: Re: False positive "do_IRQ: #.55 No irq handler for vector" messages on AMD ryzen based laptops To: "Lendacky, Thomas" , Thomas Gleixner Cc: Linux Kernel Mailing List , "Rafael J. Wysocki" , Borislav Petkov References: <95e76875-f6b2-cbea-cd74-dc14ee77b2f8@redhat.com> <13dbe818-a364-4cd4-3ac4-78bd7e8d28e3@amd.com> <9f17f1aa-f258-fb18-0736-04a5c03cf40e@redhat.com> <57b32bc1-8ef2-1e1e-a70f-04444f5919a2@amd.com> From: Hans de Goede Message-ID: <6fbcd261-f9e2-1685-1ef7-f148007aab9d@redhat.com> Date: Tue, 5 Mar 2019 17:02:02 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: <57b32bc1-8ef2-1e1e-a70f-04444f5919a2@amd.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 05-03-19 15:06, Lendacky, Thomas wrote: > On 3/3/19 4:57 AM, Hans de Goede wrote: >> Hi, >> >> On 21-02-19 13:30, Hans de Goede wrote: >>> Hi, >>> >>> On 19-02-19 22:47, Lendacky, Thomas wrote: >>>> On 2/19/19 3:01 PM, Thomas Gleixner wrote: >>>>> Hans, >>>>> >>>>> On Tue, 19 Feb 2019, Hans de Goede wrote: >>>>> >>>>> Cc+: ACPI/AMD folks >>>>> >>>>>> Various people are reporting false positive "do_IRQ: #.55 No irq >>>>>> handler for >>>>>> vector" >>>>>> messages on AMD ryzen based laptops, see e.g.: >>>>>> >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1551605 >>>>>> >>>>>> Which contains this dmesg snippet: >>>>>> >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smp: Bringing up >>>>>> secondary CPUs >>>>>> ... >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: x86: Booting SMP >>>>>> configuration: >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: .... node  #0, >>>>>> CPUs:      #1 >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: do_IRQ: 1.55 No irq >>>>>> handler for >>>>>> vector >>>>>> Feb 07 20:14:29 localhost.localdomain kernel:  #2 >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: do_IRQ: 2.55 No irq >>>>>> handler for >>>>>> vector >>>>>> Feb 07 20:14:29 localhost.localdomain kernel:  #3 >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: do_IRQ: 3.55 No irq >>>>>> handler for >>>>>> vector >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smp: Brought up 1 node, >>>>>> 4 CPUs >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smpboot: Max logical >>>>>> packages: 1 >>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smpboot: Total of 4 >>>>>> processors >>>>>> activated (15968.49 BogoMIPS) >>>>>> >>>>>> It seems that we get an IRQ for each CPU as we bring it online, >>>>>> which feels to me like it is some sorta false-positive. >>>>> >>>>> Sigh, that looks like BIOS value add again. >>>>> >>>>> It's not a false positive. Something _IS_ sending a vector 55 to these >>>>> CPUs >>>>> for whatever reason. >>>>> >>>> >>>> I remember seeing something like this in the past and it turned out to be >>>> a BIOS issue.  BIOS was enabling the APs to interact with the legacy 8259 >>>> interrupt controller when only the BSP should. During POST the APs were >>>> exposed to ExtINT/INTR events as a result of the mis-configuration >>>> (probably due to a UEFI timer-tick using the 8259) and this left a pending >>>> ExtINT/INTR interrupt latched on the APs. >>>> >>>> When the APs were started by the OS, the latched ExtINT/INTR interrupt is >>>> processed shortly after the OS enables interrupts. The AP then queries the >>>> 8259 to identify the vector number (which is the value of the 8259's ICW2 >>>> register + the IRQ level). The master 8259's ICW2 was set to 0x30 and, >>>> since no interrupts are actually pending, the 8259 will respond with IRQ7 >>>> (spurious interrupt) yielding a vector of 0x37 or 55. >>>> >>>> The OS was not expecting vector 55 and printed the message. >>>> >>>>  From the Intel Developer's Manual: Vol 3a, Section 10.5.1: >>>> "Only one processor in the system should have an LVT entry configured to >>>> use the ExtINT delivery mode." >>>> >>>> Not saying this is the problem, but very well could be. >>> >>> That sounds like a likely candidate, esp. also since this only happens >>> once per CPU when we first only the CPU. >>> >>> Can you provide me with a patch with some printk-s / pr_debugs to >>> test for this, then I can build a kernel with that patch added and >>> we can see if your hypothesis is right. >> >> Ping? I like your theory, can you provide some help with debugging this >> further (to prove that your theory is correct ) ? > > It's been a very long time since I dealt with this and I was only on the > periphery. You might be able to print the LVT entries from the APIC and > see if any of them have an un-masked ExtINT delivery mode. You would need > to do this very early before Linux modifies any values. I'm afraid I'm not familiar enough with the interrupt / APIC parts of the kernel to do something like this myself. > Or you can report the issue to the OEM and have them check their BIOS > code to see if they are doing this. I will try to go this route, but I'm not really hopeful that will lead to a solution. Regards, Hans