Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp53596ybx; Mon, 4 Nov 2019 15:42:13 -0800 (PST) X-Google-Smtp-Source: APXvYqyMPcTiZIPAa76OC/zrrfAcd+NTsJpr+MoMtsMzjIVuM7HeybiLUIJptZsL6uOONEWQeGQX X-Received: by 2002:a17:906:3053:: with SMTP id d19mr27253937ejd.109.1572910933594; Mon, 04 Nov 2019 15:42:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1572910933; cv=none; d=google.com; s=arc-20160816; b=cRjsxbVQf/3VPRv6Rl8JXapVOPYBTgY+UGEWWuQaQVfd/gIIB6W72KgkeBD4EpcmDc j7mjQUfZS0yYtB2rKvZh4ydrlxVImTXHfaDpxzn1vdheOKN5DEJUUY4XZBBBM2292f2w pwKY1Kws8InXrlKzBS53FxNII740ETLTAM1aEs/6OM/K2KhAS5RMDonpRY4SfloLrqYH 06wduOtiomO+WkE9hHYoEUl2jJs2ciW8t8WY1QYztw3nTtUx6hvsjh2y0sNc02HtnvPf EOWqEbVVbPMoWIeLmRXQwBEouCReKySPejBRPKK2rf7D5eEO/Y8TwRgo0RsIqAp7V8zC 3l/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=2M0NK1V/PxjSLI0lS+TJsdHaa7qVRKFFg0cEIKa23Ns=; b=Xw51mjJzKnXhCY/HfkeO1J27Lud0WplEfWN11igkvINoRZ/w4UEUlVX0twgeysZbSP +DrYvtKWeExAokrR6Lj6o7mnP20XlChpWObCr59NjwSmxgQgSyDWiz4FA5K9kUHXjP7D PYA5SbIMk9Ehn3rX3Rzw3QFRo8A2X3wa52OfgMIq8UxbD7Fa5NqM5k4hHxE11E4ND37U 3RrjMHsnpNBszts3xjYIH7x3JMbcq8ishbpOguJfyRukKq4Gf/C6q6fXrPsUxNmMwPYK UZIw/XF+iinwSkFjhjArxEKEsJylOJcprSn1UtscnnxQPHnXONFbJt4YvHiT/Zpbf/B/ uS2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g19si9046129edb.280.2019.11.04.15.41.50; Mon, 04 Nov 2019 15:42:13 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729867AbfKDXlU (ORCPT + 99 others); Mon, 4 Nov 2019 18:41:20 -0500 Received: from Galois.linutronix.de ([193.142.43.55]:39134 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728810AbfKDXlT (ORCPT ); Mon, 4 Nov 2019 18:41:19 -0500 Received: from p5b06da22.dip0.t-ipconnect.de ([91.6.218.34] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1iRlyC-00019P-9m; Tue, 05 Nov 2019 00:41:12 +0100 Date: Tue, 5 Nov 2019 00:41:11 +0100 (CET) From: Thomas Gleixner To: Bjorn Helgaas cc: Kar Hin Ong , linux-rt-users , LKML , linux-x86_64@vger.kernel.org, linux-pci@vger.kernel.org, "H. Peter Anvin" , Dave Hansen Subject: Re: "oneshot" interrupt causes another interrupt to be fired erroneously in Haswell system In-Reply-To: <20191031230532.GA170712@google.com> Message-ID: References: <20191031230532.GA170712@google.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 31 Oct 2019, Bjorn Helgaas wrote: > On Thu, Oct 31, 2019 at 03:53:50AM +0000, Kar Hin Ong wrote: > > I've an Intel Haswell system running Linux kernel v4.14 with > > preempt_rt patch. The system contain 2 IOAPICs: IOAPIC 1 is on the > > PCH where IOAPIC 2 is on the CPU. > > > > I observed that whenever a PCI device is firing interrupt (INTx) to > > Pin 20 of IOAPIC 2 (GSI 44); the kernel will receives 2 interrupts: > > 1. Interrupt from Pin 20 of IOAPIC 2 -> Expected > > 2. Interrupt from Pin 19 of IOAPIC 1 -> UNEXPECTED, erroneously > > triggered > > > > The unexpected interrupt is unhandled eventually. When this scenario > > happen more than 99,000 times, kernel disables the interrupt line > > (Pin 19 of IOAPIC 1) and causing device that has requested it become > > malfunction. > > > > I managed to also reproduced this issue on RHEL 8 and Ubuntu 19-04 > > (without preempt_rt patch) after added "threadirqs" to the kernel > > command line. > > > > After digging further, I noticed that the said issue is happened > > whenever an interrupt pin on IOAPIC 2 is masked: > > - Masking Pin 20 of IOAPIC 2 triggers Pin 19 of IOAPIC 1 > > - Masking Pin 22 of IOAPIC 2 triggers Pin 18 of IOAPIC 1 This is pretty much the same problem which we had analyzed and worked around years ago. > > From Intel Xeon Processor E5/E7 v3 Product Family External Design > > Specification (EDS), Volume One: Architecture, section 13.1 (Legacy > > PCI Interrupt Handling), it mention: "If the I/OxAPIC entry is > > masked (via the 'mask' bit in the corresponding Redirection Table > > Entry), then the corresponding PCI Express interrupt(s) is forwarded > > to the legacy PCH" Oh well. Really useful behaviour - NOT! > > I would like to understand if my interpretation is make sense. If > > yes, should the "oneshot" algorithm need to be updated to support > > Haswell system? No. You cannot change the oneshot algorithm. The workarounds for this are enabled by PCI quirls and either CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y or 'ioapicreroute' on the command line. It might be wortha try to add the PCI ID of that box to the quirk list, i.e. the PCI ID matches in drivers/pci/quirks.c which belong to the function: quirk_reroute_to_boot_interrupts_intel(). Thanks, tglx