Received: by 10.223.176.46 with SMTP id f43csp1744281wra; Sun, 21 Jan 2018 03:36:23 -0800 (PST) X-Google-Smtp-Source: AH8x227kXThxtcozfaYmPyyral/zHjRm4YHDoEqw6Q5l3tOogCCV35yb39SEmTLfOTE3PJnl43Q3 X-Received: by 10.101.85.69 with SMTP id t5mr4453011pgr.123.1516534583599; Sun, 21 Jan 2018 03:36:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516534583; cv=none; d=google.com; s=arc-20160816; b=AAw3boACRZYvlBwKCJqGPi3QKhEjLwoi5cSitsTOUyGfPdsCbJC4WY3guuzcRhuPpm ukJGgLjYGFUKrVhwAJkVY9WqAmLLVSiV4a2jn0dGJgIa7EWFxwM8CXgc35etpSInkEhI a1eOJBHzomv7EeNkX5qt0x+LgEpkvDbWaxaSEdufA3vL6Gwl3r+qXKsXbbyc1f4u9CZG pVDVh562YGAjEvJkwz/H1jHIwD8ajCjZIFas4DDrK1odOjDEKOZl7WBRwhekQ+tW8d+o OeG9QFOFdGxsC9mZI1/1e9bFVXbJpxSB681Ld5tV/gfMAuqUqPVH5h28cvDcddQfmjDV MVOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:organization:user-agent :references:in-reply-to:subject:cc:to:from:message-id:date :arc-authentication-results; bh=6K2HiJD+pu+bs/a/lEuKPYxyZs5dsQn3y5svqJOjLUc=; b=ZZfnsDkc1Zt7L9pXWxFKm8faFeBZbEH/X+CDX20PPLB2WIhP66fG7OZl41BMqvk7Y3 GNGMNmBltQJZknmrV/Nv+Gp/clHdYBa7OHoW7JuMRxUvS7RSZU+5Q1DCoO8SXK0fTqbM kIv98A0AUCilKdkDzgcFyVmYW+5ZFIPQuRRYWgZIQMfYwCKfv9BZff8u6mco1tI4RRSG P0ebEB4p9+3yf4AMyTC/soli1TLAg3uIPKjA/v8/2kARa/W5OFI8HIIGwREdXsukAsmV 5qfnAsutSt8UucZHCI/2RDRVB7EJVjLT6773YdUY7/3CXMeDjL8GZTd+S75GrSkPULDW leNw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l6si11831326pgp.5.2018.01.21.03.36.08; Sun, 21 Jan 2018 03:36:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751004AbeAULfr (ORCPT + 99 others); Sun, 21 Jan 2018 06:35:47 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:50614 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750822AbeAULfp (ORCPT ); Sun, 21 Jan 2018 06:35:45 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D8F401529; Sun, 21 Jan 2018 03:35:44 -0800 (PST) Received: from big-swifty.misterjones.org (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ECC473F41F; Sun, 21 Jan 2018 03:35:40 -0800 (PST) Date: Sun, 21 Jan 2018 11:35:34 +0000 Message-ID: <86po635trt.wl-marc.zyngier@arm.com> From: Marc Zyngier To: Jayachandran C Cc: Ganapatrao Kulkarni , , , , , , , , , , , , Subject: Re: [PATCH v2] irqchip/gic-v3-its: Add workaround for ThunderX2 erratum #174 In-Reply-To: <20180121070038.GA4450@jc-sabre> References: <20180118052820.30286-1-ganapatrao.kulkarni@cavium.com> <20180121070038.GA4450@jc-sabre> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL/10.8 EasyPG/1.0.0 Emacs/25.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: ARM Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 21 Jan 2018 07:00:48 +0000, Jayachandran C wrote: > > On Thu, Jan 18, 2018 at 10:58:20AM +0530, Ganapatrao Kulkarni wrote: > > This erratum is observed on the ThunderX2 GICv3 ITS. When a > > MOVI command is used to change affinity of a LPI to a collection/cpu > > on another node, the LPI is not delivered to the cpu. > > An additional INV command is required after the MOVI to deliver > > the LPI to the new destination. > > > > If we add INV after MOVI, there is a chance that we lose LPIs which > > are raised when the affinity is changed. So for now, adding workaround fix > > to disable inter node affinity change. > > > > Signed-off-by: Ganapatrao Kulkarni > > --- > > > > v2: Added workaround to avoid inter node affinity change. > > > > v1: Initial patch > > > > Documentation/arm64/silicon-errata.txt | 1 + > > arch/arm64/Kconfig | 10 ++++++++++ > > drivers/irqchip/irq-gic-v3-its.c | 21 ++++++++++++++++++++- > > 3 files changed, 31 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt > > index fc1c884..fb27cb5 100644 > > --- a/Documentation/arm64/silicon-errata.txt > > +++ b/Documentation/arm64/silicon-errata.txt > > @@ -63,6 +63,7 @@ stable kernels. > > | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | > > | Cavium | ThunderX Core | #30115 | CAVIUM_ERRATUM_30115 | > > | Cavium | ThunderX SMMUv2 | #27704 | N/A | > > +| Cavium | ThunderX2 ITS | #174 | CAVIUM_ERRATUM_174 | > > | Cavium | ThunderX2 SMMUv3| #74 | N/A | > > | Cavium | ThunderX2 SMMUv3| #126 | N/A | > > | | | | | > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index c9a7e9e..0dbf3bd 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -461,6 +461,16 @@ config ARM64_ERRATUM_843419 > > > > If unsure, say Y. > > > > +config CAVIUM_ERRATUM_174 > > + bool "Cavium ThunderX2 erratum 174" > > + default y > > + help > > + Cavium ThunderX2 dual socket systems may loose interrupts > > + on affinity change to a cpu on other node. > > + This workaround fix avoids inter node affinity change. > > This has to be fixed up to match the commit message (and for spelling). > I have seen some questions offlist about how important this fix is, > and how it can affect users - so that would be useful to have in the > description as well. > > To clarify, this errata comes into play only when the irq affinity is > forced from the node given by the device (and ITS) affinity to another > node. This should not happen in normal, useful configurations. Define normal. That's all under control of userspace, and the kernel doesn't really have a say. irqbalance will happily move interrupts around. Disable all CPUs from node at runtime (again, from userspace), and you'll get the exact same thing. I can't see what's so "abnormal" about any of that. > Also, we will hold further posting of this errata until we do another > round of investigation with the hardware team for a better solution. > If we can handle the pending interrupts for the small window of MOVI/INV > in first workaround, we will not need this restriction at all. What do you mean by "If we can handle the pending interrupts for the small window of MOVI/INV"? Taking the interrupt on the source CPU? Sure, that would be fine. But that's assuming that the souce CPU is in a position to actually handle this, and is not simply going down. If there is only a slight possibility that you may loose an interrupt in the MOVI/INV window (which is not that small, since that's a 4 command sequence), your only other solution is to inject a spurious interrupt to replace the one you may have lost in that window. In the meantime, and until I see a patch fixing this (or a decent explanation of why this isn't a problem), I'll consider it broken. Thanks, M. -- Jazz is not dead, it just smell funny.