Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932849AbbHXMai (ORCPT ); Mon, 24 Aug 2015 08:30:38 -0400 Received: from mail-io0-f178.google.com ([209.85.223.178]:34788 "EHLO mail-io0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932594AbbHXMag (ORCPT ); Mon, 24 Aug 2015 08:30:36 -0400 MIME-Version: 1.0 In-Reply-To: <55DAEF3C.5090303@arm.com> References: <20150822131050.GK1820@rric.localdomain> <55DAEF3C.5090303@arm.com> Date: Mon, 24 Aug 2015 18:00:35 +0530 Message-ID: Subject: Re: [PATCH] irqchip, gicv3-its, numa: Workaround for Cavium ThunderX erratum 23144 From: Ganapatrao Kulkarni To: Marc Zyngier Cc: Robert Richter , Will Deacon , Thomas Gleixner , Jason Cooper , "linux-arm-kernel@lists.infradead.org" , Tirumalesh Chalamarla , Robert Richter , "linux-kernel@vger.kernel.org" , Ganapatrao Kulkarni Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10215 Lines: 294 Hi Marc, thanks for the review comments. On Mon, Aug 24, 2015 at 3:47 PM, Marc Zyngier wrote: > Hi Robert, > > Just came back from the Seattle madness, so picking up patches in > reverse order... ;-) > > On 22/08/15 14:10, Robert Richter wrote: >> The patch below adds a workaround for gicv3 in a numa environment. It >> is on top of my recent gicv3 errata patch submission v4 and Ganapat's >> arm64 numa patches for devicetree v5. >> >> Please comment. >> >> Thanks, >> >> -Robert >> >> >> >> From c432820451a46b8d1e299b8bfbfcdcb3b75340e7 Mon Sep 17 00:00:00 2001 >> From: Ganapatrao Kulkarni >> Date: Wed, 19 Aug 2015 23:40:05 +0530 >> Subject: [PATCH] irqchip, gicv3-its, numa: Workaround for Cavium ThunderX erratum >> 23144 >> >> This implements a workaround for gicv3-its erratum 23144 applicable >> for Cavium's ThunderX multinode systems. >> >> The erratum fixes the hang of ITS SYNC command by avoiding inter node >> io and collections/cpu mapping. This fix is only applicable for >> Cavium's ThunderX dual-socket platforms. > > Can you please elaborate on this? I can't see any reference to the SYNC > command there. Is that a case of an ITS not being able to route LPIs to > cores on another socket? we were seeing mapc command failing when we were mapping its of node0 with collections of node1(vice-versa). we found sync was timing out, which is issued post mapc(also for mapvi and movi). Yes this errata limits the routing of inter-node LPIs. > > I really need more details to understand this patch (please use short > sentences, I'm still in a different time zone). > >> >> Signed-off-by: Ganapatrao Kulkarni >> [ rric: Reworked errata code, added helper functions, updated commit >> message. ] >> >> Signed-off-by: Robert Richter >> --- >> arch/arm64/Kconfig | 14 +++++++++++ >> drivers/irqchip/irq-gic-common.c | 5 ++-- >> drivers/irqchip/irq-gic-v3-its.c | 54 ++++++++++++++++++++++++++++++++++------ >> 3 files changed, 64 insertions(+), 9 deletions(-) >> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index 3809187ed653..b92b7b70b29b 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -421,6 +421,20 @@ config ARM64_ERRATUM_845719 >> >> If unsure, say Y. >> >> +config CAVIUM_ERRATUM_22375 >> + bool "Cavium erratum 22375, 24313" >> + depends on NUMA >> + default y >> + help >> + Enable workaround for erratum 22375, 24313. >> + >> +config CAVIUM_ERRATUM_23144 >> + bool "Cavium erratum 23144" >> + depends on NUMA >> + default y >> + help >> + Enable workaround for erratum 23144. >> + >> config CAVIUM_ERRATUM_23154 >> bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed" >> depends on ARCH_THUNDER >> diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c >> index ee789b07f2d1..1dfce64dbdac 100644 >> --- a/drivers/irqchip/irq-gic-common.c >> +++ b/drivers/irqchip/irq-gic-common.c >> @@ -24,11 +24,12 @@ >> void gic_check_capabilities(u32 iidr, const struct gic_capabilities *cap, >> void *data) >> { >> - for (; cap->desc; cap++) { >> + for (; cap->init; cap++) { >> if (cap->iidr != (cap->mask & iidr)) >> continue; >> cap->init(data); >> - pr_info("%s\n", cap->desc); >> + if (cap->desc) >> + pr_info("%s\n", cap->desc); > > No. I really want to see what errata are applied when I look at a kernel > log. sorry, did not understood your comment, it is still printed using cap->desc. > >> } >> } >> >> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c >> index 4bb135721e72..666be39f13a9 100644 >> --- a/drivers/irqchip/irq-gic-v3-its.c >> +++ b/drivers/irqchip/irq-gic-v3-its.c >> @@ -43,7 +43,8 @@ >> #include "irqchip.h" >> >> #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1ULL << 0) >> -#define ITS_FLAGS_CAVIUM_THUNDERX (1ULL << 1) >> +#define ITS_WORKAROUND_CAVIUM_22375 (1ULL << 1) >> +#define ITS_WORKAROUND_CAVIUM_23144 (1ULL << 2) >> >> #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING (1 << 0) >> >> @@ -76,6 +77,7 @@ struct its_node { >> struct list_head its_device_list; >> u64 flags; >> u32 ite_size; >> + int numa_node; >> }; >> >> #define ITS_ITT_ALIGN SZ_256 >> @@ -609,11 +611,18 @@ static void its_eoi_irq(struct irq_data *d) >> static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val, >> bool force) >> { >> - unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask); >> + unsigned int cpu; >> struct its_device *its_dev = irq_data_get_irq_chip_data(d); >> struct its_collection *target_col; >> u32 id = its_get_event_id(d); >> >> + if (its_dev->its->flags & ITS_WORKAROUND_CAVIUM_23144) { >> + cpu = cpumask_any_and(mask_val, >> + cpumask_of_node(its_dev->its->numa_node)); > > I suppose cpumask_of_node returns only the *online* cores of a given > node, right? yes. > >> + } else { >> + cpu = cpumask_any_and(mask_val, cpu_online_mask); >> + } >> + >> if (cpu >= nr_cpu_ids) >> return -EINVAL; >> >> @@ -845,7 +854,7 @@ static int its_alloc_tables(struct its_node *its) >> u64 typer; >> u32 ids; >> >> - if (its->flags & ITS_FLAGS_CAVIUM_THUNDERX) { >> + if (its->flags & ITS_WORKAROUND_CAVIUM_22375) { >> /* >> * erratum 22375: only alloc 8MB table size >> * erratum 24313: ignore memory access type >> @@ -1093,6 +1102,11 @@ static void its_cpu_init_lpis(void) >> dsb(sy); >> } >> >> +static inline int numa_node_id_early(void) >> +{ >> + return MPIDR_AFFINITY_LEVEL(read_cpuid_mpidr(), 2); >> +} >> + >> static void its_cpu_init_collection(void) >> { >> struct its_node *its; >> @@ -1104,6 +1118,11 @@ static void its_cpu_init_collection(void) >> list_for_each_entry(its, &its_nodes, entry) { >> u64 target; >> >> + /* avoid cross node core and its mapping */ >> + if ((its->flags & ITS_WORKAROUND_CAVIUM_23144) && >> + its->numa_node != numa_node_id_early()) >> + continue; >> + > > Argh. This is horrible. You really need some topology bindings to > describe your system instead of hardcoding some random level of > affinity. The next time someone is going to come up with a similarly > broken system, they will have to reinvent that wheel. thanks for the suggestion, we will use cpu_topology[cpuid].cluster_id instead of function numa_node_id_early() > >> /* >> * We now have to bind each collection to its target >> * redistributor. >> @@ -1372,9 +1391,15 @@ static void its_irq_domain_activate(struct irq_domain *domain, >> { >> struct its_device *its_dev = irq_data_get_irq_chip_data(d); >> u32 event = its_get_event_id(d); >> + unsigned int cpu; >> + >> + if (its_dev->its->flags & ITS_WORKAROUND_CAVIUM_23144) >> + cpu = cpumask_first(cpumask_of_node(its_dev->its->numa_node)); >> + else >> + cpu = cpumask_first(cpu_online_mask); > > Looks like this can be factored with the code you've added in set_affinity. we cant issue mapvi with cross-node mapping. > >> >> /* Bind the LPI to the first possible CPU */ >> - its_dev->event_map.col_map[event] = cpumask_first(cpu_online_mask); >> + its_dev->event_map.col_map[event] = cpu; >> >> /* Map the GIC IRQ and event to the device */ >> its_send_mapvi(its_dev, d->hwirq, event); >> @@ -1457,16 +1482,31 @@ static int its_force_quiescent(void __iomem *base) >> } >> } >> >> +static inline int its_get_node_thunderx(struct its_node *its) >> +{ >> + return (its->phys_base >> 44) & 0x3; > > Why 3? Is that because you have provision for 4 sockets or what? yes. > >> +} >> + >> static void its_enable_cavium_thunderx(void *data) >> { >> - struct its_node *its = data; >> + struct its_node __maybe_unused *its = data; >> >> - its->flags |= ITS_FLAGS_CAVIUM_THUNDERX; >> +#ifdef CONFIG_CAVIUM_ERRATUM_22375 >> + its->flags |= ITS_WORKAROUND_CAVIUM_22375; >> + pr_info("ITS: Enabling workaround for 22375, 24313\n"); >> +#endif >> + >> +#ifdef CONFIG_CAVIUM_ERRATUM_23144 >> + if (num_possible_nodes() > 1) { >> + its->numa_node = its_get_node_thunderx(its); > > I'd rather see numa_node being always initialized to something useful. > If you're adding numa support, why can't this be initialized via > standard topology bindings? IIUC, topology defines only cpu topology. > > Also, you're mixing things coming from the address map and information > coming from the MPIDR. I must say this doesn't fill me with confidence. > >> + its->flags |= ITS_WORKAROUND_CAVIUM_23144; >> + pr_info("ITS: Enabling workaround for 23144\n"); >> + } >> +#endif > > It would probably make sense to split these in two function, each with > its own erratum entry. sure will do. > >> } >> >> static const struct gic_capabilities its_errata[] = { >> { >> - .desc = "ITS: Cavium errata 22375, 24313", >> .iidr = 0xa100034c, /* ThunderX pass 1.x */ >> .mask = 0xffff0fff, >> .init = its_enable_cavium_thunderx, >> > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny... thanks Ganapat > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/