Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp14669133rwb; Mon, 28 Nov 2022 03:52:39 -0800 (PST) X-Google-Smtp-Source: AA0mqf5eMKjqviPUz4nkHkj2I8SjKeUUme5tZXqKcwd4Le/Qx1jLt9gaKcK/4+OntIvbiC8BkZr8 X-Received: by 2002:a17:906:745:b0:7bd:f50f:a51c with SMTP id z5-20020a170906074500b007bdf50fa51cmr8682019ejb.285.1669636358820; Mon, 28 Nov 2022 03:52:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669636358; cv=none; d=google.com; s=arc-20160816; b=odoEyeWUss7b6m8MCOMH9Pfrw8AKGkbWE9z8BbbI0OqE0gMWGPdVF72+kGoAxgIZtB pO5ulOlnXs+QxHod8uqhxC/W1jZWkpTuru0TkaCw27Q/an2tv1HyFVv/EUCDv7N2dLP9 q8fW/xklC44vtiY9Z5cHHDM4b++NyauVwyVzLu3vOTWkIZrwRU8b/4X/NwKwifr9rOM+ wxbUc3HUMTgyJoZBKPve0j1hFEQxTCzE2ajmrTM9w82JJc4IglxSGGuMwZnE0+8vcf82 lmitZEjwMiImSPuu+h/r8UN0mGLo1sZ8JbSTdDi8VCaXnn+bXk4z88sDOxtP/GEKp2xv 0iQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=cQfZPpPDqAUPnnXst4pbFOJRzOGFidziXeYbvM5U3+I=; b=jZolTIW/cRhaROWY+aITr2tfhuXSPqjvEeyLdkCNhfeb+V6KUCqZYoBi/mqk3+4Bft YmP0DrmoJZtK/VgE05btCWDC/SlbuQlnmC5zg2bE4KNWVp1fP4+bvY7PsUJqhHbQUJnc +WTqdMFuRY9uz65YH/5JAmi0ha20n9kVZnaesAA545rETV+uH1pJOt9NwhXBEOvJDyoZ 62wPZpUomaREeJsX1K136FdKXRxLz31km8M0BON0G7Wdg/uZ3IPytf3ue9lbH3PX0xhv RJLH5JwQmsWFGS+adk153cGn8dXs3XtCj2bN38GdusU6IUqCfJfW0k+bhKe0JOQ+Yk9x 257g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Z+n91fvn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gs34-20020a1709072d2200b007a6ec6fb027si11347862ejc.538.2022.11.28.03.52.18; Mon, 28 Nov 2022 03:52:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Z+n91fvn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229711AbiK1KgV (ORCPT + 84 others); Mon, 28 Nov 2022 05:36:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230397AbiK1Kff (ORCPT ); Mon, 28 Nov 2022 05:35:35 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35E191B9C0 for ; Mon, 28 Nov 2022 02:34:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0E3E761088 for ; Mon, 28 Nov 2022 10:34:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 444E2C433D6; Mon, 28 Nov 2022 10:34:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669631681; bh=UXTLtyTvZkhEJWg4SchoB5639cLPCG+wmWU24qxcMDI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Z+n91fvnHNl+bcYaK5MKfik3z7Pe0Q5Wrgms7GhAjgyie26eS47cT493O2NtQ9jYy XUkHUntu6e32ZQ2TzTpRCAA+cDI0Mo0g40wyGCjal2exR+fxWIanE6xamC3fruv0cD u11Zw3MGYWhw1MD8ilW6szVeMYAIr6/r5bWBsWCpzmyL7hhBYz9JPULZoX/03VymYy IsOJ9oCHibFU9RFYyLuDN6N7z5jSN4oeNmR8/v4I4LhOcE5taQjK4shW+CQFKrvEOi 6x1qxUD5hcObDWydhWPVUApLGa3+Tq1FUBiTOeq6kRD0eVAdWbi5kNX2ImtBr7xM7n 2X7APpm6q1tqg== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1ozbTC-008zAJ-UD; Mon, 28 Nov 2022 10:34:39 +0000 Date: Mon, 28 Nov 2022 10:34:38 +0000 Message-ID: <86k03fmkox.wl-maz@kernel.org> From: Marc Zyngier To: Anup Patel Cc: Palmer Dabbelt , Paul Walmsley , Thomas Gleixner , Daniel Lezcano , Atish Patra , Alistair Francis , Anup Patel , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v12 3/7] genirq: Add mechanism to multiplex a single HW IPI In-Reply-To: <20221126173453.306088-4-apatel@ventanamicro.com> References: <20221126173453.306088-1-apatel@ventanamicro.com> <20221126173453.306088-4-apatel@ventanamicro.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: apatel@ventanamicro.com, palmer@dabbelt.com, paul.walmsley@sifive.com, tglx@linutronix.de, daniel.lezcano@linaro.org, atishp@atishpatra.org, Alistair.Francis@wdc.com, anup@brainfault.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 26 Nov 2022 17:34:49 +0000, Anup Patel wrote: > > All RISC-V platforms have a single HW IPI provided by the INTC local > interrupt controller. The HW method to trigger INTC IPI can be through > external irqchip (e.g. RISC-V AIA), through platform specific device > (e.g. SiFive CLINT timer), or through firmware (e.g. SBI IPI call). > > To support multiple IPIs on RISC-V, we add a generic IPI multiplexing > mechanism which help us create multiple virtual IPIs using a single > HW IPI. This generic IPI multiplexing is inspired from the Apple AIC > irqchip driver and it is shared by various RISC-V irqchip drivers. > > Signed-off-by: Anup Patel > --- > include/linux/irq.h | 4 + > kernel/irq/Kconfig | 5 ++ > kernel/irq/Makefile | 1 + > kernel/irq/ipi-mux.c | 210 +++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 220 insertions(+) > create mode 100644 kernel/irq/ipi-mux.c > > diff --git a/include/linux/irq.h b/include/linux/irq.h > index c3eb89606c2b..6024e1ee1257 100644 > --- a/include/linux/irq.h > +++ b/include/linux/irq.h > @@ -1266,6 +1266,10 @@ int __ipi_send_mask(struct irq_desc *desc, const struct cpumask *dest); > int ipi_send_single(unsigned int virq, unsigned int cpu); > int ipi_send_mask(unsigned int virq, const struct cpumask *dest); > > +void ipi_mux_process(void); > +int ipi_mux_create(unsigned int nr_ipi, > + void (*mux_send)(const struct cpumask *)); > + > #ifdef CONFIG_GENERIC_IRQ_MULTI_HANDLER > /* > * Registers a generic IRQ handling function as the top-level IRQ handler in > diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig > index db3d174c53d4..df17dbc54b02 100644 > --- a/kernel/irq/Kconfig > +++ b/kernel/irq/Kconfig > @@ -86,6 +86,11 @@ config GENERIC_IRQ_IPI > depends on SMP > select IRQ_DOMAIN_HIERARCHY > > +# Generic IRQ IPI Mux support > +config GENERIC_IRQ_IPI_MUX > + bool > + depends on SMP > + > # Generic MSI interrupt support > config GENERIC_MSI_IRQ > bool > diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile > index b4f53717d143..f19d3080bf11 100644 > --- a/kernel/irq/Makefile > +++ b/kernel/irq/Makefile > @@ -15,6 +15,7 @@ obj-$(CONFIG_GENERIC_IRQ_MIGRATION) += cpuhotplug.o > obj-$(CONFIG_PM_SLEEP) += pm.o > obj-$(CONFIG_GENERIC_MSI_IRQ) += msi.o > obj-$(CONFIG_GENERIC_IRQ_IPI) += ipi.o > +obj-$(CONFIG_GENERIC_IRQ_IPI_MUX) += ipi-mux.o > obj-$(CONFIG_SMP) += affinity.o > obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o > obj-$(CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR) += matrix.o > diff --git a/kernel/irq/ipi-mux.c b/kernel/irq/ipi-mux.c > new file mode 100644 > index 000000000000..366d8cd5320b > --- /dev/null > +++ b/kernel/irq/ipi-mux.c > @@ -0,0 +1,210 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > +/* > + * Multiplex several virtual IPIs over a single HW IPI. > + * > + * Copyright The Asahi Linux Contributors > + * Copyright (c) 2022 Ventana Micro Systems Inc. > + */ > + > +#define pr_fmt(fmt) "ipi-mux: " fmt > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +struct ipi_mux_cpu { > + atomic_t enable; > + atomic_t bits; > + struct cpumask send_mask; > +}; > + > +static struct ipi_mux_cpu __percpu *ipi_mux_pcpu; > +static struct irq_domain *ipi_mux_domain; > +static void (*ipi_mux_send)(const struct cpumask *mask); > + > +static void ipi_mux_mask(struct irq_data *d) > +{ > + struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu); > + > + atomic_andnot(BIT(irqd_to_hwirq(d)), &icpu->enable); > +} > + > +static void ipi_mux_unmask(struct irq_data *d) > +{ > + u32 ibit = BIT(irqd_to_hwirq(d)); > + struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu); > + > + atomic_or(ibit, &icpu->enable); > + > + /* > + * The atomic_or() above must complete before the atomic_read() > + * below to avoid racing ipi_mux_send_mask(). > + */ > + smp_mb__after_atomic(); > + > + /* If a pending IPI was unmasked, raise a parent IPI immediately. */ > + if (atomic_read(&icpu->bits) & ibit) > + ipi_mux_send(cpumask_of(smp_processor_id())); > +} > + > +static void ipi_mux_send_mask(struct irq_data *d, const struct cpumask *mask) > +{ > + u32 ibit = BIT(irqd_to_hwirq(d)); > + struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu); > + struct cpumask *send_mask = &icpu->send_mask; > + unsigned long flags; > + int cpu; > + > + /* > + * We use send_mask as a per-CPU variable so disable local > + * interrupts to avoid being preempted. > + */ > + local_irq_save(flags); The correct way to avoid preemption is to use preempt_disable(), which is a lot cheaper than disabling interrupt on most architectures. > + > + cpumask_clear(send_mask); This thing is likely to be unnecessarily expensive on very large systems, as it is proportional to the number of CPUs. > + > + for_each_cpu(cpu, mask) { > + icpu = per_cpu_ptr(ipi_mux_pcpu, cpu); > + atomic_or(ibit, &icpu->bits); The original code had an atomic_fetch_or_release() to allow eliding the IPI if the target interrupt was already pending. Why is that code gone? This is a pretty cheap and efficient optimisation. > + > + /* > + * The atomic_or() above must complete before > + * the atomic_read() below to avoid racing with > + * ipi_mux_unmask(). > + */ > + smp_mb__after_atomic(); > + > + if (atomic_read(&icpu->enable) & ibit) > + cpumask_set_cpu(cpu, send_mask); > + } > + > + /* Trigger the parent IPI */ > + ipi_mux_send(send_mask); IPIs are very rarely made pending on more than a single CPU at a time. The overwhelming majority of them are targeting a single CPU. So accumulating bits to avoid doing two or more "send" actions only penalises the generic case. My conclusion is that this "send_mask" can probably be removed, together with the preemption fiddling. > + > + local_irq_restore(flags); > +} > + > +static const struct irq_chip ipi_mux_chip = { > + .name = "IPI Mux", > + .irq_mask = ipi_mux_mask, > + .irq_unmask = ipi_mux_unmask, > + .ipi_send_mask = ipi_mux_send_mask, > +}; OK, you have now dropped the superfluous pre/post handlers. But the need still exists. Case in point, the aic_handle_ipi() prologue and epilogue to the interrupt handling. I have suggested last time that the driver could provide the actual struct irq_chip in order to provide the callbacks it requires. Please realise that I will not take this patch if this cannot be made to work with the single existing in-tree instance of an IPI MUX. 90% of the code having been lifted from there, I think this is a pretty fair ask. > + > +static int ipi_mux_domain_alloc(struct irq_domain *d, unsigned int virq, > + unsigned int nr_irqs, void *arg) > +{ > + int i; > + > + for (i = 0; i < nr_irqs; i++) { > + irq_set_percpu_devid(virq + i); > + irq_domain_set_info(d, virq + i, i, > + &ipi_mux_chip, d->host_data, What does d->host_data represent here? > + handle_percpu_devid_irq, NULL, NULL); > + } > + > + return 0; > +} > + > +static const struct irq_domain_ops ipi_mux_domain_ops = { > + .alloc = ipi_mux_domain_alloc, > + .free = irq_domain_free_irqs_top, > +}; > + > +/** > + * ipi_mux_process - Process multiplexed virtual IPIs > + */ > +void ipi_mux_process(void) > +{ > + struct ipi_mux_cpu *icpu = this_cpu_ptr(ipi_mux_pcpu); > + irq_hw_number_t hwirq; > + unsigned long ipis; > + unsigned int en; > + > + /* > + * Reading enable mask does not need to be ordered as long as > + * this function called from interrupt handler because only > + * the CPU itself can change it's own enable mask. > + */ > + en = atomic_read(&icpu->enable); > + > + /* > + * Clear the IPIs we are about to handle. This pairs with the > + * atomic_fetch_or_release() in ipi_mux_send_mask(). > + */ > + ipis = atomic_fetch_andnot(en, &icpu->bits) & en; > + > + for_each_set_bit(hwirq, &ipis, BITS_PER_LONG) BITS_PER_LONG... > + generic_handle_domain_irq(ipi_mux_domain, hwirq); > +} > + > +/** > + * ipi_mux_create - Create virtual IPIs multiplexed on top of a single > + * parent IPI. > + * @nr_ipi: number of virtual IPIs to create. This should > + * be <= BITS_PER_TYPE(int) > + * @mux_send: callback to trigger parent IPI > + * > + * Returns first virq of the newly created virtual IPIs upon success > + * or <=0 upon failure > + */ > +int ipi_mux_create(unsigned int nr_ipi, > + void (*mux_send)(const struct cpumask *)) > +{ > + struct fwnode_handle *fwnode; > + struct irq_domain *domain; > + int rc; > + > + if (ipi_mux_domain) > + return -EEXIST; > + > + if (BITS_PER_TYPE(int) < nr_ipi || !mux_send) ... vs BITS_PER_TYPE(int) ... M. -- Without deviation from the norm, progress is not possible.