Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp12149182rwb; Sat, 26 Nov 2022 05:12:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf5wjbPSJcoyGyzghpPLyEFF9L9Bf9xiq/TOucsLHhwCTEnyhE78zEaIqjPAX31HJ8Bf9gtr X-Received: by 2002:a63:e510:0:b0:476:a862:53d2 with SMTP id r16-20020a63e510000000b00476a86253d2mr19238303pgh.163.1669468351582; Sat, 26 Nov 2022 05:12:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669468351; cv=none; d=google.com; s=arc-20160816; b=XOpjUAOZASBFsKGO+Wv1/ez3Y4fjgN2JxsGbXQ9lCv8BmSgzfvVRYN5wCNMMU5etU8 t+pk9Q4NmJYtJ+RWYRrliQ9Zsr915zPGVKTH7iJ20g4tn1ALbTSG7pZ7DjuPzKGwQRd/ DZ+do01E/furIX7TX100Pv+r6Mk8rIaCr384HqbsNjfVSu6dFeuvfbLpklRrovsbWF6U BPSbUXp7El9HsklMvvwe0G/bG12d6Q3X1jAzV3Tq69X73g4ViI+srkovayfsKbFsZVno +bkV2LF0giK5p+ikMUXkd08LneAaYey11hBfZat572llZrb1CiQh44gQWwezAMDTnDfp W2Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=tpL6P/HvzpzSyct+nQ0NRsHK1bEWbEbTu1BSJ7C7y1s=; b=CF4AU+7thvzezWXbL34V4pkPgp/fEvR4O5oMUNHfmJxODYyqtGIeG68FdypwG5HekT kkqDHD+iBmDP5wOKAy5nSvvdCOqWNrREbJtc6auIkp4RJnFKqDXBPaIUYc9ML1cFa67R Dg22NLwVDysJ1j6t8P+S7Q99i3EVSwpnuQcmyGuj4L80+yvCQfza+i4PL2RWetPtFGHR 4i23O048C97MJK2C2BRJdnINO3EOZumzsD5mc218dI2ycEmnzGIBCIpB3w4wYKdb0yP6 j3ITBsVUFav73oX+l2oJVly6uWtnH8X8BWiY9rVdZX53URQFwHUSKdAzPoChJNrHVcTC YKGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Wso69y6D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n18-20020a63f812000000b00476c6b9cf69si6726544pgh.856.2022.11.26.05.12.20; Sat, 26 Nov 2022 05:12:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Wso69y6D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229526AbiKZMmP (ORCPT + 84 others); Sat, 26 Nov 2022 07:42:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229450AbiKZMmO (ORCPT ); Sat, 26 Nov 2022 07:42:14 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D6B01FFA7 for ; Sat, 26 Nov 2022 04:42:13 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 12C0560112 for ; Sat, 26 Nov 2022 12:42:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4A7BCC433D6; Sat, 26 Nov 2022 12:42:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669466532; bh=uyA91/WIMD2IjmQ+tmb3QENbzIKL+8L+4z9NnN7901Y=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Wso69y6Dya+nThWWmnx310ZCW3yYPfe6QXVZDXPj0msdFYnLc2pRg17bVdLXwKQ3+ ug44pKhpJzP9uze3BoTH3s+imq7ZhUSCHyl8kySzfsW7gGNO8TdANP8X/CLcA1FW3h Nl8Smw6+egnKVau+cJaC2UE6hgxKFxxYUe2nOkAD5Hd7UKBGbqH4bNcGKaSTDX1Y7j gVEGOONyMl+bZEgWJIi2usDXM+bRSfuEKJ/ttX4cdp4If1zFOPye57XSFis2TdlCN7 vvLdT2Y2UWaEizX2SE/0ZDJSErfNTxqizFhiXQMJfCKGJjOgryG2Lz2KW6+9xZWDGx 1PVymLxHi4SfA== Received: from sofa.misterjones.org ([185.219.108.64] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1oyuVV-008iIE-Qc; Sat, 26 Nov 2022 12:42:09 +0000 Date: Sat, 26 Nov 2022 12:42:01 +0000 Message-ID: <87y1rxubty.wl-maz@kernel.org> From: Marc Zyngier To: Anup Patel Cc: Palmer Dabbelt , Paul Walmsley , Thomas Gleixner , Daniel Lezcano , Atish Patra , Alistair Francis , Anup Patel , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v11 3/7] genirq: Add mechanism to multiplex a single HW IPI In-Reply-To: <20221114093904.1669461-4-apatel@ventanamicro.com> References: <20221114093904.1669461-1-apatel@ventanamicro.com> <20221114093904.1669461-4-apatel@ventanamicro.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: apatel@ventanamicro.com, palmer@dabbelt.com, paul.walmsley@sifive.com, tglx@linutronix.de, daniel.lezcano@linaro.org, atishp@atishpatra.org, Alistair.Francis@wdc.com, anup@brainfault.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 14 Nov 2022 09:39:00 +0000, Anup Patel wrote: > > All RISC-V platforms have a single HW IPI provided by the INTC local > interrupt controller. The HW method to trigger INTC IPI can be through > external irqchip (e.g. RISC-V AIA), through platform specific device > (e.g. SiFive CLINT timer), or through firmware (e.g. SBI IPI call). > > To support multiple IPIs on RISC-V, we add a generic IPI multiplexing > mechanism which help us create multiple virtual IPIs using a single > HW IPI. This generic IPI multiplexing is inspired from the Apple AIC > irqchip driver and it is shared by various RISC-V irqchip drivers. > > Signed-off-by: Anup Patel > --- > include/linux/irq.h | 18 +++ > kernel/irq/Kconfig | 5 + > kernel/irq/Makefile | 1 + > kernel/irq/ipi-mux.c | 268 +++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 292 insertions(+) > create mode 100644 kernel/irq/ipi-mux.c > > diff --git a/include/linux/irq.h b/include/linux/irq.h > index c3eb89606c2b..5ab702cb0a5b 100644 > --- a/include/linux/irq.h > +++ b/include/linux/irq.h > @@ -1266,6 +1266,24 @@ int __ipi_send_mask(struct irq_desc *desc, const struct cpumask *dest); > int ipi_send_single(unsigned int virq, unsigned int cpu); > int ipi_send_mask(unsigned int virq, const struct cpumask *dest); > > +/** > + * struct ipi_mux_ops - IPI multiplex operations > + * > + * @ipi_mux_pre_handle: Optional function called before handling parent IPI > + * @ipi_mux_post_handle:Optional function called after handling parent IPI > + * @ipi_mux_send: Trigger parent IPI on target CPUs > + */ > +struct ipi_mux_ops { > + void (*ipi_mux_pre_handle)(unsigned int parent_virq, void *data); > + void (*ipi_mux_post_handle)(unsigned int parent_virq, void *data); I still haven't seen any decent explanation for this other than "we need it". What is it that cannot be achieved via the irq_ack() and irq_eoi() callbacks? > + void (*ipi_mux_send)(unsigned int parent_virq, void *data, > + const struct cpumask *mask); In what context would the 'parent_virq' be useful? You are *sending* an IPI, not receiving it, and I expect the mechanism by which you send such an IPI to be independent of the Linux view of the irq. Also, please swap data and mask in the function signature. > +}; > + > +void ipi_mux_process(void); > +int ipi_mux_create(unsigned int parent_virq, unsigned int nr_ipi, > + const struct ipi_mux_ops *ops, void *data); > + > #ifdef CONFIG_GENERIC_IRQ_MULTI_HANDLER > /* > * Registers a generic IRQ handling function as the top-level IRQ handler in > diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig > index db3d174c53d4..df17dbc54b02 100644 > --- a/kernel/irq/Kconfig > +++ b/kernel/irq/Kconfig > @@ -86,6 +86,11 @@ config GENERIC_IRQ_IPI > depends on SMP > select IRQ_DOMAIN_HIERARCHY > > +# Generic IRQ IPI Mux support > +config GENERIC_IRQ_IPI_MUX > + bool > + depends on SMP > + > # Generic MSI interrupt support > config GENERIC_MSI_IRQ > bool > diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile > index b4f53717d143..f19d3080bf11 100644 > --- a/kernel/irq/Makefile > +++ b/kernel/irq/Makefile > @@ -15,6 +15,7 @@ obj-$(CONFIG_GENERIC_IRQ_MIGRATION) += cpuhotplug.o > obj-$(CONFIG_PM_SLEEP) += pm.o > obj-$(CONFIG_GENERIC_MSI_IRQ) += msi.o > obj-$(CONFIG_GENERIC_IRQ_IPI) += ipi.o > +obj-$(CONFIG_GENERIC_IRQ_IPI_MUX) += ipi-mux.o > obj-$(CONFIG_SMP) += affinity.o > obj-$(CONFIG_GENERIC_IRQ_DEBUGFS) += debugfs.o > obj-$(CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR) += matrix.o > diff --git a/kernel/irq/ipi-mux.c b/kernel/irq/ipi-mux.c > new file mode 100644 > index 000000000000..259e00366dd7 > --- /dev/null > +++ b/kernel/irq/ipi-mux.c > @@ -0,0 +1,268 @@ > +// SPDX-License-Identifier: GPL-2.0-or-later > +/* > + * Multiplex several virtual IPIs over a single HW IPI. > + * > + * Copyright The Asahi Linux Contributors > + * Copyright (c) 2022 Ventana Micro Systems Inc. > + */ > + > +#define pr_fmt(fmt) "ipi-mux: " fmt > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +struct ipi_mux_cpu { > + atomic_t enable; > + atomic_t bits; > + struct cpumask send_mask; > +}; > + > +struct ipi_mux_control { > + void *data; > + unsigned int nr; Honestly, I think we can get rid of this. The number of IPIs Linux uses is pretty small, and assuming a huge value (like 32) would be enough. It would save looking up this value on each IPI handling. > + unsigned int parent_virq; > + struct irq_domain *domain; > + const struct ipi_mux_ops *ops; > + struct ipi_mux_cpu __percpu *cpu; > +}; > + > +static struct ipi_mux_control *imux; > +static DEFINE_STATIC_KEY_FALSE(imux_pre_handle); > +static DEFINE_STATIC_KEY_FALSE(imux_post_handle); > + > +static void ipi_mux_mask(struct irq_data *d) > +{ > + struct ipi_mux_cpu *icpu = this_cpu_ptr(imux->cpu); > + > + atomic_andnot(BIT(irqd_to_hwirq(d)), &icpu->enable); > +} > + > +static void ipi_mux_unmask(struct irq_data *d) > +{ > + u32 ibit = BIT(irqd_to_hwirq(d)); > + struct ipi_mux_cpu *icpu = this_cpu_ptr(imux->cpu); > + > + atomic_or(ibit, &icpu->enable); > + > + /* > + * The atomic_or() above must complete before the atomic_read() > + * below to avoid racing ipi_mux_send_mask(). > + */ > + smp_mb__after_atomic(); > + > + /* If a pending IPI was unmasked, raise a parent IPI immediately. */ > + if (atomic_read(&icpu->bits) & ibit) > + imux->ops->ipi_mux_send(imux->parent_virq, imux->data, > + cpumask_of(smp_processor_id())); > +} > + > +static void ipi_mux_send_mask(struct irq_data *d, const struct cpumask *mask) > +{ > + u32 ibit = BIT(irqd_to_hwirq(d)); > + struct ipi_mux_cpu *icpu = this_cpu_ptr(imux->cpu); > + struct cpumask *send_mask = &icpu->send_mask; > + unsigned long flags; > + int cpu; > + > + /* > + * We use send_mask as a per-CPU variable so disable local > + * interrupts to avoid being preempted. > + */ > + local_irq_save(flags); > + > + cpumask_clear(send_mask); > + > + for_each_cpu(cpu, mask) { > + icpu = per_cpu_ptr(imux->cpu, cpu); > + atomic_or(ibit, &icpu->bits); > + > + /* > + * The atomic_or() above must complete before > + * the atomic_read() below to avoid racing with > + * ipi_mux_unmask(). > + */ > + smp_mb__after_atomic(); > + > + if (atomic_read(&icpu->enable) & ibit) > + cpumask_set_cpu(cpu, send_mask); > + } > + > + /* Trigger the parent IPI */ > + imux->ops->ipi_mux_send(imux->parent_virq, imux->data, send_mask); > + > + local_irq_restore(flags); > +} > + > +static const struct irq_chip ipi_mux_chip = { > + .name = "IPI Mux", > + .irq_mask = ipi_mux_mask, > + .irq_unmask = ipi_mux_unmask, > + .ipi_send_mask = ipi_mux_send_mask, > +}; I really think this could either be supplied by the irqchip, or somehow patched to avoid the pointless imux->ops->ipi_mux_send indirection. Pointer chasing hurts. > + > +static int ipi_mux_domain_alloc(struct irq_domain *d, unsigned int virq, > + unsigned int nr_irqs, void *arg) > +{ > + int i; > + > + for (i = 0; i < nr_irqs; i++) { > + irq_set_percpu_devid(virq + i); > + irq_domain_set_info(d, virq + i, i, > + &ipi_mux_chip, d->host_data, > + handle_percpu_devid_irq, NULL, NULL); > + } > + > + return 0; > +} > + > +static const struct irq_domain_ops ipi_mux_domain_ops = { > + .alloc = ipi_mux_domain_alloc, > + .free = irq_domain_free_irqs_top, > +}; > + > +/** > + * ipi_mux_process - Process multiplexed virtual IPIs > + */ > +void ipi_mux_process(void) > +{ > + struct ipi_mux_cpu *icpu = this_cpu_ptr(imux->cpu); > + irq_hw_number_t hwirq; > + unsigned long ipis; > + int en; Please use an unsigned type here (see the rationale in atomic_t.txt). > + > + if (static_branch_unlikely(&imux_pre_handle)) > + imux->ops->ipi_mux_pre_handle(imux->parent_virq, imux->data); > + > + /* > + * Reading enable mask does not need to be ordered as long as > + * this function called from interrupt handler because only > + * the CPU itself can change it's own enable mask. > + */ > + en = atomic_read(&icpu->enable); > + > + /* > + * Clear the IPIs we are about to handle. This pairs with the > + * atomic_fetch_or_release() in ipi_mux_send_mask(). > + */ > + ipis = atomic_fetch_andnot(en, &icpu->bits) & en; > + > + for_each_set_bit(hwirq, &ipis, imux->nr) > + generic_handle_domain_irq(imux->domain, hwirq); > + > + if (static_branch_unlikely(&imux_post_handle)) > + imux->ops->ipi_mux_post_handle(imux->parent_virq, imux->data); Do you see what I meant about the {pre,post}_handle callback, and how they are *exactly* like irq_{ack,eoi}? > +} > + > +static void ipi_mux_handler(struct irq_desc *desc) > +{ > + struct irq_chip *chip = irq_desc_get_chip(desc); > + > + chained_irq_enter(chip, desc); > + ipi_mux_process(); > + chained_irq_exit(chip, desc); > +} > + > +/** > + * ipi_mux_create - Create virtual IPIs multiplexed on top of a single > + * parent IPI. > + * @parent_virq: virq of the parent per-CPU IRQ > + * @nr_ipi: number of virtual IPIs to create. This should > + * be <= BITS_PER_TYPE(int) > + * @ops: multiplexing operations for the parent IPI > + * @data: opaque data used by the multiplexing operations What is the use for data? If anything, that data should be passed via the mux interrupt. But the whole point of this is to make the mux invisible. So this whole 'data' business is a mystery to me. > + * > + * If the parent IPI > 0 then ipi_mux_process() will be automatically > + * called via chained handler. > + * > + * If the parent IPI <= 0 then it is responsibility of irqchip drivers > + * to explicitly call ipi_mux_process() for processing muxed IPIs. 0 is a much more idiomatic value for "no parent IRQ". < 0 normally represents an error. > + * > + * Returns first virq of the newly created virtual IPIs upon success > + * or <=0 upon failure > + */ > +int ipi_mux_create(unsigned int parent_virq, unsigned int nr_ipi, Tell me how I can express a negative parent_virq here? > + const struct ipi_mux_ops *ops, void *data) > +{ > + struct fwnode_handle *fwnode; > + struct irq_domain *domain; > + int rc; > + > + if (imux) > + return -EEXIST; > + > + if (BITS_PER_TYPE(int) < nr_ipi || !ops || !ops->ipi_mux_send) > + return -EINVAL; > + > + if (parent_virq && > + !irqd_is_per_cpu(irq_desc_get_irq_data(irq_to_desc(parent_virq)))) > + return -EINVAL; See how buggy this is if I follow your definition of "no parent IRQ"? M. -- Without deviation from the norm, progress is not possible.