Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2884923rwd; Mon, 29 May 2023 02:18:52 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6hC3/xDFRp4JsAo+UrFTmUtYrm4qswZ5W+OHcx0mA0XzuzmQgbkkKIAYqQmflVeksSyVAL X-Received: by 2002:a05:6a00:a13:b0:63d:4752:4da3 with SMTP id p19-20020a056a000a1300b0063d47524da3mr17748523pfh.25.1685351932375; Mon, 29 May 2023 02:18:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685351932; cv=none; d=google.com; s=arc-20160816; b=cuOpODEh3v6tEajor3zLKEkszgPr/ElSUvV5fP5+uvHm9Z4JIIIJqcmvZxTyt+Ma1k nohXi155T5TCIk/inE76mUzj5yXOo9WiqmGrHQV2vpdJjL6879XbRbz1wxWegp3vhJNd Ar2DCa7GhPWyDtAfybyiFaNOYGtstchJbwxuEgFnKLvwoLVkbN0nO4EJpzorux1DblCE ah/d0NvdX42SOc/ghBWO4g7dcG4FCJNFZwdFojADaNTZ1D8tGGdqTh45HYAum3XDj/ga GIz9knmgM5+LZrNmIen2VAak9v07ixdjtloAutLrv0yHYHX9BtSMCbDfiTkol+CAfQVR qE8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date:dkim-signature; bh=zNK/x+9s2QEGhayrIXxoHsPYX8U9rk+Zjgsn9mJU4V4=; b=FKuF/s7C2xgnrEpZ5H9VkAbtmgQ86wDFxep9F+zQpX95pbCH90+MIJRKKaSi3bbxI6 sd7jOC6gC7NUit1k361ZLp3sS7eIev2iw8l72a1yAEp4AZRYr+7yI4yUQ90rEGM/Vh2v WDjwxIIzwMmX/mXSsoQ2AFyNeZX5OjK39yPUqN6SNEXhfS8jWZWOGF411Np1CbcGTgEO gH4H438e2QKH8U46IgD6oLh4dLRexYotyAwHapdBZP2kyfYuzwcMWW+Zl5sma8r3q/ft h0p0MkE60ug3PyJway/vAQ5LdhBfAgKFsoQu3RXby+33KhduTS0pDRC0/1quPP8P6ZJI ncgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VPX56d+D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x64-20020a636343000000b0053f25281d15si7162718pgb.537.2023.05.29.02.18.36; Mon, 29 May 2023 02:18:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VPX56d+D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231466AbjE2Itw (ORCPT + 99 others); Mon, 29 May 2023 04:49:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231549AbjE2Itv (ORCPT ); Mon, 29 May 2023 04:49:51 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94BBC126 for ; Mon, 29 May 2023 01:49:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A4152612D8 for ; Mon, 29 May 2023 08:48:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E8A2EC433D2; Mon, 29 May 2023 08:48:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685350131; bh=a9fs1pM2YgM3QuXRlFpAtQOQDhaVfJxzaCl+ycL5Qvs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=VPX56d+DnRtw4rWjiqfXO/mJt3lRjKlKp6xypCusJHZYLG4KqHT30C6M3hdRtXXSz sDPT+wg9tFDHl+1jBP/cZH6JiUyHqM6Y/fg+HjwoByXtGdYIal0DFOcb6cILPqmAIc IK6ewBAL5pxbmqcb5g8frb6eUf+PXqyxd4poABSsKKBVOOlDC0dNdBMPIU0gw1aZOo LirBfRnOfog4STiBQ2oYw5UQ71QWuT7iW6NPZg6NuaO2SpqWterPgWZYjrR44YISoy nuiatrqXJ3DwZEaM3O6pqGEqct933fSYhkf7Ge+Y8+9HzsWRZ3ZKNGojDzqYLUgA8T fFNwf/EdeuQQQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1q3YYa-0011z9-N1; Mon, 29 May 2023 09:48:48 +0100 Date: Mon, 29 May 2023 09:48:48 +0100 Message-ID: <86cz2jcykv.wl-maz@kernel.org> From: Marc Zyngier To: "Liao, Chang" Cc: Shanker Donthineni , Thomas Gleixner , Sebastian Andrzej Siewior , Michael Walle , , Vikram Sethi , Jason Sequeira Subject: Re: [PATCH v5 1/3] genirq: Use hlist for managing resend handlers In-Reply-To: <6dc6642a-1e7c-f111-1fa2-be54826ecef6@huawei.com> References: <20230519134902.1495562-1-sdonthineni@nvidia.com> <20230519134902.1495562-2-sdonthineni@nvidia.com> <6dc6642a-1e7c-f111-1fa2-be54826ecef6@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: liaochang1@huawei.com, sdonthineni@nvidia.com, tglx@linutronix.de, bigeasy@linutronix.de, michael@walle.cc, linux-kernel@vger.kernel.org, vsethi@nvidia.com, jsequeira@nvidia.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 29 May 2023 08:57:07 +0100, "Liao, Chang" wrote: >=20 > Hi, Shanker >=20 > =E5=9C=A8 2023/5/19 21:49, Shanker Donthineni =E5=86=99=E9=81=93: > > The current implementation utilizes a bitmap for managing IRQ resend > > handlers, which is allocated based on the SPARSE_IRQ/NR_IRQS macros. > > However, this method may not efficiently utilize memory during runtime, > > particularly when IRQ_BITMAP_BITS is large. > >=20 > > Address this issue by using the hlist to manage IRQ resend handlers > > instead of relying on a static bitmap memory allocation. Additionally, > > a new function, clear_irq_resend(), is introduced and called from > > irq_shutdown to ensure a graceful teardown of the interrupt. > >=20 > > Signed-off-by: Shanker Donthineni > > --- > > include/linux/irqdesc.h | 3 +++ > > kernel/irq/chip.c | 1 + > > kernel/irq/internals.h | 2 ++ > > kernel/irq/irqdesc.c | 2 ++ > > kernel/irq/resend.c | 47 ++++++++++++++++++++++++++--------------- > > 5 files changed, 38 insertions(+), 17 deletions(-) > >=20 > > diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h > > index 844a8e30e6de..d9451d456a73 100644 > > --- a/include/linux/irqdesc.h > > +++ b/include/linux/irqdesc.h > > @@ -102,6 +102,9 @@ struct irq_desc { > > int parent_irq; > > struct module *owner; > > const char *name; > > +#ifdef CONFIG_HARDIRQS_SW_RESEND > > + struct hlist_node resend_node; > > +#endif > > } ____cacheline_internodealigned_in_smp; >=20 > Although there is no documented rule that limits the change of the KABI > struct irq_desc, it is still better to keep the irq_desc definition stabl= e. On what grounds? There is no such thing as a stable in-kernel ABI, specially for things as internal as irq_desc. > > =20 > > #ifdef CONFIG_SPARSE_IRQ > > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > > index 49e7bc871fec..2eac5532c3c8 100644 > > --- a/kernel/irq/chip.c > > +++ b/kernel/irq/chip.c > > @@ -306,6 +306,7 @@ static void __irq_disable(struct irq_desc *desc, bo= ol mask); > > void irq_shutdown(struct irq_desc *desc) > > { > > if (irqd_is_started(&desc->irq_data)) { > > + clear_irq_resend(desc); > > desc->depth =3D 1; > > if (desc->irq_data.chip->irq_shutdown) { > > desc->irq_data.chip->irq_shutdown(&desc->irq_data); > > diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h > > index 5fdc0b557579..51fc8c497c22 100644 > > --- a/kernel/irq/internals.h > > +++ b/kernel/irq/internals.h > > @@ -113,6 +113,8 @@ irqreturn_t handle_irq_event(struct irq_desc *desc); > > =20 > > /* Resending of interrupts :*/ > > int check_irq_resend(struct irq_desc *desc, bool inject); > > +void clear_irq_resend(struct irq_desc *desc); > > +void irq_resend_init(struct irq_desc *desc); > > bool irq_wait_for_poll(struct irq_desc *desc); > > void __irq_wake_thread(struct irq_desc *desc, struct irqaction *action= ); > > =20 > > diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c > > index 240e145e969f..b401b89b226a 100644 > > --- a/kernel/irq/irqdesc.c > > +++ b/kernel/irq/irqdesc.c > > @@ -415,6 +415,7 @@ static struct irq_desc *alloc_desc(int irq, int nod= e, unsigned int flags, > > desc_set_defaults(irq, desc, node, affinity, owner); > > irqd_set(&desc->irq_data, flags); > > kobject_init(&desc->kobj, &irq_kobj_type); > > + irq_resend_init(desc); > > =20 > > return desc; > > =20 > > @@ -581,6 +582,7 @@ int __init early_irq_init(void) > > mutex_init(&desc[i].request_mutex); > > init_waitqueue_head(&desc[i].wait_for_threads); > > desc_set_defaults(i, &desc[i], node, NULL, NULL); > > + irq_resend_init(desc); > > } > > return arch_early_irq_init(); > > } > > diff --git a/kernel/irq/resend.c b/kernel/irq/resend.c > > index 0c46e9fe3a89..edec335c0a7a 100644 > > --- a/kernel/irq/resend.c > > +++ b/kernel/irq/resend.c > > @@ -21,8 +21,9 @@ > > =20 > > #ifdef CONFIG_HARDIRQS_SW_RESEND > > =20 > > -/* Bitmap to handle software resend of interrupts: */ > > -static DECLARE_BITMAP(irqs_resend, IRQ_BITMAP_BITS); > > +/* hlist_head to handle software resend of interrupts: */ > > +static HLIST_HEAD(irq_resend_list); > > +static DEFINE_RAW_SPINLOCK(irq_resend_lock); >=20 > What is the benefit of using hlist here? If you want to enjoy the > low latency of querying elements by key, you must define a hlist table > with a reasonable number of buckets. Otherwise, I don't think the time > complexity of hlist is better than a regular double-linked list, right? You do realise that the list is processed in order, one element after the other, without ever querying any arbitrary element? Have you read the code? > > > =20 > > /* > > * Run software resends of IRQ's > > @@ -30,18 +31,17 @@ static DECLARE_BITMAP(irqs_resend, IRQ_BITMAP_BITS); > > static void resend_irqs(struct tasklet_struct *unused) > > { > > struct irq_desc *desc; > > - int irq; > > - > > - while (!bitmap_empty(irqs_resend, nr_irqs)) { > > - irq =3D find_first_bit(irqs_resend, nr_irqs); > > - clear_bit(irq, irqs_resend); > > - desc =3D irq_to_desc(irq); > > - if (!desc) > > - continue; > > - local_irq_disable(); > > + > > + raw_spin_lock_irq(&irq_resend_lock); > > + while (!hlist_empty(&irq_resend_list)) { > > + desc =3D hlist_entry(irq_resend_list.first, struct irq_desc, > > + resend_node); > > + hlist_del_init(&desc->resend_node); > > + raw_spin_unlock(&irq_resend_lock); > > desc->handle_irq(desc); > > - local_irq_enable(); > > + raw_spin_lock(&irq_resend_lock); > > } > > + raw_spin_unlock_irq(&irq_resend_lock); > > } > > =20 > > /* Tasklet to handle resend: */ > > @@ -49,8 +49,6 @@ static DECLARE_TASKLET(resend_tasklet, resend_irqs); > > =20 > > static int irq_sw_resend(struct irq_desc *desc) > > { > > - unsigned int irq =3D irq_desc_get_irq(desc); > > - > > /* > > * Validate whether this interrupt can be safely injected from > > * non interrupt context > > @@ -70,16 +68,31 @@ static int irq_sw_resend(struct irq_desc *desc) > > */ > > if (!desc->parent_irq) > > return -EINVAL; > > - irq =3D desc->parent_irq; >=20 > Why delete this code? OK, now I know you haven't read this code at all :-(. >=20 > > } > > =20 > > - /* Set it pending and activate the softirq: */ > > - set_bit(irq, irqs_resend); > > + /* Add to resend_list and activate the softirq: */ > > + raw_spin_lock(&irq_resend_lock); > > + hlist_add_head(&desc->resend_node, &irq_resend_list); > > + raw_spin_unlock(&irq_resend_lock); >=20 > Do you conside a situation where irq_sw_resend() is running on two CPUs c= oncurrently? > If so, the same desc could be added into irq_resend_list twice by mistake. Have you looked at the calling site (stress on singular), the locking requirements, and the role IRQS_REPLAY plays when it comes to queuing an interrupt for resend? M. --=20 Without deviation from the norm, progress is not possible.