Received: by 2002:a05:7412:b101:b0:e2:908c:2ebd with SMTP id az1csp3080279rdb; Wed, 15 Nov 2023 22:17:43 -0800 (PST) X-Google-Smtp-Source: AGHT+IH67FRbI4yOTWSDnFj6tPVTYsan026vfdJngxkQe94G101fh+5Tz4/0pycj+I65oElKdNjM X-Received: by 2002:a05:6a20:e111:b0:161:2389:e34b with SMTP id kr17-20020a056a20e11100b001612389e34bmr17833811pzb.13.1700115463167; Wed, 15 Nov 2023 22:17:43 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700115463; cv=pass; d=google.com; s=arc-20160816; b=hnUYonsaT05QXukRELXZqtiK6wxuJp2OP0bhNFUB6QCNzdeRvrHcQmbpDjs1Y0Wacn 6Omo4FSTuaxgOgiwJq7YciG1u3LZ3YEGbQ1ZTi1L0vAs0ME4HccD+/S2XyVbHxcHXAhf NDxQRilE6fc8kuFqkqcd5pdyuvXFwTYYNS701Wss151Xgg/lML2YBfQMpdVJUUUiFtYj AB6cJQS6bA3Y/JSYfC/kwEokROiUnk2miUdjAXpi+cdBMfbidZ6sL4AsfTdCZ4LPXFYY 1PNH8rIRrB59qnGbii5zVoe00NiLnL1nF/uGRUbR48GWKlLbk2Lb65prsPUifrO1XMBJ vT4w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from:dkim-signature; bh=tONxgB5RY8Os7BAnI2cUOVSAr2dxcfIJPjNK+gfxHPg=; fh=aKWXfJSjfLJXZHs7SWMUx8Nd7xGchYrjocgoYV6l9R4=; b=ihZ/wxxQsiNzJ2CTvtTF7ZaOZmkWMHAcIxGudQSib8+SRFgqk3G3nk26S/E9PVR+VY uK9Kdsk4i1Hh60jk2Zt69XcEhBlb0LLGQ/2cai86BLZntPzsf+i19z1+vseIlBOB2mqO gBcRUtueczQrRy/ywa6nhkML0UKzXQ6HdxfzlkPaXMf54fOAknunkkV2TP1CbJK/LfxF gU5HhXQbeCfYvdTysgwyx3shjghKajaEGKMmhCeCnNLRdC+8meqIs4t7LP87W8OkxwgD fWp9CKyfTuQfe4Ww9Zi9QabWt30alhHaDNVhujAuJPOf52k9q3u8UQbszsLQb3a3q8ax 4dQQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@outlook.com header.s=selector1 header.b=I+2KiN6I; arc=pass (i=1); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id o12-20020a170902bccc00b001c76b4c349esi11405179pls.218.2023.11.15.22.17.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 22:17:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@outlook.com header.s=selector1 header.b=I+2KiN6I; arc=pass (i=1); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=outlook.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id ECA7E813989D; Wed, 15 Nov 2023 22:17:41 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229819AbjKPGRi (ORCPT + 99 others); Thu, 16 Nov 2023 01:17:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229446AbjKPGRg (ORCPT ); Thu, 16 Nov 2023 01:17:36 -0500 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12olkn2031.outbound.protection.outlook.com [40.92.22.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F0E127; Wed, 15 Nov 2023 22:17:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fPsyG7wp/iL6Vonbsh39MOZJFRSVt1oI7AvaeHBjPC3of3wRCy6Xry65MpEMc3+sdQVm0ZzAB6OEndMTIlXAsy5/8aASgTuJcqtRje22ImNJs1nXz8M6Q+CWG7UxU8PPG3X5j/cX8gEnOtgU9L8mjK8uCfbu2q42WV471qUnukVTsVKeR2Hdf4SzBjuinrYpIeRCLdyUg2vCH9NYISmDirULQGO6nT+wAonicSiVI0WrINC6Q0ePAkp0vlnLfEtH62aOp9ZdCj99ZyUbjHvLWPEzT/900AiMbHe2SugvL6TVYZI6HEkC1MSWH4CmMk2hzbxODwPPlvh7+se+wh5Hsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tONxgB5RY8Os7BAnI2cUOVSAr2dxcfIJPjNK+gfxHPg=; b=STMU69rfaN7cEtvnKwXpYykAahxkXvrow0TZQlT6M6uNQCOPq0FBBF4NSlxNDs/F//iVeG8A8mRgG/KXSXwTRfg+O+gyksQ9mZiUsA3n5JywopOmqsafVOAmdrC5qAo//SW0wQNxnt6mzTvUaQsDPhvxLo5btlsqVb7irmnil/kPl3wlv44IoXp8Rs9ehDmm2WtE+dlr8DSCdLS0xbDIRXqahVy65sEPRSWTpjzQRm8QOwdDM2agLVV3ZZgL9LD/pehlvUamSvo0Q/KmgElJsHKgTnBELu9tYJmztqBaKk/bVwHUU2Rj3WtFLQMl8uHPcBUYZf+mugkowRYFqm5xLA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outlook.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tONxgB5RY8Os7BAnI2cUOVSAr2dxcfIJPjNK+gfxHPg=; b=I+2KiN6IMdaLd0qJk7w7tMhmiaVLIn9mnFSMUE34msj1a+JWlkAKEquFSaemAgMFyY3ixeM9sx8RkxQjRqY5hRug1XqgDwfKHj3G8M0zVNUvqd7DCLj+4KxjttmHDuylwB1stD7OmtvmMbH8UD+ELBJ2AqgC69ejNsxNxnLaxDmEHmQDiTnMlaorZInbo7fUkx7xzrhwDVlvKfmo1U+L+dJP8VzfjjaBjTFp6kZ4KH2rvbAIrakiuwi7luzVhG9awBo+hA2AtSFvUyVIKax+MT9wrnFtotiQ/IOnNuPioLbSN4C2Po87uut7mqDrCpTSxjbngKwTCrYqlzxFtXKWvw== Received: from BN7PR02MB4148.namprd02.prod.outlook.com (2603:10b6:406:f6::17) by IA0PR02MB9439.namprd02.prod.outlook.com (2603:10b6:208:40a::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20; Thu, 16 Nov 2023 06:17:28 +0000 Received: from BN7PR02MB4148.namprd02.prod.outlook.com ([fe80::3c13:971c:32ba:9d09]) by BN7PR02MB4148.namprd02.prod.outlook.com ([fe80::3c13:971c:32ba:9d09%7]) with mapi id 15.20.7002.015; Thu, 16 Nov 2023 06:17:27 +0000 From: Michael Kelley To: Michael Kelley , Souradeep Chakrabarti , "kys@microsoft.com" , "haiyangz@microsoft.com" , "wei.liu@kernel.org" , "decui@microsoft.com" , "davem@davemloft.net" , "edumazet@google.com" , "kuba@kernel.org" , "pabeni@redhat.com" , "longli@microsoft.com" , "leon@kernel.org" , "cai.huoqing@linux.dev" , "ssengar@linux.microsoft.com" , "vkuznets@redhat.com" , "tglx@linutronix.de" , "linux-hyperv@vger.kernel.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" CC: "schakrabarti@microsoft.com" , "paulros@microsoft.com" Subject: RE: [PATCH net-next] net: mana: Assigning IRQ affinity on HT cores Thread-Topic: [PATCH net-next] net: mana: Assigning IRQ affinity on HT cores Thread-Index: AQHaF8qFIut0V3aUl0iWHXi4P2psf7B8VglwgAAbgdA= Date: Thu, 16 Nov 2023 06:17:27 +0000 Message-ID: References: <1700056125-29358-1-git-send-email-schakrabarti@linux.microsoft.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-tmn: [V7I+hZW95NqVUaPDinLQdUVybpbd1yKi] x-ms-publictraffictype: Email x-ms-traffictypediagnostic: BN7PR02MB4148:EE_|IA0PR02MB9439:EE_ x-ms-office365-filtering-correlation-id: 07d9325e-6c88-4721-b1d7-08dbe66bb510 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: jqipgI5KBjESiutbutlwX/hkBzPGi6pSIzXQUvoqRrKTfa91HNmMrMr4RQD0kIAA6mqUDLa4eesWdr1dn7BzApN+XXs6z+vn76GKUBoXhtq8CBpQmW5nKMo1qbIu0WCD6V1G5+RymmwiEQMTulFuqoWPy7/pRDWRTtsT5Ns58asd7E86/aLr6Bo7PSKd5aRMeUNbo0TSjzvRCKlOzhHW+h5PJdW/QCVwnwbxkNke2kMU9mdWg/WnZceHqCx3joqCI1JdaZ2EDQ/f4MnEIzG+QehoL20hvJ7jTZLkB4UGG+WMEpx797soiQvuJToZiNGu18wJnUzk4R4heftQ5Y8bbH8eNaBWDuJBSjvNtUsFA2YtHot4VIBDbaHlMg8lvA0Ve71cktLclAlP2qlM9jS/Tv0U8QiXSIOjtSqafX1HIuqA5kfjOJi515anc5cHA2Tn8U5HC5Fj8hOO0X4zoG1/xyRjvHwlcDHBcp1vnTc+QM/gyuzR63cKt0MJRkRh2OclQ9khMWvVrBmdN3ezpiW3/Z789g82LUSe2EqRpkrIfB/Q3XsUfiUIYmPUvC475V7H6f3d0RSOwUXmEqEaNcAkBUWKiIf37UCAK9B/y8BxIAVQwnJ/B6+i0IdpY2coosHW x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?D33zLV5F+Bm/elx0TXYHH3dhRiQvzA31RoMK0cqWqhjHHkERy9rkjyDvPTdp?= =?us-ascii?Q?Y6zPjgVhQT5mVykQS2T/oIw1ZIVpqT+gef1dPX/onXb7hwzwlMPTC/cpNRxo?= =?us-ascii?Q?SwO7CJdzHBiaLDhLk/gFraY4akZ/CepVm/KeZ+mMIofCpH2y+o1LDhiV4L7E?= =?us-ascii?Q?TsKAaau3PzNhQkbAeqwHo4aM9VEQOdnnLc6nSAccWcvcMRabmnfv3ZSpiF64?= =?us-ascii?Q?TPq27Q/tUeM+aAPHn50iwQTQHDKaRsDGjjNyYIJ8LaZN1d+xGpD3C7WBhH6/?= =?us-ascii?Q?LSTAZWPlvfzuMxaI+i3zEbnn3nOcLdD29hvUG5QDBgCM/rXS1AkNb/lRnjhN?= =?us-ascii?Q?6LxNwR2ScAs1nK5/VsQQA0cbtQ3jJbV16qpmo6IJ7vfQPfZOb6dn16yB7MkU?= =?us-ascii?Q?PUhKmSUbNSpXhwDtLW9G7yCWaOwAKtb0t/2lLTF2VC8wg1DrkdTiV/q3VE9e?= =?us-ascii?Q?EajU2zcKVFnTtQmsId9FxQkGd5uYir512ZMRE2yRa4uwuHdXks2eidb7HJ/2?= =?us-ascii?Q?JD2ej5IcKinBn/9K+RM3SkVZkxyBzeij1e7kkVUc6j+McTkZVHvpsIUlBXHN?= =?us-ascii?Q?soWx4Im2CFt6Qrffm4y2CczuSrgb5+Dg12i6SKvnMNXkISgJpfDyxEFSltMU?= =?us-ascii?Q?nDI0eSCpkSP4qNXw4y67f4ksiJXqPMmdMhG3GLVhkWtlx+9I1srMtBt0rZYQ?= =?us-ascii?Q?fiIgxSscz4J+xfOa+xbD+JJiytxsBZip/nEGqusmlWd+pILfTdMAitKDd13A?= =?us-ascii?Q?T2R4avSbrTCC5RWExss6Al3FMvv+z2nMXR0yyiS+lAp7KwGiC1KzHZKlczay?= =?us-ascii?Q?hqVwds3dJLmjm/4+gtdfOBVn2risdHjXUvv81UNSBL3rJXc7qhzDDRsgd8/P?= =?us-ascii?Q?BWYhLSBLbLd4P1sFY4klx7UhA16IzpCQwvd7MazcUi7van0nxdw7WW1o5Xve?= =?us-ascii?Q?ksJHGmlOI3TZHfhgeEfBCQzUhz/3SizCV83RQALhxnEFxRwg9OdkbHIvT+U1?= =?us-ascii?Q?KzCU1isRt/nJ9CaHHoHTqVyFKbfdUN9kINuWZE5OceTnvFOwT24XxOjI8MXD?= =?us-ascii?Q?ZOoyuZajlpYBU8kcEERsfZxTZkYYPXOeJgmBSjYHrFwtqmpIZ4MDQcpoIiFD?= =?us-ascii?Q?ZADEplGrLRR6F+op29JKRLioNn56yrMeBqXQ8NXb+1X1QGUEFYIrh5QFcIPb?= =?us-ascii?Q?W3qqYWmYFZAEdXWCZWJ/e2JktepcC7VBnjsddw=3D=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BN7PR02MB4148.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: 07d9325e-6c88-4721-b1d7-08dbe66bb510 X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Nov 2023 06:17:27.6351 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR02MB9439 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 15 Nov 2023 22:17:42 -0800 (PST) From: Michael Kelley Sent: Wednesday, November 15, 2= 023 9:26 PM > > > > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > index 6367de0c2c2e..839be819d46e 100644 > > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c > > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c > > @@ -1243,13 +1243,115 @@ void mana_gd_free_res_map(struct > > gdma_resource *r) > > r->size =3D 0; > > } > > > > +static void cpu_mask_set(cpumask_var_t *filter_mask, cpumask_var_t > > **filter_mask_list) > > +{ > > + unsigned int core_count =3D 0, cpu; > > + cpumask_var_t *filter_mask_list_tmp; > > + > > + BUG_ON(!filter_mask || !filter_mask_list); >=20 > This check seems superfluous. The function is invoked from only > one call site in the code below, and that site call site can easily > ensure that it doesn't pass a NULL value for either parameter. >=20 > > + filter_mask_list_tmp =3D *filter_mask_list; > > + cpumask_copy(*filter_mask, cpu_online_mask); > > + /* for each core create a cpumask lookup table, > > + * which stores all the corresponding siblings > > + */ > > + for_each_cpu(cpu, *filter_mask) { > > + > BUG_ON(!alloc_cpumask_var(&filter_mask_list_tmp[core_count], GFP_KERNEL)= ); >=20 > Don't panic if a memory allocation fails. Must back out, clean up, > and return an error. >=20 > > + cpumask_or(filter_mask_list_tmp[core_count], filter_mask_list_tmp[co= re_count], > > + topology_sibling_cpumask(cpu)); >=20 > alloc_cpumask_var() does not zero out the returned cpumask. So the > cpumask_or() is operating on uninitialized garbage. Use > zalloc_cpumask_var() to get a cpumask initialized to zero. >=20 > But why is the cpumask_or() being done at all? Couldn't this > just be a cpumask_copy()? In that case, alloc_cpumask_var() is OK. >=20 > > + cpumask_andnot(*filter_mask, *filter_mask, topology_sibling_cpumask(= cpu)); > > + core_count++; > > + } > > +} > > + > > +static int irq_setup(int *irqs, int nvec) > > +{ > > + cpumask_var_t filter_mask; > > + cpumask_var_t *filter_mask_list; > > + unsigned int cpu_first, cpu, irq_start, cores =3D 0; > > + int i, core_count =3D 0, numa_node, cpu_count =3D 0, err =3D 0; > > + > > + BUG_ON(!alloc_cpumask_var(&filter_mask, GFP_KERNEL)); >=20 > Again, don't panic if a memory allocation fails. Must back out and > return an error. >=20 > > + cpus_read_lock(); > > + cpumask_copy(filter_mask, cpu_online_mask); >=20 > For readability, I'd suggest adding a blank line before any > of the multi-line comments below. >=20 > > + /* count the number of cores > > + */ > > + for_each_cpu(cpu, filter_mask) { > > + cpumask_andnot(filter_mask, filter_mask, topology_sibling_cpumask(cp= u)); > > + cores++; > > + } > > + filter_mask_list =3D kcalloc(cores, sizeof(cpumask_var_t), GFP_KERNEL= ); > > + if (!filter_mask_list) { > > + err =3D -ENOMEM; > > + goto free_irq; One additional macro-level comment. Creating and freeing the filter_mask_list is a real pain. But it is needed to identify which CPUs in a core are available to have an IRQ assigned to them. So let me suggest a modified approach to meeting that need. 1) Count the number of cores just as before. 2) Then instead of allocating filter_mask_list, allocate an array of integers, with one array entry per core. Call the array core_id_list. Somewhat like the code in cpu_mask_set(), populate the array with the CPU number of the first CPU in each core. The populating only needs to be done once, so the code can be inline in irq_setup(). It doesn't need to be in a helper function like cpu_mask_set(). 3) Allocate a single cpumask_var_t local variable that is initialized to a copy of cpu_online_mask. Call it avail_cpus. This local variable indicates which CPUs (across all cores) are available to have an IRQ assigned. 4) At the beginning of the "for" loop over nvec, current code has: cpu_first =3D cpumask_first(filter_mask_list[core_count]); Instead, do: cpu_first =3D cpumask_first_and(&avail_cpus, topology_sibling_cpumask(core_id_list[core_count])); The above code effectively creates on-the-fly the cpumask previously stored in filter_mask_list, and finds the first CPU as before. 5) When a CPU is assigned an IRQ, clear that CPU in avail_cpus, not in the filter_mask_list entry. 6) When the NUMA node wraps around and current code calls cpu_mask_set(), instead reinitialize avail_cpus to cpu_online_mask like in my #3 above. In summary, with this approach filter_mask_list isn't needed at all. The messiness of allocating and freeing an array of cpumask's goes away. I haven't coded it, but I think the result will be simpler and less error prone. Michael > > + } > > + /* if number of cpus are equal to max_queues per port, then > > + * one extra interrupt for the hardware channel communication. > > + */ >=20 > Note that higher level code may set nvec based on the # of > online CPUs, but until the cpus_read_lock is held, the # of online > CPUs can change. So nvec might have been determined when the > # of CPUs was 8, but 2 CPUs could go offline before the cpus_read_lock > is obtained. So nvec could be bigger than just 1 more than > num_online_cpus(). I'm not sure how to handle the check below to > special case the hardware communication channel. But realize > that the main "for" loop below could end up assigning multiple > IRQs to the same CPU if nvec > num_online_cpus(). >=20 > > + if (nvec - 1 =3D=3D num_online_cpus()) { > > + irq_start =3D 1; > > + cpu_first =3D cpumask_first(cpu_online_mask); > > + irq_set_affinity_and_hint(irqs[0], cpumask_of(cpu_first)); > > + } else { > > + irq_start =3D 0; > > + } > > + /* reset the core_count and num_node to 0. > > + */ > > + core_count =3D 0; > > + numa_node =3D 0; > > + cpu_mask_set(&filter_mask, &filter_mask_list); >=20 > FWIW, passing filter_mask as the first argument above was > confusing to me until I realized that the value of filter_mask > doesn't matter -- it's just to use the memory allocated for > filter_mask. Maybe a comment would help. And ditto for > the invocation of cpu_mask_set() further down. >=20 > > + /* for each interrupt find the cpu of a particular > > + * sibling set and if it belongs to the specific numa > > + * then assign irq to it and clear the cpu bit from > > + * the corresponding sibling list from filter_mask_list. > > + * Increase the cpu_count for that node. > > + * Once all cpus for a numa node is assigned, then > > + * move to different numa node and continue the same. > > + */ > > + for (i =3D irq_start; i < nvec; ) { > > + cpu_first =3D cpumask_first(filter_mask_list[core_count]); > > + if (cpu_first < nr_cpu_ids && cpu_to_node(cpu_first) =3D=3D numa_nod= e) { >=20 > Note that it's possible to have a NUMA node with zero online CPUs. > It could be a NUMA node that was originally a memory-only NUMA > node and never had any CPUs, or the NUMA node could have had > CPUs, but they were all later taken offline. Such a NUMA node is > still considered to be online because it has memory, but it has > no online CPUs. >=20 > If this code ever sets "numa_node" to such a NUMA node, > the "for" loop will become an infinite loop because the "if" > statement above will never find a CPU that is assigned to > "numa_node". >=20 > > + irq_set_affinity_and_hint(irqs[i], cpumask_of(cpu_first)); > > + cpumask_clear_cpu(cpu_first, filter_mask_list[core_count]); > > + cpu_count =3D cpu_count + 1; > > + i =3D i + 1; > > + /* checking if all the cpus are used from the > > + * particular node. > > + */ > > + if (cpu_count =3D=3D nr_cpus_node(numa_node)) { > > + numa_node =3D numa_node + 1; > > + if (numa_node =3D=3D num_online_nodes()) { > > + cpu_mask_set(&filter_mask, &filter_mask_list); >=20 > The second and subsequent calls to cpu_mask_set() will > leak any memory previously allocated by alloc_cpumask_var() > in cpu_mask_set(). >=20 > I agree with Haiyang's comment about starting with a NUMA > node other than NUMA node zero. But if you do so, note that > testing for wrap-around at num_online_nodes() won't be > equivalent to testing for getting back to the starting NUMA node. > You really want to run cpu_mask_set() again only when you get > back to the starting NUMA node. >=20 > > + numa_node =3D 0; > > + } > > + cpu_count =3D 0; > > + core_count =3D 0; > > + continue; > > + } > > + } > > + if ((core_count + 1) % cores =3D=3D 0) > > + core_count =3D 0; > > + else > > + core_count++; >=20 > The if/else could be more compactly written as: >=20 > if (++core_count =3D=3D cores) > core_count =3D 0; >=20 > > + } > > +free_irq: > > + cpus_read_unlock(); > > + if (filter_mask) >=20 > This check for non-NULL filter_mask is incorrect and might > not even compile if CONFIG_CPUMASK_OFFSTACK=3Dn. For testing > purposes, you should make sure to build and run with each > option: with CONFIG_CPUMASK_OFFSTACK=3Dy and > also CONFIG_CPUMASK_OFFSTACK=3Dn. >=20 > It's safe to just call free_cpumask_var() without the check. >=20 > > + free_cpumask_var(filter_mask); > > + if (filter_mask_list) { > > + for (core_count =3D 0; core_count < cores; core_count++) > > + free_cpumask_var(filter_mask_list[core_count]); > > + kfree(filter_mask_list); > > + } > > + return err; > > +}