Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp3728478imw; Mon, 18 Jul 2022 13:25:20 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uNTRB6V/j6puW3Airp8wMhrMrny8iRsXRYRdn6KaBV1iAcE6OwFjYmgBg/Oip3hiib/RTp X-Received: by 2002:a17:902:ea0f:b0:16c:134:a247 with SMTP id s15-20020a170902ea0f00b0016c0134a247mr29626242plg.86.1658175919996; Mon, 18 Jul 2022 13:25:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658175919; cv=none; d=google.com; s=arc-20160816; b=YlwgDuoaKRN6Bwi/UR1abZ3IunTdyLoTV/gyfZgf3YpgBEieK4f/DbvVcL3N6JZVSP J+a+mPB8fJEwsM/pkZ9hOnPBz2nCqXj4mw2haTI7P7ejVRL6uLElH5O7FHvIk3NL/van sjzzbieDxJZBVrjvo9IB//ymszDAedwvzMgADjXp4/47z3XCZDAFxmaF4+ewG04Maz0Y AFhfLU8dTwV4T+SXxcSlN33fxD4re0QgWzf/jx8lmjZpFwxrOBQuDFQ+OjG2r9eyfFrO EFz06Rvt/kCoDV4fOyV9SLN2K8Gx9DNFJkMG80yTx/iNJzplJOvckFLh8nH5Irrg3hng 6x4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=HYS8d97OSWqWFt6qraEk8QQYqaDKoCu9P8dNF7MQV/I=; b=U2GjofsmAputZLXM105ESXKuKCio4lOoa5Bz3GKUaxynoZ7HI+ftxJHJPSYVitWLfu n0GO5VIIH5fpZHBw32oNPaymkDlQfQFP5f/kolQhaewDg5SxKE4qgLmPbfQgcms4zw98 EpREwYbRg7lxI+rUzZRzBgK2SDi7ca8a1b++iqzSV4sxCyRW2lYu4gLtaLwY0pKP0xxk H9eU9x7QcXjFgKuZ/wKj1peqjHLmGgYavajj8yHzsFfjyj0GxvA0HLngbXtBg++zxC2t Sy3NxPy+YfLo+trj0nLGD4gk3aViD0ALrYGCvcLnjydLVFCzaTrnCxrzRJS6z40ml8f3 myMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=RQqo5euy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 192-20020a6302c9000000b0041a3b1deca2si2220737pgc.218.2022.07.18.13.25.03; Mon, 18 Jul 2022 13:25:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=RQqo5euy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236316AbiGRTt2 (ORCPT + 99 others); Mon, 18 Jul 2022 15:49:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231495AbiGRTt1 (ORCPT ); Mon, 18 Jul 2022 15:49:27 -0400 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F70725DF; Mon, 18 Jul 2022 12:49:26 -0700 (PDT) Received: by mail-ej1-x636.google.com with SMTP id sz17so23234159ejc.9; Mon, 18 Jul 2022 12:49:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=HYS8d97OSWqWFt6qraEk8QQYqaDKoCu9P8dNF7MQV/I=; b=RQqo5euyUNZsNSDq76e+Y6Cv3Yfw/KrbWFz3lAinvv27/90P/dhfgqzg2IbRDBN7O7 +VQ2m7hOk4ZGFrXCnHACMXuauL3TuF2vpfwKjpFKWjwdnfn/309SqQdCy4yBZ0kwbe48 ySQZzZVKf2rCRkw0109fbQrUt6mNVPt1Q694GFB6kgOTnazJYwfOj3zhUNXwVdXA9Y5R s0d9rxiiWM9UHF4OArALxG7Bh9/uCQwNTXJ8vzSL681fvxc69wvWDo3+Upp7GKe7ANXz pdTYA3bh6wgnCEFny2rthbqMJa3uJLnDwhUhifdCKoCbcaDCChcwEyd2KvDp9uUY/sBY s8+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=HYS8d97OSWqWFt6qraEk8QQYqaDKoCu9P8dNF7MQV/I=; b=Vs515rMqaa57tra7WEOTWyyqrQJgak0m1JEgqjlM9LB01IUAvE8fcg+YRdR1dYnDW9 +SVUYQyqFPvDy4D36yfn+2GhgfdSgJmXB2oKK5rDorullaufZkqSm8fm4S/mJZ/nwyLr RPrdAUrDX4w/SIamQh1CjkuIMIUAbkMdFHg6aDsY3EVAizjgD/3V1w7wDNWaamBbXLBW O58+aq4sRgXU4tk+kKEEaTnjpunQZHQdO+hT4SkUs0+i5wY8Ljhph1+/ffErm5LedOMN orqR+LzZoOCaMFGVO+JPIQPv48NSDlTkFISN1t/Gq/a84lJTpPqJv9/O1prrIlhJTBYl iAFQ== X-Gm-Message-State: AJIora94BylGHrsOl5OhfQgfWBHEunzWHgwyoWeHbTkvIOlNQom1YLSx MRvqloAeZehPBs4ffDpiaHs= X-Received: by 2002:a17:907:a40f:b0:72b:64ee:5b2f with SMTP id sg15-20020a170907a40f00b0072b64ee5b2fmr28425844ejc.268.1658173764701; Mon, 18 Jul 2022 12:49:24 -0700 (PDT) Received: from [192.168.0.104] ([77.126.166.31]) by smtp.gmail.com with ESMTPSA id g1-20020a17090604c100b0072afb6d4d6fsm5952705eja.171.2022.07.18.12.49.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 18 Jul 2022 12:49:24 -0700 (PDT) Message-ID: <2fc99d26-f804-ad34-1fd7-90cfb123b426@gmail.com> Date: Mon, 18 Jul 2022 22:49:21 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH net-next V2 2/2] net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints Content-Language: en-US To: Peter Zijlstra , Tariq Toukan Cc: "David S. Miller" , Saeed Mahameed , Jakub Kicinski , Ingo Molnar , Juri Lelli , Eric Dumazet , Paolo Abeni , netdev@vger.kernel.org, Gal Pressman , Vincent Guittot , linux-kernel@vger.kernel.org References: <20220718124315.16648-1-tariqt@nvidia.com> <20220718124315.16648-3-tariqt@nvidia.com> From: Tariq Toukan In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/18/2022 4:50 PM, Peter Zijlstra wrote: > On Mon, Jul 18, 2022 at 03:43:15PM +0300, Tariq Toukan wrote: > >> Reviewed-by: Gal Pressman >> Acked-by: Saeed Mahameed >> Signed-off-by: Tariq Toukan >> --- >> drivers/net/ethernet/mellanox/mlx5/core/eq.c | 62 +++++++++++++++++++- >> 1 file changed, 59 insertions(+), 3 deletions(-) >> >> v2: >> Separated the set_cpu operation into two functions, per Saeed's suggestion. >> Added Saeed's Acked-by signature. >> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c >> index 229728c80233..e72bdaaad84f 100644 >> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c >> @@ -11,6 +11,9 @@ >> #ifdef CONFIG_RFS_ACCEL >> #include >> #endif >> +#ifdef CONFIG_NUMA >> +#include >> +#endif >> #include "mlx5_core.h" >> #include "lib/eq.h" >> #include "fpga/core.h" >> @@ -806,13 +809,67 @@ static void comp_irqs_release(struct mlx5_core_dev *dev) >> kfree(table->comp_irqs); >> } >> >> +static void set_cpus_by_local_spread(struct mlx5_core_dev *dev, u16 *cpus, >> + int ncomp_eqs) >> +{ >> + int i; >> + >> + for (i = 0; i < ncomp_eqs; i++) >> + cpus[i] = cpumask_local_spread(i, dev->priv.numa_node); >> +} >> + >> +static bool set_cpus_by_numa_distance(struct mlx5_core_dev *dev, u16 *cpus, >> + int ncomp_eqs) >> +{ >> +#ifdef CONFIG_NUMA >> + cpumask_var_t cpumask; >> + int first; >> + int i; >> + >> + if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL)) { >> + mlx5_core_err(dev, "zalloc_cpumask_var failed\n"); >> + return false; >> + } >> + cpumask_copy(cpumask, cpu_online_mask); >> + >> + first = cpumask_local_spread(0, dev->priv.numa_node); > > Arguably you want something like: > > first = cpumask_any(cpumask_of_node(dev->priv.numa_node)); Any doesn't sound like what I'm looking for, I'm looking for first. I do care about the order within the node, so it's more like cpumask_first(cpumask_of_node(dev->priv.numa_node)); Do you think this has any advantage over cpumask_local_spread, if used only during the setup phase of the driver? > >> + >> + for (i = 0; i < ncomp_eqs; i++) { >> + int cpu; >> + >> + cpu = sched_numa_find_closest(cpumask, first); >> + if (cpu >= nr_cpu_ids) { >> + mlx5_core_err(dev, "sched_numa_find_closest failed, cpu(%d) >= nr_cpu_ids(%d)\n", >> + cpu, nr_cpu_ids); >> + >> + free_cpumask_var(cpumask); >> + return false; > > So this will fail when ncomp_eqs > cpumask_weight(online_cpus); is that > desired? > Yes. ncomp_eqs does not exceed the num of online cores. >> + } >> + cpus[i] = cpu; >> + cpumask_clear_cpu(cpu, cpumask); > > Since there is no concurrency on this cpumask, you don't need atomic > ops: > > __cpumask_clear_cpu(..); > Right. I'll fix. >> + } >> + >> + free_cpumask_var(cpumask); >> + return true; >> +#else >> + return false; >> +#endif >> +} >> + >> +static void mlx5_set_eqs_cpus(struct mlx5_core_dev *dev, u16 *cpus, int ncomp_eqs) >> +{ >> + bool success = set_cpus_by_numa_distance(dev, cpus, ncomp_eqs); >> + >> + if (!success) >> + set_cpus_by_local_spread(dev, cpus, ncomp_eqs); >> +} >> + >> static int comp_irqs_request(struct mlx5_core_dev *dev) >> { >> struct mlx5_eq_table *table = dev->priv.eq_table; >> int ncomp_eqs = table->num_comp_eqs; >> u16 *cpus; >> int ret; >> - int i; >> >> ncomp_eqs = table->num_comp_eqs; >> table->comp_irqs = kcalloc(ncomp_eqs, sizeof(*table->comp_irqs), GFP_KERNEL); >> @@ -830,8 +887,7 @@ static int comp_irqs_request(struct mlx5_core_dev *dev) >> ret = -ENOMEM; >> goto free_irqs; >> } >> - for (i = 0; i < ncomp_eqs; i++) >> - cpus[i] = cpumask_local_spread(i, dev->priv.numa_node); >> + mlx5_set_eqs_cpus(dev, cpus, ncomp_eqs); > > So you change this for mlx5, what about the other users of > cpumask_local_spread() ? I took a look at the different netdev users. While some users have similar use case to ours (affinity hints), many others use cpumask_local_spread in other flows (XPS setting, ring allocations, etc..). Moving them to use the newly exposed API needs some deeper dive into their code, especially due to the possible undesired side-effects. I prefer not to include these changes in my series for now, but probably contribute it in a followup work. Regards, Tariq