Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp5750180rwb; Tue, 9 Aug 2022 03:33:48 -0700 (PDT) X-Google-Smtp-Source: AA6agR7wUVPZ8elukKzM9yj1iqfYc5wjZ/6vo0CbcrsP9xqH20/fXn8re1IE89kIlU/PLgGqdddH X-Received: by 2002:a17:907:72cf:b0:730:8bea:f741 with SMTP id du15-20020a17090772cf00b007308beaf741mr16758416ejc.527.1660041228125; Tue, 09 Aug 2022 03:33:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660041228; cv=none; d=google.com; s=arc-20160816; b=oIBcDRgfnGarMXuDNCp/rJ7ZdFaJGZ92ax3Pa5z4xEULmGOJpz/ttF3EYVssfujMCq tPuhJDBTJz1Rx+6E4bMUq9vcTRCqQoDYEJCSjOL6VnUhfr0qK4Lu5cHqyh8EEfCshnBS 4T8ZnH9NdHKLZ4TixPFZ8Xku4tGO1aohoEjuG2WvF3ju6y6keWGxn2rDmnywbrk2iHf2 gdN9bpkidLTUQ+NrWdyImCW+1vn33VabYdkkOH42Pqo6au4bXnyNs+PDsRc5xpBGovra 8eLwtFYVAKbeUuLP2I3HzycF+6BD4t2otdrcVUeUKJn3lmzWsy7LO+5yL5fzHCCVNu+L JABg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=dc5u1NxKa9etfng0qiWTVwtMMVoMS+BgmiyLem1i3ds=; b=fwZ3VHeuCzZXOQtvqpMDN9vW6TP2CEfr01PUJm0iE8XXtAe4a9FI5AltsPGCFDyIH0 DM5OJWU3R/hvjINsjxvulLzoSi6T1b1RzMC5c8DczH5ZdqIJZJpKcZjbO8hsDJYUmySK hqZyQ7RdMO5kfDaA8NXpECjNMpZbp7eBZI1IXkei3KE2Ag486430w9eirolGTp2vUvd5 hpktuoumRHdiMG/DeITdNzxPmMp/HKZG9WvhYn0qcjXQKCjZv5zGNE4dowLEAXQyLAgH eu5J5L62TX4xz23Muho0jrPMjNsjZlEhPBVt3A53oM6Ml+xOczUj8kmOoXsEFeMmgcY6 1txw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Zbq7BT4l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gt11-20020a1709072d8b00b0072a7a097e3csi2208892ejc.464.2022.08.09.03.33.20; Tue, 09 Aug 2022 03:33:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Zbq7BT4l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242255AbiHIKSl (ORCPT + 99 others); Tue, 9 Aug 2022 06:18:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242221AbiHIKSi (ORCPT ); Tue, 9 Aug 2022 06:18:38 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C92DB220EB; Tue, 9 Aug 2022 03:18:37 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id j7so13810916wrh.3; Tue, 09 Aug 2022 03:18:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc; bh=dc5u1NxKa9etfng0qiWTVwtMMVoMS+BgmiyLem1i3ds=; b=Zbq7BT4lqJXWkxNd/a4eYq5Ki9iM2vn1Evkr5+e5JFw7E9JvwwT+W28SiVvC2Yy2OC XrtEmO3ZDGLHWKuri2zX4fXDSxu2/OSXwQCtc69mgyZ/maK2vyfQPoNrfhpHqgBZ5n3n szDIuWXorp469KFbhWD/zVkCMQU+uLxqX/xzZaXun4Eh/w7fQFkW376sRWGHZjwj+3oM EKVHCTLIRXkkF6onD+GJIsayGVQoNVOLt+JsORgiVUwA+RNeWVpscMXorGYCvbFg2XFD YXK0XW/9HKDDYNsFtAS2z69PqJAedVIy5xsSJeObwE8uZrNPy2eK6iKIbqF6T35nuMoE Lo9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc; bh=dc5u1NxKa9etfng0qiWTVwtMMVoMS+BgmiyLem1i3ds=; b=qvOX4p8dVkFz0nyPEQHZDYn2Pki7l0m7Vrfa1iN3+ZZMlpXoqvKGwaAwhKKKaEGwb7 JX7RA/n30fnhFbP3bwGmJ3ENP98YGJyZUn81DZnw8VdScQWW7m6w4VmRo86rqgbqfUiM 7futQi5ye9ZQ4GwGT9l0RTXqhRCzsSiZywAo6VNdiGfvoeGdDdBMXBHfFjMTd6H+8s0V a1uhGaCAk55Bd3YIvkoExP8+N5bdD7pTnIalyIVbPWdOMhXPX/EPXJTx/a0ws88dxY+t js7KMtqhLCwIJhqyE8g62EBzaXLNrwpQ+muMOvsJN5e/H+1K/QpNoxg7pPZ3ceLI6T73 uJTw== X-Gm-Message-State: ACgBeo33C8U6gHyoZ3Sdo29Lu5l9DZKifi4iMs9sNc/fPqm95wdjRn02 2GRNAhWNgVFfluTiLSLlkvA= X-Received: by 2002:a5d:56cf:0:b0:21e:ce64:afe7 with SMTP id m15-20020a5d56cf000000b0021ece64afe7mr13580138wrw.281.1660040316191; Tue, 09 Aug 2022 03:18:36 -0700 (PDT) Received: from [192.168.0.104] ([77.126.166.31]) by smtp.gmail.com with ESMTPSA id e12-20020adfa44c000000b0021e5cc26dd0sm13129209wra.62.2022.08.09.03.18.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 09 Aug 2022 03:18:35 -0700 (PDT) Message-ID: <12fd25f9-96fb-d0e0-14ec-3f08c01a5a4b@gmail.com> Date: Tue, 9 Aug 2022 13:18:33 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Subject: Re: [PATCH net-next V4 1/3] sched/topology: Add NUMA-based CPUs spread API Content-Language: en-US To: Valentin Schneider , Tariq Toukan , "David S. Miller" , Saeed Mahameed , Jakub Kicinski , Ingo Molnar , Peter Zijlstra , Juri Lelli Cc: Eric Dumazet , Paolo Abeni , netdev@vger.kernel.org, Gal Pressman , Vincent Guittot , linux-kernel@vger.kernel.org References: <20220728191203.4055-1-tariqt@nvidia.com> <20220728191203.4055-2-tariqt@nvidia.com> From: Tariq Toukan In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/9/2022 1:02 PM, Valentin Schneider wrote: > On 08/08/22 17:39, Tariq Toukan wrote: >> On 8/4/2022 8:28 PM, Valentin Schneider wrote: >>> On 28/07/22 22:12, Tariq Toukan wrote: >>>> +static bool sched_cpus_spread_by_distance(int node, u16 *cpus, int ncpus) >>> ^^^^^^^^^ >>> That should be a struct *cpumask. >> >> With cpumask, we'll lose the order. >> > > Right, I didn't get that from the changelog. I'll make it clear when re-spinned. > >>> >>>> +{ >>>> + cpumask_var_t cpumask; >>>> + int first, i; >>>> + >>>> + if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL)) >>>> + return false; >>>> + >>>> + cpumask_copy(cpumask, cpu_online_mask); >>>> + >>>> + first = cpumask_first(cpumask_of_node(node)); >>>> + >>>> + for (i = 0; i < ncpus; i++) { >>>> + int cpu; >>>> + >>>> + cpu = sched_numa_find_closest(cpumask, first); >>>> + if (cpu >= nr_cpu_ids) { >>>> + free_cpumask_var(cpumask); >>>> + return false; >>>> + } >>>> + cpus[i] = cpu; >>>> + __cpumask_clear_cpu(cpu, cpumask); >>>> + } >>>> + >>>> + free_cpumask_var(cpumask); >>>> + return true; >>>> +} >>> >>> This will fail if ncpus > nr_cpu_ids, which shouldn't be a problem. It >>> would make more sense to set *up to* ncpus, the calling code can then >>> decide if getting fewer than requested is OK or not. >>> >>> I also don't get the fallback to cpumask_local_spread(), is that if the >>> NUMA topology hasn't been initialized yet? It feels like most users of this >>> would invoke it late enough (i.e. anything after early initcalls) to have >>> the backing data available. >> >> I don't expect this to fail, as we invoke it late enough. Fallback is >> there just in case, to preserve the old behavior instead of getting >> totally broken. >> > > Then there shouldn't be a fallback method - the main method is expected to > work. > I'll drop the fallback then. >>> >>> Finally, I think iterating only once per NUMA level would make more sense. >> >> Agree, although it's just a setup stage. >> I'll check if it can work for me, based on the reference code below. >> >>> >>> I've scribbled something together from those thoughts, see below. This has >>> just the mlx5 bits touched to show what I mean, but that's just compile >>> tested. >> >> My function returns a *sorted* list of the N closest cpus. >> That is important. In many cases, drivers do not need all N irqs, but >> only a portion of it, so it wants to use the closest subset of cpus. >> > > Are there cases where we can't figure this out in advance? From what I grok > out of the two callsites you patched, all vectors will be used unless some > error happens, so compressing the CPUs in a single cpumask seemed > sufficient. > All vectors will be initialized to support the maximum number of traffic rings. However, the actual number of traffic rings can be controlled and set to a lower number N_actual < N. In this case, we'll be using only N_actual instances and we want them to be the first/closest.