Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2429356rwb; Fri, 11 Nov 2022 09:09:47 -0800 (PST) X-Google-Smtp-Source: AA0mqf67Jfmm9h//FiCfkwjaBbJgeFuQJcziyHCqBfdUIWhBLYUH0sVrRKpri7MlTgocG6TF9+rt X-Received: by 2002:a17:902:cad5:b0:186:7a1d:b6ee with SMTP id y21-20020a170902cad500b001867a1db6eemr3470598pld.67.1668186586774; Fri, 11 Nov 2022 09:09:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668186586; cv=none; d=google.com; s=arc-20160816; b=JNfZBI9nzC3pBQqCr3SUUsyr6g/Olazido7vKecfOoDpHzTQkMxzTsdXxf/LNuVqha n67qj6Cv18nuLeHnpST+I1y+Y5TMPEV9N4L0bt+KT+g5eOe3nglMZcEYl4+P4pWknZdi zSriIKGPPN4NhSkdU5C8qXg9o7fdbKeb1YdSHg2Vo/n7xbLDjPmMrC+vdfnFsYO1ekwW ImKHq9F5vV3zZ2cBmRcHVDP2AHUqfx+s/KZgYtxOPZGv7GflDHTW9V0fJLU3YbmhDbvK EruI8Z+xKFEvUcXyRavc94/891Z8I39Gp9c9Xil9dM7/jseKsoiT+zk8SeMz+Obtp3Y3 hs3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=AC0O3le8bk/sr0yNI25Hqpom2ujUVEBU3Z1XU45Dy38=; b=FWq0+St3cf1toeAzH/P/QN8Ylb78UwH8ekP9lUG3rCBu8b0/SBHySRLp86SAyyjb2K 2BIXr0I9p9vpHkd62NJlBQ+Srd0r6tpSYAgvzVfDEk+IwmtKCP8LJ3y7qLcpRccwDwSG 17kCNjef/1WQ6GKYBLkmFzMpMFPRZlGi6ewLD0/fKsLgpyeMvfy/e0mD8cG7CZkqw+jh NBPchiSwvL9v88S181N9bFtQtYliBeWN+Iwl4tfeYQW+hgoxTsi0qCZJBYg/qqArkzb/ kfykN55GvJhroSbnyTq2Vm2mdQjHkeO4blSMk1SpKWF5fxGukD/MqMtX4bzzv00cEvVV ztcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PPHA7iWY; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o22-20020a170902779600b00186a3bc939esi2827599pll.211.2022.11.11.09.09.19; Fri, 11 Nov 2022 09:09:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PPHA7iWY; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234324AbiKKRIF (ORCPT + 99 others); Fri, 11 Nov 2022 12:08:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234245AbiKKRHo (ORCPT ); Fri, 11 Nov 2022 12:07:44 -0500 Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10A6B86D7F; Fri, 11 Nov 2022 09:07:17 -0800 (PST) Received: by mail-oi1-x231.google.com with SMTP id m204so5436732oib.6; Fri, 11 Nov 2022 09:07:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=AC0O3le8bk/sr0yNI25Hqpom2ujUVEBU3Z1XU45Dy38=; b=PPHA7iWYBy4k5/xw+dIDuuhBjUxzOJNVpMhUmMio95WxSvS1WviKVTkEaIwe8pGe1n m708NfylWOAXWxW+m5w6D+kz9yTYs+C8CPZhsNwWqSS0gTQF53yAQKp2JhTMsu/FC2Zr 8kCDcR3MmDrSrogmClYcaH9TOrHLDT39/8ldoWBEUlfGXW7/mRJxdvP3EEiJbDQWT8wm BD+BAkKkSmZv3EngdzwauWZZa379am6LgSr4ZYtkoCZS2QDvk3Xq5scs1m57MuzWcq0W 0GFSFo9nLBRQrTDtNzfzCRwXtPCG9J8ya0G7SyBlcyDH+TXkQqlLXqvZg7zPA1CPRYHj Talw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=AC0O3le8bk/sr0yNI25Hqpom2ujUVEBU3Z1XU45Dy38=; b=CjrxNKgAFvEACqZDsPeLOnNtjQbiHj8MSn8H/OYonmy6hi9S3WSkZgkI68NFYF3sg1 XmdqNk0cZXaP8dr5eKN89KNxAIiyj53Unv1gKRSERLAtUrj+o3MnlJgLG02ZgtZYBGuB sAwXRrDw3DNFJ2ej0psd5+PIKAWHBHXNGLqmGEOko7JcWjQv5lBtKHIfYqu2hD9GAwc1 rRUeFCisbvi5Xe/ZRLLwEuNcsrMDrMxQC6rsWpAlcSOF+PmGazInzxyDOGmlXqRUjcjU /ZFsUWdwjmBOIiEaN8WDPCRjbVeYf78XFnbeSUKJmLASpzEuWWxDa0mp1CGFbR0Sg0FX 3dZw== X-Gm-Message-State: ANoB5pmPebQyUxf3ehcDZ8fpotyi+qesOItqJNC4kgXeTLfKC5v43yAF FCTnpgOFz7ttE420eSu35Y8= X-Received: by 2002:a54:400d:0:b0:359:fa2c:78c6 with SMTP id x13-20020a54400d000000b00359fa2c78c6mr1293291oie.3.1668186436179; Fri, 11 Nov 2022 09:07:16 -0800 (PST) Received: from localhost ([12.97.180.36]) by smtp.gmail.com with ESMTPSA id l5-20020a056830154500b00661a33883b8sm1150722otp.71.2022.11.11.09.07.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Nov 2022 09:07:15 -0800 (PST) Date: Fri, 11 Nov 2022 09:07:15 -0800 From: Yury Norov To: Andy Shevchenko Cc: linux-kernel@vger.kernel.org, "David S. Miller" , Barry Song , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Gal Pressman , Greg Kroah-Hartman , Heiko Carstens , Ingo Molnar , Jakub Kicinski , Jason Gunthorpe , Jesse Brandeburg , Jonathan Cameron , Juri Lelli , Leon Romanovsky , Mel Gorman , Peter Zijlstra , Rasmus Villemoes , Saeed Mahameed , Steven Rostedt , Tariq Toukan , Tariq Toukan , Tony Luck , Valentin Schneider , Vincent Guittot , linux-crypto@vger.kernel.org, netdev@vger.kernel.org, linux-rdma@vger.kernel.org Subject: Re: [PATCH 3/4] sched: add sched_numa_find_nth_cpu() Message-ID: References: <20221111040027.621646-1-yury.norov@gmail.com> <20221111040027.621646-4-yury.norov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Fri, Nov 11, 2022 at 01:42:29PM +0200, Andy Shevchenko wrote: > On Thu, Nov 10, 2022 at 08:00:26PM -0800, Yury Norov wrote: > > The function finds Nth set CPU in a given cpumask starting from a given > > node. > > > > Leveraging the fact that each hop in sched_domains_numa_masks includes the > > same or greater number of CPUs than the previous one, we can use binary > > search on hops instead of linear walk, which makes the overall complexity > > of O(log n) in terms of number of cpumask_weight() calls. > > ... > > > +int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node) > > +{ > > + unsigned int first = 0, mid, last = sched_domains_numa_levels; > > + struct cpumask ***masks; > > *** ? > Hmm... Do we really need such deep indirection? It's 2d array of pointers, so - yes. > > + int w, ret = nr_cpu_ids; > > + > > + rcu_read_lock(); > > + masks = rcu_dereference(sched_domains_numa_masks); > > + if (!masks) > > + goto out; > > + > > + while (last >= first) { > > + mid = (last + first) / 2; > > + > > + if (cpumask_weight_and(cpus, masks[mid][node]) <= cpu) { > > + first = mid + 1; > > + continue; > > + } > > + > > + w = (mid == 0) ? 0 : cpumask_weight_and(cpus, masks[mid - 1][node]); > > See below. > > > + if (w <= cpu) > > + break; > > + > > + last = mid - 1; > > + } > > We have lib/bsearch.h. I haven't really looked deeply into the above, but my > gut feelings that that might be useful here. Can you check that? Yes we do. I tried it, and it didn't work because nodes arrays are allocated dynamically, and distance between different pairs of hops for a given node is not a constant, which is a requirement for bsearch. However, distance between hops pointers in 1st level array should be constant, and we can try feeding bsearch with it. I'll experiment with bsearch for more. > > + ret = (mid == 0) ? > > + cpumask_nth_and(cpu - w, cpus, masks[mid][node]) : > > + cpumask_nth_and_andnot(cpu - w, cpus, masks[mid][node], masks[mid - 1][node]); > > You can also shorten this by inversing the conditional: > > ret = mid ? ...not 0... : ...for 0...; Yep, why not. > > +out: > > out_unlock: ? Do you think it's better? > > + rcu_read_unlock(); > > + return ret; > > +} > > -- > With Best Regards, > Andy Shevchenko >