Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp33602ybx; Tue, 5 Nov 2019 18:54:59 -0800 (PST) X-Google-Smtp-Source: APXvYqxDZ/Tg1TyzJZGbafwvk2UPDu0XItLHdPvsuS/4+V3yz2/gKuDNzbPH2o1UnjfSms06EwMI X-Received: by 2002:aa7:c7c1:: with SMTP id o1mr183800eds.123.1573008898950; Tue, 05 Nov 2019 18:54:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573008898; cv=none; d=google.com; s=arc-20160816; b=uvOrP1e0vfvmf3IffvKm3EgwZ5FlaTSQDg/QHwLy60/1tULSgMI1ZxfzTE5NuAX8KZ ne7ibcXT0IFNaQ9dZfkJ2je8UM9dtTb9m6wSem6dYuryoiTQhChznzStBBsxgGBK1pVc Nerett32O1aPGxmRuo0XzKEi9bMTBL2ECQfq051INuMvj4VT2z6ce38NSJTjOx6ecUVV W3V9hgThh8mvUj9IclXJLl4xNQEngyhEi1iPoUVWwp3A31PJpVbEesHW3ktU5wWnjUKL UrT0432mq8i77VHopBc2/EYtS//P7dR/2LArypaHBeVwpM5fIWCdbYW+o2zU+Nt29SIv 7DZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=nFu8MuBiREKNETzzLHM9TmjA9R2bxWlprtFhsnit23A=; b=EKU7lvrOvAaGcMPkb22wEFl/qjIJmGhOFfphc/7hzuTpeB0mrOx+6bFG0GvtWk4fQv 0H9EZGsDHX/o+atO78LsXPIT5D8jblHCM+gXWTIz/PJADu3aROxPchZdJ5TE5a/iP8sT XX5slifGqZYRK9QskNieZXxiQKfkMeSCsiEF58jDbzSQu3ks9J6jxHJ8+KcK14d0FxI1 EVsbTjLi79ipafmGCZwalv0oafam/5mxuc0Pz+HDLtPwc1k8tXLZaa1HsvX8Km+Q0tIL jJafhIYoAKB7WK0fTTn+C3Suq0cu4+KvC0RMjU5FVqgliCxjif7jsPtg2QYCqAqOaA5b WTrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 8si16683864ejc.104.2019.11.05.18.54.34; Tue, 05 Nov 2019 18:54:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730655AbfKFCtb (ORCPT + 99 others); Tue, 5 Nov 2019 21:49:31 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:5732 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730426AbfKFCtb (ORCPT ); Tue, 5 Nov 2019 21:49:31 -0500 Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id E49AE89323F883981E04; Wed, 6 Nov 2019 10:49:29 +0800 (CST) Received: from [127.0.0.1] (10.74.221.148) by DGGEMS407-HUB.china.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.439.0; Wed, 6 Nov 2019 10:49:19 +0800 Subject: Re: [PATCH v2] lib: optimize cpumask_local_spread() To: Andrew Morton , Michal Hocko References: <1572863268-28585-1-git-send-email-zhangshaokun@hisilicon.com> <20191105070141.GF22672@dhcp22.suse.cz> <20191105173359.39052327cf221d9c4b26b783@linux-foundation.org> CC: , yuqi jin , "Mike Rapoport" , Paul Burton , "Michael Ellerman" , Anshuman Khandual From: Shaokun Zhang Message-ID: Date: Wed, 6 Nov 2019 10:49:19 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20191105173359.39052327cf221d9c4b26b783@linux-foundation.org> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.221.148] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andrew, On 2019/11/6 9:33, Andrew Morton wrote: > On Tue, 5 Nov 2019 08:01:41 +0100 Michal Hocko wrote: > >> On Mon 04-11-19 18:27:48, Shaokun Zhang wrote: >>> From: yuqi jin >>> >>> In the multi-processor and NUMA system, I/O device may have many numa >>> nodes belonging to multiple cpus. When we get a local numa, it is >>> better to find the node closest to the local numa node, instead >>> of choosing any online cpu immediately. >>> >>> For the current code, it only considers the local NUMA node and it >>> doesn't compute the distances between different NUMA nodes for the >>> non-local NUMA nodes. Let's optimize it and find the nearest node >>> through NUMA distance. The performance will be better if it return >>> the nearest node than the random node. >> >> Numbers please > > The changelog had > > : When Parameter Server workload is tested using NIC device on Huawei > : Kunpeng 920 SoC: > : Without the patch, the performance is 22W QPS; > : Added this patch, the performance become better and it is 26W QPS. > >> [...] >>> +/** >>> + * cpumask_local_spread - select the i'th cpu with local numa cpu's first >>> + * @i: index number >>> + * @node: local numa_node >>> + * >>> + * This function selects an online CPU according to a numa aware policy; >>> + * local cpus are returned first, followed by the nearest non-local ones, >>> + * then it wraps around. >>> + * >>> + * It's not very efficient, but useful for setup. >>> + */ >>> +unsigned int cpumask_local_spread(unsigned int i, int node) >>> +{ >>> + int node_dist[MAX_NUMNODES] = {0}; >>> + bool used[MAX_NUMNODES] = {0}; >> >> Ugh. This might be a lot of stack space. Some distro kernels use large >> NODE_SHIFT (e.g 10 so this would be 4kB of stack space just for the >> node_dist). > > Yes, that's big. From a quick peek I suspect we could get by using an > array of unsigned shorts here but that might be fragile over time even > if it works now? > Yes, how about we define another macro and its value is 128(not sure it is big enough for the actual need)? --->8 unsigned int cpumask_local_spread(unsigned int i, int node) { - int node_dist[MAX_NUMNODES] = {0}; - bool used[MAX_NUMNODES] = {0}; + #define NUMA_NODE_NR 128 + int node_dist[NUMA_NODE_NR] = {0}; + bool used[NUMA_NODE_NR] = {0}; int cpu, j, id; /* Wrap: we always want a cpu. */ @@ -278,7 +279,7 @@ unsigned int cpumask_local_spread(unsigned int i, int node) if (i-- == 0) return cpu; } else { - if (nr_node_ids > MAX_NUMNODES) + if (nr_node_ids > NUMA_NODE_NR) return __cpumask_local_spread(i, node); calc_node_distance(node_dist, node); > Perhaps we could make it a statically allocated array and protect the > entire thing with a spin_lock_irqsave()? It's not a frequently called It's another way to solve this issue. I'm not sure you and Michal like which one. ;-) Thanks, Shaokun > function. > > > . >