Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1487922ybt; Thu, 2 Jul 2020 06:48:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz7oIfwfvorWwctnoxwHLEO6rqDsMyh/21o16AlZSPcsIiDontylUz3UVkGcmjvlBkdjF3e X-Received: by 2002:a17:906:6499:: with SMTP id e25mr27528286ejm.352.1593697705614; Thu, 02 Jul 2020 06:48:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593697705; cv=none; d=google.com; s=arc-20160816; b=DG2CFuwepAG87uj+F+QzIKOTDGf2GDRXOhw0Hxk+yzOXbxxtq1rTxiNOuQR0cvisvp z5oKLaiLgVEgtuu9u8/3o1GLEQVTfsn97SuhHZlSr+qkXAbg856vzYD+nK2UH9DwfbKa kLGb8V2Ggvo5DfRYYKrCWsNSXnLv64vOg2WJvkJS52dLh1r3C8pDTHO9eR/+AWgU2bGD rAyi72LTY3ohOpgmtkg7PwEGerTGUDJiFULodZNOl/8T+MMSb+wr2jWEqyQcz1cH2Hqy md0Ab33JArUiLrZh/qqmbaRJouJTChCN//mIjuExcGrk88pQZF83VSKbWKAohKFmIYIY wrPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=LJIM+Ab7DhARtGNz/vHk/l2fK0qArHA56h3WLyp3ZY4=; b=nVf1ILrGU1ADb8elHZVpBYsE8rbwAH045K5ZWdxY8mGXZI6FGuVeH5D5THKCGRTY49 wF7IgEVLt03oBbx4Uumv9oSmC9mocemuz17t/nKEddAhu27H4DLeQJVov+Xga7HH+D6F IoaiShBd3FsmCl8757Wlzb60FadM37NI/Q8FkmkNKk8t548eblSLKt2OkEpOUDUNRAHp uc2P2sXahmf2YT14N56Xv/as33qGDUfo4J+IfKcbPE8eHiQc2BTRY4iVoZQIlO5Y7Z0t vTXm0NhcWQGP0OK0ilpIKEtrprQUwI57tMVz2LFPv/iDXWqHkw7lqamia/GOmnvwVkr5 gXXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=zHuJEAGb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p4si5375848edq.201.2020.07.02.06.48.02; Thu, 02 Jul 2020 06:48:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=zHuJEAGb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729271AbgGBNpZ (ORCPT + 99 others); Thu, 2 Jul 2020 09:45:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729047AbgGBNpY (ORCPT ); Thu, 2 Jul 2020 09:45:24 -0400 Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EAA5C08C5C1 for ; Thu, 2 Jul 2020 06:45:24 -0700 (PDT) Received: by mail-lj1-x244.google.com with SMTP id z24so7312050ljn.8 for ; Thu, 02 Jul 2020 06:45:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LJIM+Ab7DhARtGNz/vHk/l2fK0qArHA56h3WLyp3ZY4=; b=zHuJEAGbPYueRjHvZbz1dAvskakjwhhr62OP1Wppf2jXL4nTV+H22bKSiFCOYT+T/B XctATN1sGjFqWn6QDB4Qgpz8bt46VzrRIUSWU4u7V2Bk3fA8rx5sT3CCeavKxFpxZulx hkJ6E4HQh1jl6vE0XzHleoJUt2JuauKss6SgE2FeZGbOJgL9eX8wZ5v5cUisTt6GNP15 gDKRIngEx7rOvjlOYr5f7rlW4cdO3CE6z/I6vSx41AxtajsSl0xqxK3dO4ecLrVVA4/G ihKBLaq9PK8uN851HuHArfVEHtNE5voSalZr7rkoBMkaXUg+Jx1nsJrEMJ2mlfi0BlHk OK4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LJIM+Ab7DhARtGNz/vHk/l2fK0qArHA56h3WLyp3ZY4=; b=VhOO/GZkFd0AU2AOhW/u7g7YDyxsCKMa9avMt///pZPPVg5XLEmUNgj/rgDQ2w4JIM 5BxsZYldHwnrZJUVy9A9+yuZGPxsSar3MaKM8y4IqsnTNb6ris0grg6GiMpN4xgkXXFf nPMMm8ObsGrhkw/MsljHhr8yKN3S8K/+aVI6NBak09hBcqWUO20eJ8NfiJ1h/0j81z58 nF5LEofIfH402NCmz7FvR7NKechqTYNFge+olkwm0HOfXolSc/o9gwSKgotICTVojiS9 ze7je3/yWOKfhi9P4aCF6zndIMiZBG5yPJsKomdo+/9c7jadng/eZAc57ZoDD498S6pn uulQ== X-Gm-Message-State: AOAM5318gUy3ZrGw+Ol90pX8Km5SLpUnr1rKD6Qq47Nfpx4sHVtyHDJB p55JZfpukG6ouEGDqyJQFLYEAq5BlRILB0MXTS06sA== X-Received: by 2002:a2e:312:: with SMTP id 18mr16046971ljd.423.1593697522526; Thu, 02 Jul 2020 06:45:22 -0700 (PDT) MIME-Version: 1.0 References: <20200616164801.18644-1-peter.puhov@linaro.org> <106350c5-c2b7-a63c-2b06-761f523ee67c@arm.com> <20200702132058.GN3129@suse.de> In-Reply-To: <20200702132058.GN3129@suse.de> From: Vincent Guittot Date: Thu, 2 Jul 2020 15:45:11 +0200 Message-ID: Subject: Re: [PATCH] sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal To: Mel Gorman Cc: Dietmar Eggemann , Peter Puhov , Valentin Schneider , linux-kernel , Robert Foley , Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Ben Segall Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2 Jul 2020 at 15:29, Mel Gorman wrote: > > On Thu, Jul 02, 2020 at 11:27:52AM +0200, Dietmar Eggemann wrote: > > On 17/06/2020 16:52, Peter Puhov wrote: > > > On Wed, 17 Jun 2020 at 06:50, Valentin Schneider > > > wrote: > > >> > > >> > > >> On 16/06/20 17:48, peter.puhov@linaro.org wrote: > > >>> From: Peter Puhov > > >>> We tested this patch with following benchmarks: > > >>> perf bench -f simple sched pipe -l 4000000 > > >>> perf bench -f simple sched messaging -l 30000 > > >>> perf bench -f simple mem memset -s 3GB -l 15 -f default > > >>> perf bench -f simple futex wake -s -t 640 -w 1 > > >>> sysbench cpu --threads=8 --cpu-max-prime=10000 run > > >>> sysbench memory --memory-access-mode=rnd --threads=8 run > > >>> sysbench threads --threads=8 run > > >>> sysbench mutex --mutex-num=1 --threads=8 run > > >>> hackbench --loops 20000 > > >>> hackbench --pipe --threads --loops 20000 > > >>> hackbench --pipe --threads --loops 20000 --datasize 4096 > > >>> > > >>> and found some performance improvements in: > > >>> sysbench threads > > >>> sysbench mutex > > >>> perf bench futex wake > > >>> and no regressions in others. > > >>> > > >> > > >> One nitpick for the results of those: condensing them in a table form would > > >> make them more reader-friendly. Perhaps something like: > > >> > > >> | Benchmark | Metric | Lower is better? | BASELINE | SERIES | DELTA | > > >> |------------------+----------+------------------+----------+--------+-------| > > >> | Sysbench threads | # events | No | 45526 | 56567 | +24% | > > >> | Sysbench mutex | ... | | | | | > > >> > > >> If you want to include more stats for each benchmark, you could have one table > > >> per (e.g. see [1]) - it'd still be a more readable form (or so I believe). > > > > Wouldn't Unix Bench's 'execl' and 'spawn' be the ultimate test cases > > for those kind of changes? > > > > I only see minor improvements with tip/sched/core as base on hikey620 > > (Arm64 octa-core). > > > > base w/ patch > > ./Run spawn -c 8 -i 10 633.6 635.1 > > > > ./Run execl -c 8 -i 10 1187.5 1190.7 > > > > > > At the end of find_idlest_group(), when comparing local and idlest, it > > is explicitly mentioned that number of idle_cpus is used instead of > > utilization. > > The comparision between potential idle groups and local & idlest group > > should probably follow the same rules. Comparing the number of idle cpu is not enough in the case described by Peter because the newly forked thread sleeps immediately and before we select cpu for the next one. This is shown in the trace where the same CPU7 is selected for all wakeup_new events. That's why, looking at utilization when there is the same number of CPU is a good way to see where the previous task was placed. Using nr_running doesn't solve the problem because newly forked task is not running and the cpu would not have been idle in this case and an idle CPU would have been selected instead > > > > There is the secondary hazard that what update_sd_pick_busiest returns > is checked later by find_busiest_group when considering the imbalance > between NUMA nodes. This particular patch splits basic communicating tasks > cross-node again at fork time so cross node communication is reintroduced > (same applies if sum_nr_running is used instead of group_util). As long as there is an idle cpu in the node, new thread doesn't cross node like previously. The only difference happens inside the node > > -- > Mel Gorman > SUSE Labs