Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2610042pxu; Mon, 14 Dec 2020 06:49:15 -0800 (PST) X-Google-Smtp-Source: ABdhPJxgdgiU7tPO1AZ0Xa2ccjhiBEB45NtjivSvA4YmwhEjFFWKsiCuCI8VkPmFS4d9FwQTE94N X-Received: by 2002:a17:906:e247:: with SMTP id gq7mr23727001ejb.27.1607957354792; Mon, 14 Dec 2020 06:49:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607957354; cv=none; d=google.com; s=arc-20160816; b=B2XtIeVVCAQ1/Qx0xFKXbwtpYdRBO7GUSfhFibEkjDZCbtzwN0PN88hWxBKwhFTbw2 QLYFUol30FQn80xsrMhYkkYvbVQJHny9MXC69rEP6DiyAT30BCdBlRVN2wYH1iIHjBkz TVukZZi/3O7AT79zJmPJ7L1+WEp4BESBrIIpqRSYBPgprbrfaujMrGjpqgcRu0sW4YYi CLzptBSqgWkNg3O/YCTb+gvRKVl2snFTzCjING43eaPFXthVc5GV1ZhU1hscsU/aBo5a KBW+RuAizwCnUxx4MnhyQMo5SeVUWcW6yHVCjLeDy2sJsr5eG6GUI+mwL1lj+EAea7HD Njdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=8GHbSYUO8iN6Ds2GSgB2Q8PPyRz6xh/DYTYwntlZb9Q=; b=fuMoh4jqznBb9UnUF8JAdInDPf5qtO6dp+lC352+ULdFiIptjk4fRGR2gvYnnJWwyG +6u5T0/p1+56OQMu5/ql8PmAxE3t8NZyXL7Zam/HjsTJVtJHzcwut3Dyh8EAty/rpa2K DrveqhJ6ycvGuUW+8MkraxE9yTkk9ysTs+UFik8iMnsuEtX1uKSbPqXqkdbmcVHSTB0k hYZBFTZxIr6HyRQ0fYGzK1xMTfeKwd1LY6lQ1UJDEq56nNJEn2bC7nYTjBESbW/Tg2ys sBd9PFZgybySQcriz0US9S59w6b7gJ2WZsBEVVNgtSAfb12QYo09XavFp52bFzIktNmA U+hA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m26si8517285edp.209.2020.12.14.06.48.51; Mon, 14 Dec 2020 06:49:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2439004AbgLNMhp (ORCPT + 99 others); Mon, 14 Dec 2020 07:37:45 -0500 Received: from outbound-smtp02.blacknight.com ([81.17.249.8]:49781 "EHLO outbound-smtp02.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726381AbgLNMha (ORCPT ); Mon, 14 Dec 2020 07:37:30 -0500 Received: from mail.blacknight.com (pemlinmail04.blacknight.ie [81.17.254.17]) by outbound-smtp02.blacknight.com (Postfix) with ESMTPS id 52019BAAA3 for ; Mon, 14 Dec 2020 12:36:34 +0000 (GMT) Received: (qmail 20736 invoked from network); 14 Dec 2020 12:36:33 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 14 Dec 2020 12:36:33 -0000 Date: Mon, 14 Dec 2020 12:36:32 +0000 From: Mel Gorman To: Peter Zijlstra Cc: Vincent Guittot , "Li, Aubrey" , Ingo Molnar , Juri Lelli , Valentin Schneider , Qais Yousef , Dietmar Eggemann , Steven Rostedt , Ben Segall , Tim Chen , linux-kernel , Mel Gorman , Jiang Biao Subject: Re: [RFC PATCH v7] sched/fair: select idle cpu from idle cpumask for task wakeup Message-ID: <20201214123632.GZ3371@techsingularity.net> References: <3802e27a-56ed-9495-21b9-7c4277065155@linux.intel.com> <20201210113441.GS3371@techsingularity.net> <31308700-aa28-b1f7-398e-ee76772b6b87@linux.intel.com> <20201210125833.GT3371@techsingularity.net> <20201211174442.GU3040@hirez.programming.kicks-ass.net> <20201211204337.GX3371@techsingularity.net> <20201211221905.GV3040@hirez.programming.kicks-ass.net> <20201211225002.GY3371@techsingularity.net> <20201214093122.GX3040@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20201214093122.GX3040@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 14, 2020 at 10:31:22AM +0100, Peter Zijlstra wrote: > On Mon, Dec 14, 2020 at 09:11:29AM +0100, Vincent Guittot wrote: > > On Fri, 11 Dec 2020 at 23:50, Mel Gorman wrote: > > > > I originally did something like that on purpose too but Vincent called > > > it out so it is worth mentioning now to avoid surprises. That said, I'm > > > at the point where anything SIS-related simply needs excessive exposure > > > because no matter what it does, someone is unhappy with it. > > > > Yeah, I don't like extending the idle core search loop for something > > that is not related to the idle core search. This adds more work out > > of control of the sis_prop. So I'm still against hiding some idle cpu > > search in idle core search. > > The idea, of course, is to do less. The current code is pretty crap in > that it will do a whole bunch of things multiple times. > ^^^ this While there is some overhead when searching for an idle core to track an idle sibling, it's better than double scanning when test_idle_cores() returns a false positive (or it races with a parallel search that takes the last idle core). > Also, a possible follow up, would be something like the below (and > remove all the sds->has_idle_cores crud), which brings core scanning > under SIS_PROP. > I'm less confident about this this but admit I have no data. test_idle_core becomes critical for hackbench once it saturates the machine as it'll generally return a true positive. Where test_idle_cores causes problems is when domains are over 50% busy and returns false positives due to races and the work to find an idle core is wasted. The flip side is that updating the has_idle_core information is also expensive in this case as it causes lots of dirty cache line bouncing so maybe in practice it might be ok. It definitely should be a separate patch that is tested on top of your first prototype with the p->cpus_ptr check when picking an idle candidate. The other side-effect is that with this patch that the scan cost is *always* accounted for. While this makes intuitive sense as it was never clear to me why it was only accounted with scan failures. When I had tested something like this, it was a mix of wins and losses. At minimum, a patch that always accounts for scan cost and one the removes the test_idle_core should be separate patches for bisection purposes at the very least. This is the current set of results I have for your prototype plus the fixes I suggested on top http://www.skynet.ie/~mel/postings/peterz-20201214/dashboard.html It's not a universal win but appears to win more than it loses The biggest loss is dbench on EPYC 2 http://www.skynet.ie/~mel/postings/peterz-20201214/io-dbench4-async-xfs/romulus/index.html#dbench4 It's not at clear why it was so badly affected but in general, EPYC can be problematic as it has multiple small LLCs. The same machine for specjvm showed large gains. http://www.skynet.ie/~mel/postings/peterz-20201214/jvm-specjbb2005-multi/romulus/index.html#specjbb There are a lot of results to trawl through but mostly it shows that it's a mix of wins and losses and it's both workload and machine dependant which is generally true for anything select_idle_sibling related. As the merge window is open, it's inevitable that this will need to be evaluated against 5.11-rc1 when all the current batch of scheduler code has been merged. Do you mind splitting your prototype into three patches and slap some sort of changlog on them? I'll run them through the grid with p->recent_used_cpu changes on top to use recent_use_cpu as a search hint instead of an idle candidate so that it scans for a core. They'll take a while to run but it's automated and some people are going to be dropping off for holidays relatively soon anyway. I can test on arm too but as it does not have SMT enabled, it'll be less useful. -- Mel Gorman SUSE Labs