Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3242843pxf; Mon, 22 Mar 2021 01:16:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx+jXhaGargniIqM+aJIN8T7PdECVVVh5DBq/SeNRaeQeZFxvQWqKGU3sPmpJTy1ojOVbog X-Received: by 2002:a17:906:a44f:: with SMTP id cb15mr17707640ejb.420.1616400983727; Mon, 22 Mar 2021 01:16:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616400983; cv=none; d=google.com; s=arc-20160816; b=SYbvN3ov8ejQovhV/9ghOwQC7dmwij1UGn2P1Yhb5xiz2mw7UpYxpmYyuubsKo6fjJ tn3ByZQNItI6LbLs/Kk+RYPv2Y6367asJErM+v8aXww7tIWWvJjOK+jin7uZx4IQGwEk lakGS859++oS9QSrtAobXvT4aHYyXyk3yDop47ReznaqtH0LWIDFoqHO3d9cYQP975Wm 2wXpObRkoD0y2e91GZmbI/l6TvKSqofvuFkF4WzPtIdIxoLEWy+MTJRYJc4LiJ7kwiVC O2EwVDIIm0a5Jk1URgso3k8BbPLzLzckASBlAnRnsjB38qaA7mALYpsRBameyV9IXDas Ee1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:ironport-sdr:ironport-sdr; bh=ARgyedsU2jB0QVlISvSTraj9wPDFe2Su2Nwh/SJfq80=; b=PEGDPm5Q1qoZhu0vUBBjvP+HrQ1TFTEPa8EfzgdvmS4vOUgPKLHFVomTSWOY1esjyA iqRb1qk83hwBaCra5Y+rXCFpdmjUVBmGiSX3vKhvXCu7DU6puYaPLR7ZHXGzJEam+/3Y r8iWuVGXUrlYtqpbslnoSiapyot410UckEiQBk4ckZjgCrfHnTKjeeBAGt2mGbwwJajC 8UQHVzrVHVw2GD52SZ0cfVxyShE9d+dQnaFm30FeQ9ZzVBXDW9YRrDcWIxQH9LYRm584 2Jd4aaFLIAeuPKyHNNNJrolD6enj4LvoW/7ktLGNEk36VEB8cjHgk1ui0BhUeFDRv67B /ZAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gz5si11082445ejb.19.2021.03.22.01.16.01; Mon, 22 Mar 2021 01:16:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230021AbhCVINH (ORCPT + 99 others); Mon, 22 Mar 2021 04:13:07 -0400 Received: from mga05.intel.com ([192.55.52.43]:47597 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230084AbhCVIMd (ORCPT ); Mon, 22 Mar 2021 04:12:33 -0400 IronPort-SDR: BIJVyEQeTFGZAubgkD/ex/huaacF/rszT56eksHgX1wMJ1YmxvdQI/RsY5sJjDNbtBrLzCxDyh VTHbpihitE9Q== X-IronPort-AV: E=McAfee;i="6000,8403,9930"; a="275317210" X-IronPort-AV: E=Sophos;i="5.81,268,1610438400"; d="scan'208";a="275317210" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Mar 2021 01:12:30 -0700 IronPort-SDR: AU26YGdKix0HcaDhAo9g4bthVQzuaK3G9gdTzRuZTiKGUICqCGx4i7siTkSyMNe8mnIvyo0V9Q j40y5hhmS93w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,268,1610438400"; d="scan'208";a="441062304" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.125]) ([10.239.161.125]) by fmsmga002.fm.intel.com with ESMTP; 22 Mar 2021 01:12:18 -0700 Subject: Re: [PATCH 1/6] sched: migration changes for core scheduling To: Peter Zijlstra Cc: "Joel Fernandes (Google)" , Nishanth Aravamudan , Julien Desfossez , Tim Chen , Vineeth Pillai , Aaron Lu , Aubrey Li , tglx@linutronix.de, linux-kernel@vger.kernel.org, mingo@kernel.org, torvalds@linux-foundation.org, fweisbec@gmail.com, keescook@chromium.org, Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , vineeth@bitbyteword.org, Chen Yu , Christian Brauner , Agata Gruza , Antonio Gomez Iglesias , graf@amazon.com, konrad.wilk@oracle.com, dfaggioli@suse.com, rostedt@goodmis.org, benbjiang@tencent.com, Alexandre Chartre , James.Bottomley@hansenpartnership.com, OWeisse@umich.edu, Dhaval Giani , chris.hyser@oracle.com, Josh Don , Hao Luo , Tom Lendacky , Aubrey Li References: <20210319203253.3352417-1-joel@joelfernandes.org> <20210319203253.3352417-2-joel@joelfernandes.org> <20210320153457.GX4746@worktop.programming.kicks-ass.net> <28e13609-c526-c6ee-22a3-898652aed5e6@linux.intel.com> From: "Li, Aubrey" Message-ID: Date: Mon, 22 Mar 2021 16:12:17 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/3/22 15:48, Peter Zijlstra wrote: > On Sun, Mar 21, 2021 at 09:34:00PM +0800, Li, Aubrey wrote: >> Hi Peter, >> >> On 2021/3/20 23:34, Peter Zijlstra wrote: >>> On Fri, Mar 19, 2021 at 04:32:48PM -0400, Joel Fernandes (Google) wrote: >>>> @@ -7530,8 +7543,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) >>>> * We do not migrate tasks that are: >>>> * 1) throttled_lb_pair, or >>>> * 2) cannot be migrated to this CPU due to cpus_ptr, or >>>> - * 3) running (obviously), or >>>> - * 4) are cache-hot on their current CPU. >>>> + * 3) task's cookie does not match with this CPU's core cookie >>>> + * 4) running (obviously), or >>>> + * 5) are cache-hot on their current CPU. >>>> */ >>>> if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu)) >>>> return 0; >>>> @@ -7566,6 +7580,13 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) >>>> return 0; >>>> } >>>> >>>> + /* >>>> + * Don't migrate task if the task's cookie does not match >>>> + * with the destination CPU's core cookie. >>>> + */ >>>> + if (!sched_core_cookie_match(cpu_rq(env->dst_cpu), p)) >>>> + return 0; >>>> + >>>> /* Record that we found atleast one task that could run on dst_cpu */ >>>> env->flags &= ~LBF_ALL_PINNED; >>>> >>> >>> This one is too strong.. persistent imbalance should be able to override >>> it. >>> >> >> IIRC, this change can avoid the following scenario: >> >> One sysbench cpu thread(cookieA) and sysbench mysql thread(cookieB) running >> on the two siblings of core_1, the other sysbench cpu thread(cookieA) and >> sysbench mysql thread(cookieB) running on the two siblings of core2, which >> causes 50% force idle. >> >> This is not an imbalance case. > > But suppose there is an imbalance; then this cookie crud can forever > stall balance. > > Imagine this cpu running a while(1); with a uniqie cookie on, then it > will _never_ accept other tasks == BAD. > How about putting the following check in sched_core_cookie_match()? + /* + * Ignore cookie match if there is a big imbalance between the src rq + * and dst rq. + */ + if ((src_rq->cfs.h_nr_running - rq->cfs.h_nr_running) > 1) + return true; This change has significant impact of my sysbench cpu+mysql colocation. - with this change, sysbench cpu tput = 2796 events/s, sysbench mysql = 1315 events/s - without it, sysbench cpu tput= 3513 events/s, sysbench mysql = 646 events. Do you have any suggestions before we drop it? Thanks, -Aubrey