Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3536937ybf; Tue, 3 Mar 2020 07:43:51 -0800 (PST) X-Google-Smtp-Source: ADFU+vt7IJDOvAPzISvpvc/6gsHSZ+NZ6khP6ySvsCs0SHWtVR1giuG5i9Pgl8W07bVUuLEZuNZ4 X-Received: by 2002:a9d:6d96:: with SMTP id x22mr2138628otp.264.1583250230831; Tue, 03 Mar 2020 07:43:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583250230; cv=none; d=google.com; s=arc-20160816; b=pu4C53HiYGvLwwKdDrdNvi77PVTwlMidNAzTajX64Er0MJXTEcBxTBoST+UTdyPhQa /kREfcAYBdgxdjpyVtSvSpO57uctYxBDV6+e4yXHTwHJDzLYqv2eElpgmQrZNNuAvqtC 4x8RVweDC4HMBKJS9lABSWV+d3u7YC6IrrTevYBAxBJrhQ7nEmA+ofxRk1uTr6NTHFFq IKrqcQZ8eSgd7wStYTc/gGkazGeaVhw+oR9cduekXPC2fVd5aFqvtxdzhtIAGhDRxhrC ZkkMP+qlPqK2BEXIcZQ/bu7+CTJhagep/4lhkBpKfj67fb2rBo6Q6TRHU9eSmaT9fa2h slSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=QZGQ8l0JkZLA9Tz9CdwgudX0hMgpolZY8SoC5QyALws=; b=ROfvVjoZtv5aJgiiUs47wsATtKe1CBgeTACRxcw9Uy12cQ2fEYSpfs9yY8yWz3bSPh f65QPD51c0fH1IzyfxHFb43v+2pKZA/+sRsv/OK0cudxny6gs99a+VbW193MU+SQdGaS rbZuzP01JNisGbPcYUctXY5CPvmzRpRmplvIfXxu2c4q1AKqXW6oAjdSsezFbNJ+ska1 gU7Ml9uK0GWY7Td7dbeMJSOM8M0uY5qij0y2l1qtFefT+sr/VSqzkrGezdceFr7ixRGg MLexFz7mSVxUn8QYbatKmOQEpaw3vFkRRaiwbutPQQU1Rgjg7gtQOuBrhtWTBilxpNl1 Mvbg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h7si3570148otq.194.2020.03.03.07.43.39; Tue, 03 Mar 2020 07:43:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729753AbgCCO7c (ORCPT + 99 others); Tue, 3 Mar 2020 09:59:32 -0500 Received: from mga14.intel.com ([192.55.52.115]:39776 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729041AbgCCO7b (ORCPT ); Tue, 3 Mar 2020 09:59:31 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Mar 2020 06:59:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,511,1574150400"; d="scan'208";a="351853842" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.118]) ([10.239.161.118]) by fmsmga001.fm.intel.com with ESMTP; 03 Mar 2020 06:59:26 -0800 Subject: Re: [RFC PATCH v4 00/19] Core scheduling v4 To: Tim Chen , Vineeth Remanan Pillai , Aubrey Li Cc: Aaron Lu , Julien Desfossez , Nishanth Aravamudan , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , Dario Faggioli , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Kees Cook , Greg Kerr , Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini References: <5e3cea14-28d1-bf1e-cabe-fb5b48fdeadc@linux.intel.com> <3c3c56c1-b8dc-652c-535e-74f6dcf45560@linux.intel.com> <20200212230705.GA25315@sinkpad> <29d43466-1e18-6b42-d4d0-20ccde20ff07@linux.intel.com> <20200225034438.GA617271@ziqianlu-desktop.localdomain> From: "Li, Aubrey" Message-ID: Date: Tue, 3 Mar 2020 22:59:25 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/2/29 7:55, Tim Chen wrote: > On 2/26/20 1:54 PM, Vineeth Remanan Pillai wrote: > >> rq->curr being NULL can mean that the sibling is idle or forced idle. >> In both the cases, I think it makes sense to migrate a task so that it can >> compete with the other sibling for a chance to run. This function >> can_migrate_task actually only says if this task is eligible and >> later part of the code decides whether it is okay to migrate it >> based on factors like load and util and capacity. So I think its >> fine to declare the task as eligible if the dest core is running >> idle. Does this thinking make sense? >> >> On our testing, it did not show much degradation in performance with >> this change. I am reworking the fix by removing the check for >> task_est_util. It doesn't seem to be valid to check for util to migrate >> the task. >> > > In Aaron's test case, there is a great imbalance in the load on one core > where all the grp A tasks are vs the other cores where the grp B tasks are > spread around. Normally, load balancer will move the tasks for grp A. > > Aubrey's can_migrate_task patch prevented the load balancer to migrate tasks if the core > cookie on the target queue don't match. The thought was it will induce > force idle and reduces cpu utilization if we migrate task to it. > That kept all the grp A tasks from getting migrated and kept the imbalance > indefinitely in Aaron's test case. > > Perhaps we should also look at the load imbalance between the src rq and > target rq. If the imbalance is big (say two full cpu bound tasks worth > of load), we should migrate anyway despite the cookie mismatch. We are willing > to pay a bit for the force idle by balancing the load out more. > I think Aubrey's patch on can_migrate_task should be more friendly to > Aaron's test scenario if such logic is incorporated. > > In Vinnet's fix, we only look at the currently running task's weight in > src and dst rq. Perhaps the load on the src and dst rq needs to be considered > to prevent too great an imbalance between the run queues? We are trying to migrate a task, can we just use cfs.h_nr_running? This signal is used to find the busiest run queue as well. Thanks, -Aubrey