Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3087176pxk; Mon, 7 Sep 2020 02:49:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxp1MBo16BeFb38cVMTi8ABWe8AghYDKwstImDad+q5NmpyuFO82BqWHiuS1eSlGtgrM6lG X-Received: by 2002:a17:906:a415:: with SMTP id l21mr19031010ejz.431.1599472167986; Mon, 07 Sep 2020 02:49:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599472167; cv=none; d=google.com; s=arc-20160816; b=Qim6dfNA2uphM1Ni0a5I8kpn+fLEnQ6vAX36sHNPzhDjtycQSSn5yvTSd/GDoVjiI5 TrxuJPcqC/KR9DXvzQSim2bNl8nNT4Hyvzly+UrovmbVn5vRbhejnCIcV0CCkbdWjBSi 3Qtwea+bYJF1PO+miV2XM1yAteOcK2zjkuU86p5lPFpkQuch/YF1QWWHfX0Oh7+3RcVp ZymUmfhA0YY2VOA0arVmK6bKMw6D98CGpT8fBfkYkCHi3IO03E7BrBsjJp8jKLhPzGft /XcRcmZaPBu2dhRyjjC0WI72eNTXFOr3Jiekmwc74bQzxbXD5vP/nHBI3HYZu9M3pu32 KLMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from; bh=w/Jweqmx5McA3xxQHMdbjiRQgXmrkFmdNNZyD+h9wXk=; b=TxKyFbnnVhXKA9GvDCoYXprFI3c6Dk1Km7N5/WYjSzJ+7soTaYovaz8lUT3vzdcSin NL8W5geSG3Yk7rnZokLTxEfgj9itJB6QBhk2vlt4FEA2O8n7gtiWsbDF1+kE/JcL4gy5 RQLU3FP73yCxA76WCbuujtBcCe5/s3nvbD70Z7Gug4KUQre6zojjMXnxFg+4mZKwOa/Y 18ocy6xjpo4LDC5SZoiasxcf4RTSA4GSGGBJoUw0QfW2bUIGuyze6HCnDxDF7c1EcWLk U3KkOl9aNmM67l7UtIdLgT4mpMwyTsN/+iHmw2Hp2tsFQdOnPfnNgYvdViiu0eaWJnXT yZ6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h4si9259754edv.181.2020.09.07.02.49.05; Mon, 07 Sep 2020 02:49:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728317AbgIGJqL convert rfc822-to-8bit (ORCPT + 99 others); Mon, 7 Sep 2020 05:46:11 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:52642 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726301AbgIGJqL (ORCPT ); Mon, 7 Sep 2020 05:46:11 -0400 Received: from DGGEMM403-HUB.china.huawei.com (unknown [172.30.72.57]) by Forcepoint Email with ESMTP id C932CA238EC1CD0FD2CB; Mon, 7 Sep 2020 17:46:08 +0800 (CST) Received: from dggemi762-chm.china.huawei.com (10.1.198.148) by DGGEMM403-HUB.china.huawei.com (10.3.20.211) with Microsoft SMTP Server (TLS) id 14.3.487.0; Mon, 7 Sep 2020 17:44:04 +0800 Received: from dggemi761-chm.china.huawei.com (10.1.198.147) by dggemi762-chm.china.huawei.com (10.1.198.148) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1913.5; Mon, 7 Sep 2020 17:44:03 +0800 Received: from dggemi761-chm.china.huawei.com ([10.9.49.202]) by dggemi761-chm.china.huawei.com ([10.9.49.202]) with mapi id 15.01.1913.007; Mon, 7 Sep 2020 17:44:04 +0800 From: "Song Bao Hua (Barry Song)" To: Mel Gorman CC: "mingo@redhat.com" , "peterz@infradead.org" , "juri.lelli@redhat.com" , "vincent.guittot@linaro.org" , "dietmar.eggemann@arm.com" , "bsegall@google.com" , "linux-kernel@vger.kernel.org" , Linuxarm , Mel Gorman , Peter Zijlstra , Valentin Schneider , Phil Auld , Hillf Danton , Ingo Molnar Subject: RE: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer Thread-Topic: [PATCH] sched/fair: use dst group while checking imbalance for NUMA balancer Thread-Index: AQHWhOi4NR3jMQOEjUuGVuHcF7fpAalcYjqAgACG3OA= Date: Mon, 7 Sep 2020 09:44:03 +0000 Message-ID: <9c8e148805bc4da1a2bfdd72f1c76f06@hisilicon.com> References: <20200907072708.8664-1-song.bao.hua@hisilicon.com> <20200907092717.GD3117@suse.de> In-Reply-To: <20200907092717.GD3117@suse.de> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.126.201.74] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: Mel Gorman [mailto:mgorman@suse.de] > Sent: Monday, September 7, 2020 9:27 PM > To: Song Bao Hua (Barry Song) > Cc: mingo@redhat.com; peterz@infradead.org; juri.lelli@redhat.com; > vincent.guittot@linaro.org; dietmar.eggemann@arm.com; > bsegall@google.com; linux-kernel@vger.kernel.org; Linuxarm > ; Mel Gorman ; > Peter Zijlstra ; Valentin Schneider > ; Phil Auld ; Hillf Danton > ; Ingo Molnar > Subject: Re: [PATCH] sched/fair: use dst group while checking imbalance for > NUMA balancer > > On Mon, Sep 07, 2020 at 07:27:08PM +1200, Barry Song wrote: > > Something is wrong. In find_busiest_group(), we are checking if src has > > higher load, however, in task_numa_find_cpu(), we are checking if dst > > will have higher load after balancing. It seems it is not sensible to > > check src. > > It maybe cause wrong imbalance value, for example, if > > dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and > > src_running = env->src_stats.nr_running - 1 results in 1; > > The current code is thinking imbalance as 0 since src_running is smaller > > than 2. > > This is inconsistent with load balancer. > > > > It checks the conditions if the move was to happen. Have you evaluated > this for a NUMA balancing load and confirmed it a) balances properly and > b) does not increase the scan rate trying to "fix" the problem? I think the original code was trying to check if the numa migration would lead to new imbalance in load balancer. In case src is A, dst is B, and both of them have nr_running as 2. A moves one task to B, then A will have 1, B will have 3. In load balancer, A will try to pull task from B since B's nr_running is larger than min_imbalance. But the code is saying imbalance=0 by finding A's nr_running is smaller than min_imbalance. Will share more test data if you need. > > -- > Mel Gorman > SUSE Labs Thanks Barry