Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3450954ybf; Tue, 3 Mar 2020 06:18:24 -0800 (PST) X-Google-Smtp-Source: ADFU+vtLkDzGuYhBoxXCl9S11Rasmvv/VC3Rls5Uqe/iXCChb27A2l0whk/3sJ69T7T5/OluvQdm X-Received: by 2002:a9d:6:: with SMTP id 6mr3531691ota.191.1583245103808; Tue, 03 Mar 2020 06:18:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583245103; cv=none; d=google.com; s=arc-20160816; b=omP7FQaInDiNkzvyneuDLm4w/YtYfv1yAjtiE3fIE89DuoQTeth0JE/TcXF6pmYHEU koKBeo1PZChsH9EbUH1hAYGuRm0nHKrCr4R7wQz7FYW65wAqupkH3kyEJ1J9YDWlpN3d 8LUHw/gTTDhdOpN8Xxzmm9Y52CwQsDsfAHxMHDaTYlj+4wcWBrQNCJI4+JXgk2S9IbF4 F0RB2CNEWM1AB1IhgGxppbCo5pCwD9KQQOgf6ibSyEZsym8Gj2Lq2m0OiKguc/8yzGqc Byz0bPcDyAjDdWJ2QIlvj0/BrPOjnVTPVeREBneg6M0E1AqDKkQNwzrM6T6vYSgF6DIB 0FSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:mime-version:user-agent:date:message-id:subject :from:to; bh=tLTBrvimLh6fPuC2+htIq6JAJ36WiNGSvQCvreHvPJE=; b=ryago6nM7D2jLPfRpvSNF/ajtWecue9eM3JWGQ3/ECpvWUkWSg/z7vrMmaAAro4oGQ vXspW8pfqUOjpn50N+tSEo7iXNLDFEH7HcIH8Fd2yGrvVMx1yvRvead1va//kC471H1I 5yPNfxC97SNO3dqofTalI2/69hT8O9k8lbRbt3D0NWIqdaD3jL4DAudPawIUZKNwh82g aqdxWOLtgh7CAhF+zs/dCn2kYhKECJBP2n04tUaWA1fj0ttqmhElWFDzcVCcDBqxYI2j VIbxaZxqUXXVnhGzmL5VRnXMeh4Kbve90Q2e2qnP7T/sLaUMkhyc9/E98Fhai9JMdnDI WyGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z23si8064677oti.34.2020.03.03.06.18.10; Tue, 03 Mar 2020 06:18:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728285AbgCCORs (ORCPT + 99 others); Tue, 3 Mar 2020 09:17:48 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:45966 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727370AbgCCORr (ORCPT ); Tue, 3 Mar 2020 09:17:47 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R631e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=yun.wang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0TrZAYVM_1583245023; Received: from testdeMacBook-Pro.local(mailfrom:yun.wang@linux.alibaba.com fp:SMTPD_---0TrZAYVM_1583245023) by smtp.aliyun-inc.com(127.0.0.1); Tue, 03 Mar 2020 22:17:18 +0800 To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , "open list:SCHEDULER" From: =?UTF-8?B?546L6LSH?= Subject: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small Message-ID: <44fa1cee-08db-e4ab-e5ab-08d6fbd421d7@linux.alibaba.com> Date: Tue, 3 Mar 2020 22:17:03 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org During our testing, we found a case that shares no longer working correctly, the cgroup topology is like: /sys/fs/cgroup/cpu/A (shares=102400) /sys/fs/cgroup/cpu/A/B (shares=2) /sys/fs/cgroup/cpu/A/B/C (shares=1024) /sys/fs/cgroup/cpu/D (shares=1024) /sys/fs/cgroup/cpu/D/E (shares=1024) /sys/fs/cgroup/cpu/D/E/F (shares=1024) The same benchmark is running in group C & F, no other tasks are running, the benchmark is capable to consumed all the CPUs. We suppose the group C will win more CPU resources since it could enjoy all the shares of group A, but it's F who wins much more. The reason is because we have group B with shares as 2, which make the group A 'cfs_rq->load.weight' very small. And in calc_group_shares() we calculate shares as: load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); shares = (tg_shares * load) / tg_weight; Since the 'cfs_rq->load.weight' is too small, the load become 0 in here, although 'tg_shares' is 102400, shares of the se which stand for group A on root cfs_rq become 2. While the se of D on root cfs_rq is far more bigger than 2, so it wins the battle. This patch add a check on the zero load and make it as MIN_SHARES to fix the nonsense shares, after applied the group C wins as expected. Signed-off-by: Michael Wang --- kernel/sched/fair.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 84594f8aeaf8..53d705f75fa4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3182,6 +3182,8 @@ static long calc_group_shares(struct cfs_rq *cfs_rq) tg_shares = READ_ONCE(tg->shares); load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); + if (!load && cfs_rq->load.weight) + load = MIN_SHARES; tg_weight = atomic_long_read(&tg->load_avg); -- 2.14.4.44.g2045bb6