Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67;
Subject: Re: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is
 too, small
To:     Vincent Guittot <vincent.guittot@linaro.org>,
        Ben Segall <bsegall@google.com>
Cc:     Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@redhat.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Mel Gorman <mgorman@suse.de>,
        "open list:SCHEDULER" <linux-kernel@vger.kernel.org>
References: <44fa1cee-08db-e4ab-e5ab-08d6fbd421d7@linux.alibaba.com>
 <20200303195245.GF2596@hirez.programming.kicks-ass.net>
 <xm26o8tc3qkv.fsf@bsegall-linux.svl.corp.google.com>
 <1180c6cd-ff61-2c9f-d689-ffe58f8c5a68@linux.alibaba.com>
 <xm267dzx47k9.fsf@bsegall-linux.svl.corp.google.com>
 <CAKfTPtDKTp_G1VNgAXnh=_yLS_T6YkipOsQQ52tBRp-m612JEw@mail.gmail.com>
From:   =?UTF-8?B?546L6LSH?= <yun.wang@linux.alibaba.com>
Message-ID: <49a4dd4a-e7b6-5182-150d-16fff2d101cf@linux.alibaba.com>
Date:   Tue, 10 Mar 2020 11:42:26 +0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0)
 Gecko/20100101 Thunderbird/68.4.2
MIME-Version: 1.0
In-Reply-To: <CAKfTPtDKTp_G1VNgAXnh=_yLS_T6YkipOsQQ52tBRp-m612JEw@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk


On 2020/3/9 下午7:15, Vincent Guittot wrote:
[snip]
>>>> -       load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
>>>> +       load = max(cfs_rq->load.weight, scale_load(cfs_rq->avg.load_avg));
>>>>
>>>>         tg_weight = atomic_long_read(&tg->load_avg);
>>>
>>> Get the point, but IMHO fix scale_load_down() sounds better, to
>>> cover all the similar cases, let's first try that way see if it's
>>> working :-)
>>
>> Yeah, that might not be a bad idea as well; it's just that doing this
>> fix would keep you from losing all your precision (and I'd have to think
>> if that would result in fairness issues like having all the group ses
>> having the full tg shares, or something like that).
> 
> AFAICT, we already have a fairness problem case because
> scale_load_down is used in calc_delta_fair() so all sched groups that
> have a weight lower than 1024 will end up with the same increase of
> their vruntime when running.
> Then the load_avg is used to balance between rq so load_balance will
> ensure at least 1 task per CPU but not more because the load_avg which
> is then used will stay null.
> 
> That being said, having a min of 2 for scale_load_down will enable us
> to have the tg->load_avg != 0 so a tg_weight != 0 and each sched group
> will not have the full shares. But it will make those group completely
> fair anyway.
> The best solution would be not to scale down the weight but that's a
> bigger change

Does that means a changing for all those 'load.weight' related
calculation, to reserve the scaled weight?

I suppose u64 is capable for 'cfs_rq.load' to reserve the scaled up load,
changing all those places could be annoying but still fine.

However, I'm not quite sure about the benefit, how much more precision
we'll gain and does that really matters? better to have some testing to
demonstrate it.

Regards,
Michael Wang


>