Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp241392ybh; Mon, 9 Mar 2020 20:43:06 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtpdANHJpCSbapGYfHiKAUTI5Jb0AtCHhYbc8i3nKY0jEt33JRZzUw2HXh4iZ+BGyyIaKL1 X-Received: by 2002:a05:6830:2391:: with SMTP id l17mr12987034ots.339.1583811786273; Mon, 09 Mar 2020 20:43:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1583811786; cv=none; d=google.com; s=arc-20160816; b=HHbeMZbPyXMOdRmHyBbq/NOdIFUeB6+Z0k1v52idcKjttktL4tqmV66wJmZnIhMxye nWmRCzHTOBYH5cOOs+8fmQzaBaJCzDMtpavBuDmqWIRnjxIT4YnjFvdLSJPES5eMqZEq +vlgByEKIMb3OOBokcY1a7/FvDZtA2kH9gjGqedL3LQmLZSkv7WszjSruv3cZ0l95lls K8Tf8lFCTGfm5NwvXOBtf4/ouDGgjJTaHR4/DTp1bzrFOCBcXRQCLvOPh/8Z8DkGCJEW mecmtlYaQZtMiys2PA/aEAB+1OKW0BLu/le85ey5D8b+u/mxq3GIYmzxjXLJ/vfJAukd Ya1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=6M5XbbCHMhG4PGhpgRvEqRTFIzcWLgXL0h9EZtzVZGQ=; b=PXzl5DGFWfzUeohNu95Xc7/6k7elpdPc9+W/9QO55yo42pkfCxdSDvvnp0zZpcQLZo Vtl3Zll/P5z6IXRpnnjbzpRC0crF3IhFq8jLaQSEcbX+dehUr/Kpkyr1vniuL/eCv1yP eD6PFFa+7k5uD7dts3Jn2zLgAFox4pKeKlPG0m7LxcMjC0xJtB8UKaGBhyC26pZR1WO+ cnYy6eG0V93kSL6q3OcxA4ZX1jVAf9LgJ1DeKfD2sITABukFFwzmZnNec5SXECZ0BPPn TJIavsTpO8ipesDVR+lNwF3Zvh6LjrN9P1WVvsC9fzW5SnnyjnoyzH3kX4zGq5jKxg3i Itow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r7si4850937oic.229.2020.03.09.20.42.53; Mon, 09 Mar 2020 20:43:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726463AbgCJDma (ORCPT + 99 others); Mon, 9 Mar 2020 23:42:30 -0400 Received: from out30-54.freemail.mail.aliyun.com ([115.124.30.54]:58538 "EHLO out30-54.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726195AbgCJDma (ORCPT ); Mon, 9 Mar 2020 23:42:30 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07417;MF=yun.wang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0TsBJEkt_1583811746; Received: from testdeMacBook-Pro.local(mailfrom:yun.wang@linux.alibaba.com fp:SMTPD_---0TsBJEkt_1583811746) by smtp.aliyun-inc.com(127.0.0.1); Tue, 10 Mar 2020 11:42:26 +0800 Subject: Re: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small To: Vincent Guittot , Ben Segall Cc: Peter Zijlstra , Ingo Molnar , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Mel Gorman , "open list:SCHEDULER" References: <44fa1cee-08db-e4ab-e5ab-08d6fbd421d7@linux.alibaba.com> <20200303195245.GF2596@hirez.programming.kicks-ass.net> <1180c6cd-ff61-2c9f-d689-ffe58f8c5a68@linux.alibaba.com> From: =?UTF-8?B?546L6LSH?= Message-ID: <49a4dd4a-e7b6-5182-150d-16fff2d101cf@linux.alibaba.com> Date: Tue, 10 Mar 2020 11:42:26 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/3/9 下午7:15, Vincent Guittot wrote: [snip] >>>> - load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); >>>> + load = max(cfs_rq->load.weight, scale_load(cfs_rq->avg.load_avg)); >>>> >>>> tg_weight = atomic_long_read(&tg->load_avg); >>> >>> Get the point, but IMHO fix scale_load_down() sounds better, to >>> cover all the similar cases, let's first try that way see if it's >>> working :-) >> >> Yeah, that might not be a bad idea as well; it's just that doing this >> fix would keep you from losing all your precision (and I'd have to think >> if that would result in fairness issues like having all the group ses >> having the full tg shares, or something like that). > > AFAICT, we already have a fairness problem case because > scale_load_down is used in calc_delta_fair() so all sched groups that > have a weight lower than 1024 will end up with the same increase of > their vruntime when running. > Then the load_avg is used to balance between rq so load_balance will > ensure at least 1 task per CPU but not more because the load_avg which > is then used will stay null. > > That being said, having a min of 2 for scale_load_down will enable us > to have the tg->load_avg != 0 so a tg_weight != 0 and each sched group > will not have the full shares. But it will make those group completely > fair anyway. > The best solution would be not to scale down the weight but that's a > bigger change Does that means a changing for all those 'load.weight' related calculation, to reserve the scaled weight? I suppose u64 is capable for 'cfs_rq.load' to reserve the scaled up load, changing all those places could be annoying but still fine. However, I'm not quite sure about the benefit, how much more precision we'll gain and does that really matters? better to have some testing to demonstrate it. Regards, Michael Wang >