Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp148471pxm; Fri, 25 Feb 2022 05:35:58 -0800 (PST) X-Google-Smtp-Source: ABdhPJwY7TBqwzUuqjE60ar13PwezR6RrigMx0VfvPA9elFzdsR6zkryXSSvQ6AQTrs+zb/0kM8Y X-Received: by 2002:a63:dd17:0:b0:36c:33aa:6d5f with SMTP id t23-20020a63dd17000000b0036c33aa6d5fmr6241252pgg.300.1645796158298; Fri, 25 Feb 2022 05:35:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645796158; cv=none; d=google.com; s=arc-20160816; b=p6tctVZhIAz8xZWxulprUrZZMlFXrDPas3Uj/CUhMtCuRjPwhs4W513qGXlAZGluDD uxp+x5qcplCyMUy22MykziBxvNkd8QGjPwBw7/59J7wl6R3KlfrgMtz8lVvZpmNxhfTI UwJgUMbhhbp9VSIii1zG/OHciyBEzj3eDfWMgfprgwX5VGffQJFtN8N/REO7Vwj8+oSE Y8WhbDQx+ew/RjXgFQ/ApQRS9RZjyYUR5W7ijY0v8npFsQih/jVyxh3GP216zq9oTCH0 3U57cu+/kYWHAI5Duj9NFuWegsAfnHSNEuCDhIgRnGQdWMEQNCkc4kE2xLUsOrZFj/bc DQ9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=w/XZsR5U4bk6AIH6QpADGOhv0TAfbBN64RGsc62N160=; b=P8pW8V87f8TmuZdiesZnrxrhByQ/3QgdZ2ooGDNt8VcbITR1F3iTZ42woKIyLdC7l5 RtFLE9P6F+KVIIo1zyB5fnaavC5gR93eS47lVwjWtLw4Hy17lRY/t+Wgb/58JKVh0q1u ZFfBlZI76epOr/9KsE60UOnQSKOhhLdvqe1dA7w0gY44mGK0Ha2QjERfDKht4y0OkdX/ FfYK+qEWw0UKpEXvKmPGnkbXySLsmpVslJaRjOKDAqReD/g/plRkAv3efNUPKcmhR0b9 QNp7RnXKBU1PY7fPO62/l/gb2pbTqLFmtbglDxYq5NOF7MQ0URn4ZZAQ2fpK0tEt5d3n DxzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=umPJ29rP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w8-20020a056a0014c800b004f0f64d8505si1854239pfu.177.2022.02.25.05.35.41; Fri, 25 Feb 2022 05:35:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=umPJ29rP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239786AbiBYKrH (ORCPT + 99 others); Fri, 25 Feb 2022 05:47:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239152AbiBYKrG (ORCPT ); Fri, 25 Feb 2022 05:47:06 -0500 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 549285F44 for ; Fri, 25 Feb 2022 02:46:33 -0800 (PST) Received: by mail-pf1-x432.google.com with SMTP id p8so4330214pfh.8 for ; Fri, 25 Feb 2022 02:46:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=w/XZsR5U4bk6AIH6QpADGOhv0TAfbBN64RGsc62N160=; b=umPJ29rPDkbG8vd66olReKLLkSk60RSR9xtgVbjShAtqGoGGnVS3soe8k29hGbtseR SQH/XWMxwNxYBb7+CYcfMBb3RpLdgN3zKDPTR+nFbT+R5slWMfykjlsIaLAwwaXnbJ+0 rtOIgqWmHDbZflADIvfvim74ZmwTgyJqyBzflFsm96LxYHW0bL71z771qiorQpWwyqjV lPNZRhhv/M8Z0+NmcLShUUiHjD7I613PqWiOws7SyTxBpFTh9ExI9aFdWeWYO4VoCrwg 6PAK44lSb1j3ildqgolujuqR83SxyGTQt4QaiF/9XScFW6WDph2Fb6E27CwuU0aX1cVj ugKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=w/XZsR5U4bk6AIH6QpADGOhv0TAfbBN64RGsc62N160=; b=ozRI9JOnsLh/cHteZtIBF7gg//ZKolX5hI69xjHRi7o/ryREXV+UpKFWAI6BS1ZLVn 0DUmMVh6tu9MXPiWVwFkvHQ6uGS2zXG9l8GtvdaGLHUMf11nrfDoNonaMCKLwQc14I+e 7+cJiODIDd3TVHR3USNCXTOUuHXyLyKIEjN/ouLiU+Awtwf/HTG1ObCV9u0Prq1IHsJ1 ADQktPx//qD3A7FbnERgI4veeLcMQgtXUEN9Oyp63QfrQE4epmEuNu9zNw68VDxqxYEL E41fAYDJ5IR+CQMq6LosBQkukfgzLCYfaPJlW5sVJAwOaeHS7Fx//XLjNf9f/XgTaS34 lzCw== X-Gm-Message-State: AOAM531txwkSx6bSIngfIh1v0Lc207uNLwDvZTB5Z51FMVssg9hiXTwW BmmcoCXaoDfXHatbruhKN6OSPg== X-Received: by 2002:a05:6a00:148f:b0:4bc:fb2d:4b6f with SMTP id v15-20020a056a00148f00b004bcfb2d4b6fmr7036987pfu.62.1645785992824; Fri, 25 Feb 2022 02:46:32 -0800 (PST) Received: from [10.94.58.189] ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id u37-20020a056a0009a500b004e1414d69besm2791022pfg.151.2022.02.25.02.46.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 25 Feb 2022 02:46:32 -0800 (PST) Message-ID: Date: Fri, 25 Feb 2022 18:46:26 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [RFC PATCH 0/5] introduce sched-idle balancing Content-Language: en-US To: Vincent Guittot Cc: Peter Zijlstra , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Mel Gorman , Steven Rostedt , linux-kernel@vger.kernel.org, Abel Wu References: <20220217154403.6497-1-wuyun.abel@bytedance.com> <9fe00f72-4e2e-38ff-d64a-4ae41e683316@bytedance.com> From: Abel Wu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/25/22 4:29 PM, Vincent Guittot Wrote: > On Fri, 25 Feb 2022 at 07:46, Abel Wu wrote: >> >> Hi Peter, >> >> On 2/24/22 11:20 PM, Peter Zijlstra Wrote: >>> On Thu, Feb 17, 2022 at 11:43:56PM +0800, Abel Wu wrote: >>>> Current load balancing is mainly based on cpu capacity >>>> and task util, which makes sense in the POV of overall >>>> throughput. While there still might be some improvement >>>> can be done by reducing number of overloaded cfs rqs if >>>> sched-idle or idle rq exists. >>> >>> I'm much confused, there is an explicit new-idle balancer and a periodic >>> idle balancer already there. >> >> The two balancers are triggered on the rqs that have no tasks on them, >> and load_balance() seems don't show a preference for non-idle tasks so > > The load balance will happen at the idle pace if a sched_idle task is > running on the cpu so you will have an ILB on each cpu that run a > sched-idle task I'm afraid I don't quite follow you, since sched-idle balancer doesn't touch the ILB part, can you elaborate on this? Thanks. > >> there might be possibility that only idle tasks are pulled during load >> balance while overloaded rqs (rq->cfs.h_nr_running > 1) exist. As a > > There is a LB_MIN feature (disable by default) that filters task with > very low load ( < 16) which includes sched-idle task which has a max > load of 3 This feature might not that friendly to the situation that only sched-idle tasks are running in the system. And this situation can last more than half a day in our co-location systems in which the training/batch tasks are placed under idle groups or directly assigned to SCHED_IDLE. > >> result the normal tasks, mostly latency-critical ones in our case, on >> that overloaded rq still suffer waiting for each other. I observed this >> through perf sched. >> >> IOW the main difference from the POV of load_balance() between the >> latency-critical tasks and the idle ones is load. >> >> The sched-idle balancer is triggered on the sched-idle rqs periodically >> and the newly-idle ones. It does a 'fast' pull of non-idle tasks from >> the overloaded rqs to the sched-idle/idle ones to let the non-idle tasks >> make full use of cpu resources. >> >> The sched-idle balancer only focuses on non-idle tasks' performance, so >> it can introduce overall load imbalance, and that's why I put it before >> load_balance(). > > According to the very low weight of a sched-idle task, I don't expect > much imbalance because of sched-idle tasks. But this also depends of > the number of sched-idle task. > > >> >> Best Regards, >> Abel