Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp378302rdh; Thu, 23 Nov 2023 06:27:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IE2QwCqZBd2941xgr5Yp76/V/GUdEIFRKkqpwzp56iFr4X20fi9bZ4CJgJInRAEdIntHTQy X-Received: by 2002:a05:6a00:8e04:b0:6cb:8a0c:292a with SMTP id io4-20020a056a008e0400b006cb8a0c292amr5914452pfb.32.1700749653992; Thu, 23 Nov 2023 06:27:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700749653; cv=none; d=google.com; s=arc-20160816; b=Ysxs4iDsfJT5JZAT3gSxdMkZZElqXdeAFTq4ScF8Xk0Ki8FhWwEZkHd9gdIi9/EOc/ kUjaXQbExFrH2B4BJLGBb04qd5zZ64M1OoF+D8dZZ7ZZmf4fG3Avt+ReBtK5HjQkSWri 1sWPfH7icFz3vkXXKs/tBMKhPIST4fA2FDLo2c4REaQCFCej1yanqtSX4IPtztZZ8Zyr snUvvmanRL9I3DkTF8YjnSYcYgkOfS8fLcFsDYTOQQahY3/pDxMA2svso5wW0brDqcF3 a1pcU1BmBSAj1TYWN3J54Bce2zruxaWghqdez1ui1Tzdy0TRmKf6H/GP5QxJTGDgR2jZ rjrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=rCmRliC8mXba9DVvgHiccPB9V5tJVEO+DwJ8o+AuIzs=; fh=u67aoYiJp2/2ImiTgOOa42ykczHNT9yBb9m0q4JXfOs=; b=Py2vWG11yCVSRM0AJs8NO1emwV5savvzfCzciYK3ttiJTrIMsyicmC4vGp0afNEewD SVwTidcvSt1z0xBumLSCL55yjyC/rd+uHDFUrICrZ0mHdT6r0Sub2+QVK9bq3Be+eZ9/ q0HWDF5bmaVwgspSzkG11dM80Ib/pwX8hGulRkY86B+JWzLUuq3yajyI0Ewek3CjkBbP MXbT9WJu34I0uaZvD1V0Ix33bdfDlAPluZNgNaQHtb/1UnxTiw5H82Rh1T5zq7eY0ywD Sv2eIZ63MX6L010KxUs1PAyzZpUAjzqmQu66eqs/wUvNrVIP6uLVTBlmxPleTtX91FmZ 7KZA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id 206-20020a6300d7000000b005b92edaa151si1381388pga.739.2023.11.23.06.27.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Nov 2023 06:27:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 1ADD88269C76; Thu, 23 Nov 2023 06:27:08 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345762AbjKWO0u (ORCPT + 99 others); Thu, 23 Nov 2023 09:26:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345736AbjKWO0s (ORCPT ); Thu, 23 Nov 2023 09:26:48 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4379E1A8; Thu, 23 Nov 2023 06:26:55 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7055F1063; Thu, 23 Nov 2023 06:27:41 -0800 (PST) Received: from [10.57.4.190] (unknown [10.57.4.190]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 99C073F6C4; Thu, 23 Nov 2023 06:26:52 -0800 (PST) Message-ID: <77ec94ee-798d-4c5e-a673-616d25acca4a@arm.com> Date: Thu, 23 Nov 2023 14:27:52 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] sched/pelt: avoid underestimate of task utilization Content-Language: en-US To: Qais Yousef Cc: mingo@redhat.com, peterz@infradead.org, Vincent Guittot , juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, rafael@kernel.org, viresh.kumar@linaro.org, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org References: <20231122140119.472110-1-vincent.guittot@linaro.org> <20231121230150.eq2kc72bvyn6ltur@airbuntu> From: Lukasz Luba In-Reply-To: <20231121230150.eq2kc72bvyn6ltur@airbuntu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 23 Nov 2023 06:27:08 -0800 (PST) On 11/21/23 23:01, Qais Yousef wrote: > On 11/22/23 15:01, Vincent Guittot wrote: >> It has been reported that thread's util_est can significantly decrease as >> a result of sharing the CPU with other threads. The use case can be easily >> reproduced with a periodic task TA that runs 1ms and sleeps 100us. >> When the task is alone on the CPU, its max utilization and its util_est is >> around 888. If another similar task starts to run on the same CPU, TA will >> have to share the CPU runtime and its maximum utilization will decrease >> around half the CPU capacity (512) then TA's util_est will follow this new >> maximum trend which is only the result of sharing the CPU with others >> tasks. Such situation can be detected with runnable_avg wich is close or >> equal to util_avg when TA is alone but increases above util_avg when TA >> shares the CPU with other threads and wait on the runqueue. >> >> Signed-off-by: Vincent Guittot >> --- > > So effectively if have two always running tasks on the same CPU their util_est > will converge to 1024 instead of 512 now, right? > > I guess this is more accurate, yes. And I think this will hit the same > limitation we can hit with uclamp_max. If for example there are two tasks that > are 650 if they run alone, they would appear as 1024 now (instead of 512) if > they share the CPU because combined running there will be no idle time on the > CPU and appear like always running tasks, I think. Well they might not converge to 1024. It will just prevent them to not drop the highest seen util_est on them before this contention happen. > > If I got it right, I think this should not be a problem in practice because the > only reason these two tasks will be stuck on the same CPU is because the load > balancer can't do anything about it to spread them; which indicates the system > must be busy anyway. Once there's more idle time elsewhere, they should be > spread and converge to 650 again. It can be applicable for the real app. That chrome thread that I reported (which is ~950 util) drops it's util and util_est in some scenarios when there are some other tasks in the runqueue, because of some short sleeps. Than this situation attracts other tasks to migrate but next time when the big thread wakes-up it has to share the CPU and looses it's util_est (which was the last information that there was such big util on it). Those update moments when we go via util_est_update() code path are quite often in short time and don't fit into the PELT world, IMO. It's like asynchronous force-update to the util_est signal, not the same way wrt how slowly util is built. I think Peter had something like this impression, when he asked me of often and fast this update could be triggered that we loose the value... I would even dare to call this patch a fix and a candidate to stable-tree.