Received: by 2002:a05:7208:13ce:b0:7f:395a:35b6 with SMTP id r14csp1207300rbe; Fri, 1 Mar 2024 07:18:33 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCW7FOQxoGnrSIIkDlj4f4TmdI2Ada3IgFt/Css7yKK30E9OjpRBZZTx88PNovsOngBipMymc44IxtnrlW6BOfS2OwgN3JUiLvIW+cX3pQ== X-Google-Smtp-Source: AGHT+IHPjV7SDZFJssBwDJoluQ19sOvOE2SEFe6LHrOpAK8PQbmVWCVlU0e8Ez+rgHi32O/8mVg1 X-Received: by 2002:a17:906:b250:b0:a44:7989:6f3e with SMTP id ce16-20020a170906b25000b00a4479896f3emr1558158ejb.68.1709306313019; Fri, 01 Mar 2024 07:18:33 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709306313; cv=pass; d=google.com; s=arc-20160816; b=HUlm7842ir05IVZsR89Q5bcGLeJcFMswrJkHYpZGTCay1Ff++qARidONFqNq4UM7RM pTtg0xiw6U82hvv0bQ+qXEh67Dfm4NzbyQn2Cv2dmThXn6blY/K9bHBE6A3vchd4SyxF H99YuKhxGeLas6SJxhkLyHvBHo4rew8e1RE+ylDuY/gr5lPb2+GiOtMPiqDXIb+MYtDN YS1YguszWdHAK12xyYCI1mgCO8aIRzpwbdwdJs4Uf1FD+ywOLwapNoTSp7ODIw/4XU6H 4UYxK19yV/YLDTuC/KPLRLWVuPf5K1muwOhVa9op3UUFwd/SMJ3GIN/pL3eMtCOAuMwq 3MGw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=t9FnlzNERVOGw06pCKSKBO2k0r/g6mojuUzF5SI7rX4=; fh=vjnxu6btb5gYs4jFOPDgeX45Pt0noh9LgKUgmif34F8=; b=F3Hk8u40lyTNtyy81yh65IgG0JVHtYy1X+IaLLwobSGwV4y55dFdzTvz8q4gdDX0Mo fPgItoR80a9bLvrKQj88vhBhs67PZvro243fo8ds66JzLdgEvWFCTsRGaN2rKyz4JKcW ZaYfz9SAMyNT1T8Elnb5eml2U1Uflp243nu8OVtqZrwzO7cY1oh6SzHMTJDY+RQ7Md1c G2GS+UGtJb8Bn/5Fm7GQNR2JzBIDtwqs+bs8PM/O+RluJJblVq4V0X6hV7+qBs2WigMU lDny5DDt9FsxaOUWg806AZ8V8kCuS78LVCKuuLzSwFkLFOzjX63b3Sv9urv+DpJBjXwH Oa7A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=ag5jtLnq; arc=pass (i=1 spf=pass spfdomain=linux.ibm.com dkim=pass dkdomain=ibm.com dmarc=pass fromdomain=linux.ibm.com); spf=pass (google.com: domain of linux-kernel+bounces-88612-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-88612-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id oy1-20020a170907104100b00a444ad7790asi1387650ejb.37.2024.03.01.07.18.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 07:18:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-88612-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=ag5jtLnq; arc=pass (i=1 spf=pass spfdomain=linux.ibm.com dkim=pass dkdomain=ibm.com dmarc=pass fromdomain=linux.ibm.com); spf=pass (google.com: domain of linux-kernel+bounces-88612-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-88612-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 99E4C1F272BB for ; Fri, 1 Mar 2024 15:18:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E0DDB6F076; Fri, 1 Mar 2024 15:18:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="ag5jtLnq" Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E830B1C33 for ; Fri, 1 Mar 2024 15:18:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709306295; cv=none; b=fiTV8RdgX1SMsFl2Rn1gS6ESXu8wYsG0jN3AvwTYQKMTSN20vu2ULbhl0tHHhF/b336kaTl9XJcI66l+ANZ2IhX/gOp8QCyGQBchYhMUcOPewOTm89VVk2WR1/UvcP02uDResjZ5O9/zxtgXzKNzz5vJbavz2w9PwephIRpXTm8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709306295; c=relaxed/simple; bh=YSVqtMO+wKDDoKeJ5jfBn/EZv1n4za+zGw2Co/dOsvc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tHmK7Ej9y19iLHrC0k285436Z00YeKz+2gr1FrzUQmNoDLoxgW0yYpDA7ba8LFentvcrGnHrWj+WxQIP9/7t87d2GpAjptchXWeqT4NaoaKc1kx/yCLcNSTqzOAyQKdNDcGeJA2vb2w/kwSwBE7cvgRnggmNdy2gGeSAhj4LJYI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=ag5jtLnq; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 421E0YWd027980; Fri, 1 Mar 2024 15:18:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=t9FnlzNERVOGw06pCKSKBO2k0r/g6mojuUzF5SI7rX4=; b=ag5jtLnqXyBTQcW9EGHj6bl6cXDA88El0j5CHkA27dlTMrwqLoi8uzWsqAuHv98JIe45 J7+sMccZeS/RpcPIDunpzBhgDL2JbNx4N1iTHiMgxnuPw6kz+A2WZk1US53uEmqKVQUN ryJq+V6MtTPF4WXNvk8f714YhrYBgw+85dBUfT0K2AJL4QP8wRnts0BpgRdAltQBi1Wn q9GyaPOtGJrpxwOs/ebvQwvB0HD3jh7sx9WE7f6XDNAOkD3o/8UbUAlf9RSdL0lCJpuU +CtLmzfnKo/2o3kwtfNFuP8yp4YBSp/MFH7tSyspaJj0yUU8o8CpBMlqb6qP7WKSHhHU XQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3wkgdd21e2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 15:18:02 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 421EtTI9019806; Fri, 1 Mar 2024 15:18:01 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3wkgdd21dh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 15:18:01 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 421E9IRQ008189; Fri, 1 Mar 2024 15:18:00 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3wfv9mw0tu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 01 Mar 2024 15:18:00 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 421FHsBI47907146 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 1 Mar 2024 15:17:56 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6ED852004B; Fri, 1 Mar 2024 15:17:54 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 524D820043; Fri, 1 Mar 2024 15:17:51 +0000 (GMT) Received: from li-c1fdab4c-355a-11b2-a85c-ef242fe9efb4.ibm.com.com (unknown [9.43.108.184]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 1 Mar 2024 15:17:51 +0000 (GMT) From: Shrikanth Hegde To: mingo@kernel.org, peterz@infradead.org, vincent.guittot@linaro.org Cc: sshegde@linux.ibm.com, yu.c.chen@intel.com, dietmar.eggemann@arm.com, linux-kernel@vger.kernel.org, nysal@linux.ibm.com, aboorvad@linux.ibm.com, srikar@linux.ibm.com, vschneid@redhat.com, pierre.gondois@arm.com, morten.rasmussen@arm.com, qyousef@layalina.io Subject: [PATCH v4 1/2] sched/fair: Add EAS checks before updating overutilized Date: Fri, 1 Mar 2024 20:47:24 +0530 Message-Id: <20240301151725.874604-2-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <20240301151725.874604-1-sshegde@linux.ibm.com> References: <20240301151725.874604-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Sw86H_SlUa7znm7JbZvkSK28nLZKH0Uw X-Proofpoint-ORIG-GUID: yNlbNyxPqk4W5e6Nuo5_hYvltdF-2KPN X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-01_14,2024-03-01_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 clxscore=1015 mlxscore=0 lowpriorityscore=0 mlxlogscore=999 spamscore=0 bulkscore=0 adultscore=0 priorityscore=1501 malwarescore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2403010127 Overutilized field of root domain is only used for EAS(energy aware scheduler) to decide whether to do load balance or not. It is not used if EAS not possible. Currently enqueue_task_fair and task_tick_fair accesses, sometime updates this field. In update_sd_lb_stats it is updated often. This causes cache contention due to true sharing and burns a lot of cycles. overload and overutilized are part of the same cacheline. Updating it often invalidates the cacheline. That causes access to overload to slow down due to false sharing. Hence add EAS check before accessing/updating this field. EAS check is optimized at compile time or it is a static branch. Hence it shouldn't cost much. With the patch, both enqueue_task_fair and newidle_balance don't show up as hot routines in perf profile. 6.8-rc4: 7.18% swapper [kernel.vmlinux] [k] enqueue_task_fair 6.78% s [kernel.vmlinux] [k] newidle_balance +patch: 0.14% swapper [kernel.vmlinux] [k] enqueue_task_fair 0.00% swapper [kernel.vmlinux] [k] newidle_balance Minor change: trace_sched_overutilized_tp expect that second argument to be bool. So do a int to bool conversion for that. Fixes: 2802bf3cd936 ("sched/fair: Add over-utilization/tipping point indicator") Reviewed-by: Srikar Dronamraju Signed-off-by: Shrikanth Hegde --- kernel/sched/fair.c | 49 +++++++++++++++++++++++++++------------------ 1 file changed, 30 insertions(+), 19 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6a16129f9a5c..a71f8a1506e4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6670,15 +6670,29 @@ static inline bool cpu_overutilized(int cpu) return !util_fits_cpu(cpu_util_cfs(cpu), rq_util_min, rq_util_max, cpu); } -static inline void update_overutilized_status(struct rq *rq) +static inline void set_rd_overutilized_status(struct root_domain *rd, + unsigned int status) { - if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) { - WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED); - trace_sched_overutilized_tp(rq->rd, SG_OVERUTILIZED); - } + WRITE_ONCE(rd->overutilized, status); + trace_sched_overutilized_tp(rd, !!status); +} + +static inline void check_update_overutilized_status(struct rq *rq) +{ + /* + * overutilized field is used for load balancing decisions only + * if energy aware scheduler is being used + */ + if (!sched_energy_enabled()) + return; + + if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) + set_rd_overutilized_status(rq->rd, SG_OVERUTILIZED); } #else -static inline void update_overutilized_status(struct rq *rq) { } +static inline void check_update_overutilized_status(struct rq *rq) { } +static inline void set_rd_overutilized_status(struct root_domain *rd, + unsigned int status) { } #endif /* Runqueue only has SCHED_IDLE tasks enqueued */ @@ -6779,7 +6793,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags) * and the following generally works well enough in practice. */ if (!task_new) - update_overutilized_status(rq); + check_update_overutilized_status(rq); enqueue_throttle: assert_list_leaf_cfs_rq(rq); @@ -9902,7 +9916,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, if (nr_running > 1) *sg_status |= SG_OVERLOAD; - if (cpu_overutilized(i)) + if (sched_energy_enabled() && cpu_overutilized(i)) *sg_status |= SG_OVERUTILIZED; #ifdef CONFIG_NUMA_BALANCING @@ -10596,19 +10610,16 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd env->fbq_type = fbq_classify_group(&sds->busiest_stat); if (!env->sd->parent) { - struct root_domain *rd = env->dst_rq->rd; - /* update overload indicator if we are at root domain */ - WRITE_ONCE(rd->overload, sg_status & SG_OVERLOAD); + WRITE_ONCE(env->dst_rq->rd->overload, sg_status & SG_OVERLOAD); /* Update over-utilization (tipping point, U >= 0) indicator */ - WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED); - trace_sched_overutilized_tp(rd, sg_status & SG_OVERUTILIZED); - } else if (sg_status & SG_OVERUTILIZED) { - struct root_domain *rd = env->dst_rq->rd; - - WRITE_ONCE(rd->overutilized, SG_OVERUTILIZED); - trace_sched_overutilized_tp(rd, SG_OVERUTILIZED); + if (sched_energy_enabled()) { + set_rd_overutilized_status(env->dst_rq->rd, + sg_status & SG_OVERUTILIZED); + } + } else if (sched_energy_enabled() && (sg_status & SG_OVERUTILIZED)) { + set_rd_overutilized_status(env->dst_rq->rd, SG_OVERUTILIZED); } update_idle_cpu_scan(env, sum_util); @@ -12609,7 +12620,7 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued) task_tick_numa(rq, curr); update_misfit_status(curr, rq); - update_overutilized_status(task_rq(curr)); + check_update_overutilized_status(task_rq(curr)); task_tick_core(rq, curr); } -- 2.39.3