Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1918655rwd; Mon, 15 May 2023 05:05:49 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6W2uxl9FmqrR38P84cjZvoa1qgOo3/Bk0rMlPkaWq3W8OsR95y2iV6mALzLHvz9bIbLdVF X-Received: by 2002:a05:6808:1513:b0:395:dcd5:52d9 with SMTP id u19-20020a056808151300b00395dcd552d9mr4527117oiw.33.1684152349357; Mon, 15 May 2023 05:05:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684152349; cv=none; d=google.com; s=arc-20160816; b=NboNEqvb+qjyMOJ64GrSt+kTjPUxBecdZKk4V1nwaHhJCexrzu4rorUGwkqKwxYD/0 lizXh8HKzCn5MdDD22s2jYKFQqE7b6zyuN6SS5R7cUIUE1gn46+5yaWMx87ccd+VwDHR aXKJFloKOtpVmRXbfJYTu3AauqnuX3n4HdgPwH/d8kn0Apft/2xYZ7y0asOSP7iTcIoo xWHn4lLYaD0GjM2Lm97r2mx7yxMXbRSoLSBl5gsj/GQJnR1Hx0aQ6iIveNUcq2qWCDv0 WrzvYpsTijn0hIxjCaVBwo4C+Lb7wNaSjJOsrnDZEnJ6Fey0PzorJOGTP6CUn1MhnGDE kMKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=tlHLvQfS2Lw4oH6NbLNsIrYdEwhzwAdihNFnGyiErJ8=; b=i8ZVfvoT577hs7v1GSQdkCsZx9gWga/+hU0I1d1CyR/4aXtVp5NfNdug+tQtzbu1cO FYEWNmZ2VZw9+aEk81PsACCRIB+l8m6Rc099nEovGjGObbQOppmuvqLFOIuJeGLdFfmi 7NS+Xm0wQZlCFFnQ0si9eLMDlZz0c6J1Zrhb2jUqGpGmx2oHXcvumTd9QHt3XqbPs/jf VQUDNYm7qxri+g5UBh/d6iyulUM9IYo09W2Xwest7oPm0R3aB62ujuLiWYOSeiguxXtK EGDPKEjKsnuD8CFVjNk4DEbZx5uYo5pNPDic/CHXtDhRwEb7qmZjKDLTPDEUj8fBY1Dm gYCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=AXMAwOnf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j1-20020a4a92c1000000b00550988d0f26si6828537ooh.47.2023.05.15.05.05.34; Mon, 15 May 2023 05:05:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=AXMAwOnf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241558AbjEOLzr (ORCPT + 99 others); Mon, 15 May 2023 07:55:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241151AbjEOLzV (ORCPT ); Mon, 15 May 2023 07:55:21 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 132472D78 for ; Mon, 15 May 2023 04:46:25 -0700 (PDT) Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34FBbsdi020591; Mon, 15 May 2023 11:46:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=pp1; bh=tlHLvQfS2Lw4oH6NbLNsIrYdEwhzwAdihNFnGyiErJ8=; b=AXMAwOnfN+zQFkNi/DxJmPdiZfNknXmZmgqu07He/kJWsVkt0a+IiacCkya5zpYohpMA MlkEcwoMv2/s9PB1Qv3SryztzSA+S11YibyzEeKDj/IyO2ZBfQEn3NjQ2nsFn4H3AtAK 3EVaKkwD7NyBbu8th6TQ7nJ0Viplws1uyaoOkrpjxKJtn4EnDOsIOVM5W96nHRPQfMLC RDeOSwxRADAGgsA7qYHIdyKjk2jN6BwKPkTYWqBpoJGRViXsvGbwGCvvKUoRYmhoKrGf 789BTPcKX4Rr/TJO7HseLQO8dJNkthcHTh4rp8bP22ynOV2dUwd3XRhaqwANCauYXSje kw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qkk9r1tp9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 15 May 2023 11:46:09 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 34FBbotv020160; Mon, 15 May 2023 11:46:09 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qkk9r1tn4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 15 May 2023 11:46:08 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 34F9cB68011878; Mon, 15 May 2023 11:46:05 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma04fra.de.ibm.com (PPS) with ESMTPS id 3qj264rv97-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 15 May 2023 11:46:05 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 34FBk33U21365154 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 15 May 2023 11:46:03 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 290212004B; Mon, 15 May 2023 11:46:03 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E29DC20040; Mon, 15 May 2023 11:46:02 +0000 (GMT) Received: from localhost.localdomain (unknown [9.171.138.156]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 15 May 2023 11:46:02 +0000 (GMT) From: Tobias Huschle To: linux-kernel@vger.kernel.org Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, sshegde@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org Subject: [RFC 0/1] sched/fair: Consider asymmetric scheduler groups in load balancer Date: Mon, 15 May 2023 13:46:00 +0200 Message-Id: <20230515114601.12737-1-huschle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: X6fKw2QlsyZYC5bDBABfUmopguXhi4Re X-Proofpoint-ORIG-GUID: QNgqIe5mT4XFrn3-NtQl1SQQj8dDGLG- X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-15_09,2023-05-05_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 spamscore=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 priorityscore=1501 impostorscore=0 mlxlogscore=736 bulkscore=0 mlxscore=0 phishscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305150100 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The current load balancer implementation implies that scheduler groups, within the same scheduler domain, all host the same number of CPUs. This appears to be valid for non-s390 architectures. Nevertheless, s390 can actually have scheduler groups of unequal size. The current scheduler behavior causes some s390 configs to use SMT while some cores are still idle, leading to a performance degredation under certain levels of workload. Please refer to the patch's commit message for more details and an example. This patch is a proposal on how to integrate the size of scheduler groups into the decision process. This patch is the most basic approach to address this issue and does not claim to be perfect as-is. Other ideas that also proved to address the problem but are more complex but also potentially more precise: 1. On scheduler group building, count the number of CPUs within each group that are first in their sibling mask. This represents the number of CPUs that can be used before running into SMT. This should be slightly more accurate than using the full group weight if the number of available SMT threads per core varies. 2. Introduce a new scheduler group classification (smt_busy) in between of fully_busy and has_spare. This classification would indicate that a group still has spare capacity, but will run into SMT when using that capacity. This would make the load balancer prefer groups with fully idle CPUs over ones that are about to run into SMT. Feedback would be greatly appreciated. Tobias Huschle (1): sched/fair: Consider asymmetric scheduler groups in load balancer kernel/sched/fair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- 2.34.1