Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp6315964ybi; Mon, 8 Jul 2019 00:58:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqz4J/EvNtlPjA1OhOmG9FcSEQsSdaErZwKwdMZh4RbsYlvrn2F7kDqvOfp47r8iQ+qKnZyQ X-Received: by 2002:a17:90a:bd8c:: with SMTP id z12mr23541754pjr.60.1562572695958; Mon, 08 Jul 2019 00:58:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562572695; cv=none; d=google.com; s=arc-20160816; b=hFE4TDKslw8B3S0HazQys0UMaovAatFxODdOSTrBJtLR1RB/WZfwDgutfqYNFRbD3G yC2SAVY+cSccZXkpO3Q8k2El2OI8jphYWjgo/4HR0G9Dfu1JkVUouOcxV6Wld84rPD5l nMfRMeObODMbcjg5mY99X8v+xh5IOMTHCp83yf+OIZiZJM9F4Qhbra/i7QASSptmx0Sn P7LEf4gjgPs6vrz7XrATNZTv8f12KOjerm34IcZtA3k1bs0yTnzbN4BEfXsi9aiEWH3E PZW2OM/47lXnlhtACfWgV/IsIMvou2Y9WpVrIgrvi1v2b4syeLMOQXNzIpG0ld3N/3Oi rtaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=f5JMJ8YHeVX08RB1PuCs1MYMm5KwibVNGDK84V1HGTc=; b=RIHXwH87wMOnTojciKoirD7galrlcbiSHNd+lqqSFIRW9yDoMmAYnFll78ltNs75c1 XFv1i3Sp4AK+KMLdFFJbSDM5Gdor/3H8xlZEBt8SkOa2Y1Ab913ASDXJN1VzDJtFyNbA Yu4nwXI5XBq13ERQEHnRDTtj2nzV0u/SSW1QGMIAP0ezco7CWneY7qXHeFTr62Qpf+o4 O+91WMq3HHs4fjQRVKqBU8AUlWJfSVanBtAhl0kJ7YaBMcedZCMT8XzbMP2kOYTAY6qE lxovoFC2O6LgBBkq8D6YRL+Ikvk/qm+DXaNGcpJCDPmhpt3yVULekwATx4UfIiDw03US +jgQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d12si17385643pla.121.2019.07.08.00.58.00; Mon, 08 Jul 2019 00:58:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728167AbfGHEyk (ORCPT + 99 others); Mon, 8 Jul 2019 00:54:40 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:24462 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727312AbfGHEyk (ORCPT ); Mon, 8 Jul 2019 00:54:40 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x684piQf094737 for ; Mon, 8 Jul 2019 00:54:39 -0400 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 2tkv8h4fwy-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 08 Jul 2019 00:54:39 -0400 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 8 Jul 2019 05:54:37 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 8 Jul 2019 05:54:35 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x684sYcu20775088 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 8 Jul 2019 04:54:34 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0E1D54C044; Mon, 8 Jul 2019 04:54:34 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 02ED74C04E; Mon, 8 Jul 2019 04:54:33 +0000 (GMT) Received: from localhost.in.ibm.com (unknown [9.124.35.94]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 8 Jul 2019 04:54:32 +0000 (GMT) From: Parth Shah To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, mingo@redhat.com, vincent.guittot@linaro.org, subhra.mazumdar@oracle.com Subject: [RFC 0/2] Optimize the idle CPU search Date: Mon, 8 Jul 2019 10:24:30 +0530 X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 x-cbid: 19070804-0012-0000-0000-000003302CE3 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19070804-0013-0000-0000-000021698C2C Message-Id: <20190708045432.18774-1-parth@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-07-08_01:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=694 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907080062 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When searching for an idle_sibling, scheduler first iterates to search for an idle core and then for an idle CPU. By maintaining the idle CPU mask while iterating through idle cores, we can mark non-idle CPUs for which idle CPU search would not have to iterate through again. This is especially true in a moderately load system Optimize idle CPUs search by marking already found non idle CPUs during idle core search. This reduces iteration count when searching for idle CPUs, resulting in lower iteration count. The results show that the time for `select_idle_cpu` decreases and there is no regression on time search for `select_idle_core` and almost no regression on schbench as well. With proper tuning schbench shows benefit as well when idle_core search fails most times. When doing this, rename locally used cpumask 'select_idle_mask' to something else to use this existing mask for such optimization. Patch set based on tip/core/core Results =========== IBM POWER9 system: 2-socket, 44 cores, 176 CPUs Function latency (with tb tick): (lower is better) +--------------+----------+--------+-------------+--------+ | select_idle_ | Baseline | stddev | Patch | stddev | +--------------+----------+--------+-------------+--------+ | core | 2080 | 1307 | 1975(+5.3%) | 1286 | | cpu | 834 | 393 | 91(+89%) | 64 | | sibling | 0.96 | 0.003 | 0.89(+7%) | 0.02 | +--------------+----------+--------+-------------+--------+ Schbench: - schbench -m 44 -t 1 (lower is better) +------+----------+--------+------------+--------+ | %ile | Baseline | stddev | Patch | stddev | +------+----------+--------+------------+--------+ | 50 | 9.9 | 2 | 10(-1.01) | 1.4 | | 95 | 465 | 3.9 | 465(0%) | 2 | | 99 | 561 | 24 | 483(-1.0%) | 14 | | 99.5 | 631 | 29 | 635(-0.6%) | 32 | | 99.9 | 801 | 41 | 763(+4.7%) | 125 | +------+----------+--------+------------+--------+ - 44 threads spread across cores to make select_idle_core return -1 most times - schbench -m 44 -t 1 +-------+----------+--------+-----------+--------+ | %ile | Baseline | stddev | patch | stddev | +-------+----------+--------+-----------+--------+ | 50 | 10 | 9 | 12(-20%) | 1 | | 95 | 468 | 3 | 31(+93%) | 1 | | 99 | 577 | 16 | 477(+17%) | 38 | | 99.95 | 647 | 26 | 482(+25%) | 2 | | 99.99 | 835 | 61 | 492(+41%) | 2 | +-------+----------+--------+-----------+--------+ Hackbench: - 44 threads spread across cores to make select_idle_core return -1 most times - perf bench sched messaging -g 1 -l 100000 (lower is better) +----------+--------+--------------+--------+ | Baseline | stddev | patch | stddev | +----------+--------+--------------+--------+ | 16.107 | 0.62 | 16.02(+0.5%) | 0.32 | +----------+--------+--------------+--------+ Series: - Patch 01: Rename select_idle_mask to reuse the name in next patch - Patch 02: Optimize the wakeup fast path Parth Shah (2): sched/fair: Rename select_idle_mask to iterator_mask sched/fair: Optimize idle CPU search kernel/sched/core.c | 3 +++ kernel/sched/fair.c | 15 ++++++++++----- 2 files changed, 13 insertions(+), 5 deletions(-) -- 2.17.1