Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1585057pxb; Thu, 4 Mar 2021 15:31:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJwm4WDiirLvsYA1LTlhTyG/dLCNw5PLayx3YHWBRCdzqPR+XgD059xXdW1sBb0dephEDWFE X-Received: by 2002:a05:6602:2ac4:: with SMTP id m4mr5579709iov.41.1614900662851; Thu, 04 Mar 2021 15:31:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614900662; cv=none; d=google.com; s=arc-20160816; b=elXvLVqMK8M+WeBOMiCl3IpvilSfh75B1UC51zFXJ/qXYKguulZxw2x8UAaD29T1KA dxPdL7jUaX7HJvXQV6Zdiy4ZnxWmxN31gplUXWg90fHJvzQXiPC4gaL+RGIAfQwexSfI onsW3CrwmnWpDwsl3urkm4+eBsn3dIAZIWiA86cYmuB2/xhzDDngI4DtBOa2s7xK1PpK RXb0JMwO1HW/sBWj9ouzg8QVBEN1DeKNjy2/Yvsds7h9PwE6rmwlsvZkSGQYfxeHSgnP 6cCTUsYR07em5wQK2CEqNT5ep569ZVCHEs7t52z519BaZCCCcf+zvslQ1latwlMb/zvm 637Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :references:mime-version:message-id:in-reply-to:date:sender :dkim-signature; bh=6GsG4BSAoi0zOG8y5aYDqIlqcuuhkTJOTNFA5M+6RFc=; b=GYUtVlIR7CatxXfQVFsgzgejP2GTD/o4WzhFIK/p04cs/Hu3j9vpWPC91Fk42lttnG jw6q8U7pnxBKI9z0FGielSBXtv493sjT+jgrOG96RRCtQNoymiKgv6H2iRN2fYNBj9je Idm8GAWRZ3cafIzT03gN+bI2oSy31awThLUFxvqtjOMciKR3YkXWhmbmvYSymMff/DCA xiuPFm7YwJzEdUf/dOV2VUacOUwDeKPEPJr2B+ECF8UrBOoCTXzIAaElHmZIBOxi47RU MaFS14ln+31ZA60eNgJ0p4pCTPXmPpKQg8n53UCsK38o5bMC7nrxKxy7TJFWryNKHAdN 5d3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Y+A60D74; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y13si574576ill.12.2021.03.04.15.30.49; Thu, 04 Mar 2021 15:31:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Y+A60D74; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354818AbhCDALL (ORCPT + 99 others); Wed, 3 Mar 2021 19:11:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1449777AbhCCXBY (ORCPT ); Wed, 3 Mar 2021 18:01:24 -0500 Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66E14C0613E7 for ; Wed, 3 Mar 2021 14:48:31 -0800 (PST) Received: by mail-qt1-x84a.google.com with SMTP id b18so10485132qtt.6 for ; Wed, 03 Mar 2021 14:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc:content-transfer-encoding; bh=6GsG4BSAoi0zOG8y5aYDqIlqcuuhkTJOTNFA5M+6RFc=; b=Y+A60D74ZEOuyHZ0L1sPY5s3pbrklD3/nblbLXwuEcfrO1g3joof/HL3d5YCNi5jo1 v4bSIWtF43Yn9qvmJHzQF+Q3ct3Oht1mLrWbCB7kOd5GixooBS2AdHdfj/JVsc4jbpIQ MdxodfcTClAHtD9ALV7rwYQmGnS0jvHFmdZjOmACc15f2QwwHzSWkU1zFsfHCah0mOgq cyx55+hWfJejGSy+vjQcl1e+9ryEL/VU5t8KkavULxn0MzYoH/fsqq9/n6Ijs71P/oFs ZFJR6CgOtiKLyudIuegTjV7rnxNcwlVE3NBH/TXxRkqXSfiVQ5ghF1CphdDQSBKo3WT3 ev3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc:content-transfer-encoding; bh=6GsG4BSAoi0zOG8y5aYDqIlqcuuhkTJOTNFA5M+6RFc=; b=J0RthSaCw4FpA2oABbvkZw7UXDxBhUbyJV8vTCl2XYOK9K9dmfr+BR4g98QQrMLOYe 2QNX9/9N38wRjzwELv+IRw/sajLuCs+mMk5M6RsNAHqnnXWgwL7Ipdz4HgWJU1TN4rri +IC0+fg6Ns7nN3y8ki/cnqZ/LFDMv+PXlMw5MZ+o/bCE6NyYut6ZIZ+6LtODu7EGtEzQ TfrFMCN5CT3tfWiR+VZjs0Fpe3hPWdOYgeDNkz5x7fvSjHX1Bsz4t5XN+9Ejojg9ub2c 2eERTdBfoE/z8oMiWeN7ta5lgNjcc9EMeQnWL70zmAVRAmel2ePijRQd1erC1h/Jkqtf rMtQ== X-Gm-Message-State: AOAM533mfm2br9f25CeDhTmkZyP/L7vDVW6Y/hGx3fTXZ3/ESRNiznZH H41zdsa1CH8FJ9XkvVF99jmPm9QBgl1P Sender: "joshdon via sendgmr" X-Received: from joshdon.svl.corp.google.com ([2620:15c:2cd:202:6dda:c053:b83b:4416]) (user=joshdon job=sendgmr) by 2002:ad4:5d46:: with SMTP id jk6mr1274223qvb.22.1614811710568; Wed, 03 Mar 2021 14:48:30 -0800 (PST) Date: Wed, 3 Mar 2021 14:46:53 -0800 In-Reply-To: Message-Id: <20210303224653.2579656-1-joshdon@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.30.1.766.gb4fecdf3b7-goog Subject: [PATCH v2] sched: Optimize __calc_delta. From: Josh Don To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot Cc: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Nathan Chancellor , Nick Desaulniers , linux-kernel@vger.kernel.org, clang-built-linux@googlegroups.com, Clement Courbet , Oleg Rombakh , Josh Don Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Clement Courbet A significant portion of __calc_delta time is spent in the loop shifting a u64 by 32 bits. Use `fls` instead of iterating. This is ~7x faster on benchmarks. The generic `fls` implementation (`generic_fls`) is still ~4x faster than the loop. Architectures that have a better implementation will make use of it. For example, on X86 we get an additional factor 2 in speed without dedicated implementation. On gcc, the asm versions of `fls` are about the same speed as the builtin. On clang, the versions that use fls are more than twice as slow as the builtin. This is because the way the `fls` function is written, clang puts the value in memory: https://godbolt.org/z/EfMbYe. This bug is filed at https://bugs.llvm.org/show_bug.cgi?id=3D49406. ``` name cpu/op BM_Calc<__calc_delta_loop> 9.57ms =C2=B112% BM_Calc<__calc_delta_generic_fls> 2.36ms =C2=B113% BM_Calc<__calc_delta_asm_fls> 2.45ms =C2=B113% BM_Calc<__calc_delta_asm_fls_nomem> 1.66ms =C2=B112% BM_Calc<__calc_delta_asm_fls64> 2.46ms =C2=B113% BM_Calc<__calc_delta_asm_fls64_nomem> 1.34ms =C2=B115% BM_Calc<__calc_delta_builtin> 1.32ms =C2=B111% ``` Signed-off-by: Clement Courbet Signed-off-by: Josh Don --- kernel/sched/fair.c | 19 +++++++++++-------- kernel/sched/sched.h | 1 + 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8a8bd7b13634..a691371960ae 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -229,22 +229,25 @@ static void __update_inv_weight(struct load_weight *l= w) static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_= weight *lw) { u64 fact =3D scale_load_down(weight); + u32 fact_hi =3D (u32)(fact >> 32); int shift =3D WMULT_SHIFT; + int fs; =20 __update_inv_weight(lw); =20 - if (unlikely(fact >> 32)) { - while (fact >> 32) { - fact >>=3D 1; - shift--; - } + if (unlikely(fact_hi)) { + fs =3D fls(fact_hi); + shift -=3D fs; + fact >>=3D fs; } =20 fact =3D mul_u32_u32(fact, lw->inv_weight); =20 - while (fact >> 32) { - fact >>=3D 1; - shift--; + fact_hi =3D (u32)(fact >> 32); + if (fact_hi) { + fs =3D fls(fact_hi); + shift -=3D fs; + fact >>=3D fs; } =20 return mul_u64_u32_shr(delta_exec, fact, shift); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 10a1522b1e30..714af71cf983 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -36,6 +36,7 @@ #include =20 #include +#include #include #include #include --=20 2.30.1.766.gb4fecdf3b7-goog