Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp403799ybt; Wed, 24 Jun 2020 02:06:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzWcgb9FO+5QHoP+mAngy4W8rMMZi2CXdyJZCOGlGmlitde5BiTyLmAKprhlBtch9pwpdqy X-Received: by 2002:aa7:d6c1:: with SMTP id x1mr25168391edr.154.1592989607388; Wed, 24 Jun 2020 02:06:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592989607; cv=none; d=google.com; s=arc-20160816; b=Rc7C/Qqo1+V1mVQHAkNWDP+359eSI8Mic85ia/Oh/saRNBRiOUov7cddIXMLgNaPEU pQrXnR7DQ7qXBJIKeOPCU+ey7RTOmS8Fg73tE+ZEPvc8ZByhQocDpNGllt2FbDkJrA4A XircdiswTqxaOzsg6J5ynsPjCQQRgc2+cBxVDt23P82c/escUJrSqxMQo6U+1IC2p7vr 3Fu6LzFdKXVqDTJNkCtnXYCP6uDYRxuw6UOEgyhlvQt9iY7DAAH6Pi/gLBmuhJH2w+j/ SDdIUk/hcPs9VFBEc+XLPbB5qMFcyX9ZttVlpreoIywe5vkLoNXq7PpdX9hNiERohlXs 2TRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=bEq8aoWf7DDgO6yWt+tqmsLUbeY8nM58bRX+rO+CdzU=; b=K6tM35KloY/plUDnkM4XiPVckVvP92BESc56r5ReBvfevDcBoEXSXDKhzb10iSneZk nuYVPIiAlvq41SvRzcK2Dn9arvXcyRxF8C3Tqc5RwZTnLSgcDOyeacd55FfD5CKeLJiS LHeEUxeoTsFRYrxXrZ7KRVwFG6oQlkVQZLVJgEq8BF0GJzY0ZtDFOZ8gsmTZwcbhiTOl XoEL8Kt6/zk10xr1+pmUhUiGezyYXhFYmQPdjlB9pFnUA1H8SxoSqunhHrcH3GAYLycJ od13M+tcesd+iABvZUxHUZwgZVfa9WsLnJs+h3m4Mgy/DG/ETBKmazN/LJArEp/OW5BD Dtsg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="M/VMdU+/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l61si1906308edl.512.2020.06.24.02.06.23; Wed, 24 Jun 2020 02:06:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="M/VMdU+/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388798AbgFXJAU (ORCPT + 99 others); Wed, 24 Jun 2020 05:00:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387491AbgFXJAT (ORCPT ); Wed, 24 Jun 2020 05:00:19 -0400 Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64266C061573 for ; Wed, 24 Jun 2020 02:00:19 -0700 (PDT) Received: by mail-lj1-x244.google.com with SMTP id q19so1737434lji.2 for ; Wed, 24 Jun 2020 02:00:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bEq8aoWf7DDgO6yWt+tqmsLUbeY8nM58bRX+rO+CdzU=; b=M/VMdU+/Rwv6tpDepjVbdIbBpTq9QQDMbSHjVa1lOlkkglEW6NWrYNDEeOt9UWfop7 /OvWaOLZjvJvTEuEOdIOMQBJcCGNq6BOs8AVCpjl84aeBTiSrNP350e9UTCPa43+C0Z8 Lu7KS0nWmQTb9yawScRbIQl/0yb25FjWGhA+jg4MOkC1JRsq1ksQO69DM9cTnai9yp0U jN8URtdNssz562PIg4VG3ADMqJ59HXXanPFeqbXFd84/QSmzJXDq7BSel15Th0AIHXXz 88y8Z19BbVWw4gG6r454wTH5gyVVy5Svfj+xFO7oZn+LM8IE/DWspWc7b3CrTdvbMKid kFkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bEq8aoWf7DDgO6yWt+tqmsLUbeY8nM58bRX+rO+CdzU=; b=mYMF6lYMo0285xzaaRXAB+Vyyvvq/WNAW69W9oA1MAncRhYU44RWhAbaHYeI+WM9a+ shtzdLc+CwIo9CAE7nt0GUVMHoF2+zLohXKqC3Gxdxfkd5JvJ0QXOV8kVbh934u6W67D QwpyQW0n4H/+8G9hp8NXeE68pPPL2QBfavv9A0gfkKThrxWQ0jA3W3LuuUboNbm2bgse unBPZstYd7HZLAecE3Lw1VJHDbFCF+65KJa5i5ELj7nUkVqwxKP43t/BqQt5IdeuuEXi yh43oCRE32VrZmEd8MesM7huDw/zO/8w2y9cA+J9kLFBCUqaj/DF1QyFjiKv17ixkZGO GeWw== X-Gm-Message-State: AOAM530q794IST8/pzRH4cFCyWM2Hoe/mudliP6iRDll/fgyrztN9krt qcyxk2eZIZ2vWhqN87zNCqz9attQ/NODy53J4TOCog== X-Received: by 2002:a05:651c:512:: with SMTP id o18mr14254641ljp.226.1592989217706; Wed, 24 Jun 2020 02:00:17 -0700 (PDT) MIME-Version: 1.0 References: <20200619172011.5810-1-qais.yousef@arm.com> <6c1a6003-8f51-dadc-53e4-a2fa034dbe36@arm.com> In-Reply-To: <6c1a6003-8f51-dadc-53e4-a2fa034dbe36@arm.com> From: Vincent Guittot Date: Wed, 24 Jun 2020 11:00:06 +0200 Message-ID: Subject: Re: [PATCH v2 0/2] sched: Optionally skip uclamp logic in fast path To: Dietmar Eggemann , Qais Yousef Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Ben Segall , Mel Gorman , Patrick Bellasi , Chris Redpath , Lukasz Luba , linux-kernel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 23 Jun 2020 at 19:40, Dietmar Eggemann wrote: > > On 19/06/2020 19:20, Qais Yousef wrote> This series attempts to address the report that uclamp logic could be expensive > > sometimes and shows a regression in netperf UDP_STREAM under certain > > conditions. > > > > The first patch is a fix for how struct uclamp_rq is initialized which is > > required by the 2nd patch which contains the real 'fix'. > > > > Worth noting that the root cause of the overhead is believed to be system > > specific or related to potential certain code/data layout issues, leading to > > worse I/D $ performance. > > > > Different systems exhibited different behaviors and the regression did > > disappear in certain kernel version while attempting to reporoduce. > > > > More info can be found here: > > > > https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/ > > > > Having the static key seemed the best thing to do to ensure the effect of > > uclamp is minimized for kernels that compile it in but don't have a userspace > > that uses it, which will allow distros to distribute uclamp capable kernels by > > default without having to compromise on performance for some systems that could > > be affected. > > My test data indicates that the static key w/o any uclamp users (3) > brings the performance number for the 'perf bench sched pipe' > workload back (i.e. from w/ !CONFIG_UCLAMP_TASK) (1). > > platform: > > Arm64 Hikey960 (only little CPUs [0-3]), no CPUidle, > performance CPUfreq governor > > workload: > > perf stat -n -r 20 -- perf bench sched pipe -T -l 100000 > > > (A) *** Performance results *** > > (1) tip/sched/core > # CONFIG_UCLAMP_TASK is not set > > *1.39285* +- 0.00191 seconds time elapsed ( +- 0.14% ) > > (2) tip/sched/core > CONFIG_UCLAMP_TASK=y > > *1.42877* +- 0.00181 seconds time elapsed ( +- 0.13% ) > > (3) tip/sched/core + opt_skip_uclamp_v2 > CONFIG_UCLAMP_TASK=y > > *1.38833* +- 0.00291 seconds time elapsed ( +- 0.21% ) > > (4) tip/sched/core + opt_skip_uclamp_v2 > CONFIG_UCLAMP_TASK=y > echo 512 > /proc/sys/kernel/sched_util_clamp_min (enable uclamp) > > *1.42062* +- 0.00238 seconds time elapsed ( +- 0.17% ) > > > (B) *** Profiling on CPU0 and CPU1 *** > > (further hp'ing out CPU2 and CPU3 to get consistent hit numbers) > > (1) > > CPU0: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1997346 2207642 us *1.105* us 0.033 us > activate_task 1997391 1840057 us *0.921* us 0.054 us > > CPU1: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1997455 2225960 us 1.114 us 0.034 us > activate_task 1997410 1842603 us 0.922 us 0.052 us > > (2) > > CPU0: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1998538 2419719 us *1.210* us 0.061 us > activate_task 1997119 1960401 us *0.981* us 0.034 us > > CPU1: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1996597 2400760 us 1.202 us 0.059 us > activate_task 1998016 1985013 us 0.993 us 0.028 us > > (3) > > CPU0: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1997525 2155416 us *1.079* us 0.020 us > activate_task 1997874 1899002 us *0.950* us 0.044 us > > CPU1: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1997935 2118648 us 1.060 us 0.017 us > activate_task 1997586 1895162 us 0.948 us 0.044 us > > (4) > > CPU0: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1998246 2428121 us *1.215* us 0.062 us > activate_task 1998252 2132141 us *1.067* us 0.020 us > > CPU1: Function Hit Time Avg s^2 > -------- --- ---- --- --- > deactivate_task 1996154 2414194 us 1.209 us 0.060 us > activate_task 1996148 2140667 us 1.072 us 0.021 us I have rerun the tests that I ran previously on my octo core arm64 (hikey): 20 iteration of perf bench sched pipe -T -l 50000 tip stands for tip/sched/core uclamp enabled means both uclamp task and uclamp cgroup the stdev is around 0.25% for all tests root level 1 level 2 level 3 tip uclamp disabled 50653 47188 44568 41925 tip uclamp enabled 48800(-3.66%) 45600(-3.37%) 42822(-3.92%) 40257(-3.98%) /w patch uclamp disabled 50615(-0.08%) 47198(+0.02%) 44609(+0.09%) 41735(-0.45%) /w patch uclamp enabled 49661(-1.96%) 46611(-1.22%) 43803(-1.72%) 41243(-1.63%) Results are better with your patch