Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp544617pxb; Mon, 25 Oct 2021 13:25:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzLvO0u2k62womKr7uC0T/Xf/lvfJGXfEDc3MnxdyIwX75ea6wa5EmwqCwAmcx3q0sGcPtR X-Received: by 2002:a63:7a11:: with SMTP id v17mr4707861pgc.435.1635193547479; Mon, 25 Oct 2021 13:25:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635193547; cv=none; d=google.com; s=arc-20160816; b=kgt9p7akVIdffEx+fD2uN56kxSAmTmhBTgLRIpmqELP+zqUo06Ej7Q+3ZHeF6YIq4F aTIprAad585WcrDvssy0u+lC9i6XAI1eqctjsIyrW2jmK+r9kFR8wuA2NzfgAMyoChFE EouldVPv0V2yrn5tdiCZi1djBfhNhFq93YA5p5qyXLvPJRJiebm6KVYjRQtdIsTubElU ow+6hOY4Vi7lu1MqsmI2EvErBIuHjVstS2vNwuhmxP+vfyUFXb9TtLUlZe5a9gurdgLi nhOeOneXi0rd/BrrdPyrQohUS74xl59zO/wIPuhEi/XH+9PA6vDeNr3Ed7tIi+To92x9 7ADw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:cc:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=zTwkaja/2JWSasZvIIm+cYcdHeXbXK6HRxptMu06TGw=; b=uxlq7QYosBE+1zt3ZV3BpE/uvVawyH/Jd56UJNetl2L0lWz1usaWsUEsdb5GaoVi7t gK1Y0vfYdsFBtNx4VS/0vYaH0p77x3EmM5eZE9wLP99J1op2b3aA2JuOiHHio/mnWogA CCzGYRPDQceK9PTIckGMInGnic8cvpZQ/LroD4eq4T1JD3BTlMYDahhl56PNbc/j6TuT +FSwqxU2W+C23Ct4yOszByFht3fFADk5aNbQZgGAePNVkaniMiR0rBzqRT98+65niGeI tyqq3WITIF2q8KKLlV+PK+FNm74R0E0/f2cwZZX4dkp/sEgJ9/5E+LKaadObga07DsCh bbGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="sZREgtN/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c198si1299337pfc.310.2021.10.25.13.25.29; Mon, 25 Oct 2021 13:25:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="sZREgtN/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234669AbhJYU0W (ORCPT + 99 others); Mon, 25 Oct 2021 16:26:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237979AbhJYU0I (ORCPT ); Mon, 25 Oct 2021 16:26:08 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36EAFC04A433 for ; Mon, 25 Oct 2021 13:11:15 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id h185-20020a256cc2000000b005bdce4db0easo18936729ybc.12 for ; Mon, 25 Oct 2021 13:11:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:cc; bh=zTwkaja/2JWSasZvIIm+cYcdHeXbXK6HRxptMu06TGw=; b=sZREgtN/dQkjCL2r8YEppt69WYaT1e3YnU/lC0tSTPmcim+MfICmWIErShiU50K4Op 0B9n9qnPvHiinHyry+vteN2f30G/5xHAB2ETMTbFOnoOAe0QJ2t8eYckiZM8AxajIi+n bvIFrHJ0nWgy889TgKX1F7kGNC7N/u8BjxT4PFeLTiWdfJwv5mi1xm0uSuUDDgUg9/++ 2B9iOJG3VCMSPVus4a7jVI1ngK8n3PB1YtZwz1gz46Ug6H0ZMY3nSmz9ToW5Kx/4sXd9 tx3SXE6tj06KTd3zS1C+sPC65RTqm/ilEyguUjz+IJQOw+K0BfvoxS6XjQYn9OEAJNwn rt7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:cc; bh=zTwkaja/2JWSasZvIIm+cYcdHeXbXK6HRxptMu06TGw=; b=zxfDjAoo9eOmnqUMcjcQg7Q2VrtdbKsi5PIMsRbdQH9f1yAWDCPl405PCmSMKWOHK8 cVzbXNXFjghVyICsS/53W6XyYo+F+15e345xYScAv3tA/rdzeqSQF2O66MuiPAFqTl8l /1SlXRz/7/hv7tohX0k098FB1WkFTnlsoddiAgKFemJVAFRm6ZTZH3sELeeCpqnvDaYd M4Q5sx5C0Wf9W8/ApJY9EBbYJMJdjayDVG1cQq6D+FavMXkTFuX7oP2HsDhHWzqrBgj8 BUX+nmSdgpswfmhs9AnQ1W1GBfCc5X5J7PiqOXDdf5XXvWQmb8RHAGJUNHrPzqe8sQ7t n5Zw== X-Gm-Message-State: AOAM533QlJian5nEJ9HM6uOun/V7DaAkiYUbBj3nt3oL9+B3p2+apG5j 9BIiPzAymBtx6fDWc+OoYGzNVnGkI5uiphQ7kA== X-Received: from kaleshsingh.mtv.corp.google.com ([2620:15c:211:200:b783:5702:523e:d435]) (user=kaleshsingh job=sendgmr) by 2002:a25:c5c5:: with SMTP id v188mr7746712ybe.34.1635192674501; Mon, 25 Oct 2021 13:11:14 -0700 (PDT) Date: Mon, 25 Oct 2021 13:08:38 -0700 In-Reply-To: <20211025200852.3002369-1-kaleshsingh@google.com> Message-Id: <20211025200852.3002369-7-kaleshsingh@google.com> Mime-Version: 1.0 References: <20211025200852.3002369-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.33.0.1079.g6e70778dc9-goog Subject: [PATCH v4 6/8] tracing/histogram: Optimize division by a power of 2 From: Kalesh Singh Cc: surenb@google.com, hridya@google.com, namhyung@kernel.org, kernel-team@android.com, Kalesh Singh , Steven Rostedt , Jonathan Corbet , Ingo Molnar , Shuah Khan , Masami Hiramatsu , Tom Zanussi , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" To: unlisted-recipients:; (no To-header on input) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The division is a slow operation. If the divisor is a power of 2, use a shift instead. Results were obtained using Android's version of perf (simpleperf[1]) as described below: 1. hist_field_div() is modified to call 2 test functions: test_hist_field_div_[not]_optimized(); passing them the same args. Use noinline and volatile to ensure these are not optimized out by the compiler. 2. Create a hist event trigger that uses division: events/kmem/rss_stat$ echo 'hist:keys=common_pid:x=size/' >> trigger events/kmem/rss_stat$ echo 'hist:keys=common_pid:vals=$x' >> trigger 3. Run Android's lmkd_test[2] to generate rss_stat events, and record CPU samples with Android's simpleperf: simpleperf record -a --exclude-perf --post-unwind=yes -m 16384 -g -f 2000 -o perf.data == Results == Divisor is a power of 2 (divisor == 32): test_hist_field_div_not_optimized | 8,717,091 cpu-cycles test_hist_field_div_optimized | 1,643,137 cpu-cycles If the divisor is a power of 2, the optimized version is ~5.3x faster. Divisor is not a power of 2 (divisor == 33): test_hist_field_div_not_optimized | 4,444,324 cpu-cycles test_hist_field_div_optimized | 5,497,958 cpu-cycles If the divisor is not a power of 2, as expected, the optimized version is slightly slower (~24% slower). [1] https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md [2] https://cs.android.com/android/platform/superproject/+/master:system/memory/lmkd/tests/lmkd_test.cpp Signed-off-by: Kalesh Singh Suggested-by: Steven Rostedt --- kernel/trace/trace_events_hist.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index db28bcf976f4..364cb3091789 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -304,6 +304,10 @@ static u64 hist_field_div(struct hist_field *hist_field, if (!val2) return -1; + /* Use shift if the divisor is a power of 2 */ + if (!(val2 & (val2 - 1))) + return val1 >> __ffs64(val2); + return div64_u64(val1, val2); } -- 2.33.0.1079.g6e70778dc9-goog