Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp710604pxb; Mon, 25 Oct 2021 17:11:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3atDZi/21b5vjcz7sMjAsi8Xg4rfwTTndHYv3zOJeeARP780oFJ938iyaviXbtBD565YC X-Received: by 2002:a05:6402:268f:: with SMTP id w15mr31264707edd.13.1635207070028; Mon, 25 Oct 2021 17:11:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635207070; cv=none; d=google.com; s=arc-20160816; b=RJAlyXCJX5SKuePqsQgPeuGslvMo5TJC12baBiBG0cKTQyeRKMygysYO9r3z20noeb /f44+1GPai3amJQhX4fN/EAFj14EODdRKn2n2j4N/48zZjfA8m9tV8Wlp8uNTYCMRk0H M18mWrxwy10l07S7ZGBFRbE6iI2beiOG+AOAdy9GjwTpWuol/909Sasp3pQ93T1F8met HVK3gYZJCjpxO8N5JJC/HYAulCP6OJoBtLGuT3kkGQ1y2xVlXoPmK88MmMUyAxihEdPC Vgp0dptje1D99cCI/yw2R/FIJYfiVzHJWAAhuVuU2TK74QJrFU/oLJECngvjShmRmflx G5UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:cc:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=zTwkaja/2JWSasZvIIm+cYcdHeXbXK6HRxptMu06TGw=; b=QPE7zosWkazoz19R+vVAwj5prTyddDGZ45Xud0b6B+0ejl5t6FsEGUSqFrFF0YHG55 YvmYBxikT6xu1AIiJaZ2Js2Cylzg6bUY5vA367gzBShYGSTC3vMIsJnij+prvFpbl+JL 7dL0WFEE5g3JUF316LmtQzGfP94A7FKMRCh7Me8/Ar5vC5xWu8FHIRjdHRL3IxlcROPH H/XOxmwGqdFSRx3hmGmEhFIH+LFzjeCLvJ2piz2zDdfrPmIelUold5C4AufV6o2b/IBE rrkGIfvUMABMpS74A7KjBTe9wBgGii3x9rGM8gy5VehNaCdRwB60Vsd8jMCd/XY9UhHk o6/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=es0PIzo5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s6si2633294edd.36.2021.10.25.17.10.44; Mon, 25 Oct 2021 17:11:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=es0PIzo5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237051AbhJYUFL (ORCPT + 99 others); Mon, 25 Oct 2021 16:05:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239329AbhJYUCR (ORCPT ); Mon, 25 Oct 2021 16:02:17 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 106AFC0432F3 for ; Mon, 25 Oct 2021 12:25:30 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id u17-20020a25ab11000000b005c1620952bfso9170463ybi.14 for ; Mon, 25 Oct 2021 12:25:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:cc; bh=zTwkaja/2JWSasZvIIm+cYcdHeXbXK6HRxptMu06TGw=; b=es0PIzo5QRe/gNbKP/B9WOwdGUjl3l9nyAvVGwTkzf/E0vFbRT43d5JLLX+0lCXYrj IuV1P8y4Zn0oRJfSwiaEGxa/UiAZz8da4/F8a5EHSpnbwDi+6C0PA/Lnnyz29lNBruSh BEGlpTgEM0+76Ouvm072oqAOIMtPb9D7p+/gImv+17JnzOioFBRX0d4dWauFZr5olSzv UhAfDvqixN7XGXm2cyzJHyg8HKcwCorh+H8p1i15l4RoGG8g0z94TwyT/Skl05dI3NSQ h9ac+zm++4zhUn2z0dbq4Z7N72HhEcHJQV0eDJka8cAB4CrPam/tgATotv94uLIo0ENt s3lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:cc; bh=zTwkaja/2JWSasZvIIm+cYcdHeXbXK6HRxptMu06TGw=; b=7bP5fv6xCaSHPxQYBSZEdLvnbn71yT9jLCUz9ICFjfLazGOsNY+8ilsRIZ+8rA688C WAY8KWuVBKVHP5kMwaIhRpJ3CKYItQtjx8FUi+3KjSNYSTtp3Wn/9G67e3cGHAuh0cwu j/Mavjw4r+hBSWJYjctSHepHom/hj/wYKDDO4ceU6couJjtQRWNS6ka0mpzb1mOuCQad qZwjxbfi402cV7CXMF4p9Pm85rJdlQjQisEslCJ+gx0aF9bJHtS5FoLNCZLwERfdgUVh Q6oKkmfNnF64zHfsnJaPNbGGCIVQxh367Rd9lNHKcioW6dBqAVpkFP6UCkNB9ksQGJXE Yr1w== X-Gm-Message-State: AOAM533ca2+VZuqeEzdWu2QP45K9UXOFLJjDwzy1P8MTngzFrBQ5De5L Pw9L2PVCqZ1usJjK4CIHiYQ3XqhbBJk3TMjxXw== X-Received: from kaleshsingh.mtv.corp.google.com ([2620:15c:211:200:b783:5702:523e:d435]) (user=kaleshsingh job=sendgmr) by 2002:a25:400f:: with SMTP id n15mr21239484yba.497.1635189929323; Mon, 25 Oct 2021 12:25:29 -0700 (PDT) Date: Mon, 25 Oct 2021 12:23:17 -0700 In-Reply-To: <20211025192330.2992076-1-kaleshsingh@google.com> Message-Id: <20211025192330.2992076-7-kaleshsingh@google.com> Mime-Version: 1.0 References: <20211025192330.2992076-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.33.0.1079.g6e70778dc9-goog Subject: [PATCH v3 6/8] tracing/histogram: Optimize division by a power of 2 From: Kalesh Singh Cc: surenb@google.com, hridya@google.com, namhyung@kernel.org, kernel-team@android.com, Kalesh Singh , Steven Rostedt , Jonathan Corbet , Ingo Molnar , Shuah Khan , Masami Hiramatsu , Tom Zanussi , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" To: unlisted-recipients:; (no To-header on input) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The division is a slow operation. If the divisor is a power of 2, use a shift instead. Results were obtained using Android's version of perf (simpleperf[1]) as described below: 1. hist_field_div() is modified to call 2 test functions: test_hist_field_div_[not]_optimized(); passing them the same args. Use noinline and volatile to ensure these are not optimized out by the compiler. 2. Create a hist event trigger that uses division: events/kmem/rss_stat$ echo 'hist:keys=common_pid:x=size/' >> trigger events/kmem/rss_stat$ echo 'hist:keys=common_pid:vals=$x' >> trigger 3. Run Android's lmkd_test[2] to generate rss_stat events, and record CPU samples with Android's simpleperf: simpleperf record -a --exclude-perf --post-unwind=yes -m 16384 -g -f 2000 -o perf.data == Results == Divisor is a power of 2 (divisor == 32): test_hist_field_div_not_optimized | 8,717,091 cpu-cycles test_hist_field_div_optimized | 1,643,137 cpu-cycles If the divisor is a power of 2, the optimized version is ~5.3x faster. Divisor is not a power of 2 (divisor == 33): test_hist_field_div_not_optimized | 4,444,324 cpu-cycles test_hist_field_div_optimized | 5,497,958 cpu-cycles If the divisor is not a power of 2, as expected, the optimized version is slightly slower (~24% slower). [1] https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md [2] https://cs.android.com/android/platform/superproject/+/master:system/memory/lmkd/tests/lmkd_test.cpp Signed-off-by: Kalesh Singh Suggested-by: Steven Rostedt --- kernel/trace/trace_events_hist.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index db28bcf976f4..364cb3091789 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -304,6 +304,10 @@ static u64 hist_field_div(struct hist_field *hist_field, if (!val2) return -1; + /* Use shift if the divisor is a power of 2 */ + if (!(val2 & (val2 - 1))) + return val1 >> __ffs64(val2); + return div64_u64(val1, val2); } -- 2.33.0.1079.g6e70778dc9-goog