Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp254488pxb; Wed, 27 Oct 2021 02:22:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwjWNMrsW7VSki/qjtGTLSqtBh1pKipqAnH0v4OUhwpktJRyPajfPdSTnLmOg9u/DD3ptZ2 X-Received: by 2002:a63:3c1b:: with SMTP id j27mr16919027pga.462.1635326559712; Wed, 27 Oct 2021 02:22:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635326559; cv=none; d=google.com; s=arc-20160816; b=f+MjBW/vZASEcr+Vv+27MeBAlAa7qBi8/uBk7OPvTvvwcH3cIk8pjGznVMuxqKEDqE hw5DRyjPjkZWb8NgmOjpYhuIAkTu7RV3h6yvUn8VW/WB1eGbroBH8P18vrzkVKT6x53y a+D7sU6m9wJsLLeNZAVxTjAIs7y+HM3pW7p/6aRgUloFczsYs9I66WdX/WuNsK4vblwd NrwIKh9+Z/t9yZ1jLc5c+3WyfvUceOpwdeINOfP/wdjGZADqtzg105PvlWdkZMcR0jSE oR5ZLB8PawNTvyoFOzQP390vv2jlMpvQT2szSlvPRS63VPXnYYBgxCJIGPW5d/MIbdGd g3Lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=vWySp3v3mAQhTse6jCzZjgR6iwELh+O8MawEw3wcnqo=; b=Tyg9Zm/Hoy/DmAT0Ts0hT87DvoIaIAObA386a47iy7w4DHaiPG59wByD+O24Ao9gtH fvbuAyw8AygQz66czA8LOWODg7oFWuFC+GrG6jRDpFjXNnoLBa0sKezZHtED0LlwcjoU sKXG0Fm0kAi4NuiKUM7srGfaeLx+uInPWeRO7a9Y3J1oiuvfAeM1WAkSANZA36dwtBNd 7ZPo5Mp/GbJzDQMRjAaH448dlBmijEL3KiNMI4pEHNMIcd1qKctj/H9z1edP5saL0jxU D2iK4mpfArA2lL8mU6WTCJ0xaAyJXL6W+gxjS5KFg77Ptx/3/tuOaJoGZ2SQk8gTd6n7 dR9w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p11si35493179pgh.463.2021.10.27.02.21.35; Wed, 27 Oct 2021 02:22:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236624AbhJZTRT (ORCPT + 99 others); Tue, 26 Oct 2021 15:17:19 -0400 Received: from mail.kernel.org ([198.145.29.99]:47746 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230184AbhJZTRS (ORCPT ); Tue, 26 Oct 2021 15:17:18 -0400 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 897D16108D; Tue, 26 Oct 2021 19:14:53 +0000 (UTC) Date: Tue, 26 Oct 2021 15:14:51 -0400 From: Steven Rostedt To: Kalesh Singh Cc: surenb@google.com, hridya@google.com, namhyung@kernel.org, kernel-team@android.com, Jonathan Corbet , Ingo Molnar , Shuah Khan , Masami Hiramatsu , Tom Zanussi , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: Re: [PATCH v4 6/8] tracing/histogram: Optimize division by a power of 2 Message-ID: <20211026151451.7f3e09a4@gandalf.local.home> In-Reply-To: <20211025200852.3002369-7-kaleshsingh@google.com> References: <20211025200852.3002369-1-kaleshsingh@google.com> <20211025200852.3002369-7-kaleshsingh@google.com> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 25 Oct 2021 13:08:38 -0700 Kalesh Singh wrote: > == Results == > > Divisor is a power of 2 (divisor == 32): > > test_hist_field_div_not_optimized | 8,717,091 cpu-cycles > test_hist_field_div_optimized | 1,643,137 cpu-cycles > > If the divisor is a power of 2, the optimized version is ~5.3x faster. > > Divisor is not a power of 2 (divisor == 33): > > test_hist_field_div_not_optimized | 4,444,324 cpu-cycles > test_hist_field_div_optimized | 5,497,958 cpu-cycles To optimize this even more, if the divisor is constant, we could make a separate function to not do the branch, and just shift or divide. And even if it is not a power of 2, for constants, we could implement a multiplication and shift, and guarantee an accuracy up to a defined max. If div is a constant, then we can calculate the mult and shift, and max dividend. Let's use 20 for shift. // This works best for small divisors if (div > max_div) { // only do a real division return; } shift = 20; mult = ((1 << shift) + div - 1) / div; delta = mult * div - (1 << shift); if (!delta) { /* div is a power of 2 */ max = -1; return; } max = (1 << shift) / delta; We would of course need to use 64 bit operations (maybe only do this for 64 bit machines). And perhaps even use bigger shift values to get a bigger max. Then we could do: if (val1 < max) return (val1 * mult) >> shift; -- Steve