Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp492230pxb; Wed, 27 Oct 2021 06:54:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyVTzW1Rqc+T6UHqdkL5E9Uv+PFZ/OopywIAuOdNByoobRglkGTNIDtobFIE6O9Lw4NplCq X-Received: by 2002:a17:90a:d3d6:: with SMTP id d22mr5972751pjw.242.1635342897662; Wed, 27 Oct 2021 06:54:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635342897; cv=none; d=google.com; s=arc-20160816; b=PMDHSzz/2yyMJyFpuQBo4MldpXKaTyEOEKjpTkj2P1XUz24XzXE0G3xpZqzTGrL+PZ HuN4lc5lDpTrPYBiSpDZOHTrUJJ4wP2NPcBgCtA26mvZnx1vjlcl6z8+SsmLXQRcn46g DbKdHizB1dLLQu/ZL2OHCNk6VRuV1nkh97OmqQD0eXXn+VcydsWIWfn9BaVdasi477q0 ULuV30a7VMQs4RV5lVwTTLFjOEZtinYvIZodp1tfQ+6kK7VE9IYh/sMDlbc9ZRKS4kQd BchW8EPPdX4J/YTcL6UYbSqLxdgRltPeIjc3a+paL01MWK3dEU2jmKUTqJo8ZOEklDxD q5AA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=gIL2iBkDrtIwY2yLAt2y6K16SOUVortCY20PtkCZU/E=; b=WFhRjVPAOLkK9zJ0qWWaKPCavnPdrX0y0GQcvIEbzK/T8Kl4CYZnW9HTy9+poOUHUy pfNJo6WBGZT8qfNDnDzfTB1NgcFWmPElxW1Uk0iFA4tMmCRozEiLW39gSscAx4T9Jprl 7Hj6yER1TL04+t0cRs2jzFosLS6em5RO9TPkR7ekvQiVBBBkY1olbMdptCdTvySeT/Na o2ID7pR5Fnkv+KAxGv8G7DPz03fuhHmMs5RqxAufubby8dtCvTTaTfOFuI+aTGlImosz kMDmT7X/lBh60x7cA3r87u2dy2uvBupkIDJLjPyF+ol68QOwOftBWcm2hyF/9j/NT+nE lqHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p63si7089pga.219.2021.10.27.06.54.37; Wed, 27 Oct 2021 06:54:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232051AbhJ0AVQ (ORCPT + 99 others); Tue, 26 Oct 2021 20:21:16 -0400 Received: from mail.kernel.org ([198.145.29.99]:54564 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231166AbhJ0AVO (ORCPT ); Tue, 26 Oct 2021 20:21:14 -0400 Received: from rorschach.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 437C86103B; Wed, 27 Oct 2021 00:18:49 +0000 (UTC) Date: Tue, 26 Oct 2021 20:18:46 -0400 From: Steven Rostedt To: Kalesh Singh Cc: surenb@google.com, hridya@google.com, namhyung@kernel.org, kernel-team@android.com, Jonathan Corbet , Ingo Molnar , Shuah Khan , Masami Hiramatsu , Tom Zanussi , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: Re: [PATCH v4 6/8] tracing/histogram: Optimize division by a power of 2 Message-ID: <20211026201846.08990d1d@rorschach.local.home> In-Reply-To: References: <20211025200852.3002369-1-kaleshsingh@google.com> <20211025200852.3002369-7-kaleshsingh@google.com> <20211026151451.7f3e09a4@gandalf.local.home> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 26 Oct 2021 16:39:13 -0700 Kalesh Singh wrote: > > // This works best for small divisors > > if (div > max_div) { > > // only do a real division > > return; > > } > > shift = 20; > > mult = ((1 << shift) + div - 1) / div; > > delta = mult * div - (1 << shift); > > if (!delta) { > > /* div is a power of 2 */ > > max = -1; > > return; > > } > > max = (1 << shift) / delta; > > I'm still trying to digest the above algorithm. mult = (2^20 + div - 1) / div; The "div - 1" is to round up. Basically, it's doing: X / div = X * (2^20 / div) / 2^20 If div is constant, the 2^20 / div is constant, and the "2^20" is the same as a shift. So multiplier is 2^20 / div, and the shift is 20. But because there's rounding errors it is only accurate up to the difference of: delta = mult * div / 2^20 That is if mult is a power of two, then there would be no rounding errors, and the delta is zero, making the max infinite: max = 2^20 / delta as delta goes to zero. > But doesn't this add 2 extra divisions? What am I missing here? The above is only done at parsing not during the trace, where we care about. > > > > > > We would of course need to use 64 bit operations (maybe only do this for 64 > > bit machines). And perhaps even use bigger shift values to get a bigger max. > > > > Then we could do: > > > > if (val1 < max) > > return (val1 * mult) >> shift; This is done at the time of recording. Actually, it would be: if (val1 < max) return (val1 * mult) >> shift; else return val1 / div; -- Steve