Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758104AbcDHKpu (ORCPT ); Fri, 8 Apr 2016 06:45:50 -0400 Received: from casper.infradead.org ([85.118.1.10]:60781 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753459AbcDHKpt (ORCPT ); Fri, 8 Apr 2016 06:45:49 -0400 Date: Fri, 8 Apr 2016 12:45:42 +0200 From: Peter Zijlstra To: Yuyang Du Cc: mingo@kernel.org, linux-kernel@vger.kernel.org, bsegall@google.com, pjt@google.com, morten.rasmussen@arm.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com Subject: Re: [PATCH] sched/fair: Optimize sum computation with a lookup table Message-ID: <20160408104542.GJ24771@twins.programming.kicks-ass.net> References: <1460081240-8074-1-git-send-email-yuyang.du@intel.com> <20160408104432.GW3430@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160408104432.GW3430@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 900 Lines: 23 On Fri, Apr 08, 2016 at 12:44:32PM +0200, Peter Zijlstra wrote: > On Fri, Apr 08, 2016 at 10:07:20AM +0800, Yuyang Du wrote: > > __compute_runnable_contrib() uses a loop to compute sum, whereas a > > table loopup can do it faster in a constant time. > > > - /* Compute \Sum k^n combining precomputed values for k^i, \Sum k^j */ > > - do { > > - contrib /= 2; /* y^LOAD_AVG_PERIOD = 1/2 */ > > - contrib += runnable_avg_yN_sum[LOAD_AVG_PERIOD]; > > - > > - n -= LOAD_AVG_PERIOD; > > - } while (n > LOAD_AVG_PERIOD); > > - > > + /* Since n < LOAD_AVG_MAX_N, n/LOAD_AVG_PERIOD < 11 */ > > + contrib = __accumulated_sum_N32[n/LOAD_AVG_PERIOD]; > > + n %= LOAD_AVG_PERIOD; > > contrib = decay_load(contrib, n); > > return contrib + runnable_avg_yN_sum[n]; > > You replace a simple loop with a DIV instruction and a potential extra > cachemiss. Duh, LOAD_AVG_PERIOD is 32, that's a simple shift.