Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758048AbcDHKol (ORCPT ); Fri, 8 Apr 2016 06:44:41 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:56048 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754035AbcDHKok (ORCPT ); Fri, 8 Apr 2016 06:44:40 -0400 Date: Fri, 8 Apr 2016 12:44:32 +0200 From: Peter Zijlstra To: Yuyang Du Cc: mingo@kernel.org, linux-kernel@vger.kernel.org, bsegall@google.com, pjt@google.com, morten.rasmussen@arm.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com Subject: Re: [PATCH] sched/fair: Optimize sum computation with a lookup table Message-ID: <20160408104432.GW3430@twins.programming.kicks-ass.net> References: <1460081240-8074-1-git-send-email-yuyang.du@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1460081240-8074-1-git-send-email-yuyang.du@intel.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 984 Lines: 26 On Fri, Apr 08, 2016 at 10:07:20AM +0800, Yuyang Du wrote: > __compute_runnable_contrib() uses a loop to compute sum, whereas a > table loopup can do it faster in a constant time. > - /* Compute \Sum k^n combining precomputed values for k^i, \Sum k^j */ > - do { > - contrib /= 2; /* y^LOAD_AVG_PERIOD = 1/2 */ > - contrib += runnable_avg_yN_sum[LOAD_AVG_PERIOD]; > - > - n -= LOAD_AVG_PERIOD; > - } while (n > LOAD_AVG_PERIOD); > - > + /* Since n < LOAD_AVG_MAX_N, n/LOAD_AVG_PERIOD < 11 */ > + contrib = __accumulated_sum_N32[n/LOAD_AVG_PERIOD]; > + n %= LOAD_AVG_PERIOD; > contrib = decay_load(contrib, n); > return contrib + runnable_avg_yN_sum[n]; You replace a simple loop with a DIV instruction and a potential extra cachemiss. Is that really faster? What is the median 'n' for which we run that loop? IOW how many loops do we normally do? And remember that while recent Intel chips are really good at divisions, not everybody is (and even then they're still slow).