Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966221AbbBCPmK (ORCPT ); Tue, 3 Feb 2015 10:42:10 -0500 Received: from mail-ie0-f173.google.com ([209.85.223.173]:39149 "EHLO mail-ie0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755993AbbBCPmG (ORCPT ); Tue, 3 Feb 2015 10:42:06 -0500 MIME-Version: 1.0 In-Reply-To: <1422938843.2293.4.camel@stgolabs.net> References: <1422897162-111998-1-git-send-email-aksgarg1989@gmail.com> <1422938843.2293.4.camel@stgolabs.net> Date: Tue, 3 Feb 2015 21:12:04 +0530 Message-ID: Subject: Re: [PATCH] lib/int_sqrt.c: Optimize square root function From: Anshul Garg To: Davidlohr Bueso Cc: Linus Torvalds , Linux Kernel Mailing List , "anshul.g@samsung.com" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3294 Lines: 88 On Tue, Feb 3, 2015 at 10:17 AM, Davidlohr Bueso wrote: > On Mon, 2015-02-02 at 11:00 -0800, Linus Torvalds wrote: >> Hmm. I don't disagree, but would like some more feedback. >> >> Davidlohr - you were the person to touch this function last (commit >> 30493cc9dddb: "lib/int_sqrt.c: optimize square root algorithm"), and >> you did so for performance reasons. And in fact, when you did that, >> you removed that initial loop: >> >> - one = 1UL << (BITS_PER_LONG - 2); >> - while (one > op) >> - one >>= 2; >> >> but I'm not sure that was actually all that conscious, I think the >> real optimization was the changes inside the loop to make the final >> real loop faster and simpler. > > I missed that. And yes, the real optimization should be in the loop. > >> >> Also, you had performance numbers, so presumably a test harness for it >> all. It probably depends a lot on the actual distribution of argument >> values, of course, but it would be good to accompany the patch with >> actual real numbers like lasty time. > > Aha. In my case I recall I ran a usersapce program using each function > from 1 to a million, and throwing perf at it for 10 times. > I have done profiling of int_sqrt function using perf tool for 10 times. For this purpose i have created a userspace program which uses sqrt function from 1 to a million. int_sqrt_old -> current algorithm version int_sqrt_new -> with proposed change these results are for BITS_PER_LONG=64. Performance counter stats for './int_sqrt_old' (10 runs): 460.944061 task-clock (msec) # 0.969 CPUs utilized ( +- 1.72% ) 64 context-switches # 0.139 K/sec ( +- 2.27% ) 0 cpu-migrations # 0.000 K/sec 132 page-faults # 0.286 K/sec cycles stalled-cycles-frontend stalled-cycles-backend instructions branches branch-misses 0.475795341 seconds time elapsed( +- 3.20% ) Performance counter stats for './int_sqrt_new' (10 runs): 401.782119 task-clock (msec) # 0.974 CPUs utilized( +- 1.55% ) 57 context-switches # 0.141 K/sec( +- 1.92% ) 0 cpu-migrations # 0.000 K/sec 132 page-faults # 0.329 K/sec cycles stalled-cycles-frontend stalled-cycles-backend instructions branches branch-misses 0.412593296 seconds time elapsed( +- 2.03% ) As per profiling definitely there is improvement in algorithm timing. >> (I'm also not entirely sure what uses int_sqrt() that ends up being so >> performance-critical, so it would be good to document that too, since >> that probably also matters for the "what's the normal argument range" >> question..) > > It's not a big deal afaik. > > Thanks, > Davidlohr > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/