Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752273AbbBESni (ORCPT ); Thu, 5 Feb 2015 13:43:38 -0500 Received: from mail-ig0-f170.google.com ([209.85.213.170]:59735 "EHLO mail-ig0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750991AbbBESnh (ORCPT ); Thu, 5 Feb 2015 13:43:37 -0500 MIME-Version: 1.0 In-Reply-To: References: <1422897162-111998-1-git-send-email-aksgarg1989@gmail.com> <1422938843.2293.4.camel@stgolabs.net> Date: Thu, 5 Feb 2015 10:43:33 -0800 Message-ID: Subject: Re: [PATCH] lib/int_sqrt.c: Optimize square root function From: Anshul Garg To: Linus Torvalds Cc: Davidlohr Bueso , Linux Kernel Mailing List , "anshul.g@samsung.com" Content-Type: multipart/mixed; boundary=089e013d06e88bb3e8050e5bae40 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3918 Lines: 90 --089e013d06e88bb3e8050e5bae40 Content-Type: text/plain; charset=UTF-8 On Thu, Feb 5, 2015 at 10:33 AM, Linus Torvalds wrote: > On Thu, Feb 5, 2015 at 10:20 AM, Linus Torvalds > wrote: >> >> Hmm. I did that too [..] > > Side note: one difference in our results (apart from possibly just CPU > uarch details) is that my loop goes to 100M to make it easier to just > time it. Which means that my load essentially had about three more > iterations over nonzero data. > > Linus I have also done the same testing on 100 million numbers. Attaching source codes. Below is the result :: int_sqrt_old - current int_sqrt_new - With proposed change anshul@ubuntu:~/kernel_latest/sqrt$ time ./int_sqrt_old real 0m41.895s user 0m36.490s sys 0m0.365s anshul@ubuntu:~/kernel_latest/sqrt$ time ./int_sqrt_new real 0m39.491s user 0m36.495s sys 0m0.338s I have run this test on Intel(R) Core(TM) i3-4000M CPU @ 2.40GHz VMWare Machine. Please check if i am doing anything wrong. NOTE :: I have not used gcc optimizations while compilation. With O2 level optimization proposed solution is taking more time. --089e013d06e88bb3e8050e5bae40 Content-Type: text/x-csrc; charset=US-ASCII; name="int_sqrt_old.c" Content-Disposition: attachment; filename="int_sqrt_old.c" Content-Transfer-Encoding: base64 X-Attachment-Id: f_i5shj5cq1 LyoKICogQ29weXJpZ2h0IChDKSAyMDEzIERhdmlkbG9ociBCdWVzbyA8ZGF2aWRsb2hyLmJ1ZXNv QGhwLmNvbT4KICoKICogIEJhc2VkIG9uIHRoZSBzaGlmdC1hbmQtc3VidHJhY3QgYWxnb3JpdGht IGZvciBjb21wdXRpbmcgaW50ZWdlcgogKiAgc3F1YXJlIHJvb3QgZnJvbSBHdXkgTC4gU3RlZWxl LgogKi8KCiNpbmNsdWRlIDxzdGRpby5oPgovKioKICogaW50X3NxcnQgLSByb3VnaCBhcHByb3hp bWF0aW9uIHRvIHNxcnQKICogQHg6IGludGVnZXIgb2Ygd2hpY2ggdG8gY2FsY3VsYXRlIHRoZSBz cXJ0CiAqCiAqIEEgdmVyeSByb3VnaCBhcHByb3hpbWF0aW9uIHRvIHRoZSBzcXJ0KCkgZnVuY3Rp b24uCiAqLwojZGVmaW5lIEJJVFNfUEVSX0xPTkcgKDgqc2l6ZW9mKGxvbmcpKQoKdW5zaWduZWQg bG9uZyBpbnRfc3FydCh1bnNpZ25lZCBsb25nIHgpCnsKCXVuc2lnbmVkIGxvbmcgYiwgbSwgeSA9 IDA7CgoJaWYgKHggPD0gMSkKCQlyZXR1cm4geDsKCgltID0gMVVMIDw8IChCSVRTX1BFUl9MT05H IC0gMik7Cgl3aGlsZSAobSAhPSAwKSB7CgkJYiA9IHkgKyBtOwoJCXkgPj49IDE7CgoJCWlmICh4 ID49IGIpIHsKCQkJeCAtPSBiOwoJCQl5ICs9IG07CgkJfQoJCW0gPj49IDI7Cgl9CgoJcmV0dXJu IHk7Cn0KCmludCBtYWluKCkKewoJdW5zaWduZWQgbG9uZyBuID0gMTsKCWZvcig7bjw9MTAwMDAw MDAwOysrbikKCQlpbnRfc3FydChuKTsKCXJldHVybiAwOwp9Cg== --089e013d06e88bb3e8050e5bae40 Content-Type: text/x-csrc; charset=US-ASCII; name="int_sqrt_new.c" Content-Disposition: attachment; filename="int_sqrt_new.c" Content-Transfer-Encoding: base64 X-Attachment-Id: f_i5shnyjq1 LyoKICogQ29weXJpZ2h0IChDKSAyMDEzIERhdmlkbG9ociBCdWVzbyA8ZGF2aWRsb2hyLmJ1ZXNv QGhwLmNvbT4KICoKICogIEJhc2VkIG9uIHRoZSBzaGlmdC1hbmQtc3VidHJhY3QgYWxnb3JpdGht IGZvciBjb21wdXRpbmcgaW50ZWdlcgogKiAgc3F1YXJlIHJvb3QgZnJvbSBHdXkgTC4gU3RlZWxl LgogKi8KCiNpbmNsdWRlIDxzdGRpby5oPgoKI2RlZmluZSBCSVRTX1BFUl9MT05HICg4KnNpemVv Zihsb25nKSkKCi8qKgogKiBpbnRfc3FydCAtIHJvdWdoIGFwcHJveGltYXRpb24gdG8gc3FydAog KiBAeDogaW50ZWdlciBvZiB3aGljaCB0byBjYWxjdWxhdGUgdGhlIHNxcnQKICoKICogQSB2ZXJ5 IHJvdWdoIGFwcHJveGltYXRpb24gdG8gdGhlIHNxcnQoKSBmdW5jdGlvbi4KICovCnVuc2lnbmVk IGxvbmcgaW50X3NxcnQodW5zaWduZWQgbG9uZyB4KQp7Cgl1bnNpZ25lZCBsb25nIGIsIG0sIHkg PSAwOwoKCWlmICh4IDw9IDEpCgkJcmV0dXJuIHg7CgoJbSA9IDFVTCA8PCAoQklUU19QRVJfTE9O RyAtIDIpOwoKCXdoaWxlKG0gPiB4ICkKCQltID4+PSAyOwoKCXdoaWxlIChtICE9IDApIHsKCQli ID0geSArIG07CgkJeSA+Pj0gMTsKCgkJaWYgKHggPj0gYikgewoJCQl4IC09IGI7CgkJCXkgKz0g bTsKCQl9CgkJbSA+Pj0gMjsKCX0KCglyZXR1cm4geTsKfQoKaW50IG1haW4oKQp7Cgl1bnNpZ25l ZCBsb25nIG4gPSAxOwoJZm9yKDtuPD0xMDAwMDAwMDA7KytuKQoJCWludF9zcXJ0KG4pOwoJcmV0 dXJuIDA7Cn0K --089e013d06e88bb3e8050e5bae40-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/