From: David Laight <David.Laight@ACULAB.COM>
To: "'Crt Mori'" <cmo@melexis.com>
CC: Peter Zijlstra <peterz@infradead.org>,
        Jonathan Cameron <jic23@kernel.org>, Ingo Molnar <mingo@kernel.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Kees Cook <keescook@chromium.org>,
        Rusty Russell <rusty@rustcorp.com.au>, Ian Abbott <abbotti@mev.co.uk>,
        Larry Finger <Larry.Finger@lwfinger.net>,
        Niklas Soderlund <niklas.soderlund+renesas@ragnatech.se>,
        Thomas Gleixner <tglx@linutronix.de>,
        Krzysztof Kozlowski <krzk@kernel.org>,
        Masahiro Yamada <yamada.masahiro@socionext.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "linux-iio@vger.kernel.org" <linux-iio@vger.kernel.org>,
        Joe Perches <joe@perches.com>
Subject: RE: [PATCH v10 1/3] lib: Add strongly typed 64bit int_sqrt
Thread-Topic: [PATCH v10 1/3] lib: Add strongly typed 64bit int_sqrt
Thread-Index: AQHTeZ24vj4uJGdJVEmg6EfyY0tMJ6NMSiFAgAAfCS+AAAbEMIAADUYAgAEf+RA=
Date: Thu, 21 Dec 2017 10:59:19 +0000
Message-ID: <d595eac5b5744a3a96cda44d96b08b53@AcuMS.aculab.com>
References: <20171220142001.18161-1-cmo@melexis.com>
 <1c1d0ffa8ee140bf9adbc78f1559b1e8@AcuMS.aculab.com>
 <20171220160001.manjff26gfbjccsw@hirez.programming.kicks-ass.net>
 <CAKv63uuL9+xzF7KruhYwSY68-M0=aJSvJOr5Y0vVBiX8ebqfeg@mail.gmail.com>
 <c3462afd27d14c8684ee33ef6623a31a@AcuMS.aculab.com>
 <CAKv63utKNW2--ZnkoJ0z++68764cX_S9xS9Gst_vj7eDsFAZrg@mail.gmail.com>
In-Reply-To: <CAKv63utKNW2--ZnkoJ0z++68764cX_S9xS9Gst_vj7eDsFAZrg@mail.gmail.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
X-Authenticated_ID: 
X-PolicySMART: 3396946, 3397078
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id vBLAxMaq016504
Content-Length: 1419
Lines: 52

From: Crt Mori
> Sent: 20 December 2017 17:30
> >> OK, is there any more easy optimizations you see?
> >
> > I think this version works.
> > It doesn't have the optimisation for small values.
> >
> > unsigned int sqrt64(unsigned long long x)
> > {
> >         unsigned int x_hi = x >> 32;
> >
> >         unsigned int b = 0;
> >         unsigned int y = 0;
> >         unsigned int i;
> >
> >         for (i = 0; i < 32; i++) {
> >                 b <<= 2;
> >                 b |= x_hi >> 30;
> >                 x_hi <<= 2;
> >                 if (i == 15)
> >                         x_hi = x;
> >                 y <<= 1;
> >                 if (b > y)
> >                         b -= ++y;
> >         }
> >         return y;
> > }
..
> 
> I did a quick run through unit tests for the sensor and the results
> are way off. On the sensor I had to convert double calculations to
> integer calculations and target was to get end result within 0.02 degC
> (with previous approximate sqrt implementation) in sensor working
> range. This now gets into 3 degC delta at least and some are way off.
> It might be off because of some scaling on the other hand during the
> equation (not exactly comparing sqrt implementations here).

I didn't get it quite right...
The last few lines need to be:
		if (b > y) {	
			b -= ++y;
			y++;
		}
	}
	return y >> 1;
}

Although that then fails for inputs larger than 2^62.

	David