From: James Bottomley <jbottomley@odin.com>
To: "vkuznets@redhat.com" <vkuznets@redhat.com>
CC: "ulf.hansson@linaro.org" <ulf.hansson@linaro.org>,
        "linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk>,
        "andriy.shevchenko@linux.intel.com" 
	<andriy.shevchenko@linux.intel.com>,
        "keescook@chromium.org" <keescook@chromium.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for
 string_get_size() interface
Thread-Topic: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for
 string_get_size() interface
Thread-Index: AQHRFYdGTWMdTgEBxkW4jHz39JrfFJ6KLd6A
Date: Tue, 3 Nov 2015 03:40:43 +0000
Message-ID: <1446522039.2189.49.camel@Odin.com>
References: <1446136250-11507-1-git-send-email-vkuznets@redhat.com>
	 <1446136250-11507-2-git-send-email-vkuznets@redhat.com>
	 <1446157677.25009.2.camel@Odin.com>	<87d1vwskfu.fsf@vitty.brq.redhat.com>
	 <1446250866.25009.54.camel@Odin.com> <87y4egpf57.fsf@vitty.brq.redhat.com>
In-Reply-To: <87y4egpf57.fsf@vitty.brq.redhat.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
Content-ID: <2449CB64294C1541807E3CE01AAE1407@sw.swsoft.com>
MIME-Version: 1.0
Sender: linux-kernel-owner@vger.kernel.org
Content-Transfer-Encoding: 8bit
Content-Length: 6352
Lines: 177

On Mon, 2015-11-02 at 16:58 +0100, Vitaly Kuznetsov wrote:
> James Bottomley <jbottomley@odin.com> writes:
> 
> > On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote:
> >> James Bottomley <jbottomley@odin.com> writes:
> >> 
> >> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
> >> >> string_get_size() can't really handle huge block sizes, especially
> >> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
> >> >> Change blk_size from u64 to u32 to reflect the reality.
> >> >
> >> > What is the actual evidence for this?  The calculation is designed to be
> >> > a symmetric 128 bit multiply.  When I wrote and tested it, it worked
> >> > fine for huge block sizes.
> >> 
> >> We have 'u32 remainder' and then we do:
> >> 
> >> exp = divisor[units] / (u32)blk_size;
> >> ...
> >> remainder = do_div(size, divisor[units]);
> >> remainder *= blk_size;
> >> 
> >> I'm pretty sure it will overflow for some inputs.
> >
> > It shouldn't; the full code snippet does this:
> >
> >         	while (blk_size >= divisor[units]) {
> >         		remainder = do_div(blk_size, divisor[units]);
> >         		i++;
> >         	}
> >
> >         	exp = divisor[units] / (u32)blk_size;
> >
> > So by the time it reaches the statement you complain about, blk_size is
> > already less than or equal to the divisor (which is 1000 or 1024) so
> > truncating to 32 bits is always correct.
> >
> 
> I overlooked, sorry!
> 
> > I'm sort of getting the impression you don't quite understand the
> > mathematics:  i is the logarithm to the base divisor[units].  We reduce
> > both operands to exponents of the logarithm base (adding the two bases
> > together in i), which means they are by definition in a range between
> > zero and the base and then multiply the remaining exponents correcting
> > the result for a base overflow (so the result is always a correct
> > exponent and i is the logarithm to the base).  It's actually simply
> > Napier's algorithm.
> >
> > The reason we're getting the up to 2.5% rounding errors you complain
> > about is because at each logarithm until the last one, we throw away the
> > remainder (it's legitimate because it's always 1000x smaller than the
> > exponent), but in the case of a large remainder it provides a small
> > correction to the final operation which we don't account for.  If you
> > want to make a true correction, you save the penultimate residue in each
> > case, multiply each by the *other* exponent add them together, divide by
> > the base and increment the final result by the remainder.
> 
> My assumption was that we don't really need to support blk_sizes >
> U32_MAX and we can simplify string_get_size() instead of adding
> additional complexity. Apparently, the assumption was wrong.
> 
> >
> > However, for 2.5% the physicist in me says the above is way overkill.
> >
> 
> It is getting was over 2.5% if blk_size is not a power of 2. While it is
> probably never the case for block subsystem the function is in lib and
> pretends to be general-enough. I'll try to make proper correction and
> let's see if it's worth the effort. 

OK, this is the full calculation.  It also includes an arithmetic
rounding to the final figure print.  I suppose it's not that much more
complexity than the original, and it does make the algorithm easier to
understand.

We could do with running the comments by some other non-mathematician,
now I've explained it in detail to you two, to see if they actually give
an understanding of the algorithm.

James

---

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 5939f63..1ec7e77a 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -44,7 +44,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 		[STRING_UNITS_2] = 1024,
 	};
 	int i, j;
-	u32 remainder = 0, sf_cap, exp;
+	u32 remainder = 0, sf_cap, r1 = 0, r2 = 0, round;
 	char tmp[8];
 	const char *unit;
 
@@ -53,27 +53,46 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 	if (!size)
 		goto out;
 
+	/* This is napier's algorithm.  Reduce the original block size to
+	 *
+	 * co * divisor[units]^i
+	 *
+	 * where co = blk_size + r1/divisor[units];
+	 *
+	 * and the same for size.  We simply add to the exponent i, because
+	 * the final calculation we're looking for is
+	 *
+	 * (co1 * co2) * divisor[units]^i
+	 */
+
+
 	while (blk_size >= divisor[units]) {
-		remainder = do_div(blk_size, divisor[units]);
+		r1 = do_div(blk_size, divisor[units]);
 		i++;
 	}
 
-	exp = divisor[units] / (u32)blk_size;
-	/*
-	 * size must be strictly greater than exp here to ensure that remainder
-	 * is greater than divisor[units] coming out of the if below.
-	 */
-	if (size > exp) {
-		remainder = do_div(size, divisor[units]);
-		remainder *= blk_size;
+	while (size >= divisor[units]) {
+		r2 = do_div(size, divisor[units]);
 		i++;
-	} else {
-		remainder *= size;
 	}
 
-	size *= blk_size;
-	size += remainder / divisor[units];
-	remainder %= divisor[units];
+	/* here's the magic.  co1 * co2 may be > divisor[i], so correct for
+	 * that in the exponent and make sure that the additional corrections
+	 * from the remainders is added in.
+	 *
+	 * co1 *co2 = (blk_size + r1/divisor[units])*(size + r2/divisor[units])
+	 *
+	 * therefore
+	 *
+	 * co1*co2*divisor[units] = blk_size*size*divisor[units] +
+	 *          r1*size + r2*size + r1*r2/divisor[units]
+	 *
+	 * drop the last term because it's too small and perform the
+	 * calculation cleverly by decremeting i to be automatically dealing
+	 * with everything multiplied by divisor[units] */
+
+	--i;
+	size = size * blk_size * divisor[units] + r1 * size + r2 * blk_size;
 
 	while (size >= divisor[units]) {
 		remainder = do_div(size, divisor[units]);
@@ -81,8 +100,15 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
 	}
 
 	sf_cap = size;
-	for (j = 0; sf_cap*10 < 1000; j++)
+	round = 500;
+	for (j = 0; sf_cap*10 < 1000; j++) {
 		sf_cap *= 10;
+		round /= 10;
+	}
+
+	/* add a 5 to the digit below what will be printed to ensure
+	 * an arithmetical round up */
+	remainder += round;
 
 	if (j) {
 		remainder *= 1000;

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?