Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755034AbbKCRCn (ORCPT ); Tue, 3 Nov 2015 12:02:43 -0500 Received: from mx2.parallels.com ([199.115.105.18]:33726 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751593AbbKCRCm (ORCPT ); Tue, 3 Nov 2015 12:02:42 -0500 From: James Bottomley To: "vkuznets@redhat.com" CC: "ulf.hansson@linaro.org" , "linux@rasmusvillemoes.dk" , "andriy.shevchenko@linux.intel.com" , "keescook@chromium.org" , "linux-kernel@vger.kernel.org" , "akpm@linux-foundation.org" Subject: Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface Thread-Topic: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface Thread-Index: AQHRFjlhTWMdTgEBxkW4jHz39JrfFJ6LDIAA Date: Tue, 3 Nov 2015 17:02:29 +0000 Message-ID: <1446570148.6440.11.camel@Odin.com> References: <1446136250-11507-1-git-send-email-vkuznets@redhat.com> <1446136250-11507-2-git-send-email-vkuznets@redhat.com> <1446157677.25009.2.camel@Odin.com> <87d1vwskfu.fsf@vitty.brq.redhat.com> <1446250866.25009.54.camel@Odin.com> <87y4egpf57.fsf@vitty.brq.redhat.com> <1446522039.2189.49.camel@Odin.com> <87egg7fcpn.fsf@vitty.brq.redhat.com> In-Reply-To: <87egg7fcpn.fsf@vitty.brq.redhat.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: Evolution 3.12.11 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [184.11.141.41] Content-Type: text/plain; charset="utf-8" Content-ID: <2DBB7B7DBF2ACC46B860B0870A7E563A@sw.swsoft.com> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id tA3H2mGJ008418 Content-Length: 8162 Lines: 212 On Tue, 2015-11-03 at 14:13 +0100, Vitaly Kuznetsov wrote: > James Bottomley writes: > > > On Mon, 2015-11-02 at 16:58 +0100, Vitaly Kuznetsov wrote: > >> James Bottomley writes: > >> > >> > On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote: > >> >> James Bottomley writes: > >> >> > >> >> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote: > >> >> >> string_get_size() can't really handle huge block sizes, especially > >> >> >> blk_size > U32_MAX but string_get_size() interface states the opposite. > >> >> >> Change blk_size from u64 to u32 to reflect the reality. > >> >> > > >> >> > What is the actual evidence for this? The calculation is designed to be > >> >> > a symmetric 128 bit multiply. When I wrote and tested it, it worked > >> >> > fine for huge block sizes. > >> >> > >> >> We have 'u32 remainder' and then we do: > >> >> > >> >> exp = divisor[units] / (u32)blk_size; > >> >> ... > >> >> remainder = do_div(size, divisor[units]); > >> >> remainder *= blk_size; > >> >> > >> >> I'm pretty sure it will overflow for some inputs. > >> > > >> > It shouldn't; the full code snippet does this: > >> > > >> > while (blk_size >= divisor[units]) { > >> > remainder = do_div(blk_size, divisor[units]); > >> > i++; > >> > } > >> > > >> > exp = divisor[units] / (u32)blk_size; > >> > > >> > So by the time it reaches the statement you complain about, blk_size is > >> > already less than or equal to the divisor (which is 1000 or 1024) so > >> > truncating to 32 bits is always correct. > >> > > >> > >> I overlooked, sorry! > >> > >> > I'm sort of getting the impression you don't quite understand the > >> > mathematics: i is the logarithm to the base divisor[units]. We reduce > >> > both operands to exponents of the logarithm base (adding the two bases > >> > together in i), which means they are by definition in a range between > >> > zero and the base and then multiply the remaining exponents correcting > >> > the result for a base overflow (so the result is always a correct > >> > exponent and i is the logarithm to the base). It's actually simply > >> > Napier's algorithm. > >> > > >> > The reason we're getting the up to 2.5% rounding errors you complain > >> > about is because at each logarithm until the last one, we throw away the > >> > remainder (it's legitimate because it's always 1000x smaller than the > >> > exponent), but in the case of a large remainder it provides a small > >> > correction to the final operation which we don't account for. If you > >> > want to make a true correction, you save the penultimate residue in each > >> > case, multiply each by the *other* exponent add them together, divide by > >> > the base and increment the final result by the remainder. > >> > >> My assumption was that we don't really need to support blk_sizes > > >> U32_MAX and we can simplify string_get_size() instead of adding > >> additional complexity. Apparently, the assumption was wrong. > >> > >> > > >> > However, for 2.5% the physicist in me says the above is way overkill. > >> > > >> > >> It is getting was over 2.5% if blk_size is not a power of 2. While it is > >> probably never the case for block subsystem the function is in lib and > >> pretends to be general-enough. I'll try to make proper correction and > >> let's see if it's worth the effort. > > > > OK, this is the full calculation. It also includes an arithmetic > > rounding to the final figure print. I suppose it's not that much more > > complexity than the original, and it does make the algorithm easier to > > understand. > > > > We could do with running the comments by some other non-mathematician, > > now I've explained it in detail to you two, to see if they actually give > > an understanding of the algorithm. > > Thanks, to me they look great! One nitpick below ... > > > > > James > > > > --- > > > > diff --git a/lib/string_helpers.c b/lib/string_helpers.c > > index 5939f63..1ec7e77a 100644 > > --- a/lib/string_helpers.c > > +++ b/lib/string_helpers.c > > @@ -44,7 +44,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, > > [STRING_UNITS_2] = 1024, > > }; > > int i, j; > > - u32 remainder = 0, sf_cap, exp; > > + u32 remainder = 0, sf_cap, r1 = 0, r2 = 0, round; > > char tmp[8]; > > const char *unit; > > > > @@ -53,27 +53,46 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, > > if (!size) > > goto out; > > > > + /* This is napier's algorithm. Reduce the original block size to > > + * > > + * co * divisor[units]^i > > + * > > + * where co = blk_size + r1/divisor[units]; > > + * > > + * and the same for size. We simply add to the exponent i, because > > + * the final calculation we're looking for is > > + * > > + * (co1 * co2) * divisor[units]^i > > + */ > > + > > + > > while (blk_size >= divisor[units]) { > > - remainder = do_div(blk_size, divisor[units]); > > + r1 = do_div(blk_size, divisor[units]); > > i++; > > } > > > > - exp = divisor[units] / (u32)blk_size; > > - /* > > - * size must be strictly greater than exp here to ensure that remainder > > - * is greater than divisor[units] coming out of the if below. > > - */ > > - if (size > exp) { > > - remainder = do_div(size, divisor[units]); > > - remainder *= blk_size; > > + while (size >= divisor[units]) { > > + r2 = do_div(size, divisor[units]); > > i++; > > - } else { > > - remainder *= size; > > } > > > > - size *= blk_size; > > - size += remainder / divisor[units]; > > - remainder %= divisor[units]; > > + /* here's the magic. co1 * co2 may be > divisor[i], so correct for > > + * that in the exponent and make sure that the additional corrections > > + * from the remainders is added in. > > + * > > + * co1 *co2 = (blk_size + r1/divisor[units])*(size + r2/divisor[units]) > > + * > > + * therefore > > + * > > + * co1*co2*divisor[units] = blk_size*size*divisor[units] + > > + * r1*size + r2*size + r1*r2/divisor[units] > > + * > > + * drop the last term because it's too small and perform the > > + * calculation cleverly by decremeting i to be automatically dealing > > + * with everything multiplied by divisor[units] */ > > + > > + --i; > > + size = size * blk_size * divisor[units] + r1 * size + r2 * > > blk_size; > > The last term is actually not that small. Here is an example: It's always < 1 in the final equation because it's divided by the square of divisor[units]. However, that can make a contribution to the round up, I suppose leading to the truncation you see > size = 8192 blk_size = 1024 > > 'As is' the algorithm gives us '8.38 MB', and if we add "+ r1 * r1 / > divisor[units]" we get '8.39 MB' (the correct answer is 8192 * 1024 = > 8388608 which is 8.39). > > Both r1 and r2 are < divisor[units] here so r1 * r2 won't overflow u32, > I suggest we add this term. > > > > > while (size >= divisor[units]) { > > remainder = do_div(size, divisor[units]); > > @@ -81,8 +100,15 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, > > } > > > > sf_cap = size; > > - for (j = 0; sf_cap*10 < 1000; j++) > > + round = 500; > > + for (j = 0; sf_cap*10 < 1000; j++) { > > sf_cap *= 10; > > + round /= 10; > > + } > > + > > + /* add a 5 to the digit below what will be printed to ensure > > + * an arithmetical round up */ > > + remainder += round; > > > > if (j) { > > remainder *= 1000; > > Can I post this solution with your Suggested-by or do you plan to do it > yourself? It was a suggestion when I explained what the missing sources of precision were, I don't think it's really a suggestion when it comes with an exemplary patch. However, the whole thing needs polishing because all the divisions need to be eliminated, since they're a huge source of problems for most 32 bit CPUs, so I can fix it all the way and repost. James ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?