Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757187Ab0HIQkm (ORCPT ); Mon, 9 Aug 2010 12:40:42 -0400 Received: from nspiron-1.llnl.gov ([128.115.41.81]:13687 "EHLO nspiron-1.llnl.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757174Ab0HIQkj (ORCPT ); Mon, 9 Aug 2010 12:40:39 -0400 X-Greylist: delayed 580 seconds by postgrey-1.27 at vger.kernel.org; Mon, 09 Aug 2010 12:40:39 EDT X-Attachments: 0001-Fix-div64_u64-for-32bit-platforms.patch, div64_u64_test.c, div64_s64_test.c, README From: Brian Behlendorf To: Andrew Morton Subject: [PATCH] Make div64_u64() precise on 32bit platforms Date: Mon, 9 Aug 2010 09:30:52 -0700 User-Agent: KMail/1.9.4 Cc: Oleg Nesterov , Ben Woodard , Jeremy Fitzhardinge , Mark Grondona , "linux-kernel@vger.kernel.org" References: <20100802160951.GA13385@redhat.com> <20100803152812.210439dc.akpm@linux-foundation.org> In-Reply-To: <20100803152812.210439dc.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_A1CYMh5a60KJySb" Message-Id: <201008090930.56671.behlendorf1@llnl.gov> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 12139 Lines: 428 --Boundary-00=_A1CYMh5a60KJySb Content-Type: multipart/signed; boundary="nextPart3079728.g0KSIjXtV0"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit --nextPart3079728.g0KSIjXtV0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline > On Mon, 2 Aug 2010 18:09:51 +0200 > > Oleg Nesterov wrote: > > We have a bugreport which blames div64_u64() on 32bit platforms. > > > > However, the code obviously doesn't even try to pretend it can do > > the 64bit division precisely. If there is something in the high > > word of divisor, div64_u64() just shifts both arguments and throws > > out the low bits. > > Well that was a bit lazy of us - I wonder how hard it is to fix. > > At present people will test their code on 64-bit only to find out later > that it doesn't work correctly on 32-bit. Bad. Perhaps we should > similarly break the 64-bit version :) Here's an even crazier idea, let's just fix the 32-bit version. :) The attached patch fully implements div64_u64() such that it will return=20 precisely the right quotient even when the divisor exceeds 32-bits. The=20 patch also adds a div64_s64() function to fully support signed 64-bit=20 division. Because this fix is non-obvious I have also included a unsigned and signed= =20 regression test to verify the correctness of the patch. Using a vanilla=20 2.6.35 kernel the unsigned regression tests fails on 32-bit platforms. Wit= h=20 the proposed patch applied both the unsigned and signed tests pass. =2D-=20 Thanks, Brian --nextPart3079728.g0KSIjXtV0 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQBMYC1ACrnpkcavZYsRAlxqAJ9mgyCP/C7zAKNckUMkJ8JFLvmlswCfSGk5 4LikuHrRL7An6rkrF3EgBBM= =j+nk -----END PGP SIGNATURE----- --nextPart3079728.g0KSIjXtV0-- --Boundary-00=_A1CYMh5a60KJySb Content-Type: text/plain; charset="iso-8859-1"; name="0001-Fix-div64_u64-for-32bit-platforms.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-Fix-div64_u64-for-32bit-platforms.patch" >From 3b30f1cf78f88b40360dd65816941cf2be9dd60d Mon Sep 17 00:00:00 2001 From: Brian Behlendorf Date: Thu, 5 Aug 2010 14:59:11 -0700 Subject: [PATCH] Fix div64_u64 for 32bit platforms The current implementation of div64_u64 for 32bit systems returns an approximately correct result when the divisor exceeds 32bits. Since doing 64bit division using 32bit hardware is a long since solved problem we just use one of the existing proven methods. Additionally, add a div64_s64 function to correctly handle doing signed 64bit division. --- include/linux/kernel.h | 5 +++ include/linux/math64.h | 12 +++++++++ lib/div64.c | 64 +++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 70 insertions(+), 11 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 5de838b..7a00dff 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -162,6 +162,11 @@ extern int _cond_resched(void); (__x < 0) ? -__x : __x; \ }) +#define abs64(x) ({ \ + s64 __x = (x); \ + (__x < 0) ? -__x : __x; \ + }) + #ifdef CONFIG_PROVE_LOCKING void might_fault(void); #else diff --git a/include/linux/math64.h b/include/linux/math64.h index c87f152..23fcdfc 100644 --- a/include/linux/math64.h +++ b/include/linux/math64.h @@ -35,6 +35,14 @@ static inline u64 div64_u64(u64 dividend, u64 divisor) return dividend / divisor; } +/** + * div64_s64 - signed 64bit divide with 64bit divisor + */ +static inline s64 div64_s64(s64 dividend, s64 divisor) +{ + return dividend / divisor; +} + #elif BITS_PER_LONG == 32 #ifndef div_u64_rem @@ -53,6 +61,10 @@ extern s64 div_s64_rem(s64 dividend, s32 divisor, s32 *remainder); extern u64 div64_u64(u64 dividend, u64 divisor); #endif +#ifndef div64_s64 +extern s64 div64_s64(s64 dividend, s64 divisor); +#endif + #endif /* BITS_PER_LONG */ /** diff --git a/lib/div64.c b/lib/div64.c index a111eb8..e4e7fc6 100644 --- a/lib/div64.c +++ b/lib/div64.c @@ -77,26 +77,68 @@ s64 div_s64_rem(s64 dividend, s32 divisor, s32 *remainder) EXPORT_SYMBOL(div_s64_rem); #endif -/* 64bit divisor, dividend and result. dynamic precision */ +/** + * div64_u64 - unsigned 64bit divide with 64bit divisor + * @dividend: 64bit dividend + * @divisor: 64bit divisor + * + * This implementation is a modified version of the algorithm proposed + * by the book 'Hacker's Delight'. The original source and full proof + * can be found here and is available for use without restriction. + * + * 'http://www.hackersdelight.org/HDcode/newCode/divDouble.c' + */ #ifndef div64_u64 u64 div64_u64(u64 dividend, u64 divisor) { - u32 high, d; - - high = divisor >> 32; - if (high) { - unsigned int shift = fls(high); + u64 u0, quot0, quot1; + u32 rem; + int n; + + if (divisor >> 32 == 0) { + if (dividend >> 32 < divisor) { + return div_u64_rem(dividend, divisor, &rem); + } else { + u0 = dividend & 0xFFFFFFFF; + quot1 = div_u64_rem(dividend >> 32, divisor, &rem); + u0 += ((u64)rem << 32); + quot0 = div_u64_rem(u0, divisor, &rem); + return (quot1 << 32) + quot0; + } + } else { + n = __builtin_clzll(divisor); + quot1 = div_u64_rem(dividend >> 1, (divisor << n) >> 32, &rem); + quot0 = (quot1 << n) >> 31; - d = divisor >> shift; - dividend >>= shift; - } else - d = divisor; + if (quot0 != 0) + quot0 = quot0 - 1; + if ((dividend - quot0 * divisor) >= divisor) + quot0 = quot0 + 1; - return div_u64(dividend, d); + return quot0; + } } EXPORT_SYMBOL(div64_u64); #endif +/** + * div64_s64 - signed 64bit divide with 64bit divisor + * @dividend: 64bit dividend + * @divisor: 64bit divisor + */ +#ifndef div64_s64 +s64 div64_s64(s64 dividend, s64 divisor) +{ + s64 quot, t; + + quot = div64_u64(abs64(dividend), abs64(divisor)); + t = (dividend ^ divisor) >> 63; + + return (quot ^ t) - t; +} +EXPORT_SYMBOL(div64_s64); +#endif + #endif /* BITS_PER_LONG == 32 */ /* -- 1.5.4.5 --Boundary-00=_A1CYMh5a60KJySb Content-Type: text/x-csrc; charset="iso-8859-1"; name="div64_u64_test.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="div64_u64_test.c" #include /* * Verification test for div64_u64. */ #ifndef abs64 #define abs64(x) ({ \ s64 __x = (x); \ (__x < 0) ? -__x : __x; \ }) #endif int div64_u64_test(void) { u64 uu, vu, qu, ru; int n, i, j, errors = 0; const u64 tabu[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1000, 2003, 32765, 32766, 32767, 32768, 32769, 32760, 65533, 65534, 65535, 65536, 65537, 65538, 0x7ffffffeULL, 0x7fffffffULL, 0x80000000ULL, 0x80000001ULL, 0x7000000000000000ULL, 0x7000000080000000ULL, 0x7000000080000001ULL, 0x7fffffffffffffffULL, 0x7fffffff8fffffffULL, 0x7fffffff8ffffff1ULL, 0x7fffffff00000000ULL, 0x7fffffff80000000ULL, 0x7fffffff00000001ULL, 0x8000000000000000ULL, 0x8000000080000000ULL, 0x8000000080000001ULL, 0xc000000000000000ULL, 0xc000000080000000ULL, 0xc000000080000001ULL, 0xfffffffffffffffdULL, 0xfffffffffffffffeULL, 0xffffffffffffffffULL, }; printk("%s", "Testing unsigned 64-bit division.\n"); n = sizeof(tabu) / sizeof(tabu[0]); for (i = 0; i < n; i++) { for (j = 1; j < n; j++) { uu = tabu[i]; vu = tabu[j]; qu = div64_u64(uu, vu); ru = uu - qu * vu; if (qu > uu || ru >= vu) { printk("%016llx/%016llx != %016llx " "rem %016llx\n", uu, vu, qu, ru); errors++; } } } if (errors) { printk("Failed %d/%d tests\n", errors, n * (n - 1)); } else { printk("Passed all %d tests\n", n * (n - 1)); } return 0; } void div64_u64_exit(void) { } module_init(div64_u64_test); module_exit(div64_u64_exit); --Boundary-00=_A1CYMh5a60KJySb Content-Type: text/x-csrc; charset="iso-8859-1"; name="div64_s64_test.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="div64_s64_test.c" #include /* * Verification test for div64_s64. */ #ifndef abs64 #define abs64(x) ({ \ s64 __x = (x); \ (__x < 0) ? -__x : __x; \ }) #endif int div64_s64_test(void) { s64 u, v, q, r; int n, i, j, k, errors = 0; const s64 tabs[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1000, 2003, 32765, 32766, 32767, 32768, 32769, 32760, 65533, 65534, 65535, 65536, 65537, 65538, 0x7ffffffeLL, 0x7fffffffLL, 0x80000000LL, 0x80000001LL, 0x7000000000000000LL, 0x7000000080000000LL, 0x7000000080000001LL, 0x7fffffffffffffffLL, 0x7fffffff8fffffffLL, 0x7fffffff8ffffff1LL, 0x7fffffff00000000LL, 0x7fffffff80000000LL, 0x7fffffff00000001LL, 0x0123456789abcdefLL, 0x00000000abcdef01LL, 0x0000000012345678LL, }; printk("%s", "Testing signed 64-bit division.\n"); n = sizeof(tabs) / sizeof(tabs[0]); for (i = 0; i < n; i++) { for (j = 1; j < n; j++) { for (k = 0; k <= 3; k++) { u = (k & 1) ? -tabs[i] : tabs[i]; v = (k >= 2) ? -tabs[j] : tabs[j]; q = div64_s64(u, v); r = u - q * v; if (abs64(q) > abs64(u) || abs64(r) >= abs64(v) || (r != 0 && (r ^ u) < 0)) { printk("%016llx/%016llx != %016llx " "rem %016llx\n", u, v, q, r); errors++; } } } } if (errors) { printk("Failed %d/%d tests\n", errors, n * (n - 1)); } else { printk("Passed all %d tests\n", n * (n - 1)); } return 0; } void div64_s64_exit(void) { } module_init(div64_s64_test); module_exit(div64_s64_exit); --Boundary-00=_A1CYMh5a60KJySb Content-Type: text/plain; charset="iso-8859-1"; name="README" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="README" linux-2.6.35 Testing Summary | x86_64 | x86 | --------------------+-----------------+ linux-2.6.35 | PASS | FAIL | linux-2.6.35+patch | PASS | PASS | Testing Details (x86_64) -------------------------------------------------------------------------- * PASS - linux-2.6.35 Testing unsigned 64-bit division. Passed all 2756 tests * PASS - linux-2.6.35+patch Testing unsigned 64-bit division. Passed all 2756 tests Testing signed 64-bit division. Passed all 2162 tests Testing Details (x86) -------------------------------------------------------------------------- * FAIL - linux-2.6.35 Testing unsigned 64-bit division. 7000000080000000/7000000080000001 != 0000000000000001 rem ffffffffffffffff 7fffffff8fffffff/7fffffffffffffff != 0000000000000001 rem ffffffff90000000 7fffffff8ffffff1/7fffffffffffffff != 0000000000000001 rem ffffffff8ffffff2 7fffffff8ffffff1/7fffffff8fffffff != 0000000000000001 rem fffffffffffffff2 7fffffff00000000/7fffffff00000001 != 0000000000000001 rem ffffffffffffffff 7fffffff80000000/7fffffffffffffff != 0000000000000001 rem ffffffff80000001 7fffffff80000000/7fffffff8fffffff != 0000000000000001 rem fffffffff0000001 7fffffff80000000/7fffffff8ffffff1 != 0000000000000001 rem fffffffff000000f 8000000000000000/8000000080000000 != 0000000000000001 rem ffffffff80000000 8000000000000000/8000000080000001 != 0000000000000001 rem ffffffff7fffffff 8000000080000000/8000000080000001 != 0000000000000001 rem ffffffffffffffff c000000000000000/c000000080000000 != 0000000000000001 rem ffffffff80000000 c000000000000000/c000000080000001 != 0000000000000001 rem ffffffff7fffffff c000000080000000/c000000080000001 != 0000000000000001 rem ffffffffffffffff fffffffffffffffd/7fffffffffffffff != 0000000000000002 rem ffffffffffffffff fffffffffffffffd/fffffffffffffffe != 0000000000000001 rem ffffffffffffffff fffffffffffffffe/ffffffffffffffff != 0000000000000001 rem ffffffffffffffff Failed 17/2756 tests * PASS - linux-2.6.35+patch Testing unsigned 64-bit division. Passed all 2756 tests Testing signed 64-bit division. Passed all 2162 tests --Boundary-00=_A1CYMh5a60KJySb-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/