The current do_div() implementation has a bogus pointer compare to
generate build warnings on mismatch on 32-bit, unfortunately this not
only triggers for size mismatch, but also _any_ type mismatch, even on
reasonable 64-bit values:
In file included from kernel/sched.c:869:
kernel/sched_debug.c: In function 'nsec_high':
kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast
kernel/sched_debug.c:41: warning: comparison of distinct pointer types lacks a cast
kernel/sched_debug.c: In function 'nsec_low':
kernel/sched_debug.c:51: warning: comparison of distinct pointer types lacks a cast
...
The options are to either 'fix' all callers of do_div() to make sure
they're using a uint64_t explicitly, or to update do_div() to make sure
that the value is 64-bits, regardless of specific type. Currently
everything that uses the generic do_div() causes a warning when using one
of 'u64', 'long long', etc. instead of 'uint64_t'.
Half-assed empirical testing indicates that the number of false positives
far outweighs any benefits of this type of checking:
$ git grep uint64_t | wc -l
947
$ git grep u64 | wc -l
13942
In short, screw uint64_t and its fan club.
Signed-off-by: Paul Mundt <[email protected]>
---
include/asm-generic/div64.h | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
index a4a4937..63e7768 100644
--- a/include/asm-generic/div64.h
+++ b/include/asm-generic/div64.h
@@ -19,6 +19,7 @@
#include <linux/types.h>
#include <linux/compiler.h>
+#include <linux/kernel.h>
#if BITS_PER_LONG == 64
@@ -39,13 +40,11 @@ static inline uint64_t div64_64(uint64_t dividend, uint64_t divisor)
extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor);
-/* The unnecessary pointer compare is there
- * to check for type safety (n must be 64bit)
- */
+/* The BUILD_BUG_ON() is there to check for type safety (n must be 64bit) */
# define do_div(n,base) ({ \
uint32_t __base = (base); \
uint32_t __rem; \
- (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \
+ BUILD_BUG_ON(sizeof(n) != sizeof(uint64_t)); \
if (likely(((n) >> 32) == 0)) { \
__rem = (uint32_t)(n) % __base; \
(n) = (uint32_t)(n) / __base; \
On Mon, 17 Dec 2007 10:48:05 +0900 Paul Mundt <[email protected]> wrote:
> The current do_div() implementation has a bogus pointer compare to
> generate build warnings on mismatch on 32-bit, unfortunately this not
> only triggers for size mismatch, but also _any_ type mismatch, even on
> reasonable 64-bit values:
>
> In file included from kernel/sched.c:869:
> kernel/sched_debug.c: In function 'nsec_high':
> kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast
> kernel/sched_debug.c:41: warning: comparison of distinct pointer types lacks a cast
> kernel/sched_debug.c: In function 'nsec_low':
> kernel/sched_debug.c:51: warning: comparison of distinct pointer types lacks a cast
> ...
>
> The options are to either 'fix' all callers of do_div() to make sure
> they're using a uint64_t explicitly, or to update do_div() to make sure
> that the value is 64-bits, regardless of specific type. Currently
> everything that uses the generic do_div() causes a warning when using one
> of 'u64', 'long long', etc. instead of 'uint64_t'.
u64 and uint64_t should be identical?
> Half-assed empirical testing indicates that the number of false positives
> far outweighs any benefits of this type of checking:
>
> $ git grep uint64_t | wc -l
> 947
> $ git grep u64 | wc -l
> 13942
>
> In short, screw uint64_t and its fan club.
I don't get it. Are u64 and uint64_t different on any arch?
> diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
> index a4a4937..63e7768 100644
> --- a/include/asm-generic/div64.h
> +++ b/include/asm-generic/div64.h
> @@ -19,6 +19,7 @@
>
> #include <linux/types.h>
> #include <linux/compiler.h>
> +#include <linux/kernel.h>
>
> #if BITS_PER_LONG == 64
>
> @@ -39,13 +40,11 @@ static inline uint64_t div64_64(uint64_t dividend, uint64_t divisor)
>
> extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor);
>
> -/* The unnecessary pointer compare is there
> - * to check for type safety (n must be 64bit)
> - */
> +/* The BUILD_BUG_ON() is there to check for type safety (n must be 64bit) */
> # define do_div(n,base) ({ \
> uint32_t __base = (base); \
> uint32_t __rem; \
> - (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \
> + BUILD_BUG_ON(sizeof(n) != sizeof(uint64_t)); \
> if (likely(((n) >> 32) == 0)) { \
> __rem = (uint32_t)(n) % __base; \
> (n) = (uint32_t)(n) / __base; \
The mismatch which I've seen triggering a lot is doing do_div() on an s64
when it expects a u64.
And I think that _is_ a bug, isn't it? do_div(-10, 10) should return -1,
but as the implementation will convert -10 to <monstrously large +ve
number>, the return value will be wildly wrong?
I'm thinking that the problem here is that x86's do_div(s64, ...) doesn't
warn. So people write wrong code and then the problems only crop up on
architectures which use asm-generic/div64.h, which does warn?
(Adding Ingo to CC regarding kernel/lockdep_proc.c..)
On Sun, Dec 16, 2007 at 07:04:18PM -0800, Andrew Morton wrote:
> On Mon, 17 Dec 2007 10:48:05 +0900 Paul Mundt <[email protected]> wrote:
> > The options are to either 'fix' all callers of do_div() to make sure
> > they're using a uint64_t explicitly, or to update do_div() to make sure
> > that the value is 64-bits, regardless of specific type. Currently
> > everything that uses the generic do_div() causes a warning when using one
> > of 'u64', 'long long', etc. instead of 'uint64_t'.
>
> u64 and uint64_t should be identical?
>
Er, yes, that was supposed to be an 's64'. It only applies to sign
mismatch.
> > -/* The unnecessary pointer compare is there
> > - * to check for type safety (n must be 64bit)
> > - */
> > +/* The BUILD_BUG_ON() is there to check for type safety (n must be 64bit) */
> > # define do_div(n,base) ({ \
> > uint32_t __base = (base); \
> > uint32_t __rem; \
> > - (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \
> > + BUILD_BUG_ON(sizeof(n) != sizeof(uint64_t)); \
> > if (likely(((n) >> 32) == 0)) { \
> > __rem = (uint32_t)(n) % __base; \
> > (n) = (uint32_t)(n) / __base; \
>
> The mismatch which I've seen triggering a lot is doing do_div() on an s64
> when it expects a u64.
>
> And I think that _is_ a bug, isn't it? do_div(-10, 10) should return -1,
> but as the implementation will convert -10 to <monstrously large +ve
> number>, the return value will be wildly wrong?
>
If it's supposed to be u64 only, then yes, the existing check should be
ok. There are a lot of places (time keeping code, lockdep, etc.) that
operate on signed values though, and from the comments in some places
this seems to be intentional (ie, kernel/lockdep_proc.c has this gem):
static void snprint_time(char *buf, size_t bufsiz, s64 nr)
{
unsigned long rem;
rem = do_div(nr, 1000); /* XXX: do_div_signed */
snprintf(buf, bufsiz, "%lld.%02d", (long long)nr, ((int)rem+5)/10);
}
> I'm thinking that the problem here is that x86's do_div(s64, ...) doesn't
> warn. So people write wrong code and then the problems only crop up on
> architectures which use asm-generic/div64.h, which does warn?
That seems to be an accurate asessment, yes. If do_div(s64, ...) is buggy
behaviour, then the current check is fine, and the callsites should be
corrected. Though if there's code in-tree that relies on s64 do_div, that seems
to be a more problematic issue.
On Mon, Dec 17, 2007 at 12:20:19PM +0900, Paul Mundt wrote:
> (Adding Ingo to CC regarding kernel/lockdep_proc.c..)
> That seems to be an accurate asessment, yes. If do_div(s64, ...) is buggy
> behaviour, then the current check is fine, and the callsites should be
> corrected. Though if there's code in-tree that relies on s64 do_div, that seems
> to be a more problematic issue.
It is a bug and the only existing callers that manage to work are those that
make sure that signed value is positive. Still asking for trouble...