LinuxLists.cc - [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

2023-08-04 11:31:48

Subject: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

The min() (etc) functions in minmax.h require that the arguments have
exactly the same types. This was probably added after an 'accident'
where a negative value got converted to a large unsigned value.

However when the type check fails, rather than look at the types and
fix the type of a variable/constant, everyone seems to jump on min_t().
In reality min_t() ought to be rare - when something unusual is being
done, not normality.

A quick grep shows 5734 min() and 4597 min_t().
Having the casts on almost half of the calls shows that something
is clearly wrong.

If the wrong type is picked (and it is far too easy to pick the type
of the result instead of the larger input) then significant bits can
get discarded.
Pretty much the worst example is in the derfved clamp_val(), consider:
unsigned char x = 200u;
y = clamp_val(x, 10u, 300u);

I also suspect that many of the min_t(u16, ...) are actually wrong.
For example copy_data() in printk_ringbuffer.c contains:
data_size = min_t(u16, buf_size, len);
Here buf_size is 'unsigned int' and len 'u16', pass a 64k buffer
(can you prove that doesn't happen?) and no data is returned.

The only reason that most of the min_t() are 'fine' is that pretty
much all the values in the kernel are between 0 and INT_MAX.

Patch 1 adds min_unsigned(), this uses integer promotions to convert
both arguments to 'unsigned long long'. It can be used to compare a
signed type that is known to contain a non-negative value with an
unsigned type. The compiler typically optimises it all away.
Added first so that it can be referred to in patch 2.

Patch 2 replaces the 'same type' check with a 'same signedness' one.
This makes min(unsigned_int_var, sizeof()) be ok.
The error message is also improved and will contain the expanded
form of both arguments (useful for seeing how constants are defined).

Patch 3 just fixes some whitespace.

Patch 4 allows comparisons of 'unsigned char' and 'unsigned short'
to signed types. The integer promotion rules convert them both
to 'signed int' prior to the comparison so they can never cause
a negative value be converted to a large positive one.

Patch 5 is slightly more contentious (Linus may not like it!)
effectively adds an (int) cast to all constants between 0 and MAX_INT.
This makes min(signed_int_var, sizeof()) be ok.

With all the patches applied pretty much all the min_t() could be
replaced by min(), and most of the rest by min_unsigned().
However they all need careful inspection due to code like:
sz = min_t(unsigned char, sz - 1, LIM - 1) + 1;
which converts 0 to LIM.

v3 Fix more issues found by the build robot
v2 Fixes some issues found by the kernel build robot.
No functional changes.

David Laight (5):
minmax: Add min_unsigned(a, b) and max_unsigned(a, b)
minmax: Allow min()/max()/clamp() if the arguments have the same
signedness.
Fix indentation of __cmp_once() and __clamp_once()
Allow comparisons of 'int' against 'unsigned char/short'.
Relax check to allow comparison between int and small unsigned
constants.

include/linux/minmax.h | 103 +++++++++++++++++++++++++++--------------
1 file changed, 68 insertions(+), 35 deletions(-)

--
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-08-04 11:34:37

by David Laight

[permalink] [raw]

Subject: [PATCH v3 2/5] minmax: Allow min()/max()/clamp() if the arguments have the same signedness.

The type-check in min()/max() is there to stop unexpected results if a
negative value gets converted to a large unsigned value.
However it also rejects 'unsigned int' v 'unsigned long' compares
which are common and never problematc.

Replace the 'same type' check with a 'same signedness' check.

The new test isn't itself a compile time error, so use static_assert()
to report the error and give a meaningful error message.

Due to the way builtin_choose_expr() works detecting the error in the
'non-constant' side (where static_assert() can be used) also detects
errors when the arguments are constant.

Signed-off-by: David Laight <[email protected]>
---
Changes for v3:
- Wrap is_signed_type() in __builtin_choose_expr() and __is_constexpr().
Generates a compile-time 0 for pointer types avoiding a bug in clang < 16.

Changes for v2:
- #include <linux/build_bug.h> to fix 'um' build.
This matches a separate commit from Andy.
- Do not delete __typecheck() - used outside this file.
- Use __builtin_choose_expr() in __clamp_once() to fix clang builds.

include/linux/minmax.h | 65 +++++++++++++++++++++++-------------------
1 file changed, 35 insertions(+), 30 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 531860e9cc55..6f79e44aad86 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -2,6 +2,7 @@
#ifndef _LINUX_MINMAX_H
#define _LINUX_MINMAX_H

+#include <linux/build_bug.h>
#include <linux/const.h>

/*
@@ -9,9 +10,8 @@
*
* - avoid multiple evaluations of the arguments (so side-effects like
* "x++" happen only once) when non-constant.
- * - perform strict type-checking (to generate warnings instead of
- * nasty runtime surprises). See the "unnecessary" pointer comparison
- * in __typecheck().
+ * - perform signed v unsigned type-checking (to generate compile
+ * errors instead of nasty runtime surprises).
* - retain result as a constant expressions when called with only
* constant expressions (to avoid tripping VLA warnings in stack
* allocation usage).
@@ -19,23 +19,30 @@
#define __typecheck(x, y) \
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))

-#define __no_side_effects(x, y) \
- (__is_constexpr(x) && __is_constexpr(y))
+/* is_signed_type() isn't a constexpr for pointer types */
+#define __is_signed(x) \
+ __builtin_choose_expr(__is_constexpr(is_signed_type(typeof(x))), \
+ is_signed_type(typeof(x)), 0)

-#define __safe_cmp(x, y) \
- (__typecheck(x, y) && __no_side_effects(x, y))
+#define __types_ok(x, y) \
+ (__is_signed(x) == __is_signed(y))

-#define __cmp(x, y, op) ((x) op (y) ? (x) : (y))
+#define __cmp_op_min <
+#define __cmp_op_max >

-#define __cmp_once(x, y, unique_x, unique_y, op) ({ \
+#define __cmp(op, x, y) ((x) __cmp_op_##op (y) ? (x) : (y))
+
+#define __cmp_once(op, x, y, unique_x, unique_y) ({ \
typeof(x) unique_x = (x); \
typeof(y) unique_y = (y); \
- __cmp(unique_x, unique_y, op); })
+ static_assert(__types_ok(x, y), \
+ #op "(" #x ", " #y ") signedness error, fix types or consider " #op "_unsigned() before " #op "_t()"); \
+ __cmp(op, unique_x, unique_y); })

-#define __careful_cmp(x, y, op) \
- __builtin_choose_expr(__safe_cmp(x, y), \
- __cmp(x, y, op), \
- __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op))
+#define __careful_cmp(op, x, y) \
+ __builtin_choose_expr(__is_constexpr((x) - (y)), \
+ __cmp(op, x, y), \
+ __cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))

#define __clamp(val, lo, hi) \
((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
@@ -44,17 +51,15 @@
typeof(val) unique_val = (val); \
typeof(lo) unique_lo = (lo); \
typeof(hi) unique_hi = (hi); \
+ static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)), \
+ (lo) <= (hi), true), \
+ "clamp() low limit " #lo " greater than high limit " #hi); \
+ static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error"); \
+ static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error"); \
__clamp(unique_val, unique_lo, unique_hi); })

-#define __clamp_input_check(lo, hi) \
- (BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
- __is_constexpr((lo) > (hi)), (lo) > (hi), false)))
-
#define __careful_clamp(val, lo, hi) ({ \
- __clamp_input_check(lo, hi) + \
- __builtin_choose_expr(__typecheck(val, lo) && __typecheck(val, hi) && \
- __typecheck(hi, lo) && __is_constexpr(val) && \
- __is_constexpr(lo) && __is_constexpr(hi), \
+ __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)), \
__clamp(val, lo, hi), \
__clamp_once(val, lo, hi, __UNIQUE_ID(__val), \
__UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
@@ -64,14 +69,14 @@
* @x: first value
* @y: second value
*/
-#define min(x, y) __careful_cmp(x, y, <)
+#define min(x, y) __careful_cmp(min, x, y)

/**
* max - return maximum of two values of the same or compatible types
* @x: first value
* @y: second value
*/
-#define max(x, y) __careful_cmp(x, y, >)
+#define max(x, y) __careful_cmp(max, x, y)

/**
* min_unsigned - return minimum of two non-negative values
@@ -79,16 +84,16 @@
* @x: first value
* @y: second value
*/
-#define min_unsigned(x, y) \
- __careful_cmp((x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull, <)
+#define min_unsigned(x, y) \
+ __careful_cmp(min, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)

/**
* max_unsigned - return maximum of two non-negative values
* @x: first value
* @y: second value
*/
-#define max_unsigned(x, y) \
- __careful_cmp((x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull, >)
+#define max_unsigned(x, y) \
+ __careful_cmp(max, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)

/**
* min3 - return minimum of three values
@@ -140,7 +145,7 @@
* @x: first value
* @y: second value
*/
-#define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <)
+#define min_t(type, x, y) __careful_cmp(min, (type)(x), (type)(y))

/**
* max_t - return maximum of two values, using the specified type
@@ -148,7 +153,7 @@
* @x: first value
* @y: second value
*/
-#define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >)
+#define max_t(type, x, y) __careful_cmp(max, (type)(x), (type)(y))

/**
* clamp_t - return a value clamped to a given range using a given type
--
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-08-04 11:47:21

by David Laight

[permalink] [raw]

Subject: [PATCH v3 4/5] minmax: Allow comparisons of 'int' against 'unsigned char/short'.

Since 'unsigned char/short' get promoted to 'signed int' it is
safe to compare them against an 'int' value.

Signed-off-by: David Laight <[email protected]>
---
v3; No change
v2: No change
include/linux/minmax.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index ccfbe003a643..f56611ab486a 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -24,8 +24,9 @@
__builtin_choose_expr(__is_constexpr(is_signed_type(typeof(x))), \
is_signed_type(typeof(x)), 0)

-#define __types_ok(x, y) \
- (__is_signed(x) == __is_signed(y))
+#define __types_ok(x, y) \
+ (__is_signed(x) == __is_signed(y) || \
+ __is_signed((x) + 0) == __is_signed((y) + 0))

#define __cmp_op_min <
#define __cmp_op_max >
--
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-08-04 11:56:39

by David Laight

[permalink] [raw]

Subject: [PATCH v3 1/5] minmax: Add min_unsigned(a, b) and max_unsigned(a, b)

These can be used when min()/max() errors a signed v unsigned
compare when the signed value is known to be non-negative.

Unlike min_t(some_unsigned_type, a, b) min_unsigned() will never
mask off high bits if an inappropriate type is selected.

The '+ 0u + 0ul + 0ull' may look strange.
The '+ 0u' is needed for 'signed int' on 64bit systems.
The '+ 0ul' is needed for 'signed long' on 32bit systems.
The '+ 0ull' is needed for 'signed long long'.

Signed-off-by: David Laight <[email protected]>
---
v3: No change.
v2: Updated commit message.
include/linux/minmax.h | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 396df1121bff..531860e9cc55 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -73,6 +73,23 @@
*/
#define max(x, y) __careful_cmp(x, y, >)

+/**
+ * min_unsigned - return minimum of two non-negative values
+ * Signed types are zero extended to match a larger unsigned type.
+ * @x: first value
+ * @y: second value
+ */
+#define min_unsigned(x, y) \
+ __careful_cmp((x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull, <)
+
+/**
+ * max_unsigned - return maximum of two non-negative values
+ * @x: first value
+ * @y: second value
+ */
+#define max_unsigned(x, y) \
+ __careful_cmp((x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull, >)
+
/**
* min3 - return minimum of three values
* @x: first value
--
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-08-15 15:54:05

by David Laight

[permalink] [raw]

Subject: RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

From: Kees Cook
> Sent: 14 August 2023 22:21
>
> On Fri, Aug 04, 2023 at 10:50:59AM +0000, David Laight wrote:
> > [...]
> > I also suspect that many of the min_t(u16, ...) are actually wrong.
> > For example copy_data() in printk_ringbuffer.c contains:
> > data_size = min_t(u16, buf_size, len);
> > Here buf_size is 'unsigned int' and len 'u16', pass a 64k buffer
> > (can you prove that doesn't happen?) and no data is returned.
>
> Stars aligning... this exact bug (as you saw in the other thread[1]) got
> hit. And in the analysis, I came to the same conclusion: min_t() is a
> serious foot-gun, and we should be able to make min() Just Work in the
> most common situations.

It is all a question of what 'work' means.
To my mind (but Linus disagrees!) the only problematic case
is where a negative signed value gets converted to a large
unsigned value.
This snippet from do_tcp_getsockopt() shows what I mean:

copy_from_user(&len,...)
len = min_t(unsigned int, len, sizeof(int));

if (len < 0)
return -EINVAL;

That can clearly never return -EINVAL.
That has actually been broken since the test was added in 2.4.4.
That predates min_t() in 2.4.10 (renamed from min() in 2.4.9
when the 'strict typecheck' version on min() was added).
So min_t() actually predates min()!

> It seems like the existing type_max/type_min macros could be used to
> figure out that the args are safe to appropriately automatically cast,
> etc. e.g. type_max(u16) <= type_max(unsigned int) && type_min(u16) >=
> type_min(unsigned int) ...

That doesn't really help; min(a,b) is ok if any of:
1) is_signed(a) == is_signed(b).
2) is_signed(a + 0) == is_signed(b + 0) // Converts char/short to int.
3) a or b is a constant between 0 and MAXINT and is cast to int.

The one you found passes (1) - both types are unsigned.
min(len, sizeof (int)) passes (3) and is converted to
min(len, (int)sizeof (int)) and can still return the expected negatives.

(Clearly that getsockopt code only works for len != 4 on LE.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-08-23 10:06:25

by David Laight

[permalink] [raw]

Subject: RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

From: Kees Cook <[email protected]>
> Sent: Monday, August 21, 2023 7:24 PM
....
> > > It seems like the existing type_max/type_min macros could be used to
> > > figure out that the args are safe to appropriately automatically cast,
> > > etc. e.g. type_max(u16) <= type_max(unsigned int) && type_min(u16) >=
> > > type_min(unsigned int) ...
> >
> > That doesn't really help; min(a,b) is ok if any of:
> > 1) is_signed(a) == is_signed(b).
> > 2) is_signed(a + 0) == is_signed(b + 0) // Converts char/short to int.
> > 3) a or b is a constant between 0 and MAXINT and is cast to int.
> >
> > The one you found passes (1) - both types are unsigned.
> > min(len, sizeof (int)) passes (3) and is converted to
> > min(len, (int)sizeof (int)) and can still return the expected negatives.
>
> It seems like the foot-gun problems are when a value gets clamped by the
> imposed type. Can't we just warn about those cases?

Also when -1 gets converted to 0xffffffff.

...
> But this is also unsafe:
>
> unsigned int a = ...;
> u16 b = ...;
> unsigned int c = min_t(u16, a, b);
>
> Both are unsigned, but "a > U16_MAX" still goes sideways.

Right, this is one reason why min_t() is broken.
If min() allowed that - no reason why it shouldn't - then it wouldn't
get written in the first place.

> I worry that weakening the min/max() type checking gets into silent errors:
>
> unsigned int a = ...;
> u16 b = ...;
> u16 c = max(a, b);
>
> when "a > U16_MAX".

Nothing can be done on the RHS to detect invalid narrowing in assignments.
And you don't want the compiler to complain because that will just cause
explicit casts be added making the code harder to read and (probably)
adding more bugs.

> Looking at warning about clamped types on min_t(), though I see tons of
> int vs unsigned int issue. (e.g. dealing with PAGE_SIZE vs an int).

Linus doesn't like me silently converting unsigned constants
to signed.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-08-23 20:19:07

by David Laight

[permalink] [raw]

Subject: RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

From: Linus Torvalds
> Sent: Tuesday, August 22, 2023 6:36 PM
>
> On Mon, 21 Aug 2023 at 11:24, Kees Cook <[email protected]> wrote:
> >
> > It seems like the foot-gun problems are when a value gets clamped by the
> > imposed type. Can't we just warn about those cases?
>
> I think that the problem with min_t() is that it is used as a random
> "when min() warns about different types", and that it basically just
> adds a cast without any semantic meaning.
>
> End result: we currently have 4500+ of those cases (and another 1300
> uses of 'max_t') and I bet that several of them are the narrowing
> kind. And some are probably valid.
>
> And if we tighten up "min_t()" type rules, we'll just hit the *next*
> problem in the series, namely "what are people going to do now?"
>
> We don't want to just keep pushing the problem down.
>
> So I actually mostly liked all the *other* patches in David's series:
> using 'min_unsigned()' and friends adds a nice *semantic* layer, not
> just a cast. And relaxing the checking of min/max doesn't cause the
> same the "push problems down" issue, as long as the relaxing is
> reasonable.
>
> (Side note: I'm not convinced 'min_unsigned()' is the right syntax.
> While I like the concept, I think 'min()' is often used as a part of
> other expressions, and 'min_unsigned()' ends up making for a very
> illegible long and complex thing. I think we might as well use
> 'umin()/umax()', matching our type system).

Picking a name is like painting the bike-shed :-)

> It's just that I very much don't think it's reasonable to relax "20u"
> (or - more commonly - sizeof) to be any random constant signed
> integer, and it should *not* compare well with signed integers and not
> silently become a signed compare.
>
> (But I do think that it's very ok to go the other way: compare a
> _unsigned_ value with a "signed" constant integer like 20. The two
> cases are not symmetrical: '20' can be a perfectly fine unsigned
> value, but '20u' cannot be treated signed).

I'll rebase the patch after the next -rc1 is out with that removed.

> And while I don't like David's patch to silently turn unsigned
> constant signed, I do acknowledge that very often the *source* of the
> unsignedness is a 'sizeof()' expression, and then you have an integer
> that gets compared to a size, and you end up using 'min_t()'. But I do
> *NOT* want to fix those cases by ignoring the signedness.
>
> Just a quick grep of
>
> git grep 'min_t(size_t' | wc
>
> shows that quite a lot of the 'min_t()' cases are this exact issue.
> But I absolutely do *not* think the solution is to relax 'min()'.

size_t is actually a problem with unsigned constants.
It is 'unsigned int' on 32bit and 'unsigned long' on 64bit.
Making it hard to use a literal.

> I suspect the fix to those cases is to much more eagerly use
> 'clamp()'. Almost every single time you do a "compare to a size", it
> really is "make sure the integer is within the size range". So while
>
> int val
> ...
> x = min(val,sizeof(xyz));
>
> is horrendously wrong and *should* warn,

The difficulty is getting people to correct it by changing
the type on 'val' to be unsigned.
Sometimes it is just a local variable.
Often it just can't ever be negative, or there is an immediately
preceding check.

Indeed if it is negative that is probably a bug.
(eg a missing error check from an earlier call.)
No amount of compile-time checking is going to help.
About the only thing that would help would be a run-time
check and a 'goto label' if negative.

There are plenty of places where a (short) buffer length is
passed to a function as 'int' - even though negative values
are completely invalid.
In reality the type change needs 'chasing back' through to
all the callers.

> I think doing
>
> x = clamp(val, 0, sizeof(xyz));
>
> is a *much* nicer model, and should not warn even if "val" and the
> upper bound do not agree. In the above kind of situation, suddenly it
> *is* ok to treat the 'sizeof()' as a signed integer, but only because
> we have that explicit lower bound too.

That would require major rework on clamp() :-)

It would also be nice to get clamp(unsigned_var, 0u, 20u) to
compile without the annoying warning from the compiler.
I think you have to use:
builtin_choose_expr(x, var, 0) >= builtin_choose_expr(x, low, 0)

> In other words: we should not "try to fix the types". That was what
> caused the problem in the first place, with "min_t()" just trying to
> fix the type mismatch with a cast. Casts are wrong, and we should have
> known that. The end result is horrendous, and I do agree with David on
> that too.

Some of the worst casts are the ones that are only there for sparse.
The compiler sees '(force __be32)var' as '(uint)var' rather than 'var'.
That really ought to always have been 'force(__be32, var)' so that
it can be made transparent to the compiler.

> I think we should strive to fix it with "semantic" fixes instead. Like
> the above "use clamp() instead of min(), and the fundamental
> signedness problem simply goes away because it has enough semantic
> meaning to be well-defined".

Unfortunately it adds an extra compare and branch to get mis-predicted.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-08-24 10:45:05

by David Laight

[permalink] [raw]

Subject: RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

From: Linus Torvalds
> Sent: Wednesday, August 23, 2023 4:32 PM
...
> That might get rid of a number of the more annoying cases.

The one it leaves is code like:

int length = get_length(...);
if (length <= 0)
return error:
do {
frag_len = some_min_function(length, PAGE_SIZE);
...
} while ((length -= frag_len) != 0);

As written it is ok for all reasonable some_min_function().

But if the (length <= 0) test is missing it really doesn't
matter what some_min_function() returns because the code
isn't going to do anything sensible - and may just loop.

About the only thing you could do is add a run-time check
and then BUG() if negative.
But that is horrid...

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)