2024-02-15 18:59:15

by Yueh-Shun Li

[permalink] [raw]
Subject: [PATCH v2] minmax: substitute local variables using __UNIQUE_ID()

Substitute identifier names of local variables used in macro
definitions inside minmax.h with those generated by __UNIQUE_ID(prefix)
to eliminate passible naming collisions.

Identifier names like __x, __y and __tmp are everywhere inside the
kernel source. This patch ensures that macros provided by minmax.h
will work even when identifiers of these names appear in the expanded
input arguments.

Signed-off-by: Yueh-Shun Li <[email protected]>
---
include/linux/minmax.h | 41 ++++++++++++++++++++++++++++-------------
1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 2ec559284a9f..df7e45106c3a 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -129,10 +129,14 @@
* @x: value1
* @y: value2
*/
-#define min_not_zero(x, y) ({ \
- typeof(x) __x = (x); \
- typeof(y) __y = (y); \
- __x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })
+#define min_not_zero(x, y) \
+ __min_not_zero_impl(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y))
+#define __min_not_zero_impl(x, y, __x, __y) \
+ ({ \
+ typeof(x) __x = (x); \
+ typeof(y) __y = (y); \
+ __x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); \
+ })

/**
* clamp - return a value clamped to a given range with strict typechecking
@@ -185,13 +189,19 @@
* typeof() keeps the const qualifier. Use __unqual_scalar_typeof() in order
* to discard the const qualifier for the __element variable.
*/
-#define __minmax_array(op, array, len) ({ \
- typeof(&(array)[0]) __array = (array); \
- typeof(len) __len = (len); \
- __unqual_scalar_typeof(__array[0]) __element = __array[--__len];\
- while (__len--) \
- __element = op(__element, __array[__len]); \
- __element; })
+#define __minmax_array(op, array, len) \
+ __minmax_array_impl(op, array, len, __UNIQUE_ID(__array), \
+ __UNIQUE_ID(__len), __UNIQUE_ID(__element))
+#define __minmax_array_impl(op, array, len, __array, __len, __element) \
+ ({ \
+ typeof(&(array)[0]) __array = (array); \
+ typeof(len) __len = (len); \
+ __unqual_scalar_typeof(__array[0]) \
+ __element = __array[--__len]; \
+ while (__len--) \
+ __element = op(__element, __array[__len]); \
+ __element; \
+ })

/**
* min_array - return minimum of values present in an array
@@ -267,7 +277,12 @@ static inline bool in_range32(u32 val, u32 start, u32 len)
* @a: first value
* @b: second value
*/
-#define swap(a, b) \
- do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
+#define swap(a, b) __swap_impl(a, b, __UNIQUE_ID(__tmp))
+#define __swap_impl(a, b, __tmp) \
+ do { \
+ typeof(a) __tmp = (a); \
+ (a) = (b); \
+ (b) = __tmp; \
+ } while (0)

#endif /* _LINUX_MINMAX_H */
--
2.42.0



2024-02-15 22:07:14

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2] minmax: substitute local variables using __UNIQUE_ID()

On Thu, 15 Feb 2024 18:58:15 +0000 Yueh-Shun Li <[email protected]> wrote:

> Substitute identifier names of local variables used in macro
> definitions inside minmax.h with those generated by __UNIQUE_ID(prefix)
> to eliminate passible naming collisions.
>
> Identifier names like __x, __y and __tmp are everywhere inside the
> kernel source. This patch ensures that macros provided by minmax.h
> will work even when identifiers of these names appear in the expanded
> input arguments.

Makes sense I guess. However I do wonder how far this goes:

# grep typeof include/linux/*.h | wc -l
313

Many of these are locals being defined within macros. Do they all need
changing? If so, do we really want to implement this fix for what has
always been, to my knowledge, a non-problem?


2024-02-17 20:02:52

by Yueh-Shun Li

[permalink] [raw]
Subject: Re: [PATCH v2] minmax: substitute local variables using __UNIQUE_ID()

On 2024-02-16 06:07, Andrew Morton wrote:
> On Thu, 15 Feb 2024 18:58:15 +0000 Yueh-Shun Li
> <[email protected]> wrote:
>
>> Substitute identifier names of local variables used in macro
>> definitions inside minmax.h with those generated by
>> __UNIQUE_ID(prefix)
>> to eliminate passible naming collisions.
>>
>> Identifier names like __x, __y and __tmp are everywhere inside the
>> kernel source. This patch ensures that macros provided by minmax.h
>> will work even when identifiers of these names appear in the expanded
>> input arguments.
>
> Makes sense I guess. However I do wonder how far this goes:
>
> # grep typeof include/linux/*.h | wc -l
> 313
>
> Many of these are locals being defined within macros. Do they all need
> changing?

Regarding the extent of changes needed, you raise a valid point.
Searching in include/linux with regular expression
`typeof\([A-Za-z0-9]+\) __`, there are about 20 header files contain
macros with temporary variables prior to this patch, and 10 of them
(including minmax.h) uses generic variable names such as __a, __x, __v,
__val, __p or __ptr.

> If so, do we really want to implement this fix for what has
> always been, to my knowledge, a non-problem?

While the occurrence of collision is minimal in the current Linux kernel
source, the presence of variable names prefixed with long underscores
hints the potential issue. Although Linux kernel coding style recommends
suffixing temporary variable names per-macro to avoid collisions[1],
prefix more underscore (_) before the colliding local variable name
seems to be a workaround that receives wider adoption. Examples include
____ptr in three include/linux/*.h files and ______r, ______f in
compiler.h. commit 24ba53017e18 ("rcu: Replace ________p1 and
_________p1 with __UNIQUE_ID(rcu)") shows how __UNIQUE_ID() could be a
more systematic solution.

A probable cause of __UNIQUE_ID() not being widely adopted as
long-underscore variables is that the __COUNTER__ support was not
available until to GCC 4 and Clang 2. It was until recently that
__UNIQUE_ID() provided by compiler.h was unified to use __COUNTER__ by
commit a8306f2d4dce ("compiler.h: unify __UNIQUE_ID").

This patch aims toward minmax.h because it provides general
functionality, such as minimum, maximum and variable swapping, that are
used across subsystems. As for other headers, those with long-underscore
names and those with wide application could be changed first.

Thank you for your valuable feedback.

Best regards,

Shamrock

[1]:
https://www.kernel.org/doc/html/latest/process/coding-style.html#macros-enums-and-rtl