2020-10-19 09:43:06

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v12 0/4] Introduce the for_each_set_clump macro

This patchset introduces a new generic version of for_each_set_clump.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump (n-bits) having
size between 1 and BITS_PER_LONG inclusive. size less than 1 or more than
BITS_PER_LONG causes undefined behaviour. The patchset utilizes the new
macro in some GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
XXXXXXXX represents the current 8-bit group:

Example: 10111110 00000000 11111111 00110011
First loop: 10111110 00000000 11111111 XXXXXXXX
Second loop: 10111110 00000000 XXXXXXXX 00110011
Third loop: XXXXXXXX 00000000 11111111 00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

/* bitmap memory region */
0x00aa0000ff000000; /* Most significant bits */
0xaaaaaa0000ff0000;
0x000000aa000000aa;
0xbbbbabcdeffedcba; /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first: offset: 0 clump: 0xfedcba
Iteration second: offset: 24 clump: 0xabcdef
Iteration third: offset: 48 clump: 0xaabbbb
Iteration fourth: offset: 96 clump: 0xaa
Iteration fifth: offset: 144 clump: 0xff
Iteration sixth: offset: 168 clump: 0xaaaaaa
Iteration seventh: offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

/* bitmap memory region */
0x00aa0000ff000000; /* Most significant bits */
0xaaaaaa0000ff0000;
0x0f00000000000000;
0x0000000000000ac0; /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first: offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v12:
- [Patch 1/4]: Format and modify comments.
- [Patch 1/4]: Optimize code using '<<' operator with GENMASK.
- [Patch 4/4]: Remove extra empty newline.

Changes in v11:
- [Patch 1/4]: Document range of values 'nbits' can take.
- [Patch 4/4]: Change variable name 'flag' to 'flags'.

Changes in v10:
- Patchset based on v5.9-rc1.

Changes in v9:
- [Patch 4/4]: Remove looping of 'for_each_set_clump' and instead process two
halves of a 64-bit bitmap separately or individually. Use normal spin_lock
call for second inner lock. And take the spin_lock_init call outside the 'if'
condition in the probe function of driver.

Changes in v8:
- [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
in 'clump_test_data' array.

Changes in v7:
- [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
definition and test data.

Changes in v6:
- [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
succinct.

Changes in v5:
- [Patch 4/4]: Minor change: Hardcode value for better code readability.

Changes in v4:
- [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
- [Patch 3/4]: Minor change: Inline value for better code readability.
- [Patch 4/4]: Minor change: Inline value for better code readability.

Changes in v3:
- [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
in function thunderx_gpio_set_multiple.

CHanges in v2:
- [Patch 2/4]: Unify different tests for 'for_each_set_clump'. Pass test data as
function parameters.
- [Patch 2/4]: Remove unnecessary bitmap_zero calls.

Syed Nayyar Waris (4):
bitops: Introduce the for_each_set_clump macro
lib/test_bitmap.c: Add for_each_set_clump test cases
gpio: thunderx: Utilize for_each_set_clump macro
gpio: xilinx: Utilize generic bitmap_get_value and _set_value

drivers/gpio/gpio-thunderx.c | 11 ++-
drivers/gpio/gpio-xilinx.c | 65 +++++++-------
include/asm-generic/bitops/find.h | 19 ++++
include/linux/bitmap.h | 61 +++++++++++++
include/linux/bitops.h | 13 +++
lib/find_bit.c | 14 +++
lib/test_bitmap.c | 144 ++++++++++++++++++++++++++++++
7 files changed, 290 insertions(+), 37 deletions(-)


base-commit: 9123e3a74ec7b934a4a099e98af6a61c2f80bbf5
--
2.26.2


2020-10-19 09:43:58

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v12 1/4] bitops: Introduce the for_each_set_clump macro

This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
Changes in v12:
- Format and modify comments.
- Optimize code using '<<' operator with GENMASK.

Changes in v11:
- Document valid range of values that 'nbits' can take.

Changes in v10:
- No change.

Changes in v9:
- No change.

Changes in v8:
- No change.

Changes in v7:
- No change.

Changes in v6:
- No change.

Changes in v5:
- No change.

Changes in v4:
- No change.

Changes in v3:
- No change.

Changes in v2:
- No change.

include/asm-generic/bitops/find.h | 19 ++++++++++
include/linux/bitmap.h | 61 +++++++++++++++++++++++++++++++
include/linux/bitops.h | 13 +++++++
lib/find_bit.c | 14 +++++++
4 files changed, 107 insertions(+)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..4e6600759455 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -97,4 +97,23 @@ extern unsigned long find_next_clump8(unsigned long *clump,
#define find_first_clump8(clump, bits, size) \
find_next_clump8((clump), (bits), (size), 0)

+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+ find_next_clump((clump), (bits), (size), 0, (clump_size))
+
#endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99058eb81042..2ee934484532 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -75,7 +75,11 @@
* bitmap_from_arr32(dst, buf, nbits) Copy nbits from u32[] buf to dst
* bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst
* bitmap_get_value8(map, start) Get 8bit value from map at start
+ * bitmap_get_value(map, start, nbits) Get bit value of size
+ * 'nbits' from map at start
* bitmap_set_value8(map, value, start) Set 8bit value to map at start
+ * bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
+ * of map at start
*
* Note, bitmap_zero() and bitmap_fill() operate over the region of
* unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -563,6 +567,34 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map,
return (map[index] >> offset) & 0xFF;
}

+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+ const size_t index = BIT_WORD(start);
+ const unsigned long offset = start % BITS_PER_LONG;
+ const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+ const unsigned long space = ceiling - start;
+ unsigned long value_low, value_high;
+
+ if (space >= nbits)
+ return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+ else {
+ value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+ value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
+ return (value_low >> offset) | (value_high << space);
+ }
+}
+
/**
* bitmap_set_value8 - set an 8-bit value within a memory region
* @map: address to the bitmap memory region
@@ -579,6 +611,35 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
map[index] |= value << offset;
}

+/**
+ * bitmap_set_value - set n-bit value within a memory region
+ * @map: address to the bitmap memory region
+ * @value: value of nbits
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).
+ */
+static inline void bitmap_set_value(unsigned long *map,
+ unsigned long value,
+ unsigned long start, unsigned long nbits)
+{
+ const size_t index = BIT_WORD(start);
+ const unsigned long offset = start % BITS_PER_LONG;
+ const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+ const unsigned long space = ceiling - start;
+
+ value &= GENMASK(nbits - 1, 0);
+
+ if (space >= nbits) {
+ map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
+ map[index] |= value << offset;
+ } else {
+ map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
+ map[index + 0] |= value << offset;
+ map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
+ map[index + 1] |= value >> space;
+ }
+}
+
#endif /* __ASSEMBLY__ */

#endif /* __LINUX_BITMAP_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 99f2ac30b1d9..36a445e4a7cc 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -62,6 +62,19 @@ extern unsigned long __sw_hweight64(__u64 w);
(start) < (size); \
(start) = find_next_clump8(&(clump), (bits), (size), (start) + 8))

+/**
+ * for_each_set_clump - iterate over bitmap for each clump with set bits
+ * @start: bit offset to start search and to store the current iteration offset
+ * @clump: location to store copy of current 8-bit clump
+ * @bits: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ * @clump_size: clump size in bits
+ */
+#define for_each_set_clump(start, clump, bits, size, clump_size) \
+ for ((start) = find_first_clump(&(clump), (bits), (size), (clump_size)); \
+ (start) < (size); \
+ (start) = find_next_clump(&(clump), (bits), (size), (start) + (clump_size), (clump_size)))
+
static inline int get_bitmask_order(unsigned int count)
{
int order;
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 49f875f1baf7..1341bd39b32a 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -190,3 +190,17 @@ unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,
return offset;
}
EXPORT_SYMBOL(find_next_clump8);
+
+unsigned long find_next_clump(unsigned long *clump, const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size)
+{
+ offset = find_next_bit(addr, size, offset);
+ if (offset == size)
+ return size;
+
+ offset = rounddown(offset, clump_size);
+ *clump = bitmap_get_value(addr, offset, clump_size);
+ return offset;
+}
+EXPORT_SYMBOL(find_next_clump);
--
2.26.2

2020-10-19 09:46:05

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v12 2/4] lib/test_bitmap.c: Add for_each_set_clump test cases

The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Signed-off-by: Syed Nayyar Waris <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
Changes in v12:
- No change.

Changes in v11:
- No change.

Changes in v10:
- No change.

Changes in v9:
- No change.

Changes in v8:
- [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
in 'clump_test_data' array.

Changes in v7:
- Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
definition and test data.

Changes in v6:
- Make 'for loop' inside 'test_for_each_set_clump' more succinct.

Changes in v5:
- No change.

Changes in v4:
- Use 'for' loop in test function of 'for_each_set_clump'.

Changes in v3:
- No Change.

Changes in v2:
- Unify different tests for 'for_each_set_clump'. Pass test data as
function parameters.
- Remove unnecessary bitmap_zero calls.

lib/test_bitmap.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 144 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index df903c53952b..cb2cf3858f93 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -155,6 +155,37 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
return true;
}

+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+ const unsigned int offset,
+ const unsigned int size,
+ const unsigned long *const clump_exp,
+ const unsigned long *const clump,
+ const unsigned long clump_size)
+{
+ unsigned long exp;
+
+ if (offset >= size) {
+ pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected less than %u, got %u\n",
+ srcfile, line, size, offset);
+ return false;
+ }
+
+ exp = clump_exp[offset / clump_size];
+ if (!exp) {
+ pr_warn("[%s:%u] bit offset for zero clump: expected nonzero clump, got bit offset %u with clump value 0",
+ srcfile, line, offset);
+ return false;
+ }
+
+ if (*clump != exp) {
+ pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value of 0x%lX",
+ srcfile, line, exp, *clump);
+ return false;
+ }
+
+ return true;
+}
+
#define __expect_eq(suffix, ...) \
({ \
int result = 0; \
@@ -172,6 +203,7 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
#define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
#define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__)
#define expect_eq_clump8(...) __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...) __expect_eq(clump, ##__VA_ARGS__)

static void __init test_zero_clear(void)
{
@@ -577,6 +609,28 @@ static void noinline __init test_mem_optimisations(void)
}
}

+static const unsigned long clump_bitmap_data[] __initconst = {
+ 0x38000201,
+ 0x05ff0f38,
+ 0xeffedcba,
+ 0xbbbbabcd,
+ 0x000000aa,
+ 0x000000aa,
+ 0x00ff0000,
+ 0xaaaaaa00,
+ 0xff000000,
+ 0x00aa0000,
+ 0x00000000,
+ 0x00000000,
+ 0x00000000,
+ 0x0f000000,
+ 0x00ff0000,
+ 0xaaaaaa00,
+ 0xff000000,
+ 0x00aa0000,
+ 0x00000ac0,
+};
+
static const unsigned char clump_exp[] __initconst = {
0x01, /* 1 bit set */
0x02, /* non-edge 1 bit set */
@@ -588,6 +642,95 @@ static const unsigned char clump_exp[] __initconst = {
0x05, /* non-adjacent 2 bits set */
};

+static const unsigned long clump_exp1[] __initconst = {
+ 0x01, /* 1 bit set */
+ 0x02, /* non-edge 1 bit set */
+ 0x00, /* zero bits set */
+ 0x38, /* 3 bits set across 4-bit boundary */
+ 0x38, /* Repeated clump */
+ 0x0F, /* 4 bits set */
+ 0xFF, /* all bits set */
+ 0x05, /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+ 0xfedcba, /* 24 bits */
+ 0xabcdef,
+ 0xaabbbb, /* Clump split between 2 words */
+ 0x000000, /* zeroes in between */
+ 0x0000aa,
+ 0x000000,
+ 0x0000ff,
+ 0xaaaaaa,
+ 0x000000,
+ 0x0000ff,
+};
+
+static const unsigned long clump_exp3[] __initconst = {
+ 0x00000000, /* starting with 0s*/
+ 0x00000000, /* All 0s */
+ 0x00000000,
+ 0x00000000,
+ 0x3f00000f, /* Non zero set */
+ 0x2aa80003,
+ 0x00000aaa,
+ 0x00003fc0,
+};
+
+static const unsigned long clump_exp4[] __initconst = {
+ 0x00,
+ 0x2b,
+};
+
+struct clump_test_data_params {
+ DECLARE_BITMAP(data, 256);
+ unsigned long count;
+ unsigned long offset;
+ unsigned long limit;
+ unsigned long clump_size;
+ unsigned long const *exp;
+};
+
+static struct clump_test_data_params clump_test_data[] __initdata =
+ { {{0}, 2, 0, 64, 8, clump_exp1},
+ {{0}, 8, 2, 240, 24, clump_exp2},
+ {{0}, 8, 10, 240, 30, clump_exp3},
+ {{0}, 1, 18, 18, 6, clump_exp4} };
+
+static void __init prepare_test_data(unsigned int index)
+{
+ int i;
+ unsigned long width = 0;
+
+ for(i = 0; i < clump_test_data[index].count; i++)
+ {
+ bitmap_set_value(clump_test_data[index].data,
+ clump_bitmap_data[(clump_test_data[index].offset)++], width, 32);
+ width += 32;
+ }
+}
+
+static void __init execute_for_each_set_clump_test(unsigned int index)
+{
+ unsigned long start, clump;
+
+ for_each_set_clump(start, clump, clump_test_data[index].data,
+ clump_test_data[index].limit,
+ clump_test_data[index].clump_size)
+ expect_eq_clump(start, clump_test_data[index].limit, clump_test_data[index].exp,
+ &clump, clump_test_data[index].clump_size);
+}
+
+static void __init test_for_each_set_clump(void)
+{
+ unsigned int i;
+
+ for (i = 0; i < ARRAY_SIZE(clump_test_data); i++) {
+ prepare_test_data(i);
+ execute_for_each_set_clump_test(i);
+ }
+}
+
static void __init test_for_each_set_clump8(void)
{
#define CLUMP_EXP_NUMBITS 64
@@ -680,6 +823,7 @@ static void __init selftest(void)
test_bitmap_parselist_user();
test_mem_optimisations();
test_for_each_set_clump8();
+ test_for_each_set_clump();
test_bitmap_cut();
}

--
2.26.2

2020-10-19 09:48:29

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: Bartosz Golaszewski <[email protected]>
Cc: Michal Simek <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
Changes in v12:
- Remove extra empty newline.

Changes in v11:
- Change variable name 'flag' to 'flags'.

Changes in v10:
- No change.

Changes in v9:
- Remove looping of 'for_each_set_clump' and instead process two
halves of a 64-bit bitmap separately or individually. Use normal spin_lock
call for second inner lock. And take the spin_lock_init call outside the 'if'
condition in the 'probe' function of driver.

Changes in v8:
- No change.

Changes in v7:
- No change.

Changes in v6:
- No change.

Changes in v5:
- Minor change: Inline values '32' and '64' in code for better
code readability.

Changes in v4:
- Minor change: Inline values '32' and '64' in code for better
code readability.

Changes in v3:
- No change.

Changes in v2:
- No change

drivers/gpio/gpio-xilinx.c | 65 +++++++++++++++++++-------------------
1 file changed, 32 insertions(+), 33 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..3ba1a993c85e 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -138,37 +138,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
{
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
- int index = xgpio_index(chip, 0);
- int offset, i;
-
- spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
- /* Write to GPIO signals */
- for (i = 0; i < gc->ngpio; i++) {
- if (*mask == 0)
- break;
- /* Once finished with an index write it out to the register */
- if (index != xgpio_index(chip, i)) {
- xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
- index * XGPIO_CHANNEL_OFFSET,
- chip->gpio_state[index]);
- spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
- index = xgpio_index(chip, i);
- spin_lock_irqsave(&chip->gpio_lock[index], flags);
- }
- if (__test_and_clear_bit(i, mask)) {
- offset = xgpio_offset(chip, i);
- if (test_bit(i, bits))
- chip->gpio_state[index] |= BIT(offset);
- else
- chip->gpio_state[index] &= ~BIT(offset);
- }
- }
-
- xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
- index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
- spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+ u32 *const state = chip->gpio_state;
+ unsigned int *const width = chip->gpio_width;
+
+ DECLARE_BITMAP(old, 64);
+ DECLARE_BITMAP(new, 64);
+ DECLARE_BITMAP(changed, 64);
+
+ spin_lock_irqsave(&chip->gpio_lock[0], flags);
+ spin_lock(&chip->gpio_lock[1]);
+
+ bitmap_set_value(old, state[0], 0, width[0]);
+ bitmap_set_value(old, state[1], width[0], width[1]);
+ bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+ bitmap_set_value(old, state[0], 0, 32);
+ bitmap_set_value(old, state[1], 32, 32);
+ state[0] = bitmap_get_value(new, 0, width[0]);
+ state[1] = bitmap_get_value(new, width[0], width[1]);
+ bitmap_set_value(new, state[0], 0, 32);
+ bitmap_set_value(new, state[1], 32, 32);
+ bitmap_xor(changed, old, new, 64);
+
+ if (((u32 *)changed)[0])
+ xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+ state[0]);
+ if (((u32 *)changed)[1])
+ xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+ XGPIO_CHANNEL_OFFSET, state[1]);
+
+ spin_unlock(&chip->gpio_lock[1]);
+ spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
}

/**
@@ -292,6 +292,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;

spin_lock_init(&chip->gpio_lock[0]);
+ spin_lock_init(&chip->gpio_lock[1]);

if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -313,8 +314,6 @@ static int xgpio_probe(struct platform_device *pdev)
if (of_property_read_u32(np, "xlnx,gpio2-width",
&chip->gpio_width[1]))
chip->gpio_width[1] = 32;
-
- spin_lock_init(&chip->gpio_lock[1]);
}

chip->gc.base = -1;
--
2.26.2

2020-10-19 19:54:57

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v12 3/4] gpio: thunderx: Utilize for_each_set_clump macro

This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
Changes in v12:
- No change.

Changes in v11:
- No change.

Changes in v10:
- No change.

Changes in v9:
- No change.

Changes in v8:
- No change.

Changes in v7:
- No change.

Changes in v6:
- No change.

Changes in v5:
- No change.

Changes in v4:
- Minor change: Inline value '64' in code for better code readability.

Changes in v3:
- Change datatype of some variables from u64 to unsigned long
in function thunderx_gpio_set_multiple.

Changes in v2:
- No change.

drivers/gpio/gpio-thunderx.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..58c9bb25a377 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *bits)
{
int bank;
- u64 set_bits, clear_bits;
+ unsigned long set_bits, clear_bits, gpio_mask;
+ unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);

- for (bank = 0; bank <= chip->ngpio / 64; bank++) {
- set_bits = bits[bank] & mask[bank];
- clear_bits = ~bits[bank] & mask[bank];
+ for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+ bank = offset / 64;
+ set_bits = bits[bank] & gpio_mask;
+ clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_CLR);
}
--
2.26.2

2020-10-29 22:49:15

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
>
> This patch reimplements the xgpio_set_multiple() function in
> drivers/gpio/gpio-xilinx.c to use the new generic functions:
> bitmap_get_value() and bitmap_set_value(). The code is now simpler
> to read and understand. Moreover, instead of looping for each bit
> in xgpio_set_multiple() function, now we can check each channel at
> a time and save cycles.

This now causes -Wtype-limits warnings in linux-next with gcc-10:

> + u32 *const state = chip->gpio_state;
> + unsigned int *const width = chip->gpio_width;
> +
> + DECLARE_BITMAP(old, 64);
> + DECLARE_BITMAP(new, 64);
> + DECLARE_BITMAP(changed, 64);
> +
> + spin_lock_irqsave(&chip->gpio_lock[0], flags);
> + spin_lock(&chip->gpio_lock[1]);
> +
> + bitmap_set_value(old, state[0], 0, width[0]);
> + bitmap_set_value(old, state[1], width[0], width[1]);

In file included from ../include/linux/cpumask.h:12,
from ../arch/x86/include/asm/cpumask.h:5,
from ../arch/x86/include/asm/msr.h:11,
from ../arch/x86/include/asm/processor.h:22,
from ../arch/x86/include/asm/timex.h:5,
from ../include/linux/timex.h:65,
from ../include/linux/time32.h:13,
from ../include/linux/time.h:73,
from ../include/linux/stat.h:19,
from ../include/linux/module.h:13,
from ../drivers/gpio/gpio-xilinx.c:11:
../include/linux/bitmap.h:639:18: warning: array subscript [1,
67108864] is outside array bounds of 'long unsigned int[1]'
[-Warray-bounds]
639 | map[index + 1] |= value >> space;
| ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from ../include/linux/kasan-checks.h:5,
from ../include/asm-generic/rwonce.h:26,
from ./arch/x86/include/generated/asm/rwonce.h:1,
from ../include/linux/compiler.h:246,
from ../include/linux/build_bug.h:5,
from ../include/linux/bits.h:22,
from ../include/linux/bitops.h:6,
from ../drivers/gpio/gpio-xilinx.c:8:
../drivers/gpio/gpio-xilinx.c:144:17: note: while referencing 'old'
144 | DECLARE_BITMAP(old, 64);
| ^~~
../include/linux/types.h:11:16: note: in definition of macro 'DECLARE_BITMAP'
11 | unsigned long name[BITS_TO_LONGS(bits)]
| ^~~~
In file included from ../include/linux/cpumask.h:12,
from ../arch/x86/include/asm/cpumask.h:5,
from ../arch/x86/include/asm/msr.h:11,
from ../arch/x86/include/asm/processor.h:22,
from ../arch/x86/include/asm/timex.h:5,
from ../include/linux/timex.h:65,
from ../include/linux/time32.h:13,
from ../include/linux/time.h:73,
from ../include/linux/stat.h:19,
from ../include/linux/module.h:13,
from ../drivers/gpio/gpio-xilinx.c:11:

The compiler clearly tries to do range-checking here and notices
that the index into the fixed-length array on the stack is not correctly
bounded. It seems this would happen whenever width[0] + width[1]
is larger than 64.

I have just submitted patches for all other -Wtype-limits warnings
and would like to enable this option by default. Can you try to find
a way to make this code safer? I would expect that you need a
variant of bitmap_set_value() that takes an explicit ceiling here,
and checks the stand and nbits values against that.

Arnd

2020-11-01 15:03:34

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
> >
> > This patch reimplements the xgpio_set_multiple() function in
> > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > to read and understand. Moreover, instead of looping for each bit
> > in xgpio_set_multiple() function, now we can check each channel at
> > a time and save cycles.
>
> This now causes -Wtype-limits warnings in linux-next with gcc-10:

Hi Arnd,

What version of gcc-10 are you running? I'm having trouble generating
these warnings so I suspect I'm using a different version than you.

Regardless I can see your concern about the code, and I think I have a
solution.

>
> > + u32 *const state = chip->gpio_state;
> > + unsigned int *const width = chip->gpio_width;
> > +
> > + DECLARE_BITMAP(old, 64);
> > + DECLARE_BITMAP(new, 64);
> > + DECLARE_BITMAP(changed, 64);
> > +
> > + spin_lock_irqsave(&chip->gpio_lock[0], flags);
> > + spin_lock(&chip->gpio_lock[1]);
> > +
> > + bitmap_set_value(old, state[0], 0, width[0]);
> > + bitmap_set_value(old, state[1], width[0], width[1]);
>
> In file included from ../include/linux/cpumask.h:12,
> from ../arch/x86/include/asm/cpumask.h:5,
> from ../arch/x86/include/asm/msr.h:11,
> from ../arch/x86/include/asm/processor.h:22,
> from ../arch/x86/include/asm/timex.h:5,
> from ../include/linux/timex.h:65,
> from ../include/linux/time32.h:13,
> from ../include/linux/time.h:73,
> from ../include/linux/stat.h:19,
> from ../include/linux/module.h:13,
> from ../drivers/gpio/gpio-xilinx.c:11:
> ../include/linux/bitmap.h:639:18: warning: array subscript [1,
> 67108864] is outside array bounds of 'long unsigned int[1]'
> [-Warray-bounds]
> 639 | map[index + 1] |= value >> space;
> | ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
> In file included from ../include/linux/kasan-checks.h:5,
> from ../include/asm-generic/rwonce.h:26,
> from ./arch/x86/include/generated/asm/rwonce.h:1,
> from ../include/linux/compiler.h:246,
> from ../include/linux/build_bug.h:5,
> from ../include/linux/bits.h:22,
> from ../include/linux/bitops.h:6,
> from ../drivers/gpio/gpio-xilinx.c:8:
> ../drivers/gpio/gpio-xilinx.c:144:17: note: while referencing 'old'
> 144 | DECLARE_BITMAP(old, 64);
> | ^~~
> ../include/linux/types.h:11:16: note: in definition of macro 'DECLARE_BITMAP'
> 11 | unsigned long name[BITS_TO_LONGS(bits)]
> | ^~~~
> In file included from ../include/linux/cpumask.h:12,
> from ../arch/x86/include/asm/cpumask.h:5,
> from ../arch/x86/include/asm/msr.h:11,
> from ../arch/x86/include/asm/processor.h:22,
> from ../arch/x86/include/asm/timex.h:5,
> from ../include/linux/timex.h:65,
> from ../include/linux/time32.h:13,
> from ../include/linux/time.h:73,
> from ../include/linux/stat.h:19,
> from ../include/linux/module.h:13,
> from ../drivers/gpio/gpio-xilinx.c:11:
>
> The compiler clearly tries to do range-checking here and notices
> that the index into the fixed-length array on the stack is not correctly
> bounded. It seems this would happen whenever width[0] + width[1]
> is larger than 64.
>
> I have just submitted patches for all other -Wtype-limits warnings
> and would like to enable this option by default. Can you try to find
> a way to make this code safer? I would expect that you need a
> variant of bitmap_set_value() that takes an explicit ceiling here,
> and checks the stand and nbits values against that.
>
> Arnd

Let me first verify that I understand the problem correctly. The issue
is the possibility of a stack smash in bitmap_set_value() when the value
of start + nbits is larger than the length of the map bitmap memory
region. This is because index (or index + 1) could be outside the range
of the bitmap memory region passed in as map. Is my understanding
correct here?

In xgpio_set_multiple(), the variables width[0] and width[1] serve as
possible start and nbits values for the bitmap_set_value() calls.
Because width[0] and width[1] are unsigned int variables, GCC considers
the possibility that the value of width[0]/width[1] might exceed the
length of the bitmap memory region named old and thus result in a stack
smash.

I don't know if invalid width values are actually possible for the
Xilinx gpio device, but let's err on the side of safety and assume this
is actually a possibility. We should verify that the combined value of
gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
check for this in xgpio_probe() when we grab the gpio_width values.

However, we're still left with the GCC warnings because GCC is not smart
enough to know that we've already checked the boundary and width[0] and
width[1] are valid values. I suspect we can avoid this warning is we
refactor bitmap_set_value() to increment map seperately and then set it:

static inline void bitmap_set_value(unsigned long *map,
unsigned long value,
unsigned long start, unsigned long nbits)
{
const unsigned long offset = start % BITS_PER_LONG;
const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
const unsigned long space = ceiling - start;

map += BIT_WORD(start);
value &= GENMASK(nbits - 1, 0);

if (space >= nbits) {
*map &= ~(GENMASK(nbits - 1, 0) << offset);
*map |= value << offset;
} else {
*map &= ~BITMAP_FIRST_WORD_MASK(start);
*map |= value << offset;
map++;
*map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
*map |= value >> space;
}
}

This avoids adding a costly conditional check inside bitmap_set_value()
when almost all bitmap_set_value() calls will have static arguments with
well-defined and obvious boundaries.

Do you think this would be an acceptable solution to resolve your GCC
warnings?

Sincerely,

William Breathitt Gray


Attachments:
(No filename) (6.39 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-01 20:11:18

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
<[email protected]> wrote:
>
> On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
> > >
> > > This patch reimplements the xgpio_set_multiple() function in
> > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > to read and understand. Moreover, instead of looping for each bit
> > > in xgpio_set_multiple() function, now we can check each channel at
> > > a time and save cycles.
> >
> > This now causes -Wtype-limits warnings in linux-next with gcc-10:
>
> Hi Arnd,
>
> What version of gcc-10 are you running? I'm having trouble generating
> these warnings so I suspect I'm using a different version than you.

I originally saw it with the binaries from
https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
also been able to reproduce it with a minimal test case on the
binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n

> Let me first verify that I understand the problem correctly. The issue
> is the possibility of a stack smash in bitmap_set_value() when the value
> of start + nbits is larger than the length of the map bitmap memory
> region. This is because index (or index + 1) could be outside the range
> of the bitmap memory region passed in as map. Is my understanding
> correct here?

Yes, that seems to be the case here.

> In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> possible start and nbits values for the bitmap_set_value() calls.
> Because width[0] and width[1] are unsigned int variables, GCC considers
> the possibility that the value of width[0]/width[1] might exceed the
> length of the bitmap memory region named old and thus result in a stack
> smash.
>
> I don't know if invalid width values are actually possible for the
> Xilinx gpio device, but let's err on the side of safety and assume this
> is actually a possibility. We should verify that the combined value of
> gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> check for this in xgpio_probe() when we grab the gpio_width values.
>
> However, we're still left with the GCC warnings because GCC is not smart
> enough to know that we've already checked the boundary and width[0] and
> width[1] are valid values. I suspect we can avoid this warning is we
> refactor bitmap_set_value() to increment map seperately and then set it:

As I understand it, part of the problem is that gcc sees the possible
range as being constrained by the operations on 'start' and 'nbits',
in particular the shift in BIT_WORD() that put an upper bound on
the index, but then it sees that the upper bound is higher than the
upper bound of the array, i.e. element zero.

I added a check

if (start >= 64 || start + size >= 64) return;

in the godbolt.org testcase, which does help limit the start
index appropriately, but it is not sufficient to let the compiler
see that the 'if (space >= nbits) ' condition is guaranteed to
be true for all values here.

> static inline void bitmap_set_value(unsigned long *map,
> unsigned long value,
> unsigned long start, unsigned long nbits)
> {
> const unsigned long offset = start % BITS_PER_LONG;
> const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> const unsigned long space = ceiling - start;
>
> map += BIT_WORD(start);
> value &= GENMASK(nbits - 1, 0);
>
> if (space >= nbits) {
> *map &= ~(GENMASK(nbits - 1, 0) << offset);
> *map |= value << offset;
> } else {
> *map &= ~BITMAP_FIRST_WORD_MASK(start);
> *map |= value << offset;
> map++;
> *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> *map |= value >> space;
> }
> }
>
> This avoids adding a costly conditional check inside bitmap_set_value()
> when almost all bitmap_set_value() calls will have static arguments with
> well-defined and obvious boundaries.
>
> Do you think this would be an acceptable solution to resolve your GCC
> warnings?

Unfortunately, it does not seem to make a difference, as gcc still
knows that this compiles to the same result, and it produces the same
warning as before (see https://godbolt.org/z/rjx34r)

Arnd

2020-11-05 09:12:59

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH v12 3/4] gpio: thunderx: Utilize for_each_set_clump macro

On Sun, Oct 18, 2020 at 11:41 PM Syed Nayyar Waris <[email protected]> wrote:

> This patch reimplements the thunderx_gpio_set_multiple function in
> drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
> Instead of looping for each bank in thunderx_gpio_set_multiple
> function, now we can skip bank which is not set and save cycles.
>
> Cc: Robert Richter <[email protected]>
> Cc: Bartosz Golaszewski <[email protected]>
> Signed-off-by: Syed Nayyar Waris <[email protected]>
> Signed-off-by: William Breathitt Gray <[email protected]>
> ---
> Changes in v12:
> - No change.

Acked-by: Linus Walleij <[email protected]>

If Andrew merges this through his tree.

Yours,
Linus Walleij

2020-11-09 12:38:34

by Syed Nayyar Waris

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> <[email protected]> wrote:
> >
> > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
> > > >
> > > > This patch reimplements the xgpio_set_multiple() function in
> > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > to read and understand. Moreover, instead of looping for each bit
> > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > a time and save cycles.
> > >
> > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> >
> > Hi Arnd,
> >
> > What version of gcc-10 are you running? I'm having trouble generating
> > these warnings so I suspect I'm using a different version than you.
>
> I originally saw it with the binaries from
> https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> also been able to reproduce it with a minimal test case on the
> binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
>
> > Let me first verify that I understand the problem correctly. The issue
> > is the possibility of a stack smash in bitmap_set_value() when the value
> > of start + nbits is larger than the length of the map bitmap memory
> > region. This is because index (or index + 1) could be outside the range
> > of the bitmap memory region passed in as map. Is my understanding
> > correct here?
>
> Yes, that seems to be the case here.
>
> > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > possible start and nbits values for the bitmap_set_value() calls.
> > Because width[0] and width[1] are unsigned int variables, GCC considers
> > the possibility that the value of width[0]/width[1] might exceed the
> > length of the bitmap memory region named old and thus result in a stack
> > smash.
> >
> > I don't know if invalid width values are actually possible for the
> > Xilinx gpio device, but let's err on the side of safety and assume this
> > is actually a possibility. We should verify that the combined value of
> > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > check for this in xgpio_probe() when we grab the gpio_width values.
> >
> > However, we're still left with the GCC warnings because GCC is not smart
> > enough to know that we've already checked the boundary and width[0] and
> > width[1] are valid values. I suspect we can avoid this warning is we
> > refactor bitmap_set_value() to increment map seperately and then set it:
>
> As I understand it, part of the problem is that gcc sees the possible
> range as being constrained by the operations on 'start' and 'nbits',
> in particular the shift in BIT_WORD() that put an upper bound on
> the index, but then it sees that the upper bound is higher than the
> upper bound of the array, i.e. element zero.
>
> I added a check
>
> if (start >= 64 || start + size >= 64) return;
>
> in the godbolt.org testcase, which does help limit the start
> index appropriately, but it is not sufficient to let the compiler
> see that the 'if (space >= nbits) ' condition is guaranteed to
> be true for all values here.
>
> > static inline void bitmap_set_value(unsigned long *map,
> > unsigned long value,
> > unsigned long start, unsigned long nbits)
> > {
> > const unsigned long offset = start % BITS_PER_LONG;
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> >
> > map += BIT_WORD(start);
> > value &= GENMASK(nbits - 1, 0);
> >
> > if (space >= nbits) {
> > *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > *map |= value << offset;
> > } else {
> > *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > *map |= value << offset;
> > map++;
> > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > *map |= value >> space;
> > }
> > }
> >
> > This avoids adding a costly conditional check inside bitmap_set_value()
> > when almost all bitmap_set_value() calls will have static arguments with
> > well-defined and obvious boundaries.
> >
> > Do you think this would be an acceptable solution to resolve your GCC
> > warnings?
>
> Unfortunately, it does not seem to make a difference, as gcc still
> knows that this compiles to the same result, and it produces the same
> warning as before (see https://godbolt.org/z/rjx34r)
>
> Arnd

Hi Arnd,

Sharing a different version of bitmap_set_valuei() function. See below.

Let me know if the below solution looks good to you and if it resolves
the above compiler warning.


@@ -1,5 +1,5 @@
static inline void bitmap_set_value(unsigned long *map,
- unsigned long value,
+ unsigned long value, const size_t length,
unsigned long start, unsigned long nbits)
{
const size_t index = BIT_WORD(start);
@@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
const unsigned long space = ceiling - start;

+ if (index >= length)
+ return;
+
value &= GENMASK(nbits - 1, 0);

if (space >= nbits) {
@@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
} else {
map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
map[index + 0] |= value << offset;
+
+ if (index + 1 >= length)
+ return;
+
map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
map[index + 1] |= value >> space;
}





2020-11-09 13:30:03

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 9, 2020 at 1:34 PM Syed Nayyar Waris <[email protected]> wrote:
> On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > > This avoids adding a costly conditional check inside bitmap_set_value()
> > > when almost all bitmap_set_value() calls will have static arguments with
> > > well-defined and obvious boundaries.
> > >
> > > Do you think this would be an acceptable solution to resolve your GCC
> > > warnings?
> >
> > Unfortunately, it does not seem to make a difference, as gcc still
> > knows that this compiles to the same result, and it produces the same
> > warning as before (see https://godbolt.org/z/rjx34r)
> >
> > Arnd
>
> Hi Arnd,
>
> Sharing a different version of bitmap_set_valuei() function. See below.
>
> Let me know if the below solution looks good to you and if it resolves
> the above compiler warning.

Thanks for the follow-up!

> @@ -1,5 +1,5 @@
> static inline void bitmap_set_value(unsigned long *map,
> - unsigned long value,
> + unsigned long value, const size_t length,
> unsigned long start, unsigned long nbits)
> {
> const size_t index = BIT_WORD(start);
> @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
> const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> const unsigned long space = ceiling - start;
>
> + if (index >= length)
> + return;
> +
> value &= GENMASK(nbits - 1, 0);
>
> if (space >= nbits) {
> @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
> } else {
> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> map[index + 0] |= value << offset;
> +
> + if (index + 1 >= length)
> + return;
> +
> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> map[index + 1] |= value >> space;
> }

Yes, this does address the warning: https://godbolt.org/z/3nsGzq

Not sure what the best calling conventions would be though, as the function
now has five arguments, and the one called 'nbits' appears to be what
all other helpers in include/linux/bitmap.h use for the length of the bitmap,
while this one uses it for the length of the value.

I'd prefer passing the number of bits in the bitmap rather than the number
of 'unsigned long' words in it, and calling that 'nbits', while renaming
the current 'nbits' to something else, e.g.:

static inline void bitmap_set_value(unsigned long *map,
unsigned long value, unsigned long start,
unsigned long clump_size, unsigned
long nbits);

Though I'm still unsure about the argument order. Having 'nbits'
right next to 'map' would be the most logical to me as they logically
belong together, but most other linux/bitops.h helpers seem to have
'nbits' as the last argument.

Arnd

2020-11-09 13:43:56

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> > <[email protected]> wrote:
> > >
> > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
> > > > >
> > > > > This patch reimplements the xgpio_set_multiple() function in
> > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > > to read and understand. Moreover, instead of looping for each bit
> > > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > > a time and save cycles.
> > > >
> > > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> > >
> > > Hi Arnd,
> > >
> > > What version of gcc-10 are you running? I'm having trouble generating
> > > these warnings so I suspect I'm using a different version than you.
> >
> > I originally saw it with the binaries from
> > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> > also been able to reproduce it with a minimal test case on the
> > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> >
> > > Let me first verify that I understand the problem correctly. The issue
> > > is the possibility of a stack smash in bitmap_set_value() when the value
> > > of start + nbits is larger than the length of the map bitmap memory
> > > region. This is because index (or index + 1) could be outside the range
> > > of the bitmap memory region passed in as map. Is my understanding
> > > correct here?
> >
> > Yes, that seems to be the case here.
> >
> > > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > > possible start and nbits values for the bitmap_set_value() calls.
> > > Because width[0] and width[1] are unsigned int variables, GCC considers
> > > the possibility that the value of width[0]/width[1] might exceed the
> > > length of the bitmap memory region named old and thus result in a stack
> > > smash.
> > >
> > > I don't know if invalid width values are actually possible for the
> > > Xilinx gpio device, but let's err on the side of safety and assume this
> > > is actually a possibility. We should verify that the combined value of
> > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > > check for this in xgpio_probe() when we grab the gpio_width values.
> > >
> > > However, we're still left with the GCC warnings because GCC is not smart
> > > enough to know that we've already checked the boundary and width[0] and
> > > width[1] are valid values. I suspect we can avoid this warning is we
> > > refactor bitmap_set_value() to increment map seperately and then set it:
> >
> > As I understand it, part of the problem is that gcc sees the possible
> > range as being constrained by the operations on 'start' and 'nbits',
> > in particular the shift in BIT_WORD() that put an upper bound on
> > the index, but then it sees that the upper bound is higher than the
> > upper bound of the array, i.e. element zero.
> >
> > I added a check
> >
> > if (start >= 64 || start + size >= 64) return;
> >
> > in the godbolt.org testcase, which does help limit the start
> > index appropriately, but it is not sufficient to let the compiler
> > see that the 'if (space >= nbits) ' condition is guaranteed to
> > be true for all values here.
> >
> > > static inline void bitmap_set_value(unsigned long *map,
> > > unsigned long value,
> > > unsigned long start, unsigned long nbits)
> > > {
> > > const unsigned long offset = start % BITS_PER_LONG;
> > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > const unsigned long space = ceiling - start;
> > >
> > > map += BIT_WORD(start);
> > > value &= GENMASK(nbits - 1, 0);
> > >
> > > if (space >= nbits) {
> > > *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > > *map |= value << offset;
> > > } else {
> > > *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > > *map |= value << offset;
> > > map++;
> > > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > *map |= value >> space;
> > > }
> > > }
> > >
> > > This avoids adding a costly conditional check inside bitmap_set_value()
> > > when almost all bitmap_set_value() calls will have static arguments with
> > > well-defined and obvious boundaries.
> > >
> > > Do you think this would be an acceptable solution to resolve your GCC
> > > warnings?
> >
> > Unfortunately, it does not seem to make a difference, as gcc still
> > knows that this compiles to the same result, and it produces the same
> > warning as before (see https://godbolt.org/z/rjx34r)
> >
> > Arnd
>
> Hi Arnd,
>
> Sharing a different version of bitmap_set_valuei() function. See below.
>
> Let me know if the below solution looks good to you and if it resolves
> the above compiler warning.
>
>
> @@ -1,5 +1,5 @@
> static inline void bitmap_set_value(unsigned long *map,
> - unsigned long value,
> + unsigned long value, const size_t length,
> unsigned long start, unsigned long nbits)
> {
> const size_t index = BIT_WORD(start);
> @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
> const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> const unsigned long space = ceiling - start;
>
> + if (index >= length)
> + return;
> +
> value &= GENMASK(nbits - 1, 0);
>
> if (space >= nbits) {
> @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
> } else {
> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> map[index + 0] |= value << offset;
> +
> + if (index + 1 >= length)
> + return;
> +
> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> map[index + 1] |= value >> space;
> }

One of my concerns is that we're incurring the latency two additional
conditional checks just to suppress a compiler warning about a case that
wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
there's a way for us to suppress these warnings without adding onto the
latency of this function; given that bitmap_set_value() is intended to
be used in loops, conditionals here could significantly increase latency
in drivers.

I wonder if array_index_nospec() might have the side effect of
suppressing these warnings for us. For example, would this work:

static inline void bitmap_set_value(unsigned long *map,
unsigned long value,
unsigned long start, unsigned long nbits)
{
const unsigned long offset = start % BITS_PER_LONG;
const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
const unsigned long space = ceiling - start;
size_t index = BIT_WORD(start);

value &= GENMASK(nbits - 1, 0);

if (space >= nbits) {
index = array_index_nospec(index, index + 1);

map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
map[index] |= value << offset;
} else {
index = array_index_nospec(index, index + 2);

map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
map[index + 0] |= value << offset;
map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
map[index + 1] |= value >> space;
}
}

Or is this going to produce the same warning because we're not using an
explicit check against the map array size?

William Breathitt Gray


Attachments:
(No filename) (7.87 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-09 14:41:00

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote:
> On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> > On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> > > <[email protected]> wrote:
> > > >
> > > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
> > > > > >
> > > > > > This patch reimplements the xgpio_set_multiple() function in
> > > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > > > to read and understand. Moreover, instead of looping for each bit
> > > > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > > > a time and save cycles.
> > > > >
> > > > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> > > >
> > > > Hi Arnd,
> > > >
> > > > What version of gcc-10 are you running? I'm having trouble generating
> > > > these warnings so I suspect I'm using a different version than you.
> > >
> > > I originally saw it with the binaries from
> > > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> > > also been able to reproduce it with a minimal test case on the
> > > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> > >
> > > > Let me first verify that I understand the problem correctly. The issue
> > > > is the possibility of a stack smash in bitmap_set_value() when the value
> > > > of start + nbits is larger than the length of the map bitmap memory
> > > > region. This is because index (or index + 1) could be outside the range
> > > > of the bitmap memory region passed in as map. Is my understanding
> > > > correct here?
> > >
> > > Yes, that seems to be the case here.
> > >
> > > > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > > > possible start and nbits values for the bitmap_set_value() calls.
> > > > Because width[0] and width[1] are unsigned int variables, GCC considers
> > > > the possibility that the value of width[0]/width[1] might exceed the
> > > > length of the bitmap memory region named old and thus result in a stack
> > > > smash.
> > > >
> > > > I don't know if invalid width values are actually possible for the
> > > > Xilinx gpio device, but let's err on the side of safety and assume this
> > > > is actually a possibility. We should verify that the combined value of
> > > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > > > check for this in xgpio_probe() when we grab the gpio_width values.
> > > >
> > > > However, we're still left with the GCC warnings because GCC is not smart
> > > > enough to know that we've already checked the boundary and width[0] and
> > > > width[1] are valid values. I suspect we can avoid this warning is we
> > > > refactor bitmap_set_value() to increment map seperately and then set it:
> > >
> > > As I understand it, part of the problem is that gcc sees the possible
> > > range as being constrained by the operations on 'start' and 'nbits',
> > > in particular the shift in BIT_WORD() that put an upper bound on
> > > the index, but then it sees that the upper bound is higher than the
> > > upper bound of the array, i.e. element zero.
> > >
> > > I added a check
> > >
> > > if (start >= 64 || start + size >= 64) return;
> > >
> > > in the godbolt.org testcase, which does help limit the start
> > > index appropriately, but it is not sufficient to let the compiler
> > > see that the 'if (space >= nbits) ' condition is guaranteed to
> > > be true for all values here.
> > >
> > > > static inline void bitmap_set_value(unsigned long *map,
> > > > unsigned long value,
> > > > unsigned long start, unsigned long nbits)
> > > > {
> > > > const unsigned long offset = start % BITS_PER_LONG;
> > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > > const unsigned long space = ceiling - start;
> > > >
> > > > map += BIT_WORD(start);
> > > > value &= GENMASK(nbits - 1, 0);
> > > >
> > > > if (space >= nbits) {
> > > > *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > > > *map |= value << offset;
> > > > } else {
> > > > *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > *map |= value << offset;
> > > > map++;
> > > > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > *map |= value >> space;
> > > > }
> > > > }
> > > >
> > > > This avoids adding a costly conditional check inside bitmap_set_value()
> > > > when almost all bitmap_set_value() calls will have static arguments with
> > > > well-defined and obvious boundaries.
> > > >
> > > > Do you think this would be an acceptable solution to resolve your GCC
> > > > warnings?
> > >
> > > Unfortunately, it does not seem to make a difference, as gcc still
> > > knows that this compiles to the same result, and it produces the same
> > > warning as before (see https://godbolt.org/z/rjx34r)
> > >
> > > Arnd
> >
> > Hi Arnd,
> >
> > Sharing a different version of bitmap_set_valuei() function. See below.
> >
> > Let me know if the below solution looks good to you and if it resolves
> > the above compiler warning.
> >
> >
> > @@ -1,5 +1,5 @@
> > static inline void bitmap_set_value(unsigned long *map,
> > - unsigned long value,
> > + unsigned long value, const size_t length,
> > unsigned long start, unsigned long nbits)
> > {
> > const size_t index = BIT_WORD(start);
> > @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> >
> > + if (index >= length)
> > + return;
> > +
> > value &= GENMASK(nbits - 1, 0);
> >
> > if (space >= nbits) {
> > @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > } else {
> > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > map[index + 0] |= value << offset;
> > +
> > + if (index + 1 >= length)
> > + return;
> > +
> > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > map[index + 1] |= value >> space;
> > }
>
> One of my concerns is that we're incurring the latency two additional
> conditional checks just to suppress a compiler warning about a case that
> wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> there's a way for us to suppress these warnings without adding onto the
> latency of this function; given that bitmap_set_value() is intended to
> be used in loops, conditionals here could significantly increase latency
> in drivers.
>
> I wonder if array_index_nospec() might have the side effect of
> suppressing these warnings for us. For example, would this work:
>
> static inline void bitmap_set_value(unsigned long *map,
> unsigned long value,
> unsigned long start, unsigned long nbits)
> {
> const unsigned long offset = start % BITS_PER_LONG;
> const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> const unsigned long space = ceiling - start;
> size_t index = BIT_WORD(start);
>
> value &= GENMASK(nbits - 1, 0);
>
> if (space >= nbits) {
> index = array_index_nospec(index, index + 1);
>
> map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> map[index] |= value << offset;
> } else {
> index = array_index_nospec(index, index + 2);
>
> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> map[index + 0] |= value << offset;
> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> map[index + 1] |= value >> space;
> }
> }
>
> Or is this going to produce the same warning because we're not using an
> explicit check against the map array size?
>
> William Breathitt Gray

After testing my suggestion, it looks like the warnings are still
present. :-(

Something else I've also considered is perhaps using the GCC built-in
function __builtin_unreachable() instead of returning. So in Syed's code
we would have the following instead:

if (index + 1 >= length)
__builtin_unreachable();

This might allow GCC to optimize better and avoid the conditional check
all together, thus avoiding latency while also hinting enough context to
the compiler to suppress the warnings.

William Breathitt Gray


Attachments:
(No filename) (8.83 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-09 14:46:28

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 9, 2020 at 2:41 PM William Breathitt Gray
<[email protected]> wrote:
> On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
>
> One of my concerns is that we're incurring the latency two additional
> conditional checks just to suppress a compiler warning about a case that
> wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> there's a way for us to suppress these warnings without adding onto the
> latency of this function; given that bitmap_set_value() is intended to
> be used in loops, conditionals here could significantly increase latency
> in drivers.

At least for this caller, the size check would be a compile-time
constant that can be eliminated.

> I wonder if array_index_nospec() might have the side effect of
> suppressing these warnings for us. For example, would this work:
>
> static inline void bitmap_set_value(unsigned long *map,
> unsigned long value,
> unsigned long start, unsigned long nbits)
> {
> const unsigned long offset = start % BITS_PER_LONG;
> const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> const unsigned long space = ceiling - start;
> size_t index = BIT_WORD(start);
>
> value &= GENMASK(nbits - 1, 0);
>
> if (space >= nbits) {
> index = array_index_nospec(index, index + 1);
>
> map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> map[index] |= value << offset;
> } else {
> index = array_index_nospec(index, index + 2);
>
> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> map[index + 0] |= value << offset;
> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> map[index + 1] |= value >> space;
> }
> }
>
> Or is this going to produce the same warning because we're not using an
> explicit check against the map array size?

https://godbolt.org/z/fxnsG9

It still warns about the 'map[index + 1]' access: from all I can tell,
gcc mainly complains because it cannot rule out that 'space < nbits',
and then it knows the size of 'DECLARE_BITMAP(old, 64)' and finds
that if 'index + 0' is correct, then 'index + 1' overflows that array.

Arnd

2020-11-09 14:52:46

by Syed Nayyar Waris

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 9, 2020 at 8:09 PM William Breathitt Gray
<[email protected]> wrote:
>
> On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote:
> > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> > > On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > > > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
> > > > > > >
> > > > > > > This patch reimplements the xgpio_set_multiple() function in
> > > > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > > > > to read and understand. Moreover, instead of looping for each bit
> > > > > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > > > > a time and save cycles.
> > > > > >
> > > > > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> > > > >
> > > > > Hi Arnd,
> > > > >
> > > > > What version of gcc-10 are you running? I'm having trouble generating
> > > > > these warnings so I suspect I'm using a different version than you.
> > > >
> > > > I originally saw it with the binaries from
> > > > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> > > > also been able to reproduce it with a minimal test case on the
> > > > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> > > >
> > > > > Let me first verify that I understand the problem correctly. The issue
> > > > > is the possibility of a stack smash in bitmap_set_value() when the value
> > > > > of start + nbits is larger than the length of the map bitmap memory
> > > > > region. This is because index (or index + 1) could be outside the range
> > > > > of the bitmap memory region passed in as map. Is my understanding
> > > > > correct here?
> > > >
> > > > Yes, that seems to be the case here.
> > > >
> > > > > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > > > > possible start and nbits values for the bitmap_set_value() calls.
> > > > > Because width[0] and width[1] are unsigned int variables, GCC considers
> > > > > the possibility that the value of width[0]/width[1] might exceed the
> > > > > length of the bitmap memory region named old and thus result in a stack
> > > > > smash.
> > > > >
> > > > > I don't know if invalid width values are actually possible for the
> > > > > Xilinx gpio device, but let's err on the side of safety and assume this
> > > > > is actually a possibility. We should verify that the combined value of
> > > > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > > > > check for this in xgpio_probe() when we grab the gpio_width values.
> > > > >
> > > > > However, we're still left with the GCC warnings because GCC is not smart
> > > > > enough to know that we've already checked the boundary and width[0] and
> > > > > width[1] are valid values. I suspect we can avoid this warning is we
> > > > > refactor bitmap_set_value() to increment map seperately and then set it:
> > > >
> > > > As I understand it, part of the problem is that gcc sees the possible
> > > > range as being constrained by the operations on 'start' and 'nbits',
> > > > in particular the shift in BIT_WORD() that put an upper bound on
> > > > the index, but then it sees that the upper bound is higher than the
> > > > upper bound of the array, i.e. element zero.
> > > >
> > > > I added a check
> > > >
> > > > if (start >= 64 || start + size >= 64) return;
> > > >
> > > > in the godbolt.org testcase, which does help limit the start
> > > > index appropriately, but it is not sufficient to let the compiler
> > > > see that the 'if (space >= nbits) ' condition is guaranteed to
> > > > be true for all values here.
> > > >
> > > > > static inline void bitmap_set_value(unsigned long *map,
> > > > > unsigned long value,
> > > > > unsigned long start, unsigned long nbits)
> > > > > {
> > > > > const unsigned long offset = start % BITS_PER_LONG;
> > > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > > > const unsigned long space = ceiling - start;
> > > > >
> > > > > map += BIT_WORD(start);
> > > > > value &= GENMASK(nbits - 1, 0);
> > > > >
> > > > > if (space >= nbits) {
> > > > > *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > > > > *map |= value << offset;
> > > > > } else {
> > > > > *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > *map |= value << offset;
> > > > > map++;
> > > > > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > > *map |= value >> space;
> > > > > }
> > > > > }
> > > > >
> > > > > This avoids adding a costly conditional check inside bitmap_set_value()
> > > > > when almost all bitmap_set_value() calls will have static arguments with
> > > > > well-defined and obvious boundaries.
> > > > >
> > > > > Do you think this would be an acceptable solution to resolve your GCC
> > > > > warnings?
> > > >
> > > > Unfortunately, it does not seem to make a difference, as gcc still
> > > > knows that this compiles to the same result, and it produces the same
> > > > warning as before (see https://godbolt.org/z/rjx34r)
> > > >
> > > > Arnd
> > >
> > > Hi Arnd,
> > >
> > > Sharing a different version of bitmap_set_valuei() function. See below.
> > >
> > > Let me know if the below solution looks good to you and if it resolves
> > > the above compiler warning.
> > >
> > >
> > > @@ -1,5 +1,5 @@
> > > static inline void bitmap_set_value(unsigned long *map,
> > > - unsigned long value,
> > > + unsigned long value, const size_t length,
> > > unsigned long start, unsigned long nbits)
> > > {
> > > const size_t index = BIT_WORD(start);
> > > @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
> > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > const unsigned long space = ceiling - start;
> > >
> > > + if (index >= length)
> > > + return;
> > > +
> > > value &= GENMASK(nbits - 1, 0);
> > >
> > > if (space >= nbits) {
> > > @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > } else {
> > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > map[index + 0] |= value << offset;
> > > +
> > > + if (index + 1 >= length)
> > > + return;
> > > +
> > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > map[index + 1] |= value >> space;
> > > }
> >
> > One of my concerns is that we're incurring the latency two additional
> > conditional checks just to suppress a compiler warning about a case that
> > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> > there's a way for us to suppress these warnings without adding onto the
> > latency of this function; given that bitmap_set_value() is intended to
> > be used in loops, conditionals here could significantly increase latency
> > in drivers.
> >
> > I wonder if array_index_nospec() might have the side effect of
> > suppressing these warnings for us. For example, would this work:
> >
> > static inline void bitmap_set_value(unsigned long *map,
> > unsigned long value,
> > unsigned long start, unsigned long nbits)
> > {
> > const unsigned long offset = start % BITS_PER_LONG;
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> > size_t index = BIT_WORD(start);
> >
> > value &= GENMASK(nbits - 1, 0);
> >
> > if (space >= nbits) {
> > index = array_index_nospec(index, index + 1);
> >
> > map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > map[index] |= value << offset;
> > } else {
> > index = array_index_nospec(index, index + 2);
> >
> > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > map[index + 0] |= value << offset;
> > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > map[index + 1] |= value >> space;
> > }
> > }
> >
> > Or is this going to produce the same warning because we're not using an
> > explicit check against the map array size?
> >
> > William Breathitt Gray
>
> After testing my suggestion, it looks like the warnings are still
> present. :-(
>
> Something else I've also considered is perhaps using the GCC built-in
> function __builtin_unreachable() instead of returning. So in Syed's code
> we would have the following instead:
>
> if (index + 1 >= length)
> __builtin_unreachable();
>
> This might allow GCC to optimize better and avoid the conditional check
> all together, thus avoiding latency while also hinting enough context to
> the compiler to suppress the warnings.
>
> William Breathitt Gray

I also thought of another optimization. Arnd, William, let me know
what you think about it.

Since exceeding the array limit is a rather rare event, we can use the
gcc extension: 'unlikely' for the boundary checks.
We can use it at the two places where 'index' and 'index + 1' is being
checked against the boundary limit.

It might help optimize the code. Wouldn't it?

Syed Nayyar Waris

2020-11-09 15:20:44

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 09, 2020 at 08:18:51PM +0530, Syed Nayyar Waris wrote:
> On Mon, Nov 9, 2020 at 8:09 PM William Breathitt Gray
> <[email protected]> wrote:
> >
> > On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote:
> > > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> > > > On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > > > > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > > > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris <[email protected]> wrote:
> > > > > > > >
> > > > > > > > This patch reimplements the xgpio_set_multiple() function in
> > > > > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > > > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > > > > > to read and understand. Moreover, instead of looping for each bit
> > > > > > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > > > > > a time and save cycles.
> > > > > > >
> > > > > > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> > > > > >
> > > > > > Hi Arnd,
> > > > > >
> > > > > > What version of gcc-10 are you running? I'm having trouble generating
> > > > > > these warnings so I suspect I'm using a different version than you.
> > > > >
> > > > > I originally saw it with the binaries from
> > > > > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> > > > > also been able to reproduce it with a minimal test case on the
> > > > > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> > > > >
> > > > > > Let me first verify that I understand the problem correctly. The issue
> > > > > > is the possibility of a stack smash in bitmap_set_value() when the value
> > > > > > of start + nbits is larger than the length of the map bitmap memory
> > > > > > region. This is because index (or index + 1) could be outside the range
> > > > > > of the bitmap memory region passed in as map. Is my understanding
> > > > > > correct here?
> > > > >
> > > > > Yes, that seems to be the case here.
> > > > >
> > > > > > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > > > > > possible start and nbits values for the bitmap_set_value() calls.
> > > > > > Because width[0] and width[1] are unsigned int variables, GCC considers
> > > > > > the possibility that the value of width[0]/width[1] might exceed the
> > > > > > length of the bitmap memory region named old and thus result in a stack
> > > > > > smash.
> > > > > >
> > > > > > I don't know if invalid width values are actually possible for the
> > > > > > Xilinx gpio device, but let's err on the side of safety and assume this
> > > > > > is actually a possibility. We should verify that the combined value of
> > > > > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > > > > > check for this in xgpio_probe() when we grab the gpio_width values.
> > > > > >
> > > > > > However, we're still left with the GCC warnings because GCC is not smart
> > > > > > enough to know that we've already checked the boundary and width[0] and
> > > > > > width[1] are valid values. I suspect we can avoid this warning is we
> > > > > > refactor bitmap_set_value() to increment map seperately and then set it:
> > > > >
> > > > > As I understand it, part of the problem is that gcc sees the possible
> > > > > range as being constrained by the operations on 'start' and 'nbits',
> > > > > in particular the shift in BIT_WORD() that put an upper bound on
> > > > > the index, but then it sees that the upper bound is higher than the
> > > > > upper bound of the array, i.e. element zero.
> > > > >
> > > > > I added a check
> > > > >
> > > > > if (start >= 64 || start + size >= 64) return;
> > > > >
> > > > > in the godbolt.org testcase, which does help limit the start
> > > > > index appropriately, but it is not sufficient to let the compiler
> > > > > see that the 'if (space >= nbits) ' condition is guaranteed to
> > > > > be true for all values here.
> > > > >
> > > > > > static inline void bitmap_set_value(unsigned long *map,
> > > > > > unsigned long value,
> > > > > > unsigned long start, unsigned long nbits)
> > > > > > {
> > > > > > const unsigned long offset = start % BITS_PER_LONG;
> > > > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > > > > const unsigned long space = ceiling - start;
> > > > > >
> > > > > > map += BIT_WORD(start);
> > > > > > value &= GENMASK(nbits - 1, 0);
> > > > > >
> > > > > > if (space >= nbits) {
> > > > > > *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > > > > > *map |= value << offset;
> > > > > > } else {
> > > > > > *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > > *map |= value << offset;
> > > > > > map++;
> > > > > > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > > > *map |= value >> space;
> > > > > > }
> > > > > > }
> > > > > >
> > > > > > This avoids adding a costly conditional check inside bitmap_set_value()
> > > > > > when almost all bitmap_set_value() calls will have static arguments with
> > > > > > well-defined and obvious boundaries.
> > > > > >
> > > > > > Do you think this would be an acceptable solution to resolve your GCC
> > > > > > warnings?
> > > > >
> > > > > Unfortunately, it does not seem to make a difference, as gcc still
> > > > > knows that this compiles to the same result, and it produces the same
> > > > > warning as before (see https://godbolt.org/z/rjx34r)
> > > > >
> > > > > Arnd
> > > >
> > > > Hi Arnd,
> > > >
> > > > Sharing a different version of bitmap_set_valuei() function. See below.
> > > >
> > > > Let me know if the below solution looks good to you and if it resolves
> > > > the above compiler warning.
> > > >
> > > >
> > > > @@ -1,5 +1,5 @@
> > > > static inline void bitmap_set_value(unsigned long *map,
> > > > - unsigned long value,
> > > > + unsigned long value, const size_t length,
> > > > unsigned long start, unsigned long nbits)
> > > > {
> > > > const size_t index = BIT_WORD(start);
> > > > @@ -7,6 +7,9 @@ static inline void bitmap_set_value(unsigned long *map,
> > > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > > const unsigned long space = ceiling - start;
> > > >
> > > > + if (index >= length)
> > > > + return;
> > > > +
> > > > value &= GENMASK(nbits - 1, 0);
> > > >
> > > > if (space >= nbits) {
> > > > @@ -15,6 +18,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > > } else {
> > > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > map[index + 0] |= value << offset;
> > > > +
> > > > + if (index + 1 >= length)
> > > > + return;
> > > > +
> > > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > map[index + 1] |= value >> space;
> > > > }
> > >
> > > One of my concerns is that we're incurring the latency two additional
> > > conditional checks just to suppress a compiler warning about a case that
> > > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> > > there's a way for us to suppress these warnings without adding onto the
> > > latency of this function; given that bitmap_set_value() is intended to
> > > be used in loops, conditionals here could significantly increase latency
> > > in drivers.
> > >
> > > I wonder if array_index_nospec() might have the side effect of
> > > suppressing these warnings for us. For example, would this work:
> > >
> > > static inline void bitmap_set_value(unsigned long *map,
> > > unsigned long value,
> > > unsigned long start, unsigned long nbits)
> > > {
> > > const unsigned long offset = start % BITS_PER_LONG;
> > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > const unsigned long space = ceiling - start;
> > > size_t index = BIT_WORD(start);
> > >
> > > value &= GENMASK(nbits - 1, 0);
> > >
> > > if (space >= nbits) {
> > > index = array_index_nospec(index, index + 1);
> > >
> > > map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > > map[index] |= value << offset;
> > > } else {
> > > index = array_index_nospec(index, index + 2);
> > >
> > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > map[index + 0] |= value << offset;
> > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > map[index + 1] |= value >> space;
> > > }
> > > }
> > >
> > > Or is this going to produce the same warning because we're not using an
> > > explicit check against the map array size?
> > >
> > > William Breathitt Gray
> >
> > After testing my suggestion, it looks like the warnings are still
> > present. :-(
> >
> > Something else I've also considered is perhaps using the GCC built-in
> > function __builtin_unreachable() instead of returning. So in Syed's code
> > we would have the following instead:
> >
> > if (index + 1 >= length)
> > __builtin_unreachable();
> >
> > This might allow GCC to optimize better and avoid the conditional check
> > all together, thus avoiding latency while also hinting enough context to
> > the compiler to suppress the warnings.
> >
> > William Breathitt Gray
>
> I also thought of another optimization. Arnd, William, let me know
> what you think about it.
>
> Since exceeding the array limit is a rather rare event, we can use the
> gcc extension: 'unlikely' for the boundary checks.
> We can use it at the two places where 'index' and 'index + 1' is being
> checked against the boundary limit.
>
> It might help optimize the code. Wouldn't it?
>
> Syed Nayyar Waris

We probably don't need unlikely() because __builtin_unreachable() should
suffice to inform GCC that this condition will never occur -- in other
words, GCC will compile optimized code to avoid the conditional
entirely.

By the way, I think we only need the (index + 1 >= length) check; the
first index conditional check is not needed and does not affect the
warnings at all, so we might as well get rid of it.

William Breathitt Gray


Attachments:
(No filename) (10.78 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-09 16:48:42

by Syed Nayyar Waris

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> On Mon, Nov 9, 2020 at 2:41 PM William Breathitt Gray
> <[email protected]> wrote:
> > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> >
> > One of my concerns is that we're incurring the latency two additional
> > conditional checks just to suppress a compiler warning about a case that
> > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> > there's a way for us to suppress these warnings without adding onto the
> > latency of this function; given that bitmap_set_value() is intended to
> > be used in loops, conditionals here could significantly increase latency
> > in drivers.
>
> At least for this caller, the size check would be a compile-time
> constant that can be eliminated.
>
> > I wonder if array_index_nospec() might have the side effect of
> > suppressing these warnings for us. For example, would this work:
> >
> > static inline void bitmap_set_value(unsigned long *map,
> > unsigned long value,
> > unsigned long start, unsigned long nbits)
> > {
> > const unsigned long offset = start % BITS_PER_LONG;
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> > size_t index = BIT_WORD(start);
> >
> > value &= GENMASK(nbits - 1, 0);
> >
> > if (space >= nbits) {
> > index = array_index_nospec(index, index + 1);
> >
> > map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > map[index] |= value << offset;
> > } else {
> > index = array_index_nospec(index, index + 2);
> >
> > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > map[index + 0] |= value << offset;
> > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > map[index + 1] |= value >> space;
> > }
> > }
> >
> > Or is this going to produce the same warning because we're not using an
> > explicit check against the map array size?
>
> https://godbolt.org/z/fxnsG9
>
> It still warns about the 'map[index + 1]' access: from all I can tell,
> gcc mainly complains because it cannot rule out that 'space < nbits',
> and then it knows the size of 'DECLARE_BITMAP(old, 64)' and finds
> that if 'index + 0' is correct, then 'index + 1' overflows that array.
>
> Arnd

Hi Arnd,

As suggested by William, sharing another solution to suppress the
compiler warning. Please let me know your views on the below fix. Thanks.

If its alright, I shall submit a (new) v13 patchset soon. Let me know.

@@ -1,5 +1,5 @@
static inline void bitmap_set_value(unsigned long *map,
- unsigned long value,
+ unsigned long value, const size_t length,
unsigned long start, unsigned long nbits)
{
const size_t index = BIT_WORD(start);
@@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
} else {
map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
map[index + 0] |= value << offset;
+
+ if (index + 1 >= length)
+ __builtin_unreachable();
+
map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
map[index + 1] |= value >> space;
}


2020-11-09 17:13:38

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > On Mon, Nov 9, 2020 at 2:41 PM William Breathitt Gray
> > <[email protected]> wrote:
> > > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> > >
> > > One of my concerns is that we're incurring the latency two additional
> > > conditional checks just to suppress a compiler warning about a case that
> > > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> > > there's a way for us to suppress these warnings without adding onto the
> > > latency of this function; given that bitmap_set_value() is intended to
> > > be used in loops, conditionals here could significantly increase latency
> > > in drivers.
> >
> > At least for this caller, the size check would be a compile-time
> > constant that can be eliminated.
> >
> > > I wonder if array_index_nospec() might have the side effect of
> > > suppressing these warnings for us. For example, would this work:
> > >
> > > static inline void bitmap_set_value(unsigned long *map,
> > > unsigned long value,
> > > unsigned long start, unsigned long nbits)
> > > {
> > > const unsigned long offset = start % BITS_PER_LONG;
> > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > const unsigned long space = ceiling - start;
> > > size_t index = BIT_WORD(start);
> > >
> > > value &= GENMASK(nbits - 1, 0);
> > >
> > > if (space >= nbits) {
> > > index = array_index_nospec(index, index + 1);
> > >
> > > map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > > map[index] |= value << offset;
> > > } else {
> > > index = array_index_nospec(index, index + 2);
> > >
> > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > map[index + 0] |= value << offset;
> > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > map[index + 1] |= value >> space;
> > > }
> > > }
> > >
> > > Or is this going to produce the same warning because we're not using an
> > > explicit check against the map array size?
> >
> > https://godbolt.org/z/fxnsG9
> >
> > It still warns about the 'map[index + 1]' access: from all I can tell,
> > gcc mainly complains because it cannot rule out that 'space < nbits',
> > and then it knows the size of 'DECLARE_BITMAP(old, 64)' and finds
> > that if 'index + 0' is correct, then 'index + 1' overflows that array.
> >
> > Arnd
>
> Hi Arnd,
>
> As suggested by William, sharing another solution to suppress the
> compiler warning. Please let me know your views on the below fix. Thanks.
>
> If its alright, I shall submit a (new) v13 patchset soon. Let me know.
>
> @@ -1,5 +1,5 @@
> static inline void bitmap_set_value(unsigned long *map,
> - unsigned long value,
> + unsigned long value, const size_t length,
> unsigned long start, unsigned long nbits)
> {
> const size_t index = BIT_WORD(start);
> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> } else {
> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> map[index + 0] |= value << offset;
> +
> + if (index + 1 >= length)
> + __builtin_unreachable();
> +
> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> map[index + 1] |= value >> space;
> }

Hi Syed,

Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
to value_width.

William Breathitt Gray


Attachments:
(No filename) (3.86 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-09 17:23:42

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:

...

> > static inline void bitmap_set_value(unsigned long *map,
> > - unsigned long value,
> > + unsigned long value, const size_t length,
> > unsigned long start, unsigned long nbits)
> > {
> > const size_t index = BIT_WORD(start);
> > @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > } else {
> > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > map[index + 0] |= value << offset;
> > +
> > + if (index + 1 >= length)
> > + __builtin_unreachable();
> > +
> > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > map[index + 1] |= value >> space;
> > }
>
> Hi Syed,
>
> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> to value_width.

length here is in longs. I guess this is the point of entire patch.

But to me sounds like it would be better to have simply bitmap_set_value64() /
bitmap_set_value32() with proper optimization done and forget about variadic
ones for now.

--
With Best Regards,
Andy Shevchenko


2020-11-09 17:33:43

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
>
> ...
>
> > > static inline void bitmap_set_value(unsigned long *map,
> > > - unsigned long value,
> > > + unsigned long value, const size_t length,
> > > unsigned long start, unsigned long nbits)
> > > {
> > > const size_t index = BIT_WORD(start);
> > > @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > } else {
> > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > map[index + 0] |= value << offset;
> > > +
> > > + if (index + 1 >= length)
> > > + __builtin_unreachable();
> > > +
> > > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > map[index + 1] |= value >> space;
> > > }
> >
> > Hi Syed,
> >
> > Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > to value_width.
>
> length here is in longs. I guess this is the point of entire patch.

Ah yes, this should become 'const unsigned long nbits' and represent the
length of the bitmap in bits and not longs.

> But to me sounds like it would be better to have simply bitmap_set_value64() /
> bitmap_set_value32() with proper optimization done and forget about variadic
> ones for now.

The gpio-xilinx driver can have arbitrary sizes for width[0] and
width[1], so unfortunately that means we don't know the start position
nor the width of the value beforehand.

William Breathitt Gray


Attachments:
(No filename) (1.84 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-10 10:05:45

by Michal Simek

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value



On 09. 11. 20 18:31, William Breathitt Gray wrote:
> On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
>> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
>>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
>>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
>>
>> ...
>>
>>>> static inline void bitmap_set_value(unsigned long *map,
>>>> - unsigned long value,
>>>> + unsigned long value, const size_t length,
>>>> unsigned long start, unsigned long nbits)
>>>> {
>>>> const size_t index = BIT_WORD(start);
>>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
>>>> } else {
>>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
>>>> map[index + 0] |= value << offset;
>>>> +
>>>> + if (index + 1 >= length)
>>>> + __builtin_unreachable();
>>>> +
>>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
>>>> map[index + 1] |= value >> space;
>>>> }
>>>
>>> Hi Syed,
>>>
>>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
>>> to value_width.
>>
>> length here is in longs. I guess this is the point of entire patch.
>
> Ah yes, this should become 'const unsigned long nbits' and represent the
> length of the bitmap in bits and not longs.
>
>> But to me sounds like it would be better to have simply bitmap_set_value64() /
>> bitmap_set_value32() with proper optimization done and forget about variadic
>> ones for now.
>
> The gpio-xilinx driver can have arbitrary sizes for width[0] and
> width[1], so unfortunately that means we don't know the start position
> nor the width of the value beforehand.

Start position should be all the time zero. You can't configure this IP
to start from bit 2. Width can vary but start is IMHO all the time from
0 bit.

Thanks,
Michal

2020-11-10 12:39:38

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
>
>
> On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> >>
> >> ...
> >>
> >>>> static inline void bitmap_set_value(unsigned long *map,
> >>>> - unsigned long value,
> >>>> + unsigned long value, const size_t length,
> >>>> unsigned long start, unsigned long nbits)
> >>>> {
> >>>> const size_t index = BIT_WORD(start);
> >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> >>>> } else {
> >>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> >>>> map[index + 0] |= value << offset;
> >>>> +
> >>>> + if (index + 1 >= length)
> >>>> + __builtin_unreachable();
> >>>> +
> >>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> >>>> map[index + 1] |= value >> space;
> >>>> }
> >>>
> >>> Hi Syed,
> >>>
> >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> >>> to value_width.
> >>
> >> length here is in longs. I guess this is the point of entire patch.
> >
> > Ah yes, this should become 'const unsigned long nbits' and represent the
> > length of the bitmap in bits and not longs.
> >
> >> But to me sounds like it would be better to have simply bitmap_set_value64() /
> >> bitmap_set_value32() with proper optimization done and forget about variadic
> >> ones for now.
> >
> > The gpio-xilinx driver can have arbitrary sizes for width[0] and
> > width[1], so unfortunately that means we don't know the start position
> > nor the width of the value beforehand.
>
> Start position should be all the time zero. You can't configure this IP
> to start from bit 2. Width can vary but start is IMHO all the time from
> 0 bit.
>
> Thanks,
> Michal

Hi Michal,

I'm referring to the mask creation, not the data bus transfer; see the
implementation of the xgpio_set_multiple() function in linux-next for
reference:
<https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpio/gpio-xilinx.c?h=akpm>.

To generate the old mask we call the following:

bitmap_set_value(old, state[0], 0, width[0]);
bitmap_set_value(old, state[1], width[0], width[1]);

Here, width[0] and width[1] can vary, which makes the exact values of
the start and nbits parameters unknown beforehand (although we do know
they are within the bitmap boundary).

Regardless, this is not an issue because we know the bitmap_set_value()
is supposed to be called with valid values. We just need a way to hint
to GCC that this is the case, without increasing the latency of the
function -- which I think is possible if we use __builtin_unreachable()
for the conditional path checking the index against the length of the
bitmap.

William Breathitt Gray


Attachments:
(No filename) (3.20 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-10 17:27:09

by Syed Nayyar Waris

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
<[email protected]> wrote:
>
> On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> >
> >
> > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > >>
> > >> ...
> > >>
> > >>>> static inline void bitmap_set_value(unsigned long *map,
> > >>>> - unsigned long value,
> > >>>> + unsigned long value, const size_t length,
> > >>>> unsigned long start, unsigned long nbits)
> > >>>> {
> > >>>> const size_t index = BIT_WORD(start);
> > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > >>>> } else {
> > >>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > >>>> map[index + 0] |= value << offset;
> > >>>> +
> > >>>> + if (index + 1 >= length)
> > >>>> + __builtin_unreachable();
> > >>>> +
> > >>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > >>>> map[index + 1] |= value >> space;
> > >>>> }
> > >>>
> > >>> Hi Syed,
> > >>>
> > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > >>> to value_width.
> > >>
> > >> length here is in longs. I guess this is the point of entire patch.
> > >
> > > Ah yes, this should become 'const unsigned long nbits' and represent the
> > > length of the bitmap in bits and not longs.

Hi William, Andy and All,

Thank You for reviewing. I was looking into the review comments and I
have a question on the above.

Actually, in bitmap_set_value(), the intended comparison is to be made
between 'index + 1' and 'length' (which is now renamed as 'nbits').
That is, the comparison would look-like as follows:
if (index + 1 >= nbits)

The 'index' is getting populated with BIT_WORD(start).
The 'index' variable in above is the actual index of the bitmap array,
while in previous mail it is suggested to use 'nbits' which represent
the length of the bitmap in bits and not longs.

Isn't it comparing two different things? index of array (not the
bit-wise-length) on left hand side and nbits (bit-wise-length) on
right hand side?

Have I misunderstood something? If yes, request to clarify.

Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
it with 'index + 1'? Something like this?

Regards
Syed Nayyar Waris

> > >
> > >> But to me sounds like it would be better to have simply bitmap_set_value64() /
> > >> bitmap_set_value32() with proper optimization done and forget about variadic
> > >> ones for now.
> > >
> > > The gpio-xilinx driver can have arbitrary sizes for width[0] and
> > > width[1], so unfortunately that means we don't know the start position
> > > nor the width of the value beforehand.
> >
> > Start position should be all the time zero. You can't configure this IP
> > to start from bit 2. Width can vary but start is IMHO all the time from
> > 0 bit.
> >
> > Thanks,
> > Michal
>
> Hi Michal,
>
> I'm referring to the mask creation, not the data bus transfer; see the
> implementation of the xgpio_set_multiple() function in linux-next for
> reference:
> <https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpio/gpio-xilinx.c?h=akpm>.
>
> To generate the old mask we call the following:
>
> bitmap_set_value(old, state[0], 0, width[0]);
> bitmap_set_value(old, state[1], width[0], width[1]);
>
> Here, width[0] and width[1] can vary, which makes the exact values of
> the start and nbits parameters unknown beforehand (although we do know
> they are within the bitmap boundary).
>
> Regardless, this is not an issue because we know the bitmap_set_value()
> is supposed to be called with valid values. We just need a way to hint
> to GCC that this is the case, without increasing the latency of the
> function -- which I think is possible if we use __builtin_unreachable()
> for the conditional path checking the index against the length of the
> bitmap.
>
> William Breathitt Gray

2020-11-10 17:47:19

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Tue, Nov 10, 2020 at 10:52:42PM +0530, Syed Nayyar Waris wrote:
> On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
> <[email protected]> wrote:
> >
> > On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> > >
> > >
> > > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > > >>
> > > >> ...
> > > >>
> > > >>>> static inline void bitmap_set_value(unsigned long *map,
> > > >>>> - unsigned long value,
> > > >>>> + unsigned long value, const size_t length,
> > > >>>> unsigned long start, unsigned long nbits)
> > > >>>> {
> > > >>>> const size_t index = BIT_WORD(start);
> > > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > >>>> } else {
> > > >>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > >>>> map[index + 0] |= value << offset;
> > > >>>> +
> > > >>>> + if (index + 1 >= length)
> > > >>>> + __builtin_unreachable();
> > > >>>> +
> > > >>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > >>>> map[index + 1] |= value >> space;
> > > >>>> }
> > > >>>
> > > >>> Hi Syed,
> > > >>>
> > > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > > >>> to value_width.
> > > >>
> > > >> length here is in longs. I guess this is the point of entire patch.
> > > >
> > > > Ah yes, this should become 'const unsigned long nbits' and represent the
> > > > length of the bitmap in bits and not longs.
>
> Hi William, Andy and All,
>
> Thank You for reviewing. I was looking into the review comments and I
> have a question on the above.
>
> Actually, in bitmap_set_value(), the intended comparison is to be made
> between 'index + 1' and 'length' (which is now renamed as 'nbits').
> That is, the comparison would look-like as follows:
> if (index + 1 >= nbits)
>
> The 'index' is getting populated with BIT_WORD(start).
> The 'index' variable in above is the actual index of the bitmap array,
> while in previous mail it is suggested to use 'nbits' which represent
> the length of the bitmap in bits and not longs.
>
> Isn't it comparing two different things? index of array (not the
> bit-wise-length) on left hand side and nbits (bit-wise-length) on
> right hand side?
>
> Have I misunderstood something? If yes, request to clarify.
>
> Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
> it with 'index + 1'? Something like this?
>
> Regards
> Syed Nayyar Waris

The array elements of the bitmap memory region are abstracted away for
the covenience of the users of the bitmap_* functions; the driver
authors are able to treat their bitmaps as just a set of contiguous bits
and not worry about where the division between array elements happen.

So to match the interface of the other bitmap_* functions, you should
take in nbits and figure out the actual array length by dividing by
BITS_PER_LONG inside bitmap_set_value(). Then you can use your
conditional check (index + 1 >= length) like you have been doing so far.

William Breathitt Gray


Attachments:
(No filename) (3.53 kB)
signature.asc (849.00 B)
Download all attachments

2020-11-10 22:03:22

by Syed Nayyar Waris

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Tue, Nov 10, 2020 at 12:43:16PM -0500, William Breathitt Gray wrote:
> On Tue, Nov 10, 2020 at 10:52:42PM +0530, Syed Nayyar Waris wrote:
> > On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
> > <[email protected]> wrote:
> > >
> > > On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> > > >
> > > >
> > > > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > > > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > > > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > > > >>
> > > > >> ...
> > > > >>
> > > > >>>> static inline void bitmap_set_value(unsigned long *map,
> > > > >>>> - unsigned long value,
> > > > >>>> + unsigned long value, const size_t length,
> > > > >>>> unsigned long start, unsigned long nbits)
> > > > >>>> {
> > > > >>>> const size_t index = BIT_WORD(start);
> > > > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > > >>>> } else {
> > > > >>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > >>>> map[index + 0] |= value << offset;
> > > > >>>> +
> > > > >>>> + if (index + 1 >= length)
> > > > >>>> + __builtin_unreachable();
> > > > >>>> +
> > > > >>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > >>>> map[index + 1] |= value >> space;
> > > > >>>> }
> > > > >>>
> > > > >>> Hi Syed,
> > > > >>>
> > > > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > > > >>> to value_width.
> > > > >>
> > > > >> length here is in longs. I guess this is the point of entire patch.
> > > > >
> > > > > Ah yes, this should become 'const unsigned long nbits' and represent the
> > > > > length of the bitmap in bits and not longs.
> >
> > Hi William, Andy and All,
> >
> > Thank You for reviewing. I was looking into the review comments and I
> > have a question on the above.
> >
> > Actually, in bitmap_set_value(), the intended comparison is to be made
> > between 'index + 1' and 'length' (which is now renamed as 'nbits').
> > That is, the comparison would look-like as follows:
> > if (index + 1 >= nbits)
> >
> > The 'index' is getting populated with BIT_WORD(start).
> > The 'index' variable in above is the actual index of the bitmap array,
> > while in previous mail it is suggested to use 'nbits' which represent
> > the length of the bitmap in bits and not longs.
> >
> > Isn't it comparing two different things? index of array (not the
> > bit-wise-length) on left hand side and nbits (bit-wise-length) on
> > right hand side?
> >
> > Have I misunderstood something? If yes, request to clarify.
> >
> > Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
> > it with 'index + 1'? Something like this?
> >
> > Regards
> > Syed Nayyar Waris
>
> The array elements of the bitmap memory region are abstracted away for
> the covenience of the users of the bitmap_* functions; the driver
> authors are able to treat their bitmaps as just a set of contiguous bits
> and not worry about where the division between array elements happen.
>
> So to match the interface of the other bitmap_* functions, you should
> take in nbits and figure out the actual array length by dividing by
> BITS_PER_LONG inside bitmap_set_value(). Then you can use your
> conditional check (index + 1 >= length) like you have been doing so far.
>
> William Breathitt Gray

Hi Arnd,

Sharing a new version of bitmap_set_value(). Let me know if it looks
good and whether it suppresses the compiler warning.

The below patch is created against the v12 version of bitmap_set_value().

-static inline void bitmap_set_value(unsigned long *map,
- unsigned long value,
- unsigned long start, unsigned long nbits)
+static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
+ unsigned long value, unsigned long value_width,
+ unsigned long start)
{
- const size_t index = BIT_WORD(start);
+ const unsigned long index = BIT_WORD(start);
+ const unsigned long length = BIT_WORD(nbits);
const unsigned long offset = start % BITS_PER_LONG;
const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
const unsigned long space = ceiling - start;

- value &= GENMASK(nbits - 1, 0);
+ value &= GENMASK(value_width - 1, 0);

- if (space >= nbits) {
- map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
+ if (space >= value_width) {
+ map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
map[index] |= value << offset;
} else {
map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
map[index + 0] |= value << offset;
- map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
+
+ if (index + 1 >= length)
+ __builtin_unreachable();
+
+ map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
map[index + 1] |= value >> space;
}
}


2020-11-13 16:54:30

by Syed Nayyar Waris

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Wed, Nov 11, 2020 at 3:30 AM Syed Nayyar Waris <[email protected]> wrote:
>
> On Tue, Nov 10, 2020 at 12:43:16PM -0500, William Breathitt Gray wrote:
> > On Tue, Nov 10, 2020 at 10:52:42PM +0530, Syed Nayyar Waris wrote:
> > > On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
> > > <[email protected]> wrote:
> > > >
> > > > On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> > > > >
> > > > >
> > > > > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > > > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > > > > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > > > > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > > > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > > > > >>
> > > > > >> ...
> > > > > >>
> > > > > >>>> static inline void bitmap_set_value(unsigned long *map,
> > > > > >>>> - unsigned long value,
> > > > > >>>> + unsigned long value, const size_t length,
> > > > > >>>> unsigned long start, unsigned long nbits)
> > > > > >>>> {
> > > > > >>>> const size_t index = BIT_WORD(start);
> > > > > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > > > >>>> } else {
> > > > > >>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > >>>> map[index + 0] |= value << offset;
> > > > > >>>> +
> > > > > >>>> + if (index + 1 >= length)
> > > > > >>>> + __builtin_unreachable();
> > > > > >>>> +
> > > > > >>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > > >>>> map[index + 1] |= value >> space;
> > > > > >>>> }
> > > > > >>>
> > > > > >>> Hi Syed,
> > > > > >>>
> > > > > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > > > > >>> to value_width.
> > > > > >>
> > > > > >> length here is in longs. I guess this is the point of entire patch.
> > > > > >
> > > > > > Ah yes, this should become 'const unsigned long nbits' and represent the
> > > > > > length of the bitmap in bits and not longs.
> > >
> > > Hi William, Andy and All,
> > >
> > > Thank You for reviewing. I was looking into the review comments and I
> > > have a question on the above.
> > >
> > > Actually, in bitmap_set_value(), the intended comparison is to be made
> > > between 'index + 1' and 'length' (which is now renamed as 'nbits').
> > > That is, the comparison would look-like as follows:
> > > if (index + 1 >= nbits)
> > >
> > > The 'index' is getting populated with BIT_WORD(start).
> > > The 'index' variable in above is the actual index of the bitmap array,
> > > while in previous mail it is suggested to use 'nbits' which represent
> > > the length of the bitmap in bits and not longs.
> > >
> > > Isn't it comparing two different things? index of array (not the
> > > bit-wise-length) on left hand side and nbits (bit-wise-length) on
> > > right hand side?
> > >
> > > Have I misunderstood something? If yes, request to clarify.
> > >
> > > Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
> > > it with 'index + 1'? Something like this?
> > >
> > > Regards
> > > Syed Nayyar Waris
> >
> > The array elements of the bitmap memory region are abstracted away for
> > the covenience of the users of the bitmap_* functions; the driver
> > authors are able to treat their bitmaps as just a set of contiguous bits
> > and not worry about where the division between array elements happen.
> >
> > So to match the interface of the other bitmap_* functions, you should
> > take in nbits and figure out the actual array length by dividing by
> > BITS_PER_LONG inside bitmap_set_value(). Then you can use your
> > conditional check (index + 1 >= length) like you have been doing so far.
> >
> > William Breathitt Gray
>
> Hi Arnd,
>
> Sharing a new version of bitmap_set_value(). Let me know if it looks
> good and whether it suppresses the compiler warning.
>
> The below patch is created against the v12 version of bitmap_set_value().
>
> -static inline void bitmap_set_value(unsigned long *map,
> - unsigned long value,
> - unsigned long start, unsigned long nbits)
> +static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
> + unsigned long value, unsigned long value_width,
> + unsigned long start)
> {
> - const size_t index = BIT_WORD(start);
> + const unsigned long index = BIT_WORD(start);
> + const unsigned long length = BIT_WORD(nbits);
> const unsigned long offset = start % BITS_PER_LONG;
> const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> const unsigned long space = ceiling - start;
>
> - value &= GENMASK(nbits - 1, 0);
> + value &= GENMASK(value_width - 1, 0);
>
> - if (space >= nbits) {
> - map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> + if (space >= value_width) {
> + map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
> map[index] |= value << offset;
> } else {
> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> map[index + 0] |= value << offset;
> - map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> +
> + if (index + 1 >= length)
> + __builtin_unreachable();
> +
> + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
> map[index + 1] |= value >> space;
> }
> }
>
>

Hi Arnd,

What do you think of the above solution ( new version of
bitmap_set_value() )? Does it look good?

Regards
Syed Nayyar Waris

2020-11-20 13:29:45

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Fri, Nov 13, 2020 at 5:52 PM Syed Nayyar Waris <[email protected]> wrote:
> On Wed, Nov 11, 2020 at 3:30 AM Syed Nayyar Waris <[email protected]> wrote:
> > On Tue, Nov 10, 2020 at 12:43:16PM -0500, William Breathitt Gray wrote:
> > > On Tue, Nov 10, 2020 at 10:52:42PM +0530, Syed Nayyar Waris wrote:
> > > > On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> > > > > >
> > > > > >
> > > > > > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > > > > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > > > > > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > > > > > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > > > > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > > > > > >>
> > > > > > >> ...
> > > > > > >>
> > > > > > >>>> static inline void bitmap_set_value(unsigned long *map,
> > > > > > >>>> - unsigned long value,
> > > > > > >>>> + unsigned long value, const size_t length,
> > > > > > >>>> unsigned long start, unsigned long nbits)
> > > > > > >>>> {
> > > > > > >>>> const size_t index = BIT_WORD(start);
> > > > > > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > > > > >>>> } else {
> > > > > > >>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > > >>>> map[index + 0] |= value << offset;
> > > > > > >>>> +
> > > > > > >>>> + if (index + 1 >= length)
> > > > > > >>>> + __builtin_unreachable();
> > > > > > >>>> +
> > > > > > >>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > > > >>>> map[index + 1] |= value >> space;
> > > > > > >>>> }
> > > > > > >>>
> > > > > > >>> Hi Syed,
> > > > > > >>>
> > > > > > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > > > > > >>> to value_width.
> > > > > > >>
> > > > > > >> length here is in longs. I guess this is the point of entire patch.
> > > > > > >
> > > > > > > Ah yes, this should become 'const unsigned long nbits' and represent the
> > > > > > > length of the bitmap in bits and not longs.
> > > >
> > > > Hi William, Andy and All,
> > > >
> > > > Thank You for reviewing. I was looking into the review comments and I
> > > > have a question on the above.
> > > >
> > > > Actually, in bitmap_set_value(), the intended comparison is to be made
> > > > between 'index + 1' and 'length' (which is now renamed as 'nbits').
> > > > That is, the comparison would look-like as follows:
> > > > if (index + 1 >= nbits)
> > > >
> > > > The 'index' is getting populated with BIT_WORD(start).
> > > > The 'index' variable in above is the actual index of the bitmap array,
> > > > while in previous mail it is suggested to use 'nbits' which represent
> > > > the length of the bitmap in bits and not longs.
> > > >
> > > > Isn't it comparing two different things? index of array (not the
> > > > bit-wise-length) on left hand side and nbits (bit-wise-length) on
> > > > right hand side?
> > > >
> > > > Have I misunderstood something? If yes, request to clarify.
> > > >
> > > > Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
> > > > it with 'index + 1'? Something like this?
> > > >
> > > > Regards
> > > > Syed Nayyar Waris
> > >
> > > The array elements of the bitmap memory region are abstracted away for
> > > the covenience of the users of the bitmap_* functions; the driver
> > > authors are able to treat their bitmaps as just a set of contiguous bits
> > > and not worry about where the division between array elements happen.
> > >
> > > So to match the interface of the other bitmap_* functions, you should
> > > take in nbits and figure out the actual array length by dividing by
> > > BITS_PER_LONG inside bitmap_set_value(). Then you can use your
> > > conditional check (index + 1 >= length) like you have been doing so far.
> > >
> > > William Breathitt Gray
> >
> > Hi Arnd,
> >
> > Sharing a new version of bitmap_set_value(). Let me know if it looks
> > good and whether it suppresses the compiler warning.
> >
> > The below patch is created against the v12 version of bitmap_set_value().
> >
> > -static inline void bitmap_set_value(unsigned long *map,
> > - unsigned long value,
> > - unsigned long start, unsigned long nbits)
> > +static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
> > + unsigned long value, unsigned long value_width,
> > + unsigned long start)
> > {
> > - const size_t index = BIT_WORD(start);
> > + const unsigned long index = BIT_WORD(start);
> > + const unsigned long length = BIT_WORD(nbits);
> > const unsigned long offset = start % BITS_PER_LONG;
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> >
> > - value &= GENMASK(nbits - 1, 0);
> > + value &= GENMASK(value_width - 1, 0);
> >
> > - if (space >= nbits) {
> > - map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > + if (space >= value_width) {
> > + map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
> > map[index] |= value << offset;
> > } else {
> > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > map[index + 0] |= value << offset;
> > - map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > +
> > + if (index + 1 >= length)
> > + __builtin_unreachable();
> > +
> > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
> > map[index + 1] |= value >> space;
> > }
> > }
> >
> >
>
> Hi Arnd,
>
> What do you think of the above solution ( new version of
> bitmap_set_value() )? Does it look good?

Sorry for the late reply and thanks for continuing to look at solutions.

I don't really like the idea of having the __builtin_unreachable() in
there, since that would lead to even worse undefined behavior
(jumping to a random instruction) than the previous one (writing
to a random location) when invalid data gets passed.

Isn't passing the length of the bitmap sufficient to suppress the
warning (sorry I did not try myself)? If not, maybe this could
be a "BUG_ON(index + 1 >= length)" instead of the
__builtin_unreachable(). That way it would at least crash
in a well-defined way.

Arnd

2020-11-20 13:49:56

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Fri, Nov 20, 2020 at 02:26:35PM +0100, Arnd Bergmann wrote:
> On Fri, Nov 13, 2020 at 5:52 PM Syed Nayyar Waris <[email protected]> wrote:
> > On Wed, Nov 11, 2020 at 3:30 AM Syed Nayyar Waris <[email protected]> wrote:
> > > On Tue, Nov 10, 2020 at 12:43:16PM -0500, William Breathitt Gray wrote:
> > > > On Tue, Nov 10, 2020 at 10:52:42PM +0530, Syed Nayyar Waris wrote:
> > > > > On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > > > > > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > > > > > > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > > > > > > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > > > > > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > > > > > > >>
> > > > > > > >> ...
> > > > > > > >>
> > > > > > > >>>> static inline void bitmap_set_value(unsigned long *map,
> > > > > > > >>>> - unsigned long value,
> > > > > > > >>>> + unsigned long value, const size_t length,
> > > > > > > >>>> unsigned long start, unsigned long nbits)
> > > > > > > >>>> {
> > > > > > > >>>> const size_t index = BIT_WORD(start);
> > > > > > > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
> > > > > > > >>>> } else {
> > > > > > > >>>> map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > > > >>>> map[index + 0] |= value << offset;
> > > > > > > >>>> +
> > > > > > > >>>> + if (index + 1 >= length)
> > > > > > > >>>> + __builtin_unreachable();
> > > > > > > >>>> +
> > > > > > > >>>> map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > > > > > >>>> map[index + 1] |= value >> space;
> > > > > > > >>>> }
> > > > > > > >>>
> > > > > > > >>> Hi Syed,
> > > > > > > >>>
> > > > > > > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > > > > > > >>> to value_width.
> > > > > > > >>
> > > > > > > >> length here is in longs. I guess this is the point of entire patch.
> > > > > > > >
> > > > > > > > Ah yes, this should become 'const unsigned long nbits' and represent the
> > > > > > > > length of the bitmap in bits and not longs.
> > > > >
> > > > > Hi William, Andy and All,
> > > > >
> > > > > Thank You for reviewing. I was looking into the review comments and I
> > > > > have a question on the above.
> > > > >
> > > > > Actually, in bitmap_set_value(), the intended comparison is to be made
> > > > > between 'index + 1' and 'length' (which is now renamed as 'nbits').
> > > > > That is, the comparison would look-like as follows:
> > > > > if (index + 1 >= nbits)
> > > > >
> > > > > The 'index' is getting populated with BIT_WORD(start).
> > > > > The 'index' variable in above is the actual index of the bitmap array,
> > > > > while in previous mail it is suggested to use 'nbits' which represent
> > > > > the length of the bitmap in bits and not longs.
> > > > >
> > > > > Isn't it comparing two different things? index of array (not the
> > > > > bit-wise-length) on left hand side and nbits (bit-wise-length) on
> > > > > right hand side?
> > > > >
> > > > > Have I misunderstood something? If yes, request to clarify.
> > > > >
> > > > > Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
> > > > > it with 'index + 1'? Something like this?
> > > > >
> > > > > Regards
> > > > > Syed Nayyar Waris
> > > >
> > > > The array elements of the bitmap memory region are abstracted away for
> > > > the covenience of the users of the bitmap_* functions; the driver
> > > > authors are able to treat their bitmaps as just a set of contiguous bits
> > > > and not worry about where the division between array elements happen.
> > > >
> > > > So to match the interface of the other bitmap_* functions, you should
> > > > take in nbits and figure out the actual array length by dividing by
> > > > BITS_PER_LONG inside bitmap_set_value(). Then you can use your
> > > > conditional check (index + 1 >= length) like you have been doing so far.
> > > >
> > > > William Breathitt Gray
> > >
> > > Hi Arnd,
> > >
> > > Sharing a new version of bitmap_set_value(). Let me know if it looks
> > > good and whether it suppresses the compiler warning.
> > >
> > > The below patch is created against the v12 version of bitmap_set_value().
> > >
> > > -static inline void bitmap_set_value(unsigned long *map,
> > > - unsigned long value,
> > > - unsigned long start, unsigned long nbits)
> > > +static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
> > > + unsigned long value, unsigned long value_width,
> > > + unsigned long start)
> > > {
> > > - const size_t index = BIT_WORD(start);
> > > + const unsigned long index = BIT_WORD(start);
> > > + const unsigned long length = BIT_WORD(nbits);
> > > const unsigned long offset = start % BITS_PER_LONG;
> > > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > > const unsigned long space = ceiling - start;
> > >
> > > - value &= GENMASK(nbits - 1, 0);
> > > + value &= GENMASK(value_width - 1, 0);
> > >
> > > - if (space >= nbits) {
> > > - map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > > + if (space >= value_width) {
> > > + map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
> > > map[index] |= value << offset;
> > > } else {
> > > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > map[index + 0] |= value << offset;
> > > - map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > > +
> > > + if (index + 1 >= length)
> > > + __builtin_unreachable();
> > > +
> > > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
> > > map[index + 1] |= value >> space;
> > > }
> > > }
> > >
> > >
> >
> > Hi Arnd,
> >
> > What do you think of the above solution ( new version of
> > bitmap_set_value() )? Does it look good?
>
> Sorry for the late reply and thanks for continuing to look at solutions.
>
> I don't really like the idea of having the __builtin_unreachable() in
> there, since that would lead to even worse undefined behavior
> (jumping to a random instruction) than the previous one (writing
> to a random location) when invalid data gets passed.
>
> Isn't passing the length of the bitmap sufficient to suppress the
> warning (sorry I did not try myself)? If not, maybe this could
> be a "BUG_ON(index + 1 >= length)" instead of the
> __builtin_unreachable(). That way it would at least crash
> in a well-defined way.
>
> Arnd

Hi Arnd,

I don't think we need to worry about incorrect values being passed into
bitmap_set_value(). This condition should never be possible in the code
because the boundaries are required to be correct before the function is
called.

This is the same reason other bitmap_* functions such as bitmap_fill()
don't check the boundaries either: they are expected to be correct
before the function is called; the responsibility is on the caller for
ensuring the boundaries are correct.

Our motivation here is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely, whereas
BUG_ON() would still have some amount -- albeit small given the
unlikely() within.

William Breathitt Gray


Attachments:
(No filename) (8.10 kB)
signature.asc (849.00 B)
Download all attachments