Hello Linus,
Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?
(Note: Patchset resent with the new macro and relevant
functions shifted to a new header clump_bits.h [Linus Torvalds])
Michal,
What do you think of [PATCH 5/5]? Is the conditional check needed? And
also does returning -EINVAL look good?
This patchset introduces a new generic version of for_each_set_clump.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.
The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.
For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
XXXXXXXX represents the current 8-bit group:
Example: 10111110 00000000 11111111 00110011
First loop: 10111110 00000000 11111111 XXXXXXXX
Second loop: 10111110 00000000 XXXXXXXX 00110011
Third loop: XXXXXXXX 00000000 11111111 00110011
Each iteration of the loop returns the next 8-bit group that has at
least one set bit.
But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.
Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.
/* bitmap memory region */
0x00aa0000ff000000; /* Most significant bits */
0xaaaaaa0000ff0000;
0x000000aa000000aa;
0xbbbbabcdeffedcba; /* Least significant bits */
Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first: offset: 0 clump: 0xfedcba
Iteration second: offset: 24 clump: 0xabcdef
Iteration third: offset: 48 clump: 0xaabbbb
Iteration fourth: offset: 96 clump: 0xaa
Iteration fifth: offset: 144 clump: 0xff
Iteration sixth: offset: 168 clump: 0xaaaaaa
Iteration seventh: offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.
In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).
Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.
/* bitmap memory region */
0x00aa0000ff000000; /* Most significant bits */
0xaaaaaa0000ff0000;
0x0f00000000000000;
0x0000000000000ac0; /* Least significant bits */
Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first: offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.
GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the
function is called. The responsibility is on the caller for ensuring the
boundaries are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.
Syed Nayyar Waris (5):
clump_bits: Introduce the for_each_set_clump macro
lib/test_bitmap.c: Add for_each_set_clump test cases
gpio: thunderx: Utilize for_each_set_clump macro
gpio: xilinx: Utilize generic bitmap_get_value and _set_value
gpio: xilinx: Add extra check if sum of widths exceed 64
drivers/gpio/clump_bits.h | 101 ++++++++++++++++++++++++
drivers/gpio/gpio-thunderx.c | 12 ++-
drivers/gpio/gpio-xilinx.c | 72 ++++++++++--------
lib/test_bitmap.c | 144 +++++++++++++++++++++++++++++++++++
4 files changed, 292 insertions(+), 37 deletions(-)
create mode 100644 drivers/gpio/clump_bits.h
base-commit: bbe2ba04c5a92a49db8a42c850a5a2f6481e47eb
--
2.29.0
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.
GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the
function is called. The responsibility is on the caller for ensuring the
boundaries are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.
Cc: Linus Walleij <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: William Breathitt Gray <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
---
drivers/gpio/clump_bits.h | 101 ++++++++++++++++++++++++++++++++++++++
1 file changed, 101 insertions(+)
create mode 100644 drivers/gpio/clump_bits.h
diff --git a/drivers/gpio/clump_bits.h b/drivers/gpio/clump_bits.h
new file mode 100644
index 000000000000..72ef772b83c8
--- /dev/null
+++ b/drivers/gpio/clump_bits.h
@@ -0,0 +1,101 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __CLUMP_BITS_H
+#define __CLUMP_BITS_H
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+ find_next_clump((clump), (bits), (size), 0, (clump_size))
+
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+ const size_t index = BIT_WORD(start);
+ const unsigned long offset = start % BITS_PER_LONG;
+ const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+ const unsigned long space = ceiling - start;
+ unsigned long value_low, value_high;
+
+ if (space >= nbits)
+ return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+ else {
+ value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+ value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
+ return (value_low >> offset) | (value_high << space);
+ }
+}
+
+/**
+ * bitmap_set_value - set value within a memory region
+ * @map: address to the bitmap memory region
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG inclusive)
+ * @start: bit offset of the value
+ */
+static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
+ unsigned long value, unsigned long value_width,
+ unsigned long start)
+{
+ const unsigned long index = BIT_WORD(start);
+ const unsigned long length = BIT_WORD(nbits);
+ const unsigned long offset = start % BITS_PER_LONG;
+ const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+ const unsigned long space = ceiling - start;
+
+ value &= GENMASK(value_width - 1, 0);
+
+ if (space >= value_width) {
+ map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
+ map[index] |= value << offset;
+ } else {
+ map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
+ map[index + 0] |= value << offset;
+
+ if (index + 1 >= length)
+ __builtin_unreachable();
+
+ map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
+ map[index + 1] |= value >> space;
+ }
+}
+
+/**
+ * for_each_set_clump - iterate over bitmap for each clump with set bits
+ * @start: bit offset to start search and to store the current iteration offset
+ * @clump: location to store copy of current 8-bit clump
+ * @bits: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ * @clump_size: clump size in bits
+ */
+#define for_each_set_clump(start, clump, bits, size, clump_size) \
+ for ((start) = find_first_clump(&(clump), (bits), (size), (clump_size)); \
+ (start) < (size); \
+ (start) = find_next_clump(&(clump), (bits), (size), (start) + (clump_size), (clump_size)))
+
+#endif /* __CLUMP_BITS_H */
--
2.29.0
The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.
Cc: Andy Shevchenko <[email protected]>
Cc: William Breathitt Gray <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
---
lib/test_bitmap.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 144 insertions(+)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 4425a1dd4ef1..c5b5fb98c9dd 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -13,6 +13,7 @@
#include <linux/slab.h>
#include <linux/string.h>
#include <linux/uaccess.h>
+#include <../drivers/gpio/clump_bits.h>
#include "../tools/testing/selftests/kselftest_module.h"
@@ -155,6 +156,37 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
return true;
}
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+ const unsigned int offset,
+ const unsigned int size,
+ const unsigned long *const clump_exp,
+ const unsigned long *const clump,
+ const unsigned long clump_size)
+{
+ unsigned long exp;
+
+ if (offset >= size) {
+ pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected less than %u, got %u\n",
+ srcfile, line, size, offset);
+ return false;
+ }
+
+ exp = clump_exp[offset / clump_size];
+ if (!exp) {
+ pr_warn("[%s:%u] bit offset for zero clump: expected nonzero clump, got bit offset %u with clump value 0",
+ srcfile, line, offset);
+ return false;
+ }
+
+ if (*clump != exp) {
+ pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value of 0x%lX",
+ srcfile, line, exp, *clump);
+ return false;
+ }
+
+ return true;
+}
+
#define __expect_eq(suffix, ...) \
({ \
int result = 0; \
@@ -172,6 +204,7 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
#define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
#define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__)
#define expect_eq_clump8(...) __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...) __expect_eq(clump, ##__VA_ARGS__)
static void __init test_zero_clear(void)
{
@@ -530,6 +563,28 @@ static void noinline __init test_mem_optimisations(void)
}
}
+static const unsigned long clump_bitmap_data[] __initconst = {
+ 0x38000201,
+ 0x05ff0f38,
+ 0xeffedcba,
+ 0xbbbbabcd,
+ 0x000000aa,
+ 0x000000aa,
+ 0x00ff0000,
+ 0xaaaaaa00,
+ 0xff000000,
+ 0x00aa0000,
+ 0x00000000,
+ 0x00000000,
+ 0x00000000,
+ 0x0f000000,
+ 0x00ff0000,
+ 0xaaaaaa00,
+ 0xff000000,
+ 0x00aa0000,
+ 0x00000ac0,
+};
+
static const unsigned char clump_exp[] __initconst = {
0x01, /* 1 bit set */
0x02, /* non-edge 1 bit set */
@@ -541,6 +596,94 @@ static const unsigned char clump_exp[] __initconst = {
0x05, /* non-adjacent 2 bits set */
};
+static const unsigned long clump_exp1[] __initconst = {
+ 0x01, /* 1 bit set */
+ 0x02, /* non-edge 1 bit set */
+ 0x00, /* zero bits set */
+ 0x38, /* 3 bits set across 4-bit boundary */
+ 0x38, /* Repeated clump */
+ 0x0F, /* 4 bits set */
+ 0xFF, /* all bits set */
+ 0x05, /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+ 0xfedcba, /* 24 bits */
+ 0xabcdef,
+ 0xaabbbb, /* Clump split between 2 words */
+ 0x000000, /* zeroes in between */
+ 0x0000aa,
+ 0x000000,
+ 0x0000ff,
+ 0xaaaaaa,
+ 0x000000,
+ 0x0000ff,
+};
+
+static const unsigned long clump_exp3[] __initconst = {
+ 0x00000000, /* starting with 0s*/
+ 0x00000000, /* All 0s */
+ 0x00000000,
+ 0x00000000,
+ 0x3f00000f, /* Non zero set */
+ 0x2aa80003,
+ 0x00000aaa,
+ 0x00003fc0,
+};
+
+static const unsigned long clump_exp4[] __initconst = {
+ 0x00,
+ 0x2b,
+};
+
+struct clump_test_data_params {
+ DECLARE_BITMAP(data, 256);
+ unsigned long count;
+ unsigned long offset;
+ unsigned long limit;
+ unsigned long clump_size;
+ unsigned long const *exp;
+};
+
+static struct clump_test_data_params clump_test_data[] __initdata = {
+ {{0}, 2, 0, 64, 8, clump_exp1},
+ {{0}, 8, 2, 240, 24, clump_exp2},
+ {{0}, 8, 10, 240, 30, clump_exp3},
+ {{0}, 1, 18, 18, 6, clump_exp4} };
+
+static void __init prepare_test_data(unsigned int index)
+{
+ int i;
+ unsigned long width = 0;
+
+ for (i = 0; i < clump_test_data[index].count; i++) {
+ bitmap_set_value(clump_test_data[index].data, 256,
+ clump_bitmap_data[(clump_test_data[index].offset)++], 32, width);
+ width += 32;
+ }
+}
+
+static void __init execute_for_each_set_clump_test(unsigned int index)
+{
+ unsigned long start, clump;
+
+ for_each_set_clump(start, clump, clump_test_data[index].data,
+ clump_test_data[index].limit,
+ clump_test_data[index].clump_size)
+ expect_eq_clump(start, clump_test_data[index].limit, clump_test_data[index].exp,
+ &clump, clump_test_data[index].clump_size);
+}
+
+static void __init test_for_each_set_clump(void)
+{
+ unsigned int i;
+
+ for (i = 0; i < ARRAY_SIZE(clump_test_data); i++) {
+ prepare_test_data(i);
+ execute_for_each_set_clump_test(i);
+ }
+}
+
static void __init test_for_each_set_clump8(void)
{
#define CLUMP_EXP_NUMBITS 64
@@ -631,6 +774,7 @@ static void __init selftest(void)
test_bitmap_parselist();
test_mem_optimisations();
test_for_each_set_clump8();
+ test_for_each_set_clump();
test_bitmap_cut();
}
--
2.29.0
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.
Cc: William Breathitt Gray <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
---
drivers/gpio/gpio-thunderx.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..716b75ba7df6 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,6 +16,7 @@
#include <linux/pci.h>
#include <linux/spinlock.h>
#include <asm-generic/msi.h>
+#include <../drivers/gpio/clump_bits.h>
#define GPIO_RX_DAT 0x0
@@ -275,12 +276,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *bits)
{
int bank;
- u64 set_bits, clear_bits;
+ unsigned long set_bits, clear_bits, gpio_mask;
+ unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
- for (bank = 0; bank <= chip->ngpio / 64; bank++) {
- set_bits = bits[bank] & mask[bank];
- clear_bits = ~bits[bank] & mask[bank];
+ for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+ bank = offset / 64;
+ set_bits = bits[bank] & gpio_mask;
+ clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_CLR);
}
--
2.29.0
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.
Cc: William Breathitt Gray <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Cc: Michal Simek <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
---
drivers/gpio/gpio-xilinx.c | 66 +++++++++++++++++++-------------------
1 file changed, 33 insertions(+), 33 deletions(-)
diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..d565fbf128b7 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -14,6 +14,7 @@
#include <linux/io.h>
#include <linux/gpio/driver.h>
#include <linux/slab.h>
+#include <../drivers/gpio/clump_bits.h>
/* Register Offset Definitions */
#define XGPIO_DATA_OFFSET (0x0) /* Data register */
@@ -138,37 +139,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
{
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
- int index = xgpio_index(chip, 0);
- int offset, i;
-
- spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
- /* Write to GPIO signals */
- for (i = 0; i < gc->ngpio; i++) {
- if (*mask == 0)
- break;
- /* Once finished with an index write it out to the register */
- if (index != xgpio_index(chip, i)) {
- xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
- index * XGPIO_CHANNEL_OFFSET,
- chip->gpio_state[index]);
- spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
- index = xgpio_index(chip, i);
- spin_lock_irqsave(&chip->gpio_lock[index], flags);
- }
- if (__test_and_clear_bit(i, mask)) {
- offset = xgpio_offset(chip, i);
- if (test_bit(i, bits))
- chip->gpio_state[index] |= BIT(offset);
- else
- chip->gpio_state[index] &= ~BIT(offset);
- }
- }
-
- xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
- index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
- spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+ u32 *const state = chip->gpio_state;
+ unsigned int *const width = chip->gpio_width;
+
+ DECLARE_BITMAP(old, 64);
+ DECLARE_BITMAP(new, 64);
+ DECLARE_BITMAP(changed, 64);
+
+ spin_lock_irqsave(&chip->gpio_lock[0], flags);
+ spin_lock(&chip->gpio_lock[1]);
+
+ bitmap_set_value(old, 64, state[0], width[0], 0);
+ bitmap_set_value(old, 64, state[1], width[1], width[0]);
+ bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+ bitmap_set_value(old, 64, state[0], 32, 0);
+ bitmap_set_value(old, 64, state[1], 32, 32);
+ state[0] = bitmap_get_value(new, 0, width[0]);
+ state[1] = bitmap_get_value(new, width[0], width[1]);
+ bitmap_set_value(new, 64, state[0], 32, 0);
+ bitmap_set_value(new, 64, state[1], 32, 32);
+ bitmap_xor(changed, old, new, 64);
+
+ if (((u32 *)changed)[0])
+ xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+ state[0]);
+ if (((u32 *)changed)[1])
+ xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+ XGPIO_CHANNEL_OFFSET, state[1]);
+
+ spin_unlock(&chip->gpio_lock[1]);
+ spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
}
/**
@@ -292,6 +293,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;
spin_lock_init(&chip->gpio_lock[0]);
+ spin_lock_init(&chip->gpio_lock[1]);
if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -313,8 +315,6 @@ static int xgpio_probe(struct platform_device *pdev)
if (of_property_read_u32(np, "xlnx,gpio2-width",
&chip->gpio_width[1]))
chip->gpio_width[1] = 32;
-
- spin_lock_init(&chip->gpio_lock[1]);
}
chip->gc.base = -1;
--
2.29.0
Add extra check to see if sum of widths does not exceed 64. If it
exceeds then return -EINVAL alongwith appropriate error message.
Cc: Michal Simek <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
---
drivers/gpio/gpio-xilinx.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index d565fbf128b7..c9d740ac711b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -319,6 +319,12 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gc.base = -1;
chip->gc.ngpio = chip->gpio_width[0] + chip->gpio_width[1];
+
+ if (chip->gc.ngpio > 64) {
+ dev_err(&pdev->dev, "invalid configuration: number of GPIO is greater than 64");
+ return -EINVAL;
+ }
+
chip->gc.parent = &pdev->dev;
chip->gc.direction_input = xgpio_dir_in;
chip->gc.direction_output = xgpio_dir_out;
--
2.29.0
On Sat, Dec 26, 2020 at 7:41 AM Syed Nayyar Waris <[email protected]> wrote:
> Since this patchset primarily affects GPIO drivers, would you like
> to pick it up through your GPIO tree?
Actually Bartosz is handling the GPIO patches for v5.12.
I tried to merge the patch series before but failed for
various reasons.
Yours,
Linus Walleij
On Sat, Dec 26, 2020 at 7:44 AM Syed Nayyar Waris <[email protected]> wrote:
> This patch reimplements the xgpio_set_multiple() function in
> drivers/gpio/gpio-xilinx.c to use the new generic functions:
> bitmap_get_value() and bitmap_set_value(). The code is now simpler
> to read and understand. Moreover, instead of looping for each bit
> in xgpio_set_multiple() function, now we can check each channel at
> a time and save cycles.
>
> Cc: William Breathitt Gray <[email protected]>
> Cc: Bartosz Golaszewski <[email protected]>
> Cc: Michal Simek <[email protected]>
> Signed-off-by: Syed Nayyar Waris <[email protected]>
(...)
> +#include <../drivers/gpio/clump_bits.h>
What is this?
Isn't a simple
#include "clump_bits.h"
enough?
We need an ACK from the Xilinx people that they think this
actually improves the readability and maintainability of their
driver.
Yours,
Linus Walleij
On Sat, Dec 26, 2020 at 7:42 AM Syed Nayyar Waris <[email protected]> wrote:
>
> This macro iterates for each group of bits (clump) with set bits,
> within a bitmap memory region. For each iteration, "start" is set to
> the bit offset of the found clump, while the respective clump value is
> stored to the location pointed by "clump". Additionally, the
> bitmap_get_value() and bitmap_set_value() functions are introduced to
> respectively get and set a value of n-bits in a bitmap memory region.
> The n-bits can have any size from 1 to BITS_PER_LONG. size less
> than 1 or more than BITS_PER_LONG causes undefined behaviour.
> Moreover, during setting value of n-bit in bitmap, if a situation arise
> that the width of next n-bit is exceeding the word boundary, then it
> will divide itself such that some portion of it is stored in that word,
> while the remaining portion is stored in the next higher word. Similar
> situation occurs while retrieving the value from bitmap.
>
> GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
> Add explicit check to see if the value being written into the bitmap
> does not fall outside the bitmap.
> The situation that it is falling outside would never be possible in the
> code because the boundaries are required to be correct before the
> function is called. The responsibility is on the caller for ensuring the
> boundaries are correct.
> The code change is simply to silence the GCC warning messages
> because GCC is not aware that the boundaries have already been checked.
> As such, we're better off using __builtin_unreachable() here because we
> can avoid the latency of the conditional check entirely.
Didn't the __builtin_unreachable() end up leading to an objtool
warning about incorrect stack frames for the code path that leads
into the undefined behavior? I thought I saw a message from the 0day
build bot about that and didn't expect to see it again after that.
Can you actually measure any performance difference compared
to BUG_ON() that avoids the undefined behavior? Practically
all CPUs from the past 20 years have branch predictors that should
completely hide measurable overhead from this.
Arnd
On Sat, Dec 26, 2020 at 12:15:20PM +0530, Syed Nayyar Waris wrote:
> Add extra check to see if sum of widths does not exceed 64. If it
> exceeds then return -EINVAL alongwith appropriate error message.
>
> Cc: Michal Simek <[email protected]>
> Signed-off-by: Syed Nayyar Waris <[email protected]>
Hello Syed,
This change is independent from the rest of this patchset so I recommend
dropping this patch and instead resubmitting it separately as an
independent patch submission.
William Breathitt Gray
> ---
> drivers/gpio/gpio-xilinx.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> index d565fbf128b7..c9d740ac711b 100644
> --- a/drivers/gpio/gpio-xilinx.c
> +++ b/drivers/gpio/gpio-xilinx.c
> @@ -319,6 +319,12 @@ static int xgpio_probe(struct platform_device *pdev)
>
> chip->gc.base = -1;
> chip->gc.ngpio = chip->gpio_width[0] + chip->gpio_width[1];
> +
> + if (chip->gc.ngpio > 64) {
> + dev_err(&pdev->dev, "invalid configuration: number of GPIO is greater than 64");
> + return -EINVAL;
> + }
> +
> chip->gc.parent = &pdev->dev;
> chip->gc.direction_input = xgpio_dir_in;
> chip->gc.direction_output = xgpio_dir_out;
> --
> 2.29.0
>
On Sun, Dec 27, 2020 at 11:03:06PM +0100, Arnd Bergmann wrote:
> On Sat, Dec 26, 2020 at 7:42 AM Syed Nayyar Waris <[email protected]> wrote:
> >
> > This macro iterates for each group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to
> > the bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value() and bitmap_set_value() functions are introduced to
> > respectively get and set a value of n-bits in a bitmap memory region.
> > The n-bits can have any size from 1 to BITS_PER_LONG. size less
> > than 1 or more than BITS_PER_LONG causes undefined behaviour.
> > Moreover, during setting value of n-bit in bitmap, if a situation arise
> > that the width of next n-bit is exceeding the word boundary, then it
> > will divide itself such that some portion of it is stored in that word,
> > while the remaining portion is stored in the next higher word. Similar
> > situation occurs while retrieving the value from bitmap.
> >
> > GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
> > Add explicit check to see if the value being written into the bitmap
> > does not fall outside the bitmap.
> > The situation that it is falling outside would never be possible in the
> > code because the boundaries are required to be correct before the
> > function is called. The responsibility is on the caller for ensuring the
> > boundaries are correct.
> > The code change is simply to silence the GCC warning messages
> > because GCC is not aware that the boundaries have already been checked.
> > As such, we're better off using __builtin_unreachable() here because we
> > can avoid the latency of the conditional check entirely.
>
> Didn't the __builtin_unreachable() end up leading to an objtool
> warning about incorrect stack frames for the code path that leads
> into the undefined behavior? I thought I saw a message from the 0day
> build bot about that and didn't expect to see it again after that.
>
> Can you actually measure any performance difference compared
> to BUG_ON() that avoids the undefined behavior? Practically
> all CPUs from the past 20 years have branch predictors that should
> completely hide measurable overhead from this.
>
> Arnd
When I initially recommended using __builtin_unreachable(), I was
anticipating the use of bitmap_set_value() in kernel at large -- so the
possible performance hit from a conditional check was a concern for me.
However, now that we're restricting the scope of bitmap_set_value() to
only the GPIO subsystem, such optimization is no longer a major concern
I feel: gpio-xilinx is the only driver utilizing bitmap_set_value() --
and we know it won't be called in a loop -- so whatever hypothetical
performance hit there might be is inconsequential in the end.
Instead, we should focus on code clarity now. I believe it makes sense
given the new scope of this function to revert back to the earlier
suggestion of passing in and checking the boundary explicitly, and to
remove the __builtin_unreachable() call for now. If bitmap_set_value()
becomes available to the rest of the kernel in the future, we can
reconsider whether or not to use __builtin_unreachable().
William Breathitt Gray
On 26. 12. 20 7:45, Syed Nayyar Waris wrote:
> Add extra check to see if sum of widths does not exceed 64. If it
> exceeds then return -EINVAL alongwith appropriate error message.
>
> Cc: Michal Simek <[email protected]>
> Signed-off-by: Syed Nayyar Waris <[email protected]>
> ---
> drivers/gpio/gpio-xilinx.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> index d565fbf128b7..c9d740ac711b 100644
> --- a/drivers/gpio/gpio-xilinx.c
> +++ b/drivers/gpio/gpio-xilinx.c
> @@ -319,6 +319,12 @@ static int xgpio_probe(struct platform_device *pdev)
>
> chip->gc.base = -1;
> chip->gc.ngpio = chip->gpio_width[0] + chip->gpio_width[1];
> +
> + if (chip->gc.ngpio > 64) {
> + dev_err(&pdev->dev, "invalid configuration: number of GPIO is greater than 64");
> + return -EINVAL;
> + }
> +
> chip->gc.parent = &pdev->dev;
> chip->gc.direction_input = xgpio_dir_in;
> chip->gc.direction_output = xgpio_dir_out;
>
Acked-by: Michal Simek <[email protected]>
Thanks,
Michal
Hi, +Srinivas,
On 27. 12. 20 22:29, Linus Walleij wrote:
> On Sat, Dec 26, 2020 at 7:44 AM Syed Nayyar Waris <[email protected]> wrote:
>
>> This patch reimplements the xgpio_set_multiple() function in
>> drivers/gpio/gpio-xilinx.c to use the new generic functions:
>> bitmap_get_value() and bitmap_set_value(). The code is now simpler
>> to read and understand. Moreover, instead of looping for each bit
>> in xgpio_set_multiple() function, now we can check each channel at
>> a time and save cycles.
>>
>> Cc: William Breathitt Gray <[email protected]>
>> Cc: Bartosz Golaszewski <[email protected]>
>> Cc: Michal Simek <[email protected]>
>> Signed-off-by: Syed Nayyar Waris <[email protected]>
>
> (...)
>
>> +#include <../drivers/gpio/clump_bits.h>
>
> What is this?
>
> Isn't a simple
>
> #include "clump_bits.h"
>
> enough?
>
> We need an ACK from the Xilinx people that they think this
> actually improves the readability and maintainability of their
> driver.
Srinivas is going to send some patches against this driver. That's why
please take a look if both of these changes are fitting together.
Thanks,
Michal
Hi,
On 26. 12. 20 7:41, Syed Nayyar Waris wrote:
> Hello Linus,
>
> Since this patchset primarily affects GPIO drivers, would you like
> to pick it up through your GPIO tree?
>
> (Note: Patchset resent with the new macro and relevant
> functions shifted to a new header clump_bits.h [Linus Torvalds])
>
> Michal,
> What do you think of [PATCH 5/5]? Is the conditional check needed? And
> also does returning -EINVAL look good?
As was said would be better to handle it out of this series. And I
expect none is really describing fpga designs by hand and using DT
generator for it. But I can't see any issue with checking that we are
not exceeding certain limit.
Just keep in your mind that every bank has max 32 lines.
It means if you say bank0 40, bank1 10 which is in total 50 it will pass
your condition in 5/5.
It means maybe checking every bank separately is better approach.
Thanks,
Michal
On Sun, Dec 27, 2020 at 10:27 PM Linus Walleij <[email protected]> wrote:
>
> On Sat, Dec 26, 2020 at 7:41 AM Syed Nayyar Waris <[email protected]> wrote:
>
> > Since this patchset primarily affects GPIO drivers, would you like
> > to pick it up through your GPIO tree?
>
> Actually Bartosz is handling the GPIO patches for v5.12.
> I tried to merge the patch series before but failed for
> various reasons.
>
> Yours,
> Linus Walleij
My info on this is a bit outdated - didn't Linus Torvalds reject these
patches from Andrew Morton's PR? Or am I confusing this series with
something else?
Bart
On Tue, Jan 05, 2021 at 03:19:13PM +0100, Bartosz Golaszewski wrote:
> On Sun, Dec 27, 2020 at 10:27 PM Linus Walleij <[email protected]> wrote:
> >
> > On Sat, Dec 26, 2020 at 7:41 AM Syed Nayyar Waris <[email protected]> wrote:
> >
> > > Since this patchset primarily affects GPIO drivers, would you like
> > > to pick it up through your GPIO tree?
> >
> > Actually Bartosz is handling the GPIO patches for v5.12.
> > I tried to merge the patch series before but failed for
> > various reasons.
> My info on this is a bit outdated - didn't Linus Torvalds reject these
> patches from Andrew Morton's PR? Or am I confusing this series with
> something else?
Linus T. told that it can be done inside GPIO realm. This version tries
(badly in my opinion) to achieve that.
--
With Best Regards,
Andy Shevchenko
On Tue, Jan 5, 2021 at 3:38 PM Andy Shevchenko
<[email protected]> wrote:
>
> On Tue, Jan 05, 2021 at 03:19:13PM +0100, Bartosz Golaszewski wrote:
> > On Sun, Dec 27, 2020 at 10:27 PM Linus Walleij <[email protected]> wrote:
> > >
> > > On Sat, Dec 26, 2020 at 7:41 AM Syed Nayyar Waris <[email protected]> wrote:
> > >
> > > > Since this patchset primarily affects GPIO drivers, would you like
> > > > to pick it up through your GPIO tree?
> > >
> > > Actually Bartosz is handling the GPIO patches for v5.12.
> > > I tried to merge the patch series before but failed for
> > > various reasons.
>
> > My info on this is a bit outdated - didn't Linus Torvalds reject these
> > patches from Andrew Morton's PR? Or am I confusing this series with
> > something else?
>
> Linus T. told that it can be done inside GPIO realm. This version tries
> (badly in my opinion) to achieve that.
>
I'm seeing William and Arnd have some unaddressed issues with patch 1
(with using __builtin_unreachable()).
Admittedly I didn't follow the previous iterations too much so I may
miss some history behind it. Why do the first two patches go into lib
if this is supposed to be gpiolib-only?
Bartosz
On Wed, Jan 06, 2021 at 08:27:43AM +0100, Bartosz Golaszewski wrote:
> On Tue, Jan 5, 2021 at 3:38 PM Andy Shevchenko
> <[email protected]> wrote:
> >
> > On Tue, Jan 05, 2021 at 03:19:13PM +0100, Bartosz Golaszewski wrote:
> > > On Sun, Dec 27, 2020 at 10:27 PM Linus Walleij <[email protected]> wrote:
> > > >
> > > > On Sat, Dec 26, 2020 at 7:41 AM Syed Nayyar Waris <[email protected]> wrote:
> > > >
> > > > > Since this patchset primarily affects GPIO drivers, would you like
> > > > > to pick it up through your GPIO tree?
> > > >
> > > > Actually Bartosz is handling the GPIO patches for v5.12.
> > > > I tried to merge the patch series before but failed for
> > > > various reasons.
> >
> > > My info on this is a bit outdated - didn't Linus Torvalds reject these
> > > patches from Andrew Morton's PR? Or am I confusing this series with
> > > something else?
> >
> > Linus T. told that it can be done inside GPIO realm. This version tries
> > (badly in my opinion) to achieve that.
> >
>
> I'm seeing William and Arnd have some unaddressed issues with patch 1
> (with using __builtin_unreachable()).
>
> Admittedly I didn't follow the previous iterations too much so I may
> miss some history behind it. Why do the first two patches go into lib
> if this is supposed to be gpiolib-only?
>
> Bartosz
This patchset originally start out as a replacement for
bitmap_get_value8/bitmap_set_value8/for_each_set_clump8, which are used
outside of the GPIO subsystem. Over the course of the revisions, the
scope of this patchset was reduced down and now it's only affecting GPIO
drivers.
You're right that this shouldn't be going into lib anymore because it's
gpiolib-only now. I expect the next revision of this patchset Syed
submits will address that.
William Breathitt Gray
On Sat, Dec 26, 2020 at 8:15 PM Andy Shevchenko
<[email protected]> wrote:
>
>
>
> On Saturday, December 26, 2020, Syed Nayyar Waris <[email protected]> wrote:
>>
>> The introduction of the generic for_each_set_clump macro need test
>> cases to verify the implementation. This patch adds test cases for
>> scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
>> The cases contain situations where clump is getting split at the word
>> boundary and also when zeroes are present in the start and middle of
>> bitmap.
>
>
> You have to split it to a separate test under drivers/gpio, because now it has no sense to be like this.
Hi Andy,
How do I split it into separate test under drivers/gpio ? I have
thought of making a test_clump_bits.c file in drivers/gpio.
But how do I integrate this test file so that tests are executed at
runtime? Similar to tests in lib/test_bitmap.c ?
I believe I need to make changes in config files so that tests in
test_clump_bits.c ( in drivers/gpio ) are executed at runtime. Could
you please provide some steps on how to do that. Thank You !
Regards
Syed Nayyar Waris
On Thu, Feb 4, 2021 at 2:25 PM Syed Nayyar Waris <[email protected]> wrote:
>
> On Sat, Dec 26, 2020 at 8:15 PM Andy Shevchenko
> <[email protected]> wrote:
> >
> >
> >
> > On Saturday, December 26, 2020, Syed Nayyar Waris <[email protected]> wrote:
> >>
> >> The introduction of the generic for_each_set_clump macro need test
> >> cases to verify the implementation. This patch adds test cases for
> >> scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
> >> The cases contain situations where clump is getting split at the word
> >> boundary and also when zeroes are present in the start and middle of
> >> bitmap.
> >
> >
> > You have to split it to a separate test under drivers/gpio, because now it has no sense to be like this.
>
> Hi Andy,
>
> How do I split it into separate test under drivers/gpio ? I have
> thought of making a test_clump_bits.c file in drivers/gpio.
> But how do I integrate this test file so that tests are executed at
> runtime? Similar to tests in lib/test_bitmap.c ?
>
> I believe I need to make changes in config files so that tests in
> test_clump_bits.c ( in drivers/gpio ) are executed at runtime. Could
> you please provide some steps on how to do that. Thank You !
>
> Regards
> Syed Nayyar Waris
Hi Andy, could you please help me on the above. Thanks !
Regards
Syed Nayyar Waris