2019-03-03 07:45:20

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 0/9] Introduce the for_each_set_clump8 macro

Changes in v9:
- Return unsigned long for bitmap_get_value8 for consistency

While adding GPIO get_multiple/set_multiple callback support for various
drivers, I noticed a pattern of looping manifesting that would be useful
standardized as a macro.

This patchset introduces the for_each_set_clump8 macro and utilizes it
in several GPIO drivers. The for_each_set_clump macro8 facilitates a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
XXXXXXXX represents the current 8-bit group:

Example: 10111110 00000000 11111111 00110011
First loop: 10111110 00000000 11111111 XXXXXXXX
Second loop: 10111110 00000000 XXXXXXXX 00110011
Third loop: XXXXXXXX 00000000 11111111 00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

The for_each_set_clump8 macro has four parameters:

* start: set to the bit offset of the current clump
* clump: set to the current clump value
* bits: bitmap to search within
* size: bitmap size in number of bits

In this version of the patchset, the for_each_set_clump macro has been
reimplemented and simplified based on the suggestions provided by Rasmus
Villemoes and Andy Shevchenko in the version 4 submission.

In particular, the function of the for_each_set_clump macro has been
restricted to handle only 8-bit clumps; the drivers that use the
for_each_set_clump macro only handle 8-bit ports so a generic
for_each_set_clump implementation is not necessary. Thus, a solution for
large clumps (i.e. those larger than the width of a bitmap word) can be
postponed until a driver appears that actually requires such a generic
for_each_set_clump implementation.

For what it's worth, a semi-generic for_each_set_clump (i.e. for clumps
smaller than the width of a bitmap word) can be implemented by simply
replacing the hardcoded '8' and '0xFF' instances with respective
variables. I have not yet had a need for such an implementation, and
since it falls short of a true generic for_each_set_clump function, I
have decided to forgo such an implementation for now.

In addition, the bitmap_get_value8 and bitmap_set_value8 functions are
introduced to get and set 8-bit values respectively. Their use is based
on the behavior suggested in the patchset version 4 review.

William Breathitt Gray (9):
bitops: Introduce the for_each_set_clump8 macro
lib/test_bitmap.c: Add for_each_set_clump8 test cases
gpio: 104-dio-48e: Utilize for_each_set_clump8 macro
gpio: 104-idi-48: Utilize for_each_set_clump8 macro
gpio: gpio-mm: Utilize for_each_set_clump8 macro
gpio: ws16c48: Utilize for_each_set_clump8 macro
gpio: pci-idio-16: Utilize for_each_set_clump8 macro
gpio: pcie-idio-24: Utilize for_each_set_clump8 macro
gpio: uniphier: Utilize for_each_set_clump8 macro

drivers/gpio/gpio-104-dio-48e.c | 73 ++++++--------------
drivers/gpio/gpio-104-idi-48.c | 37 +++-------
drivers/gpio/gpio-gpio-mm.c | 73 ++++++--------------
drivers/gpio/gpio-pci-idio-16.c | 75 ++++++++------------
drivers/gpio/gpio-pcie-idio-24.c | 111 +++++++++++-------------------
drivers/gpio/gpio-uniphier.c | 16 ++---
drivers/gpio/gpio-ws16c48.c | 72 ++++++-------------
include/asm-generic/bitops/find.h | 14 ++++
include/linux/bitops.h | 5 ++
lib/find_bit.c | 81 ++++++++++++++++++++++
lib/test_bitmap.c | 65 +++++++++++++++++
11 files changed, 313 insertions(+), 309 deletions(-)

--
2.21.0



2019-03-03 07:49:01

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 2/9] lib/test_bitmap.c: Add for_each_set_clump8 test cases

The introduction of the for_each_set_clump8 macro warrants test cases to
verify the implementation. This patch adds test case checks for whether
an out-of-bounds clump index is returned, a zero clump is returned, or
the returned clump value differs from the expected clump value.

Cc: Andrew Morton <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
lib/test_bitmap.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 65 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 6cd7d0740005..66ddb3fb98cb 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -88,6 +88,36 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
return true;
}

+static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
+ const unsigned int offset,
+ const unsigned int size,
+ const unsigned char *const clump_exp,
+ const unsigned long *const clump)
+{
+ unsigned long exp;
+
+ if (offset >= size) {
+ pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected less than %u, got %u\n",
+ srcfile, line, size, offset);
+ return false;
+ }
+
+ exp = clump_exp[offset / 8];
+ if (!exp) {
+ pr_warn("[%s:%u] bit offset for zero clump: expected nonzero clump, got bit offset %u with clump value 0",
+ srcfile, line, offset);
+ return false;
+ }
+
+ if (*clump != exp) {
+ pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value of 0x%lX",
+ srcfile, line, exp, *clump);
+ return false;
+ }
+
+ return true;
+}
+
#define __expect_eq(suffix, ...) \
({ \
int result = 0; \
@@ -104,6 +134,7 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
#define expect_eq_bitmap(...) __expect_eq(bitmap, ##__VA_ARGS__)
#define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
#define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__)
+#define expect_eq_clump8(...) __expect_eq(clump8, ##__VA_ARGS__)

static void __init test_zero_clear(void)
{
@@ -361,6 +392,39 @@ static void noinline __init test_mem_optimisations(void)
}
}

+static const unsigned char clump_exp[] __initconst = {
+ 0x01, /* 1 bit set */
+ 0x02, /* non-edge 1 bit set */
+ 0x00, /* zero bits set */
+ 0x28, /* 3 bits set across 4-bit boundary */
+ 0x28, /* Repeated clump */
+ 0x0F, /* 4 bits set */
+ 0xFF, /* all bits set */
+ 0x05, /* non-adjacent 2 bits set */
+};
+
+static void __init test_for_each_set_clump8(void)
+{
+#define CLUMP_EXP_NUMBITS 64
+ DECLARE_BITMAP(bits, CLUMP_EXP_NUMBITS);
+ unsigned int start;
+ unsigned long clump;
+
+ /* set bitmap to test case */
+ bitmap_zero(bits, CLUMP_EXP_NUMBITS);
+ bitmap_set(bits, 0, 1); /* 0x01 */
+ bitmap_set(bits, 8, 1); /* 0x02 */
+ bitmap_set(bits, 27, 3); /* 0x28 */
+ bitmap_set(bits, 35, 3); /* 0x28 */
+ bitmap_set(bits, 40, 4); /* 0x0F */
+ bitmap_set(bits, 48, 8); /* 0xFF */
+ bitmap_set(bits, 56, 1); /* 0x05 - part 1 */
+ bitmap_set(bits, 58, 1); /* 0x05 - part 2 */
+
+ for_each_set_clump8(start, clump, bits, CLUMP_EXP_NUMBITS)
+ expect_eq_clump8(start, CLUMP_EXP_NUMBITS, clump_exp, &clump);
+}
+
static int __init test_bitmap_init(void)
{
test_zero_clear();
@@ -369,6 +433,7 @@ static int __init test_bitmap_init(void)
test_bitmap_arr32();
test_bitmap_parselist();
test_mem_optimisations();
+ test_for_each_set_clump8();

if (failed_tests == 0)
pr_info("all %u tests passed\n", total_tests);
--
2.21.0


2019-03-03 07:49:23

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 3/9] gpio: 104-dio-48e: Utilize for_each_set_clump8 macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-104-dio-48e.c | 73 ++++++++++-----------------------
1 file changed, 22 insertions(+), 51 deletions(-)

diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c
index 92c8f944bf64..23413d90e944 100644
--- a/drivers/gpio/gpio-104-dio-48e.c
+++ b/drivers/gpio/gpio-104-dio-48e.c
@@ -183,46 +183,26 @@ static int dio48e_gpio_get(struct gpio_chip *chip, unsigned offset)
return !!(port_state & mask);
}

+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
static int dio48e_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask,
unsigned long *bits)
{
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
- size_t i;
- static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+ unsigned int offset;
+ unsigned long gpio_mask;
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ unsigned int port_addr;
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
- port_state = inb(dio48egpio->base + ports[i]);
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ port_addr = dio48egpio->base + ports[offset / 8];
+ port_state = inb(port_addr) & gpio_mask;

- /* store acquired bits at respective bits array offset */
- bits[word_index] |= (port_state << word_offset) & word_mask;
+ bitmap_set_value8(bits, ngpio, port_state, offset);
}

return 0;
@@ -252,37 +232,28 @@ static void dio48e_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
- unsigned int i;
- const unsigned int gpio_reg_size = 8;
- unsigned int port;
- unsigned int out_port;
+ unsigned int offset;
+ unsigned long gpio_mask;
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ size_t index;
+ unsigned int port_addr;
unsigned int bitmask;
unsigned long flags;

- /* set bits are evaluated a gpio register size at a time */
- for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
- /* no more set bits in this mask word; skip to the next word */
- if (!mask[BIT_WORD(i)]) {
- i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
- continue;
- }
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ index = offset / 8;
+ port_addr = dio48egpio->base + ports[index];

- port = i / gpio_reg_size;
- out_port = (port > 2) ? port + 1 : port;
- bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+ bitmask = bitmap_get_value8(bits, ngpio, offset) & gpio_mask;

raw_spin_lock_irqsave(&dio48egpio->lock, flags);

/* update output state data and set device gpio register */
- dio48egpio->out_state[port] &= ~mask[BIT_WORD(i)];
- dio48egpio->out_state[port] |= bitmask;
- outb(dio48egpio->out_state[port], dio48egpio->base + out_port);
+ dio48egpio->out_state[index] &= ~gpio_mask;
+ dio48egpio->out_state[index] |= bitmask;
+ outb(dio48egpio->out_state[index], port_addr);

raw_spin_unlock_irqrestore(&dio48egpio->lock, flags);
-
- /* prepare for next gpio register set */
- mask[BIT_WORD(i)] >>= gpio_reg_size;
- bits[BIT_WORD(i)] >>= gpio_reg_size;
}
}

--
2.21.0


2019-03-03 07:49:47

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

This macro iterates for each 8-bit group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to the
bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value8 and bitmap_set_value8 functions are introduced to
respectively get and set an 8-bit value in a bitmap memory region.

Suggested-by: Andy Shevchenko <[email protected]>
Suggested-by: Rasmus Villemoes <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Andrew Morton <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
include/asm-generic/bitops/find.h | 14 ++++++
include/linux/bitops.h | 5 ++
lib/find_bit.c | 81 +++++++++++++++++++++++++++++++
3 files changed, 100 insertions(+)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 8a1ee10014de..9a76adff59c6 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -80,4 +80,18 @@ extern unsigned long find_first_zero_bit(const unsigned long *addr,

#endif /* CONFIG_GENERIC_FIND_FIRST_BIT */

+unsigned long bitmap_get_value8(const unsigned long *const bitmap,
+ const unsigned int size,
+ const unsigned int start);
+
+void bitmap_set_value8(unsigned long *const bitmap, const unsigned int size,
+ const unsigned long value, const unsigned int start);
+
+unsigned int find_next_clump8(unsigned long *const clump,
+ const unsigned long *const addr,
+ unsigned int offset, const unsigned int size);
+
+#define find_first_clump8(clump, bits, size) \
+ find_next_clump8((clump), (bits), 0, (size))
+
#endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 705f7c442691..61c10f20079e 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -40,6 +40,11 @@ extern unsigned long __sw_hweight64(__u64 w);
(bit) < (size); \
(bit) = find_next_zero_bit((addr), (size), (bit) + 1))

+#define for_each_set_clump8(start, clump, bits, size) \
+ for ((start) = find_first_clump8(&(clump), (bits), (size)); \
+ (start) < (size); \
+ (start) = find_next_clump8(&(clump), (bits), (start) + 8, (size)))
+
static inline int get_bitmask_order(unsigned int count)
{
int order;
diff --git a/lib/find_bit.c b/lib/find_bit.c
index ee3df93ba69a..c2af1f013ea2 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -218,3 +218,84 @@ EXPORT_SYMBOL(find_next_bit_le);
#endif

#endif /* __BIG_ENDIAN */
+
+/**
+ * bitmap_get_value8 - get an 8-bit value within a memory region
+ * @bitmap: address to the bitmap memory region
+ * @size: bitmap size in number of bits
+ * @start: bit offset of the 8-bit value
+ *
+ * Returns the 8-bit value located at the @start bit offset within the @bitmap
+ * memory region.
+ */
+unsigned long bitmap_get_value8(const unsigned long *const bitmap,
+ const unsigned int size,
+ const unsigned int start)
+{
+ const size_t index = BIT_WORD(start);
+ const unsigned int offset = start % BITS_PER_LONG;
+ const unsigned int low_width = (offset + 8 > BITS_PER_LONG) ?
+ BITS_PER_LONG - offset : 8;
+ const unsigned long low = bitmap[index] >> offset;
+ const unsigned long high = (low_width < 8 && start + 8 <= size) ?
+ bitmap[index + 1] << low_width : 0;
+
+ return (low | high) & 0xFF;
+}
+EXPORT_SYMBOL(bitmap_get_value8);
+
+/**
+ * bitmap_set_value8 - set an 8-bit value within a memory region
+ * @bitmap: address to the bitmap memory region
+ * @size: bitmap size in number of bits
+ * @value: the 8-bit value; values wider than 8 bits may clobber bitmap
+ * @start: bit offset of the 8-bit value
+ */
+void bitmap_set_value8(unsigned long *const bitmap, const unsigned int size,
+ const unsigned long value, const unsigned int start)
+{
+ const size_t index = BIT_WORD(start);
+ const unsigned int offset = start % BITS_PER_LONG;
+ const unsigned int low_width = (offset + 8 > BITS_PER_LONG) ?
+ BITS_PER_LONG - offset : 8;
+ const unsigned long low_mask = GENMASK(offset + low_width - 1, offset);
+ const unsigned int high_width = 8 - low_width;
+ const unsigned long high_mask = GENMASK(high_width - 1, 0);
+
+ /* set lower portion */
+ bitmap[index] &= ~low_mask;
+ bitmap[index] |= value << offset;
+
+ /* set higher portion if space available in bitmap */
+ if (high_width && start + 8 <= size) {
+ bitmap[index + 1] &= ~high_mask;
+ bitmap[index + 1] |= value >> low_width;
+ }
+}
+EXPORT_SYMBOL(bitmap_set_value8);
+
+/**
+ * find_next_clump8 - find next 8-bit clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @offset: bit offset at which to start searching
+ * @size: bitmap size in number of bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+unsigned int find_next_clump8(unsigned long *const clump,
+ const unsigned long *const addr,
+ unsigned int offset, const unsigned int size)
+{
+ for (; offset < size; offset += 8) {
+ *clump = bitmap_get_value8(addr, size, offset);
+ if (!*clump)
+ continue;
+
+ return offset;
+ }
+
+ return size;
+}
+EXPORT_SYMBOL(find_next_clump8);
--
2.21.0


2019-03-03 07:50:56

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 7/9] gpio: pci-idio-16: Utilize for_each_set_clump8 macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-pci-idio-16.c | 75 ++++++++++++---------------------
1 file changed, 28 insertions(+), 47 deletions(-)

diff --git a/drivers/gpio/gpio-pci-idio-16.c b/drivers/gpio/gpio-pci-idio-16.c
index 6b7349783223..b0ed6bb68296 100644
--- a/drivers/gpio/gpio-pci-idio-16.c
+++ b/drivers/gpio/gpio-pci-idio-16.c
@@ -108,45 +108,24 @@ static int idio_16_gpio_get_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
- size_t i;
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
- unsigned long port_state;
+ unsigned int offset;
+ unsigned long gpio_mask;
void __iomem *ports[] = {
&idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15,
&idio16gpio->reg->in0_7, &idio16gpio->reg->in8_15,
};
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ void __iomem *port_addr;
+ unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ port_addr = ports[offset / 8];
+ port_state = ioread8(port_addr) & gpio_mask;

- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
- port_state = ioread8(ports[i]);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= (port_state << word_offset) & word_mask;
+ bitmap_set_value8(bits, ngpio, port_state, offset);
}

return 0;
@@ -186,30 +165,32 @@ static void idio_16_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
+ unsigned int offset;
+ unsigned long gpio_mask;
+ void __iomem *ports[] = {
+ &idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15,
+ };
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ size_t index;
+ void __iomem *port_addr;
+ unsigned int bitmask;
unsigned long flags;
unsigned int out_state;

- raw_spin_lock_irqsave(&idio16gpio->lock, flags);
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ index = offset / 8;
+ port_addr = ports[index];

- /* process output lines 0-7 */
- if (*mask & 0xFF) {
- out_state = ioread8(&idio16gpio->reg->out0_7) & ~*mask;
- out_state |= *mask & *bits;
- iowrite8(out_state, &idio16gpio->reg->out0_7);
- }
+ bitmask = bitmap_get_value8(bits, ngpio, offset) & gpio_mask;
+
+ raw_spin_lock_irqsave(&idio16gpio->lock, flags);

- /* shift to next output line word */
- *mask >>= 8;
+ out_state = ioread8(port_addr) & ~gpio_mask;
+ out_state |= bitmask;
+ iowrite8(out_state, port_addr);

- /* process output lines 8-15 */
- if (*mask & 0xFF) {
- *bits >>= 8;
- out_state = ioread8(&idio16gpio->reg->out8_15) & ~*mask;
- out_state |= *mask & *bits;
- iowrite8(out_state, &idio16gpio->reg->out8_15);
+ raw_spin_unlock_irqrestore(&idio16gpio->lock, flags);
}
-
- raw_spin_unlock_irqrestore(&idio16gpio->lock, flags);
}

static void idio_16_irq_ack(struct irq_data *data)
--
2.21.0


2019-03-03 07:51:02

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 4/9] gpio: 104-idi-48: Utilize for_each_set_clump8 macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-104-idi-48.c | 37 ++++++++--------------------------
1 file changed, 8 insertions(+), 29 deletions(-)

diff --git a/drivers/gpio/gpio-104-idi-48.c b/drivers/gpio/gpio-104-idi-48.c
index 88dc6f2449f6..59c571aecf9a 100644
--- a/drivers/gpio/gpio-104-idi-48.c
+++ b/drivers/gpio/gpio-104-idi-48.c
@@ -93,42 +93,21 @@ static int idi_48_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask,
unsigned long *bits)
{
struct idi_48_gpio *const idi48gpio = gpiochip_get_data(chip);
- size_t i;
+ unsigned int offset;
+ unsigned long gpio_mask;
static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ unsigned int port_addr;
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ port_addr = idi48gpio->base + ports[offset / 8];
+ port_state = inb(port_addr) & gpio_mask;

- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
- port_state = inb(idi48gpio->base + ports[i]);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= (port_state << word_offset) & word_mask;
+ bitmap_set_value8(bits, ngpio, port_state, offset);
}

return 0;
--
2.21.0


2019-03-03 07:51:22

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 5/9] gpio: gpio-mm: Utilize for_each_set_clump8 macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-gpio-mm.c | 73 +++++++++++--------------------------
1 file changed, 22 insertions(+), 51 deletions(-)

diff --git a/drivers/gpio/gpio-gpio-mm.c b/drivers/gpio/gpio-gpio-mm.c
index 8c150fd68d9d..4c1037a005ab 100644
--- a/drivers/gpio/gpio-gpio-mm.c
+++ b/drivers/gpio/gpio-gpio-mm.c
@@ -172,46 +172,26 @@ static int gpiomm_gpio_get(struct gpio_chip *chip, unsigned int offset)
return !!(port_state & mask);
}

+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
static int gpiomm_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask,
unsigned long *bits)
{
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
- size_t i;
- static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+ unsigned int offset;
+ unsigned long gpio_mask;
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ unsigned int port_addr;
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
- port_state = inb(gpiommgpio->base + ports[i]);
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ port_addr = gpiommgpio->base + ports[offset / 8];
+ port_state = inb(port_addr) & gpio_mask;

- /* store acquired bits at respective bits array offset */
- bits[word_index] |= (port_state << word_offset) & word_mask;
+ bitmap_set_value8(bits, ngpio, port_state, offset);
}

return 0;
@@ -242,37 +222,28 @@ static void gpiomm_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
- unsigned int i;
- const unsigned int gpio_reg_size = 8;
- unsigned int port;
- unsigned int out_port;
+ unsigned int offset;
+ unsigned long gpio_mask;
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ size_t index;
+ unsigned int port_addr;
unsigned int bitmask;
unsigned long flags;

- /* set bits are evaluated a gpio register size at a time */
- for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
- /* no more set bits in this mask word; skip to the next word */
- if (!mask[BIT_WORD(i)]) {
- i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
- continue;
- }
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ index = offset / 8;
+ port_addr = gpiommgpio->base + ports[index];

- port = i / gpio_reg_size;
- out_port = (port > 2) ? port + 1 : port;
- bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+ bitmask = bitmap_get_value8(bits, ngpio, offset) & gpio_mask;

spin_lock_irqsave(&gpiommgpio->lock, flags);

/* update output state data and set device gpio register */
- gpiommgpio->out_state[port] &= ~mask[BIT_WORD(i)];
- gpiommgpio->out_state[port] |= bitmask;
- outb(gpiommgpio->out_state[port], gpiommgpio->base + out_port);
+ gpiommgpio->out_state[index] &= ~gpio_mask;
+ gpiommgpio->out_state[index] |= bitmask;
+ outb(gpiommgpio->out_state[index], port_addr);

spin_unlock_irqrestore(&gpiommgpio->lock, flags);
-
- /* prepare for next gpio register set */
- mask[BIT_WORD(i)] >>= gpio_reg_size;
- bits[BIT_WORD(i)] >>= gpio_reg_size;
}
}

--
2.21.0


2019-03-03 07:51:34

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 9/9] gpio: uniphier: Utilize for_each_set_clump8 macro

Replace verbose implementation in set_multiple callback with
for_each_set_clump8 macro to simplify code and improve clarity. An
improvement in this case is that banks that are not masked will now be
skipped.

Cc: Masahiro Yamada <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-uniphier.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/gpio/gpio-uniphier.c b/drivers/gpio/gpio-uniphier.c
index 0f662b297a95..df640cb29b9c 100644
--- a/drivers/gpio/gpio-uniphier.c
+++ b/drivers/gpio/gpio-uniphier.c
@@ -15,9 +15,6 @@
#include <linux/spinlock.h>
#include <dt-bindings/gpio/uniphier-gpio.h>

-#define UNIPHIER_GPIO_BANK_MASK \
- GENMASK((UNIPHIER_GPIO_LINES_PER_BANK) - 1, 0)
-
#define UNIPHIER_GPIO_IRQ_MAX_NUM 24

#define UNIPHIER_GPIO_PORT_DATA 0x0 /* data */
@@ -147,15 +144,14 @@ static void uniphier_gpio_set(struct gpio_chip *chip,
static void uniphier_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
- unsigned int bank, shift, bank_mask, bank_bits;
- int i;
+ unsigned int i;
+ unsigned long bank_mask;
+ unsigned int bank;
+ unsigned int bank_bits;

- for (i = 0; i < chip->ngpio; i += UNIPHIER_GPIO_LINES_PER_BANK) {
+ for_each_set_clump8(i, bank_mask, mask, chip->ngpio) {
bank = i / UNIPHIER_GPIO_LINES_PER_BANK;
- shift = i % BITS_PER_LONG;
- bank_mask = (mask[BIT_WORD(i)] >> shift) &
- UNIPHIER_GPIO_BANK_MASK;
- bank_bits = bits[BIT_WORD(i)] >> shift;
+ bank_bits = bitmap_get_value8(bits, chip->ngpio, i);

uniphier_gpio_bank_write(chip, bank, UNIPHIER_GPIO_PORT_DATA,
bank_mask, bank_bits);
--
2.21.0


2019-03-03 07:51:43

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 6/9] gpio: ws16c48: Utilize for_each_set_clump8 macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-ws16c48.c | 72 +++++++++++--------------------------
1 file changed, 20 insertions(+), 52 deletions(-)

diff --git a/drivers/gpio/gpio-ws16c48.c b/drivers/gpio/gpio-ws16c48.c
index 5cf3697bfb15..1d071a3d3e81 100644
--- a/drivers/gpio/gpio-ws16c48.c
+++ b/drivers/gpio/gpio-ws16c48.c
@@ -134,42 +134,19 @@ static int ws16c48_gpio_get_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
- const unsigned int gpio_reg_size = 8;
- size_t i;
- const size_t num_ports = chip->ngpio / gpio_reg_size;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+ unsigned int offset;
+ unsigned long gpio_mask;
+ unsigned int port_addr;
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < num_ports; i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
+ for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) {
+ port_addr = ws16c48gpio->base + offset / 8;
+ port_state = inb(port_addr) & gpio_mask;

- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
- port_state = inb(ws16c48gpio->base + i);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= (port_state << word_offset) & word_mask;
+ bitmap_set_value8(bits, chip->ngpio, port_state, offset);
}

return 0;
@@ -203,39 +180,30 @@ static void ws16c48_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
- unsigned int i;
- const unsigned int gpio_reg_size = 8;
- unsigned int port;
- unsigned int iomask;
+ unsigned int offset;
+ unsigned long gpio_mask;
+ size_t index;
+ unsigned int port_addr;
unsigned int bitmask;
unsigned long flags;

- /* set bits are evaluated a gpio register size at a time */
- for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
- /* no more set bits in this mask word; skip to the next word */
- if (!mask[BIT_WORD(i)]) {
- i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
- continue;
- }
-
- port = i / gpio_reg_size;
+ for_each_set_clump8(offset, gpio_mask, mask, chip->ngpio) {
+ index = offset / 8;
+ port_addr = ws16c48gpio->base + index;

/* mask out GPIO configured for input */
- iomask = mask[BIT_WORD(i)] & ~ws16c48gpio->io_state[port];
- bitmask = iomask & bits[BIT_WORD(i)];
+ gpio_mask &= ~ws16c48gpio->io_state[index];
+ bitmask = bitmap_get_value8(bits, chip->ngpio, offset) &
+ gpio_mask;

raw_spin_lock_irqsave(&ws16c48gpio->lock, flags);

/* update output state data and set device gpio register */
- ws16c48gpio->out_state[port] &= ~iomask;
- ws16c48gpio->out_state[port] |= bitmask;
- outb(ws16c48gpio->out_state[port], ws16c48gpio->base + port);
+ ws16c48gpio->out_state[index] &= ~gpio_mask;
+ ws16c48gpio->out_state[index] |= bitmask;
+ outb(ws16c48gpio->out_state[index], port_addr);

raw_spin_unlock_irqrestore(&ws16c48gpio->lock, flags);
-
- /* prepare for next gpio register set */
- mask[BIT_WORD(i)] >>= gpio_reg_size;
- bits[BIT_WORD(i)] >>= gpio_reg_size;
}
}

--
2.21.0


2019-03-03 07:51:45

by William Breathitt Gray

[permalink] [raw]
Subject: [PATCH v9 8/9] gpio: pcie-idio-24: Utilize for_each_set_clump8 macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump8 macro to simplify code and improve clarity.

Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-pcie-idio-24.c | 111 ++++++++++++-------------------
1 file changed, 42 insertions(+), 69 deletions(-)

diff --git a/drivers/gpio/gpio-pcie-idio-24.c b/drivers/gpio/gpio-pcie-idio-24.c
index 52f1647a46fd..2ceff1f5d8fd 100644
--- a/drivers/gpio/gpio-pcie-idio-24.c
+++ b/drivers/gpio/gpio-pcie-idio-24.c
@@ -198,52 +198,35 @@ static int idio_24_gpio_get_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
- size_t i;
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
- unsigned long port_state;
+ unsigned int offset;
+ unsigned long gpio_mask;
void __iomem *ports[] = {
&idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15,
&idio24gpio->reg->out16_23, &idio24gpio->reg->in0_7,
&idio24gpio->reg->in8_15, &idio24gpio->reg->in16_23,
};
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ size_t index;
+ unsigned long port_state;
const unsigned long out_mode_mask = BIT(1);

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports) + 1; i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ index = offset / 8;

/* read bits from current gpio port (port 6 is TTL GPIO) */
- if (i < 6)
- port_state = ioread8(ports[i]);
+ if (index < 6)
+ port_state = ioread8(ports[index]);
else if (ioread8(&idio24gpio->reg->ctl) & out_mode_mask)
port_state = ioread8(&idio24gpio->reg->ttl_out0_7);
else
port_state = ioread8(&idio24gpio->reg->ttl_in0_7);

- /* store acquired bits at respective bits array offset */
- bits[word_index] |= (port_state << word_offset) & word_mask;
+ port_state &= gpio_mask;
+
+ bitmap_set_value8(bits, ngpio, port_state, offset);
}

return 0;
@@ -294,59 +277,49 @@ static void idio_24_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
- size_t i;
- unsigned long bits_offset;
+ unsigned int offset;
unsigned long gpio_mask;
- const unsigned int gpio_reg_size = 8;
- const unsigned long port_mask = GENMASK(gpio_reg_size, 0);
- unsigned long flags;
- unsigned int out_state;
void __iomem *ports[] = {
&idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15,
&idio24gpio->reg->out16_23
};
+ const unsigned int ngpio = ARRAY_SIZE(ports) * 8;
+ size_t index;
+ unsigned int bitmask;
+ unsigned long flags;
+ unsigned int out_state;
const unsigned long out_mode_mask = BIT(1);
- const unsigned int ttl_offset = 48;
- const size_t ttl_i = BIT_WORD(ttl_offset);
- const unsigned int word_offset = ttl_offset % BITS_PER_LONG;
- const unsigned long ttl_mask = (mask[ttl_i] >> word_offset) & port_mask;
- const unsigned long ttl_bits = (bits[ttl_i] >> word_offset) & ttl_mask;
-
- /* set bits are processed a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* check if any set bits for current port */
- gpio_mask = (*mask >> bits_offset) & port_mask;
- if (!gpio_mask) {
- /* no set bits for this port so move on to next port */
- continue;
- }

- raw_spin_lock_irqsave(&idio24gpio->lock, flags);
+ for_each_set_clump8(offset, gpio_mask, mask, ngpio) {
+ index = offset / 8;

- /* process output lines */
- out_state = ioread8(ports[i]) & ~gpio_mask;
- out_state |= (*bits >> bits_offset) & gpio_mask;
- iowrite8(out_state, ports[i]);
+ bitmask = bitmap_get_value8(bits, ngpio, offset) & gpio_mask;

- raw_spin_unlock_irqrestore(&idio24gpio->lock, flags);
- }
+ raw_spin_lock_irqsave(&idio24gpio->lock, flags);

- /* check if setting TTL lines and if they are in output mode */
- if (!ttl_mask || !(ioread8(&idio24gpio->reg->ctl) & out_mode_mask))
- return;
+ /* read bits from current gpio port (port 6 is TTL GPIO) */
+ if (index < 6) {
+ out_state = ioread8(ports[index]);
+ } else if (ioread8(&idio24gpio->reg->ctl) & out_mode_mask) {
+ out_state = ioread8(&idio24gpio->reg->ttl_out0_7);
+ } else {
+ /* skip TTL GPIO if set for input */
+ raw_spin_unlock_irqrestore(&idio24gpio->lock, flags);
+ continue;
+ }

- /* handle TTL output */
- raw_spin_lock_irqsave(&idio24gpio->lock, flags);
+ /* set requested bit states */
+ out_state &= ~gpio_mask;
+ out_state |= bitmask;

- /* process output lines */
- out_state = ioread8(&idio24gpio->reg->ttl_out0_7) & ~ttl_mask;
- out_state |= ttl_bits;
- iowrite8(out_state, &idio24gpio->reg->ttl_out0_7);
+ /* write bits for current gpio port (port 6 is TTL GPIO) */
+ if (index < 6)
+ iowrite8(out_state, ports[index]);
+ else
+ iowrite8(out_state, &idio24gpio->reg->ttl_out0_7);

- raw_spin_unlock_irqrestore(&idio24gpio->lock, flags);
+ raw_spin_unlock_irqrestore(&idio24gpio->lock, flags);
+ }
}

static void idio_24_irq_ack(struct irq_data *data)
--
2.21.0


2019-03-08 08:32:03

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Sun, Mar 3, 2019 at 8:47 AM William Breathitt Gray
<[email protected]> wrote:

> This macro iterates for each 8-bit group of bits (clump) with set bits,
> within a bitmap memory region. For each iteration, "start" is set to the
> bit offset of the found clump, while the respective clump value is
> stored to the location pointed by "clump". Additionally, the
> bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> respectively get and set an 8-bit value in a bitmap memory region.
>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Rasmus Villemoes <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Reviewed-by: Andy Shevchenko <[email protected]>
> Reviewed-by: Linus Walleij <[email protected]>
> Signed-off-by: William Breathitt Gray <[email protected]>

Andrew: would you be OK with this being merged in v5.1?

If we need to move the code to drivers/gpio that's OK (though
I think it's generally useful) but I need to know to proceed with
the William's nice optimization of these drivers.

Yours,
Linus Walleij

2019-03-08 08:57:29

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Fri, Mar 08, 2019 at 09:31:00AM +0100, Linus Walleij wrote:
> On Sun, Mar 3, 2019 at 8:47 AM William Breathitt Gray
> <[email protected]> wrote:
>
> > This macro iterates for each 8-bit group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to the
> > bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> > respectively get and set an 8-bit value in a bitmap memory region.
> >
> > Suggested-by: Andy Shevchenko <[email protected]>
> > Suggested-by: Rasmus Villemoes <[email protected]>
> > Cc: Arnd Bergmann <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Reviewed-by: Andy Shevchenko <[email protected]>
> > Reviewed-by: Linus Walleij <[email protected]>
> > Signed-off-by: William Breathitt Gray <[email protected]>
>
> Andrew: would you be OK with this being merged in v5.1?
>
> If we need to move the code to drivers/gpio that's OK (though
> I think it's generally useful) but I need to know to proceed with
> the William's nice optimization of these drivers.
>
> Yours,
> Linus Walleij

I was waiting on Andy to suggest some examples out of the GPIO realm,
but he may be under a heavy workload right so I decided to do a quick
search for one.

In drivers/of/unittest.c, there is loop across a bitmap in the
of_unittest_destroy_tracked_overlays function:

for (id = MAX_UNITTEST_OVERLAYS - 1; id >= 0; id--) {
if (!(overlay_id_bits[BIT_WORD(id)] & BIT_MASK(id)))
continue;

This section of code is checking each bit individually, and skipping if
that bit is not set. This looping can be optimized by using the
for_each_set_clump8 macro to skip clumps of nonset bits (not to mention
make the logic of the code much simpler and easier to follow by reducing
the code to a single line):

for_each_set_clump8(id, clump, overlay_id_bits, MAX_UNITTEST_OVERLAYS-1)

The for_each_set_clump8 macro is not specific to the GPIO subsystem; I
just happen to use it in these GPIO drivers simply because I am most
familar with this section of the kernel (it's where most of my
contributions occur afterall).

Consider this, if I am able to find a use for this macro outside of the
GPIO subsystem within a matter minutes, then there must be some benefit
in allowing the rest of the kernel to use the for_each_set_clump8 macro.
So let's put it in bitops.h rather than restrict it to just the GPIO
subsystem.

William Breathitt Gray

2019-03-08 09:20:04

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Fri, Mar 8, 2019 at 10:56 AM William Breathitt Gray
<[email protected]> wrote:
> On Fri, Mar 08, 2019 at 09:31:00AM +0100, Linus Walleij wrote:
> > On Sun, Mar 3, 2019 at 8:47 AM William Breathitt Gray
> > <[email protected]> wrote:
> >
> > > This macro iterates for each 8-bit group of bits (clump) with set bits,
> > > within a bitmap memory region. For each iteration, "start" is set to the
> > > bit offset of the found clump, while the respective clump value is
> > > stored to the location pointed by "clump". Additionally, the
> > > bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> > > respectively get and set an 8-bit value in a bitmap memory region.

> > Andrew: would you be OK with this being merged in v5.1?
> >
> > If we need to move the code to drivers/gpio that's OK (though
> > I think it's generally useful) but I need to know to proceed with
> > the William's nice optimization of these drivers.
> >
> > Yours,
> > Linus Walleij
>
> I was waiting on Andy to suggest some examples out of the GPIO realm,
> but he may be under a heavy workload right

Yeah, sorry for that. I will use your helpers in the future for sure
in the suitable parts of the code inside and outside of GPIO, just not
in a highest priority to me.

> so I decided to do a quick
> Consider this, if I am able to find a use for this macro outside of the
> GPIO subsystem within a matter minutes, then there must be some benefit
> in allowing the rest of the kernel to use the for_each_set_clump8 macro.
> So let's put it in bitops.h rather than restrict it to just the GPIO
> subsystem.

As I mentioned earlier I'm pretty sure I found as well opportunity to
use this new API
outside of GPIO realm. I just want to be sure (means of testing on real HW).

--
With Best Regards,
Andy Shevchenko

2019-03-11 07:58:09

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] [lib/test_bitmap.c] ecdc93614a: kernel_selftests.lib.bitmap.sh.fail

FYI, we noticed the following commit (built with gcc-8):

commit: ecdc93614ac2e83d11b08e8b603ebd14e90c39c2 ("[PATCH v9 2/9] lib/test_bitmap.c: Add for_each_set_clump8 test cases")
url: https://github.com/0day-ci/linux/commits/William-Breathitt-Gray/Introduce-the-for_each_set_clump8-macro/20190305-073809
base: https://git.kernel.org/cgit/linux/kernel/git/linusw/linux-gpio.git for-next

in testcase: kernel_selftests
with following parameters:

group: kselftests-01

test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):




KERNEL SELFTESTS: linux_headers_dir is /usr/src/linux-headers-x86_64-rhel-7.6-ecdc93614ac2e83d11b08e8b603ebd14e90c39c2
2019-03-08 11:24:52 ln -sf /usr/bin/clang-7 /usr/bin/clang
2019-03-08 11:24:52 ln -sf /usr/bin/llc-7 /usr/bin/llc

2019-03-08 11:28:25 make run_tests -C lib
make: Entering directory '/usr/src/perf_selftests-x86_64-rhel-7.6-ecdc93614ac2e83d11b08e8b603ebd14e90c39c2/tools/testing/selftests/lib'
TAP version 13
selftests: lib: printf.sh
========================================
printf: ok
ok 1..1 selftests: lib: printf.sh [PASS]
selftests: lib: bitmap.sh
========================================
bitmap: [FAIL]
not ok 1..2 selftests: lib: bitmap.sh [FAIL]
selftests: lib: prime_numbers.sh
========================================
prime_numbers: ok
ok 1..3 selftests: lib: prime_numbers.sh [PASS]
make: Leaving directory '/usr/src/perf_selftests-x86_64-rhel-7.6-ecdc93614ac2e83d11b08e8b603ebd14e90c39c2/tools/testing/selftests/lib'
locking test: not in Makefile



To reproduce:

# build kernel
cd linux
cp config-5.0.0-rc6-00113-gecdc936 .config
make HOSTCC=gcc-8 CC=gcc-8 ARCH=x86_64 olddefconfig
make HOSTCC=gcc-8 CC=gcc-8 ARCH=x86_64 prepare
make HOSTCC=gcc-8 CC=gcc-8 ARCH=x86_64 modules_prepare
make HOSTCC=gcc-8 CC=gcc-8 ARCH=x86_64 SHELL=/bin/bash
make HOSTCC=gcc-8 CC=gcc-8 ARCH=x86_64 bzImage


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Rong Chen


Attachments:
(No filename) (2.43 kB)
config-5.0.0-rc6-00113-gecdc936 (195.07 kB)
job-script (6.37 kB)
dmesg.xz (38.05 kB)
kernel_selftests (64.48 kB)
Download all attachments

2019-03-12 01:01:51

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Fri, 8 Mar 2019 09:31:00 +0100 Linus Walleij <[email protected]> wrote:

> On Sun, Mar 3, 2019 at 8:47 AM William Breathitt Gray
> <[email protected]> wrote:
>
> > This macro iterates for each 8-bit group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to the
> > bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> > respectively get and set an 8-bit value in a bitmap memory region.
> >
> > Suggested-by: Andy Shevchenko <[email protected]>
> > Suggested-by: Rasmus Villemoes <[email protected]>
> > Cc: Arnd Bergmann <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Reviewed-by: Andy Shevchenko <[email protected]>
> > Reviewed-by: Linus Walleij <[email protected]>
> > Signed-off-by: William Breathitt Gray <[email protected]>
>
> Andrew: would you be OK with this being merged in v5.1?

Yup. We have quite a few users there. I assume this will go via the
gpio tree?

Feel free to add Acked-by: Andrew Morton <[email protected]>,
although it probably isn't worth churning the git tree to do so at this
late stage - your cvall.


2019-03-12 03:55:41

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Sun, Mar 3, 2019 at 4:48 PM William Breathitt Gray
<[email protected]> wrote:
>
> This macro iterates for each 8-bit group of bits (clump) with set bits,
> within a bitmap memory region. For each iteration, "start" is set to the
> bit offset of the found clump, while the respective clump value is
> stored to the location pointed by "clump". Additionally, the
> bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> respectively get and set an 8-bit value in a bitmap memory region.
>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Rasmus Villemoes <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Reviewed-by: Andy Shevchenko <[email protected]>
> Reviewed-by: Linus Walleij <[email protected]>
> Signed-off-by: William Breathitt Gray <[email protected]>
> ---
> include/asm-generic/bitops/find.h | 14 ++++++
> include/linux/bitops.h | 5 ++
> lib/find_bit.c | 81 +++++++++++++++++++++++++++++++
> 3 files changed, 100 insertions(+)
>
> diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
> index 8a1ee10014de..9a76adff59c6 100644
> --- a/include/asm-generic/bitops/find.h
> +++ b/include/asm-generic/bitops/find.h
> @@ -80,4 +80,18 @@ extern unsigned long find_first_zero_bit(const unsigned long *addr,
>
> #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */
>
> +unsigned long bitmap_get_value8(const unsigned long *const bitmap,
> + const unsigned int size,
> + const unsigned int start);
> +
> +void bitmap_set_value8(unsigned long *const bitmap, const unsigned int size,
> + const unsigned long value, const unsigned int start);
> +
> +unsigned int find_next_clump8(unsigned long *const clump,
> + const unsigned long *const addr,
> + unsigned int offset, const unsigned int size);
> +
> +#define find_first_clump8(clump, bits, size) \
> + find_next_clump8((clump), (bits), 0, (size))
> +
> #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index 705f7c442691..61c10f20079e 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -40,6 +40,11 @@ extern unsigned long __sw_hweight64(__u64 w);
> (bit) < (size); \
> (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
>
> +#define for_each_set_clump8(start, clump, bits, size) \
> + for ((start) = find_first_clump8(&(clump), (bits), (size)); \
> + (start) < (size); \
> + (start) = find_next_clump8(&(clump), (bits), (start) + 8, (size)))
> +
> static inline int get_bitmask_order(unsigned int count)
> {
> int order;
> diff --git a/lib/find_bit.c b/lib/find_bit.c
> index ee3df93ba69a..c2af1f013ea2 100644
> --- a/lib/find_bit.c
> +++ b/lib/find_bit.c
> @@ -218,3 +218,84 @@ EXPORT_SYMBOL(find_next_bit_le);
> #endif
>
> #endif /* __BIG_ENDIAN */
> +
> +/**
> + * bitmap_get_value8 - get an 8-bit value within a memory region
> + * @bitmap: address to the bitmap memory region
> + * @size: bitmap size in number of bits
> + * @start: bit offset of the 8-bit value
> + *
> + * Returns the 8-bit value located at the @start bit offset within the @bitmap
> + * memory region.
> + */
> +unsigned long bitmap_get_value8(const unsigned long *const bitmap,
> + const unsigned int size,
> + const unsigned int start)


A bunch of 'const' qualifiers are eyesore.

The first 'const' of bitmap is the only useful one.


unsigned long bitmap_get_value8(const unsigned long *bitmap, unsigned int size,
unsigned int start)

is enough.





> +{
> + const size_t index = BIT_WORD(start);
> + const unsigned int offset = start % BITS_PER_LONG;
> + const unsigned int low_width = (offset + 8 > BITS_PER_LONG) ?
> + BITS_PER_LONG - offset : 8;
> + const unsigned long low = bitmap[index] >> offset;
> + const unsigned long high = (low_width < 8 && start + 8 <= size) ?
> + bitmap[index + 1] << low_width : 0;


Meh.



> +
> + return (low | high) & 0xFF;
> +}
> +EXPORT_SYMBOL(bitmap_get_value8);
> +
> +/**
> + * bitmap_set_value8 - set an 8-bit value within a memory region
> + * @bitmap: address to the bitmap memory region
> + * @size: bitmap size in number of bits
> + * @value: the 8-bit value; values wider than 8 bits may clobber bitmap
> + * @start: bit offset of the 8-bit value
> + */
> +void bitmap_set_value8(unsigned long *const bitmap, const unsigned int size,
> + const unsigned long value, const unsigned int start)
> +{
> + const size_t index = BIT_WORD(start);
> + const unsigned int offset = start % BITS_PER_LONG;
> + const unsigned int low_width = (offset + 8 > BITS_PER_LONG) ?
> + BITS_PER_LONG - offset : 8;
> + const unsigned long low_mask = GENMASK(offset + low_width - 1, offset);
> + const unsigned int high_width = 8 - low_width;
> + const unsigned long high_mask = GENMASK(high_width - 1, 0);
> +
> + /* set lower portion */
> + bitmap[index] &= ~low_mask;
> + bitmap[index] |= value << offset;
> +
> + /* set higher portion if space available in bitmap */
> + if (high_width && start + 8 <= size) {
> + bitmap[index + 1] &= ~high_mask;
> + bitmap[index + 1] |= value >> low_width;
> + }
> +}
> +EXPORT_SYMBOL(bitmap_set_value8);
> +
> +/**
> + * find_next_clump8 - find next 8-bit clump with set bits in a memory region
> + * @clump: location to store copy of found clump
> + * @addr: address to base the search on
> + * @offset: bit offset at which to start searching
> + * @size: bitmap size in number of bits
> + *
> + * Returns the bit offset for the next set clump; the found clump value is
> + * copied to the location pointed by @clump. If no bits are set, returns @size.
> + */
> +unsigned int find_next_clump8(unsigned long *const clump,
> + const unsigned long *const addr,
> + unsigned int offset, const unsigned int size)
> +{
> + for (; offset < size; offset += 8) {
> + *clump = bitmap_get_value8(addr, size, offset);
> + if (!*clump)
> + continue;
> +
> + return offset;
> + }
> +
> + return size;
> +}
> +EXPORT_SYMBOL(find_next_clump8);
> --
> 2.21.0
>


--
Best Regards
Masahiro Yamada

2019-03-12 04:38:44

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v9 9/9] gpio: uniphier: Utilize for_each_set_clump8 macro

On Sun, Mar 3, 2019 at 4:51 PM William Breathitt Gray
<[email protected]> wrote:
>
> Replace verbose implementation in set_multiple callback with
> for_each_set_clump8 macro to simplify code and improve clarity. An
> improvement in this case is that banks that are not masked will now be
> skipped.
>
> Cc: Masahiro Yamada <[email protected]>
> Signed-off-by: William Breathitt Gray <[email protected]>
> ---
> drivers/gpio/gpio-uniphier.c | 16 ++++++----------
> 1 file changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpio/gpio-uniphier.c b/drivers/gpio/gpio-uniphier.c
> index 0f662b297a95..df640cb29b9c 100644
> --- a/drivers/gpio/gpio-uniphier.c
> +++ b/drivers/gpio/gpio-uniphier.c
> @@ -15,9 +15,6 @@
> #include <linux/spinlock.h>
> #include <dt-bindings/gpio/uniphier-gpio.h>
>
> -#define UNIPHIER_GPIO_BANK_MASK \
> - GENMASK((UNIPHIER_GPIO_LINES_PER_BANK) - 1, 0)
> -
> #define UNIPHIER_GPIO_IRQ_MAX_NUM 24
>
> #define UNIPHIER_GPIO_PORT_DATA 0x0 /* data */
> @@ -147,15 +144,14 @@ static void uniphier_gpio_set(struct gpio_chip *chip,
> static void uniphier_gpio_set_multiple(struct gpio_chip *chip,
> unsigned long *mask, unsigned long *bits)
> {
> - unsigned int bank, shift, bank_mask, bank_bits;
> - int i;
> + unsigned int i;
> + unsigned long bank_mask;
> + unsigned int bank;
> + unsigned int bank_bits;
>
> - for (i = 0; i < chip->ngpio; i += UNIPHIER_GPIO_LINES_PER_BANK) {
> + for_each_set_clump8(i, bank_mask, mask, chip->ngpio) {
> bank = i / UNIPHIER_GPIO_LINES_PER_BANK;
> - shift = i % BITS_PER_LONG;
> - bank_mask = (mask[BIT_WORD(i)] >> shift) &
> - UNIPHIER_GPIO_BANK_MASK;
> - bank_bits = bits[BIT_WORD(i)] >> shift;
> + bank_bits = bitmap_get_value8(bits, chip->ngpio, i);
>
> uniphier_gpio_bank_write(chip, bank, UNIPHIER_GPIO_PORT_DATA,
> bank_mask, bank_bits);


Please do not do this.

Nothing in this driver says the GPIO width is 8-bit.

You are hard-coding '8-bit'.







> --
> 2.21.0
>


--
Best Regards
Masahiro Yamada

2019-03-12 05:05:22

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Sun, Mar 3, 2019 at 4:48 PM William Breathitt Gray
<[email protected]> wrote:
>
> This macro iterates for each 8-bit group of bits (clump) with set bits,
> within a bitmap memory region. For each iteration, "start" is set to the
> bit offset of the found clump, while the respective clump value is
> stored to the location pointed by "clump". Additionally, the
> bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> respectively get and set an 8-bit value in a bitmap memory region.
>
> Suggested-by: Andy Shevchenko <[email protected]>
> Suggested-by: Rasmus Villemoes <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Reviewed-by: Andy Shevchenko <[email protected]>
> Reviewed-by: Linus Walleij <[email protected]>
> Signed-off-by: William Breathitt Gray <[email protected]>
> ---
> include/asm-generic/bitops/find.h | 14 ++++++
> include/linux/bitops.h | 5 ++
> lib/find_bit.c | 81 +++++++++++++++++++++++++++++++
> 3 files changed, 100 insertions(+)
>
> diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
> index 8a1ee10014de..9a76adff59c6 100644
> --- a/include/asm-generic/bitops/find.h
> +++ b/include/asm-generic/bitops/find.h
> @@ -80,4 +80,18 @@ extern unsigned long find_first_zero_bit(const unsigned long *addr,
>
> #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */
>
> +unsigned long bitmap_get_value8(const unsigned long *const bitmap,
> + const unsigned int size,
> + const unsigned int start);
> +
> +void bitmap_set_value8(unsigned long *const bitmap, const unsigned int size,
> + const unsigned long value, const unsigned int start);
> +
> +unsigned int find_next_clump8(unsigned long *const clump,
> + const unsigned long *const addr,
> + unsigned int offset, const unsigned int size);
> +
> +#define find_first_clump8(clump, bits, size) \
> + find_next_clump8((clump), (bits), 0, (size))
> +
> #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index 705f7c442691..61c10f20079e 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -40,6 +40,11 @@ extern unsigned long __sw_hweight64(__u64 w);
> (bit) < (size); \
> (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
>
> +#define for_each_set_clump8(start, clump, bits, size) \
> + for ((start) = find_first_clump8(&(clump), (bits), (size)); \
> + (start) < (size); \
> + (start) = find_next_clump8(&(clump), (bits), (start) + 8, (size)))
> +
> static inline int get_bitmask_order(unsigned int count)
> {
> int order;
> diff --git a/lib/find_bit.c b/lib/find_bit.c
> index ee3df93ba69a..c2af1f013ea2 100644
> --- a/lib/find_bit.c
> +++ b/lib/find_bit.c
> @@ -218,3 +218,84 @@ EXPORT_SYMBOL(find_next_bit_le);
> #endif
>
> #endif /* __BIG_ENDIAN */
> +
> +/**
> + * bitmap_get_value8 - get an 8-bit value within a memory region
> + * @bitmap: address to the bitmap memory region
> + * @size: bitmap size in number of bits
> + * @start: bit offset of the 8-bit value
> + *
> + * Returns the 8-bit value located at the @start bit offset within the @bitmap
> + * memory region.
> + */
> +unsigned long bitmap_get_value8(const unsigned long *const bitmap,
> + const unsigned int size,
> + const unsigned int start)


The comment says this function returns '8-bit value'.

The return type should be 'u8' instead of 'unsigned long', then.

Same for other helpers.



> +{
> + const size_t index = BIT_WORD(start);
> + const unsigned int offset = start % BITS_PER_LONG;
> + const unsigned int low_width = (offset + 8 > BITS_PER_LONG) ?
> + BITS_PER_LONG - offset : 8;
> + const unsigned long low = bitmap[index] >> offset;
> + const unsigned long high = (low_width < 8 && start + 8 <= size) ?
> + bitmap[index + 1] << low_width : 0;


I do not know if we have a usecase
where the 'start' is not multiple of 8, though.



--
Best Regards
Masahiro Yamada

2019-03-12 05:38:18

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Fri, Mar 8, 2019 at 5:57 PM William Breathitt Gray
<[email protected]> wrote:
>
> On Fri, Mar 08, 2019 at 09:31:00AM +0100, Linus Walleij wrote:
> > On Sun, Mar 3, 2019 at 8:47 AM William Breathitt Gray
> > <[email protected]> wrote:
> >
> > > This macro iterates for each 8-bit group of bits (clump) with set bits,
> > > within a bitmap memory region. For each iteration, "start" is set to the
> > > bit offset of the found clump, while the respective clump value is
> > > stored to the location pointed by "clump". Additionally, the
> > > bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> > > respectively get and set an 8-bit value in a bitmap memory region.
> > >
> > > Suggested-by: Andy Shevchenko <[email protected]>
> > > Suggested-by: Rasmus Villemoes <[email protected]>
> > > Cc: Arnd Bergmann <[email protected]>
> > > Cc: Andrew Morton <[email protected]>
> > > Reviewed-by: Andy Shevchenko <[email protected]>
> > > Reviewed-by: Linus Walleij <[email protected]>
> > > Signed-off-by: William Breathitt Gray <[email protected]>
> >
> > Andrew: would you be OK with this being merged in v5.1?
> >
> > If we need to move the code to drivers/gpio that's OK (though
> > I think it's generally useful) but I need to know to proceed with
> > the William's nice optimization of these drivers.
> >
> > Yours,
> > Linus Walleij
>
> I was waiting on Andy to suggest some examples out of the GPIO realm,
> but he may be under a heavy workload right so I decided to do a quick
> search for one.
>
> In drivers/of/unittest.c, there is loop across a bitmap in the
> of_unittest_destroy_tracked_overlays function:
>
> for (id = MAX_UNITTEST_OVERLAYS - 1; id >= 0; id--) {
> if (!(overlay_id_bits[BIT_WORD(id)] & BIT_MASK(id)))
> continue;
>
> This section of code is checking each bit individually, and skipping if
> that bit is not set. This looping can be optimized by using the
> for_each_set_clump8 macro


Probably no.


I see this comment before the loop.
/* remove in reverse order */


Also, the unittest code handles per-bit
whereas your helper does per-byte.





> to skip clumps of nonset bits (not to mention
> make the logic of the code much simpler and easier to follow by reducing
> the code to a single line):
>
> for_each_set_clump8(id, clump, overlay_id_bits, MAX_UNITTEST_OVERLAYS-1)
>
> The for_each_set_clump8 macro is not specific to the GPIO subsystem; I
> just happen to use it in these GPIO drivers simply because I am most
> familar with this section of the kernel (it's where most of my
> contributions occur afterall).
>
> Consider this, if I am able to find a use for this macro outside of the
> GPIO subsystem within a matter minutes, then there must be some benefit
> in allowing the rest of the kernel to use the for_each_set_clump8 macro.
> So let's put it in bitops.h rather than restrict it to just the GPIO
> subsystem.


If we do not find useful cases in other subsystem,
this patch set looks over-engineering to me.






> William Breathitt Gray


--
Best Regards
Masahiro Yamada

2019-03-12 07:15:14

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Tue, Mar 12, 2019 at 7:04 AM Masahiro Yamada
<[email protected]> wrote:
> On Sun, Mar 3, 2019 at 4:48 PM William Breathitt Gray
> <[email protected]> wrote:
> >
> > This macro iterates for each 8-bit group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to the
> > bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> > respectively get and set an 8-bit value in a bitmap memory region.

> > +/**
> > + * bitmap_get_value8 - get an 8-bit value within a memory region
> > + * @bitmap: address to the bitmap memory region
> > + * @size: bitmap size in number of bits
> > + * @start: bit offset of the 8-bit value
> > + *
> > + * Returns the 8-bit value located at the @start bit offset within the @bitmap
> > + * memory region.
> > + */
> > +unsigned long bitmap_get_value8(const unsigned long *const bitmap,
> > + const unsigned int size,
> > + const unsigned int start)
>
>
> The comment says this function returns '8-bit value'.
>
> The return type should be 'u8' instead of 'unsigned long', then.
>
> Same for other helpers.

This is done in a way to be consistent with the rest of bitmap API.
None of them returns boolean, for example, for single bit.

--
With Best Regards,
Andy Shevchenko

2019-03-12 07:19:00

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v9 9/9] gpio: uniphier: Utilize for_each_set_clump8 macro

On Tue, Mar 12, 2019 at 6:40 AM Masahiro Yamada
<[email protected]> wrote:
> On Sun, Mar 3, 2019 at 4:51 PM William Breathitt Gray
> <[email protected]> wrote:
> >
> > Replace verbose implementation in set_multiple callback with
> > for_each_set_clump8 macro to simplify code and improve clarity. An
> > improvement in this case is that banks that are not masked will now be
> > skipped.

> Please do not do this.
>
> Nothing in this driver says the GPIO width is 8-bit.

Huh?

https://elixir.bootlin.com/linux/latest/source/include/dt-bindings/gpio/uniphier-gpio.h#L9

Isn't a hardcoding?

--
With Best Regards,
Andy Shevchenko

2019-03-12 07:23:36

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Tue, Mar 12, 2019 at 02:36:21PM +0900, Masahiro Yamada wrote:
> On Fri, Mar 8, 2019 at 5:57 PM William Breathitt Gray
> <[email protected]> wrote:
> >
> > On Fri, Mar 08, 2019 at 09:31:00AM +0100, Linus Walleij wrote:
> > > On Sun, Mar 3, 2019 at 8:47 AM William Breathitt Gray
> > > <[email protected]> wrote:
> > >
> > > > This macro iterates for each 8-bit group of bits (clump) with set bits,
> > > > within a bitmap memory region. For each iteration, "start" is set to the
> > > > bit offset of the found clump, while the respective clump value is
> > > > stored to the location pointed by "clump". Additionally, the
> > > > bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> > > > respectively get and set an 8-bit value in a bitmap memory region.
> > > >
> > > > Suggested-by: Andy Shevchenko <[email protected]>
> > > > Suggested-by: Rasmus Villemoes <[email protected]>
> > > > Cc: Arnd Bergmann <[email protected]>
> > > > Cc: Andrew Morton <[email protected]>
> > > > Reviewed-by: Andy Shevchenko <[email protected]>
> > > > Reviewed-by: Linus Walleij <[email protected]>
> > > > Signed-off-by: William Breathitt Gray <[email protected]>
> > >
> > > Andrew: would you be OK with this being merged in v5.1?
> > >
> > > If we need to move the code to drivers/gpio that's OK (though
> > > I think it's generally useful) but I need to know to proceed with
> > > the William's nice optimization of these drivers.
> > >
> > > Yours,
> > > Linus Walleij
> >
> > I was waiting on Andy to suggest some examples out of the GPIO realm,
> > but he may be under a heavy workload right so I decided to do a quick
> > search for one.
> >
> > In drivers/of/unittest.c, there is loop across a bitmap in the
> > of_unittest_destroy_tracked_overlays function:
> >
> > for (id = MAX_UNITTEST_OVERLAYS - 1; id >= 0; id--) {
> > if (!(overlay_id_bits[BIT_WORD(id)] & BIT_MASK(id)))
> > continue;
> >
> > This section of code is checking each bit individually, and skipping if
> > that bit is not set. This looping can be optimized by using the
> > for_each_set_clump8 macro
>
>
> Probably no.
>
>
> I see this comment before the loop.
> /* remove in reverse order */

You're right, for_each_set_clump8 wouldn't work in this case since it
does not loop in reverse order. I shouldn't have rushed to find a case
and ignored the context of the code like that.

Since Andy appears to have hardware outside of the GPIO subsystem he's
testing, let's wait for that and see how it turns out.

William Breathitt Gray

>
>
> Also, the unittest code handles per-bit
> whereas your helper does per-byte.
>
>
>
>
>
> > to skip clumps of nonset bits (not to mention
> > make the logic of the code much simpler and easier to follow by reducing
> > the code to a single line):
> >
> > for_each_set_clump8(id, clump, overlay_id_bits, MAX_UNITTEST_OVERLAYS-1)
> >
> > The for_each_set_clump8 macro is not specific to the GPIO subsystem; I
> > just happen to use it in these GPIO drivers simply because I am most
> > familar with this section of the kernel (it's where most of my
> > contributions occur afterall).
> >
> > Consider this, if I am able to find a use for this macro outside of the
> > GPIO subsystem within a matter minutes, then there must be some benefit
> > in allowing the rest of the kernel to use the for_each_set_clump8 macro.
> > So let's put it in bitops.h rather than restrict it to just the GPIO
> > subsystem.
>
>
> If we do not find useful cases in other subsystem,
> this patch set looks over-engineering to me.
>
>
>
>
>
>
> > William Breathitt Gray
>
>
> --
> Best Regards
> Masahiro Yamada

2019-03-12 07:29:27

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v9 9/9] gpio: uniphier: Utilize for_each_set_clump8 macro

On Tue, Mar 12, 2019 at 01:36:57PM +0900, Masahiro Yamada wrote:
> On Sun, Mar 3, 2019 at 4:51 PM William Breathitt Gray
> <[email protected]> wrote:
> >
> > Replace verbose implementation in set_multiple callback with
> > for_each_set_clump8 macro to simplify code and improve clarity. An
> > improvement in this case is that banks that are not masked will now be
> > skipped.
> >
> > Cc: Masahiro Yamada <[email protected]>
> > Signed-off-by: William Breathitt Gray <[email protected]>
> > ---
> > drivers/gpio/gpio-uniphier.c | 16 ++++++----------
> > 1 file changed, 6 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpio/gpio-uniphier.c b/drivers/gpio/gpio-uniphier.c
> > index 0f662b297a95..df640cb29b9c 100644
> > --- a/drivers/gpio/gpio-uniphier.c
> > +++ b/drivers/gpio/gpio-uniphier.c
> > @@ -15,9 +15,6 @@
> > #include <linux/spinlock.h>
> > #include <dt-bindings/gpio/uniphier-gpio.h>
> >
> > -#define UNIPHIER_GPIO_BANK_MASK \
> > - GENMASK((UNIPHIER_GPIO_LINES_PER_BANK) - 1, 0)
> > -
> > #define UNIPHIER_GPIO_IRQ_MAX_NUM 24
> >
> > #define UNIPHIER_GPIO_PORT_DATA 0x0 /* data */
> > @@ -147,15 +144,14 @@ static void uniphier_gpio_set(struct gpio_chip *chip,
> > static void uniphier_gpio_set_multiple(struct gpio_chip *chip,
> > unsigned long *mask, unsigned long *bits)
> > {
> > - unsigned int bank, shift, bank_mask, bank_bits;
> > - int i;
> > + unsigned int i;
> > + unsigned long bank_mask;
> > + unsigned int bank;
> > + unsigned int bank_bits;
> >
> > - for (i = 0; i < chip->ngpio; i += UNIPHIER_GPIO_LINES_PER_BANK) {
> > + for_each_set_clump8(i, bank_mask, mask, chip->ngpio) {
> > bank = i / UNIPHIER_GPIO_LINES_PER_BANK;
> > - shift = i % BITS_PER_LONG;
> > - bank_mask = (mask[BIT_WORD(i)] >> shift) &
> > - UNIPHIER_GPIO_BANK_MASK;
> > - bank_bits = bits[BIT_WORD(i)] >> shift;
> > + bank_bits = bitmap_get_value8(bits, chip->ngpio, i);
> >
> > uniphier_gpio_bank_write(chip, bank, UNIPHIER_GPIO_PORT_DATA,
> > bank_mask, bank_bits);
>
>
> Please do not do this.
>
> Nothing in this driver says the GPIO width is 8-bit.
>
> You are hard-coding '8-bit'.

The for_each_set_clump8 macro is hardcoded to 8-bit clumps because the
current drivers utilizing this functionality are only updating their
GPIO ports 8-bits at a time.

However, if this driver updates more or less GPIO at a time, we can
easily update the macro by replacing the hardcoded '8' value with a
variable, thus giving us the generic for_each_set_clump macro.

How many GPIO can be updated at a time for this device?

William Breathitt Gray

>
>
>
>
>
>
>
> > --
> > 2.21.0
> >
>
>
> --
> Best Regards
> Masahiro Yamada

2019-03-12 08:58:50

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v9 9/9] gpio: uniphier: Utilize for_each_set_clump8 macro

On Tue, Mar 12, 2019 at 4:19 PM Andy Shevchenko
<[email protected]> wrote:
>
> On Tue, Mar 12, 2019 at 6:40 AM Masahiro Yamada
> <[email protected]> wrote:
> > On Sun, Mar 3, 2019 at 4:51 PM William Breathitt Gray
> > <[email protected]> wrote:
> > >
> > > Replace verbose implementation in set_multiple callback with
> > > for_each_set_clump8 macro to simplify code and improve clarity. An
> > > improvement in this case is that banks that are not masked will now be
> > > skipped.
>
> > Please do not do this.
> >
> > Nothing in this driver says the GPIO width is 8-bit.
>
> Huh?
>
> https://elixir.bootlin.com/linux/latest/source/include/dt-bindings/gpio/uniphier-gpio.h#L9
>
> Isn't a hardcoding?


Semi-hardcoding.

I needed to factor out some magic numbers
shared between DT and drivers.

Then, dt-bindings is out of realm of operating system.

If I am doing wrong, I take back my comments, though.



--
Best Regards
Masahiro Yamada

2019-03-12 09:11:28

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v9 9/9] gpio: uniphier: Utilize for_each_set_clump8 macro

On Tue, Mar 12, 2019 at 10:58 AM Masahiro Yamada
<[email protected]> wrote:
> On Tue, Mar 12, 2019 at 4:19 PM Andy Shevchenko
> <[email protected]> wrote:
> > On Tue, Mar 12, 2019 at 6:40 AM Masahiro Yamada
> > <[email protected]> wrote:
> > > On Sun, Mar 3, 2019 at 4:51 PM William Breathitt Gray
> > > <[email protected]> wrote:
> > > >
> > > > Replace verbose implementation in set_multiple callback with
> > > > for_each_set_clump8 macro to simplify code and improve clarity. An
> > > > improvement in this case is that banks that are not masked will now be
> > > > skipped.
> >
> > > Please do not do this.
> > >
> > > Nothing in this driver says the GPIO width is 8-bit.
> >
> > Huh?
> >
> > https://elixir.bootlin.com/linux/latest/source/include/dt-bindings/gpio/uniphier-gpio.h#L9
> >
> > Isn't a hardcoding?
>
>
> Semi-hardcoding.
>
> I needed to factor out some magic numbers
> shared between DT and drivers.

Effectively means you introduced an ABI, which we are not supposed to
change, where the number is carved in stone for all hardware covered
by this driver + DT pair.
If you would ever need another one it would require extending existing
bindings without dropping them away.

> Then, dt-bindings is out of realm of operating system.

Exactly!

> If I am doing wrong, I take back my comments, though.

--
With Best Regards,
Andy Shevchenko

2019-03-12 10:43:20

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Mon, Mar 11, 2019 at 06:01:13PM -0700, Andrew Morton wrote:
> On Fri, 8 Mar 2019 09:31:00 +0100 Linus Walleij <[email protected]> wrote:
>
> > On Sun, Mar 3, 2019 at 8:47 AM William Breathitt Gray
> > <[email protected]> wrote:
> >
> > > This macro iterates for each 8-bit group of bits (clump) with set bits,
> > > within a bitmap memory region. For each iteration, "start" is set to the
> > > bit offset of the found clump, while the respective clump value is
> > > stored to the location pointed by "clump". Additionally, the
> > > bitmap_get_value8 and bitmap_set_value8 functions are introduced to
> > > respectively get and set an 8-bit value in a bitmap memory region.
> > >
> > > Suggested-by: Andy Shevchenko <[email protected]>
> > > Suggested-by: Rasmus Villemoes <[email protected]>
> > > Cc: Arnd Bergmann <[email protected]>
> > > Cc: Andrew Morton <[email protected]>
> > > Reviewed-by: Andy Shevchenko <[email protected]>
> > > Reviewed-by: Linus Walleij <[email protected]>
> > > Signed-off-by: William Breathitt Gray <[email protected]>
> >
> > Andrew: would you be OK with this being merged in v5.1?
>
> Yup. We have quite a few users there. I assume this will go via the
> gpio tree?
>
> Feel free to add Acked-by: Andrew Morton <[email protected]>,
> although it probably isn't worth churning the git tree to do so at this
> late stage - your cvall.

Linus,

I discovered a bug in this version of the patchset. I'll release a
version 10 once I've resolved the issue.

William Breathitt Gray

2019-03-12 14:54:55

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v9 1/9] bitops: Introduce the for_each_set_clump8 macro

On Tue, Mar 12, 2019 at 04:22:22PM +0900, William Breathitt Gray wrote:

> Since Andy appears to have hardware outside of the GPIO subsystem he's
> testing, let's wait for that and see how it turns out.

Since I have still not much time, here is the driver I'm talking about
drivers/thermal/intel/intel_soc_dts_iosf.c

If you have a chance to look at it (add_dts_thermal_zone(), for example) and
prepare a patch, I will be able to test it on real hardware.

--
With Best Regards,
Andy Shevchenko