2018-10-02 01:14:06

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 0/8] Introduce the for_each_set_clump macro

This is a resend of v4 in hopes of getting some more ACKS and a few more
eyes on this patchset.

Changes in v4:
- Fix bitmap_set arguments (last parameter is nbits not endbit)

While adding GPIO get_multiple/set_multiple callback support for various
drivers, I noticed a pattern of looping manifesting that would be useful
standardized as a macro.

This patchset introduces the for_each_set_clump macro and utilizes it in
several GPIO drivers. The for_each_set_clump macro facilitates a
for-loop syntax that iterates over entire groups of set bits at a time.

For example, suppose you would like to iterate over a 16-bit integer 4
bits at a time, skipping over 4-bit groups with no set bit, where XXXX
represents the current 4-bit group:

Example: 1011 1110 0000 1111
First loop: 1011 1110 0000 XXXX
Second loop: 1011 XXXX 0000 1111
Third loop: XXXX 1110 0000 1111

Each iteration of the loop returns the next 4-bit group that has at
least one set bit.

The for_each_set_clump macro has six parameters:

* clump: set to current clump index for the iteration
* index: set to current bitmap word index for the iteration
* offset: bits offset of the found clump in the bitmap word
* bits: bitmap to search within
* size: bitmap size in number of clumps
* clump_size: clump size in number of bits

The clump_size argument can be an arbitrary number of bits and is not
required to be a multiple of 2.

This patchset was rebased on top of the following three commits:

* commit aaf96e51de11 ("gpio: pci-idio-16: Fix port memory offset for get_multiple callback")
* commit 304440aa96c6 ("gpio: pcie-idio-24: Fix port memory offset for get_multiple/set_multiple callbacks")
* commit e026646c178d ("gpio: pcie-idio-24: Fix off-by-one error in get_multiple loop")

When I implemented the test_for_each_set_clump function, I used
bitmap_set to set the expected bitmap for the test. This method of
setting bits only segments at a time was rather tedious and error-prone;
is there a better way to accomplish what I did (set a bitmap after a
DECLARE_BITMAP)?

William Breathitt Gray

William Breathitt Gray (8):
bitops: Introduce the for_each_set_clump macro
lib/test_bitmap.c: Add for_each_set_clump test cases
gpio: 104-dio-48e: Utilize for_each_set_clump macro
gpio: 104-idi-48: Utilize for_each_set_clump macro
gpio: gpio-mm: Utilize for_each_set_clump macro
gpio: ws16c48: Utilize for_each_set_clump macro
gpio: pci-idio-16: Utilize for_each_set_clump macro
gpio: pcie-idio-24: Utilize for_each_set_clump macro

drivers/gpio/gpio-104-dio-48e.c | 67 +++++---------------
drivers/gpio/gpio-104-idi-48.c | 32 ++--------
drivers/gpio/gpio-gpio-mm.c | 67 +++++---------------
drivers/gpio/gpio-pci-idio-16.c | 67 ++++++--------------
drivers/gpio/gpio-pcie-idio-24.c | 102 +++++++++++-------------------
drivers/gpio/gpio-ws16c48.c | 66 +++++--------------
include/asm-generic/bitops/find.h | 9 +++
include/linux/bitops.h | 7 ++
lib/find_bit.c | 40 ++++++++++++
lib/test_bitmap.c | 71 +++++++++++++++++++++
10 files changed, 236 insertions(+), 292 deletions(-)

--
2.19.0



2018-10-02 01:14:23

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

This macro iterates for each group of bits (clump) with set bits, within
a bitmap memory region. For each iteration, "clump" is set to the found
clump index, "index" is set to the word index of the bitmap containing
the found clump, and "offset" is set to the bit offset of the found
clump within the respective bitmap word.

Suggested-by: Andy Shevchenko <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
include/asm-generic/bitops/find.h | 9 +++++++
include/linux/bitops.h | 7 ++++++
lib/find_bit.c | 40 +++++++++++++++++++++++++++++++
3 files changed, 56 insertions(+)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 8a1ee10014de..3d3b2fc34908 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -2,6 +2,8 @@
#ifndef _ASM_GENERIC_BITOPS_FIND_H_
#define _ASM_GENERIC_BITOPS_FIND_H_

+#include <linux/types.h>
+
#ifndef find_next_bit
/**
* find_next_bit - find the next set bit in a memory region
@@ -80,4 +82,11 @@ extern unsigned long find_first_zero_bit(const unsigned long *addr,

#endif /* CONFIG_GENERIC_FIND_FIRST_BIT */

+size_t find_next_clump(size_t *const index, unsigned int *const offset,
+ const unsigned long *const bits, const size_t size,
+ const size_t clump_index, const unsigned int clump_size);
+
+#define find_first_clump(index, offset, bits, size, clump_size) \
+ find_next_clump((index), (offset), (bits), (size), 0, (clump_size))
+
#endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 7ddb1349394d..089381017f74 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -40,6 +40,13 @@ extern unsigned long __sw_hweight64(__u64 w);
(bit) < (size); \
(bit) = find_next_zero_bit((addr), (size), (bit) + 1))

+#define for_each_set_clump(clump, index, offset, bits, size, clump_size) \
+ for ((clump) = find_first_clump(&(index), &(offset), (bits), (size), \
+ (clump_size)); \
+ (clump) < (size); \
+ (clump) = find_next_clump(&(index), &(offset), (bits), (size), \
+ (clump) + 1, (clump_size)))
+
static inline int get_bitmask_order(unsigned int count)
{
int order;
diff --git a/lib/find_bit.c b/lib/find_bit.c
index ee3df93ba69a..1d547fe9304f 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -218,3 +218,43 @@ EXPORT_SYMBOL(find_next_bit_le);
#endif

#endif /* __BIG_ENDIAN */
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @index: location to store bitmap word index of found clump
+ * @offset: bits offset of the found clump within the respective bitmap word
+ * @bits: address to base the search on
+ * @size: bitmap size in number of clumps
+ * @clump_index: clump index at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the clump index for the next clump with set bits; the respective
+ * bitmap word index is stored at the location pointed by @index, and the bits
+ * offset of the found clump within the respective bitmap word is stored at the
+ * location pointed by @offset. If no bits are set, returns @size.
+ */
+size_t find_next_clump(size_t *const index, unsigned int *const offset,
+ const unsigned long *const bits, const size_t size,
+ const size_t clump_index, const unsigned int clump_size)
+{
+ size_t i;
+ unsigned int bits_offset;
+ unsigned long word_mask;
+ const unsigned long clump_mask = GENMASK(clump_size - 1, 0);
+
+ for (i = clump_index; i < size; i++) {
+ bits_offset = i * clump_size;
+
+ *index = BIT_WORD(bits_offset);
+ *offset = bits_offset % BITS_PER_LONG;
+
+ word_mask = bits[*index] & (clump_mask << *offset);
+ if (!word_mask)
+ continue;
+
+ return i;
+ }
+
+ return size;
+}
+EXPORT_SYMBOL(find_next_clump);
--
2.19.0


2018-10-02 01:14:56

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 2/8] lib/test_bitmap.c: Add for_each_set_clump test cases

The introduction of the for_each_set_clump macro warrants test cases to
verify the implementation. This patch adds test case checks for whether
an out-of-bounds clump index is returned, a zero clump is returned, or
the returned clump value differs from the expected clump value. A 4-bit
clump size is chosen in order to verify non-8-bit iteration.

Cc: Andy Shevchenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
---
lib/test_bitmap.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 71 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 6cd7d0740005..0a63313873c0 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -88,6 +88,39 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
return true;
}

+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+ const size_t clump_index, const size_t size,
+ const unsigned char *const clump_exp,
+ const unsigned long *const bits,
+ const size_t index,
+ const unsigned int offset)
+{
+ unsigned long clump;
+ unsigned long exp;
+
+ if (clump_index >= size) {
+ pr_warn("[%s:%u] clump index out-of-bounds: expected less than %zu, got %zu\n",
+ srcfile, line, size, clump_index);
+ return false;
+ }
+
+ exp = clump_exp[clump_index];
+ if (!exp) {
+ pr_warn("[%s:%u] clump index for zero clump: expected nonzero clump, got clump index %zu with clump value 0",
+ srcfile, line, clump_index);
+ return false;
+ }
+
+ clump = (bits[index] >> offset) & 0xF;
+ if (clump != exp) {
+ pr_warn("[%s:%u] expected 0x%lX, got 0x%lX",
+ srcfile, line, exp, clump);
+ return false;
+ }
+
+ return true;
+}
+
#define __expect_eq(suffix, ...) \
({ \
int result = 0; \
@@ -104,6 +137,7 @@ __check_eq_u32_array(const char *srcfile, unsigned int line,
#define expect_eq_bitmap(...) __expect_eq(bitmap, ##__VA_ARGS__)
#define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
#define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__)
+#define expect_eq_clump(...) __expect_eq(clump, ##__VA_ARGS__)

static void __init test_zero_clear(void)
{
@@ -361,6 +395,42 @@ static void noinline __init test_mem_optimisations(void)
}
}

+static const unsigned char clump_exp[] __initconst = {
+ 0x1, /* 1 bit set */
+ 0x2, /* non-edge 1 bit set */
+ 0x0, /* zero bits set */
+ 0xE, /* 3 bits set */
+ 0xE, /* Repeated clump */
+ 0xF, /* 4 bits set */
+ 0x3, /* 2 bits set */
+ 0x5, /* non-adjacent 2 bits set */
+};
+
+static void __init test_for_each_set_clump(void)
+{
+ size_t clump;
+ size_t index;
+ unsigned int offset;
+#define CLUMP_BITMAP_NUMBITS 32
+ DECLARE_BITMAP(bits, CLUMP_BITMAP_NUMBITS);
+#define CLUMP_SIZE 4
+ const size_t size = DIV_ROUND_UP(CLUMP_BITMAP_NUMBITS, CLUMP_SIZE);
+
+ /* set bitmap to test case */
+ bitmap_zero(bits, CLUMP_BITMAP_NUMBITS);
+ bitmap_set(bits, 0, 1); /* 0x1 */
+ bitmap_set(bits, 5, 1); /* 0x2 */
+ bitmap_set(bits, 13, 3); /* 0xE */
+ bitmap_set(bits, 17, 3); /* 0xE */
+ bitmap_set(bits, 20, 4); /* 0xF */
+ bitmap_set(bits, 24, 2); /* 0x3 */
+ bitmap_set(bits, 28, 1); /* 0x5 - part 1 */
+ bitmap_set(bits, 30, 1); /* 0x5 - part 2 */
+
+ for_each_set_clump(clump, index, offset, bits, size, CLUMP_SIZE)
+ expect_eq_clump(clump, size, clump_exp, bits, index, offset);
+}
+
static int __init test_bitmap_init(void)
{
test_zero_clear();
@@ -369,6 +439,7 @@ static int __init test_bitmap_init(void)
test_bitmap_arr32();
test_bitmap_parselist();
test_mem_optimisations();
+ test_for_each_set_clump();

if (failed_tests == 0)
pr_info("all %u tests passed\n", total_tests);
--
2.19.0


2018-10-02 01:15:09

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 3/8] gpio: 104-dio-48e: Utilize for_each_set_clump macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-104-dio-48e.c | 67 ++++++++-------------------------
1 file changed, 16 insertions(+), 51 deletions(-)

diff --git a/drivers/gpio/gpio-104-dio-48e.c b/drivers/gpio/gpio-104-dio-48e.c
index 9c4e07fcb74b..77eeaa36094c 100644
--- a/drivers/gpio/gpio-104-dio-48e.c
+++ b/drivers/gpio/gpio-104-dio-48e.c
@@ -183,46 +183,23 @@ static int dio48e_gpio_get(struct gpio_chip *chip, unsigned offset)
return !!(port_state & mask);
}

+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
static int dio48e_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask,
unsigned long *bits)
{
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
size_t i;
- static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+ size_t word;
+ unsigned int offset;
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
+ for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
port_state = inb(dio48egpio->base + ports[i]);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= port_state << word_offset;
+ bits[word] |= port_state << offset;
}

return 0;
@@ -252,37 +229,25 @@ static void dio48e_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct dio48e_gpio *const dio48egpio = gpiochip_get_data(chip);
- unsigned int i;
- const unsigned int gpio_reg_size = 8;
- unsigned int port;
- unsigned int out_port;
+ size_t i;
+ size_t word;
+ unsigned int offset;
+ unsigned int iomask;
unsigned int bitmask;
unsigned long flags;

- /* set bits are evaluated a gpio register size at a time */
- for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
- /* no more set bits in this mask word; skip to the next word */
- if (!mask[BIT_WORD(i)]) {
- i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
- continue;
- }
-
- port = i / gpio_reg_size;
- out_port = (port > 2) ? port + 1 : port;
- bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+ for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
+ iomask = mask[word] >> offset;
+ bitmask = iomask & (bits[word] >> offset);

raw_spin_lock_irqsave(&dio48egpio->lock, flags);

/* update output state data and set device gpio register */
- dio48egpio->out_state[port] &= ~mask[BIT_WORD(i)];
- dio48egpio->out_state[port] |= bitmask;
- outb(dio48egpio->out_state[port], dio48egpio->base + out_port);
+ dio48egpio->out_state[i] &= ~iomask;
+ dio48egpio->out_state[i] |= bitmask;
+ outb(dio48egpio->out_state[i], dio48egpio->base + ports[i]);

raw_spin_unlock_irqrestore(&dio48egpio->lock, flags);
-
- /* prepare for next gpio register set */
- mask[BIT_WORD(i)] >>= gpio_reg_size;
- bits[BIT_WORD(i)] >>= gpio_reg_size;
}
}

--
2.19.0


2018-10-02 01:16:16

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 7/8] gpio: pci-idio-16: Utilize for_each_set_clump macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-pci-idio-16.c | 67 +++++++++++----------------------
1 file changed, 21 insertions(+), 46 deletions(-)

diff --git a/drivers/gpio/gpio-pci-idio-16.c b/drivers/gpio/gpio-pci-idio-16.c
index 25d16b2af1c3..6d748c6e59cb 100644
--- a/drivers/gpio/gpio-pci-idio-16.c
+++ b/drivers/gpio/gpio-pci-idio-16.c
@@ -109,44 +109,20 @@ static int idio_16_gpio_get_multiple(struct gpio_chip *chip,
{
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
size_t i;
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
- unsigned long port_state;
+ size_t word;
+ unsigned int offset;
void __iomem *ports[] = {
&idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15,
&idio16gpio->reg->in0_7, &idio16gpio->reg->in8_15,
};
+ unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
+ for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
port_state = ioread8(ports[i]);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= port_state << word_offset;
+ bits[word] |= port_state << offset;
}

return 0;
@@ -186,30 +162,29 @@ static void idio_16_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct idio_16_gpio *const idio16gpio = gpiochip_get_data(chip);
+ size_t i;
+ size_t word;
+ unsigned int offset;
+ void __iomem *ports[] = {
+ &idio16gpio->reg->out0_7, &idio16gpio->reg->out8_15,
+ };
+ unsigned int iomask;
+ unsigned int bitmask;
unsigned long flags;
unsigned int out_state;

- raw_spin_lock_irqsave(&idio16gpio->lock, flags);
+ for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
+ iomask = mask[word] >> offset;
+ bitmask = iomask & (bits[word] >> offset);

- /* process output lines 0-7 */
- if (*mask & 0xFF) {
- out_state = ioread8(&idio16gpio->reg->out0_7) & ~*mask;
- out_state |= *mask & *bits;
- iowrite8(out_state, &idio16gpio->reg->out0_7);
- }
+ raw_spin_lock_irqsave(&idio16gpio->lock, flags);

- /* shift to next output line word */
- *mask >>= 8;
+ out_state = ioread8(ports[i]) & ~iomask;
+ out_state |= bitmask;
+ iowrite8(out_state, ports[i]);

- /* process output lines 8-15 */
- if (*mask & 0xFF) {
- *bits >>= 8;
- out_state = ioread8(&idio16gpio->reg->out8_15) & ~*mask;
- out_state |= *mask & *bits;
- iowrite8(out_state, &idio16gpio->reg->out8_15);
+ raw_spin_unlock_irqrestore(&idio16gpio->lock, flags);
}
-
- raw_spin_unlock_irqrestore(&idio16gpio->lock, flags);
}

static void idio_16_irq_ack(struct irq_data *data)
--
2.19.0


2018-10-02 01:16:33

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 4/8] gpio: 104-idi-48: Utilize for_each_set_clump macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-104-idi-48.c | 32 ++++----------------------------
1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/drivers/gpio/gpio-104-idi-48.c b/drivers/gpio/gpio-104-idi-48.c
index 2c9738adb3a6..f8de5560174f 100644
--- a/drivers/gpio/gpio-104-idi-48.c
+++ b/drivers/gpio/gpio-104-idi-48.c
@@ -94,41 +94,17 @@ static int idi_48_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask,
{
struct idi_48_gpio *const idi48gpio = gpiochip_get_data(chip);
size_t i;
+ size_t word;
+ unsigned int offset;
static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
+ for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
port_state = inb(idi48gpio->base + ports[i]);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= port_state << word_offset;
+ bits[word] |= port_state << offset;
}

return 0;
--
2.19.0


2018-10-02 01:16:44

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 8/8] gpio: pcie-idio-24: Utilize for_each_set_clump macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-pcie-idio-24.c | 102 +++++++++++--------------------
1 file changed, 36 insertions(+), 66 deletions(-)

diff --git a/drivers/gpio/gpio-pcie-idio-24.c b/drivers/gpio/gpio-pcie-idio-24.c
index f953541e7890..b4d300338a05 100644
--- a/drivers/gpio/gpio-pcie-idio-24.c
+++ b/drivers/gpio/gpio-pcie-idio-24.c
@@ -199,41 +199,21 @@ static int idio_24_gpio_get_multiple(struct gpio_chip *chip,
{
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
size_t i;
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
- unsigned long port_state;
+ size_t word;
+ unsigned int offset;
void __iomem *ports[] = {
&idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15,
&idio24gpio->reg->out16_23, &idio24gpio->reg->in0_7,
&idio24gpio->reg->in8_15, &idio24gpio->reg->in16_23,
};
+ const size_t num_ports = ARRAY_SIZE(ports) + 1;
+ unsigned long port_state;
const unsigned long out_mode_mask = BIT(1);

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports) + 1; i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
+ for_each_set_clump(i, word, offset, mask, num_ports, 8) {
/* read bits from current gpio port (port 6 is TTL GPIO) */
if (i < 6)
port_state = ioread8(ports[i]);
@@ -243,7 +223,7 @@ static int idio_24_gpio_get_multiple(struct gpio_chip *chip,
port_state = ioread8(&idio24gpio->reg->ttl_in0_7);

/* store acquired bits at respective bits array offset */
- bits[word_index] |= port_state << word_offset;
+ bits[word] |= port_state << offset;
}

return 0;
@@ -295,58 +275,48 @@ static void idio_24_gpio_set_multiple(struct gpio_chip *chip,
{
struct idio_24_gpio *const idio24gpio = gpiochip_get_data(chip);
size_t i;
- unsigned long bits_offset;
- unsigned long gpio_mask;
- const unsigned int gpio_reg_size = 8;
- const unsigned long port_mask = GENMASK(gpio_reg_size, 0);
- unsigned long flags;
- unsigned int out_state;
+ size_t word;
+ unsigned int offset;
void __iomem *ports[] = {
&idio24gpio->reg->out0_7, &idio24gpio->reg->out8_15,
&idio24gpio->reg->out16_23
};
+ const size_t num_ports = ARRAY_SIZE(ports) + 1;
+ unsigned int iomask;
+ unsigned int bitmask;
+ unsigned long flags;
const unsigned long out_mode_mask = BIT(1);
- const unsigned int ttl_offset = 48;
- const size_t ttl_i = BIT_WORD(ttl_offset);
- const unsigned int word_offset = ttl_offset % BITS_PER_LONG;
- const unsigned long ttl_mask = (mask[ttl_i] >> word_offset) & port_mask;
- const unsigned long ttl_bits = (bits[ttl_i] >> word_offset) & ttl_mask;
-
- /* set bits are processed a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* check if any set bits for current port */
- gpio_mask = (*mask >> bits_offset) & port_mask;
- if (!gpio_mask) {
- /* no set bits for this port so move on to next port */
+ unsigned int out_state;
+
+ for_each_set_clump(i, word, offset, mask, num_ports, 8) {
+ iomask = mask[word] >> offset;
+ bitmask = iomask & (bits[word] >> offset);
+
+ raw_spin_lock_irqsave(&idio24gpio->lock, flags);
+
+ /* read bits from current gpio port (port 6 is TTL GPIO) */
+ if (i < 6) {
+ out_state = ioread8(ports[i]) & ~iomask;
+ } else if (ioread8(&idio24gpio->reg->ctl) & out_mode_mask) {
+ out_state = ioread8(&idio24gpio->reg->ttl_out0_7);
+ } else {
+ /* skip TTL GPIO if set for input */
+ raw_spin_unlock_irqrestore(&idio24gpio->lock, flags);
continue;
}

- raw_spin_lock_irqsave(&idio24gpio->lock, flags);
+ /* set requested bit states */
+ out_state &= ~iomask;
+ out_state |= bitmask;

- /* process output lines */
- out_state = ioread8(ports[i]) & ~gpio_mask;
- out_state |= (*bits >> bits_offset) & gpio_mask;
- iowrite8(out_state, ports[i]);
+ /* write bits for current gpio port (port 6 is TTL GPIO) */
+ if (i < 6)
+ iowrite8(out_state, ports[i]);
+ else
+ iowrite8(out_state, &idio24gpio->reg->ttl_out0_7);

raw_spin_unlock_irqrestore(&idio24gpio->lock, flags);
}
-
- /* check if setting TTL lines and if they are in output mode */
- if (!ttl_mask || !(ioread8(&idio24gpio->reg->ctl) & out_mode_mask))
- return;
-
- /* handle TTL output */
- raw_spin_lock_irqsave(&idio24gpio->lock, flags);
-
- /* process output lines */
- out_state = ioread8(&idio24gpio->reg->ttl_out0_7) & ~ttl_mask;
- out_state |= ttl_bits;
- iowrite8(out_state, &idio24gpio->reg->ttl_out0_7);
-
- raw_spin_unlock_irqrestore(&idio24gpio->lock, flags);
}

static void idio_24_irq_ack(struct irq_data *data)
--
2.19.0


2018-10-02 01:16:58

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 5/8] gpio: gpio-mm: Utilize for_each_set_clump macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-gpio-mm.c | 67 +++++++++----------------------------
1 file changed, 16 insertions(+), 51 deletions(-)

diff --git a/drivers/gpio/gpio-gpio-mm.c b/drivers/gpio/gpio-gpio-mm.c
index b56ff2efbf36..72668da8bf8d 100644
--- a/drivers/gpio/gpio-gpio-mm.c
+++ b/drivers/gpio/gpio-gpio-mm.c
@@ -172,46 +172,23 @@ static int gpiomm_gpio_get(struct gpio_chip *chip, unsigned int offset)
return !!(port_state & mask);
}

+static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
+
static int gpiomm_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask,
unsigned long *bits)
{
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
size_t i;
- static const size_t ports[] = { 0, 1, 2, 4, 5, 6 };
- const unsigned int gpio_reg_size = 8;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+ size_t word;
+ unsigned int offset;
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < ARRAY_SIZE(ports); i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
+ for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
port_state = inb(gpiommgpio->base + ports[i]);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= port_state << word_offset;
+ bits[word] |= port_state << offset;
}

return 0;
@@ -242,37 +219,25 @@ static void gpiomm_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct gpiomm_gpio *const gpiommgpio = gpiochip_get_data(chip);
- unsigned int i;
- const unsigned int gpio_reg_size = 8;
- unsigned int port;
- unsigned int out_port;
+ size_t i;
+ size_t word;
+ unsigned int offset;
+ unsigned int iomask;
unsigned int bitmask;
unsigned long flags;

- /* set bits are evaluated a gpio register size at a time */
- for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
- /* no more set bits in this mask word; skip to the next word */
- if (!mask[BIT_WORD(i)]) {
- i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
- continue;
- }
-
- port = i / gpio_reg_size;
- out_port = (port > 2) ? port + 1 : port;
- bitmask = mask[BIT_WORD(i)] & bits[BIT_WORD(i)];
+ for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
+ iomask = mask[word] >> offset;
+ bitmask = iomask & (bits[word] >> offset);

spin_lock_irqsave(&gpiommgpio->lock, flags);

/* update output state data and set device gpio register */
- gpiommgpio->out_state[port] &= ~mask[BIT_WORD(i)];
- gpiommgpio->out_state[port] |= bitmask;
- outb(gpiommgpio->out_state[port], gpiommgpio->base + out_port);
+ gpiommgpio->out_state[i] &= ~iomask;
+ gpiommgpio->out_state[i] |= bitmask;
+ outb(gpiommgpio->out_state[i], gpiommgpio->base + ports[i]);

spin_unlock_irqrestore(&gpiommgpio->lock, flags);
-
- /* prepare for next gpio register set */
- mask[BIT_WORD(i)] >>= gpio_reg_size;
- bits[BIT_WORD(i)] >>= gpio_reg_size;
}
}

--
2.19.0


2018-10-02 01:17:11

by William Breathitt Gray

[permalink] [raw]
Subject: [RESEND PATCH v4 6/8] gpio: ws16c48: Utilize for_each_set_clump macro

Replace verbose implementation in get_multiple/set_multiple callbacks
with for_each_set_clump macro to simplify code and improve clarity.

Signed-off-by: William Breathitt Gray <[email protected]>
---
drivers/gpio/gpio-ws16c48.c | 66 +++++++++----------------------------
1 file changed, 16 insertions(+), 50 deletions(-)

diff --git a/drivers/gpio/gpio-ws16c48.c b/drivers/gpio/gpio-ws16c48.c
index c7028eb0b8e1..625336376b5d 100644
--- a/drivers/gpio/gpio-ws16c48.c
+++ b/drivers/gpio/gpio-ws16c48.c
@@ -134,42 +134,19 @@ static int ws16c48_gpio_get_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
- const unsigned int gpio_reg_size = 8;
- size_t i;
- const size_t num_ports = chip->ngpio / gpio_reg_size;
- unsigned int bits_offset;
- size_t word_index;
- unsigned int word_offset;
- unsigned long word_mask;
- const unsigned long port_mask = GENMASK(gpio_reg_size - 1, 0);
+ size_t port;
+ size_t word;
+ unsigned int offset;
+ const unsigned int port_size = 8;
+ const size_t num_ports = chip->ngpio / port_size;
unsigned long port_state;

/* clear bits array to a clean slate */
bitmap_zero(bits, chip->ngpio);

- /* get bits are evaluated a gpio port register at a time */
- for (i = 0; i < num_ports; i++) {
- /* gpio offset in bits array */
- bits_offset = i * gpio_reg_size;
-
- /* word index for bits array */
- word_index = BIT_WORD(bits_offset);
-
- /* gpio offset within current word of bits array */
- word_offset = bits_offset % BITS_PER_LONG;
-
- /* mask of get bits for current gpio within current word */
- word_mask = mask[word_index] & (port_mask << word_offset);
- if (!word_mask) {
- /* no get bits in this port so skip to next one */
- continue;
- }
-
- /* read bits from current gpio port */
- port_state = inb(ws16c48gpio->base + i);
-
- /* store acquired bits at respective bits array offset */
- bits[word_index] |= port_state << word_offset;
+ for_each_set_clump(port, word, offset, mask, num_ports, port_size) {
+ port_state = inb(ws16c48gpio->base + port);
+ bits[word] |= port_state << offset;
}

return 0;
@@ -203,26 +180,19 @@ static void ws16c48_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *mask, unsigned long *bits)
{
struct ws16c48_gpio *const ws16c48gpio = gpiochip_get_data(chip);
- unsigned int i;
- const unsigned int gpio_reg_size = 8;
- unsigned int port;
+ size_t port;
+ size_t word;
+ unsigned int offset;
+ const unsigned int port_size = 8;
+ const size_t num_ports = chip->ngpio / port_size;
unsigned int iomask;
unsigned int bitmask;
unsigned long flags;

- /* set bits are evaluated a gpio register size at a time */
- for (i = 0; i < chip->ngpio; i += gpio_reg_size) {
- /* no more set bits in this mask word; skip to the next word */
- if (!mask[BIT_WORD(i)]) {
- i = (BIT_WORD(i) + 1) * BITS_PER_LONG - gpio_reg_size;
- continue;
- }
-
- port = i / gpio_reg_size;
-
+ for_each_set_clump(port, word, offset, mask, num_ports, port_size) {
/* mask out GPIO configured for input */
- iomask = mask[BIT_WORD(i)] & ~ws16c48gpio->io_state[port];
- bitmask = iomask & bits[BIT_WORD(i)];
+ iomask = (mask[word] >> offset) & ~ws16c48gpio->io_state[port];
+ bitmask = iomask & (bits[word] >> offset);

raw_spin_lock_irqsave(&ws16c48gpio->lock, flags);

@@ -232,10 +202,6 @@ static void ws16c48_gpio_set_multiple(struct gpio_chip *chip,
outb(ws16c48gpio->out_state[port], ws16c48gpio->base + port);

raw_spin_unlock_irqrestore(&ws16c48gpio->lock, flags);
-
- /* prepare for next gpio register set */
- mask[BIT_WORD(i)] >>= gpio_reg_size;
- bits[BIT_WORD(i)] >>= gpio_reg_size;
}
}

--
2.19.0


2018-10-02 07:01:19

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 3/8] gpio: 104-dio-48e: Utilize for_each_set_clump macro

On 2018-10-02 03:14, William Breathitt Gray wrote:
> /* clear bits array to a clean slate */
> bitmap_zero(bits, chip->ngpio);
>
> - /* get bits are evaluated a gpio port register at a time */
> - for (i = 0; i < ARRAY_SIZE(ports); i++) {
> - /* gpio offset in bits array */
> - bits_offset = i * gpio_reg_size;
> -
> - /* word index for bits array */
> - word_index = BIT_WORD(bits_offset);
> -
> - /* gpio offset within current word of bits array */
> - word_offset = bits_offset % BITS_PER_LONG;
> -
> - /* mask of get bits for current gpio within current word */
> - word_mask = mask[word_index] & (port_mask << word_offset);
> - if (!word_mask) {
> - /* no get bits in this port so skip to next one */
> - continue;
> - }
> -
> - /* read bits from current gpio port */
> + for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
> port_state = inb(dio48egpio->base + ports[i]);
> -
> - /* store acquired bits at respective bits array offset */
> - bits[word_index] |= port_state << word_offset;
> + bits[word] |= port_state << offset;

Somewhat unrelated to this series, but is the existing code correct? I'd
expect the RHS to be masked by word_mask; otherwise we might set bits in
bits[] that were not requested? And if one does that, the !word_mask
test is merely an optimization to avoid reading the gpios when the
result would be ignored anyway. Perhaps no caller cares.

Rasmus

2018-10-02 07:44:34

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

On 2018-10-02 03:13, William Breathitt Gray wrote:
> This macro iterates for each group of bits (clump) with set bits, within
> a bitmap memory region. For each iteration, "clump" is set to the found
> clump index, "index" is set to the word index of the bitmap containing
> the found clump, and "offset" is set to the bit offset of the found
> clump within the respective bitmap word.

I can't say I'm a fan. It seems rather clumsy and ad-hoc - though I do
see how it matches the code you replace in drivers/gpio/. When I
initially read the cover letter, I assumed that one would get a sequence
of 4-bit values, but one has to dig the actual value out of the bitmap
afterwards using all of index, offset and a mask computed from clump_size.

> +
> +/**
> + * find_next_clump - find next clump with set bits in a memory region
> + * @index: location to store bitmap word index of found clump
> + * @offset: bits offset of the found clump within the respective bitmap word
> + * @bits: address to base the search on
> + * @size: bitmap size in number of clumps

That's a rather inconvenient unit, no? And rather easy to get wrong, I
can easily see people passing nbits instead.

I think you could reduce the number of arguments to this helper and the
macro, while getting rid of some confusion: Drop index and offset, let
clump_index be start_index and measured in bit positions (like
find_next_bit et al), and let the return value also be a bit position.
And instead of index and offset, have another unsigned long* output
parameter that gives the actual value at [return value:return
value+clump_size]. IOW, I think the prototype should be close to
find_next_bit, except that in case of "clumps", there's an extra piece
of information to return.

> + * @clump_index: clump index at which to start searching
> + * @clump_size: clump size in bits
> + *
> + * Returns the clump index for the next clump with set bits; the respective
> + * bitmap word index is stored at the location pointed by @index, and the bits
> + * offset of the found clump within the respective bitmap word is stored at the
> + * location pointed by @offset. If no bits are set, returns @size.
> + */
> +size_t find_next_clump(size_t *const index, unsigned int *const offset,
> + const unsigned long *const bits, const size_t size,
> + const size_t clump_index, const unsigned int clump_size)
> +{
> + size_t i;
> + unsigned int bits_offset;
> + unsigned long word_mask;
> + const unsigned long clump_mask = GENMASK(clump_size - 1, 0);
> +
> + for (i = clump_index; i < size; i++) {
> + bits_offset = i * clump_size;
> +
> + *index = BIT_WORD(bits_offset);
> + *offset = bits_offset % BITS_PER_LONG;
> +
> + word_mask = bits[*index] & (clump_mask << *offset);
> + if (!word_mask)
> + continue;

The cover letter says

The clump_size argument can be an arbitrary number of bits and is not
required to be a multiple of 2.

by which I assume you mean "power of 2", but either way, the above code
does not seem to take into account the case where bits_offset +
clump_size straddles a word boundary, so it wouldn't work for a
clump_size that does not divide BITS_PER_LONG.

May I suggest another approach:

unsigned long bitmap_get_value(const unsigned long *bitmap, unsigned
start, unsigned width): Get the value of bitmap[start:start+width] for
1<=width<=BITS_PER_LONG (it's up to the caller to ensure this is within
the defined region). That can almost be an inline

bitmap_get_value(const unsigned long *bitmap, unsigned start, unsigned
width)
{
unsigned idx = BIT_WORD(start);
unsigned offset = start % BITS_PER_LONG;
unsigned long lower = bitmap[idx] >> offset;
unsigned long upper = offset <= BITS_PER_LONG - width ? 0 :
bitmap[idx+1] << (BITS_PER_LONG - offset);
return (lower | upper) & GENMASK(width-1, 0)
}

Then you can implement the for_each_set_clump by a (IMO) more readable

for (i = 0, start = 0; i < num_ports; i++, start += gpio_reg_size) {
word_mask = bitmap_get_value(mask, start, gpio_reg_size);
if (!word_mask)
continue;
...
}

Or, if you do want find_next_clump/for_each_set_clump, that can be
implemented in terms of this.

Rasmus

2018-10-02 08:23:59

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

On Tue, Oct 02, 2018 at 09:42:48AM +0200, Rasmus Villemoes wrote:
> On 2018-10-02 03:13, William Breathitt Gray wrote:

> The cover letter says
>
> The clump_size argument can be an arbitrary number of bits and is not
> required to be a multiple of 2.
>
> by which I assume you mean "power of 2", but either way, the above code
> does not seem to take into account the case where bits_offset +
> clump_size straddles a word boundary, so it wouldn't work for a
> clump_size that does not divide BITS_PER_LONG.

E.g. 3 bits in a clump? Hmm...

Why would we need that? I mean some real use case?

> May I suggest another approach:

You may, of course, but see above and my comments below.

> unsigned long bitmap_get_value(const unsigned long *bitmap, unsigned
> start, unsigned width): Get the value of bitmap[start:start+width] for
> 1<=width<=BITS_PER_LONG (it's up to the caller to ensure this is within
> the defined region). That can almost be an inline
>
> bitmap_get_value(const unsigned long *bitmap, unsigned start, unsigned
> width)
> {
> unsigned idx = BIT_WORD(start);
> unsigned offset = start % BITS_PER_LONG;
> unsigned long lower = bitmap[idx] >> offset;
> unsigned long upper = offset <= BITS_PER_LONG - width ? 0 :
> bitmap[idx+1] << (BITS_PER_LONG - offset);
> return (lower | upper) & GENMASK(width-1, 0)
> }
>
> Then you can implement the for_each_set_clump by a (IMO) more readable
>
> for (i = 0, start = 0; i < num_ports; i++, start += gpio_reg_size) {
> word_mask = bitmap_get_value(mask, start, gpio_reg_size);
> if (!word_mask)
> continue;
> ...
> }

I would rather go with two prototypes to get()/set() a clump in the bitmap
in a way when it's aligned and BITS_PER_LONG % clump_size == 0.

unsigned long bitmap_get_clump(unsigned long *src, unsigned int start, unsigned int clump_size)
{
unsigned int index = BIT_WORD(start);
unsigned int offset = start % BITS_PER_LONG;


/* These just for spelling the restrictions */
WARN_ON(BITS_PER_LONG % clump_size);
WARN_ON(offset % clump_size);

/* TODO: take care of clump_size == 64 */
return (bitmap[index] >> offset) & GENMASK(clump_size - 1, 0);
}

Something similar with set with additional parameter unsigned long value
which has MSB cleared till we reach [clump_size - 1 : 0].

--
With Best Regards,
Andy Shevchenko



2018-10-03 11:49:06

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

On Tue, Oct 02, 2018 at 11:21:42AM +0300, Andy Shevchenko wrote:

> I would rather go with two prototypes to get()/set() a clump in the bitmap
> in a way when it's aligned and BITS_PER_LONG % clump_size == 0.

To make things much easier, restrict clump_size to the one
from the following set:

1, 2, 4, 8, 16, 32 even on 64-bit platforms.

If it would be simpler solution to add 64 here (implying 32-bit platform),
I would vote for that.

For the generic case we might need something like:

unsigned long bitmap_get_bits(unsigned long *src, unsigned int start, unsigned int nbits)
{
assert(nbits > BITS_PER_LONG);

/* Something like Rasmus proposed earlier */
}

And similar to setter.


--
With Best Regards,
Andy Shevchenko



2018-10-04 10:05:44

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

On Tue, Oct 02, 2018 at 09:42:48AM +0200, Rasmus Villemoes wrote:
> On 2018-10-02 03:13, William Breathitt Gray wrote:
> > This macro iterates for each group of bits (clump) with set bits, within
> > a bitmap memory region. For each iteration, "clump" is set to the found
> > clump index, "index" is set to the word index of the bitmap containing
> > the found clump, and "offset" is set to the bit offset of the found
> > clump within the respective bitmap word.
>
> I can't say I'm a fan. It seems rather clumsy and ad-hoc - though I do
> see how it matches the code you replace in drivers/gpio/. When I
> initially read the cover letter, I assumed that one would get a sequence
> of 4-bit values, but one has to dig the actual value out of the bitmap
> afterwards using all of index, offset and a mask computed from clump_size.

Yes, that is because this macro is as you noted primarily a replacement
for the repetitive code used in the GPIO drivers; the GPIO drivers
require the index and offset in order to modify and store the requested
bit values and perform port I/O.

I put this macro up in the bitops code, but perhaps I should have left
it local to the GPIO subsystem since its so specific. This is one aspect
I want to determine: whether to keep this macro here or move it.

> > +
> > +/**
> > + * find_next_clump - find next clump with set bits in a memory region
> > + * @index: location to store bitmap word index of found clump
> > + * @offset: bits offset of the found clump within the respective bitmap word
> > + * @bits: address to base the search on
> > + * @size: bitmap size in number of clumps
>
> That's a rather inconvenient unit, no? And rather easy to get wrong, I
> can easily see people passing nbits instead.
>
> I think you could reduce the number of arguments to this helper and the
> macro, while getting rid of some confusion: Drop index and offset, let
> clump_index be start_index and measured in bit positions (like
> find_next_bit et al), and let the return value also be a bit position.
> And instead of index and offset, have another unsigned long* output
> parameter that gives the actual value at [return value:return
> value+clump_size]. IOW, I think the prototype should be close to
> find_next_bit, except that in case of "clumps", there's an extra piece
> of information to return.

There may be benefit to develop a different macro more aligned with the
rest of the bitops code -- one where we do in fact return the direct
4-bit value for example. Essentially all the GPIO drivers need are the
index for the hardware I/O port and the index for the bitmap to store
the bits.

So we may be able to reimplement the for_each_set_clump to utilize a
simplier macro that returns the clump value, and then determine index
and offset up in the for_each_set_clump macro; that way we can keep the
generic clump value return code isolated from the code needed by the
GPIO drivers.

> > + * @clump_index: clump index at which to start searching
> > + * @clump_size: clump size in bits
> > + *
> > + * Returns the clump index for the next clump with set bits; the respective
> > + * bitmap word index is stored at the location pointed by @index, and the bits
> > + * offset of the found clump within the respective bitmap word is stored at the
> > + * location pointed by @offset. If no bits are set, returns @size.
> > + */
> > +size_t find_next_clump(size_t *const index, unsigned int *const offset,
> > + const unsigned long *const bits, const size_t size,
> > + const size_t clump_index, const unsigned int clump_size)
> > +{
> > + size_t i;
> > + unsigned int bits_offset;
> > + unsigned long word_mask;
> > + const unsigned long clump_mask = GENMASK(clump_size - 1, 0);
> > +
> > + for (i = clump_index; i < size; i++) {
> > + bits_offset = i * clump_size;
> > +
> > + *index = BIT_WORD(bits_offset);
> > + *offset = bits_offset % BITS_PER_LONG;
> > +
> > + word_mask = bits[*index] & (clump_mask << *offset);
> > + if (!word_mask)
> > + continue;
>
> The cover letter says
>
> The clump_size argument can be an arbitrary number of bits and is not
> required to be a multiple of 2.
>
> by which I assume you mean "power of 2", but either way, the above code
> does not seem to take into account the case where bits_offset +
> clump_size straddles a word boundary, so it wouldn't work for a
> clump_size that does not divide BITS_PER_LONG.

Ah, you are correct, if clump_size does not divide evenly into
BITS_PER_LONG then the macro skips the portion of bits that reside
across the boundary. This is an unintentional behavior that will need to
be fixed. I didn't notice this since the GPIO drivers utilizing the
macro so far have all used a clump_size that divides cleanly.

>
> May I suggest another approach:
>
> unsigned long bitmap_get_value(const unsigned long *bitmap, unsigned
> start, unsigned width): Get the value of bitmap[start:start+width] for
> 1<=width<=BITS_PER_LONG (it's up to the caller to ensure this is within
> the defined region). That can almost be an inline
>
> bitmap_get_value(const unsigned long *bitmap, unsigned start, unsigned
> width)
> {
> unsigned idx = BIT_WORD(start);
> unsigned offset = start % BITS_PER_LONG;
> unsigned long lower = bitmap[idx] >> offset;
> unsigned long upper = offset <= BITS_PER_LONG - width ? 0 :
> bitmap[idx+1] << (BITS_PER_LONG - offset);
> return (lower | upper) & GENMASK(width-1, 0)
> }
>
> Then you can implement the for_each_set_clump by a (IMO) more readable
>
> for (i = 0, start = 0; i < num_ports; i++, start += gpio_reg_size) {
> word_mask = bitmap_get_value(mask, start, gpio_reg_size);
> if (!word_mask)
> continue;
> ...
> }
>
> Or, if you do want find_next_clump/for_each_set_clump, that can be
> implemented in terms of this.
>
> Rasmus

This might work. All that would need to change to support the GPIO
drivers is to return BIT_WORD(start) as index and offset as (start %
BITS_PER_LONG). These sets can be performed outside of bitmap_get_value,
thus allowing it to have a simplier interface for code that does not
require index/offset.

William Breathitt Gray

2018-10-04 10:32:32

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

On Tue, Oct 02, 2018 at 11:21:42AM +0300, Andy Shevchenko wrote:
> On Tue, Oct 02, 2018 at 09:42:48AM +0200, Rasmus Villemoes wrote:
> > On 2018-10-02 03:13, William Breathitt Gray wrote:
>
> > The cover letter says
> >
> > The clump_size argument can be an arbitrary number of bits and is not
> > required to be a multiple of 2.
> >
> > by which I assume you mean "power of 2", but either way, the above code
> > does not seem to take into account the case where bits_offset +
> > clump_size straddles a word boundary, so it wouldn't work for a
> > clump_size that does not divide BITS_PER_LONG.
>
> E.g. 3 bits in a clump? Hmm...
>
> Why would we need that? I mean some real use case?

GPIOs in hardware may be routed to devices logically in groups of I/O
lines, yet must still be accessed via the word-sized registers on the
operating machine.

For example, suppose a GPIO card is used to control a set of shower
devices. The card supports 4 shower devices, each device controlled by 3
lines of I/O: enable, hot-cold selection, high-low pressure selection.
In this case, a operating machine would still have to access the GPIO
lines via the I/O registers (e.g. 8-bit port I/O); but with a macro
handling a clump size of 3-bits, we can loop logically by each shower
device which is much simpler from a driver perspective.

William Breathitt Gray

2018-10-04 10:36:47

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

On Wed, Oct 03, 2018 at 02:48:04PM +0300, Andy Shevchenko wrote:
> On Tue, Oct 02, 2018 at 11:21:42AM +0300, Andy Shevchenko wrote:
>
> > I would rather go with two prototypes to get()/set() a clump in the bitmap
> > in a way when it's aligned and BITS_PER_LONG % clump_size == 0.
>
> To make things much easier, restrict clump_size to the one
> from the following set:
>
> 1, 2, 4, 8, 16, 32 even on 64-bit platforms.
>
> If it would be simpler solution to add 64 here (implying 32-bit platform),
> I would vote for that.
>
> For the generic case we might need something like:
>
> unsigned long bitmap_get_bits(unsigned long *src, unsigned int start, unsigned int nbits)
> {
> assert(nbits > BITS_PER_LONG);
>
> /* Something like Rasmus proposed earlier */
> }
>
> And similar to setter.
>
>
> --
> With Best Regards,
> Andy Shevchenko

I have no objections to have a simplier macro for these common clump
sizes -- afterall, I suspect most drivers will likely use clump sizes
that are powers of 2 anyway. It would be nice to have a more versatile
macro though for those drivers that would benefit from odd clump sizes,
but we can perhaps postpone that until the need arises (the GPIO drivers
in this patchset all use a power of 2).

William Breathitt Gray

2018-10-04 12:11:35

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 1/8] bitops: Introduce the for_each_set_clump macro

On Thu, Oct 04, 2018 at 07:36:20PM +0900, William Breathitt Gray wrote:
> On Wed, Oct 03, 2018 at 02:48:04PM +0300, Andy Shevchenko wrote:
> > On Tue, Oct 02, 2018 at 11:21:42AM +0300, Andy Shevchenko wrote:
> >
> > > I would rather go with two prototypes to get()/set() a clump in the bitmap
> > > in a way when it's aligned and BITS_PER_LONG % clump_size == 0.
> >
> > To make things much easier, restrict clump_size to the one
> > from the following set:
> >
> > 1, 2, 4, 8, 16, 32 even on 64-bit platforms.
> >
> > If it would be simpler solution to add 64 here (implying 32-bit platform),
> > I would vote for that.
> >
> > For the generic case we might need something like:
> >
> > unsigned long bitmap_get_bits(unsigned long *src, unsigned int start, unsigned int nbits)
> > {
> > assert(nbits > BITS_PER_LONG);
> >
> > /* Something like Rasmus proposed earlier */
> > }
> >
> > And similar to setter.
> >
> >
> > --
> > With Best Regards,
> > Andy Shevchenko
>
> I have no objections to have a simplier macro for these common clump
> sizes -- afterall, I suspect most drivers will likely use clump sizes
> that are powers of 2 anyway. It would be nice to have a more versatile
> macro though for those drivers that would benefit from odd clump sizes,
> but we can perhaps postpone that until the need arises (the GPIO drivers
> in this patchset all use a power of 2).

Yes, this is my point of view: don't produce additional complexity to some
which has no users (yet).

When we would really have groups out of an odd bit number, we may reconsider.

--
With Best Regards,
Andy Shevchenko



2018-10-14 04:20:38

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 3/8] gpio: 104-dio-48e: Utilize for_each_set_clump macro

On Tue, Oct 02, 2018 at 09:00:45AM +0200, Rasmus Villemoes wrote:
> On 2018-10-02 03:14, William Breathitt Gray wrote:
> > /* clear bits array to a clean slate */
> > bitmap_zero(bits, chip->ngpio);
> >
> > - /* get bits are evaluated a gpio port register at a time */
> > - for (i = 0; i < ARRAY_SIZE(ports); i++) {
> > - /* gpio offset in bits array */
> > - bits_offset = i * gpio_reg_size;
> > -
> > - /* word index for bits array */
> > - word_index = BIT_WORD(bits_offset);
> > -
> > - /* gpio offset within current word of bits array */
> > - word_offset = bits_offset % BITS_PER_LONG;
> > -
> > - /* mask of get bits for current gpio within current word */
> > - word_mask = mask[word_index] & (port_mask << word_offset);
> > - if (!word_mask) {
> > - /* no get bits in this port so skip to next one */
> > - continue;
> > - }
> > -
> > - /* read bits from current gpio port */
> > + for_each_set_clump(i, word, offset, mask, ARRAY_SIZE(ports), 8) {
> > port_state = inb(dio48egpio->base + ports[i]);
> > -
> > - /* store acquired bits at respective bits array offset */
> > - bits[word_index] |= port_state << word_offset;
> > + bits[word] |= port_state << offset;
>
> Somewhat unrelated to this series, but is the existing code correct? I'd
> expect the RHS to be masked by word_mask; otherwise we might set bits in
> bits[] that were not requested? And if one does that, the !word_mask
> test is merely an optimization to avoid reading the gpios when the
> result would be ignored anyway. Perhaps no caller cares.
>
> Rasmus

I don't think the caller cares in this case. Take a look at the
gpiod_get_array_value_complex function: the desired inputs are collected
before gpio_chip_get_multiple is called and then looped through after --
unrequested bits are simply ignored.

This caller behavior also makes sense because a bit value of 0 in the
bits array does not necessarily mean the input was not requested, but
may instead mean that the value at the input is 0; therefore, the caller
must keep track of the requested inputs rather than try to deduce them
from the values in the bits array.

William Breathitt Gray

2018-10-15 12:00:24

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 3/8] gpio: 104-dio-48e: Utilize for_each_set_clump macro

On 2018-10-14 06:19, William Breathitt Gray wrote:

> a bit value of 0 in the
> bits array does not necessarily mean the input was not requested, but
> may instead mean that the value at the input is 0;

sure enough, but...

> therefore, the caller
> must keep track of the requested inputs rather than try to deduce them
> from the values in the bits array.

...I don't agree that this logically follows. A caller might reasonably
expect not to find any bits set in positions other than those in mask. A
simple example would be caller that just tried to ask "are any of
_these_ inputs set"; it would be reasonable to implement that using
bitmap_empty() on the returned bitset, without first having to mask by
the mask he passed in.

Rasmus

2018-10-17 01:54:43

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 3/8] gpio: 104-dio-48e: Utilize for_each_set_clump macro

On Mon, Oct 15, 2018 at 01:59:33PM +0200, Rasmus Villemoes wrote:
> On 2018-10-14 06:19, William Breathitt Gray wrote:
>
> > a bit value of 0 in the
> > bits array does not necessarily mean the input was not requested, but
> > may instead mean that the value at the input is 0;
>
> sure enough, but...
>
> > therefore, the caller
> > must keep track of the requested inputs rather than try to deduce them
> > from the values in the bits array.
>
> ...I don't agree that this logically follows. A caller might reasonably
> expect not to find any bits set in positions other than those in mask. A
> simple example would be caller that just tried to ask "are any of
> _these_ inputs set"; it would be reasonable to implement that using
> bitmap_empty() on the returned bitset, without first having to mask by
> the mask he passed in.
>
> Rasmus

I see your point. It would be good to keep the behavior consistent with
what would be expected by the user -- and adding an additional AND
operation at the end to mask away the unrequested bits should not really
affect the performance to a discernible degree -- so I'll submit a
patchset implementing the mask for these drivers some time this weekend.

William Breathitt Gray