2021-02-12 13:23:47

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v2 0/3] Introduce the for_each_set_clump macro

Hello Bartosz,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_clump.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
XXXXXXXX represents the current 8-bit group:

Example: 10111110 00000000 11111111 00110011
First loop: 10111110 00000000 11111111 XXXXXXXX
Second loop: 10111110 00000000 XXXXXXXX 00110011
Third loop: XXXXXXXX 00000000 11111111 00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

/* bitmap memory region */
0x00aa0000ff000000; /* Most significant bits */
0xaaaaaa0000ff0000;
0x000000aa000000aa;
0xbbbbabcdeffedcba; /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first: offset: 0 clump: 0xfedcba
Iteration second: offset: 24 clump: 0xabcdef
Iteration third: offset: 48 clump: 0xaabbbb
Iteration fourth: offset: 96 clump: 0xaa
Iteration fifth: offset: 144 clump: 0xff
Iteration sixth: offset: 168 clump: 0xaaaaaa
Iteration seventh: offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

/* bitmap memory region */
0x00aa0000ff000000; /* Most significant bits */
0xaaaaaa0000ff0000;
0x0f00000000000000;
0x0000000000000ac0; /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first: offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v2:
- [Patch 1/3]: Shift the macros and related functions to gpiolib inside
gpio/. Reduce the visibilty of 'for_each_set_clump' to gpio.
- [Patch 1/3]: Remove __builtin_unreachable and simply use return
statement.
- Remove tests from lib/test_bitmap.c as 'for_each_set_clump' is
now localised inside gpio/ only.

Syed Nayyar Waris (3):
gpiolib: : Introduce the for_each_set_clump macro
gpio: thunderx: Utilize for_each_set_clump macro
gpio: xilinx: Utilize generic bitmap_get_value and _set_value

drivers/gpio/gpio-thunderx.c | 13 ++++--
drivers/gpio/gpio-xilinx.c | 63 ++++++++++++-------------
drivers/gpio/gpiolib.c | 90 ++++++++++++++++++++++++++++++++++++
drivers/gpio/gpiolib.h | 28 +++++++++++
4 files changed, 158 insertions(+), 36 deletions(-)


base-commit: e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62
--
2.29.0


2021-02-12 13:25:01

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v2 2/3] gpio: thunderx: Utilize for_each_set_clump macro

This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: William Breathitt Gray <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
---
drivers/gpio/gpio-thunderx.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..0398b2d2af4b 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,7 +16,7 @@
#include <linux/pci.h>
#include <linux/spinlock.h>
#include <asm-generic/msi.h>
-
+#include "gpiolib.h"

#define GPIO_RX_DAT 0x0
#define GPIO_TX_SET 0x8
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip *chip,
unsigned long *bits)
{
int bank;
- u64 set_bits, clear_bits;
+ unsigned long set_bits, clear_bits, gpio_mask;
+ unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);

- for (bank = 0; bank <= chip->ngpio / 64; bank++) {
- set_bits = bits[bank] & mask[bank];
- clear_bits = ~bits[bank] & mask[bank];
+ for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+ bank = offset / 64;
+ set_bits = bits[bank] & gpio_mask;
+ clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_CLR);
}
--
2.29.0

2021-02-12 13:27:25

by Syed Nayyar Waris

[permalink] [raw]
Subject: [PATCH v2 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: William Breathitt Gray <[email protected]>
Cc: Bartosz Golaszewski <[email protected]>
Cc: Michal Simek <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
---
drivers/gpio/gpio-xilinx.c | 63 +++++++++++++++++++-------------------
1 file changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index be539381fd82..8445e69cf37b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -15,6 +15,7 @@
#include <linux/of_device.h>
#include <linux/of_platform.h>
#include <linux/slab.h>
+#include "gpiolib.h"

/* Register Offset Definitions */
#define XGPIO_DATA_OFFSET (0x0) /* Data register */
@@ -141,37 +142,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
{
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
- int index = xgpio_index(chip, 0);
- int offset, i;
-
- spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
- /* Write to GPIO signals */
- for (i = 0; i < gc->ngpio; i++) {
- if (*mask == 0)
- break;
- /* Once finished with an index write it out to the register */
- if (index != xgpio_index(chip, i)) {
- xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
- index * XGPIO_CHANNEL_OFFSET,
- chip->gpio_state[index]);
- spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
- index = xgpio_index(chip, i);
- spin_lock_irqsave(&chip->gpio_lock[index], flags);
- }
- if (__test_and_clear_bit(i, mask)) {
- offset = xgpio_offset(chip, i);
- if (test_bit(i, bits))
- chip->gpio_state[index] |= BIT(offset);
- else
- chip->gpio_state[index] &= ~BIT(offset);
- }
- }
-
- xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
- index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
- spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+ u32 *const state = chip->gpio_state;
+ unsigned int *const width = chip->gpio_width;
+
+ DECLARE_BITMAP(old, 64);
+ DECLARE_BITMAP(new, 64);
+ DECLARE_BITMAP(changed, 64);
+
+ spin_lock_irqsave(&chip->gpio_lock[0], flags);
+ spin_lock(&chip->gpio_lock[1]);
+
+ bitmap_set_value(old, 64, state[0], width[0], 0);
+ bitmap_set_value(old, 64, state[1], width[1], width[0]);
+ bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+ bitmap_set_value(old, 64, state[0], 32, 0);
+ bitmap_set_value(old, 64, state[1], 32, 32);
+ state[0] = bitmap_get_value(new, 0, width[0]);
+ state[1] = bitmap_get_value(new, width[0], width[1]);
+ bitmap_set_value(new, 64, state[0], 32, 0);
+ bitmap_set_value(new, 64, state[1], 32, 32);
+ bitmap_xor(changed, old, new, 64);
+
+ if (((u32 *)changed)[0])
+ xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+ state[0]);
+ if (((u32 *)changed)[1])
+ xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+ XGPIO_CHANNEL_OFFSET, state[1]);
+
+ spin_unlock(&chip->gpio_lock[1]);
+ spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
}

/**
--
2.29.0

2021-02-14 06:49:23

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] gpio: thunderx: Utilize for_each_set_clump macro

On Fri, Feb 12, 2021 at 06:51:04PM +0530, Syed Nayyar Waris wrote:
> This patch reimplements the thunderx_gpio_set_multiple function in
> drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
> Instead of looping for each bank in thunderx_gpio_set_multiple
> function, now we can skip bank which is not set and save cycles.
>
> Cc: William Breathitt Gray <[email protected]>
> Cc: Robert Richter <[email protected]>
> Cc: Bartosz Golaszewski <[email protected]>
> Signed-off-by: Syed Nayyar Waris <[email protected]>

Acked-by: William Breathitt Gray <[email protected]>

> ---
> drivers/gpio/gpio-thunderx.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
> index 9f66deab46ea..0398b2d2af4b 100644
> --- a/drivers/gpio/gpio-thunderx.c
> +++ b/drivers/gpio/gpio-thunderx.c
> @@ -16,7 +16,7 @@
> #include <linux/pci.h>
> #include <linux/spinlock.h>
> #include <asm-generic/msi.h>
> -
> +#include "gpiolib.h"
>
> #define GPIO_RX_DAT 0x0
> #define GPIO_TX_SET 0x8
> @@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip *chip,
> unsigned long *bits)
> {
> int bank;
> - u64 set_bits, clear_bits;
> + unsigned long set_bits, clear_bits, gpio_mask;
> + unsigned long offset;
> +
> struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
>
> - for (bank = 0; bank <= chip->ngpio / 64; bank++) {
> - set_bits = bits[bank] & mask[bank];
> - clear_bits = ~bits[bank] & mask[bank];
> + for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
> + bank = offset / 64;
> + set_bits = bits[bank] & gpio_mask;
> + clear_bits = ~bits[bank] & gpio_mask;
> writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_SET);
> writeq(clear_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) + GPIO_TX_CLR);
> }
> --
> 2.29.0
>


Attachments:
(No filename) (1.94 kB)
signature.asc (849.00 B)
Download all attachments

2021-02-14 06:53:26

by William Breathitt Gray

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

On Fri, Feb 12, 2021 at 06:52:00PM +0530, Syed Nayyar Waris wrote:
> This patch reimplements the xgpio_set_multiple() function in
> drivers/gpio/gpio-xilinx.c to use the new generic functions:
> bitmap_get_value() and bitmap_set_value(). The code is now simpler
> to read and understand. Moreover, instead of looping for each bit
> in xgpio_set_multiple() function, now we can check each channel at
> a time and save cycles.
>
> Cc: William Breathitt Gray <[email protected]>
> Cc: Bartosz Golaszewski <[email protected]>
> Cc: Michal Simek <[email protected]>
> Signed-off-by: Syed Nayyar Waris <[email protected]>

Acked-by: William Breathitt Gray <[email protected]>

> ---
> drivers/gpio/gpio-xilinx.c | 63 +++++++++++++++++++-------------------
> 1 file changed, 32 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> index be539381fd82..8445e69cf37b 100644
> --- a/drivers/gpio/gpio-xilinx.c
> +++ b/drivers/gpio/gpio-xilinx.c
> @@ -15,6 +15,7 @@
> #include <linux/of_device.h>
> #include <linux/of_platform.h>
> #include <linux/slab.h>
> +#include "gpiolib.h"
>
> /* Register Offset Definitions */
> #define XGPIO_DATA_OFFSET (0x0) /* Data register */
> @@ -141,37 +142,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
> {
> unsigned long flags;
> struct xgpio_instance *chip = gpiochip_get_data(gc);
> - int index = xgpio_index(chip, 0);
> - int offset, i;
> -
> - spin_lock_irqsave(&chip->gpio_lock[index], flags);
> -
> - /* Write to GPIO signals */
> - for (i = 0; i < gc->ngpio; i++) {
> - if (*mask == 0)
> - break;
> - /* Once finished with an index write it out to the register */
> - if (index != xgpio_index(chip, i)) {
> - xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> - index * XGPIO_CHANNEL_OFFSET,
> - chip->gpio_state[index]);
> - spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
> - index = xgpio_index(chip, i);
> - spin_lock_irqsave(&chip->gpio_lock[index], flags);
> - }
> - if (__test_and_clear_bit(i, mask)) {
> - offset = xgpio_offset(chip, i);
> - if (test_bit(i, bits))
> - chip->gpio_state[index] |= BIT(offset);
> - else
> - chip->gpio_state[index] &= ~BIT(offset);
> - }
> - }
> -
> - xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> - index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
> -
> - spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
> + u32 *const state = chip->gpio_state;
> + unsigned int *const width = chip->gpio_width;
> +
> + DECLARE_BITMAP(old, 64);
> + DECLARE_BITMAP(new, 64);
> + DECLARE_BITMAP(changed, 64);
> +
> + spin_lock_irqsave(&chip->gpio_lock[0], flags);
> + spin_lock(&chip->gpio_lock[1]);
> +
> + bitmap_set_value(old, 64, state[0], width[0], 0);
> + bitmap_set_value(old, 64, state[1], width[1], width[0]);
> + bitmap_replace(new, old, bits, mask, gc->ngpio);
> +
> + bitmap_set_value(old, 64, state[0], 32, 0);
> + bitmap_set_value(old, 64, state[1], 32, 32);
> + state[0] = bitmap_get_value(new, 0, width[0]);
> + state[1] = bitmap_get_value(new, width[0], width[1]);
> + bitmap_set_value(new, 64, state[0], 32, 0);
> + bitmap_set_value(new, 64, state[1], 32, 32);
> + bitmap_xor(changed, old, new, 64);
> +
> + if (((u32 *)changed)[0])
> + xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
> + state[0]);
> + if (((u32 *)changed)[1])
> + xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> + XGPIO_CHANNEL_OFFSET, state[1]);
> +
> + spin_unlock(&chip->gpio_lock[1]);
> + spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
> }
>
> /**
> --
> 2.29.0
>


Attachments:
(No filename) (3.66 kB)
signature.asc (849.00 B)
Download all attachments

2021-03-04 11:18:38

by Bartosz Golaszewski

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Introduce the for_each_set_clump macro

On Fri, Feb 12, 2021 at 2:19 PM Syed Nayyar Waris <[email protected]> wrote:
>
> Hello Bartosz,
>
> Since this patchset primarily affects GPIO drivers, would you like
> to pick it up through your GPIO tree?
>

Sure, as soon as you figure out what's wrong with the xilinx patch.
Could you also follow William's suggestion and rename the functions?

Bart

2021-03-06 13:44:16

by Syed Nayyar Waris

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Introduce the for_each_set_clump macro

On Wed, Mar 3, 2021 at 8:13 PM Bartosz Golaszewski
<[email protected]> wrote:
>
> On Fri, Feb 12, 2021 at 2:19 PM Syed Nayyar Waris <[email protected]> wrote:
> >
> > Hello Bartosz,
> >
> > Since this patchset primarily affects GPIO drivers, would you like
> > to pick it up through your GPIO tree?
> >
>
> Sure, as soon as you figure out what's wrong with the xilinx patch.
> Could you also follow William's suggestion and rename the functions?
>
> Bart

I have incorporated William's suggestions and have also solved the
build error coming in the xilinx patch.

I am sharing the v3 patchset. Thanks !

Regards


Syed Nayyar Waris