2023-09-26 06:26:32

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 0/5] bitmap: get rid of bitmap_remap() and bitmap_biremap() uses

As Rasmus suggested [1], the bit remapping APIs should be local to NUMA.
However, they are being in use outside of that for a while. To make
above happen, introduces simplified APIs that can be used otherwise.

It seems we might have yet another user of the above mentioned APIs.

The last patch has not been tested anyhow (except compilation, so
all testing and comments are welcome).

The idea is to get an immutable tag (via my tree) that can be pulled
by bitmap and GPIO trees on the need (while usually I send PR to
the GPIO subsystem).

Link: https://lore.kernel.org/all/[email protected]/ [1]

Andy Shevchenko (5):
lib/test_bitmap: Excape space symbols when printing input string
lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers
gpio: xilinx: Switch to use new bitmap_scatter() helper
gpio: xilinx: Replace bitmap_bitremap() calls
gpiolib: cdev: Utilize more bitmap APIs

drivers/gpio/gpio-xilinx.c | 13 ++----
drivers/gpio/gpiolib-cdev.c | 79 +++++++++++++++++--------------------
include/linux/bitmap.h | 9 +++++
lib/bitmap.c | 70 ++++++++++++++++++++++++++++++++
lib/test_bitmap.c | 25 +++++++++++-
5 files changed, 143 insertions(+), 53 deletions(-)

--
2.40.0.1.gaa8946217a0b


2023-09-26 07:30:38

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 4/5] gpio: xilinx: Replace bitmap_bitremap() calls

We have sparse and dence masks of the line mappings based on
the view point (Linux numbering or hardware numbering). Since
the Linux side uses sequential bits for the mask, we can simply
convert a Linux number to the hardware one and vise versa by
counting set bits in the respective mask. Hence replace
bitmap_bitremap() calls by simpler equivalents.

With this done the dence mask is not needed and thus dropped.

Signed-off-by: Andy Shevchenko <[email protected]>
---
drivers/gpio/gpio-xilinx.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index f103c30cc74f..14ca3097563a 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -46,7 +46,6 @@
* @gc: GPIO chip
* @regs: register block
* @hw_map: GPIO pin mapping on hardware side
- * @sw_map: GPIO pin mapping on software side
* @state: GPIO write state shadow register
* @last_irq_read: GPIO read state register from last interrupt
* @dir: GPIO direction shadow register
@@ -62,7 +61,6 @@ struct xgpio_instance {
struct gpio_chip gc;
void __iomem *regs;
DECLARE_BITMAP(hw_map, 64);
- DECLARE_BITMAP(sw_map, 64);
DECLARE_BITMAP(state, 64);
DECLARE_BITMAP(last_irq_read, 64);
DECLARE_BITMAP(dir, 64);
@@ -76,12 +74,12 @@ struct xgpio_instance {

static inline int xgpio_from_bit(struct xgpio_instance *chip, int bit)
{
- return bitmap_bitremap(bit, chip->hw_map, chip->sw_map, 64);
+ return bitmap_weight(chip->hw_map, bit + 1);
}

static inline int xgpio_to_bit(struct xgpio_instance *chip, int gpio)
{
- return bitmap_bitremap(gpio, chip->sw_map, chip->hw_map, 64);
+ return find_nth_bit(chip->hw_map, 64, gpio);
}

static inline u32 xgpio_get_value32(const unsigned long *map, int bit)
@@ -619,9 +617,6 @@ static int xgpio_probe(struct platform_device *pdev)
if (width[1] > 32)
return -EINVAL;

- /* Setup software pin mapping */
- bitmap_set(chip->sw_map, 0, width[0] + width[1]);
-
/* Setup hardware pin mapping */
bitmap_set(chip->hw_map, 0, width[0]);
bitmap_set(chip->hw_map, 32, width[1]);
--
2.40.0.1.gaa8946217a0b

2023-09-26 08:35:25

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 3/5] gpio: xilinx: Switch to use new bitmap_scatter() helper

bitmaps_scatter() produces a sparse bitmap based on the dense one
and a given mask. In this driver the given mask is hw_map, while
sw_map is of sequential bits. We may use the bitmap_scatter() helper
instead of bitmap_remap(), because it's optimized for our case.

Signed-off-by: Andy Shevchenko <[email protected]>
---
drivers/gpio/gpio-xilinx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index a16945e8319e..f103c30cc74f 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -208,8 +208,8 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);

- bitmap_remap(hw_mask, mask, chip->sw_map, chip->hw_map, 64);
- bitmap_remap(hw_bits, bits, chip->sw_map, chip->hw_map, 64);
+ bitmap_scatter(hw_mask, mask, chip->hw_map, 64);
+ bitmap_scatter(hw_bits, bits, chip->hw_map, 64);

spin_lock_irqsave(&chip->gpio_lock, flags);

--
2.40.0.1.gaa8946217a0b

2023-09-26 08:51:34

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 2/5] lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers

These helpers are the optimized versions of the bitmap_remap()
where one of the bitmaps (source or destination) is of sequential bits.

See more in the kernel documentation of the helpers.

Signed-off-by: Andy Shevchenko <[email protected]>
---
include/linux/bitmap.h | 9 ++++++
lib/bitmap.c | 70 ++++++++++++++++++++++++++++++++++++++++++
lib/test_bitmap.c | 23 ++++++++++++++
3 files changed, 102 insertions(+)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 1516ff979315..87013b9a7dd8 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -60,6 +60,8 @@ struct device;
* bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
* bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest
* bitmap_replace(dst, old, new, mask, nbits) *dst = (*old & ~(*mask)) | (*new & *mask)
+ * bitmap_scatter(dst, src, mask, nbits) *dst = map(dense, sparse)(src)
+ * bitmap_gather(dst, src, mask, nbits) *dst = map(sparse, dense)(src)
* bitmap_remap(dst, src, old, new, nbits) *dst = map(old, new)(src)
* bitmap_bitremap(oldbit, old, new, nbits) newbit = map(old, new)(oldbit)
* bitmap_onto(dst, orig, relmap, nbits) *dst = orig relative to relmap
@@ -208,6 +210,12 @@ int bitmap_parselist(const char *buf, unsigned long *maskp,
int nmaskbits);
int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen,
unsigned long *dst, int nbits);
+
+unsigned int bitmap_scatter(unsigned long *dst, const unsigned long *src,
+ const unsigned long *mask, unsigned int nbits);
+unsigned int bitmap_gather(unsigned long *dst, const unsigned long *src,
+ const unsigned long *mask, unsigned int nbits);
+
void bitmap_remap(unsigned long *dst, const unsigned long *src,
const unsigned long *old, const unsigned long *new, unsigned int nbits);
int bitmap_bitremap(int oldbit,
@@ -216,6 +224,7 @@ void bitmap_onto(unsigned long *dst, const unsigned long *orig,
const unsigned long *relmap, unsigned int bits);
void bitmap_fold(unsigned long *dst, const unsigned long *orig,
unsigned int sz, unsigned int nbits);
+
int bitmap_find_free_region(unsigned long *bitmap, unsigned int bits, int order);
void bitmap_release_region(unsigned long *bitmap, unsigned int pos, int order);
int bitmap_allocate_region(unsigned long *bitmap, unsigned int pos, int order);
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 935e0f96e785..31cfc7846aae 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -942,6 +942,76 @@ int bitmap_parse(const char *start, unsigned int buflen,
}
EXPORT_SYMBOL(bitmap_parse);

+/**
+ * bitmap_scatter - Scatter a bitmap according to the given mask
+ * @dst: scattered bitmap
+ * @src: gathered bitmap
+ * @mask: bits to assign to in the scattered bitmap
+ * @nbits: number of bits in each of these bitmaps
+ *
+ * Scatters bitmap with sequential bits according to the given @mask.
+ *
+ * Example:
+ * If @src bitmap = 0x005a, with @mask = 0x1313, @dst will be 0x0302.
+ *
+ * Or in binary form
+ * @src @mask @dst
+ * 0000000001011010 0001001100010011 0000001100000010
+ *
+ * (Bits 0, 1, 2, 3, 4, 5 are copied to the bits 0, 1, 4, 8, 9, 12)
+ *
+ * Returns: the weight of the @mask.
+ */
+unsigned int bitmap_scatter(unsigned long *dst, const unsigned long *src,
+ const unsigned long *mask, unsigned int nbits)
+{
+ unsigned int bit;
+ int n = 0;
+
+ bitmap_zero(dst, nbits);
+
+ for_each_set_bit(bit, mask, nbits)
+ __assign_bit(bit, dst, test_bit(n++, src));
+
+ return n;
+}
+EXPORT_SYMBOL(bitmap_scatter);
+
+/**
+ * bitmap_gather - Gather a bitmap according to given mask
+ * @dst: gathered bitmap
+ * @src: scattered bitmap
+ * @mask: bits to extract from in the scattered bitmap
+ * @nbits: number of bits in each of these bitmaps
+ *
+ * Gathers bitmap with sparse bits according to the given @mask.
+ *
+ * Example:
+ * If @src bitmap = 0x0302, with @mask = 0x1313, @dst will be 0x001a.
+ *
+ * Or in binary form
+ * @src @mask @dst
+ * 0000001100000010 0001001100010011 0000000000011010
+ *
+ * (Bits 0, 1, 4, 8, 9, 12 are copied to the bits 0, 1, 2, 3, 4, 5)
+ *
+ * Returns: the weight of the @mask.
+ */
+unsigned int bitmap_gather(unsigned long *dst, const unsigned long *src,
+ const unsigned long *mask, unsigned int nbits)
+{
+ unsigned int bit;
+ int n = 0;
+
+ bitmap_zero(dst, nbits);
+
+ for_each_set_bit(bit, mask, nbits)
+ __assign_bit(n++, dst, test_bit(bit, src));
+
+ return n;
+}
+EXPORT_SYMBOL(bitmap_gather);
+
/**
* bitmap_pos_to_ord - find ordinal of set bit at given position in bitmap
* @buf: pointer to a bitmap
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 1f2dc7fef17f..f43a07679998 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -50,6 +50,9 @@ static const unsigned long exp2[] __initconst = {
static const unsigned long exp2_to_exp3_mask[] __initconst = {
BITMAP_FROM_U64(0x008000020020212eULL),
};
+static const unsigned long exp2_to_exp3_maskg[] __initconst = {
+ BITMAP_FROM_U64(0x00000000000001ffULL),
+};
/* exp3_0_1 = (exp2[0] & ~exp2_to_exp3_mask) | (exp2[1] & exp2_to_exp3_mask) */
static const unsigned long exp3_0_1[] __initconst = {
BITMAP_FROM_U64(0x33b3333311313137ULL),
@@ -357,6 +360,25 @@ static void __init test_replace(void)
expect_eq_bitmap(bmap, exp3_1_0, nbits);
}

+static void __init test_bitmap_sg(void)
+{
+ unsigned int nbits = 64;
+ DECLARE_BITMAP(bmap, 1024);
+ unsigned int w;
+
+ bitmap_zero(bmap, 1024);
+ w = bitmap_gather(bmap, exp2_to_exp3_mask, exp2_to_exp3_mask, nbits);
+ expect_eq_uint(bitmap_weight(exp2_to_exp3_mask, nbits), w);
+ expect_eq_uint(bitmap_weight(bmap, 1024), w);
+ expect_eq_bitmap(bmap, exp2_to_exp3_maskg, nbits);
+
+ bitmap_zero(bmap, 1024);
+ w = bitmap_scatter(bmap, exp2_to_exp3_maskg, exp2_to_exp3_mask, nbits);
+ expect_eq_uint(bitmap_weight(exp2_to_exp3_maskg, nbits), w);
+ expect_eq_uint(bitmap_weight(bmap, 1024), w);
+ expect_eq_bitmap(bmap, exp2_to_exp3_mask, nbits);
+}
+
#define PARSE_TIME 0x1
#define NO_LEN 0x2

@@ -1228,6 +1250,7 @@ static void __init selftest(void)
test_fill_set();
test_copy();
test_replace();
+ test_bitmap_sg();
test_bitmap_arr32();
test_bitmap_arr64();
test_bitmap_parse();
--
2.40.0.1.gaa8946217a0b

2023-09-26 11:22:44

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

Currently we have a few bitmap calls that are open coded in the library
module. Let's convert them to use generic bitmap APIs instead.

Signed-off-by: Andy Shevchenko <[email protected]>
---
drivers/gpio/gpiolib-cdev.c | 79 +++++++++++++++++--------------------
1 file changed, 36 insertions(+), 43 deletions(-)

diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c
index e39d344feb28..a5bbbd44531f 100644
--- a/drivers/gpio/gpiolib-cdev.c
+++ b/drivers/gpio/gpiolib-cdev.c
@@ -1263,35 +1263,32 @@ static long linereq_get_values(struct linereq *lr, void __user *ip)
{
struct gpio_v2_line_values lv;
DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX);
+ DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX);
+ DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX);
struct gpio_desc **descs;
unsigned int i, didx, num_get;
- bool val;
int ret;

/* NOTE: It's ok to read values of output lines. */
if (copy_from_user(&lv, ip, sizeof(lv)))
return -EFAULT;

- for (num_get = 0, i = 0; i < lr->num_lines; i++) {
- if (lv.mask & BIT_ULL(i)) {
- num_get++;
- descs = &lr->lines[i].desc;
- }
- }
+ bitmap_from_arr64(mask, &lv.mask, GPIO_V2_LINES_MAX);

+ num_get = bitmap_weight(mask, lr->num_lines);
if (num_get == 0)
return -EINVAL;

- if (num_get != 1) {
+ if (num_get == 1) {
+ descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc;
+ } else {
descs = kmalloc_array(num_get, sizeof(*descs), GFP_KERNEL);
if (!descs)
return -ENOMEM;
- for (didx = 0, i = 0; i < lr->num_lines; i++) {
- if (lv.mask & BIT_ULL(i)) {
- descs[didx] = lr->lines[i].desc;
- didx++;
- }
- }
+
+ didx = 0;
+ for_each_set_bit(i, mask, lr->num_lines)
+ descs[didx++] = lr->lines[i].desc;
}
ret = gpiod_get_array_value_complex(false, true, num_get,
descs, NULL, vals);
@@ -1301,19 +1298,15 @@ static long linereq_get_values(struct linereq *lr, void __user *ip)
if (ret)
return ret;

- lv.bits = 0;
- for (didx = 0, i = 0; i < lr->num_lines; i++) {
- if (lv.mask & BIT_ULL(i)) {
- if (lr->lines[i].sw_debounced)
- val = debounced_value(&lr->lines[i]);
- else
- val = test_bit(didx, vals);
- if (val)
- lv.bits |= BIT_ULL(i);
- didx++;
- }
+ bitmap_scatter(bits, vals, mask, lr->num_lines);
+
+ for_each_set_bit(i, mask, lr->num_lines) {
+ if (lr->lines[i].sw_debounced)
+ __assign_bit(i, bits, debounced_value(&lr->lines[i]));
}

+ bitmap_to_arr64(&lv.bits, bits, GPIO_V2_LINES_MAX);
+
if (copy_to_user(ip, &lv, sizeof(lv)))
return -EFAULT;

@@ -1324,35 +1317,35 @@ static long linereq_set_values_unlocked(struct linereq *lr,
struct gpio_v2_line_values *lv)
{
DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX);
+ DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX);
+ DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX);
struct gpio_desc **descs;
unsigned int i, didx, num_set;
int ret;

- bitmap_zero(vals, GPIO_V2_LINES_MAX);
- for (num_set = 0, i = 0; i < lr->num_lines; i++) {
- if (lv->mask & BIT_ULL(i)) {
- if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags))
- return -EPERM;
- if (lv->bits & BIT_ULL(i))
- __set_bit(num_set, vals);
- num_set++;
- descs = &lr->lines[i].desc;
- }
- }
+ bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX);
+ bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX);
+
+ num_set = bitmap_gather(vals, bits, mask, lr->num_lines);
if (num_set == 0)
return -EINVAL;

- if (num_set != 1) {
+ for_each_set_bit(i, mask, lr->num_lines) {
+ if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags))
+ return -EPERM;
+ }
+
+ if (num_set == 1) {
+ descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc;
+ } else {
/* build compacted desc array and values */
descs = kmalloc_array(num_set, sizeof(*descs), GFP_KERNEL);
if (!descs)
return -ENOMEM;
- for (didx = 0, i = 0; i < lr->num_lines; i++) {
- if (lv->mask & BIT_ULL(i)) {
- descs[didx] = lr->lines[i].desc;
- didx++;
- }
- }
+
+ didx = 0;
+ for_each_set_bit(i, mask, lr->num_lines)
+ descs[didx++] = lr->lines[i].desc;
}
ret = gpiod_set_array_value_complex(false, true, num_set,
descs, NULL, vals);
--
2.40.0.1.gaa8946217a0b

2023-09-26 11:40:52

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 1/5] lib/test_bitmap: Excape space symbols when printing input string

test_bitmap_printlist() prints the input string which contains
a new line character. Instead of stripping it, escape that kind
of characters, so developer will see the actual input string
that has been used. Without this change the new line splits
the string to two, and the first one is not guaranteed to be
followed by the first part immediatelly.

Signed-off-by: Andy Shevchenko <[email protected]>
---
lib/test_bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index f2ea9f30c7c5..1f2dc7fef17f 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -523,7 +523,7 @@ static void __init test_bitmap_printlist(void)
goto out;
}

- pr_err("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time);
+ pr_err("bitmap_print_to_pagebuf: input is '%*pEs', Time: %llu\n", ret, buf, time);
out:
kfree(buf);
kfree(bmap);
--
2.40.0.1.gaa8946217a0b

2023-09-26 13:22:55

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] gpio: xilinx: Replace bitmap_bitremap() calls

On Tue, Sep 26, 2023 at 06:41:00PM +0800, Kent Gibson wrote:
> On Tue, Sep 26, 2023 at 08:20:06AM +0300, Andy Shevchenko wrote:
> > We have sparse and dence masks of the line mappings based on
>
> dense
>
> > the view point (Linux numbering or hardware numbering). Since
> > the Linux side uses sequential bits for the mask, we can simply
> > convert a Linux number to the hardware one and vise versa by
>
> vice
>
> > counting set bits in the respective mask. Hence replace
> > bitmap_bitremap() calls by simpler equivalents.
> >
> > With this done the dence mask is not needed and thus dropped.
>
> And dense again.

Thank you, Kent, I really appreciate your help with my poor English,
nevertheless it would be nice if you can look at the last patch and
maybe even test it, so we have a bit of confidence that it works
as expected.

(The spelling will be fixed in the next version.)

--
With Best Regards,
Andy Shevchenko


2023-09-26 13:25:18

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 1/5] lib/test_bitmap: Excape space symbols when printing input string

On Tue, Sep 26, 2023 at 06:35:40PM +0800, Kent Gibson wrote:
> On Tue, Sep 26, 2023 at 08:20:03AM +0300, Andy Shevchenko wrote:
> > test_bitmap_printlist() prints the input string which contains
> > a new line character. Instead of stripping it, escape that kind
> > of characters, so developer will see the actual input string
>

And "Excape" -> "Escape" - didn't notice until my reply was sending.

Cheers,
Kent.

> Grammar nit:
>
> "that kind of characters" -> "those kinds of characters" or "that kind
> of character" or "such characters" or ...
>
> > that has been used. Without this change the new line splits
> > the string to two, and the first one is not guaranteed to be
> > followed by the first part immediatelly.
>
> immediately
>
> And the second "first" should be "second"??
>
> "the second part is not guaranteed to immediately follow the first" is
> clearer (and hopefully what you mean), IMHO.
>
> Cheers,
> Kent.
>
> >
> > Signed-off-by: Andy Shevchenko <[email protected]>
> > ---
> > lib/test_bitmap.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> > index f2ea9f30c7c5..1f2dc7fef17f 100644
> > --- a/lib/test_bitmap.c
> > +++ b/lib/test_bitmap.c
> > @@ -523,7 +523,7 @@ static void __init test_bitmap_printlist(void)
> > goto out;
> > }
> >
> > - pr_err("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time);
> > + pr_err("bitmap_print_to_pagebuf: input is '%*pEs', Time: %llu\n", ret, buf, time);
> > out:
> > kfree(buf);
> > kfree(bmap);
> > --
> > 2.40.0.1.gaa8946217a0b
> >

2023-09-26 15:00:30

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] gpio: xilinx: Replace bitmap_bitremap() calls

On Tue, Sep 26, 2023 at 08:20:06AM +0300, Andy Shevchenko wrote:
> We have sparse and dence masks of the line mappings based on

dense

> the view point (Linux numbering or hardware numbering). Since
> the Linux side uses sequential bits for the mask, we can simply
> convert a Linux number to the hardware one and vise versa by

vice

> counting set bits in the respective mask. Hence replace
> bitmap_bitremap() calls by simpler equivalents.
>
> With this done the dence mask is not needed and thus dropped.
>

And dense again.

Cheers,
Kent.

> Signed-off-by: Andy Shevchenko <[email protected]>
> ---
> drivers/gpio/gpio-xilinx.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> index f103c30cc74f..14ca3097563a 100644
> --- a/drivers/gpio/gpio-xilinx.c
> +++ b/drivers/gpio/gpio-xilinx.c
> @@ -46,7 +46,6 @@
> * @gc: GPIO chip
> * @regs: register block
> * @hw_map: GPIO pin mapping on hardware side
> - * @sw_map: GPIO pin mapping on software side
> * @state: GPIO write state shadow register
> * @last_irq_read: GPIO read state register from last interrupt
> * @dir: GPIO direction shadow register
> @@ -62,7 +61,6 @@ struct xgpio_instance {
> struct gpio_chip gc;
> void __iomem *regs;
> DECLARE_BITMAP(hw_map, 64);
> - DECLARE_BITMAP(sw_map, 64);
> DECLARE_BITMAP(state, 64);
> DECLARE_BITMAP(last_irq_read, 64);
> DECLARE_BITMAP(dir, 64);
> @@ -76,12 +74,12 @@ struct xgpio_instance {
>
> static inline int xgpio_from_bit(struct xgpio_instance *chip, int bit)
> {
> - return bitmap_bitremap(bit, chip->hw_map, chip->sw_map, 64);
> + return bitmap_weight(chip->hw_map, bit + 1);
> }
>
> static inline int xgpio_to_bit(struct xgpio_instance *chip, int gpio)
> {
> - return bitmap_bitremap(gpio, chip->sw_map, chip->hw_map, 64);
> + return find_nth_bit(chip->hw_map, 64, gpio);
> }
>
> static inline u32 xgpio_get_value32(const unsigned long *map, int bit)
> @@ -619,9 +617,6 @@ static int xgpio_probe(struct platform_device *pdev)
> if (width[1] > 32)
> return -EINVAL;
>
> - /* Setup software pin mapping */
> - bitmap_set(chip->sw_map, 0, width[0] + width[1]);
> -
> /* Setup hardware pin mapping */
> bitmap_set(chip->hw_map, 0, width[0]);
> bitmap_set(chip->hw_map, 32, width[1]);
> --
> 2.40.0.1.gaa8946217a0b
>

2023-09-26 15:15:39

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 1/5] lib/test_bitmap: Excape space symbols when printing input string

On Tue, Sep 26, 2023 at 08:20:03AM +0300, Andy Shevchenko wrote:
> test_bitmap_printlist() prints the input string which contains
> a new line character. Instead of stripping it, escape that kind
> of characters, so developer will see the actual input string

Grammar nit:

"that kind of characters" -> "those kinds of characters" or "that kind
of character" or "such characters" or ...

> that has been used. Without this change the new line splits
> the string to two, and the first one is not guaranteed to be
> followed by the first part immediatelly.

immediately

And the second "first" should be "second"??

"the second part is not guaranteed to immediately follow the first" is
clearer (and hopefully what you mean), IMHO.

Cheers,
Kent.

>
> Signed-off-by: Andy Shevchenko <[email protected]>
> ---
> lib/test_bitmap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> index f2ea9f30c7c5..1f2dc7fef17f 100644
> --- a/lib/test_bitmap.c
> +++ b/lib/test_bitmap.c
> @@ -523,7 +523,7 @@ static void __init test_bitmap_printlist(void)
> goto out;
> }
>
> - pr_err("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time);
> + pr_err("bitmap_print_to_pagebuf: input is '%*pEs', Time: %llu\n", ret, buf, time);
> out:
> kfree(buf);
> kfree(bmap);
> --
> 2.40.0.1.gaa8946217a0b
>

2023-09-26 16:45:19

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] bitmap: get rid of bitmap_remap() and bitmap_biremap() uses

On Tue, Sep 26, 2023 at 7:20 AM Andy Shevchenko
<[email protected]> wrote:

> As Rasmus suggested [1], the bit remapping APIs should be local to NUMA.
> However, they are being in use outside of that for a while. To make
> above happen, introduces simplified APIs that can be used otherwise.
>
> It seems we might have yet another user of the above mentioned APIs.
>
> The last patch has not been tested anyhow (except compilation, so
> all testing and comments are welcome).
>
> The idea is to get an immutable tag (via my tree) that can be pulled
> by bitmap and GPIO trees on the need (while usually I send PR to
> the GPIO subsystem).
>
> Link: https://lore.kernel.org/all/[email protected]/ [1]

I don't understand the bitmap changes very well, but the resulting
changes to cdev look very nice clearly making that code more readable
and maintainable, so FWIW:
Acked-by: Linus Walleij <[email protected]>

Yours,
Linus Walleij

2023-09-26 16:56:46

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 4/5] gpio: xilinx: Replace bitmap_bitremap() calls

On Tue, Sep 26, 2023 at 02:11:14PM +0300, Andy Shevchenko wrote:
> On Tue, Sep 26, 2023 at 06:41:00PM +0800, Kent Gibson wrote:
> > On Tue, Sep 26, 2023 at 08:20:06AM +0300, Andy Shevchenko wrote:
> > > We have sparse and dence masks of the line mappings based on
> >
> > dense
> >
> > > the view point (Linux numbering or hardware numbering). Since
> > > the Linux side uses sequential bits for the mask, we can simply
> > > convert a Linux number to the hardware one and vise versa by
> >
> > vice
> >
> > > counting set bits in the respective mask. Hence replace
> > > bitmap_bitremap() calls by simpler equivalents.
> > >
> > > With this done the dence mask is not needed and thus dropped.
> >
> > And dense again.
>
> Thank you, Kent, I really appreciate your help with my poor English,
> nevertheless it would be nice if you can look at the last patch and
> maybe even test it, so we have a bit of confidence that it works
> as expected.
>

Well that is the plan, but I haven't been in the GPIO space for a while so
I need to pull my test setup out of mothballs first - so don't hold your
breath.

> (The spelling will be fixed in the next version.)
>

Those I can spot without needing to compile anything ;-).

Cheers,
Kent.

2023-09-27 02:01:24

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 0/5] bitmap: get rid of bitmap_remap() and bitmap_biremap() uses

On Tue, Sep 26, 2023 at 10:52:05AM +0200, Linus Walleij wrote:
> On Tue, Sep 26, 2023 at 7:20 AM Andy Shevchenko
> <[email protected]> wrote:
>
> > As Rasmus suggested [1], the bit remapping APIs should be local to NUMA.
> > However, they are being in use outside of that for a while. To make
> > above happen, introduces simplified APIs that can be used otherwise.
> >
> > It seems we might have yet another user of the above mentioned APIs.
> >
> > The last patch has not been tested anyhow (except compilation, so
> > all testing and comments are welcome).
> >
> > The idea is to get an immutable tag (via my tree) that can be pulled
> > by bitmap and GPIO trees on the need (while usually I send PR to
> > the GPIO subsystem).
> >
> > Link: https://lore.kernel.org/all/[email protected]/ [1]
>
> I don't understand the bitmap changes very well,

Me neither... Oops (it was a joke :-)

> but the resulting
> changes to cdev look very nice clearly making that code more readable
> and maintainable,

While the above is a joke it took quite a time to get into the logic.
Hence this patch. TBH I'm in doubt, that's why asking for careful testing
and reviews.

> so FWIW:
> Acked-by: Linus Walleij <[email protected]>

Thank you!

--
With Best Regards,
Andy Shevchenko


2023-09-27 07:00:16

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers

On Tue, Sep 26, 2023 at 08:20:04AM +0300, Andy Shevchenko wrote:
> These helpers are the optimized versions of the bitmap_remap()
> where one of the bitmaps (source or destination) is of sequential bits.

If so, can you add a test that makes sure that new API is consistent
with the old bitmap_remap? And also provide numbers how well are they
optimized, comparing to bitmap_remap.

> See more in the kernel documentation of the helpers.

I grepped the whole kernel, not only Documentation directory, and found
nothing...

> Signed-off-by: Andy Shevchenko <[email protected]>
> ---
> include/linux/bitmap.h | 9 ++++++
> lib/bitmap.c | 70 ++++++++++++++++++++++++++++++++++++++++++
> lib/test_bitmap.c | 23 ++++++++++++++
> 3 files changed, 102 insertions(+)
>
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 1516ff979315..87013b9a7dd8 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -60,6 +60,8 @@ struct device;
> * bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
> * bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest
> * bitmap_replace(dst, old, new, mask, nbits) *dst = (*old & ~(*mask)) | (*new & *mask)
> + * bitmap_scatter(dst, src, mask, nbits) *dst = map(dense, sparse)(src)
> + * bitmap_gather(dst, src, mask, nbits) *dst = map(sparse, dense)(src)
> * bitmap_remap(dst, src, old, new, nbits) *dst = map(old, new)(src)
> * bitmap_bitremap(oldbit, old, new, nbits) newbit = map(old, new)(oldbit)
> * bitmap_onto(dst, orig, relmap, nbits) *dst = orig relative to relmap
> @@ -208,6 +210,12 @@ int bitmap_parselist(const char *buf, unsigned long *maskp,
> int nmaskbits);
> int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen,
> unsigned long *dst, int nbits);
> +
> +unsigned int bitmap_scatter(unsigned long *dst, const unsigned long *src,
> + const unsigned long *mask, unsigned int nbits);
> +unsigned int bitmap_gather(unsigned long *dst, const unsigned long *src,
> + const unsigned long *mask, unsigned int nbits);
> +
> void bitmap_remap(unsigned long *dst, const unsigned long *src,
> const unsigned long *old, const unsigned long *new, unsigned int nbits);
> int bitmap_bitremap(int oldbit,
> @@ -216,6 +224,7 @@ void bitmap_onto(unsigned long *dst, const unsigned long *orig,
> const unsigned long *relmap, unsigned int bits);
> void bitmap_fold(unsigned long *dst, const unsigned long *orig,
> unsigned int sz, unsigned int nbits);
> +
> int bitmap_find_free_region(unsigned long *bitmap, unsigned int bits, int order);
> void bitmap_release_region(unsigned long *bitmap, unsigned int pos, int order);
> int bitmap_allocate_region(unsigned long *bitmap, unsigned int pos, int order);
> diff --git a/lib/bitmap.c b/lib/bitmap.c
> index 935e0f96e785..31cfc7846aae 100644
> --- a/lib/bitmap.c
> +++ b/lib/bitmap.c
> @@ -942,6 +942,76 @@ int bitmap_parse(const char *start, unsigned int buflen,
> }
> EXPORT_SYMBOL(bitmap_parse);
>
> +/**
> + * bitmap_scatter - Scatter a bitmap according to the given mask
> + * @dst: scattered bitmap
> + * @src: gathered bitmap
> + * @mask: bits to assign to in the scattered bitmap
> + * @nbits: number of bits in each of these bitmaps
> + *
> + * Scatters bitmap with sequential bits according to the given @mask.
> + *
> + * Example:
> + * If @src bitmap = 0x005a, with @mask = 0x1313, @dst will be 0x0302.
> + *
> + * Or in binary form
> + * @src @mask @dst
> + * 0000000001011010 0001001100010011 0000001100000010
> + *
> + * (Bits 0, 1, 2, 3, 4, 5 are copied to the bits 0, 1, 4, 8, 9, 12)
> + *
> + * Returns: the weight of the @mask.

Returning a weight of the mask is somewhat non-trivial... To me it
would be logical to return a weight of destination, for example...

But I see that in the following patch you're using the returned value.
Maybe add a few words to advocate that?

> + */
> +unsigned int bitmap_scatter(unsigned long *dst, const unsigned long *src,
> + const unsigned long *mask, unsigned int nbits)
> +{
> + unsigned int bit;
> + int n = 0;

Is n signed for purpose? I think it should be consistent with
return value.

> +
> + bitmap_zero(dst, nbits);
> +
> + for_each_set_bit(bit, mask, nbits)
> + __assign_bit(bit, dst, test_bit(n++, src));
> +
> + return n;
> +}
> +EXPORT_SYMBOL(bitmap_scatter);
> +
> +/**
> + * bitmap_gather - Gather a bitmap according to given mask
> + * @dst: gathered bitmap
> + * @src: scattered bitmap
> + * @mask: bits to extract from in the scattered bitmap
> + * @nbits: number of bits in each of these bitmaps
> + *
> + * Gathers bitmap with sparse bits according to the given @mask.
> + *
> + * Example:
> + * If @src bitmap = 0x0302, with @mask = 0x1313, @dst will be 0x001a.

Not sure about others, but to me hex representation is quite useless,
moreover it's followed by binary one.

> + * Or in binary form
> + * @src @mask @dst
> + * 0000001100000010 0001001100010011 0000000000011010
> + *
> + * (Bits 0, 1, 4, 8, 9, 12 are copied to the bits 0, 1, 2, 3, 4, 5)
> + *
> + * Returns: the weight of the @mask.
> + */

It looks like those are designed complement to each other. Is that
true? If so, can you make your example showing that
scatter -> gather -> scatter
would restore the original bitmap?

If I'm wrong, can you please underline that they are not complement,
and why?

> +unsigned int bitmap_gather(unsigned long *dst, const unsigned long *src,
> + const unsigned long *mask, unsigned int nbits)
> +{
> + unsigned int bit;
> + int n = 0;
> +
> + bitmap_zero(dst, nbits);
> +
> + for_each_set_bit(bit, mask, nbits)
> + __assign_bit(n++, dst, test_bit(bit, src));
> +
> + return n;
> +}
> +EXPORT_SYMBOL(bitmap_gather);

I feel like they should reside in header, because they are quite a small
functions indeed, and they would benefit from compile-time optimizations
without bloating the kernel.

Moreover, you are using them in patch #3 on 64-bit bitmaps, which
would benefit from small_const_nbits() optimization.

> +
> /**
> * bitmap_pos_to_ord - find ordinal of set bit at given position in bitmap
> * @buf: pointer to a bitmap
> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> index 1f2dc7fef17f..f43a07679998 100644
> --- a/lib/test_bitmap.c
> +++ b/lib/test_bitmap.c
> @@ -50,6 +50,9 @@ static const unsigned long exp2[] __initconst = {
> static const unsigned long exp2_to_exp3_mask[] __initconst = {
> BITMAP_FROM_U64(0x008000020020212eULL),
> };
> +static const unsigned long exp2_to_exp3_maskg[] __initconst = {
> + BITMAP_FROM_U64(0x00000000000001ffULL),
> +};
> /* exp3_0_1 = (exp2[0] & ~exp2_to_exp3_mask) | (exp2[1] & exp2_to_exp3_mask) */
> static const unsigned long exp3_0_1[] __initconst = {
> BITMAP_FROM_U64(0x33b3333311313137ULL),
> @@ -357,6 +360,25 @@ static void __init test_replace(void)
> expect_eq_bitmap(bmap, exp3_1_0, nbits);
> }
>
> +static void __init test_bitmap_sg(void)
> +{
> + unsigned int nbits = 64;
> + DECLARE_BITMAP(bmap, 1024);

Can you make it 1000? That way we'll test non-aligned case.

> + unsigned int w;
> +
> + bitmap_zero(bmap, 1024);
> + w = bitmap_gather(bmap, exp2_to_exp3_mask, exp2_to_exp3_mask, nbits);
> + expect_eq_uint(bitmap_weight(exp2_to_exp3_mask, nbits), w);
> + expect_eq_uint(bitmap_weight(bmap, 1024), w);
> + expect_eq_bitmap(bmap, exp2_to_exp3_maskg, nbits);
> +
> + bitmap_zero(bmap, 1024);
> + w = bitmap_scatter(bmap, exp2_to_exp3_maskg, exp2_to_exp3_mask, nbits);
> + expect_eq_uint(bitmap_weight(exp2_to_exp3_maskg, nbits), w);
> + expect_eq_uint(bitmap_weight(bmap, 1024), w);
> + expect_eq_bitmap(bmap, exp2_to_exp3_mask, nbits);

Would be interesting to compare bitmap scatter/gather performance
against bitmap_remap.

> +}
> +
> #define PARSE_TIME 0x1
> #define NO_LEN 0x2
>
> @@ -1228,6 +1250,7 @@ static void __init selftest(void)
> test_fill_set();
> test_copy();
> test_replace();
> + test_bitmap_sg();
> test_bitmap_arr32();
> test_bitmap_arr64();
> test_bitmap_parse();
> --
> 2.40.0.1.gaa8946217a0b

2023-09-27 09:45:50

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote:
> Currently we have a few bitmap calls that are open coded in the library
> module. Let's convert them to use generic bitmap APIs instead.
>
> Signed-off-by: Andy Shevchenko <[email protected]>
> ---
> drivers/gpio/gpiolib-cdev.c | 79 +++++++++++++++++--------------------
> 1 file changed, 36 insertions(+), 43 deletions(-)
>
> diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c
> index e39d344feb28..a5bbbd44531f 100644
> --- a/drivers/gpio/gpiolib-cdev.c
> +++ b/drivers/gpio/gpiolib-cdev.c
> @@ -1263,35 +1263,32 @@ static long linereq_get_values(struct linereq *lr, void __user *ip)
> {
> struct gpio_v2_line_values lv;
> DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX);
> struct gpio_desc **descs;
> unsigned int i, didx, num_get;
> - bool val;
> int ret;
>
> /* NOTE: It's ok to read values of output lines. */
> if (copy_from_user(&lv, ip, sizeof(lv)))
> return -EFAULT;
>
> - for (num_get = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv.mask & BIT_ULL(i)) {
> - num_get++;
> - descs = &lr->lines[i].desc;
> - }
> - }
> + bitmap_from_arr64(mask, &lv.mask, GPIO_V2_LINES_MAX);
>
> + num_get = bitmap_weight(mask, lr->num_lines);
> if (num_get == 0)
> return -EINVAL;
>
> - if (num_get != 1) {
> + if (num_get == 1) {
> + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc;
> + } else {
> descs = kmalloc_array(num_get, sizeof(*descs), GFP_KERNEL);
> if (!descs)
> return -ENOMEM;
> - for (didx = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv.mask & BIT_ULL(i)) {
> - descs[didx] = lr->lines[i].desc;
> - didx++;
> - }
> - }
> +
> + didx = 0;
> + for_each_set_bit(i, mask, lr->num_lines)
> + descs[didx++] = lr->lines[i].desc;
> }
> ret = gpiod_get_array_value_complex(false, true, num_get,
> descs, NULL, vals);
> @@ -1301,19 +1298,15 @@ static long linereq_get_values(struct linereq *lr, void __user *ip)
> if (ret)
> return ret;
>
> - lv.bits = 0;
> - for (didx = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv.mask & BIT_ULL(i)) {
> - if (lr->lines[i].sw_debounced)
> - val = debounced_value(&lr->lines[i]);
> - else
> - val = test_bit(didx, vals);
> - if (val)
> - lv.bits |= BIT_ULL(i);
> - didx++;
> - }
> + bitmap_scatter(bits, vals, mask, lr->num_lines);
> +
> + for_each_set_bit(i, mask, lr->num_lines) {
> + if (lr->lines[i].sw_debounced)
> + __assign_bit(i, bits, debounced_value(&lr->lines[i]));
> }
>
> + bitmap_to_arr64(&lv.bits, bits, GPIO_V2_LINES_MAX);
> +
> if (copy_to_user(ip, &lv, sizeof(lv)))
> return -EFAULT;
>
> @@ -1324,35 +1317,35 @@ static long linereq_set_values_unlocked(struct linereq *lr,
> struct gpio_v2_line_values *lv)
> {
> DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX);
> struct gpio_desc **descs;
> unsigned int i, didx, num_set;
> int ret;
>
> - bitmap_zero(vals, GPIO_V2_LINES_MAX);
> - for (num_set = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv->mask & BIT_ULL(i)) {
> - if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags))
> - return -EPERM;
> - if (lv->bits & BIT_ULL(i))
> - __set_bit(num_set, vals);
> - num_set++;
> - descs = &lr->lines[i].desc;
> - }
> - }
> + bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX);
> + bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX);
> +
> + num_set = bitmap_gather(vals, bits, mask, lr->num_lines);

It looks like GPIO_V2_LINES_MAX is always 64, and so I wonder: is
my understanding correct that all bits in ->mask and ->bits beyond
lr->num_lines are clear?

If so, you can seemingly pass the GPIO_V2_LINES_MAX instead of
lr->num_lines, and that way it will be small_cons_nbits()-optimized.

> if (num_set == 0)
> return -EINVAL;
>
> - if (num_set != 1) {
> + for_each_set_bit(i, mask, lr->num_lines) {
> + if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags))
> + return -EPERM;
> + }
> +
> + if (num_set == 1) {
> + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc;
> + } else {
> /* build compacted desc array and values */
> descs = kmalloc_array(num_set, sizeof(*descs), GFP_KERNEL);
> if (!descs)
> return -ENOMEM;
> - for (didx = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv->mask & BIT_ULL(i)) {
> - descs[didx] = lr->lines[i].desc;
> - didx++;
> - }
> - }
> +
> + didx = 0;
> + for_each_set_bit(i, mask, lr->num_lines)
> + descs[didx++] = lr->lines[i].desc;
> }
> ret = gpiod_set_array_value_complex(false, true, num_set,
> descs, NULL, vals);
> --
> 2.40.0.1.gaa8946217a0b

2023-09-27 10:24:54

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers

> > +unsigned int bitmap_gather(unsigned long *dst, const unsigned long *src,
> > + const unsigned long *mask, unsigned int nbits)
> > +{
> > + unsigned int bit;
> > + int n = 0;
> > +
> > + bitmap_zero(dst, nbits);
> > +
> > + for_each_set_bit(bit, mask, nbits)
> > + __assign_bit(n++, dst, test_bit(bit, src));
> > +
> > + return n;
> > +}
> > +EXPORT_SYMBOL(bitmap_gather);

So, if mask is 0b01, and src is 0b10, the output will be 0b00.
To me it sounds like you've gathered nothing, while the intention
was to gather all source bits to bit #0. This is my understanding
of the word 'gather', and this is how bitmap_remap() works.

bitmap_remap() handles it by wrapping around 0:
set_bit(find_nth_bit(new, nbits, n % w), dst);

In your case, it may look like:
n = off = 0;
while (1) {
off += n;
n = 0;
for_each_set_bit(bit, mask, nbits) {
if (bit + off >= nbits)
return;
__assign_bit(n++, dst, test_bit(bit + off, src));
}
}

(Not tested, except that on piece of paper.)

If you claim you're replacing bitmap_remap(), you should correctly handle
the above case; when src == dst; when mask is empty, and probably more...

Thanks,
Yury

2023-09-27 14:33:55

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote:
> > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote:
> > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:
> > > > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote:
> > > > > Currently we have a few bitmap calls that are open coded in the library
> > > > > module. Let's convert them to use generic bitmap APIs instead.
> > > >
> > > > Firstly, I didn't consider using the bitmap module here as, in my mind at
> > > > least, that is intended for bitmaps wider than 64 bits, or with variable
> > > > width. In this case the bitmap is fixed at 64 bits, so bitops seemed more
> > > > appropriate.
> > > >
> > > > And I would argue that they aren't "open coded" - they are parallelized
> > > > to reduce the number of passes over the bitmap.
> > > > This change serialises them, e.g. the get used to require 2 passes over
> > > > the bitmap, it now requires 3 or 4. The set used to require 1 and now
> > > > requires 2.
> > > > And there are additional copies that the original doesn't require.
> > > > So your change looks less efficient to me - unless there is direct
> > > > hardware support for bitmap ops??
> > > >
> > > > Wrt the argument that the serialized form is clearer and more
> > > > maintainable, optimised code is frequently more cryptic - as noted in
> > > > bitmap.c itself, and this code has remained unchanged since it was merged
> > > > 3 years ago, so the only maintenance it has required is to be more
> > > > maintainable?? Ok then.
> > > >
> > > > Your patch is functionally equivalent and pass my uAPI tests, so
> > > >
> > > > Tested-by: Kent Gibson <[email protected]>
> > >
> > > Thanks for testing!
> >
> > Not a problem - that is what test suites are for.
> >
> > > > but my preference is to leave it as is.
> > >
> > > As Yury mentioned we need to look at bitmap APIs and make them possible to have
> > > a compile-time optimizations. With that in mind, I would prefer bitmap APIs
> > > over open-coded stuff which is hardly to be understood (yes, I still point
> > > out that it takes a few hours to me, maybe because I'm stupid enough, to
> > > get what's the heck is going one there, esp. for the == 1 case).
> >
> > Really? With all the bits out in the open it seems pretty clear to me.
> > Clearer than scatter/gather in fact.
>
> Yes, you are biased. :-) Ask some stranger about this code and I am pretty sure
> there will be double-figures percentage of people who can tell that the current
> code is a bit voodoo.
>

It is the same as yours - just inside out. i.e. it performs the ops per
selected line, not each op on the whole bitmap of lines.

> > Sure, if there is suitable hardware support then bitmaps COULD be faster
> > than bitops. But without that, and that is the general case, it will be
> > slower. Do you have ANY cases where your implementation is currently
> > faster? Then you would have a stronger case.
>
> Why do we care here about performance? But if we do, I would check this on
> the 32-bit platform where 64-bit operations somewhat problematic / slow.
>

Yet you argue that bitmaps could be more performant?? Pick a side!

> If Yury gives an idea about performance tests I can consider to add this
> piece to compare with and we might see the difference.
>
> > And if you find the existing implementation unclear then the appropriate
> > solution is to better document it, as bitmaps itself does, not replace it
> > with something simpler and slower.
>
> Documentation will be needed either way. In general statistics it will be 50/50
> who (mis)understands this or new code. Pity that the original author of the code
> hadn't though about documenting this...
>

And who was the original author? I forget.

What you mean to say is it is a pity the reviewers at the time were
satisfied with the code as it stands, right?
Cos there is a process here.
As I recall reviewers were more often than not complaining about
pointless comments, not the lack of comments, so the natural bias as the
author is towards under-documenting...

> > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say
> > > allows to work with 512 GPIOs at a time. With your code it will be much harder
> > > to achieve and see what you wrote about maintenance (in that case).
> >
> > v3 ABI?? libgpiod v2 is barely out the door!
> > Do you have any cases where 64 lines per request is limiting?
>
> IIRC it was SO question where the OP asks exactly about breaking the 64 lines
> limitation in the current ABI.
>
> > If that sort of speculation isn't premature optimisation then I don't know
> > what is.
>
> No, based on the real question / discussion, just have no link at hand.
> But it's quite a niche, I can agree.
>

Let me know if you find a ref to that discussion - I'm curious.

Cheers,
Kent.

2023-09-27 14:55:34

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:
> > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote:
> > > Currently we have a few bitmap calls that are open coded in the library
> > > module. Let's convert them to use generic bitmap APIs instead.
> >
> > Firstly, I didn't consider using the bitmap module here as, in my mind at
> > least, that is intended for bitmaps wider than 64 bits, or with variable
> > width. In this case the bitmap is fixed at 64 bits, so bitops seemed more
> > appropriate.
> >
> > And I would argue that they aren't "open coded" - they are parallelized
> > to reduce the number of passes over the bitmap.
> > This change serialises them, e.g. the get used to require 2 passes over
> > the bitmap, it now requires 3 or 4. The set used to require 1 and now
> > requires 2.
> > And there are additional copies that the original doesn't require.
> > So your change looks less efficient to me - unless there is direct
> > hardware support for bitmap ops??
> >
> > Wrt the argument that the serialized form is clearer and more
> > maintainable, optimised code is frequently more cryptic - as noted in
> > bitmap.c itself, and this code has remained unchanged since it was merged
> > 3 years ago, so the only maintenance it has required is to be more
> > maintainable?? Ok then.
> >
> > Your patch is functionally equivalent and pass my uAPI tests, so
> >
> > Tested-by: Kent Gibson <[email protected]>
>
> Thanks for testing!
>

Not a problem - that is what test suites are for.

> > but my preference is to leave it as is.
>
> As Yury mentioned we need to look at bitmap APIs and make them possible to have
> a compile-time optimizations. With that in mind, I would prefer bitmap APIs
> over open-coded stuff which is hardly to be understood (yes, I still point
> out that it takes a few hours to me, maybe because I'm stupid enough, to
> get what's the heck is going one there, esp. for the == 1 case).
>

Really? With all the bits out in the open it seems pretty clear to me.
Clearer than scatter/gather in fact.

Sure, if there is suitable hardware support then bitmaps COULD be faster
than bitops. But without that, and that is the general case, it will be
slower. Do you have ANY cases where your implementation is currently
faster? Then you would have a stronger case.

And if you find the existing implementation unclear then the appropriate
solution is to better document it, as bitmaps itself does, not replace it
with something simpler and slower.

> Yet, it opens a way to scale this in case we might have v3 ABI that let's say
> allows to work with 512 GPIOs at a time. With your code it will be much harder
> to achieve and see what you wrote about maintenance (in that case).
>

v3 ABI?? libgpiod v2 is barely out the door!
Do you have any cases where 64 lines per request is limiting?
If that sort of speculation isn't premature optimisation then I don't know
what is.

Cheers,
Kent.

2023-09-27 16:23:51

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers

On Tue, Sep 26, 2023 at 07:10:33PM -0700, Yury Norov wrote:

...

> So, if mask is 0b01, and src is 0b10, the output will be 0b00.

Correct. This how it must work.

> To me it sounds like you've gathered nothing, while the intention
> was to gather all source bits to bit #0.

No, the idea is to gather bits positions of which are provided by a mask
from source to a destination, where the positions are sequential.

> This is my understanding
> of the word 'gather',

You should get the mask meaning. It's not the bit provider, it's a bit
positions provider.

> and this is how bitmap_remap() works.

It's NOT a replacement of bitmap_remap(). It's specifically written in the
commit message that domain of these APIs is when @old or @new (in case of
bitmap_remap() API) equals 2^n - 1, where n is amount of bits we consider.

...

> If you claim you're replacing bitmap_remap(),
> you should correctly handle
> the above case; when src == dst; when mask is empty, and probably more...

I don't care about corner cases of bitmap_remap(), and we can solve the issue
when it comes. Currently there is no issue with the all users that need these
API as they use different addresses.

--
With Best Regards,
Andy Shevchenko


2023-09-27 17:58:42

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:
> On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote:
> > Currently we have a few bitmap calls that are open coded in the library
> > module. Let's convert them to use generic bitmap APIs instead.
>
> Firstly, I didn't consider using the bitmap module here as, in my mind at
> least, that is intended for bitmaps wider than 64 bits, or with variable
> width. In this case the bitmap is fixed at 64 bits, so bitops seemed more
> appropriate.
>
> And I would argue that they aren't "open coded" - they are parallelized
> to reduce the number of passes over the bitmap.
> This change serialises them, e.g. the get used to require 2 passes over
> the bitmap, it now requires 3 or 4. The set used to require 1 and now
> requires 2.
> And there are additional copies that the original doesn't require.
> So your change looks less efficient to me - unless there is direct
> hardware support for bitmap ops??
>
> Wrt the argument that the serialized form is clearer and more
> maintainable, optimised code is frequently more cryptic - as noted in
> bitmap.c itself, and this code has remained unchanged since it was merged
> 3 years ago, so the only maintenance it has required is to be more
> maintainable?? Ok then.
>
> Your patch is functionally equivalent and pass my uAPI tests, so
>
> Tested-by: Kent Gibson <[email protected]>

Thanks for testing!

> but my preference is to leave it as is.

As Yury mentioned we need to look at bitmap APIs and make them possible to have
a compile-time optimizations. With that in mind, I would prefer bitmap APIs
over open-coded stuff which is hardly to be understood (yes, I still point
out that it takes a few hours to me, maybe because I'm stupid enough, to
get what's the heck is going one there, esp. for the == 1 case).

Yet, it opens a way to scale this in case we might have v3 ABI that let's say
allows to work with 512 GPIOs at a time. With your code it will be much harder
to achieve and see what you wrote about maintenance (in that case).

--
With Best Regards,
Andy Shevchenko


2023-09-27 18:25:27

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote:
> On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote:
> > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:
> > > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote:
> > > > Currently we have a few bitmap calls that are open coded in the library
> > > > module. Let's convert them to use generic bitmap APIs instead.
> > >
> > > Firstly, I didn't consider using the bitmap module here as, in my mind at
> > > least, that is intended for bitmaps wider than 64 bits, or with variable
> > > width. In this case the bitmap is fixed at 64 bits, so bitops seemed more
> > > appropriate.
> > >
> > > And I would argue that they aren't "open coded" - they are parallelized
> > > to reduce the number of passes over the bitmap.
> > > This change serialises them, e.g. the get used to require 2 passes over
> > > the bitmap, it now requires 3 or 4. The set used to require 1 and now
> > > requires 2.
> > > And there are additional copies that the original doesn't require.
> > > So your change looks less efficient to me - unless there is direct
> > > hardware support for bitmap ops??
> > >
> > > Wrt the argument that the serialized form is clearer and more
> > > maintainable, optimised code is frequently more cryptic - as noted in
> > > bitmap.c itself, and this code has remained unchanged since it was merged
> > > 3 years ago, so the only maintenance it has required is to be more
> > > maintainable?? Ok then.
> > >
> > > Your patch is functionally equivalent and pass my uAPI tests, so
> > >
> > > Tested-by: Kent Gibson <[email protected]>
> >
> > Thanks for testing!
>
> Not a problem - that is what test suites are for.
>
> > > but my preference is to leave it as is.
> >
> > As Yury mentioned we need to look at bitmap APIs and make them possible to have
> > a compile-time optimizations. With that in mind, I would prefer bitmap APIs
> > over open-coded stuff which is hardly to be understood (yes, I still point
> > out that it takes a few hours to me, maybe because I'm stupid enough, to
> > get what's the heck is going one there, esp. for the == 1 case).
>
> Really? With all the bits out in the open it seems pretty clear to me.
> Clearer than scatter/gather in fact.

Yes, you are biased. :-) Ask some stranger about this code and I am pretty sure
there will be double-figures percentage of people who can tell that the current
code is a bit voodoo.

> Sure, if there is suitable hardware support then bitmaps COULD be faster
> than bitops. But without that, and that is the general case, it will be
> slower. Do you have ANY cases where your implementation is currently
> faster? Then you would have a stronger case.

Why do we care here about performance? But if we do, I would check this on
the 32-bit platform where 64-bit operations somewhat problematic / slow.

If Yury gives an idea about performance tests I can consider to add this
piece to compare with and we might see the difference.

> And if you find the existing implementation unclear then the appropriate
> solution is to better document it, as bitmaps itself does, not replace it
> with something simpler and slower.

Documentation will be needed either way. In general statistics it will be 50/50
who (mis)understands this or new code. Pity that the original author of the code
hadn't though about documenting this...

> > Yet, it opens a way to scale this in case we might have v3 ABI that let's say
> > allows to work with 512 GPIOs at a time. With your code it will be much harder
> > to achieve and see what you wrote about maintenance (in that case).
>
> v3 ABI?? libgpiod v2 is barely out the door!
> Do you have any cases where 64 lines per request is limiting?

IIRC it was SO question where the OP asks exactly about breaking the 64 lines
limitation in the current ABI.

> If that sort of speculation isn't premature optimisation then I don't know
> what is.

No, based on the real question / discussion, just have no link at hand.
But it's quite a niche, I can agree.

--
With Best Regards,
Andy Shevchenko


2023-09-27 18:55:00

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers

On Tue, Sep 26, 2023 at 05:25:13PM -0700, Yury Norov wrote:
> On Tue, Sep 26, 2023 at 08:20:04AM +0300, Andy Shevchenko wrote:
> > These helpers are the optimized versions of the bitmap_remap()
> > where one of the bitmaps (source or destination) is of sequential bits.
>
> If so, can you add a test that makes sure that new API is consistent
> with the old bitmap_remap? And also provide numbers how well are they
> optimized, comparing to bitmap_remap.

It's impossible. bitmap_remap() is universal, these APIs only for the specific
domain.

> > See more in the kernel documentation of the helpers.
>
> I grepped the whole kernel, not only Documentation directory, and found
> nothing...

It's added in this patch in the format of kernel doc.

...

> > + * Returns: the weight of the @mask.
>
> Returning a weight of the mask is somewhat non-trivial... To me it
> would be logical to return a weight of destination, for example...

> But I see that in the following patch you're using the returned value.
> Maybe add a few words to advocate that?


I'll look into it again, maybe dst would work as well, I don't remember why
I have chosen mask. Maybe because it's invariant here, dunno.

...

> > + int n = 0;
>
> Is n signed for purpose? I think it should be consistent with
> return value.

OK. No purpose there. Perhaps it's a leftover from the first experiments
on the implementation of these APIs.

...

> > + * Example:
> > + * If @src bitmap = 0x0302, with @mask = 0x1313, @dst will be 0x001a.
>
> Not sure about others, but to me hex representation is quite useless,
> moreover it's followed by binary one.

Somebody is better at hex, somebody at binary one, I would leave both.

> > + * Or in binary form
> > + * @src @mask @dst
> > + * 0000001100000010 0001001100010011 0000000000011010
> > + *
> > + * (Bits 0, 1, 4, 8, 9, 12 are copied to the bits 0, 1, 2, 3, 4, 5)
> > + *
> > + * Returns: the weight of the @mask.
> > + */
>
> It looks like those are designed complement to each other. Is that
> true? If so, can you make your example showing that
> scatter -> gather -> scatter
> would restore the original bitmap?

It looks like you stopped reading documentation somewhere on the middle.
The two APIs are documented with the same example which makes it clear
that they are data-loss transformations.

Do you need something like this to be added (in both documentations):

The bitmap_scatter(), when executed over the @dst bitmap, will
restore the @src one if the @mask is kept the same, see the example
in the function description.

?

> If I'm wrong, can you please underline that they are not complement,
> and why?

No, you are not.

...

> I feel like they should reside in header, because they are quite a small
> functions indeed, and they would benefit from compile-time optimizations
> without bloating the kernel.
>
> Moreover, you are using them in patch #3 on 64-bit bitmaps, which
> would benefit from small_const_nbits() optimization.

I can move them into header.

...

> > + DECLARE_BITMAP(bmap, 1024);

> Can you make it 1000? That way we'll test non-aligned case.

Sure. But it's not related to the patch. It will test bitmap_weight() and not
the new APIs, so, whatever is bigger 64 will suffice the purpose and won't
anyhow affect the newly added APIs.

...

> Would be interesting to compare bitmap scatter/gather performance
> against bitmap_remap.

Do you have a code in mind? I can incorporate it.

Again, you should understand that it's only applicable to the certain cases,
otherwise it makes no sense (it's like comparing performance of ffs() on
a single word against find_first_bit() on the arbitrary amount of words).

--
With Best Regards,
Andy Shevchenko


2023-09-27 20:10:01

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Tue, Sep 26, 2023 at 05:46:07PM -0700, Yury Norov wrote:
> On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote:
> > Currently we have a few bitmap calls that are open coded in the library
> > module. Let's convert them to use generic bitmap APIs instead.
> >
> > Signed-off-by: Andy Shevchenko <[email protected]>

> > + bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX);
> > + bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX);
> > +
> > + num_set = bitmap_gather(vals, bits, mask, lr->num_lines);
>
> It looks like GPIO_V2_LINES_MAX is always 64, and so I wonder: is
> my understanding correct that all bits in ->mask and ->bits beyond
> lr->num_lines are clear?
>

The lv fields come from userspace and so cannot be guaranteed to be
zeroed beyond lr->num_lines. Any set bits beyond that must be ignored,
one way or another.

> If so, you can seemingly pass the GPIO_V2_LINES_MAX instead of
> lr->num_lines, and that way it will be small_cons_nbits()-optimized.
>

But that would be decidedly non-optimal for the most common case where
lr->num_lines == 1.

Cheers,
Kent.

2023-09-28 00:08:50

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote:
> Currently we have a few bitmap calls that are open coded in the library
> module. Let's convert them to use generic bitmap APIs instead.
>

Firstly, I didn't consider using the bitmap module here as, in my mind at
least, that is intended for bitmaps wider than 64 bits, or with variable
width. In this case the bitmap is fixed at 64 bits, so bitops seemed more
appropriate.

And I would argue that they aren't "open coded" - they are parallelized
to reduce the number of passes over the bitmap.
This change serialises them, e.g. the get used to require 2 passes over
the bitmap, it now requires 3 or 4. The set used to require 1 and now
requires 2.
And there are additional copies that the original doesn't require.
So your change looks less efficient to me - unless there is direct
hardware support for bitmap ops??

Wrt the argument that the serialized form is clearer and more
maintainable, optimised code is frequently more cryptic - as noted in
bitmap.c itself, and this code has remained unchanged since it was merged
3 years ago, so the only maintenance it has required is to be more
maintainable?? Ok then.

Your patch is functionally equivalent and pass my uAPI tests, so

Tested-by: Kent Gibson <[email protected]>

but my preference is to leave it as is.

Cheers,
Kent.

> Signed-off-by: Andy Shevchenko <[email protected]>
> ---
> drivers/gpio/gpiolib-cdev.c | 79 +++++++++++++++++--------------------
> 1 file changed, 36 insertions(+), 43 deletions(-)
>
> diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c
> index e39d344feb28..a5bbbd44531f 100644
> --- a/drivers/gpio/gpiolib-cdev.c
> +++ b/drivers/gpio/gpiolib-cdev.c
> @@ -1263,35 +1263,32 @@ static long linereq_get_values(struct linereq *lr, void __user *ip)
> {
> struct gpio_v2_line_values lv;
> DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX);
> struct gpio_desc **descs;
> unsigned int i, didx, num_get;
> - bool val;
> int ret;
>
> /* NOTE: It's ok to read values of output lines. */
> if (copy_from_user(&lv, ip, sizeof(lv)))
> return -EFAULT;
>
> - for (num_get = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv.mask & BIT_ULL(i)) {
> - num_get++;
> - descs = &lr->lines[i].desc;
> - }
> - }
> + bitmap_from_arr64(mask, &lv.mask, GPIO_V2_LINES_MAX);
>
> + num_get = bitmap_weight(mask, lr->num_lines);
> if (num_get == 0)
> return -EINVAL;
>
> - if (num_get != 1) {
> + if (num_get == 1) {
> + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc;
> + } else {
> descs = kmalloc_array(num_get, sizeof(*descs), GFP_KERNEL);
> if (!descs)
> return -ENOMEM;
> - for (didx = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv.mask & BIT_ULL(i)) {
> - descs[didx] = lr->lines[i].desc;
> - didx++;
> - }
> - }
> +
> + didx = 0;
> + for_each_set_bit(i, mask, lr->num_lines)
> + descs[didx++] = lr->lines[i].desc;
> }
> ret = gpiod_get_array_value_complex(false, true, num_get,
> descs, NULL, vals);
> @@ -1301,19 +1298,15 @@ static long linereq_get_values(struct linereq *lr, void __user *ip)
> if (ret)
> return ret;
>
> - lv.bits = 0;
> - for (didx = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv.mask & BIT_ULL(i)) {
> - if (lr->lines[i].sw_debounced)
> - val = debounced_value(&lr->lines[i]);
> - else
> - val = test_bit(didx, vals);
> - if (val)
> - lv.bits |= BIT_ULL(i);
> - didx++;
> - }
> + bitmap_scatter(bits, vals, mask, lr->num_lines);
> +
> + for_each_set_bit(i, mask, lr->num_lines) {
> + if (lr->lines[i].sw_debounced)
> + __assign_bit(i, bits, debounced_value(&lr->lines[i]));
> }
>
> + bitmap_to_arr64(&lv.bits, bits, GPIO_V2_LINES_MAX);
> +
> if (copy_to_user(ip, &lv, sizeof(lv)))
> return -EFAULT;
>
> @@ -1324,35 +1317,35 @@ static long linereq_set_values_unlocked(struct linereq *lr,
> struct gpio_v2_line_values *lv)
> {
> DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX);
> + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX);
> struct gpio_desc **descs;
> unsigned int i, didx, num_set;
> int ret;
>
> - bitmap_zero(vals, GPIO_V2_LINES_MAX);
> - for (num_set = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv->mask & BIT_ULL(i)) {
> - if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags))
> - return -EPERM;
> - if (lv->bits & BIT_ULL(i))
> - __set_bit(num_set, vals);
> - num_set++;
> - descs = &lr->lines[i].desc;
> - }
> - }
> + bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX);
> + bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX);
> +
> + num_set = bitmap_gather(vals, bits, mask, lr->num_lines);
> if (num_set == 0)
> return -EINVAL;
>
> - if (num_set != 1) {
> + for_each_set_bit(i, mask, lr->num_lines) {
> + if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags))
> + return -EPERM;
> + }
> +
> + if (num_set == 1) {
> + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc;
> + } else {
> /* build compacted desc array and values */
> descs = kmalloc_array(num_set, sizeof(*descs), GFP_KERNEL);
> if (!descs)
> return -ENOMEM;
> - for (didx = 0, i = 0; i < lr->num_lines; i++) {
> - if (lv->mask & BIT_ULL(i)) {
> - descs[didx] = lr->lines[i].desc;
> - didx++;
> - }
> - }
> +
> + didx = 0;
> + for_each_set_bit(i, mask, lr->num_lines)
> + descs[didx++] = lr->lines[i].desc;
> }
> ret = gpiod_set_array_value_complex(false, true, num_set,
> descs, NULL, vals);
> --
> 2.40.0.1.gaa8946217a0b
>

2023-10-02 06:30:05

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers

On Wed, Sep 27, 2023 at 03:02:34PM +0300, Andy Shevchenko wrote:

[...]

> > It looks like those are designed complement to each other. Is that
> > true? If so, can you make your example showing that
> > scatter -> gather -> scatter
> > would restore the original bitmap?
>
> It looks like you stopped reading documentation somewhere on the middle.

What a wonderful week of strong statements... Whatever...

> The two APIs are documented with the same example which makes it clear
> that they are data-loss transformations.
>
> Do you need something like this to be added (in both documentations):
>
> The bitmap_scatter(), when executed over the @dst bitmap, will
> restore the @src one if the @mask is kept the same, see the example
> in the function description.
>
> ?
>
> > If I'm wrong, can you please underline that they are not complement,
> > and why?
>
> No, you are not.

I should be confused even more. You're saying that I'm not wrong here, and few
lines above you're saying it's a data loss...

I don't mind this new 3-liners, but I'd like you to have a well better
wording and testing around them because those bitmap_scatter/gather are
all about performance, readability and usability.

To begin with, the whole name of the series: "get rid of bitmap_remap() and
bitmap_biremap() uses" is wrong because the functions are still here, and are
still used.

Even worse, instead of getting rid of some useless code, you're
bloating the kernel with something that duplicates existing
functionality.

This is an example of a series that 'gets rid of' something for true:

https://yhbt.net/lore/all/[email protected]/T/

(And unfortunately it's still unreviewed.)

But I understand your motivation, and as I already said, I like this
series in general. So let's please figure out a better wording before
moving forward?

Below are some my of thought.

1. Stop saying that you're getting rid of something when you're not.
I'd suggest something like: "add simple and verbose alternatives to
bitmap_remap(), and use them where appropriate".

2. Don't blame people considering a parameter named 'mask' as a mask.
I mean this sentence:

> You should get the mask meaning. It's not the bit provider, it's a bit
> positions provider.

If it's a 'bit position provider', please give it a proper name,
for example 'pos'. I'd suggest something like:
unsigned long bitmap_scatter(unsigned long *dst, unsigned long *pos,
unsigned long *val)

3. If you're saying that new API is a simplified replacement for
something, I'd like to see the full contract, i.e. explicit list of all
simplifications and limitations implied:
- val == dst is not handled;
- when 'pos' is empty, val is not copied to dst;
- new API doesn't wrap around 0, like bitmap_remap() does;
- set bits in 'val' are not copied to 'dst' when not in 'pos' (?)'
- anything else else?

4. Similarly to previous item, I'd like to have explicit understanding
and examples where and how bitmap_remap may be replaced. You're
only saying that it is possible when either 'new' or 'old' are
dense. Is that the only case? Can you add a test that explicitly
checks that bitmap_remap and bitmap_scatter/gather produce the same
output. Something like this:
bitmap_remap(dst1, val, dense_map, pos, nbits);
bitmap_scatter(dst2, val, pos, nbits);
check_eq_bitmap(dst1, dst2, nbits);

5. Can you add a picture like this to explain the algorithm even more:

mask: ...v..vv..v...vv
bits: 0000001100000010
1. ^ ^^ ^ 0
2. | || | 10
3. | || +> 010
4. | |+--> 1010
5. | +--> 11010
6. +----> 011010
gather: ..........011010

5. Regarding my confusion, I had to draw the picture above to realise
how it's possible that scatter/gather are inverse and data-loss
(i.e. not inverse) at the same time. Can you explain it with a
wording like this: "For bits selected by 'pos' bitmap, gathering a
'val' bitmap with the following scattering restores the original map.
All other bits values are lost and replaced with zeros." Similarly
for gathering. And please add a test case.

6. Regarding performance. I think it's wrong to say that that your
code is better optimized then some other, and then ask your
reviewers to figure out how to measure the difference. If you make
such statement, you should already have some test or real-life
measurement.

However, if you ask me, I can suggest you to pull this patch:
https://lore.kernel.org/lkml/[email protected]/

and modify/extend it in a way that both bitmap_remap and
bitmap_scatter/gather take the same 'val' and 'pos' bitmaps,
produce the same output, and then see which code is faster.

Worth to mention that since all current users of your API are working
on 64-bit maps, performance is doubty an issue. So you can drop the
'optimization' part of your wording, and don't add performance test.

Thanks,
Yury

2023-10-02 10:40:19

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Mon, Oct 02, 2023 at 05:25:05PM +0800, Kent Gibson wrote:
> On Mon, Oct 02, 2023 at 12:05:11PM +0300, Andy Shevchenko wrote:
> > On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote:
> > > On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote:
> > > > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote:
> > > > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote:
> > > > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:

...

> > > > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say
> > > > > > allows to work with 512 GPIOs at a time. With your code it will be much harder
> > > > > > to achieve and see what you wrote about maintenance (in that case).
> > > > >
> > > > > v3 ABI?? libgpiod v2 is barely out the door!
> > > > > Do you have any cases where 64 lines per request is limiting?
> > > >
> > > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines
> > > > limitation in the current ABI.
> > > >
> > > > > If that sort of speculation isn't premature optimisation then I don't know
> > > > > what is.
> > > >
> > > > No, based on the real question / discussion, just have no link at hand.
> > > > But it's quite a niche, I can agree.
> > >
> > > Let me know if you find a ref to that discussion - I'm curious.
> >
> > Here it is (read comments as well):
> > https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h
> >
>
> That question looks to me to be confusing how many GPIOs can be
> requested per request (64) and in total (effectively unlimited) - thinking
> they are the same.
> That could be due to their desire to use the gpiod_chip_get_all_lines()
> convenience function with a chip with more than 64 lines, rather than
> because they have an actual need for the lines to be managed in a single
> request.
>
> So that doesn't look like a genuine use case to me - just a "what if I
> want to do X" question. Certainly not something that would warrant a v3
> ABI.

Sure, and I'm not talking about v3 ABI to go for, see the word "might" in my
reply in the first paragraph of this message.

--
With Best Regards,
Andy Shevchenko


2023-10-02 12:19:14

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Mon, Oct 02, 2023 at 12:32:22PM +0300, Andy Shevchenko wrote:
> On Mon, Oct 02, 2023 at 05:25:05PM +0800, Kent Gibson wrote:
> > On Mon, Oct 02, 2023 at 12:05:11PM +0300, Andy Shevchenko wrote:
> > > On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote:
> > > > On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote:
> > > > > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote:
> > > > > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote:
> > > > > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:
>
> ...
>
> > > > > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say
> > > > > > > allows to work with 512 GPIOs at a time. With your code it will be much harder
> > > > > > > to achieve and see what you wrote about maintenance (in that case).
> > > > > >
> > > > > > v3 ABI?? libgpiod v2 is barely out the door!
> > > > > > Do you have any cases where 64 lines per request is limiting?
> > > > >
> > > > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines
> > > > > limitation in the current ABI.
> > > > >
> > > > > > If that sort of speculation isn't premature optimisation then I don't know
> > > > > > what is.
> > > > >
> > > > > No, based on the real question / discussion, just have no link at hand.
> > > > > But it's quite a niche, I can agree.
> > > >
> > > > Let me know if you find a ref to that discussion - I'm curious.
> > >
> > > Here it is (read comments as well):
> > > https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h
> > >
> >
> > That question looks to me to be confusing how many GPIOs can be
> > requested per request (64) and in total (effectively unlimited) - thinking
> > they are the same.
> > That could be due to their desire to use the gpiod_chip_get_all_lines()
> > convenience function with a chip with more than 64 lines, rather than
> > because they have an actual need for the lines to be managed in a single
> > request.
> >
> > So that doesn't look like a genuine use case to me - just a "what if I
> > want to do X" question. Certainly not something that would warrant a v3
> > ABI.
>
> Sure, and I'm not talking about v3 ABI to go for, see the word "might" in my
> reply in the first paragraph of this message.
>

Ok, so your original point was pure speculation.

Cheers,
Kent.

2023-10-02 15:58:09

by Kent Gibson

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Mon, Oct 02, 2023 at 12:05:11PM +0300, Andy Shevchenko wrote:
> On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote:
> > On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote:
> > > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote:
> > > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote:
> > > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:
>
> ...
>
> > > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say
> > > > > allows to work with 512 GPIOs at a time. With your code it will be much harder
> > > > > to achieve and see what you wrote about maintenance (in that case).
> > > >
> > > > v3 ABI?? libgpiod v2 is barely out the door!
> > > > Do you have any cases where 64 lines per request is limiting?
> > >
> > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines
> > > limitation in the current ABI.
> > >
> > > > If that sort of speculation isn't premature optimisation then I don't know
> > > > what is.
> > >
> > > No, based on the real question / discussion, just have no link at hand.
> > > But it's quite a niche, I can agree.
> >
> > Let me know if you find a ref to that discussion - I'm curious.
>
> Here it is (read comments as well):
> https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h
>

That question looks to me to be confusing how many GPIOs can be
requested per request (64) and in total (effectively unlimited) - thinking
they are the same.
That could be due to their desire to use the gpiod_chip_get_all_lines()
convenience function with a chip with more than 64 lines, rather than
because they have an actual need for the lines to be managed in a single
request.

So that doesn't look like a genuine use case to me - just a "what if I
want to do X" question. Certainly not something that would warrant a v3
ABI.

Cheers,
Kent.

2023-10-02 16:09:00

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpers

On Sun, Oct 01, 2023 at 09:06:24PM -0700, Yury Norov wrote:
> On Wed, Sep 27, 2023 at 03:02:34PM +0300, Andy Shevchenko wrote:

[...]

> > > It looks like those are designed complement to each other. Is that
> > > true? If so, can you make your example showing that
> > > scatter -> gather -> scatter
> > > would restore the original bitmap?
> >
> > It looks like you stopped reading documentation somewhere on the middle.
>
> What a wonderful week of strong statements... Whatever...
>
> > The two APIs are documented with the same example which makes it clear
> > that they are data-loss transformations.

Oh, should be read as data-lossleess.

> > Do you need something like this to be added (in both documentations):
> >
> > The bitmap_scatter(), when executed over the @dst bitmap, will
> > restore the @src one if the @mask is kept the same, see the example
> > in the function description.
> >
> > ?
> >
> > > If I'm wrong, can you please underline that they are not complement,
> > > and why?
> >
> > No, you are not.
>
> I should be confused even more. You're saying that I'm not wrong here, and few
> lines above you're saying it's a data loss...

data-lossless, sorry, I was missing that.

> I don't mind this new 3-liners, but I'd like you to have a well better
> wording and testing around them because those bitmap_scatter/gather are
> all about performance, readability and usability.
>
> To begin with, the whole name of the series: "get rid of bitmap_remap() and
> bitmap_biremap() uses" is wrong because the functions are still here, and are
> still used.
>
> Even worse, instead of getting rid of some useless code, you're
> bloating the kernel with something that duplicates existing
> functionality.

Wasn't Rasmus clear about the intention:
1) get rid of active users of bitmap_remap() outside of NUMA;
2) move that APIs to be private for NUMA users exclusively?

Towards that direction is my series.

> But I understand your motivation, and as I already said, I like this
> series in general. So let's please figure out a better wording before
> moving forward?
>
> Below are some my of thought.
>
> 1. Stop saying that you're getting rid of something when you're not.
> I'd suggest something like: "add simple and verbose alternatives to
> bitmap_remap(), and use them where appropriate".

I'm getting rid of the users of the bitmap_remap() outside of NUMA.
Why should I stop telling that? I think I have to elaborate what
that means.

> 2. Don't blame people considering a parameter named 'mask' as a mask.
> I mean this sentence:
>
> > You should get the mask meaning. It's not the bit provider, it's a bit
> > positions provider.
>
> If it's a 'bit position provider', please give it a proper name,
> for example 'pos'. I'd suggest something like:
> unsigned long bitmap_scatter(unsigned long *dst, unsigned long *pos,

It's not pos in the sense of what pos means. pos means a position of a single
bit, mask is about many bits. And mask meaning is still the same despite how
we call it, it's bit positions provider, i.e. bits that are set in the mask
will be considered to be worked on. It's mask, why should I rename it?

> 3. If you're saying that new API is a simplified replacement for
> something, I'd like to see the full contract, i.e. explicit list of all
> simplifications and limitations implied:
> - val == dst is not handled;
> - when 'pos' is empty, val is not copied to dst;
> - new API doesn't wrap around 0, like bitmap_remap() does;
> - set bits in 'val' are not copied to 'dst' when not in 'pos' (?)'
> - anything else else?

I see, I should have not used word "simplification" at all.
I will reword to not mention bitmap_remap() _at all_.

> 4. Similarly to previous item, I'd like to have explicit understanding
> and examples where and how bitmap_remap may be replaced. You're
> only saying that it is possible when either 'new' or 'old' are
> dense. Is that the only case? Can you add a test that explicitly
> checks that bitmap_remap and bitmap_scatter/gather produce the same
> output. Something like this:
> bitmap_remap(dst1, val, dense_map, pos, nbits);
> bitmap_scatter(dst2, val, pos, nbits);
> check_eq_bitmap(dst1, dst2, nbits);

No, I won't do that. This is out of the scope as see comment on 3. above.

> 5. Can you add a picture like this to explain the algorithm even more:
>
> mask: ...v..vv..v...vv
> bits: 0000001100000010
> 1. ^ ^^ ^ 0
> 2. | || | 10
> 3. | || +> 010
> 4. | |+--> 1010
> 5. | +--> 11010
> 6. +----> 011010
> gather: ..........011010

Why? Is it so complicated?

> 5. Regarding my confusion, I had to draw the picture above to realise
> how it's possible that scatter/gather are inverse and data-loss
> (i.e. not inverse) at the same time. Can you explain it with a
> wording like this: "For bits selected by 'pos' bitmap, gathering a
> 'val' bitmap with the following scattering restores the original map.
> All other bits values are lost and replaced with zeros." Similarly
> for gathering. And please add a test case.

Let's not make a big deal out of a small typo. I admit that it confused people,
but that was neither in the documentation, nor in the code, just in our
discussion.

> 6. Regarding performance. I think it's wrong to say that that your
> code is better optimized then some other, and then ask your
> reviewers to figure out how to measure the difference. If you make
> such statement, you should already have some test or real-life
> measurement.

Same as for 4. above. I drop the comparison and mentioning of bitmap_remap().
It was a big mistake by me to even mentioned that.

> However, if you ask me, I can suggest you to pull this patch:
> https://lore.kernel.org/lkml/[email protected]/
>
> and modify/extend it in a way that both bitmap_remap and
> bitmap_scatter/gather take the same 'val' and 'pos' bitmaps,
> produce the same output, and then see which code is faster.
>
> Worth to mention that since all current users of your API are working
> on 64-bit maps, performance is doubty an issue. So you can drop the
> 'optimization' part of your wording, and don't add performance test.

--
With Best Regards,
Andy Shevchenko


2023-10-02 23:00:41

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs

On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote:
> On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote:
> > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote:
> > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote:
> > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote:

...

> > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say
> > > > allows to work with 512 GPIOs at a time. With your code it will be much harder
> > > > to achieve and see what you wrote about maintenance (in that case).
> > >
> > > v3 ABI?? libgpiod v2 is barely out the door!
> > > Do you have any cases where 64 lines per request is limiting?
> >
> > IIRC it was SO question where the OP asks exactly about breaking the 64 lines
> > limitation in the current ABI.
> >
> > > If that sort of speculation isn't premature optimisation then I don't know
> > > what is.
> >
> > No, based on the real question / discussion, just have no link at hand.
> > But it's quite a niche, I can agree.
>
> Let me know if you find a ref to that discussion - I'm curious.

Here it is (read comments as well):
https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h

--
With Best Regards,
Andy Shevchenko