2021-07-19 02:19:28

by Yury Norov

[permalink] [raw]
Subject: [PATCH v2 0/4] bitmap: unify for_each_bit() macros

Replace bitmap_for_each_bit_region() with for_each_set_bitrange()
and ~30 times improve bitmap_list_string() with new macro.

On top of:
https://lore.kernel.org/lkml/YNp3extAkTY8Aocd@yury-ThinkPad/T/ and
https://lore.kernel.org/lkml/YNirnaYw1GSxg1jK@yury-ThinkPad/T/

The full series is here:
https://github.com/norov/linux/commits/bitmap-20210716

v1: https://lore.kernel.org/patchwork/patch/1455255/
v2: - replace bitmap_for_each_bit_region();
- address comments for bitmap_list_string() rework.

Yury Norov (4):
mm/percpu: micro-optimize pcpu_is_populated()
bitmap: unify find_bit operations
lib: bitmap: add performance test for bitmap_print_to_pagebuf
vsprintf: rework bitmap_list_string

drivers/mmc/host/renesas_sdhi_core.c | 2 +-
include/linux/bitmap.h | 41 --------------------
include/linux/find.h | 56 ++++++++++++++++++++++++++++
lib/test_bitmap.c | 37 ++++++++++++++++++
lib/vsprintf.c | 24 ++++--------
mm/percpu.c | 35 ++++++++---------
6 files changed, 117 insertions(+), 78 deletions(-)

--
2.30.2


2021-07-19 02:20:27

by Yury Norov

[permalink] [raw]
Subject: [PATCH 3/4] lib: bitmap: add performance test for bitmap_print_to_pagebuf

Functional tests for bitmap_print_to_pagebuf() are provided
in lib/test_printf.c. This patch adds performance test for
a case of fully set bitmap.

Signed-off-by: Yury Norov <[email protected]>
---
lib/test_bitmap.c | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 4ea73f5aed41..452d525007da 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -430,6 +430,42 @@ static void __init test_bitmap_parselist(void)
}
}

+static void __init test_bitmap_printlist(void)
+{
+ unsigned long *bmap = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ char *buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ char expected[256];
+ int ret, slen;
+ ktime_t time;
+
+ if (!buf || !bmap)
+ goto out;
+
+ memset(bmap, -1, PAGE_SIZE);
+ slen = snprintf(expected, 256, "0-%ld", PAGE_SIZE * 8 - 1);
+ if (slen < 0)
+ goto out;
+
+ time = ktime_get();
+ ret = bitmap_print_to_pagebuf(true, buf, bmap, PAGE_SIZE * 8);
+ time = ktime_get() - time;
+
+ if (ret != slen + 1) {
+ pr_err("bitmap_print_to_pagebuf: result is %d, expected %d\n", ret, slen);
+ goto out;
+ }
+
+ if (strncmp(buf, expected, slen)) {
+ pr_err("bitmap_print_to_pagebuf: result is %s, expected %s\n", buf, expected);
+ goto out;
+ }
+
+ pr_err("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time);
+out:
+ kfree(buf);
+ kfree(bmap);
+}
+
static const unsigned long parse_test[] __initconst = {
BITMAP_FROM_U64(0),
BITMAP_FROM_U64(1),
@@ -669,6 +705,7 @@ static void __init selftest(void)
test_bitmap_arr32();
test_bitmap_parse();
test_bitmap_parselist();
+ test_bitmap_printlist();
test_mem_optimisations();
test_for_each_set_clump8();
test_bitmap_cut();
--
2.30.2

2021-07-19 02:21:24

by Yury Norov

[permalink] [raw]
Subject: [PATCH 1/4] mm/percpu: micro-optimize pcpu_is_populated()

bitmap_next_clear_region() calls find_next_zero_bit() and find_next_bit()
sequentially to find a range of clear bits. In case of pcpu_is_populated()
there's a chance to return earlier if bitmap has all bits set.

Signed-off-by: Yury Norov <[email protected]>
---
mm/percpu.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/percpu.c b/mm/percpu.c
index 7f2e0151c4e2..25461571dcc5 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1071,17 +1071,18 @@ static void pcpu_block_update_hint_free(struct pcpu_chunk *chunk, int bit_off,
static bool pcpu_is_populated(struct pcpu_chunk *chunk, int bit_off, int bits,
int *next_off)
{
- unsigned int page_start, page_end, rs, re;
+ unsigned int start, end;

- page_start = PFN_DOWN(bit_off * PCPU_MIN_ALLOC_SIZE);
- page_end = PFN_UP((bit_off + bits) * PCPU_MIN_ALLOC_SIZE);
+ start = PFN_DOWN(bit_off * PCPU_MIN_ALLOC_SIZE);
+ end = PFN_UP((bit_off + bits) * PCPU_MIN_ALLOC_SIZE);

- rs = page_start;
- bitmap_next_clear_region(chunk->populated, &rs, &re, page_end);
- if (rs >= page_end)
+ start = find_next_zero_bit(chunk->populated, end, start);
+ if (start >= end)
return true;

- *next_off = re * PAGE_SIZE / PCPU_MIN_ALLOC_SIZE;
+ end = find_next_bit(chunk->populated, end, start + 1);
+
+ *next_off = end * PAGE_SIZE / PCPU_MIN_ALLOC_SIZE;
return false;
}

--
2.30.2

2021-07-19 02:22:11

by Yury Norov

[permalink] [raw]
Subject: [PATCH 2/4] bitmap: unify find_bit operations

bitmap_for_each_{set,clear}_region() are similar to for_each_bit()
macros in include/linux/find.h, but interface and implementation
of them are different.

This patch adds for_each_bitrange() macros and drops unused
bitmap_*_region() API in sake of unification.

Signed-off-by: Yury Norov <[email protected]>
---
drivers/mmc/host/renesas_sdhi_core.c | 2 +-
include/linux/bitmap.h | 41 --------------------
include/linux/find.h | 56 ++++++++++++++++++++++++++++
mm/percpu.c | 20 ++++------
4 files changed, 65 insertions(+), 54 deletions(-)

diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index e49ca0f7fe9a..efd33b1fc467 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -647,7 +647,7 @@ static int renesas_sdhi_select_tuning(struct tmio_mmc_host *host)
* is at least SH_MOBILE_SDHI_MIN_TAP_ROW probes long then use the
* center index as the tap, otherwise bail out.
*/
- bitmap_for_each_set_region(bitmap, rs, re, 0, taps_size) {
+ for_each_set_bitrange(rs, re, bitmap, taps_size) {
if (re - rs > tap_cnt) {
tap_end = re;
tap_start = rs;
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 3f7c6731b203..96670abf49bd 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -55,12 +55,6 @@ struct device;
* bitmap_clear(dst, pos, nbits) Clear specified bit area
* bitmap_find_next_zero_area(buf, len, pos, n, mask) Find bit free area
* bitmap_find_next_zero_area_off(buf, len, pos, n, mask, mask_off) as above
- * bitmap_next_clear_region(map, &start, &end, nbits) Find next clear region
- * bitmap_next_set_region(map, &start, &end, nbits) Find next set region
- * bitmap_for_each_clear_region(map, rs, re, start, end)
- * Iterate over all clear regions
- * bitmap_for_each_set_region(map, rs, re, start, end)
- * Iterate over all set regions
* bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n
* bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
* bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest
@@ -459,41 +453,6 @@ static inline void bitmap_replace(unsigned long *dst,
__bitmap_replace(dst, old, new, mask, nbits);
}

-static inline void bitmap_next_clear_region(unsigned long *bitmap,
- unsigned int *rs, unsigned int *re,
- unsigned int end)
-{
- *rs = find_next_zero_bit(bitmap, end, *rs);
- *re = find_next_bit(bitmap, end, *rs + 1);
-}
-
-static inline void bitmap_next_set_region(unsigned long *bitmap,
- unsigned int *rs, unsigned int *re,
- unsigned int end)
-{
- *rs = find_next_bit(bitmap, end, *rs);
- *re = find_next_zero_bit(bitmap, end, *rs + 1);
-}
-
-/*
- * Bitmap region iterators. Iterates over the bitmap between [@start, @end).
- * @rs and @re should be integer variables and will be set to start and end
- * index of the current clear or set region.
- */
-#define bitmap_for_each_clear_region(bitmap, rs, re, start, end) \
- for ((rs) = (start), \
- bitmap_next_clear_region((bitmap), &(rs), &(re), (end)); \
- (rs) < (re); \
- (rs) = (re) + 1, \
- bitmap_next_clear_region((bitmap), &(rs), &(re), (end)))
-
-#define bitmap_for_each_set_region(bitmap, rs, re, start, end) \
- for ((rs) = (start), \
- bitmap_next_set_region((bitmap), &(rs), &(re), (end)); \
- (rs) < (re); \
- (rs) = (re) + 1, \
- bitmap_next_set_region((bitmap), &(rs), &(re), (end)))
-
/**
* BITMAP_FROM_U64() - Represent u64 value in the format suitable for bitmap.
* @n: u64 value
diff --git a/include/linux/find.h b/include/linux/find.h
index ae9ed52b52b8..5bb6db213bcb 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -301,6 +301,62 @@ unsigned long find_next_bit_le(const void *addr, unsigned
(bit) < (size); \
(bit) = find_next_zero_bit((addr), (size), (bit) + 1))

+/**
+ * for_each_set_bitrange - iterate over all set bit ranges [b; e)
+ * @b: bit offset of start of current bitrange (first set bit)
+ * @e: bit offset of end of current bitrange (first unset bit)
+ * @addr: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ */
+#define for_each_set_bitrange(b, e, addr, size) \
+ for ((b) = find_next_bit((addr), (size), 0), \
+ (e) = find_next_zero_bit((addr), (size), (b) + 1); \
+ (b) < (size); \
+ (b) = find_next_bit((addr), (size), (e) + 1), \
+ (e) = find_next_zero_bit((addr), (size), (b) + 1))
+
+/**
+ * for_each_set_bitrange_from - iterate over set bit ranges [b; e)
+ * @b: bit offset of start of current bitrange (first set bit); must be initialized
+ * @e: bit offset of end of current bitrange (first unset bit)
+ * @addr: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ */
+#define for_each_set_bitrange_from(b, e, addr, size) \
+ for ((b) = find_next_bit((addr), (size), (b)), \
+ (e) = find_next_zero_bit((addr), (size), (b) + 1); \
+ (b) < (size); \
+ (b) = find_next_bit((addr), (size), (e) + 1), \
+ (e) = find_next_zero_bit((addr), (size), (b) + 1))
+
+/**
+ * for_each_clear_bitrange - iterate over all unset bit ranges [b; e)
+ * @b: bit offset of start of current bitrange (first unset bit)
+ * @e: bit offset of end of current bitrange (first set bit)
+ * @addr: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ */
+#define for_each_clear_bitrange(b, e, addr, size) \
+ for ((b) = find_next_zero_bit((addr), (size), 0), \
+ (e) = find_next_bit((addr), (size), (b) + 1); \
+ (b) < (size); \
+ (b) = find_next_zero_bit((addr), (size), (e) + 1), \
+ (e) = find_next_bit((addr), (size), (b) + 1))
+
+/**
+ * for_each_clear_bitrange_from - iterate over unset bit ranges [b; e)
+ * @b: bit offset of start of current bitrange (first unset bit); must be initialized
+ * @e: bit offset of end of current bitrange (first set bit)
+ * @addr: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ */
+#define for_each_clear_bitrange_from(b, e, addr, size) \
+ for ((b) = find_next_zero_bit((addr), (size), (b)), \
+ (e) = find_next_bit((addr), (size), (b) + 1); \
+ (b) < (size); \
+ (b) = find_next_zero_bit((addr), (size), (e) + 1), \
+ (e) = find_next_bit((addr), (size), (b) + 1))
+
/**
* for_each_set_clump8 - iterate over bitmap for each 8-bit clump with set bits
* @start: bit offset to start search and to store the current iteration offset
diff --git a/mm/percpu.c b/mm/percpu.c
index 25461571dcc5..6d518e822983 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -780,7 +780,7 @@ static void pcpu_block_refresh_hint(struct pcpu_chunk *chunk, int index)
{
struct pcpu_block_md *block = chunk->md_blocks + index;
unsigned long *alloc_map = pcpu_index_alloc_map(chunk, index);
- unsigned int rs, re, start; /* region start, region end */
+ unsigned int start, end; /* region start, region end */

/* promote scan_hint to contig_hint */
if (block->scan_hint) {
@@ -796,9 +796,8 @@ static void pcpu_block_refresh_hint(struct pcpu_chunk *chunk, int index)
block->right_free = 0;

/* iterate over free areas and update the contig hints */
- bitmap_for_each_clear_region(alloc_map, rs, re, start,
- PCPU_BITMAP_BLOCK_BITS)
- pcpu_block_update(block, rs, re);
+ for_each_clear_bitrange_from(start, end, alloc_map, PCPU_BITMAP_BLOCK_BITS)
+ pcpu_block_update(block, start, end);
}

/**
@@ -1856,13 +1855,12 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,

/* populate if not all pages are already there */
if (!is_atomic) {
- unsigned int page_start, page_end, rs, re;
+ unsigned int page_end, rs, re;

- page_start = PFN_DOWN(off);
+ rs = PFN_DOWN(off);
page_end = PFN_UP(off + size);

- bitmap_for_each_clear_region(chunk->populated, rs, re,
- page_start, page_end) {
+ for_each_clear_bitrange_from(rs, re, chunk->populated, page_end) {
WARN_ON(chunk->immutable);

ret = pcpu_populate_chunk(chunk, rs, re, pcpu_gfp);
@@ -2018,8 +2016,7 @@ static void pcpu_balance_free(bool empty_only)
list_for_each_entry_safe(chunk, next, &to_free, list) {
unsigned int rs, re;

- bitmap_for_each_set_region(chunk->populated, rs, re, 0,
- chunk->nr_pages) {
+ for_each_set_bitrange(rs, re, chunk->populated, chunk->nr_pages) {
pcpu_depopulate_chunk(chunk, rs, re);
spin_lock_irq(&pcpu_lock);
pcpu_chunk_depopulated(chunk, rs, re);
@@ -2089,8 +2086,7 @@ static void pcpu_balance_populated(void)
continue;

/* @chunk can't go away while pcpu_alloc_mutex is held */
- bitmap_for_each_clear_region(chunk->populated, rs, re, 0,
- chunk->nr_pages) {
+ for_each_clear_bitrange(rs, re, chunk->populated, chunk->nr_pages) {
int nr = min_t(int, re - rs, nr_to_pop);

spin_unlock_irq(&pcpu_lock);
--
2.30.2

2021-07-19 02:22:26

by Yury Norov

[permalink] [raw]
Subject: [PATCH 4/4] vsprintf: rework bitmap_list_string

bitmap_list_string() is very ineffective when printing bitmaps with long
ranges of set bits because it calls find_next_bit() for each bit in the
bitmap. We can do better by detecting ranges of set bits.

In my environment, before/after is 943008/31008 ns.

Signed-off-by: Yury Norov <[email protected]>
---
lib/vsprintf.c | 24 +++++++-----------------
1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 3b8b3f20051a..361799075706 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1241,20 +1241,13 @@ char *bitmap_list_string(char *buf, char *end, unsigned long *bitmap,
struct printf_spec spec, const char *fmt)
{
int nr_bits = max_t(int, spec.field_width, 0);
- /* current bit is 'cur', most recently seen range is [rbot, rtop] */
- int cur, rbot, rtop;
bool first = true;
+ int rbot, rtop;

if (check_pointer(&buf, end, bitmap, spec))
return buf;

- rbot = cur = find_first_bit(bitmap, nr_bits);
- while (cur < nr_bits) {
- rtop = cur;
- cur = find_next_bit(bitmap, nr_bits, cur + 1);
- if (cur < nr_bits && cur <= rtop + 1)
- continue;
-
+ for_each_set_bitrange(rbot, rtop, bitmap, nr_bits) {
if (!first) {
if (buf < end)
*buf = ',';
@@ -1263,15 +1256,12 @@ char *bitmap_list_string(char *buf, char *end, unsigned long *bitmap,
first = false;

buf = number(buf, end, rbot, default_dec_spec);
- if (rbot < rtop) {
- if (buf < end)
- *buf = '-';
- buf++;
-
- buf = number(buf, end, rtop, default_dec_spec);
- }
+ if (rtop == rbot + 1)
+ continue;

- rbot = cur;
+ if (buf < end)
+ *buf = '-';
+ buf = number(++buf, end, rtop - 1, default_dec_spec);
}
return buf;
}
--
2.30.2

2021-07-21 21:09:28

by Dennis Zhou

[permalink] [raw]
Subject: Re: [PATCH 1/4] mm/percpu: micro-optimize pcpu_is_populated()

Hello,

On Sun, Jul 18, 2021 at 07:17:52PM -0700, Yury Norov wrote:
> bitmap_next_clear_region() calls find_next_zero_bit() and find_next_bit()
> sequentially to find a range of clear bits. In case of pcpu_is_populated()
> there's a chance to return earlier if bitmap has all bits set.
>
> Signed-off-by: Yury Norov <[email protected]>
> ---
> mm/percpu.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/mm/percpu.c b/mm/percpu.c
> index 7f2e0151c4e2..25461571dcc5 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -1071,17 +1071,18 @@ static void pcpu_block_update_hint_free(struct pcpu_chunk *chunk, int bit_off,
> static bool pcpu_is_populated(struct pcpu_chunk *chunk, int bit_off, int bits,
> int *next_off)
> {
> - unsigned int page_start, page_end, rs, re;
> + unsigned int start, end;
>
> - page_start = PFN_DOWN(bit_off * PCPU_MIN_ALLOC_SIZE);
> - page_end = PFN_UP((bit_off + bits) * PCPU_MIN_ALLOC_SIZE);
> + start = PFN_DOWN(bit_off * PCPU_MIN_ALLOC_SIZE);
> + end = PFN_UP((bit_off + bits) * PCPU_MIN_ALLOC_SIZE);
>
> - rs = page_start;
> - bitmap_next_clear_region(chunk->populated, &rs, &re, page_end);
> - if (rs >= page_end)
> + start = find_next_zero_bit(chunk->populated, end, start);
> + if (start >= end)
> return true;
>
> - *next_off = re * PAGE_SIZE / PCPU_MIN_ALLOC_SIZE;
> + end = find_next_bit(chunk->populated, end, start + 1);
> +
> + *next_off = end * PAGE_SIZE / PCPU_MIN_ALLOC_SIZE;
> return false;
> }
>
> --
> 2.30.2
>

Sorry for the delay.

Acked-by: Dennis Zhou <[email protected]>

Thanks,
Dennis

2021-07-21 21:14:33

by Dennis Zhou

[permalink] [raw]
Subject: Re: [PATCH 2/4] bitmap: unify find_bit operations

On Sun, Jul 18, 2021 at 07:17:53PM -0700, Yury Norov wrote:
> bitmap_for_each_{set,clear}_region() are similar to for_each_bit()
> macros in include/linux/find.h, but interface and implementation
> of them are different.
>
> This patch adds for_each_bitrange() macros and drops unused
> bitmap_*_region() API in sake of unification.
>
> Signed-off-by: Yury Norov <[email protected]>
> ---
> drivers/mmc/host/renesas_sdhi_core.c | 2 +-
> include/linux/bitmap.h | 41 --------------------
> include/linux/find.h | 56 ++++++++++++++++++++++++++++
> mm/percpu.c | 20 ++++------
> 4 files changed, 65 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
> index e49ca0f7fe9a..efd33b1fc467 100644
> --- a/drivers/mmc/host/renesas_sdhi_core.c
> +++ b/drivers/mmc/host/renesas_sdhi_core.c
> @@ -647,7 +647,7 @@ static int renesas_sdhi_select_tuning(struct tmio_mmc_host *host)
> * is at least SH_MOBILE_SDHI_MIN_TAP_ROW probes long then use the
> * center index as the tap, otherwise bail out.
> */
> - bitmap_for_each_set_region(bitmap, rs, re, 0, taps_size) {
> + for_each_set_bitrange(rs, re, bitmap, taps_size) {
> if (re - rs > tap_cnt) {
> tap_end = re;
> tap_start = rs;
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 3f7c6731b203..96670abf49bd 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -55,12 +55,6 @@ struct device;
> * bitmap_clear(dst, pos, nbits) Clear specified bit area
> * bitmap_find_next_zero_area(buf, len, pos, n, mask) Find bit free area
> * bitmap_find_next_zero_area_off(buf, len, pos, n, mask, mask_off) as above
> - * bitmap_next_clear_region(map, &start, &end, nbits) Find next clear region
> - * bitmap_next_set_region(map, &start, &end, nbits) Find next set region
> - * bitmap_for_each_clear_region(map, rs, re, start, end)
> - * Iterate over all clear regions
> - * bitmap_for_each_set_region(map, rs, re, start, end)
> - * Iterate over all set regions
> * bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n
> * bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
> * bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest
> @@ -459,41 +453,6 @@ static inline void bitmap_replace(unsigned long *dst,
> __bitmap_replace(dst, old, new, mask, nbits);
> }
>
> -static inline void bitmap_next_clear_region(unsigned long *bitmap,
> - unsigned int *rs, unsigned int *re,
> - unsigned int end)
> -{
> - *rs = find_next_zero_bit(bitmap, end, *rs);
> - *re = find_next_bit(bitmap, end, *rs + 1);
> -}
> -
> -static inline void bitmap_next_set_region(unsigned long *bitmap,
> - unsigned int *rs, unsigned int *re,
> - unsigned int end)
> -{
> - *rs = find_next_bit(bitmap, end, *rs);
> - *re = find_next_zero_bit(bitmap, end, *rs + 1);
> -}
> -
> -/*
> - * Bitmap region iterators. Iterates over the bitmap between [@start, @end).
> - * @rs and @re should be integer variables and will be set to start and end
> - * index of the current clear or set region.
> - */
> -#define bitmap_for_each_clear_region(bitmap, rs, re, start, end) \
> - for ((rs) = (start), \
> - bitmap_next_clear_region((bitmap), &(rs), &(re), (end)); \
> - (rs) < (re); \
> - (rs) = (re) + 1, \
> - bitmap_next_clear_region((bitmap), &(rs), &(re), (end)))
> -
> -#define bitmap_for_each_set_region(bitmap, rs, re, start, end) \
> - for ((rs) = (start), \
> - bitmap_next_set_region((bitmap), &(rs), &(re), (end)); \
> - (rs) < (re); \
> - (rs) = (re) + 1, \
> - bitmap_next_set_region((bitmap), &(rs), &(re), (end)))
> -
> /**
> * BITMAP_FROM_U64() - Represent u64 value in the format suitable for bitmap.
> * @n: u64 value
> diff --git a/include/linux/find.h b/include/linux/find.h
> index ae9ed52b52b8..5bb6db213bcb 100644
> --- a/include/linux/find.h
> +++ b/include/linux/find.h
> @@ -301,6 +301,62 @@ unsigned long find_next_bit_le(const void *addr, unsigned
> (bit) < (size); \
> (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
>
> +/**
> + * for_each_set_bitrange - iterate over all set bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first set bit)
> + * @e: bit offset of end of current bitrange (first unset bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_set_bitrange(b, e, addr, size) \
> + for ((b) = find_next_bit((addr), (size), 0), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_bit((addr), (size), (e) + 1), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1))
> +
> +/**
> + * for_each_set_bitrange_from - iterate over set bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first set bit); must be initialized
> + * @e: bit offset of end of current bitrange (first unset bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_set_bitrange_from(b, e, addr, size) \
> + for ((b) = find_next_bit((addr), (size), (b)), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_bit((addr), (size), (e) + 1), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1))
> +
> +/**
> + * for_each_clear_bitrange - iterate over all unset bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first unset bit)
> + * @e: bit offset of end of current bitrange (first set bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_clear_bitrange(b, e, addr, size) \
> + for ((b) = find_next_zero_bit((addr), (size), 0), \
> + (e) = find_next_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_zero_bit((addr), (size), (e) + 1), \
> + (e) = find_next_bit((addr), (size), (b) + 1))
> +
> +/**
> + * for_each_clear_bitrange_from - iterate over unset bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first unset bit); must be initialized
> + * @e: bit offset of end of current bitrange (first set bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_clear_bitrange_from(b, e, addr, size) \
> + for ((b) = find_next_zero_bit((addr), (size), (b)), \
> + (e) = find_next_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_zero_bit((addr), (size), (e) + 1), \
> + (e) = find_next_bit((addr), (size), (b) + 1))
> +
> /**
> * for_each_set_clump8 - iterate over bitmap for each 8-bit clump with set bits
> * @start: bit offset to start search and to store the current iteration offset
> diff --git a/mm/percpu.c b/mm/percpu.c
> index 25461571dcc5..6d518e822983 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -780,7 +780,7 @@ static void pcpu_block_refresh_hint(struct pcpu_chunk *chunk, int index)
> {
> struct pcpu_block_md *block = chunk->md_blocks + index;
> unsigned long *alloc_map = pcpu_index_alloc_map(chunk, index);
> - unsigned int rs, re, start; /* region start, region end */
> + unsigned int start, end; /* region start, region end */
>
> /* promote scan_hint to contig_hint */
> if (block->scan_hint) {
> @@ -796,9 +796,8 @@ static void pcpu_block_refresh_hint(struct pcpu_chunk *chunk, int index)
> block->right_free = 0;
>
> /* iterate over free areas and update the contig hints */
> - bitmap_for_each_clear_region(alloc_map, rs, re, start,
> - PCPU_BITMAP_BLOCK_BITS)
> - pcpu_block_update(block, rs, re);
> + for_each_clear_bitrange_from(start, end, alloc_map, PCPU_BITMAP_BLOCK_BITS)
> + pcpu_block_update(block, start, end);
> }
>
> /**
> @@ -1856,13 +1855,12 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
>
> /* populate if not all pages are already there */
> if (!is_atomic) {
> - unsigned int page_start, page_end, rs, re;
> + unsigned int page_end, rs, re;
>
> - page_start = PFN_DOWN(off);
> + rs = PFN_DOWN(off);
> page_end = PFN_UP(off + size);
>
> - bitmap_for_each_clear_region(chunk->populated, rs, re,
> - page_start, page_end) {
> + for_each_clear_bitrange_from(rs, re, chunk->populated, page_end) {
> WARN_ON(chunk->immutable);
>
> ret = pcpu_populate_chunk(chunk, rs, re, pcpu_gfp);
> @@ -2018,8 +2016,7 @@ static void pcpu_balance_free(bool empty_only)
> list_for_each_entry_safe(chunk, next, &to_free, list) {
> unsigned int rs, re;
>
> - bitmap_for_each_set_region(chunk->populated, rs, re, 0,
> - chunk->nr_pages) {
> + for_each_set_bitrange(rs, re, chunk->populated, chunk->nr_pages) {
> pcpu_depopulate_chunk(chunk, rs, re);
> spin_lock_irq(&pcpu_lock);
> pcpu_chunk_depopulated(chunk, rs, re);
> @@ -2089,8 +2086,7 @@ static void pcpu_balance_populated(void)
> continue;
>
> /* @chunk can't go away while pcpu_alloc_mutex is held */
> - bitmap_for_each_clear_region(chunk->populated, rs, re, 0,
> - chunk->nr_pages) {
> + for_each_clear_bitrange(rs, re, chunk->populated, chunk->nr_pages) {
> int nr = min_t(int, re - rs, nr_to_pop);
>
> spin_unlock_irq(&pcpu_lock);
> --
> 2.30.2
>

Acked-by: Dennis Zhou <[email protected]>

Thanks,
Dennis

2021-08-04 12:54:09

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH 2/4] bitmap: unify find_bit operations

On Mon, 19 Jul 2021 at 04:18, Yury Norov <[email protected]> wrote:
>
> bitmap_for_each_{set,clear}_region() are similar to for_each_bit()
> macros in include/linux/find.h, but interface and implementation
> of them are different.
>
> This patch adds for_each_bitrange() macros and drops unused
> bitmap_*_region() API in sake of unification.
>
> Signed-off-by: Yury Norov <[email protected]>

Acked-by: Ulf Hansson <[email protected]> # For MMC

Kind regards
Uffe

> ---
> drivers/mmc/host/renesas_sdhi_core.c | 2 +-
> include/linux/bitmap.h | 41 --------------------
> include/linux/find.h | 56 ++++++++++++++++++++++++++++
> mm/percpu.c | 20 ++++------
> 4 files changed, 65 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
> index e49ca0f7fe9a..efd33b1fc467 100644
> --- a/drivers/mmc/host/renesas_sdhi_core.c
> +++ b/drivers/mmc/host/renesas_sdhi_core.c
> @@ -647,7 +647,7 @@ static int renesas_sdhi_select_tuning(struct tmio_mmc_host *host)
> * is at least SH_MOBILE_SDHI_MIN_TAP_ROW probes long then use the
> * center index as the tap, otherwise bail out.
> */
> - bitmap_for_each_set_region(bitmap, rs, re, 0, taps_size) {
> + for_each_set_bitrange(rs, re, bitmap, taps_size) {
> if (re - rs > tap_cnt) {
> tap_end = re;
> tap_start = rs;
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 3f7c6731b203..96670abf49bd 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -55,12 +55,6 @@ struct device;
> * bitmap_clear(dst, pos, nbits) Clear specified bit area
> * bitmap_find_next_zero_area(buf, len, pos, n, mask) Find bit free area
> * bitmap_find_next_zero_area_off(buf, len, pos, n, mask, mask_off) as above
> - * bitmap_next_clear_region(map, &start, &end, nbits) Find next clear region
> - * bitmap_next_set_region(map, &start, &end, nbits) Find next set region
> - * bitmap_for_each_clear_region(map, rs, re, start, end)
> - * Iterate over all clear regions
> - * bitmap_for_each_set_region(map, rs, re, start, end)
> - * Iterate over all set regions
> * bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n
> * bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
> * bitmap_cut(dst, src, first, n, nbits) Cut n bits from first, copy rest
> @@ -459,41 +453,6 @@ static inline void bitmap_replace(unsigned long *dst,
> __bitmap_replace(dst, old, new, mask, nbits);
> }
>
> -static inline void bitmap_next_clear_region(unsigned long *bitmap,
> - unsigned int *rs, unsigned int *re,
> - unsigned int end)
> -{
> - *rs = find_next_zero_bit(bitmap, end, *rs);
> - *re = find_next_bit(bitmap, end, *rs + 1);
> -}
> -
> -static inline void bitmap_next_set_region(unsigned long *bitmap,
> - unsigned int *rs, unsigned int *re,
> - unsigned int end)
> -{
> - *rs = find_next_bit(bitmap, end, *rs);
> - *re = find_next_zero_bit(bitmap, end, *rs + 1);
> -}
> -
> -/*
> - * Bitmap region iterators. Iterates over the bitmap between [@start, @end).
> - * @rs and @re should be integer variables and will be set to start and end
> - * index of the current clear or set region.
> - */
> -#define bitmap_for_each_clear_region(bitmap, rs, re, start, end) \
> - for ((rs) = (start), \
> - bitmap_next_clear_region((bitmap), &(rs), &(re), (end)); \
> - (rs) < (re); \
> - (rs) = (re) + 1, \
> - bitmap_next_clear_region((bitmap), &(rs), &(re), (end)))
> -
> -#define bitmap_for_each_set_region(bitmap, rs, re, start, end) \
> - for ((rs) = (start), \
> - bitmap_next_set_region((bitmap), &(rs), &(re), (end)); \
> - (rs) < (re); \
> - (rs) = (re) + 1, \
> - bitmap_next_set_region((bitmap), &(rs), &(re), (end)))
> -
> /**
> * BITMAP_FROM_U64() - Represent u64 value in the format suitable for bitmap.
> * @n: u64 value
> diff --git a/include/linux/find.h b/include/linux/find.h
> index ae9ed52b52b8..5bb6db213bcb 100644
> --- a/include/linux/find.h
> +++ b/include/linux/find.h
> @@ -301,6 +301,62 @@ unsigned long find_next_bit_le(const void *addr, unsigned
> (bit) < (size); \
> (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
>
> +/**
> + * for_each_set_bitrange - iterate over all set bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first set bit)
> + * @e: bit offset of end of current bitrange (first unset bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_set_bitrange(b, e, addr, size) \
> + for ((b) = find_next_bit((addr), (size), 0), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_bit((addr), (size), (e) + 1), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1))
> +
> +/**
> + * for_each_set_bitrange_from - iterate over set bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first set bit); must be initialized
> + * @e: bit offset of end of current bitrange (first unset bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_set_bitrange_from(b, e, addr, size) \
> + for ((b) = find_next_bit((addr), (size), (b)), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_bit((addr), (size), (e) + 1), \
> + (e) = find_next_zero_bit((addr), (size), (b) + 1))
> +
> +/**
> + * for_each_clear_bitrange - iterate over all unset bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first unset bit)
> + * @e: bit offset of end of current bitrange (first set bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_clear_bitrange(b, e, addr, size) \
> + for ((b) = find_next_zero_bit((addr), (size), 0), \
> + (e) = find_next_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_zero_bit((addr), (size), (e) + 1), \
> + (e) = find_next_bit((addr), (size), (b) + 1))
> +
> +/**
> + * for_each_clear_bitrange_from - iterate over unset bit ranges [b; e)
> + * @b: bit offset of start of current bitrange (first unset bit); must be initialized
> + * @e: bit offset of end of current bitrange (first set bit)
> + * @addr: bitmap address to base the search on
> + * @size: bitmap size in number of bits
> + */
> +#define for_each_clear_bitrange_from(b, e, addr, size) \
> + for ((b) = find_next_zero_bit((addr), (size), (b)), \
> + (e) = find_next_bit((addr), (size), (b) + 1); \
> + (b) < (size); \
> + (b) = find_next_zero_bit((addr), (size), (e) + 1), \
> + (e) = find_next_bit((addr), (size), (b) + 1))
> +
> /**
> * for_each_set_clump8 - iterate over bitmap for each 8-bit clump with set bits
> * @start: bit offset to start search and to store the current iteration offset
> diff --git a/mm/percpu.c b/mm/percpu.c
> index 25461571dcc5..6d518e822983 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -780,7 +780,7 @@ static void pcpu_block_refresh_hint(struct pcpu_chunk *chunk, int index)
> {
> struct pcpu_block_md *block = chunk->md_blocks + index;
> unsigned long *alloc_map = pcpu_index_alloc_map(chunk, index);
> - unsigned int rs, re, start; /* region start, region end */
> + unsigned int start, end; /* region start, region end */
>
> /* promote scan_hint to contig_hint */
> if (block->scan_hint) {
> @@ -796,9 +796,8 @@ static void pcpu_block_refresh_hint(struct pcpu_chunk *chunk, int index)
> block->right_free = 0;
>
> /* iterate over free areas and update the contig hints */
> - bitmap_for_each_clear_region(alloc_map, rs, re, start,
> - PCPU_BITMAP_BLOCK_BITS)
> - pcpu_block_update(block, rs, re);
> + for_each_clear_bitrange_from(start, end, alloc_map, PCPU_BITMAP_BLOCK_BITS)
> + pcpu_block_update(block, start, end);
> }
>
> /**
> @@ -1856,13 +1855,12 @@ static void __percpu *pcpu_alloc(size_t size, size_t align, bool reserved,
>
> /* populate if not all pages are already there */
> if (!is_atomic) {
> - unsigned int page_start, page_end, rs, re;
> + unsigned int page_end, rs, re;
>
> - page_start = PFN_DOWN(off);
> + rs = PFN_DOWN(off);
> page_end = PFN_UP(off + size);
>
> - bitmap_for_each_clear_region(chunk->populated, rs, re,
> - page_start, page_end) {
> + for_each_clear_bitrange_from(rs, re, chunk->populated, page_end) {
> WARN_ON(chunk->immutable);
>
> ret = pcpu_populate_chunk(chunk, rs, re, pcpu_gfp);
> @@ -2018,8 +2016,7 @@ static void pcpu_balance_free(bool empty_only)
> list_for_each_entry_safe(chunk, next, &to_free, list) {
> unsigned int rs, re;
>
> - bitmap_for_each_set_region(chunk->populated, rs, re, 0,
> - chunk->nr_pages) {
> + for_each_set_bitrange(rs, re, chunk->populated, chunk->nr_pages) {
> pcpu_depopulate_chunk(chunk, rs, re);
> spin_lock_irq(&pcpu_lock);
> pcpu_chunk_depopulated(chunk, rs, re);
> @@ -2089,8 +2086,7 @@ static void pcpu_balance_populated(void)
> continue;
>
> /* @chunk can't go away while pcpu_alloc_mutex is held */
> - bitmap_for_each_clear_region(chunk->populated, rs, re, 0,
> - chunk->nr_pages) {
> + for_each_clear_bitrange(rs, re, chunk->populated, chunk->nr_pages) {
> int nr = min_t(int, re - rs, nr_to_pop);
>
> spin_unlock_irq(&pcpu_lock);
> --
> 2.30.2
>

2021-08-11 07:41:41

by Wolfram Sang

[permalink] [raw]
Subject: Re: [PATCH 2/4] bitmap: unify find_bit operations

On Sun, Jul 18, 2021 at 07:17:53PM -0700, Yury Norov wrote:
> bitmap_for_each_{set,clear}_region() are similar to for_each_bit()
> macros in include/linux/find.h, but interface and implementation
> of them are different.
>
> This patch adds for_each_bitrange() macros and drops unused
> bitmap_*_region() API in sake of unification.
>
> Signed-off-by: Yury Norov <[email protected]>

I fetched your bitmap-20210716 branch and tested it on a Renesas
Salvator-XS board with an R-Car M3-N SoC with some debug output added.
Still works and values make sense, so:

Tested-by: Wolfram Sang <[email protected]>


Attachments:
(No filename) (641.00 B)
signature.asc (849.00 B)
Download all attachments

2021-08-12 01:25:19

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 2/4] bitmap: unify find_bit operations

On Wed, Aug 11, 2021 at 12:38 AM Wolfram Sang
<[email protected]> wrote:
>
> On Sun, Jul 18, 2021 at 07:17:53PM -0700, Yury Norov wrote:
> > bitmap_for_each_{set,clear}_region() are similar to for_each_bit()
> > macros in include/linux/find.h, but interface and implementation
> > of them are different.
> >
> > This patch adds for_each_bitrange() macros and drops unused
> > bitmap_*_region() API in sake of unification.
> >
> > Signed-off-by: Yury Norov <[email protected]>
>
> I fetched your bitmap-20210716 branch and tested it on a Renesas
> Salvator-XS board with an R-Car M3-N SoC with some debug output added.
> Still works and values make sense, so:
>
> Tested-by: Wolfram Sang <[email protected]>

Thank you Wolfram for looking into this. I'll resend all the series
this weekend.