2022-10-30 05:29:33

by Dawei Li

[permalink] [raw]
Subject: [PATCH v3] block: simplify blksize_bits() implementation

Convert current looping-based implementation into bit operation,
which can bring improvement for:

1) bitops is more efficient for its arch-level optimization.

2) Given that blksize_bits() is inline, _if_ @size is compile-time
constant, it's possible that order_base_2() _may_ make output
compile-time evaluated, depending on code context and compiler behavior.

v1: https://lore.kernel.org/all/TYCP286MB2323169D81A806A7C1F7FDF1CA309@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM

v2: Remove the ternary operator, based on Bart's suggestion
But this may lead to break for corner cases below:
BUILD_BUG_ON(blksize_bits(1025) != 11);
So make a minor modification by adding (SECTOR_SIZE - 1) before
shifting.

v3: Remove the rounding stuff.

base-commit: 30209debe98b6f66b13591e59e5272cb65b3945e

Signed-off-by: Dawei Li <[email protected]>
---
include/linux/blkdev.h | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 57ed49f20d2e..32137d85c9ad 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1349,12 +1349,7 @@ static inline int blk_rq_aligned(struct request_queue *q, unsigned long addr,
/* assumes size > 256 */
static inline unsigned int blksize_bits(unsigned int size)
{
- unsigned int bits = 8;
- do {
- bits++;
- size >>= 1;
- } while (size > 256);
- return bits;
+ return order_base_2(size >> SECTOR_SHIFT) + SECTOR_SHIFT;
}

static inline unsigned int block_size(struct block_device *bdev)
--
2.25.1



2022-10-30 08:07:13

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3] block: simplify blksize_bits() implementation

I'm not sure if it matters, but the change does look fine to me:

Reviewed-by: Christoph Hellwig <[email protected]>

2022-10-30 15:22:32

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH v3] block: simplify blksize_bits() implementation

On 10/29/22 22:20, Dawei Li wrote:
> @@ -1349,12 +1349,7 @@ static inline int blk_rq_aligned(struct request_queue *q, unsigned long addr,
> /* assumes size > 256 */
> static inline unsigned int blksize_bits(unsigned int size)
> {
> - unsigned int bits = 8;
> - do {
> - bits++;
> - size >>= 1;
> - } while (size > 256);
> - return bits;
> + return order_base_2(size >> SECTOR_SHIFT) + SECTOR_SHIFT;
> }

Reviewed-by: Bart Van Assche <[email protected]>

2022-10-31 02:27:15

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH v3] block: simplify blksize_bits() implementation

On 10/29/2022 10:20 PM, Dawei Li wrote:
> Convert current looping-based implementation into bit operation,
> which can bring improvement for:
>
> 1) bitops is more efficient for its arch-level optimization.
>

do you have a quantitative date to prove that ?
Also which arch benefits the most ? is it true for all ?

> 2) Given that blksize_bits() is inline, _if_ @size is compile-time
> constant, it's possible that order_base_2() _may_ make output
> compile-time evaluated, depending on code context and compiler behavior.
>

patches like this needs to be supported by the quantitative
data, else I've seen reviewers taking an objection ...

either way :-

Reviewed-by: Chaitanya Kulkarni <[email protected]>

-ck


2022-10-31 14:11:18

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v3] block: simplify blksize_bits() implementation

On Sun, 30 Oct 2022 13:20:08 +0800, Dawei Li wrote:
> Convert current looping-based implementation into bit operation,
> which can bring improvement for:
>
> 1) bitops is more efficient for its arch-level optimization.
>
> 2) Given that blksize_bits() is inline, _if_ @size is compile-time
> constant, it's possible that order_base_2() _may_ make output
> compile-time evaluated, depending on code context and compiler behavior.
>
> [...]

Applied, thanks!

[1/1] block: simplify blksize_bits() implementation
commit: adff215830fcf3ef74f2f0d4dd5a47a6927d450b

Best regards,
--
Jens Axboe