2019-11-30 20:28:36

by Pavel Begunkov

[permalink] [raw]
Subject: [PATCH v2] block: optimise bvec_iter_advance()

bvec_iter_advance() is quite popular, but compilers fail to do proper
alias analysis and optimise it good enough. The assembly is checked
for gcc 9.2, x86-64.

- remove @iter->bi_size from min(...), as it's always less than @bytes.
Modify at the beginning and forget about it.

- the compiler isn't able to collapse memory dependencies and remove
writes in the loop. Help it by explicitely using local vars.

Signed-off-by: Pavel Begunkov <[email protected]>
---

v2: simplify code (Arvind Sankar)

include/linux/bvec.h | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index a032f01e928c..679a42253170 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -87,26 +87,24 @@ struct bvec_iter_all {
static inline bool bvec_iter_advance(const struct bio_vec *bv,
struct bvec_iter *iter, unsigned bytes)
{
+ unsigned int idx = iter->bi_idx;
+
if (WARN_ONCE(bytes > iter->bi_size,
"Attempted to advance past end of bvec iter\n")) {
iter->bi_size = 0;
return false;
}

- while (bytes) {
- const struct bio_vec *cur = bv + iter->bi_idx;
- unsigned len = min3(bytes, iter->bi_size,
- cur->bv_len - iter->bi_bvec_done);
-
- bytes -= len;
- iter->bi_size -= len;
- iter->bi_bvec_done += len;
+ iter->bi_size -= bytes;
+ bytes += iter->bi_bvec_done;

- if (iter->bi_bvec_done == cur->bv_len) {
- iter->bi_bvec_done = 0;
- iter->bi_idx++;
- }
+ while (bytes && bytes >= bv[idx].bv_len) {
+ bytes -= bv[idx].bv_len;
+ idx++;
}
+
+ iter->bi_idx = idx;
+ iter->bi_bvec_done = bytes;
return true;
}

--
2.24.0


2019-11-30 22:06:33

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v2] block: optimise bvec_iter_advance()

On Sat, Nov 30, 2019 at 11:23:52PM +0300, Pavel Begunkov wrote:
> bvec_iter_advance() is quite popular, but compilers fail to do proper
> alias analysis and optimise it good enough. The assembly is checked
> for gcc 9.2, x86-64.
>
> - remove @iter->bi_size from min(...), as it's always less than @bytes.
> Modify at the beginning and forget about it.
>
> - the compiler isn't able to collapse memory dependencies and remove
> writes in the loop. Help it by explicitely using local vars.
>
> Signed-off-by: Pavel Begunkov <[email protected]>
> ---
>
> v2: simplify code (Arvind Sankar)
>

Thanks :)

Signed-off-by: Arvind Sankar <[email protected]>

Btw, I discovered that gcc 9.2 doesn't optimize away the second
comparison in something like

m = min(a,b);
return m>a;

So the WARN_ONCE bit doesn't get optimized away even in cases like
bio_for_each_bvec where it's guaranteed at compile-time to not trigger.

2019-12-01 00:11:14

by Pavel Begunkov

[permalink] [raw]
Subject: Re: [PATCH v2] block: optimise bvec_iter_advance()

On 01/12/2019 01:05, Arvind Sankar wrote:
> On Sat, Nov 30, 2019 at 11:23:52PM +0300, Pavel Begunkov wrote:
>> bvec_iter_advance() is quite popular, but compilers fail to do proper
>> alias analysis and optimise it good enough. The assembly is checked
>> for gcc 9.2, x86-64.
>>
>> - remove @iter->bi_size from min(...), as it's always less than @bytes.
>> Modify at the beginning and forget about it.
>>
>> - the compiler isn't able to collapse memory dependencies and remove
>> writes in the loop. Help it by explicitly using local vars.
>>
>> Signed-off-by: Pavel Begunkov <[email protected]>
>> ---
>>
>> v2: simplify code (Arvind Sankar)
>>
>
> Thanks :)
>
> Signed-off-by: Arvind Sankar <[email protected]>
>
> Btw, I discovered that gcc 9.2 doesn't optimize away the second
> comparison in something like

That's unfortunate. IDK, whether it could be easily done in gcc,
but maybe relay it to compiler guys?

>
> m = min(a,b);
> return m>a;
>
> So the WARN_ONCE bit doesn't get optimized away even in cases like
> bio_for_each_bvec where it's guaranteed at compile-time to not trigger.
>

That's more like a starting point. The idea is to revise and
tune/rewrite iteration helpers including *for_each_bvec(). I assume,
all those are really poorly optimisable (especially with
-fno-strict-aliasing). So, noted to consider

--
Pavel Begunkov


Attachments:
signature.asc (849.00 B)
OpenPGP digital signature

2019-12-03 14:39:37

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v2] block: optimise bvec_iter_advance()

On 11/30/19 1:23 PM, Pavel Begunkov wrote:
> bvec_iter_advance() is quite popular, but compilers fail to do proper
> alias analysis and optimise it good enough. The assembly is checked
> for gcc 9.2, x86-64.
>
> - remove @iter->bi_size from min(...), as it's always less than @bytes.
> Modify at the beginning and forget about it.
>
> - the compiler isn't able to collapse memory dependencies and remove
> writes in the loop. Help it by explicitely using local vars.

Applied, thanks.

--
Jens Axboe