2020-10-23 06:14:45

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH] fs/dcache: optimize start_dir_add()

Considering both end_dir_add() and d_alloc_parallel(), the
dir->i_dir_seq wants acquire/release semantics, therefore
micro-optimize for ll/sc archs and use finer grained barriers
to provide (load)-ACQUIRE ordering (L->S + L->L). This comes
at no additional cost for most of x86, as sane tso models will
have a nop for smp_rmb/smp_acquire__after_ctrl_dep.

Signed-off-by: Davidlohr Bueso <[email protected]>
---
Alternatively I guess we could just use cmpxchg_acquire().

fs/dcache.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index ea0485861d93..22738daccb9c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2502,13 +2502,18 @@ EXPORT_SYMBOL(d_rehash);

static inline unsigned start_dir_add(struct inode *dir)
{
+ unsigned n;

for (;;) {
- unsigned n = dir->i_dir_seq;
- if (!(n & 1) && cmpxchg(&dir->i_dir_seq, n, n + 1) == n)
- return n;
+ n = READ_ONCE(dir->i_dir_seq);
+ if (!(n & 1) && cmpxchg_relaxed(&dir->i_dir_seq, n, n + 1) == n)
+ break;
cpu_relax();
}
+
+ /* create (load)-ACQUIRE ordering */
+ smp_acquire__after_ctrl_dep();
+ return n;
}

static inline void end_dir_add(struct inode *dir, unsigned n)
--
2.26.2


2020-10-27 00:06:59

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] fs/dcache: optimize start_dir_add()

On Thu, Oct 22, 2020 at 02:16:50PM -0700, Davidlohr Bueso wrote:
> Considering both end_dir_add() and d_alloc_parallel(), the
> dir->i_dir_seq wants acquire/release semantics, therefore
> micro-optimize for ll/sc archs and use finer grained barriers
> to provide (load)-ACQUIRE ordering (L->S + L->L). This comes
> at no additional cost for most of x86, as sane tso models will
> have a nop for smp_rmb/smp_acquire__after_ctrl_dep.
>
> Signed-off-by: Davidlohr Bueso <[email protected]>
> ---
> Alternatively I guess we could just use cmpxchg_acquire().

Please us cmpxchg_acquire() so that people who have no clue what the
hell smp_acquire__after_ctrl_dep() means or does have some hope of
understanding of what objects the ordering semantics in the function
actually apply to....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2020-10-29 19:34:15

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH v2] fs/dcache: optimize start_dir_add()

Considering both end_dir_add() and d_alloc_parallel(), the
dir->i_dir_seq wants acquire/release semantics, therefore
micro-optimize for ll/sc archs. Also add READ_ONCE around
the variable mostly for documentation purposes - either
the successful cmpxchg or the pause will avoid the tearing).

Signed-off-by: Davidlohr Bueso <[email protected]>
---
fs/dcache.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index ea0485861d93..9177f0d08a5a 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2504,8 +2504,8 @@ static inline unsigned start_dir_add(struct inode *dir)
{

for (;;) {
- unsigned n = dir->i_dir_seq;
- if (!(n & 1) && cmpxchg(&dir->i_dir_seq, n, n + 1) == n)
+ unsigned n = READ_ONCE(dir->i_dir_seq);
+ if (!(n & 1) && cmpxchg_acquire(&dir->i_dir_seq, n, n + 1) == n)
return n;
cpu_relax();
}
--
2.26.2