2017-11-09 17:27:32

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH v4] lib/dlock-list: Scale dlock_lists_empty()

On Wed, 08 Nov 2017, Boqun Feng wrote:
>Or worse:
>
> * CPU0 CPU1
> * dlock_list_add() dlock_lists_empty()
> * smp_mb__before_atomic();
> * [L] atomic_read(used_lists)
> * [S] atomic_inc(used_lists);
> * smp_mb__after_atomic();
>
>, in which case dlock_lists_empty() misses a increment of used_lists.
>
>That said, this may be fine for the usage of dlock_list. But I think the
>comments are confusing or wrong. The reason is you can not use barriers
>to order two memory ops on different CPUs, and usually we add comments
>like:

So I think that case is OK as CPU1 legitimately reads a 0, the problem
would be if we miss the inc it because we are doing spin_lock().

>
> [S] ... [S] ...
> <barrier> <barrier>
> [L] ... [L] ...

That is true.

>
>Davidlohr, could you try to improve the comments by finding both memory
>ops preceding and following the barriers? Maybe you will find those
>barriers are not necessary in the end.

Ok so for the dlock_list_add() the first barrier is so that the atomic_inc()
is not done inside the critical region, after the head->lock is taken.
The other barriers that follow I put such that the respective atomic op
is done before the list_add(), however thinking about it I don't think
they are really needed.

Regarding dlock_list_empty(), the smp_mb__before_atomic() is mainly for
pairing reasons; but at this point I don't see a respective counterpart
for the above diagram so I'm unclear.

Thanks,
Davidlohr


Attachments:
(No filename) (1.51 kB)
signature.asc (836.00 B)
Digital signature
Download all attachments