Hi,
What do you think about the following patch for sem_lock()?
Other options:
1) I don't like
#define smp_mb__after_unlock_wait() smp_rmb()
I think it is too specific: the last block in sem_lock uses
if (sma->complex_count == 0) {
smp_rmb();
return;
}
2) What about
#define smp_aquire__after_control_barrier() smp_rmb()
Best regards,
Manfred
xxxxx
sem_lock() does not properly pair memory barriers.
Theoretially an acquire barrier would the right operation.
But since the existing control boundary is a write memory barrier,
it is cheaper use an smp_rmb().
Signed-off-by: Manfred Spraul <[email protected]>
---
ipc/sem.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/ipc/sem.c b/ipc/sem.c
index 9284211..d43011d 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -267,6 +267,10 @@ static void sem_wait_array(struct sem_array *sma)
if (sma->complex_count) {
/* The thread that increased sma->complex_count waited on
* all sem->lock locks. Thus we don't need to wait again.
+ *
+ * The is no need for memory barriers: with
+ * complex_count>0, all threads acquire/release
+ * sem_perm.lock, thus spin_lock/unlock is the barrier.
*/
return;
}
@@ -275,6 +279,20 @@ static void sem_wait_array(struct sem_array *sma)
sem = sma->sem_base + i;
spin_unlock_wait(&sem->lock);
}
+ /*
+ * We own sem_perm.lock, all owners of sma->sem_base[i].lock have
+ * dropped their locks. But we still need a memory barrier:
+ * - The lock dropping thread did a spin_unlock(), which is the
+ * release memory barrier.
+ * - But the spin_unlock(&sma->sem_base[i].lock) might have
+ * happened after this thread did spin_lock(&sma->sem_perm.lock),
+ * thus the acquire memory barrier in this thread is missing.
+ * - spin_unlock_wait() is internally a loop, thus we have a control
+ * boundary. As writes are not speculated, we have already a barrier
+ * for writes. Reads can be performed speculatively, therefore a
+ * smp_rmb() is necessary.
+ */
+ smp_rmb();
}
/*
@@ -341,7 +359,13 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
* Thus: if is now 0, then it will stay 0.
*/
if (sma->complex_count == 0) {
- /* fast path successful! */
+ /*
+ * Fast path successful!
+ * We only need a final memory barrier.
+ * (see sem_wait_array() for details).
+ */
+ smp_rmb();
+
return sops->sem_num;
}
}
--
2.1.0
Sorry Manfred, I initiated this discussion and then disappeared. Currently
I am buried in the ancient 2.16.18 bugs ;)
On 02/25, Manfred Spraul wrote:
> Hi,
>
> What do you think about the following patch for sem_lock()?
>
> Other options:
>
> 1) I don't like
>
> #define smp_mb__after_unlock_wait() smp_rmb()
>
> I think it is too specific: the last block in sem_lock uses
>
> if (sma->complex_count == 0) {
> smp_rmb();
> return;
> }
See below.
>
> 2) What about
>
> #define smp_aquire__after_control_barrier() smp_rmb()
I agree with any naming. The only point of the new helper is that we can
factor out the comment, otherwise we would need to repeat it again and again.
> @@ -341,7 +359,13 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
> * Thus: if is now 0, then it will stay 0.
> */
> if (sma->complex_count == 0) {
> - /* fast path successful! */
> + /*
> + * Fast path successful!
> + * We only need a final memory barrier.
> + * (see sem_wait_array() for details).
> + */
> + smp_rmb();
> +
I'll try to read this again tomorrow, but so far I am confused.
Most probably I missed something, but this looks unneeded at first glance.
We already have another smp_rmb() above this check. And it should act as
a "final" barrier, or we can not trust this ->complex_count check ?
And (if I am right) this means that the comment above that rmb() should
be updated. And that is why I think the helper makes sense, the comment
should be almost the same as in sem_wait_array().
If not, could you please spell to explain why do we need another rmb() ?
Oleg.
Hi Oleg,
On 02/26/2015 08:29 PM, Oleg Nesterov wrote:
>> @@ -341,7 +359,13 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
>> * Thus: if is now 0, then it will stay 0.
>> */
>> if (sma->complex_count == 0) {
>> - /* fast path successful! */
>> + /*
>> + * Fast path successful!
>> + * We only need a final memory barrier.
>> + * (see sem_wait_array() for details).
>> + */
>> + smp_rmb();
>> +
> I'll try to read this again tomorrow, but so far I am confused.
>
> Most probably I missed something, but this looks unneeded at first glance.
No, my fault:
I thought long about sem_wait_array() and then I did copy&paste without
thinking properly.
The sequence is:
thread A:
spin_lock(&local)
thread B:
complex_count=??;
spin_unlock(&global); <<< release_mb
thread A:
spin_unlock_wait(&global); <<< control_mb
smb_mb__after_control_barrier(); <<< acquire_mb
<<< now everything from thread B is visible.
<<< and: thread B has dropped the lock, it can't change any
protected var
<<< and: a new thread C can't acquire a lock, we hold &local.
if (complex_count == 0) goto success;
I'll update the patch.
(cc stable, starting from 3.10...)
--
Manfred