In queue_read_lock_slowpath, when writer count becomes 0, we need
increment the read count and get the lock. Then need call
rspin_until_writer_unlock to check again if an incoming writer
steals the lock in the gap. But in rspin_until_writer_unlock
it only checks the writer count, namely low 8 bit of lock->cnts,
no need to subtract the reader count unit specifically. So remove
that subtraction to make it clearer, rspin_until_writer_unlock
just takes the actual lock->cnts as the 2nd argument.
And also change the code comment in queue_write_lock_slowpath to
make it more exact and explicit.
Signed-off-by: Baoquan He <[email protected]>
---
kernel/locking/qrwlock.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index f956ede..ae66c10 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -76,7 +76,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock)
while (atomic_read(&lock->cnts) & _QW_WMASK)
cpu_relax_lowlatency();
- cnts = atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
+ cnts = atomic_add_return(_QR_BIAS, &lock->cnts);
rspin_until_writer_unlock(lock, cnts);
/*
@@ -97,14 +97,14 @@ void queue_write_lock_slowpath(struct qrwlock *lock)
/* Put the writer into the wait queue */
arch_spin_lock(&lock->lock);
- /* Try to acquire the lock directly if no reader is present */
+ /* Try to acquire the lock directly if no reader and writer is present */
if (!atomic_read(&lock->cnts) &&
(atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0))
goto unlock;
/*
- * Set the waiting flag to notify readers that a writer is pending,
- * or wait for a previous writer to go away.
+ * Wait for a previous writer to go away, then set the waiting flag to
+ * notify readers that a writer is pending.
*/
for (;;) {
cnts = atomic_read(&lock->cnts);
--
1.8.5.3
On Tue, Dec 16, 2014 at 02:00:40PM +0800, Baoquan He wrote:
> In queue_read_lock_slowpath, when writer count becomes 0, we need
> increment the read count and get the lock. Then need call
> rspin_until_writer_unlock to check again if an incoming writer
> steals the lock in the gap. But in rspin_until_writer_unlock
> it only checks the writer count, namely low 8 bit of lock->cnts,
> no need to subtract the reader count unit specifically. So remove
> that subtraction to make it clearer, rspin_until_writer_unlock
> just takes the actual lock->cnts as the 2nd argument.
>
> And also change the code comment in queue_write_lock_slowpath to
> make it more exact and explicit.
>
> Signed-off-by: Baoquan He <[email protected]>
> ---
> kernel/locking/qrwlock.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index f956ede..ae66c10 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -76,7 +76,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock)
> while (atomic_read(&lock->cnts) & _QW_WMASK)
> cpu_relax_lowlatency();
>
> - cnts = atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
> + cnts = atomic_add_return(_QR_BIAS, &lock->cnts);
> rspin_until_writer_unlock(lock, cnts);
Did you actually look at the ASM generated? I suspect your change makes
it bigger.
On 12/16/14 at 10:01am, Peter Zijlstra wrote:
> On Tue, Dec 16, 2014 at 02:00:40PM +0800, Baoquan He wrote:
> > In queue_read_lock_slowpath, when writer count becomes 0, we need
> > increment the read count and get the lock. Then need call
> > rspin_until_writer_unlock to check again if an incoming writer
> > steals the lock in the gap. But in rspin_until_writer_unlock
> > it only checks the writer count, namely low 8 bit of lock->cnts,
> > no need to subtract the reader count unit specifically. So remove
> > that subtraction to make it clearer, rspin_until_writer_unlock
> > just takes the actual lock->cnts as the 2nd argument.
> >
> > And also change the code comment in queue_write_lock_slowpath to
> > make it more exact and explicit.
> >
> > Signed-off-by: Baoquan He <[email protected]>
> > ---
> > kernel/locking/qrwlock.c | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> > index f956ede..ae66c10 100644
> > --- a/kernel/locking/qrwlock.c
> > +++ b/kernel/locking/qrwlock.c
> > @@ -76,7 +76,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock)
> > while (atomic_read(&lock->cnts) & _QW_WMASK)
> > cpu_relax_lowlatency();
> >
> > - cnts = atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
> > + cnts = atomic_add_return(_QR_BIAS, &lock->cnts);
> > rspin_until_writer_unlock(lock, cnts);
>
> Did you actually look at the ASM generated? I suspect your change makes
> it bigger.
It does make it bigger. But it doesn't matter. Because in
rspin_until_writer_unlock it only compqre (cnts & _QW_WMASK)
with _QW_LOCKED. So using incremented reader count doesn't impact
the result. Anyway it will get the actual lock->cnts in
rspin_until_writer_unlock in next loop. I can't see why we need
subtract that reader count increment specifically.
When I read this code, thought there's some special usage. Finally I
realized it doesn't have special usage, and doesn't have to do that.
On 12/16/2014 10:36 AM, Baoquan He wrote:
> On 12/16/14 at 10:01am, Peter Zijlstra wrote:
>> On Tue, Dec 16, 2014 at 02:00:40PM +0800, Baoquan He wrote:
>>> In queue_read_lock_slowpath, when writer count becomes 0, we need
>>> increment the read count and get the lock. Then need call
>>> rspin_until_writer_unlock to check again if an incoming writer
>>> steals the lock in the gap. But in rspin_until_writer_unlock
>>> it only checks the writer count, namely low 8 bit of lock->cnts,
>>> no need to subtract the reader count unit specifically. So remove
>>> that subtraction to make it clearer, rspin_until_writer_unlock
>>> just takes the actual lock->cnts as the 2nd argument.
>>>
>>> And also change the code comment in queue_write_lock_slowpath to
>>> make it more exact and explicit.
>>>
>>> Signed-off-by: Baoquan He<[email protected]>
>>> ---
>>> kernel/locking/qrwlock.c | 8 ++++----
>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
>>> index f956ede..ae66c10 100644
>>> --- a/kernel/locking/qrwlock.c
>>> +++ b/kernel/locking/qrwlock.c
>>> @@ -76,7 +76,7 @@ void queue_read_lock_slowpath(struct qrwlock *lock)
>>> while (atomic_read(&lock->cnts)& _QW_WMASK)
>>> cpu_relax_lowlatency();
>>>
>>> - cnts = atomic_add_return(_QR_BIAS,&lock->cnts) - _QR_BIAS;
>>> + cnts = atomic_add_return(_QR_BIAS,&lock->cnts);
>>> rspin_until_writer_unlock(lock, cnts);
>> Did you actually look at the ASM generated? I suspect your change makes
>> it bigger.
>
> It does make it bigger. But it doesn't matter. Because in
> rspin_until_writer_unlock it only compqre (cnts& _QW_WMASK)
> with _QW_LOCKED. So using incremented reader count doesn't impact
> the result. Anyway it will get the actual lock->cnts in
> rspin_until_writer_unlock in next loop. I can't see why we need
> subtract that reader count increment specifically.
>
> When I read this code, thought there's some special usage. Finally I
> realized it doesn't have special usage, and doesn't have to do that.
The "- _QR_BIAS" expression was added to simulate xadd() which is
present in x86, but not in some other architectures. There is no
equivalent functionality in the set of atomic helper functions. Anyway,
I have no objection to the change as it is in the slowpath.
Acked-by: Waiman Long <[email protected]>