2017-03-01 15:54:23

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [locking/ww_mutex] 2a0c112828 WARNING: CPU: 0 PID: 18 at kernel/locking/mutex.c:305 __ww_mutex_wakeup_for_backoff

On Wed, Mar 01, 2017 at 11:40:43PM +0800, Fengguang Wu wrote:
> Thanks for the patch! I applied the patch on top of "locking/ww_mutex:
> Add kselftests for ww_mutex stress", and find no "bad unlock balance
> detected" but this warning. Attached is the new dmesg which is a bit
> large due to lots of repeated errors.

So with all the various patches it works for me.

I also have the following on top; which I did when I was looking through
this code trying to figure out wth was happening.

Chris, does this make sense to you?

It makes each loop a fully new 'instance', otherwise we'll never update
the ww_class->stamp and the threads will aways have the same order.

---

diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
index da6c9a34f62f..d0fd06429c9d 100644
--- a/kernel/locking/test-ww_mutex.c
+++ b/kernel/locking/test-ww_mutex.c
@@ -398,12 +398,11 @@ static void stress_inorder_work(struct work_struct *work)
if (!order)
return;

- ww_acquire_init(&ctx, &ww_class);
-
do {
int contended = -1;
int n, err;

+ ww_acquire_init(&ctx, &ww_class);
retry:
err = 0;
for (n = 0; n < nlocks; n++) {
@@ -433,9 +432,9 @@ static void stress_inorder_work(struct work_struct *work)
__func__, err);
break;
}
- } while (--stress->nloops);

- ww_acquire_fini(&ctx);
+ ww_acquire_fini(&ctx);
+ } while (--stress->nloops);

kfree(order);
kfree(stress);
@@ -470,9 +469,9 @@ static void stress_reorder_work(struct work_struct *work)
kfree(order);
order = NULL;

- ww_acquire_init(&ctx, &ww_class);
-
do {
+ ww_acquire_init(&ctx, &ww_class);
+
list_for_each_entry(ll, &locks, link) {
err = ww_mutex_lock(ll->lock, &ctx);
if (!err)
@@ -495,9 +494,9 @@ static void stress_reorder_work(struct work_struct *work)
dummy_load(stress);
list_for_each_entry(ll, &locks, link)
ww_mutex_unlock(ll->lock);
- } while (--stress->nloops);

- ww_acquire_fini(&ctx);
+ ww_acquire_fini(&ctx);
+ } while (--stress->nloops);

out:
list_for_each_entry_safe(ll, ln, &locks, link)


2017-03-01 16:12:15

by Chris Wilson

[permalink] [raw]
Subject: Re: [locking/ww_mutex] 2a0c112828 WARNING: CPU: 0 PID: 18 at kernel/locking/mutex.c:305 __ww_mutex_wakeup_for_backoff

On Wed, Mar 01, 2017 at 04:54:14PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 01, 2017 at 11:40:43PM +0800, Fengguang Wu wrote:
> > Thanks for the patch! I applied the patch on top of "locking/ww_mutex:
> > Add kselftests for ww_mutex stress", and find no "bad unlock balance
> > detected" but this warning. Attached is the new dmesg which is a bit
> > large due to lots of repeated errors.
>
> So with all the various patches it works for me.
>
> I also have the following on top; which I did when I was looking through
> this code trying to figure out wth was happening.
>
> Chris, does this make sense to you?
>
> It makes each loop a fully new 'instance', otherwise we'll never update
> the ww_class->stamp and the threads will aways have the same order.

Sounds ok, I just thought the stamp order of the threads was
immaterial - with each test doing a different sequence of locks and each
being identical in behaviour, it would not matter which had priority,
there would have be some shuffling no matter waht. However, for the
purpose of testing, having each iteration be a new locking instance does
make it behaviour more like a typical user.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2017-03-01 16:27:07

by Chris Wilson

[permalink] [raw]
Subject: Re: [locking/ww_mutex] 2a0c112828 WARNING: CPU: 0 PID: 18 at kernel/locking/mutex.c:305 __ww_mutex_wakeup_for_backoff

On Wed, Mar 01, 2017 at 04:11:48PM +0000, Chris Wilson wrote:
> On Wed, Mar 01, 2017 at 04:54:14PM +0100, Peter Zijlstra wrote:
> > On Wed, Mar 01, 2017 at 11:40:43PM +0800, Fengguang Wu wrote:
> > > Thanks for the patch! I applied the patch on top of "locking/ww_mutex:
> > > Add kselftests for ww_mutex stress", and find no "bad unlock balance
> > > detected" but this warning. Attached is the new dmesg which is a bit
> > > large due to lots of repeated errors.
> >
> > So with all the various patches it works for me.
> >
> > I also have the following on top; which I did when I was looking through
> > this code trying to figure out wth was happening.
> >
> > Chris, does this make sense to you?
> >
> > It makes each loop a fully new 'instance', otherwise we'll never update
> > the ww_class->stamp and the threads will aways have the same order.
>
> Sounds ok, I just thought the stamp order of the threads was
> immaterial - with each test doing a different sequence of locks and each
> being identical in behaviour, it would not matter which had priority,
> there would have be some shuffling no matter waht. However, for the
> purpose of testing, having each iteration be a new locking instance does
> make it behaviour more like a typical user.

Correcting myself, the workers didn't reorder the locks, so changing the
stamp does make the test more interesting.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2017-03-01 17:42:37

by Fengguang Wu

[permalink] [raw]
Subject: Re: [locking/ww_mutex] 2a0c112828 WARNING: CPU: 0 PID: 18 at kernel/locking/mutex.c:305 __ww_mutex_wakeup_for_backoff

On Wed, Mar 01, 2017 at 05:54:06PM +0100, Peter Zijlstra wrote:
>On Wed, Mar 01, 2017 at 04:26:48PM +0000, Chris Wilson wrote:
>> On Wed, Mar 01, 2017 at 04:11:48PM +0000, Chris Wilson wrote:
>> > On Wed, Mar 01, 2017 at 04:54:14PM +0100, Peter Zijlstra wrote:
>> > > On Wed, Mar 01, 2017 at 11:40:43PM +0800, Fengguang Wu wrote:
>> > > > Thanks for the patch! I applied the patch on top of "locking/ww_mutex:
>> > > > Add kselftests for ww_mutex stress", and find no "bad unlock balance
>> > > > detected" but this warning. Attached is the new dmesg which is a bit
>> > > > large due to lots of repeated errors.
>> > >
>> > > So with all the various patches it works for me.
>> > >
>> > > I also have the following on top; which I did when I was looking through
>> > > this code trying to figure out wth was happening.
>> > >
>> > > Chris, does this make sense to you?
>> > >
>> > > It makes each loop a fully new 'instance', otherwise we'll never update
>> > > the ww_class->stamp and the threads will aways have the same order.
>> >
>> > Sounds ok, I just thought the stamp order of the threads was
>> > immaterial - with each test doing a different sequence of locks and each
>> > being identical in behaviour, it would not matter which had priority,
>> > there would have be some shuffling no matter waht. However, for the
>> > purpose of testing, having each iteration be a new locking instance does
>> > make it behaviour more like a typical user.
>>
>> Correcting myself, the workers didn't reorder the locks, so changing the
>> stamp does make the test more interesting.
>
>OK, so I'll go write a Changelog for it then ;-) And stick your ACK on.

With both patches in this thread, all 110 boots are successful w/o a
single warning.

Tested-by: Fengguang Wu <[email protected]>

Thanks,
Fengguang

2017-03-01 21:44:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [locking/ww_mutex] 2a0c112828 WARNING: CPU: 0 PID: 18 at kernel/locking/mutex.c:305 __ww_mutex_wakeup_for_backoff

On Wed, Mar 01, 2017 at 04:26:48PM +0000, Chris Wilson wrote:
> On Wed, Mar 01, 2017 at 04:11:48PM +0000, Chris Wilson wrote:
> > On Wed, Mar 01, 2017 at 04:54:14PM +0100, Peter Zijlstra wrote:
> > > On Wed, Mar 01, 2017 at 11:40:43PM +0800, Fengguang Wu wrote:
> > > > Thanks for the patch! I applied the patch on top of "locking/ww_mutex:
> > > > Add kselftests for ww_mutex stress", and find no "bad unlock balance
> > > > detected" but this warning. Attached is the new dmesg which is a bit
> > > > large due to lots of repeated errors.
> > >
> > > So with all the various patches it works for me.
> > >
> > > I also have the following on top; which I did when I was looking through
> > > this code trying to figure out wth was happening.
> > >
> > > Chris, does this make sense to you?
> > >
> > > It makes each loop a fully new 'instance', otherwise we'll never update
> > > the ww_class->stamp and the threads will aways have the same order.
> >
> > Sounds ok, I just thought the stamp order of the threads was
> > immaterial - with each test doing a different sequence of locks and each
> > being identical in behaviour, it would not matter which had priority,
> > there would have be some shuffling no matter waht. However, for the
> > purpose of testing, having each iteration be a new locking instance does
> > make it behaviour more like a typical user.
>
> Correcting myself, the workers didn't reorder the locks, so changing the
> stamp does make the test more interesting.

OK, so I'll go write a Changelog for it then ;-) And stick your ACK on.