2016-03-16 23:22:20

by Joel A Fernandes

[permalink] [raw]
Subject: RFC on fixing mutex spinning on owner

Hi,

On a fairly recent kernel and android userspace, I am seeing that with
i915 driver is in a spin loop waiting for mutex owner to release it
(mutex_spin_on_owner). I believe this because the owner of the mutex
is running on another CPU and the expectation is the mutex owner
releases the mutex or goes to sleep soon, so we avoid sleeping if we
fail to acquire mutex and continue to spin and try to acquire it much
like a spinlock (while disabling preemption through out the spinning).

My question is, what if the owner cannot or doesn't want to sleep and
holds the mutex runs for a while while holding it. (Lets also assume
that all other tasks are sleeping on the mutex owner's CPU so its not
preempted).

In this case, does it make sense to time out the spinning after a
while? Because preemption is disabled during the spinning so this
spinning business seems a very very bad thing.

Should the code holding the mutex and running (the owner) be fixed to
not hold mutex for a while? Or would a patch introducing a timeout of
a certain threshold on the spinning be welcomed?

To give numbers, I am seeing spinning of as long as 20 ms in the worst
case, while the mutex owner holds the mutex for 22 ms. The ftrace
preemptoff tracer goes off.

Thanks for any advice on what the right fix of the problem should be.

Best,
Joel


2016-03-16 23:35:37

by Greg KH

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Wed, Mar 16, 2016 at 04:22:17PM -0700, Joel Fernandes wrote:
> Hi,
>
> On a fairly recent kernel and android userspace, I am seeing that with
> i915 driver is in a spin loop waiting for mutex owner to release it
> (mutex_spin_on_owner).

Why not provide a backtrace to the drm and i915 developers so that they
can work on this? The mailing lists you sent this to can't do much
about this, sorry...

thanks,

greg k-h

2016-03-16 23:39:00

by Joel A Fernandes

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

Hi Greg,

On Wed, Mar 16, 2016 at 4:35 PM, Greg Kroah-Hartman <[email protected]> wrote:
> On Wed, Mar 16, 2016 at 04:22:17PM -0700, Joel Fernandes wrote:
>> Hi,
>>
>> On a fairly recent kernel and android userspace, I am seeing that with
>> i915 driver is in a spin loop waiting for mutex owner to release it
>> (mutex_spin_on_owner).
>
> Why not provide a backtrace to the drm and i915 developers so that they
> can work on this? The mailing lists you sent this to can't do much
> about this, sorry...

I am not sure if the problem is with the i915 driver, because the
mutex spin on owner stuff is mutex related so the mutex design may
potentially need a tweak (I mentioned a proposal of adding mutex
spinning time outs).
Also since this is latency issue related (I mentioned preemptoff
tracer and preempt disabled), I sent it to linux-rt-users. Thanks for
the tip about sending it to i915 developers, incase no one here has a
say in the matter, I can drop them a note later as well.

Thanks,

Joel

2016-03-17 02:17:56

by Steven Rostedt

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Wed, 16 Mar 2016 16:38:56 -0700
Joel Fernandes <[email protected]> wrote:

> I am not sure if the problem is with the i915 driver, because the
> mutex spin on owner stuff is mutex related so the mutex design may
> potentially need a tweak (I mentioned a proposal of adding mutex
> spinning time outs).
> Also since this is latency issue related (I mentioned preemptoff
> tracer and preempt disabled), I sent it to linux-rt-users. Thanks for
> the tip about sending it to i915 developers, incase no one here has a
> say in the matter, I can drop them a note later as well.

Actually, the preempt off section here is not really an issue:

rcu_read_lock();
while (owner_running(lock, owner)) {
if (need_resched())
break;

cpu_relax_lowlatency();
}
rcu_read_unlock();

Although preemption may be disabled, that "need_resched()" check will
break out of the loop if a higher priority task were to want to run on
this CPU.

I probably should add a hook there to let the preemptoff tracer know
that this is not an issue.

Thanks for the report.

-- Steve

2016-03-17 05:35:25

by Joel A Fernandes

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Wed, Mar 16, 2016 at 7:17 PM, Steven Rostedt <[email protected]> wrote:
>
> On Wed, 16 Mar 2016 16:38:56 -0700
> Joel Fernandes <[email protected]> wrote:
>
> > I am not sure if the problem is with the i915 driver, because the
> > mutex spin on owner stuff is mutex related so the mutex design may
> > potentially need a tweak (I mentioned a proposal of adding mutex
> > spinning time outs).
> > Also since this is latency issue related (I mentioned preemptoff
> > tracer and preempt disabled), I sent it to linux-rt-users. Thanks for
> > the tip about sending it to i915 developers, incase no one here has a
> > say in the matter, I can drop them a note later as well.
>
> Actually, the preempt off section here is not really an issue:
>
> rcu_read_lock();
> while (owner_running(lock, owner)) {
> if (need_resched())
> break;
>
> cpu_relax_lowlatency();
> }
> rcu_read_unlock();
>
> Although preemption may be disabled, that "need_resched()" check will
> break out of the loop if a higher priority task were to want to run on
> this CPU.
>
> I probably should add a hook there to let the preemptoff tracer know
> that this is not an issue.

Thanks Steve! That makes sense. If you do end up adding this hook to
the tracer, I would appreciate it if you could Cc me on the patch so I
could back port it to my kernel as well.

Also, since we are on the topic of preemptoff tracer, I posted a patch
[1] few days ago fixing another issue, if it looks Ok to you could you
pick it up? I had CC'd you on it.

[1] http://www.mail-archive.com/[email protected]/msg1099561.html

Thanks,
Joel

2016-03-17 07:36:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Wed, Mar 16, 2016 at 10:17:51PM -0400, Steven Rostedt wrote:
> Actually, the preempt off section here is not really an issue:
>
> rcu_read_lock();
> while (owner_running(lock, owner)) {
> if (need_resched())
> break;
>
> cpu_relax_lowlatency();
> }
> rcu_read_unlock();
>
> Although preemption may be disabled, that "need_resched()" check will
> break out of the loop if a higher priority task were to want to run on
> this CPU.
>
> I probably should add a hook there to let the preemptoff tracer know
> that this is not an issue.

Urgh, there's a lot of such spin loops all over, that's going to be a
pain to annotate all.

2016-03-17 09:09:17

by Nicholas Mc Guire

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Thu, Mar 17, 2016 at 08:36:05AM +0100, Peter Zijlstra wrote:
> On Wed, Mar 16, 2016 at 10:17:51PM -0400, Steven Rostedt wrote:
> > Actually, the preempt off section here is not really an issue:
> >
> > rcu_read_lock();
> > while (owner_running(lock, owner)) {
> > if (need_resched())
> > break;
> >
> > cpu_relax_lowlatency();
> > }
> > rcu_read_unlock();
> >
> > Although preemption may be disabled, that "need_resched()" check will
> > break out of the loop if a higher priority task were to want to run on
> > this CPU.
> >
> > I probably should add a hook there to let the preemptoff tracer know
> > that this is not an issue.
>
> Urgh, there's a lot of such spin loops all over, that's going to be a
> pain to annotate all.

scanning for that patter with a quite relaxed spatch did not
turn up more than a hand full:

@resched_spin exists@
position p;
@@

(
* while@p (...) {
...
if (need_resched() || ...)
break;
...
\(cpu_relax\|cpu_relax_lowlatency\)();
}
|
* while@p (!need_resched()) {
...
\(cpu_relax\|cpu_relax_lowlatency\)();
}
)

@script:python@
p << resched_spin.p;
@@
print "%s:%s " % (p[0].file,p[0].line)


is this making some wrong assumptions here or is this
really so infrequent ?

thx!
hofrat

2016-03-17 10:18:29

by Peter Zijlstra

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Thu, Mar 17, 2016 at 08:05:26AM +0000, Nicholas Mc Guire wrote:
> scanning for that patter with a quite relaxed spatch did not
> turn up more than a hand full:
>
> @resched_spin exists@
> position p;
> @@
>
> (
> * while@p (...) {
> ...
> if (need_resched() || ...)
> break;
> ...
> \(cpu_relax\|cpu_relax_lowlatency\)();
> }
> |
> * while@p (!need_resched()) {
> ...
> \(cpu_relax\|cpu_relax_lowlatency\)();
> }
> )
>
> @script:python@
> p << resched_spin.p;
> @@
> print "%s:%s " % (p[0].file,p[0].line)
>
>
> is this making some wrong assumptions here or is this
> really so infrequent ?

Ah, a tool, nice! :-)

There should be one in the osq_lock, one in mutex, one in rwsem-xadd,
davidlohr was adding one to rt_mutex; and these are the one I can
remember from the top of my head.

But I would not be surprised if there are a fair few more. Also for (;;)
is a frequent loop pattern.

My biggest worry is how to not forget adding this annotation if we
create yet another instance of this. The tool could help I suppose.

Also, maybe the tracer should measure the time from need_resched()
getting true until the next preemption point, instead of the entire time
preemption was disabled. Which would avoid the entire issue altogether.

2016-03-17 11:17:51

by Thomas Gleixner

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Thu, 17 Mar 2016, Peter Zijlstra wrote:
> Also, maybe the tracer should measure the time from need_resched()
> getting true until the next preemption point, instead of the entire time
> preemption was disabled. Which would avoid the entire issue altogether.

Well, that only gives you the information on a actual preemption, but not
information about long preempt disabled regions which can cause a problem
eventually.

Thanks,

tglx

2016-03-17 12:06:36

by Steven Rostedt

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Thu, 17 Mar 2016 12:16:11 +0100 (CET)
Thomas Gleixner <[email protected]> wrote:

> On Thu, 17 Mar 2016, Peter Zijlstra wrote:
> > Also, maybe the tracer should measure the time from need_resched()
> > getting true until the next preemption point, instead of the entire time
> > preemption was disabled. Which would avoid the entire issue altogether.
>
> Well, that only gives you the information on a actual preemption, but not
> information about long preempt disabled regions which can cause a problem
> eventually.
>

Actually, I was thinking the reverse. If need_resched() is called and
is false, then do a reset of the preemption time. But if need_resched()
is true, then do nothing, as that would measure the total time preempt
disable was set and a task could not schedule.

Question is, should this be a hook and each location audited, or add
this to need_resched() itself?

-- Steve

2016-03-17 12:08:36

by Steven Rostedt

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Wed, 16 Mar 2016 22:35:21 -0700
Joel Fernandes <[email protected]> wrote:


> Also, since we are on the topic of preemptoff tracer, I posted a patch
> [1] few days ago fixing another issue, if it looks Ok to you could you
> pick it up? I had CC'd you on it.
>
> [1] http://www.mail-archive.com/[email protected]/msg1099561.html

Yeah I saw that. I've been busy debugging work code. In other words,
doing my day job ;-) I'm hoping to spend today working on mainline
again.

Thanks,

-- Steve

2016-03-17 12:14:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Thu, Mar 17, 2016 at 08:06:29AM -0400, Steven Rostedt wrote:
> On Thu, 17 Mar 2016 12:16:11 +0100 (CET)
> Thomas Gleixner <[email protected]> wrote:
>
> > On Thu, 17 Mar 2016, Peter Zijlstra wrote:
> > > Also, maybe the tracer should measure the time from need_resched()
> > > getting true until the next preemption point, instead of the entire time
> > > preemption was disabled. Which would avoid the entire issue altogether.
> >
> > Well, that only gives you the information on a actual preemption, but not
> > information about long preempt disabled regions which can cause a problem
> > eventually.
> >
>
> Actually, I was thinking the reverse. If need_resched() is called and
> is false, then do a reset of the preemption time. But if need_resched()
> is true, then do nothing, as that would measure the total time preempt
> disable was set and a task could not schedule.
>
> Question is, should this be a hook and each location audited, or add
> this to need_resched() itself?

Is anybody calling need_resched() and then not doing anything with the
value?

2016-03-17 13:34:31

by Thomas Gleixner

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Thu, 17 Mar 2016, Peter Zijlstra wrote:
> On Thu, Mar 17, 2016 at 08:06:29AM -0400, Steven Rostedt wrote:
> > On Thu, 17 Mar 2016 12:16:11 +0100 (CET)
> > Thomas Gleixner <[email protected]> wrote:
> >
> > > On Thu, 17 Mar 2016, Peter Zijlstra wrote:
> > > > Also, maybe the tracer should measure the time from need_resched()
> > > > getting true until the next preemption point, instead of the entire time
> > > > preemption was disabled. Which would avoid the entire issue altogether.
> > >
> > > Well, that only gives you the information on a actual preemption, but not
> > > information about long preempt disabled regions which can cause a problem
> > > eventually.
> > >
> >
> > Actually, I was thinking the reverse. If need_resched() is called and
> > is false, then do a reset of the preemption time. But if need_resched()
> > is true, then do nothing, as that would measure the total time preempt
> > disable was set and a task could not schedule.
> >
> > Question is, should this be a hook and each location audited, or add
> > this to need_resched() itself?
>
> Is anybody calling need_resched() and then not doing anything with the
> value?

Probably not. So Stevens idea makes a lot of sense.

Thanks,

tglx


2016-03-17 16:40:41

by Nicholas Mc Guire

[permalink] [raw]
Subject: Re: RFC on fixing mutex spinning on owner

On Thu, Mar 17, 2016 at 02:32:54PM +0100, Thomas Gleixner wrote:
> On Thu, 17 Mar 2016, Peter Zijlstra wrote:
> > On Thu, Mar 17, 2016 at 08:06:29AM -0400, Steven Rostedt wrote:
> > > On Thu, 17 Mar 2016 12:16:11 +0100 (CET)
> > > Thomas Gleixner <[email protected]> wrote:
> > >
> > > > On Thu, 17 Mar 2016, Peter Zijlstra wrote:
> > > > > Also, maybe the tracer should measure the time from need_resched()
> > > > > getting true until the next preemption point, instead of the entire time
> > > > > preemption was disabled. Which would avoid the entire issue altogether.
> > > >
> > > > Well, that only gives you the information on a actual preemption, but not
> > > > information about long preempt disabled regions which can cause a problem
> > > > eventually.
> > > >
> > >
> > > Actually, I was thinking the reverse. If need_resched() is called and
> > > is false, then do a reset of the preemption time. But if need_resched()
> > > is true, then do nothing, as that would measure the total time preempt
> > > disable was set and a task could not schedule.
> > >
> > > Question is, should this be a hook and each location audited, or add
> > > this to need_resched() itself?
> >
> > Is anybody calling need_resched() and then not doing anything with the
> > value?
>
> Probably not. So Stevens idea makes a lot of sense.
>

Provably not (provided I did not screw up the spec). So all need_resched
are ither in a if|while condition, an argument to a function or part of a
return value.

<snip>
@need_resched exists@
identifier func,val;
statement S1,S2;
expression E1,E2;
position p;
@@

(
if(<+... need_resched()@p ...+>) S1 else S2
|
while (<+... need_resched()@p ...+>) S1
//|
// Do while not supported in coccinelle :(
// do { ... } while (<+... need_resched()@p ...+>);
|
val = need_resched()@p ? E1 : E2;
|
func(..., need_resched()@p, ...);
|
return <+... need_resched()@p ...+>;
)

@need_resched_unused exists@
position p != need_resched.p;
@@

* need_resched()@p

@script:python@
p << need_resched_unused.p;
@@
print "need_resched() in %s line %s not used" % (p[0].file,p[0].line)
<snip>

except for the unhandled do{...} while(need_resched()); cases as cocci
currently does not support do{}while(); constructs this does not report
any unused need_resched() cases and the 5 reported findings:
need_resched() in ./include/net/busy_poll.h line 112 not used
need_resched() in ./kernel/sched/core.c line 3453 not used
need_resched() in ./kernel/sched/core.c line 3573 not used
need_resched() in ./kernel/sched/core.c line 3546 not used
need_resched() in ./kernel/sched/core.c line 3406 not used
where manually checked and are all do{}while(need_resched()); cases.

thx!
hofrat