2015-02-24 00:49:05

by Jörn Engel

[permalink] [raw]
Subject: RFC: revert 43fa5460fe60

Hello Steven!

I came across a silly problem that tempted me to revert 43fa5460fe60.
We had a high-priority realtime thread woken, TIF_NEED_RESCHED was set
for the running thread and the realtime thread didn't run for >2s.
Problem was a system call that mapped a ton of device memory and never
hit a cond_resched() point. Obvious solution is to fix the long-running
system call.

Applying that solution quickly turns into a game of whack-a-mole. Not
the worst game in the world and all those moles surely deserve a solid
hit on the head. But what is annoying in my case is that I had plenty
of idle cpus during the entire time and the high-priority thread was
allowed to run anywhere. So if the thread had been moved to a different
runqueue immediately there would have been zero delay. Sure, the cache
is semi-cold or the migration may even be cross-package. That is a
tradeoff we are willing to make and we set the cpumask explicitly that
way. We want this thread to run quickly, anywhere.

So we could check for currently idle cpus when waking realtime threads.
If there are any, immediately move the woken thread over. Maybe have a
check whether the running thread on the current cpu is in a syscall and
retain current behaviour if not.

Now this is not quite the same as reverting 43fa5460fe60 and I would
like to verify the idea before I spend time on a patch you would never
consider merging anyway.

J?rn

--
As more agents are added, systems become more reliable in the total-effort
case, but less reliable in the weakest-link case. What are the implications?
Well, software companies schould hire more software testers and fewer (but
more competent) programmers.
-- Ross Anderson


2015-02-24 00:52:44

by Jörn Engel

[permalink] [raw]
Subject: Re: RFC: revert 43fa5460fe60

Looks like Gregory Haskins' email bounces. Should have figured as much.

J?rn

--
The only real mistake is the one from which we learn nothing.
-- John Powell

2015-02-24 07:30:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: RFC: revert 43fa5460fe60 ("sched: Try not to migrate higher priority RT tasks")


* J?rn Engel <[email protected]> wrote:

> Hello Steven!
>
> I came across a silly problem that tempted me to revert 43fa5460fe60.

So just to save everyone having to go to the Git tree to
remember which patch that was:

43fa5460fe60 ("sched: Try not to migrate higher priority RT tasks")

Thanks,

Ingo

2015-02-24 15:33:49

by Steven Rostedt

[permalink] [raw]
Subject: Re: RFC: revert 43fa5460fe60

On Tue, 24 Feb 2015 09:55:09 -0500
Jörn Engel <[email protected]> wrote:

> Hello Steven!
>
> I came across a silly problem that tempted me to revert 43fa5460fe60.
> We had a high-priority realtime thread woken, TIF_NEED_RESCHED was set
> for the running thread and the realtime thread didn't run for >2s.
> Problem was a system call that mapped a ton of device memory and never
> hit a cond_resched() point. Obvious solution is to fix the long-running
> system call.

I'm assuming that you are running a non PREEMPT kernel
(PREEMPT_VOLUNTARY doesn't count). If that's the case, then it is very
much likely that you will hit long latencies. Regardless of this change
or not (you will continue to need to whack-a-mole here).

>
> Applying that solution quickly turns into a game of whack-a-mole. Not
> the worst game in the world and all those moles surely deserve a solid
> hit on the head. But what is annoying in my case is that I had plenty
> of idle cpus during the entire time and the high-priority thread was
> allowed to run anywhere. So if the thread had been moved to a different
> runqueue immediately there would have been zero delay. Sure, the cache
> is semi-cold or the migration may even be cross-package. That is a
> tradeoff we are willing to make and we set the cpumask explicitly that
> way. We want this thread to run quickly, anywhere.
>
> So we could check for currently idle cpus when waking realtime threads.
> If there are any, immediately move the woken thread over. Maybe have a
> check whether the running thread on the current cpu is in a syscall and
> retain current behaviour if not.
>
> Now this is not quite the same as reverting 43fa5460fe60 and I would
> like to verify the idea before I spend time on a patch you would never
> consider merging anyway.

The thing is, the patch you want to revert helped a lot for
CONFIG_PREEMPT kernels. It doesn't make sense to gut a change for a non
CONFIG_PREEMPT kernel as that already suffers huge latencies.

This is considering that you are indeed not running a CONFIG_PREEMPT
kernel, as you stated that it doesn't preempt until it hits a
cond_resched().

-- Steve

2015-02-24 17:19:14

by Jörn Engel

[permalink] [raw]
Subject: Re: RFC: revert 43fa5460fe60

On Tue, Feb 24, 2015 at 10:33:44AM -0500, Steven Rostedt wrote:
> On Tue, 24 Feb 2015 09:55:09 -0500
> J?rn Engel <[email protected]> wrote:
>
> > I came across a silly problem that tempted me to revert 43fa5460fe60.
> > We had a high-priority realtime thread woken, TIF_NEED_RESCHED was set
> > for the running thread and the realtime thread didn't run for >2s.
> > Problem was a system call that mapped a ton of device memory and never
> > hit a cond_resched() point. Obvious solution is to fix the long-running
> > system call.
>
> I'm assuming that you are running a non PREEMPT kernel
> (PREEMPT_VOLUNTARY doesn't count). If that's the case, then it is very
> much likely that you will hit long latencies. Regardless of this change
> or not (you will continue to need to whack-a-mole here).

Correct.

> > Applying that solution quickly turns into a game of whack-a-mole. Not
> > the worst game in the world and all those moles surely deserve a solid
> > hit on the head. But what is annoying in my case is that I had plenty
> > of idle cpus during the entire time and the high-priority thread was
> > allowed to run anywhere. So if the thread had been moved to a different
> > runqueue immediately there would have been zero delay. Sure, the cache
> > is semi-cold or the migration may even be cross-package. That is a
> > tradeoff we are willing to make and we set the cpumask explicitly that
> > way. We want this thread to run quickly, anywhere.
> >
> > So we could check for currently idle cpus when waking realtime threads.
> > If there are any, immediately move the woken thread over. Maybe have a
> > check whether the running thread on the current cpu is in a syscall and
> > retain current behaviour if not.
> >
> > Now this is not quite the same as reverting 43fa5460fe60 and I would
> > like to verify the idea before I spend time on a patch you would never
> > consider merging anyway.
>
> The thing is, the patch you want to revert helped a lot for
> CONFIG_PREEMPT kernels. It doesn't make sense to gut a change for a non
> CONFIG_PREEMPT kernel as that already suffers huge latencies.
>
> This is considering that you are indeed not running a CONFIG_PREEMPT
> kernel, as you stated that it doesn't preempt until it hits a
> cond_resched().

Well, reverting was my first instinct, but for different reasons I think
it is wrong. Simply reverting can result in the high priority thread
moving from one cpu with a running process to a different cpu with a
running process. In both cases you may trip over a mole, so nothing
much is gained.

But if you know that the destination cpu is idle, you can avoid any
moles, give or take a small race window maybe. The moles are still
present and you still need some debug tool to detect them and fix them
over time. But as cpus increase over time, your chances of getting
lucky in spite of bad kernel code also increase.

Is that a worthwhile approach, at least for non PREEMPT?

J?rn

--
Schr?dinger's cat is <BLINK>not</BLINK> dead.
-- Illiad

2015-02-24 17:41:18

by Steven Rostedt

[permalink] [raw]
Subject: Re: RFC: revert 43fa5460fe60

On Tue, 24 Feb 2015 09:19:06 -0800
Jörn Engel <[email protected]> wrote:

> Well, reverting was my first instinct, but for different reasons I think
> it is wrong. Simply reverting can result in the high priority thread
> moving from one cpu with a running process to a different cpu with a
> running process. In both cases you may trip over a mole, so nothing
> much is gained.
>
> But if you know that the destination cpu is idle, you can avoid any
> moles, give or take a small race window maybe. The moles are still
> present and you still need some debug tool to detect them and fix them
> over time. But as cpus increase over time, your chances of getting
> lucky in spite of bad kernel code also increase.
>
> Is that a worthwhile approach, at least for non PREEMPT?

I don't know. Could probably add it if CONFIG_PREEMPT is not set. Just
check if an idle CPU is available in that case and move it, as non
PREEMPT kernels will have long latencies anyway.

-- Steve