2019-08-01 12:12:03

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 0/5] Fix FIFO-99 abuse

I noticed a bunch of kthreads defaulted to FIFO-99, fix them.

The generic default is FIFO-50, the admin will have to configure the system
anyway.

For some the purpose is to be above OTHER and then FIFO-1 really is sufficient.


2019-08-01 13:38:35

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On 08/01/19 13:13, Peter Zijlstra wrote:
> I noticed a bunch of kthreads defaulted to FIFO-99, fix them.
>
> The generic default is FIFO-50, the admin will have to configure the system
> anyway.
>
> For some the purpose is to be above OTHER and then FIFO-1 really is sufficient.

I was looking in this area too and was thinking of a way to consolidate the
creation of RT/DL tasks in the kernel and the way we set the priority.

Does it make sense to create a new header for RT priorities for kthreads
created in the kernel so that we can easily track and rationale about the
relative priorities of in-kernel RT tasks?

When working in the FW world such a header helped a lot in understanding what
runs at each priority level and how to reason about what priority level makes
sense for a new item. It could be a nice single point of reference; even for
admins.

Cheers

--
Qais Yousef

2019-08-02 09:46:05

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On Thu, Aug 01, 2019 at 02:17:07PM +0100, Qais Yousef wrote:
> On 08/01/19 13:13, Peter Zijlstra wrote:
> > I noticed a bunch of kthreads defaulted to FIFO-99, fix them.
> >
> > The generic default is FIFO-50, the admin will have to configure the system
> > anyway.
> >
> > For some the purpose is to be above OTHER and then FIFO-1 really is sufficient.
>
> I was looking in this area too and was thinking of a way to consolidate the
> creation of RT/DL tasks in the kernel and the way we set the priority.
>
> Does it make sense to create a new header for RT priorities for kthreads
> created in the kernel so that we can easily track and rationale about the
> relative priorities of in-kernel RT tasks?
>
> When working in the FW world such a header helped a lot in understanding what
> runs at each priority level and how to reason about what priority level makes
> sense for a new item. It could be a nice single point of reference; even for
> admins.

Well, SCHED_FIFO is a broken scheduler model; that is, it is
fundamentally incapable of resource management, which is the one thing
an OS really should be doing.

This is of course the reason it is limited to privileged users only.

Worse still; it is fundamentally impossible to compose static priority
workloads. You cannot take two correctly working static prio workloads
and smash them together and still expect them to work.

For this reason 'all' FIFO tasks the kernel creates are basically at:

MAX_RT_PRIO / 2

The administrator _MUST_ configure the system, the kernel simply doesn't
know enough information to make a sensible choice.

Now, Geert suggested so make make a define for that, but how about we do
something like:

/*
* ${the above explanation}
*/
int kernel_setscheduler_fifo(struct task_struct *p)
{
struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
return sched_setscheduler_nocheck(p, SCHED_FIFO, &sp);
}

And then take away sched_setscheduler*().

2019-08-02 13:33:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On Fri, 2 Aug 2019, Peter Zijlstra wrote:
> On Thu, Aug 01, 2019 at 02:17:07PM +0100, Qais Yousef wrote:
> > On 08/01/19 13:13, Peter Zijlstra wrote:
> > > I noticed a bunch of kthreads defaulted to FIFO-99, fix them.
> > >
> > > The generic default is FIFO-50, the admin will have to configure the system
> > > anyway.
> > >
> > > For some the purpose is to be above OTHER and then FIFO-1 really is sufficient.
> >
> > I was looking in this area too and was thinking of a way to consolidate the
> > creation of RT/DL tasks in the kernel and the way we set the priority.
> >
> > Does it make sense to create a new header for RT priorities for kthreads
> > created in the kernel so that we can easily track and rationale about the
> > relative priorities of in-kernel RT tasks?
> >
> > When working in the FW world such a header helped a lot in understanding what
> > runs at each priority level and how to reason about what priority level makes
> > sense for a new item. It could be a nice single point of reference; even for
> > admins.
>
> Well, SCHED_FIFO is a broken scheduler model; that is, it is
> fundamentally incapable of resource management, which is the one thing
> an OS really should be doing.
>
> This is of course the reason it is limited to privileged users only.
>
> Worse still; it is fundamentally impossible to compose static priority
> workloads. You cannot take two correctly working static prio workloads
> and smash them together and still expect them to work.
>
> For this reason 'all' FIFO tasks the kernel creates are basically at:
>
> MAX_RT_PRIO / 2
>
> The administrator _MUST_ configure the system, the kernel simply doesn't
> know enough information to make a sensible choice.
>
> Now, Geert suggested so make make a define for that, but how about we do
> something like:
>
> /*
> * ${the above explanation}
> */
> int kernel_setscheduler_fifo(struct task_struct *p)
> {
> struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
> return sched_setscheduler_nocheck(p, SCHED_FIFO, &sp);
> }
>
> And then take away sched_setscheduler*().

Yes, please.

tglx

2019-08-02 13:38:27

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On 08/02/19 11:32, Peter Zijlstra wrote:
> On Thu, Aug 01, 2019 at 02:17:07PM +0100, Qais Yousef wrote:
> > On 08/01/19 13:13, Peter Zijlstra wrote:
> > > I noticed a bunch of kthreads defaulted to FIFO-99, fix them.
> > >
> > > The generic default is FIFO-50, the admin will have to configure the system
> > > anyway.
> > >
> > > For some the purpose is to be above OTHER and then FIFO-1 really is sufficient.
> >
> > I was looking in this area too and was thinking of a way to consolidate the
> > creation of RT/DL tasks in the kernel and the way we set the priority.
> >
> > Does it make sense to create a new header for RT priorities for kthreads
> > created in the kernel so that we can easily track and rationale about the
> > relative priorities of in-kernel RT tasks?
> >
> > When working in the FW world such a header helped a lot in understanding what
> > runs at each priority level and how to reason about what priority level makes
> > sense for a new item. It could be a nice single point of reference; even for
> > admins.
>
> Well, SCHED_FIFO is a broken scheduler model; that is, it is
> fundamentally incapable of resource management, which is the one thing
> an OS really should be doing.
>
> This is of course the reason it is limited to privileged users only.
>
> Worse still; it is fundamentally impossible to compose static priority
> workloads. You cannot take two correctly working static prio workloads
> and smash them together and still expect them to work.
>
> For this reason 'all' FIFO tasks the kernel creates are basically at:
>
> MAX_RT_PRIO / 2
>
> The administrator _MUST_ configure the system, the kernel simply doesn't
> know enough information to make a sensible choice.
>
> Now, Geert suggested so make make a define for that, but how about we do
> something like:
>
> /*
> * ${the above explanation}
> */
> int kernel_setscheduler_fifo(struct task_struct *p)
> {
> struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
> return sched_setscheduler_nocheck(p, SCHED_FIFO, &sp);
> }
>
> And then take away sched_setscheduler*().

Yes a somewhat enforced default makes more sense to me. I assume you no longer
want to put the kthreads that just need to be above OTHER in FIFO-1?

While at it, since we will cram all kthreads on the same priority, isn't
a SCHED_RR a better choice now? I think the probability of a clash is pretty
low, but when it happens, shouldn't we try to guarantee some fairness?

--
Qais Yousef

2019-08-02 23:21:08

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On Fri, Aug 02, 2019 at 11:26:12AM +0100, Qais Yousef wrote:

> Yes a somewhat enforced default makes more sense to me. I assume you no longer
> want to put the kthreads that just need to be above OTHER in FIFO-1?

I'm not sure, maybe, there's not that many of them, but possibly we add
another interface for them.

> While at it, since we will cram all kthreads on the same priority, isn't
> a SCHED_RR a better choice now? I think the probability of a clash is pretty
> low, but when it happens, shouldn't we try to guarantee some fairness?

It's never been a problem, and aside from these few straggler threads,
everybody has effectively been there already for years, so if it were a
problem someone would've complained by now.

Also; like said before, the admin had better configure.

Also also, RR-SMP is actually broken (and nobody has cared enough to
bother fixing it).

2019-08-03 14:03:30

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On 08/02/19 14:41, Peter Zijlstra wrote:
> On Fri, Aug 02, 2019 at 11:26:12AM +0100, Qais Yousef wrote:
>
> > Yes a somewhat enforced default makes more sense to me. I assume you no longer
> > want to put the kthreads that just need to be above OTHER in FIFO-1?
>
> I'm not sure, maybe, there's not that many of them, but possibly we add
> another interface for them.

By the way, did you see this one which is set to priority 16?

https://elixir.bootlin.com/linux/v5.3-rc2/source/drivers/gpu/drm/msm/msm_drv.c#L523

>
> > While at it, since we will cram all kthreads on the same priority, isn't
> > a SCHED_RR a better choice now? I think the probability of a clash is pretty
> > low, but when it happens, shouldn't we try to guarantee some fairness?
>
> It's never been a problem, and aside from these few straggler threads,
> everybody has effectively been there already for years, so if it were a
> problem someone would've complained by now.

Usually they can run on enough CPUs so a real clash is definitely hard.

I'm trying to collect data on that, if I find something interesting I'll share
it.

>
> Also; like said before, the admin had better configure.

I agree. But I don't think an 'admin' is an easily defined entity for all
systems. On mobile, is it the SoC vendor, Android framework, or the
handset/platform vendor/integrator?

In a *real* realtime system I think things are better defined. But usage of RT
tasks on generic systems is the confusing part. There's no real ownership and
things are more ad-hoc.

>
> Also also, RR-SMP is actually broken (and nobody has cared enough to
> bother fixing it).

If you can give me enough pointers to understand the problem I might be able to
bother with it :-)

Thanks

--
Qais Yousef

2019-08-03 14:10:33

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On Fri, Aug 02, 2019 at 03:08:54PM +0100, Qais Yousef wrote:
> On 08/02/19 14:41, Peter Zijlstra wrote:
> > On Fri, Aug 02, 2019 at 11:26:12AM +0100, Qais Yousef wrote:
> >
> > > Yes a somewhat enforced default makes more sense to me. I assume you no longer
> > > want to put the kthreads that just need to be above OTHER in FIFO-1?
> >
> > I'm not sure, maybe, there's not that many of them, but possibly we add
> > another interface for them.
>
> By the way, did you see this one which is set to priority 16?
>
> https://elixir.bootlin.com/linux/v5.3-rc2/source/drivers/gpu/drm/msm/msm_drv.c#L523

I did, I ignored it because it wasn't 99 or something silly like that,
but I'd definitely mop it up when doing the proposed.

> > Also; like said before, the admin had better configure.
>
> I agree. But I don't think an 'admin' is an easily defined entity for all
> systems. On mobile, is it the SoC vendor, Android framework, or the
> handset/platform vendor/integrator?

Mostly Android I suspect, but if SoC specific drivers have RT threads,
it's their responsibility to integrate properly with the rest of Android
of course.

> > Also also, RR-SMP is actually broken (and nobody has cared enough to
> > bother fixing it).
>
> If you can give me enough pointers to understand the problem I might be able to
> bother with it :-)

So the push-pull balancer we have (designed for FIFO but also applied to
RR) will only move a task if the destination CPU has a lower prio. In
the case where one CPU has 3 tasks and the other 1, and they're all the
same prio, it does nothing. For FIFO that is fine, for RR, not so much.

Because then the one CPU will RR between 3 tasks, giving each task
1/3rd, while the other will only run the one task.

2019-08-03 14:33:49

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH 0/5] Fix FIFO-99 abuse

On 08/02/19 16:33, Peter Zijlstra wrote:
> > > Also also, RR-SMP is actually broken (and nobody has cared enough to
> > > bother fixing it).
> >
> > If you can give me enough pointers to understand the problem I might be able to
> > bother with it :-)
>
> So the push-pull balancer we have (designed for FIFO but also applied to
> RR) will only move a task if the destination CPU has a lower prio. In
> the case where one CPU has 3 tasks and the other 1, and they're all the
> same prio, it does nothing. For FIFO that is fine, for RR, not so much.
>
> Because then the one CPU will RR between 3 tasks, giving each task
> 1/3rd, while the other will only run the one task.

Okay so what we need to do is keep a count of number of RR tasks on each CPU
(which we already do IIRC) and take that into account in
find_loweset_rq()/cpupri_find().

Let me see if I can create a test case then hack something.

Thanks

--
Qais Yousef