Hi Peter,
On Wed, Mar 30, 2022 at 08:23:27PM +0200 Peter Zijlstra wrote:
> On Mon, Mar 28, 2022 at 12:44:54PM -0400, Steven Rostedt wrote:
> > On Mon, 28 Mar 2022 17:56:07 +0200
> > Peter Zijlstra <[email protected]> wrote:
> >
> > > > echo $$ > test/cgroup.procs
> > > > taskset -c 1 bash -c "while true; do let i++; done" --> will be throttled
> > >
> > > Ofcourse.. I'm arguing that bandiwdth control and NOHZ_FULL are somewhat
> > > mutually exclusive, use-case wise. So I really don't get why you'd want
> > > them both.
> >
> > Is it?
> >
> > One use case I can see for having both is for having a deadline task that
> > needs to get something done in a tight deadline. NOHZ_FULL means "do not
> > interrupt this task when it is the top priority task on the CPU and is
> > running in user space".
>
> This is absolute batshit.. It means no such thing. We'll happily wake
> another task to this CPU and re-enable the tick any instant.
>
> Worse; the use-case at hand pertains to cfs bandwidth control, which
> pretty much guarantees there *will* be an interrupt.
The problem is (at least in some cases) that container orchestration userspace
code allocates a whole CPU by setting quota == period. Or 3 cpus as 3*period etc.
In cases where an isolated task is expected to run uninterrupted (only task in
the system affined to that cpu, nohz_full, nocbs etc) you can end up with it
getting throttled even though it theoritically has enough bandwidth for the full
cpu and therefore should never get throttled.
There are radio network setups where the packet processing is isolated
like this but the system as a whole is managed by container orchestration so
everything has cfs bandwidth quotas set.
I don't think generally the bandwidth controls in these cases are used for
CPU sharing (quota < period). I agree that doesn't make much sense with NOHZ_FULL
and won't work right.
It's doled out as full cpu(s) in these cases.
Thats not a VM case so is likely different from the one that started this thread
but I thought I should mention it.
Cheers,
Phil
>
> > Why is it mutually exclusive to have a deadline task that does not want to
> > be interrupted by timer interrupts?
>
> This has absolutely nothing to do with deadline tasks, nada, noppes.
>
> > Just because the biggest pushers of NOHZ_FULL is for those that are running
> > RT tasks completely in user space and event want to fault if it ever goes
> > into the kernel, doesn't mean that's the only use case.
>
> Because there's costs associated with the whole thing. system entry/exit
> get far more expensive. It just doesn't make much sense to use NOHZ_FULL
> if you're not absoultely limiting system entry.
>
> > Chengming brought up VMs. That's a case to want to control the bandwidth,
> > but also not interrupt them with timer interrupts when they are running as
> > the top priority task on a CPU.
>
> It's CFS, there is nothing top priority about that.
>
--
Hi,
On 2022/3/31 03:14, Phil Auld wrote:
> Hi Peter,
>
> On Wed, Mar 30, 2022 at 08:23:27PM +0200 Peter Zijlstra wrote:
>> On Mon, Mar 28, 2022 at 12:44:54PM -0400, Steven Rostedt wrote:
>>> On Mon, 28 Mar 2022 17:56:07 +0200
>>> Peter Zijlstra <[email protected]> wrote:
>>>
>>>>> echo $$ > test/cgroup.procs
>>>>> taskset -c 1 bash -c "while true; do let i++; done" --> will be throttled
>>>>
>>>> Ofcourse.. I'm arguing that bandiwdth control and NOHZ_FULL are somewhat
>>>> mutually exclusive, use-case wise. So I really don't get why you'd want
>>>> them both.
>>>
>>> Is it?
>>>
>>> One use case I can see for having both is for having a deadline task that
>>> needs to get something done in a tight deadline. NOHZ_FULL means "do not
>>> interrupt this task when it is the top priority task on the CPU and is
>>> running in user space".
>>
>> This is absolute batshit.. It means no such thing. We'll happily wake
>> another task to this CPU and re-enable the tick any instant.
>>
>> Worse; the use-case at hand pertains to cfs bandwidth control, which
>> pretty much guarantees there *will* be an interrupt.
>
> The problem is (at least in some cases) that container orchestration userspace
> code allocates a whole CPU by setting quota == period. Or 3 cpus as 3*period etc.
>
> In cases where an isolated task is expected to run uninterrupted (only task in
> the system affined to that cpu, nohz_full, nocbs etc) you can end up with it
> getting throttled even though it theoritically has enough bandwidth for the full
> cpu and therefore should never get throttled.
>
> There are radio network setups where the packet processing is isolated
> like this but the system as a whole is managed by container orchestration so
> everything has cfs bandwidth quotas set.
>
> I don't think generally the bandwidth controls in these cases are used for
> CPU sharing (quota < period). I agree that doesn't make much sense with NOHZ_FULL
> and won't work right.
>
> It's doled out as full cpu(s) in these cases.
>
> Thats not a VM case so is likely different from the one that started this thread
> but I thought I should mention it.
Yes, it's a different use-case from ours. Thanks for sharing with us. I should
put these in the patch log and send an updated version.
Thanks.
>
>
> Cheers,
> Phil
>
>>
>>> Why is it mutually exclusive to have a deadline task that does not want to
>>> be interrupted by timer interrupts?
>>
>> This has absolutely nothing to do with deadline tasks, nada, noppes.
>>
>>> Just because the biggest pushers of NOHZ_FULL is for those that are running
>>> RT tasks completely in user space and event want to fault if it ever goes
>>> into the kernel, doesn't mean that's the only use case.
>>
>> Because there's costs associated with the whole thing. system entry/exit
>> get far more expensive. It just doesn't make much sense to use NOHZ_FULL
>> if you're not absoultely limiting system entry.
>>
>>> Chengming brought up VMs. That's a case to want to control the bandwidth,
>>> but also not interrupt them with timer interrupts when they are running as
>>> the top priority task on a CPU.
>>
>> It's CFS, there is nothing top priority about that.
>>
>