Date: Mon, 9 Feb 2015 10:14:36 -0800
From: Steven Noonan <steven@uplinklabs.net>
To: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
        Thomas Gleixner <tglx@linutronix.de>, jacob.jun.pan@intel.com,
        Arjan van de Ven <arjan@linux.intel.com>,
        Linux Kernel mailing List <linux-kernel@vger.kernel.org>,
        =?utf-8?B?RnLDqWTDqXJpYw==?= Weisbecker <fweisbec@gmail.com>,
        frederic@kernel.org, Daniel Lezcano <daniel.lezcano@linaro.org>,
        Amit Kucheria <amit.kucheria@linaro.org>,
        Eduardo Valentin <edubezval@gmail.com>,
        Viresh Kumar <viresh.kumar@linaro.org>, rui.zhang@intel.com
Subject: Re: [PATCH V2] idle/intel_powerclamp: Redesign idle injection to use
 bandwidth control mechanism
Message-ID: <20150209181436.GA29721@rincewind.us-west-2.compute.internal>
References: <20150209044852.6231.66456.stgit@preeti.in.ibm.com>
 <CAKbGBLhHUActyWB2UjC1Uw=BW-93icONV4Pxa2WvjJEUtAHy7g@mail.gmail.com>
 <54D89F46.4030803@linux.vnet.ibm.com>
 <CAKbGBLg2c=Fsgucu0VvvHGnAzxXjSfVrGW4UNQJ29xB40QoX-Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <CAKbGBLg2c=Fsgucu0VvvHGnAzxXjSfVrGW4UNQJ29xB40QoX-Q@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 40410
Lines: 803

On Mon, Feb 9, 2015 at 9:56 AM, Steven Noonan <steven@uplinklabs.net> wrote:
> On Mon, Feb 9, 2015 at 3:51 AM, Preeti U Murthy
> <preeti@linux.vnet.ibm.com> wrote:
>> Hi Steven,
>>
>> On 02/09/2015 01:02 PM, Steven Noonan wrote:
>>> On Sun, Feb 8, 2015 at 8:49 PM, Preeti U Murthy
>>> <preeti@linux.vnet.ibm.com> wrote:
>>>> The powerclamp driver injects idle periods to stay within the thermal constraints.
>>>> The driver does a fake idle by spawning per-cpu threads that call the mwait
>>>> instruction. This behavior of fake idle can confuse the other kernel subsystems.
>>>> For instance it calls into the nohz tick handlers, which are meant to be called
>>>> only by the idle thread. It sets the state of the tick in each cpu to idle and
>>>> stops the tick, when there are tasks on the runqueue. As a result the callers of
>>>> idle_cpu()/ tick_nohz_tick_stopped() see different states of the cpu; while the
>>>> former thinks that the cpu is busy, the latter thinks that it is idle. The outcome
>>>> may be  inconsistency in the scheduler/nohz states which can lead to serious
>>>> consequences. One of them was reported on this thread:
>>>> https://lkml.org/lkml/2014/12/11/365.
>>>>
>>>> Thomas posted out a patch to disable the powerclamp driver from calling into the
>>>> tick nohz code which has taken care of the above regression for the moment. However
>>>> powerclamp driver as a result, will not be able to inject idle periods due to the
>>>> presence of periodic ticks. With the current design of fake idle, we cannot move
>>>> towards a better solution.
>>>> https://lkml.org/lkml/2014/12/18/169
>>>>
>>>> This patch aims at removing the concept of fake idle and instead makes the cpus
>>>> truly idle by throttling the runqueues during the idle injection periods. The situation
>>>> is in fact very similar to throttling of cfs_rqs when they exceed their bandwidths.
>>>> The idle injection metrics can be mapped to the bandwidth control metrics 'quota' and
>>>> 'period' to achieve the same result. When the powerclamping is begun or when the
>>>> clamping controls have been modified, the bandwidth for the root task group is set.
>>>> The 'quota' will be the amount of time that the system needs to be busy and 'period'
>>>> will be the sum of this busy duration and the idle duration as calculated by the driver.
>>>> This gets rid of per-cpu kthreads, control cpu, hotplug notifiers and clamping mask since
>>>> the thread starting powerclamping will set the bandwidth and throttling of all cpus will
>>>> automatically fall in place. None of the other cpus need be bothered about this. This
>>>> simplifies the design of the driver.
>>>>
>>>> Of course this is only if the idle injection metrics can be conveniently transformed
>>>> into bandwidth control metrics. There are a couple of other primary concerns around if
>>>> doing the below two in this patch is valid.
>>>> a. This patch exports the functions to set the quota and period of task groups.
>>>> b. This patch removes the constraint of not being able to set the root task grp's bandwidth.
>>>>
>>>> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>>>
>>> This doesn't compile.
>>
>> Thanks for reporting this! I realized that I had not compiled in the powerclamp driver
>> as a module while compile testing it. I was focusing on the issues with the design and
>> failed to cross verify this. Apologies for the inconvenience.
>>
>> Find the diff compile tested below.
>>
>> I also realized that clamp_cpus() that sets the bandwidth cannot be called from
>> multiple places. Currently I am calling it from end_powerclamp(), when the user changes the
>> idle clamping duration and from a queued timer. This will require synchronization between
>> callers which is not really called for. The queued wakeup_timer alone can re-evaluate the
>> clamping metrics after every throttle-unthrottle period and this should suffice as far
>> as I can see. Thoughts ?
>
> Hmm, I've had two system lockups so far while running a kernel with
> intel_powerclamp loaded. Both times it slowly ground to a halt and
> processes piled up...

I let the hung system sit for a while and it eventually partially recovered and
came back with this in dmesg (9:24 is around system boot time, 9:51 is around
when it went out to lunch, and 10:01 is when it came back):

Feb 09 09:24:16 osprey kernel: wlp2s0: authenticate with e0:3f:49:28:3b:8c
Feb 09 09:24:16 osprey kernel: wlp2s0: send auth to e0:3f:49:28:3b:8c (try 1/3)
Feb 09 09:24:16 osprey kernel: wlp2s0: authenticated
Feb 09 09:24:16 osprey kernel: wlp2s0: associate with e0:3f:49:28:3b:8c (try 1/3)
Feb 09 09:24:16 osprey kernel: wlp2s0: RX AssocResp from e0:3f:49:28:3b:8c (capab=0x1011 status=0 aid=4)
Feb 09 09:24:16 osprey kernel: wlp2s0: associated
Feb 09 09:50:19 osprey kernel: intel_powerclamp: Start idle injection to reduce power
Feb 09 09:50:23 osprey kernel: intel_powerclamp: Stop forced idle injection
Feb 09 09:50:27 osprey kernel: intel_powerclamp: Start idle injection to reduce power
Feb 09 09:50:31 osprey kernel: intel_powerclamp: Stop forced idle injection
Feb 09 09:50:31 osprey kernel: intel_powerclamp: Start idle injection to reduce power
Feb 09 09:50:57 osprey kernel: Watchdog[1271]: segfault at 0 ip 00007f580734ad7b sp 00007f57f04175a0 error 6 in chromium[7f580310e000+66c6000]
Feb 09 09:51:46 osprey kernel: INFO: rcu_preempt self-detected stall on CPU
Feb 09 09:51:46 osprey kernel:         0: (1 GPs behind) idle=d1d/2/0 softirq=48032/48032 last_accelerate: a04e/8abc, nonlazy_posted: 0, L.
Feb 09 09:51:46 osprey kernel:          (t=60000 jiffies g=40078 c=40077 q=69418)
Feb 09 09:51:46 osprey kernel: Task dump for CPU 0:
Feb 09 09:51:46 osprey kernel: swapper/0       R  running task        0     0      0 0x00000008
Feb 09 09:51:46 osprey kernel:  0000000000000000 ffff88041ea03ae0 ffffffff81099946 0000000000000000
Feb 09 09:51:46 osprey kernel:  ffffffff81a56f00 ffff88041ea03b00 ffffffff8109ce8d ffff88041ea0d300
Feb 09 09:51:46 osprey kernel:  0000000000000001 ffff88041ea03b30 ffffffff810cd391 ffff88041ea0d300
Feb 09 09:51:46 osprey kernel: Call Trace:
Feb 09 09:51:46 osprey kernel:  <IRQ>  [<ffffffff81099946>] sched_show_task+0xb6/0x120
Feb 09 09:51:46 osprey kernel:  [<ffffffff8109ce8d>] dump_cpu_task+0x3d/0x50
Feb 09 09:51:46 osprey kernel:  [<ffffffff810cd391>] rcu_dump_cpu_stacks+0x91/0xd0
Feb 09 09:51:46 osprey kernel:  [<ffffffff810d15c2>] rcu_check_callbacks+0x4d2/0x7e0
Feb 09 09:51:46 osprey kernel:  [<ffffffff812c6e73>] ? __this_cpu_preempt_check+0x13/0x20
Feb 09 09:51:46 osprey kernel:  [<ffffffff810d7f46>] ? hrtimer_run_queues+0x36/0x110
Feb 09 09:51:46 osprey kernel:  [<ffffffff810d68cb>] update_process_times+0x4b/0x80
Feb 09 09:51:46 osprey kernel:  [<ffffffff810e5e65>] tick_sched_handle.isra.13+0x25/0x60
Feb 09 09:51:46 osprey kernel:  [<ffffffff810e5ee5>] tick_sched_timer+0x45/0x80
Feb 09 09:51:46 osprey kernel:  [<ffffffff810d6e87>] __run_hrtimer+0x77/0x250
Feb 09 09:51:46 osprey kernel:  [<ffffffff810e5ea0>] ? tick_sched_handle.isra.13+0x60/0x60
Feb 09 09:51:46 osprey kernel:  [<ffffffff810d79d7>] hrtimer_interrupt+0x107/0x250
Feb 09 09:51:46 osprey kernel:  [<ffffffff8104acbb>] local_apic_timer_interrupt+0x3b/0x70
Feb 09 09:51:46 osprey kernel:  [<ffffffff8152a8a3>] smp_apic_timer_interrupt+0x43/0x60
Feb 09 09:51:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 09:51:46 osprey kernel:  [<ffffffffa0b0728b>] ? 0xffffffffa0b0728b
Feb 09 09:51:46 osprey kernel:  [<ffffffffa0b0718d>] ? 0xffffffffa0b0718d
Feb 09 09:51:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 09:51:46 osprey kernel:  [<ffffffffa0b075e5>] 0xffffffffa0b075e5
Feb 09 09:51:46 osprey kernel:  [<ffffffff810d54ca>] call_timer_fn+0x3a/0x160
Feb 09 09:51:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 09:51:46 osprey kernel:  [<ffffffff810d5dc4>] run_timer_softirq+0x264/0x310
Feb 09 09:51:46 osprey kernel:  [<ffffffff81073ab6>] __do_softirq+0xf6/0x2e0
Feb 09 09:51:46 osprey kernel:  [<ffffffff81073efe>] irq_exit+0x8e/0xb0
Feb 09 09:51:46 osprey kernel:  [<ffffffff8152a8a8>] smp_apic_timer_interrupt+0x48/0x60
Feb 09 09:51:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 09:51:46 osprey kernel:  <EOI>  [<ffffffff813e5db5>] ? cpuidle_enter_state+0x65/0x1a0
Feb 09 09:51:46 osprey kernel:  [<ffffffff813e5da1>] ? cpuidle_enter_state+0x51/0x1a0
Feb 09 09:51:46 osprey kernel:  [<ffffffff813e5fd7>] cpuidle_enter+0x17/0x20
Feb 09 09:51:46 osprey kernel:  [<ffffffff810af014>] cpu_startup_entry+0x3f4/0x460
Feb 09 09:51:46 osprey kernel:  [<ffffffff815187f5>] rest_init+0x85/0x90
Feb 09 09:51:46 osprey kernel:  [<ffffffff81ba0f7e>] start_kernel+0x445/0x452
Feb 09 09:51:46 osprey kernel:  [<ffffffff81ba0120>] ? early_idt_handlers+0x120/0x120
Feb 09 09:51:46 osprey kernel:  [<ffffffff81ba04d7>] x86_64_start_reservations+0x2a/0x2c
Feb 09 09:51:46 osprey kernel:  [<ffffffff81ba061c>] x86_64_start_kernel+0x143/0x152
Feb 09 09:54:46 osprey kernel: INFO: rcu_preempt self-detected stall on CPU
Feb 09 09:54:46 osprey kernel:         0: (1 GPs behind) idle=d1d/2/0 softirq=48032/48032 last_accelerate: a04e/49df, nonlazy_posted: 0, L.
Feb 09 09:54:46 osprey kernel:          (t=240003 jiffies g=40078 c=40077 q=72706)
Feb 09 09:54:46 osprey kernel: Task dump for CPU 0:
Feb 09 09:54:46 osprey kernel: swapper/0       R  running task        0     0      0 0x00000008
Feb 09 09:54:46 osprey kernel:  0000000000000000 ffff88041ea03ae0 ffffffff81099946 0000000000000000
Feb 09 09:54:46 osprey kernel:  ffffffff81a56f00 ffff88041ea03b00 ffffffff8109ce8d ffff88041ea0d300
Feb 09 09:54:46 osprey kernel:  0000000000000001 ffff88041ea03b30 ffffffff810cd391 ffff88041ea0d300
Feb 09 09:54:46 osprey kernel: Call Trace:
Feb 09 09:54:46 osprey kernel:  <IRQ>  [<ffffffff81099946>] sched_show_task+0xb6/0x120
Feb 09 09:54:46 osprey kernel:  [<ffffffff8109ce8d>] dump_cpu_task+0x3d/0x50
Feb 09 09:54:46 osprey kernel:  [<ffffffff810cd391>] rcu_dump_cpu_stacks+0x91/0xd0
Feb 09 09:54:46 osprey kernel:  [<ffffffff810d15c2>] rcu_check_callbacks+0x4d2/0x7e0
Feb 09 09:54:46 osprey kernel:  [<ffffffff812c6e73>] ? __this_cpu_preempt_check+0x13/0x20
Feb 09 09:54:46 osprey kernel:  [<ffffffff810d7f46>] ? hrtimer_run_queues+0x36/0x110
Feb 09 09:54:46 osprey kernel:  [<ffffffff810d68cb>] update_process_times+0x4b/0x80
Feb 09 09:54:46 osprey kernel:  [<ffffffff810e5e65>] tick_sched_handle.isra.13+0x25/0x60
Feb 09 09:54:46 osprey kernel:  [<ffffffff810e5ee5>] tick_sched_timer+0x45/0x80
Feb 09 09:54:46 osprey kernel:  [<ffffffff810d6e87>] __run_hrtimer+0x77/0x250
Feb 09 09:54:46 osprey kernel:  [<ffffffff810e5ea0>] ? tick_sched_handle.isra.13+0x60/0x60
Feb 09 09:54:46 osprey kernel:  [<ffffffff810d79d7>] hrtimer_interrupt+0x107/0x250
Feb 09 09:54:46 osprey kernel:  [<ffffffff8104acbb>] local_apic_timer_interrupt+0x3b/0x70
Feb 09 09:54:46 osprey kernel:  [<ffffffff8152a8a3>] smp_apic_timer_interrupt+0x43/0x60
Feb 09 09:54:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 09:54:46 osprey kernel:  [<ffffffffa0b0726d>] ? 0xffffffffa0b0726d
Feb 09 09:54:46 osprey kernel:  [<ffffffffa0b0718d>] ? 0xffffffffa0b0718d
Feb 09 09:54:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 09:54:46 osprey kernel:  [<ffffffffa0b075e5>] 0xffffffffa0b075e5
Feb 09 09:54:46 osprey kernel:  [<ffffffff810d54ca>] call_timer_fn+0x3a/0x160
Feb 09 09:54:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 09:54:46 osprey kernel:  [<ffffffff810d5dc4>] run_timer_softirq+0x264/0x310
Feb 09 09:54:46 osprey kernel:  [<ffffffff81073ab6>] __do_softirq+0xf6/0x2e0
Feb 09 09:54:46 osprey kernel:  [<ffffffff81073efe>] irq_exit+0x8e/0xb0
Feb 09 09:54:46 osprey kernel:  [<ffffffff8152a8a8>] smp_apic_timer_interrupt+0x48/0x60
Feb 09 09:54:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 09:54:46 osprey kernel:  <EOI>  [<ffffffff813e5db5>] ? cpuidle_enter_state+0x65/0x1a0
Feb 09 09:54:46 osprey kernel:  [<ffffffff813e5da1>] ? cpuidle_enter_state+0x51/0x1a0
Feb 09 09:54:46 osprey kernel:  [<ffffffff813e5fd7>] cpuidle_enter+0x17/0x20
Feb 09 09:54:46 osprey kernel:  [<ffffffff810af014>] cpu_startup_entry+0x3f4/0x460
Feb 09 09:54:46 osprey kernel:  [<ffffffff815187f5>] rest_init+0x85/0x90
Feb 09 09:54:46 osprey kernel:  [<ffffffff81ba0f7e>] start_kernel+0x445/0x452
Feb 09 09:54:46 osprey kernel:  [<ffffffff81ba0120>] ? early_idt_handlers+0x120/0x120
Feb 09 09:54:46 osprey kernel:  [<ffffffff81ba04d7>] x86_64_start_reservations+0x2a/0x2c
Feb 09 09:54:46 osprey kernel:  [<ffffffff81ba061c>] x86_64_start_kernel+0x143/0x152
Feb 09 09:57:46 osprey kernel: INFO: rcu_preempt self-detected stall on CPU
Feb 09 09:57:46 osprey kernel:         0: (1 GPs behind) idle=d1d/2/0 softirq=48032/48032 last_accelerate: a04e/0902, nonlazy_posted: 0, L.
Feb 09 09:57:46 osprey kernel:          (t=420006 jiffies g=40078 c=40077 q=76104)
Feb 09 09:57:46 osprey kernel: Task dump for CPU 0:
Feb 09 09:57:46 osprey kernel: swapper/0       R  running task        0     0      0 0x00000008
Feb 09 09:57:46 osprey kernel:  0000000000000000 ffff88041ea03ae0 ffffffff81099946 0000000000000000
Feb 09 09:57:46 osprey kernel:  ffffffff81a56f00 ffff88041ea03b00 ffffffff8109ce8d ffff88041ea0d300
Feb 09 09:57:46 osprey kernel:  0000000000000001 ffff88041ea03b30 ffffffff810cd391 ffff88041ea0d300
Feb 09 09:57:46 osprey kernel: Call Trace:
Feb 09 09:57:46 osprey kernel:  <IRQ>  [<ffffffff81099946>] sched_show_task+0xb6/0x120
Feb 09 09:57:46 osprey kernel:  [<ffffffff8109ce8d>] dump_cpu_task+0x3d/0x50
Feb 09 09:57:46 osprey kernel:  [<ffffffff810cd391>] rcu_dump_cpu_stacks+0x91/0xd0
Feb 09 09:57:46 osprey kernel:  [<ffffffff810d15c2>] rcu_check_callbacks+0x4d2/0x7e0
Feb 09 09:57:46 osprey kernel:  [<ffffffff812c6e73>] ? __this_cpu_preempt_check+0x13/0x20
Feb 09 09:57:46 osprey kernel:  [<ffffffff810d7f46>] ? hrtimer_run_queues+0x36/0x110
Feb 09 09:57:46 osprey kernel:  [<ffffffff810d68cb>] update_process_times+0x4b/0x80
Feb 09 09:57:46 osprey kernel:  [<ffffffff810e5e65>] tick_sched_handle.isra.13+0x25/0x60
Feb 09 09:57:46 osprey kernel:  [<ffffffff810e5ee5>] tick_sched_timer+0x45/0x80
Feb 09 09:57:46 osprey kernel:  [<ffffffff810d6e87>] __run_hrtimer+0x77/0x250
Feb 09 09:57:46 osprey kernel:  [<ffffffff810e5ea0>] ? tick_sched_handle.isra.13+0x60/0x60
Feb 09 09:57:46 osprey kernel:  [<ffffffff810d79d7>] hrtimer_interrupt+0x107/0x250
Feb 09 09:57:46 osprey kernel:  [<ffffffff8104acbb>] local_apic_timer_interrupt+0x3b/0x70
Feb 09 09:57:46 osprey kernel:  [<ffffffff8152a8a3>] smp_apic_timer_interrupt+0x43/0x60
Feb 09 09:57:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 09:57:46 osprey kernel:  [<ffffffffa0b0726d>] ? 0xffffffffa0b0726d
Feb 09 09:57:46 osprey kernel:  [<ffffffffa0b0718d>] ? 0xffffffffa0b0718d
Feb 09 09:57:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 09:57:46 osprey kernel:  [<ffffffffa0b075e5>] 0xffffffffa0b075e5
Feb 09 09:57:46 osprey kernel:  [<ffffffff810d54ca>] call_timer_fn+0x3a/0x160
Feb 09 09:57:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 09:57:46 osprey kernel:  [<ffffffff810d5dc4>] run_timer_softirq+0x264/0x310
Feb 09 09:57:46 osprey kernel:  [<ffffffff81073ab6>] __do_softirq+0xf6/0x2e0
Feb 09 09:57:46 osprey kernel:  [<ffffffff81073efe>] irq_exit+0x8e/0xb0
Feb 09 09:57:46 osprey kernel:  [<ffffffff8152a8a8>] smp_apic_timer_interrupt+0x48/0x60
Feb 09 09:57:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 09:57:46 osprey kernel:  <EOI>  [<ffffffff813e5db5>] ? cpuidle_enter_state+0x65/0x1a0
Feb 09 09:57:46 osprey kernel:  [<ffffffff813e5da1>] ? cpuidle_enter_state+0x51/0x1a0
Feb 09 09:57:46 osprey kernel:  [<ffffffff813e5fd7>] cpuidle_enter+0x17/0x20
Feb 09 09:57:46 osprey kernel:  [<ffffffff810af014>] cpu_startup_entry+0x3f4/0x460
Feb 09 09:57:46 osprey kernel:  [<ffffffff815187f5>] rest_init+0x85/0x90
Feb 09 09:57:46 osprey kernel:  [<ffffffff81ba0f7e>] start_kernel+0x445/0x452
Feb 09 09:57:46 osprey kernel:  [<ffffffff81ba0120>] ? early_idt_handlers+0x120/0x120
Feb 09 09:57:46 osprey kernel:  [<ffffffff81ba04d7>] x86_64_start_reservations+0x2a/0x2c
Feb 09 09:57:46 osprey kernel:  [<ffffffff81ba061c>] x86_64_start_kernel+0x143/0x152
Feb 09 10:00:46 osprey kernel: INFO: rcu_preempt self-detected stall on CPU
Feb 09 10:00:46 osprey kernel:         0: (1 GPs behind) idle=d1d/2/0 softirq=48032/48032 last_accelerate: a04e/c825, nonlazy_posted: 0, L.
Feb 09 10:00:46 osprey kernel:          (t=600009 jiffies g=40078 c=40077 q=79433)
Feb 09 10:00:46 osprey kernel: Task dump for CPU 0:
Feb 09 10:00:46 osprey kernel: swapper/0       R  running task        0     0      0 0x00000008
Feb 09 10:00:46 osprey kernel:  0000000000000000 ffff88041ea03ae0 ffffffff81099946 0000000000000000
Feb 09 10:00:46 osprey kernel:  ffffffff81a56f00 ffff88041ea03b00 ffffffff8109ce8d ffff88041ea0d300
Feb 09 10:00:46 osprey kernel:  0000000000000001 ffff88041ea03b30 ffffffff810cd391 ffff88041ea0d300
Feb 09 10:00:46 osprey kernel: Call Trace:
Feb 09 10:00:46 osprey kernel:  <IRQ>  [<ffffffff81099946>] sched_show_task+0xb6/0x120
Feb 09 10:00:46 osprey kernel:  [<ffffffff8109ce8d>] dump_cpu_task+0x3d/0x50
Feb 09 10:00:46 osprey kernel:  [<ffffffff810cd391>] rcu_dump_cpu_stacks+0x91/0xd0
Feb 09 10:00:46 osprey kernel:  [<ffffffff810d15c2>] rcu_check_callbacks+0x4d2/0x7e0
Feb 09 10:00:46 osprey kernel:  [<ffffffff812c6e73>] ? __this_cpu_preempt_check+0x13/0x20
Feb 09 10:00:46 osprey kernel:  [<ffffffff810d7f46>] ? hrtimer_run_queues+0x36/0x110
Feb 09 10:00:46 osprey kernel:  [<ffffffff810d68cb>] update_process_times+0x4b/0x80
Feb 09 10:00:46 osprey kernel:  [<ffffffff810e5e65>] tick_sched_handle.isra.13+0x25/0x60
Feb 09 10:00:46 osprey kernel:  [<ffffffff810e5ee5>] tick_sched_timer+0x45/0x80
Feb 09 10:00:46 osprey kernel:  [<ffffffff810d6e87>] __run_hrtimer+0x77/0x250
Feb 09 10:00:46 osprey kernel:  [<ffffffff810e5ea0>] ? tick_sched_handle.isra.13+0x60/0x60
Feb 09 10:00:46 osprey kernel:  [<ffffffff810d79d7>] hrtimer_interrupt+0x107/0x250
Feb 09 10:00:46 osprey kernel:  [<ffffffff8104acbb>] local_apic_timer_interrupt+0x3b/0x70
Feb 09 10:00:46 osprey kernel:  [<ffffffff8152a8a3>] smp_apic_timer_interrupt+0x43/0x60
Feb 09 10:00:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 10:00:46 osprey kernel:  [<ffffffffa0b0726d>] ? 0xffffffffa0b0726d
Feb 09 10:00:46 osprey kernel:  [<ffffffffa0b0718d>] ? 0xffffffffa0b0718d
Feb 09 10:00:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 10:00:46 osprey kernel:  [<ffffffffa0b075e5>] 0xffffffffa0b075e5
Feb 09 10:00:46 osprey kernel:  [<ffffffff810d54ca>] call_timer_fn+0x3a/0x160
Feb 09 10:00:46 osprey kernel:  [<ffffffffa0b075d0>] ? 0xffffffffa0b075d0
Feb 09 10:00:46 osprey kernel:  [<ffffffff810d5dc4>] run_timer_softirq+0x264/0x310
Feb 09 10:00:46 osprey kernel:  [<ffffffff81073ab6>] __do_softirq+0xf6/0x2e0
Feb 09 10:00:46 osprey kernel:  [<ffffffff81073efe>] irq_exit+0x8e/0xb0
Feb 09 10:00:46 osprey kernel:  [<ffffffff8152a8a8>] smp_apic_timer_interrupt+0x48/0x60
Feb 09 10:00:46 osprey kernel:  [<ffffffff81528aed>] apic_timer_interrupt+0x6d/0x80
Feb 09 10:00:46 osprey kernel:  <EOI>  [<ffffffff813e5db5>] ? cpuidle_enter_state+0x65/0x1a0
Feb 09 10:00:46 osprey kernel:  [<ffffffff813e5da1>] ? cpuidle_enter_state+0x51/0x1a0
Feb 09 10:00:46 osprey kernel:  [<ffffffff813e5fd7>] cpuidle_enter+0x17/0x20
Feb 09 10:00:46 osprey kernel:  [<ffffffff810af014>] cpu_startup_entry+0x3f4/0x460
Feb 09 10:00:46 osprey kernel:  [<ffffffff815187f5>] rest_init+0x85/0x90
Feb 09 10:00:46 osprey kernel:  [<ffffffff81ba0f7e>] start_kernel+0x445/0x452
Feb 09 10:00:46 osprey kernel:  [<ffffffff81ba0120>] ? early_idt_handlers+0x120/0x120
Feb 09 10:00:46 osprey kernel:  [<ffffffff81ba04d7>] x86_64_start_reservations+0x2a/0x2c
Feb 09 10:00:46 osprey kernel:  [<ffffffff81ba061c>] x86_64_start_kernel+0x143/0x152

I couldn't do much at this point. While I had some interactivity, processes
were failing to spawn, so i had to just 'echo s > /proc/sysrq-trigger; echo b >
/proc/sysrq-trigger'.

>> ----------------------------------------------------------------------------------
>>
>> V2 of intel_powerclamp driver
>>
>> From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>>
>>
>> ---
>>  drivers/thermal/Kconfig            |    1
>>  drivers/thermal/intel_powerclamp.c |  301 ++++++++++--------------------------
>>  include/linux/sched.h              |    9 +
>>  kernel/sched/core.c                |    6 -
>>  kernel/sched/sched.h               |    5 -
>>  5 files changed, 95 insertions(+), 227 deletions(-)
>>
>> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
>> index af40db0..4b7cd02 100644
>> --- a/drivers/thermal/Kconfig
>> +++ b/drivers/thermal/Kconfig
>> @@ -233,6 +233,7 @@ config INTEL_POWERCLAMP
>>         depends on THERMAL
>>         depends on X86
>>         depends on CPU_SUP_INTEL
>> +       select CFS_BANDWIDTH
>>         help
>>           Enable this to enable Intel PowerClamp idle injection driver. This
>>           enforce idle time which results in more package C-state residency. The
>> diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
>> index 6ceebd6..4bd07bb 100644
>> --- a/drivers/thermal/intel_powerclamp.c
>> +++ b/drivers/thermal/intel_powerclamp.c
>> @@ -51,6 +51,7 @@
>>  #include <linux/debugfs.h>
>>  #include <linux/seq_file.h>
>>  #include <linux/sched/rt.h>
>> +#include <linux/sched.h>
>>
>>  #include <asm/nmi.h>
>>  #include <asm/msr.h>
>> @@ -78,20 +79,18 @@ static struct dentry *debug_dir;
>>  static unsigned int set_target_ratio;
>>  static unsigned int current_ratio;
>>  static bool should_skip;
>> +static unsigned int count = 0;
>>  static bool reduce_irq;
>>  static atomic_t idle_wakeup_counter;
>> -static unsigned int control_cpu; /* The cpu assigned to collect stat and update
>> -                                 * control parameters. default to BSP but BSP
>> -                                 * can be offlined.
>> -                                 */
>>  static bool clamping;
>>
>> +/* Timer that evaluates bandwidth reset if clamping knobs have changed */
>> +static void clamp_timer_fn(unsigned long foo);
>> +DEFINE_TIMER(wakeup_timer, clamp_timer_fn, 0, 0);
>> +
>> +static void clamp_cpus(void);
>>
>> -static struct task_struct * __percpu *powerclamp_thread;
>>  static struct thermal_cooling_device *cooling_dev;
>> -static unsigned long *cpu_clamping_mask;  /* bit map for tracking per cpu
>> -                                          * clamping thread
>> -                                          */
>>
>>  static unsigned int duration;
>>  static unsigned int pkg_cstate_ratio_cur;
>> @@ -163,7 +162,7 @@ static int window_size_set(const char *arg, const struct kernel_param *kp)
>>         smp_mb();
>>
>>  exit_win:
>> -
>> +       clamp_cpus();
>>         return ret;
>>  }
>>
>> @@ -256,10 +255,6 @@ static u64 pkg_state_counter(void)
>>         return count;
>>  }
>>
>> -static void noop_timer(unsigned long foo)
>> -{
>> -       /* empty... just the fact that we get the interrupt wakes us up */
>> -}
>>
>>  static unsigned int get_compensation(int ratio)
>>  {
>> @@ -362,100 +357,77 @@ static bool powerclamp_adjust_controls(unsigned int target_ratio,
>>         return set_target_ratio + guard <= current_ratio;
>>  }
>>
>> -static int clamp_thread(void *arg)
>> +static void clamp_cpus(void)
>>  {
>> -       int cpunr = (unsigned long)arg;
>> -       DEFINE_TIMER(wakeup_timer, noop_timer, 0, 0);
>> -       static const struct sched_param param = {
>> -               .sched_priority = MAX_USER_RT_PRIO/2,
>> -       };
>> -       unsigned int count = 0;
>>         unsigned int target_ratio;
>> +       u64 quota, period;
>> +       int sleeptime;
>> +       unsigned long target_jiffies;
>> +       unsigned int guard;
>> +       unsigned int compensation;
>> +       int interval; /* jiffies to sleep for each attempt */
>> +       unsigned int duration_jiffies;
>> +       unsigned int window_size_now;
>> +
>> +again:
>> +       if (clamping == false)
>> +               goto out;
>> +
>> +       quota = RUNTIME_INF;
>> +       period = NSEC_PER_SEC;
>> +       compensation = 0;
>> +       duration_jiffies = msecs_to_jiffies(duration);
>> +       /*
>> +        * make sure user selected ratio does not take effect until
>> +        * the next round. adjust target_ratio if user has changed
>> +        * target such that we can converge quickly.
>> +        */
>> +       target_ratio = set_target_ratio;
>> +       guard = 1 + target_ratio/20;
>> +       window_size_now = window_size;
>>
>> -       set_bit(cpunr, cpu_clamping_mask);
>> -       set_freezable();
>> -       init_timer_on_stack(&wakeup_timer);
>> -       sched_setscheduler(current, SCHED_FIFO, &param);
>> -
>> -       while (true == clamping && !kthread_should_stop() &&
>> -               cpu_online(cpunr)) {
>> -               int sleeptime;
>> -               unsigned long target_jiffies;
>> -               unsigned int guard;
>> -               unsigned int compensation = 0;
>> -               int interval; /* jiffies to sleep for each attempt */
>> -               unsigned int duration_jiffies = msecs_to_jiffies(duration);
>> -               unsigned int window_size_now;
>> -
>> -               try_to_freeze();
>> -               /*
>> -                * make sure user selected ratio does not take effect until
>> -                * the next round. adjust target_ratio if user has changed
>> -                * target such that we can converge quickly.
>> -                */
>> -               target_ratio = set_target_ratio;
>> -               guard = 1 + target_ratio/20;
>> -               window_size_now = window_size;
>> -               count++;
>> -
>> -               /*
>> -                * systems may have different ability to enter package level
>> -                * c-states, thus we need to compensate the injected idle ratio
>> -                * to achieve the actual target reported by the HW.
>> -                */
>> -               compensation = get_compensation(target_ratio);
>> -               interval = duration_jiffies*100/(target_ratio+compensation);
>> -
>> -               /* align idle time */
>> -               target_jiffies = roundup(jiffies, interval);
>> -               sleeptime = target_jiffies - jiffies;
>> -               if (sleeptime <= 0)
>> -                       sleeptime = 1;
>> -               schedule_timeout_interruptible(sleeptime);
>> -               /*
>> -                * only elected controlling cpu can collect stats and update
>> -                * control parameters.
>> -                */
>> -               if (cpunr == control_cpu && !(count%window_size_now)) {
>> -                       should_skip =
>> -                               powerclamp_adjust_controls(target_ratio,
>> -                                                       guard, window_size_now);
>> -                       smp_mb();
>> -               }
>> -
>> -               if (should_skip)
>> -                       continue;
>> -
>> -               target_jiffies = jiffies + duration_jiffies;
>> -               mod_timer(&wakeup_timer, target_jiffies);
>> -               if (unlikely(local_softirq_pending()))
>> -                       continue;
>> -               /*
>> -                * stop tick sched during idle time, interrupts are still
>> -                * allowed. thus jiffies are updated properly.
>> -                */
>> -               preempt_disable();
>> -               /* mwait until target jiffies is reached */
>> -               while (time_before(jiffies, target_jiffies)) {
>> -                       unsigned long ecx = 1;
>> -                       unsigned long eax = target_mwait;
>> -
>> -                       /*
>> -                        * REVISIT: may call enter_idle() to notify drivers who
>> -                        * can save power during cpu idle. same for exit_idle()
>> -                        */
>> -                       local_touch_nmi();
>> -                       stop_critical_timings();
>> -                       mwait_idle_with_hints(eax, ecx);
>> -                       start_critical_timings();
>> -                       atomic_inc(&idle_wakeup_counter);
>> -               }
>> -               preempt_enable();
>> +       /*
>> +        * systems may have different ability to enter package level
>> +        * c-states, thus we need to compensate the injected idle ratio
>> +        * to achieve the actual target reported by the HW.
>> +        */
>> +       compensation = get_compensation(target_ratio);
>> +       interval = duration_jiffies*100/(target_ratio+compensation);
>> +
>> +       /* align idle time */
>> +       target_jiffies = roundup(jiffies, interval);
>> +       sleeptime = target_jiffies - jiffies;
>> +       if (sleeptime <= 0)
>> +               sleeptime = 1;
>> +
>> +       if (!(count%window_size_now)) {
>> +               should_skip =
>> +                       powerclamp_adjust_controls(target_ratio,
>> +                                               guard, window_size_now);
>> +               smp_mb();
>>         }
>> -       del_timer_sync(&wakeup_timer);
>> -       clear_bit(cpunr, cpu_clamping_mask);
>>
>> -       return 0;
>> +       if (should_skip)
>> +               goto again;
>> +
>> +       target_jiffies = jiffies + sleeptime + duration_jiffies;
>> +       mod_timer(&wakeup_timer, target_jiffies);
>> +       if (unlikely(local_softirq_pending()))
>> +               goto again;
>> +
>> +       quota = jiffies_to_usecs(sleeptime);
>> +       period = jiffies_to_usecs(sleeptime + duration_jiffies);
>> +
>> +out:
>> +       tg_set_cfs_quota(&root_task_group, quota);
>> +       tg_set_cfs_period(&root_task_group, period);
>> +}
>> +
>> +static void clamp_timer_fn(unsigned long foo)
>> +{
>> +       /* Evaluate to see if clamping controls need to be adjusted */
>> +       count++;
>> +       clamp_cpus();
>>  }
>>
>>  /*
>> @@ -501,8 +473,7 @@ static void poll_pkg_cstate(struct work_struct *dummy)
>>
>>  static int start_power_clamp(void)
>>  {
>> -       unsigned long cpu;
>> -       struct task_struct *thread;
>> +       clamping = true;
>>
>>         /* check if pkg cstate counter is completely 0, abort in this case */
>>         if (!has_pkg_state_counter()) {
>> @@ -511,108 +482,21 @@ static int start_power_clamp(void)
>>         }
>>
>>         set_target_ratio = clamp(set_target_ratio, 0U, MAX_TARGET_RATIO - 1);
>> -       /* prevent cpu hotplug */
>> -       get_online_cpus();
>> -
>> -       /* prefer BSP */
>> -       control_cpu = 0;
>> -       if (!cpu_online(control_cpu))
>> -               control_cpu = smp_processor_id();
>>
>> -       clamping = true;
>>         schedule_delayed_work(&poll_pkg_cstate_work, 0);
>> -
>> -       /* start one thread per online cpu */
>> -       for_each_online_cpu(cpu) {
>> -               struct task_struct **p =
>> -                       per_cpu_ptr(powerclamp_thread, cpu);
>> -
>> -               thread = kthread_create_on_node(clamp_thread,
>> -                                               (void *) cpu,
>> -                                               cpu_to_node(cpu),
>> -                                               "kidle_inject/%ld", cpu);
>> -               /* bind to cpu here */
>> -               if (likely(!IS_ERR(thread))) {
>> -                       kthread_bind(thread, cpu);
>> -                       wake_up_process(thread);
>> -                       *p = thread;
>> -               }
>> -
>> -       }
>> -       put_online_cpus();
>> +       clamp_cpus();
>>
>>         return 0;
>>  }
>>
>>  static void end_power_clamp(void)
>>  {
>> -       int i;
>> -       struct task_struct *thread;
>> -
>>         clamping = false;
>> -       /*
>> -        * make clamping visible to other cpus and give per cpu clamping threads
>> -        * sometime to exit, or gets killed later.
>> -        */
>> -       smp_mb();
>> -       msleep(20);
>> -       if (bitmap_weight(cpu_clamping_mask, num_possible_cpus())) {
>> -               for_each_set_bit(i, cpu_clamping_mask, num_possible_cpus()) {
>> -                       pr_debug("clamping thread for cpu %d alive, kill\n", i);
>> -                       thread = *per_cpu_ptr(powerclamp_thread, i);
>> -                       kthread_stop(thread);
>> -               }
>> -       }
>> -}
>>
>> -static int powerclamp_cpu_callback(struct notifier_block *nfb,
>> -                               unsigned long action, void *hcpu)
>> -{
>> -       unsigned long cpu = (unsigned long)hcpu;
>> -       struct task_struct *thread;
>> -       struct task_struct **percpu_thread =
>> -               per_cpu_ptr(powerclamp_thread, cpu);
>> -
>> -       if (false == clamping)
>> -               goto exit_ok;
>> -
>> -       switch (action) {
>> -       case CPU_ONLINE:
>> -               thread = kthread_create_on_node(clamp_thread,
>> -                                               (void *) cpu,
>> -                                               cpu_to_node(cpu),
>> -                                               "kidle_inject/%lu", cpu);
>> -               if (likely(!IS_ERR(thread))) {
>> -                       kthread_bind(thread, cpu);
>> -                       wake_up_process(thread);
>> -                       *percpu_thread = thread;
>> -               }
>> -               /* prefer BSP as controlling CPU */
>> -               if (cpu == 0) {
>> -                       control_cpu = 0;
>> -                       smp_mb();
>> -               }
>> -               break;
>> -       case CPU_DEAD:
>> -               if (test_bit(cpu, cpu_clamping_mask)) {
>> -                       pr_err("cpu %lu dead but powerclamping thread is not\n",
>> -                               cpu);
>> -                       kthread_stop(*percpu_thread);
>> -               }
>> -               if (cpu == control_cpu) {
>> -                       control_cpu = smp_processor_id();
>> -                       smp_mb();
>> -               }
>> -       }
>> -
>> -exit_ok:
>> -       return NOTIFY_OK;
>> +       clamp_cpus();
>> +       del_timer_sync(&wakeup_timer);
>>  }
>>
>> -static struct notifier_block powerclamp_cpu_notifier = {
>> -       .notifier_call = powerclamp_cpu_callback,
>> -};
>> -
>>  static int powerclamp_get_max_state(struct thermal_cooling_device *cdev,
>>                                  unsigned long *state)
>>  {
>> @@ -656,6 +540,7 @@ static int powerclamp_set_cur_state(struct thermal_cooling_device *cdev,
>>         }
>>
>>  exit_set:
>> +       clamp_cpus();
>>         return ret;
>>  }
>>
>> @@ -716,7 +601,6 @@ static int powerclamp_debug_show(struct seq_file *m, void *unused)
>>  {
>>         int i = 0;
>>
>> -       seq_printf(m, "controlling cpu: %d\n", control_cpu);
>>         seq_printf(m, "pct confidence steady dynamic (compensation)\n");
>>         for (i = 0; i < MAX_TARGET_RATIO; i++) {
>>                 seq_printf(m, "%d\t%lu\t%lu\t%lu\n",
>> @@ -762,33 +646,20 @@ file_error:
>>  static int powerclamp_init(void)
>>  {
>>         int retval;
>> -       int bitmap_size;
>> -
>> -       bitmap_size = BITS_TO_LONGS(num_possible_cpus()) * sizeof(long);
>> -       cpu_clamping_mask = kzalloc(bitmap_size, GFP_KERNEL);
>> -       if (!cpu_clamping_mask)
>> -               return -ENOMEM;
>>
>>         /* probe cpu features and ids here */
>>         retval = powerclamp_probe();
>>         if (retval)
>> -               goto exit_free;
>> +               goto exit;
>>
>>         /* set default limit, maybe adjusted during runtime based on feedback */
>>         window_size = 2;
>> -       register_hotcpu_notifier(&powerclamp_cpu_notifier);
>> -
>> -       powerclamp_thread = alloc_percpu(struct task_struct *);
>> -       if (!powerclamp_thread) {
>> -               retval = -ENOMEM;
>> -               goto exit_unregister;
>> -       }
>>
>>         cooling_dev = thermal_cooling_device_register("intel_powerclamp", NULL,
>>                                                 &powerclamp_cooling_ops);
>>         if (IS_ERR(cooling_dev)) {
>>                 retval = -ENODEV;
>> -               goto exit_free_thread;
>> +               goto exit;
>>         }
>>
>>         if (!duration)
>> @@ -798,23 +669,15 @@ static int powerclamp_init(void)
>>
>>         return 0;
>>
>> -exit_free_thread:
>> -       free_percpu(powerclamp_thread);
>> -exit_unregister:
>> -       unregister_hotcpu_notifier(&powerclamp_cpu_notifier);
>> -exit_free:
>> -       kfree(cpu_clamping_mask);
>> +exit:
>>         return retval;
>>  }
>>  module_init(powerclamp_init);
>>
>>  static void powerclamp_exit(void)
>>  {
>> -       unregister_hotcpu_notifier(&powerclamp_cpu_notifier);
>>         end_power_clamp();
>> -       free_percpu(powerclamp_thread);
>>         thermal_cooling_device_unregister(cooling_dev);
>> -       kfree(cpu_clamping_mask);
>>
>>         cancel_delayed_work_sync(&poll_pkg_cstate_work);
>>         debugfs_remove_recursive(debug_dir);
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 8db31ef..2493942 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -163,6 +163,11 @@ extern void get_avenrun(unsigned long *loads, unsigned long offset, int shift);
>>         load += n*(FIXED_1-exp); \
>>         load >>= FSHIFT;
>>
>> +/*
>> + * single value that denotes runtime == period, ie unlimited time.
>> + */
>> +#define RUNTIME_INF    ((u64)~0ULL)
>> +
>>  extern unsigned long total_forks;
>>  extern int nr_threads;
>>  DECLARE_PER_CPU(unsigned long, process_counts);
>> @@ -3002,6 +3007,10 @@ extern long sched_getaffinity(pid_t pid, struct cpumask *mask);
>>
>>  #ifdef CONFIG_CGROUP_SCHED
>>  extern struct task_group root_task_group;
>> +#ifdef CONFIG_CFS_BANDWIDTH
>> +extern int tg_set_cfs_quota(struct task_group *tg, long cfs_quota_us);
>> +extern int tg_set_cfs_period(struct task_group *tg, long cfs_period_us);
>> +#endif /* CONFIG_CFS_BANDWIDTH */
>>  #endif /* CONFIG_CGROUP_SCHED */
>>
>>  extern int task_can_switch_user(struct user_struct *up,
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index e628cb1..7471b06 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -7097,6 +7097,7 @@ int in_sched_functions(unsigned long addr)
>>   * Every task in system belongs to this group at bootup.
>>   */
>>  struct task_group root_task_group;
>> +EXPORT_SYMBOL_GPL(root_task_group);
>>  LIST_HEAD(task_groups);
>>  #endif
>>
>> @@ -8059,9 +8060,6 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, u64 period, u64 quota)
>>         int i, ret = 0, runtime_enabled, runtime_was_enabled;
>>         struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
>>
>> -       if (tg == &root_task_group)
>> -               return -EINVAL;
>> -
>>         /*
>>          * Ensure we have at some amount of bandwidth every period.  This is
>>          * to prevent reaching a state of large arrears when throttled via
>> @@ -8141,6 +8139,7 @@ int tg_set_cfs_quota(struct task_group *tg, long cfs_quota_us)
>>
>>         return tg_set_cfs_bandwidth(tg, period, quota);
>>  }
>> +EXPORT_SYMBOL_GPL(tg_set_cfs_quota);
>>
>>  long tg_get_cfs_quota(struct task_group *tg)
>>  {
>> @@ -8164,6 +8163,7 @@ int tg_set_cfs_period(struct task_group *tg, long cfs_period_us)
>>
>>         return tg_set_cfs_bandwidth(tg, period, quota);
>>  }
>> +EXPORT_SYMBOL_GPL(tg_set_cfs_period);
>>
>>  long tg_get_cfs_period(struct task_group *tg)
>>  {
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index 9a2a45c..20493e4 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -72,11 +72,6 @@ extern void update_cpu_load_active(struct rq *this_rq);
>>   * These are the 'tuning knobs' of the scheduler:
>>   */
>>
>> -/*
>> - * single value that denotes runtime == period, ie unlimited time.
>> - */
>> -#define RUNTIME_INF    ((u64)~0ULL)
>> -
>>  static inline int fair_policy(int policy)
>>  {
>>         return policy == SCHED_NORMAL || policy == SCHED_BATCH;
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/