Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752043AbcKGUHb (ORCPT ); Mon, 7 Nov 2016 15:07:31 -0500 Received: from smtpi-sp-233.kinghost.net ([177.185.201.233]:36969 "EHLO smtpi-sp-233.kinghost.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751704AbcKGUHa (ORCPT ); Mon, 7 Nov 2016 15:07:30 -0500 Subject: Re: [PATCH] sched/rt: RT_RUNTIME_GREED sched feature To: Steven Rostedt , Christoph Lameter References: <20161107133207.4282de69@gandalf.local.home> <20161107144738.4811a5dd@gandalf.local.home> <20161107150003.66777b43@gandalf.local.home> Cc: Daniel Bristot de Oliveira , Ingo Molnar , Peter Zijlstra , linux-rt-users , LKML From: Daniel Bristot de Oliveira Message-ID: <1e79f711-95f1-da2f-f572-1ac4329c8be7@bristot.me> Date: Mon, 7 Nov 2016 21:06:50 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161107150003.66777b43@gandalf.local.home> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4736 Lines: 67 On 11/07/2016 09:00 PM, Steven Rostedt wrote: > On Mon, 7 Nov 2016 13:54:12 -0600 (CST) > Christoph Lameter wrote: > >> On Mon, 7 Nov 2016, Steven Rostedt wrote: >> >>> On Mon, 7 Nov 2016 13:30:15 -0600 (CST) >>> Christoph Lameter wrote: >>> >>>> SCHED_RR tasks alternately running on on cpu can cause endless deferral of >>>> kworker threads. With the global effect of the OS processing reserved >>>> it may be the case that the processor we are executing never gets any >>>> time. And if that kworker threads role is releasing a mutex (like the >>>> cgroup_lock) then deadlocks can result. >>> >>> I believe SCHED_RR tasks will still throttle if they use up too much of >>> the CPU. But I still don't see how this patch helps your situation. >> >> The kworker thread will be able to make progress? Or am I not reading this >> correctly? >> > > If kworker is SCHED_OTHER, then it will be able to make progress if the > RT tasks are throttled. > > What Daniel's patch does, is to turn off throttling of the RT tasks if > there's no other task on the run queue. Here in the example of two spinning RR tasks (f-22466 & f-22473) and an other task (o-22506): f-22466 [002] d... 79045.641364: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=o next_pid=22506 next_prio=120 o-22506 [002] d... 79045.690379: sched_switch: prev_comm=o prev_pid=22506 prev_prio=120 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 f-22466 [002] d... 79045.725359: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22473 next_prio=94 f-22473 [002] d... 79045.825356: sched_switch: prev_comm=f prev_pid=22473 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 f-22466 [002] d... 79045.925350: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22473 next_prio=94 f-22473 [002] d... 79046.025346: sched_switch: prev_comm=f prev_pid=22473 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 f-22466 [002] d... 79046.125346: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22473 next_prio=94 f-22473 [002] d... 79046.225337: sched_switch: prev_comm=f prev_pid=22473 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 f-22466 [002] d... 79046.325333: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22473 next_prio=94 f-22473 [002] d... 79046.425328: sched_switch: prev_comm=f prev_pid=22473 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 f-22466 [002] d... 79046.525324: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22473 next_prio=94 f-22473 [002] d... 79046.625319: sched_switch: prev_comm=f prev_pid=22473 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 f-22466 [002] d... 79046.641320: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=o next_pid=22506 next_prio=120 o-22506 [002] d... 79046.690335: sched_switch: prev_comm=o prev_pid=22506 prev_prio=120 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 The throttling is per-rq, so even if the RR tasks keep switching between each other, the throttling will take place if there is any sched other task. On Christoph's case, the other task will be the kworker, like bellow: f-22466 [002] d... 79294.430542: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22473 next_prio=94 f-22473 [002] d... 79294.483539: sched_switch: prev_comm=f prev_pid=22473 prev_prio=94 prev_state=R ==> next_comm=kworker/2:1 next_pid=22198 next_prio=120 kworker/2:1-22198 [002] d... 79294.483544: sched_switch: prev_comm=kworker/2:1 prev_pid=22198 prev_prio=120 prev_state=S ==> next_comm=f next_pid=22473 next_prio=94 f-22473 [002] d... 79294.530537: sched_switch: prev_comm=f prev_pid=22473 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22466 next_prio=94 f-22466 [002] d... 79294.630541: sched_switch: prev_comm=f prev_pid=22466 prev_prio=94 prev_state=R ==> next_comm=f next_pid=22473 next_prio=94 The throttling allowed the kworker to run, but once the kworker went to sleep, the RT tasks started to work again. In the previous behavior, the system would either go idle, or the kworker would starve because the runtime become infinity for RR tasks. -- Daniel