Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754034AbcKHT35 (ORCPT ); Tue, 8 Nov 2016 14:29:57 -0500 Received: from mail-qk0-f173.google.com ([209.85.220.173]:33933 "EHLO mail-qk0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751198AbcKHT3y (ORCPT ); Tue, 8 Nov 2016 14:29:54 -0500 Subject: Re: [PATCH] sched/rt: RT_RUNTIME_GREED sched feature To: Peter Zijlstra , Steven Rostedt References: <20161108115958.GO3142@twins.programming.kicks-ass.net> <20161108090740.4226ffc9@gandalf.local.home> <20161108165133.GI3117@twins.programming.kicks-ass.net> <20161108121710.3e7eb664@gandalf.local.home> <20161108180548.GN3117@twins.programming.kicks-ass.net> Cc: Daniel Bristot de Oliveira , Ingo Molnar , Christoph Lameter , linux-rt-users , LKML , Tommaso Cucinotta From: Daniel Bristot de Oliveira Message-ID: Date: Tue, 8 Nov 2016 20:29:49 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161108180548.GN3117@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1574 Lines: 38 On 11/08/2016 07:05 PM, Peter Zijlstra wrote: >> > >> > I know what we want to do, but there's some momentous problems that >> > need to be solved first. > Like what? The problem is that using RT_RUNTIME_SHARE a CPU will almost always borrow enough runtime to make a CPU intensive rt task to run forever... well not forever, but until the system crash because a kworker starved in this CPU. Kworkers are sched fair by design and users do not always have a way to avoid them in an isolated CPU, for example. The user then can disable RT_RUNTIME_SHARE, but then the user will have the CPU going idle for (period - runtime) at each period... throwing CPU time in the trash. >> > Until then, we may be forced to continue with >> > hacks. > Well, the more ill specified hacks we put in, the harder if will be to > replace because people will end up depending on it. The proposed patch seems to be the expected behavior for users/rt throttling - they want a safeguard for fair tasks while allowing -rt tasks to run as much as possible. I see (and completely agree) that a DL server for fair/rt task would be the best way to deal with this problem, but it will take some time until such solution :-(. We even discussed this at Retis today, but yeah, it will take sometime even in the best case. (thinking aloud... a DL Server would react like the proposed patch, in the sense that it would not be activated without tasks to run and would return the CPU for other tasks if the tasks inside the server finish their job before the end of the DL server runtime...) -- Daniel