Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752702AbbEHMsK (ORCPT ); Fri, 8 May 2015 08:48:10 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:40289 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750851AbbEHMsH (ORCPT ); Fri, 8 May 2015 08:48:07 -0400 Date: Fri, 8 May 2015 14:47:48 +0200 From: Peter Zijlstra To: Ingo Molnar , Linus Torvalds , Oleg Nesterov Cc: linux-kernel@vger.kernel.org, ja@ssi.bg, neilb@suse.de Subject: [PATCH] sched: Introduce TASK_NOLOAD and TASK_IDLE Message-ID: <20150508124748.GH27504@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3180 Lines: 79 Subject: sched: Introduce TASK_NOLOAD and TASK_IDLE From: Peter Zijlstra Date: Fri May 8 14:23:45 CEST 2015 Currently people use TASK_INTERRUPTIBLE to idle kthreads and wait for 'work' because TASK_UNINTERRUPTIBLE contributes to the loadavg. Having all idle kthreads contribute to the loadavg is somewhat silly. Now mostly this works OK, because kthreads have all their signals masked. However there's a few sites where this is causing problems and TASK_UNINTERRUPTIBLE should be used, except for that loadavg issue. This patch adds TASK_NOLOAD which, when combined with TASK_UNINTERRUPTIBLE avoids the loadavg accounting. As most of imagined usage sites are loops where a thread wants to idle, waiting for work, a helper TASK_IDLE is introduced. Cc: Julian Anastasov Cc: NeilBrown Cc: Oleg Nesterov Cc: Linus Torvalds Signed-off-by: Peter Zijlstra (Intel) --- include/linux/sched.h | 10 +++++++--- include/trace/events/sched.h | 3 ++- 2 files changed, 9 insertions(+), 4 deletions(-) --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -213,9 +213,10 @@ print_cfs_rq(struct seq_file *m, int cpu #define TASK_WAKEKILL 128 #define TASK_WAKING 256 #define TASK_PARKED 512 -#define TASK_STATE_MAX 1024 +#define TASK_NOLOAD 1024 +#define TASK_STATE_MAX 2048 -#define TASK_STATE_TO_CHAR_STR "RSDTtXZxKWP" +#define TASK_STATE_TO_CHAR_STR "RSDTtXZxKWPN" extern char ___assert_task_state[1 - 2*!!( sizeof(TASK_STATE_TO_CHAR_STR)-1 != ilog2(TASK_STATE_MAX)+1)]; @@ -225,6 +226,8 @@ extern char ___assert_task_state[1 - 2*! #define TASK_STOPPED (TASK_WAKEKILL | __TASK_STOPPED) #define TASK_TRACED (TASK_WAKEKILL | __TASK_TRACED) +#define TASK_IDLE (TASK_UNINTERRUPTIBLE | TASK_NOLOAD) + /* Convenience macros for the sake of wake_up */ #define TASK_NORMAL (TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE) #define TASK_ALL (TASK_NORMAL | __TASK_STOPPED | __TASK_TRACED) @@ -240,7 +243,8 @@ extern char ___assert_task_state[1 - 2*! ((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0) #define task_contributes_to_load(task) \ ((task->state & TASK_UNINTERRUPTIBLE) != 0 && \ - (task->flags & PF_FROZEN) == 0) + (task->flags & PF_FROZEN) == 0 && \ + (task->state & TASK_NOLOAD) == 0) #ifdef CONFIG_DEBUG_ATOMIC_SLEEP --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -147,7 +147,8 @@ TRACE_EVENT(sched_switch, __print_flags(__entry->prev_state & (TASK_STATE_MAX-1), "|", { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" }, { 16, "Z" }, { 32, "X" }, { 64, "x" }, - { 128, "K" }, { 256, "W" }, { 512, "P" }) : "R", + { 128, "K" }, { 256, "W" }, { 512, "P" }, + { 1024, "N" }) : "R", __entry->prev_state & TASK_STATE_MAX ? "+" : "", __entry->next_comm, __entry->next_pid, __entry->next_prio) ); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/