Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp4957440ybb; Tue, 24 Mar 2020 08:21:46 -0700 (PDT) X-Google-Smtp-Source: ADFU+vviOQo1hr8w4idC5KLUD8ibI6156JVASH53uIA3t8RpxJVUoi7KKCu8+byQX/NiMYyx8hl6 X-Received: by 2002:a4a:db87:: with SMTP id s7mr2133447oou.38.1585063306180; Tue, 24 Mar 2020 08:21:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585063306; cv=none; d=google.com; s=arc-20160816; b=uLLU73/HRqx3m3Kn2CjUecaF5hzmeLpYm2r+WZcDYeD0TdNSmsP54sTyxw/djIf9Im 5xuuwx78+QlFDS7zHp5xRr9e/Bx5AKhxtq2L+O271+0WRM4Deqzs0TI7n/o/9N783iua xRuBxUhT2gBSBY0JbXu2L2lgOhFlxyvez6nVZqcvAMAMj1lXJqO0He+ZFCFTNwAVmIVq o8S3vcMa4niBLdFs5ysgIxxFKg3crYYkPl7C5AZnIbvmD3b2rw+KvDhytOFa84vL+Hoo 3GvLC1J6oZg4Uuj5wC6itYdQTrV0uoi03559STDegB4ufHXmSmS3Mt0V9saF0yr9O2bG B7EA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=PsOvQBlzuy3xnUZlgfJeTMdXK5YEHhrw4gbm6DAYH3A=; b=YBx9Q0hnWgtQN3hocvjAjTDRO4BkQqTa+nj6iRmTbwr6WoY4oXlI7i8GHMEWg6ftNn DQnrBJXX9tAGAZux2Vmd4/ngVKiBEgAJXujd7SHO/5GRZj+7RpE+76lw7RPWFLbCGZTK inMgCYzWEzAtyfRfCGa52pXdrJO7CRfDzA2O80JNQLUzD1zcXQJaEQz3gNHrX7M2btUY lmNfP3oaPSSx5AECKuhuNTa0cfl5pdXx+6dNGPAen8R0zVUnYc5e9Jqj5dD1H9KzcUMs /aKsCJLPo/ehI4Yy+tQIfyIQ6RYGpt+Vn8nCtPVznDTixq51F3XyIW3FLUhTwtsJ5R95 o7WQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FVMJqjGc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w8si6477230oib.105.2020.03.24.08.21.31; Tue, 24 Mar 2020 08:21:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=FVMJqjGc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728350AbgCXPUy (ORCPT + 99 others); Tue, 24 Mar 2020 11:20:54 -0400 Received: from us-smtp-delivery-74.mimecast.com ([216.205.24.74]:57776 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727589AbgCXPUw (ORCPT ); Tue, 24 Mar 2020 11:20:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1585063250; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=PsOvQBlzuy3xnUZlgfJeTMdXK5YEHhrw4gbm6DAYH3A=; b=FVMJqjGchEpEl9Gy5oCDj1O/KkjFlyAiU2q6U9f5Ef92hYvah9wh8jKAGRKOvooDXHVCnB 6AGwF3I4Tcd2WjiRI5LWKo5RIXb5SO50TQ7RDRdvMLx5UExXTFwCY/FYmSXJzgjKgevbAU LyuumphgxDayUnvTzEh9B18U4M7UwNE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-301-1Ol68rMINbiaMZT6kIM8vQ-1; Tue, 24 Mar 2020 11:20:44 -0400 X-MC-Unique: 1Ol68rMINbiaMZT6kIM8vQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8B07918CA244; Tue, 24 Mar 2020 15:20:42 +0000 (UTC) Received: from fuller.cnet (ovpn-116-9.gru2.redhat.com [10.97.116.9]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0CCA61001938; Tue, 24 Mar 2020 15:20:42 +0000 (UTC) Received: by fuller.cnet (Postfix, from userid 1000) id 773EF4198B3A; Tue, 24 Mar 2020 12:20:16 -0300 (-03) Date: Tue, 24 Mar 2020 12:20:16 -0300 From: Marcelo Tosatti To: Thomas Gleixner Cc: Chris Friesen , linux-kernel@vger.kernel.org, Christoph Lameter , Jim Somerville , Andrew Morton , Frederic Weisbecker , Peter Zijlstra Subject: [PATCH v2] isolcpus: affine kernel threads to specified cpumask Message-ID: <20200324152016.GA25422@fuller.cnet> References: <20200323135414.GA28634@fuller.cnet> <87k13boxcn.fsf@nanos.tec.linutronix.de> <87imiuq0cg.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87imiuq0cg.fsf@nanos.tec.linutronix.de> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a kernel enhancement to configure the cpu affinity of kernel threads via kernel boot option isolcpus=no_kthreads,, When this option is specified, the cpumask is immediately applied upon thread launch. This does not affect kernel threads that specify cpu and node. This allows CPU isolation (that is not allowing certain threads to execute on certain CPUs) without using the isolcpus=domain parameter, making it possible to enable load balancing on such CPUs during runtime (see Note-1: this is based off on Wind River's patch at https://github.com/starlingx-staging/stx-integ/blob/master/kernel/kernel-std/centos/patches/affine-compute-kernel-threads.patch Difference being that this patch is limited to modifying kernel thread cpumask: Behaviour of other threads can be controlled via cgroups or sched_setaffinity. Note-2: MontaVista's patch was based off Christoph Lameter's patch at https://lwn.net/Articles/565932/ with the only difference being the kernel parameter changed from kthread to kthread_cpus. Signed-off-by: Marcelo Tosatti --- v2: use isolcpus= subcommand (Thomas Gleixner) Documentation/admin-guide/kernel-parameters.txt | 8 ++++++++ include/linux/cpumask.h | 5 +++++ include/linux/sched/isolation.h | 1 + init/main.c | 1 + kernel/cpu.c | 13 +++++++++++++ kernel/kthread.c | 4 ++-- kernel/sched/isolation.c | 6 ++++++ 7 files changed, 36 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index c07815d230bc..7318e3057383 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1959,6 +1959,14 @@ the CPU affinity syscalls or cpuset. begins at 0 and the maximum value is "number of CPUs in system - 1". + When using cpusets, use the isolcpus option no_kthreads + to avoid creation of kernel threads on isolated CPUs. + + no_kthreads + Adjust the CPU affinity mask of unbound kernel threads to + not contain CPUs on the isolated list. This complements + the isolation provided by the cpusets mechanism described + above. managed_irq diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index d5cc88514aee..a0dc4b12e048 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -55,6 +55,7 @@ extern unsigned int nr_cpu_ids; * cpu_present_mask - has bit 'cpu' set iff cpu is populated * cpu_online_mask - has bit 'cpu' set iff cpu available to scheduler * cpu_active_mask - has bit 'cpu' set iff cpu available to migration + * cpu_kthread_mask - has bit 'cpu' set iff general kernel threads allowed * * If !CONFIG_HOTPLUG_CPU, present == possible, and active == online. * @@ -91,10 +92,12 @@ extern struct cpumask __cpu_possible_mask; extern struct cpumask __cpu_online_mask; extern struct cpumask __cpu_present_mask; extern struct cpumask __cpu_active_mask; +extern struct cpumask __cpu_kthread_mask; #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask) #define cpu_online_mask ((const struct cpumask *)&__cpu_online_mask) #define cpu_present_mask ((const struct cpumask *)&__cpu_present_mask) #define cpu_active_mask ((const struct cpumask *)&__cpu_active_mask) +#define cpu_kthread_mask ((const struct cpumask *)&__cpu_kthread_mask) extern atomic_t __num_online_cpus; @@ -145,6 +148,8 @@ static inline unsigned int cpumask_check(unsigned int cpu) return cpu; } +int __init init_kthread_cpumask(void); + #if NR_CPUS == 1 /* Uniprocessor. Assume all masks are "1". */ static inline unsigned int cpumask_first(const struct cpumask *srcp) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index 0fbcbacd1b29..d002332d00eb 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -14,6 +14,7 @@ enum hk_flags { HK_FLAG_DOMAIN = (1 << 5), HK_FLAG_WQ = (1 << 6), HK_FLAG_MANAGED_IRQ = (1 << 7), + HK_FLAG_NO_KTHREADS = (1 << 8), }; #ifdef CONFIG_CPU_ISOLATION diff --git a/init/main.c b/init/main.c index ee4947af823f..69f528ddc477 100644 --- a/init/main.c +++ b/init/main.c @@ -618,6 +618,7 @@ noinline void __ref rest_init(void) int pid; rcu_scheduler_starting(); + init_kthread_cpumask(); /* * We need to spawn init first so that it obtains pid 1, however * the init task will end up wanting to create kthreads, which, if diff --git a/kernel/cpu.c b/kernel/cpu.c index 9c706af713fb..c549ad8e6596 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -2311,9 +2311,22 @@ EXPORT_SYMBOL(__cpu_present_mask); struct cpumask __cpu_active_mask __read_mostly; EXPORT_SYMBOL(__cpu_active_mask); +struct cpumask __cpu_kthread_mask __read_mostly; +EXPORT_SYMBOL(__cpu_kthread_mask); + atomic_t __num_online_cpus __read_mostly; EXPORT_SYMBOL(__num_online_cpus); +int __init init_kthread_cpumask(void) +{ + const struct cpumask *kthread_mask; + + kthread_mask = housekeeping_cpumask(HK_FLAG_NO_KTHREADS); + cpumask_copy(&__cpu_kthread_mask, kthread_mask); + + return 0; +} + void init_cpu_present(const struct cpumask *src) { cpumask_copy(&__cpu_present_mask, src); diff --git a/kernel/kthread.c b/kernel/kthread.c index b262f47046ca..be9c8d53a986 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -347,7 +347,7 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data), * The kernel thread should not inherit these properties. */ sched_setscheduler_nocheck(task, SCHED_NORMAL, ¶m); - set_cpus_allowed_ptr(task, cpu_all_mask); + set_cpus_allowed_ptr(task, cpu_kthread_mask); } kfree(create); return task; @@ -572,7 +572,7 @@ int kthreadd(void *unused) /* Setup a clean context for our children to inherit. */ set_task_comm(tsk, "kthreadd"); ignore_signals(tsk); - set_cpus_allowed_ptr(tsk, cpu_all_mask); + set_cpus_allowed_ptr(tsk, cpu_kthread_mask); set_mems_allowed(node_states[N_MEMORY]); current->flags |= PF_NOFREEZE; diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 008d6ac2342b..e9d48729efd4 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -169,6 +169,12 @@ static int __init housekeeping_isolcpus_setup(char *str) continue; } + if (!strncmp(str, "no_kthreads,", 12)) { + str += 12; + flags |= HK_FLAG_NO_KTHREADS; + continue; + } + pr_warn("isolcpus: Error, unknown flag\n"); return 0; }