Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp5135352yba; Wed, 10 Apr 2019 12:08:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqz8iU3SpWI8y/ElDI0KkTLRz9Q5NiTI4uxd+6nN+96idbUAKa54449t/gSTAXKC5/9qNNyA X-Received: by 2002:a63:fc5a:: with SMTP id r26mr40639371pgk.97.1554923298250; Wed, 10 Apr 2019 12:08:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554923298; cv=none; d=google.com; s=arc-20160816; b=Sow1NzUb56ERWDQRUNd323poF1VI407QZorVE9eBHlnsUTQdFycS8tchCBw4J7ANum wsgGfreu+v7o+bXNAY8gW06pBgVYjzpFRf1oRN/R1Z+5pqXRSm7liu1Yt7X/KC+wIeOo cXi8hI5CKGP6Tl5QN8+ba0tgT0xl5cL7/74cF4iZwCG6MkWFZC6cRrxqRVxtIT9BAjn5 wCBdHA92EAj0WdHYnF0KqxatNI1vfmyog3hySHBHfi+XMFta7Mxz5vsrb47uEiLgoMa9 7l19rdRRJYKCM9Qs5Hwda/3iaf9yb6KCzUC/b1QVYaDbtZi/GhZ3QU7P8eFcJ3lHnmCq QcjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Bi16+Hs0kYSCNWbOjlVhHKTkRPT67EwDGsiLjnBF97w=; b=wEoLvZt729Pb8vfalDcccUWuZ9vEh2X/teRPhDMAMdO1wpveltitdwltqGIZ7KUpiF p3cym0aJU3SPtZ7rmKkmxFmQ4deMm7qX8Kf2kaz0PlQMYOnwteTnXKdqFEy8rAc388cY eX9ketYhDL0fy6twp6Rhuf9ZbKMivt93KDKZi0VAk78Gc36vXj2GO85N6aqmK1YnXrcO e1luO+fpvkzWeeKtEja28eeYSazMQqQWyvcNBZp2kgAfr20H3P6jsoHRlc/1Y3iK+nYE Z/x/InZtHj/gMsczT+YrVD6eMXcSPrXTRJLKrvHjNR28jZ2sOV1BzvtqVKyrIwjQRRHC YXoQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k8si24470575plt.354.2019.04.10.12.08.02; Wed, 10 Apr 2019 12:08:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730077AbfDJRpA (ORCPT + 99 others); Wed, 10 Apr 2019 13:45:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55092 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728729AbfDJRo6 (ORCPT ); Wed, 10 Apr 2019 13:44:58 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CCAE6C0495BD; Wed, 10 Apr 2019 17:44:57 +0000 (UTC) Received: from pauld.bos.csb (dhcp-17-51.bos.redhat.com [10.18.17.51]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C2EC9608EB; Wed, 10 Apr 2019 17:44:54 +0000 (UTC) Date: Wed, 10 Apr 2019 13:44:53 -0400 From: Phil Auld To: Joel Savitz Cc: linux-kernel@vger.kernel.org, Waiman Long , Tejun Heo , Li Zefan , cgroups@vger.kernel.org Subject: Re: [PATCH v2] cpuset: restore sanity to cpuset_cpus_allowed_fallback() Message-ID: <20190410174452.GI10132@pauld.bos.csb> References: <20190409204003.6428-1-jsavitz@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190409204003.6428-1-jsavitz@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 10 Apr 2019 17:44:58 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 09, 2019 at 04:40:03PM -0400 Joel Savitz wrote: > If a process is limited by taskset (i.e. cpuset) to only be allowed to > run on cpu N, and then cpu N is offlined via hotplug, the process will > be assigned the current value of its cpuset cgroup's effective_cpus field > in a call to do_set_cpus_allowed() in cpuset_cpus_allowed_fallback(). > This argument's value does not makes sense for this case, because > task_cs(tsk)->effective_cpus is modified by cpuset_hotplug_workfn() > to reflect the new value of cpu_active_mask after cpu N is removed from > the mask. While this may make sense for the cgroup affinity mask, it > does not make sense on a per-task basis, as a task that was previously > limited to only be run on cpu N will be limited to every cpu _except_ for > cpu N after it is offlined/onlined via hotplug. > > Pre-patch behavior: > > $ grep Cpus /proc/$$/status > Cpus_allowed: ff > Cpus_allowed_list: 0-7 > > $ taskset -p 4 $$ > pid 19202's current affinity mask: f > pid 19202's new affinity mask: 4 > > $ grep Cpus /proc/self/status > Cpus_allowed: 04 > Cpus_allowed_list: 2 > > # echo off > /sys/devices/system/cpu/cpu2/online > $ grep Cpus /proc/$$/status > Cpus_allowed: 0b > Cpus_allowed_list: 0-1,3 > > # echo on > /sys/devices/system/cpu/cpu2/online > $ grep Cpus /proc/$$/status > Cpus_allowed: 0b > Cpus_allowed_list: 0-1,3 > > On a patched system, the final grep produces the following > output instead: > > $ grep Cpus /proc/$$/status > Cpus_allowed: ff > Cpus_allowed_list: 0-7 > > This patch changes the above behavior by instead resetting the mask to > task_cs(tsk)->cpus_allowed by default, and cpu_possible mask in legacy > mode. > > This fallback mechanism is only triggered if _every_ other valid avenue > has been traveled, and it is the last resort before calling BUG(). > > Signed-off-by: Joel Savitz > --- > kernel/cgroup/cpuset.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 4834c4214e9c..6c9deb2cc687 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -3255,10 +3255,23 @@ void cpuset_cpus_allowed(struct task_struct *tsk, struct cpumask *pmask) > spin_unlock_irqrestore(&callback_lock, flags); > } > > +/** > + * cpuset_cpus_allowed_fallback - final fallback before complete catastrophe. > + * @tsk: pointer to task_struct with which the scheduler is struggling > + * > + * Description: In the case that the scheduler cannot find an allowed cpu in > + * tsk->cpus_allowed, we fall back to task_cs(tsk)->cpus_allowed. In legacy > + * mode however, this value is the same as task_cs(tsk)->effective_cpus, > + * which will not contain a sane cpumask during cases such as cpu hotplugging. > + * This is the absolute last resort for the scheduler and it is only used if > + * _every_ other avenue has been traveled. > + **/ > + > void cpuset_cpus_allowed_fallback(struct task_struct *tsk) > { > rcu_read_lock(); > - do_set_cpus_allowed(tsk, task_cs(tsk)->effective_cpus); > + do_set_cpus_allowed(tsk, is_in_v2_mode() ? > + task_cs(tsk)->cpus_allowed : cpu_possible_mask); > rcu_read_unlock(); > > /* > -- > 2.18.1 > Fwiw, Acked-by: Phil Auld --