Received: by 2002:a05:6358:700f:b0:131:369:b2a3 with SMTP id 15csp2996459rwo; Thu, 3 Aug 2023 19:52:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHUN4bsSEIM6lj/GvIGSsbxY+bkcwGce8LQ1ROjTv7k1ivekEYFwNFpRrCQFZU89hiDQKvS X-Received: by 2002:a05:6808:1794:b0:3a7:250a:7948 with SMTP id bg20-20020a056808179400b003a7250a7948mr790262oib.13.1691117550097; Thu, 03 Aug 2023 19:52:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691117550; cv=none; d=google.com; s=arc-20160816; b=QHnuLrgVyydSYO+9ydCSpTLCpSLasHndNUbMdTsxedifGQ+VG9Hld7xkQ5ZbO891r0 nITlNul16Tbk8mIDMjjgYYW7uuOlY7D5SZwOB4GPKGGlBX2KJQ5tC8E73Xra9wSbQXnu Y5yeAhXk0SmPrfko1ZfTnvLFR4UK7KpmM9K9N9dGWBMavBbA0a+krga2IpFb73x6AAiu BSvOZrMCfpJDcvK4uW+/W1v0Ays/Xa0rotePR73D4/WapaLgDD5fGJsiPLBMCx4nLG++ Knctzs3FEQtW5zsyIYJ6fdbjo2tn0P2T/6IpewFVILfRFeDbOZMN/e+2ng6yHC6BawlY Fx+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=8jnmrIEx4SJ3sEJwf12nMDN4JK4gDtMO8+W4VI3+2Rk=; fh=F2iwgnVFXMvjqypvfkAMrkXQdtuB3f8By1lZ1HkEhQU=; b=I2oDQOYeXCmJZC4+a5dfL/brAyiPQxL/TRm5x45eOFtYzwIk3q5LaScIFQwvi5/aV5 4fNhIy0Kv6nlq7KvtgC4jEiPrMO/KkutJHb4/B/sn2Km2ZnEr3r1gseOuhWjCxSC376Y n1DkOVykeJzU12jf5TWH1gLNJH28mBdqUATHmApOJ0PWslMobty3Z0GE22u3avPv0V72 R4SyEpWwA60t69DG3xSfu71AEyMigLnFea1aU/h8HAWYlG5Lfxlce9i2Gnqnjvv8p4KK laGErzHXn9QF4bJ1tfjr1knlFZVFqj0kZ+DHt7O4o29cFNPmX0L8ApSlGarj3SACkUjO Q01A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Nf7f50ZL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jx10-20020a17090b46ca00b00267deffa3bdsi1062216pjb.121.2023.08.03.19.52.17; Thu, 03 Aug 2023 19:52:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Nf7f50ZL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232443AbjHDCdT (ORCPT + 99 others); Thu, 3 Aug 2023 22:33:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230035AbjHDCdS (ORCPT ); Thu, 3 Aug 2023 22:33:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76711A3 for ; Thu, 3 Aug 2023 19:32:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1691116350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=8jnmrIEx4SJ3sEJwf12nMDN4JK4gDtMO8+W4VI3+2Rk=; b=Nf7f50ZLKnkt+YocHhE4j/Ab9IxYPZKeAhO+JKyZjrmLdI6zZpygJmKi1wkyizx2caera0 LWAAOmdNCQuFsa3cqON39aAeqyNWk6fR3qIwHHbRx5faT75X1cffcwuTrD7nob+x+sQevz CXg/6tDgUU+I493DGtriR3FO6kvfZVo= Received: from mimecast-mx02.redhat.com (66.187.233.73 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-197-wvWgjmUlOzKF-e2PP22GFA-1; Thu, 03 Aug 2023 22:32:29 -0400 X-MC-Unique: wvWgjmUlOzKF-e2PP22GFA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DECDE380391D; Fri, 4 Aug 2023 02:32:28 +0000 (UTC) Received: from llong.com (unknown [10.22.17.81]) by smtp.corp.redhat.com (Postfix) with ESMTP id 09CC4C5796B; Fri, 4 Aug 2023 02:32:27 +0000 (UTC) From: Waiman Long To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Phil Auld , Brent Rowsell , Peter Hunt , Waiman Long Subject: [PATCH v3] sched/core: Use empty mask to reset cpumasks in sched_setaffinity() Date: Thu, 3 Aug 2023 22:32:18 -0400 Message-Id: <20230804023218.75544-1-longman@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask"), user provided CPU affinity via sched_setaffinity(2) is perserved even if the task is being moved to a different cpuset. However, that affinity is also being inherited by any subsequently created child processes which may not want or be aware of that affinity. One way to solve this problem is to provide a way to back off from that user provided CPU affinity. This patch implements such a scheme by using an empty cpumask to signal a reset of the cpumasks to the default as allowed by the current cpuset. Before this patch, passing in an empty cpumask to sched_setaffinity(2) will always return an -EINVAL error. With this patch, an alternative error of -ENODEV will be returned returned if sched_setaffinity(2) has been called before to set up user_cpus_ptr. In this case, the user_cpus_ptr that stores the user provided affinity will be cleared and the task's CPU affinity will be reset to that of the current cpuset. This alternative error code of -ENODEV signals that the no CPU is specified and, at the same time, a side effect of resetting cpu affinity to the cpuset default. If sched_setaffinity(2) has not been called previously, an EINVAL error will be returned with an empty cpumask just like before. Tests or tools that rely on the behavior that an empty cpumask will return an error code will not be affected. We will have to update the sched_setaffinity(2) manpage to document this possible side effect of passing in an empty cpumask. Signed-off-by: Waiman Long --- kernel/sched/core.c | 42 +++++++++++++++++++++++++++++++++--------- 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c52c2eba7c73..3ef7397f2a61 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8317,7 +8317,12 @@ __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx) } cpuset_cpus_allowed(p, cpus_allowed); - cpumask_and(new_mask, ctx->new_mask, cpus_allowed); + + /* Default to cpus_allowed with NULL new_mask */ + if (ctx->new_mask) + cpumask_and(new_mask, ctx->new_mask, cpus_allowed); + else + cpumask_copy(new_mask, cpus_allowed); ctx->new_mask = new_mask; ctx->flags |= SCA_CHECK; @@ -8366,6 +8371,7 @@ __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx) long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) { + bool reset_cpumasks = cpumask_empty(in_mask); struct affinity_context ac; struct cpumask *user_mask; struct task_struct *p; @@ -8403,15 +8409,26 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) goto out_put_task; /* - * With non-SMP configs, user_cpus_ptr/user_mask isn't used and - * alloc_user_cpus_ptr() returns NULL. + * If an empty cpumask is passed in and user_cpus_ptr is set, + * clear user_cpus_ptr and reset the current cpu affinity to the + * default for the current cpuset. If user_cpus_ptr isn't set, + * -EINVAL will be returned as before. */ - user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE); - if (user_mask) { - cpumask_copy(user_mask, in_mask); - } else if (IS_ENABLED(CONFIG_SMP)) { - retval = -ENOMEM; - goto out_put_task; + if (reset_cpumasks && p->user_cpus_ptr) { + in_mask = NULL; /* To be updated in __sched_setaffinity */ + user_mask = NULL; + } else { + /* + * With non-SMP configs, user_cpus_ptr/user_mask isn't used + * and alloc_user_cpus_ptr() returns NULL. + */ + user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE); + if (user_mask) { + cpumask_copy(user_mask, in_mask); + } else if (IS_ENABLED(CONFIG_SMP)) { + retval = -ENOMEM; + goto out_put_task; + } } ac = (struct affinity_context){ @@ -8423,6 +8440,13 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) retval = __sched_setaffinity(p, &ac); kfree(ac.user_mask); + /* + * Force an error return (-ENODEV), if no error yet, for the empty + * cpumask case to avoid breaking existing tests. + */ + if (reset_cpumasks && !retval) + retval = -ENODEV; + out_put_task: put_task_struct(p); return retval; -- 2.31.1