Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp4606746pxv; Tue, 20 Jul 2021 07:37:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzUgbIMquOyDK7Z4yMqi5UpP7+IkF+FrMj6GNA59L9nptZDt0XjjAVtxoKhw9g5TOEdvkDs X-Received: by 2002:a17:906:382:: with SMTP id b2mr26492373eja.85.1626791855187; Tue, 20 Jul 2021 07:37:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626791855; cv=none; d=google.com; s=arc-20160816; b=ttLqyPwMWWm5w8xXh5hnN0e7C86NNy4lmxRPOWrvd+DcwHHYBC1C85kJWwi5epXkAP WxeRjDepihl4eVr+KpxVp/Zs8XEzXeRQHVQP104zSRb2VoKUmSLmg1pTlLqmTqfSUnaX Zy0cefP11kqr5GBHC7fKf3OvUrNpeqt99tQaj9lWrNyMni8PCfq2SpX97o3s7z2vOnNK LQNTsE1Cj/PQ4/fSKEJ8T9VujbGH39h/P4ZSg2tSePCuElRumz6u0rKcJxYfe0uwBPkh RasQa3r1IBfxgMBNbSvVpkUpq/zgkI9yd/nttKOclXAABBSD7gJaJaNmfALjhn5F17uV go/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=OSD1ueZb/8AiWAEOmohDX/0RCpwXdDXZ8clzlRhUsiQ=; b=M93hXcLbBezRnVqJle5kts3KfAtB+ZBVYxLLKPVIUYF5REr2mHzJGWoXrjdJaWD1uY r1gcyFKV+Lba4Ez0YeTp8/0SH4bW7oLPjb79tChytxWZ3nyCjkG7wS6gN3r4Lm3VWjsF 2wCK+JzQVnSE2CjYCqX3iG+lG2hYtIHzLhgl/cr5rMAMT3nN6Yr6YKR5rDCqO4/U+d1u lPKp7SLZdJ6mfBq6n5ifZA6jed1ZeWYpo+5EVNKsy2Cbs/HeiuRNrydNkpESDOSY9uKN /EwXC0xRuBrMpWtELdO9yXKF8iPpbtm5pD+Z8tFCPaC4Toc56Frb8jNzT5h3VorYo+24 wb1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=T0PGjMWR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ee38si3171352edb.225.2021.07.20.07.37.11; Tue, 20 Jul 2021 07:37:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=T0PGjMWR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239809AbhGTNyU (ORCPT + 99 others); Tue, 20 Jul 2021 09:54:20 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:24735 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239955AbhGTNjX (ORCPT ); Tue, 20 Jul 2021 09:39:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626790800; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=OSD1ueZb/8AiWAEOmohDX/0RCpwXdDXZ8clzlRhUsiQ=; b=T0PGjMWRuXFrCymcgCyUTfi3g8lRH2Lm4gyA6ArZd/ReiI3Yibe/GvZC5V42BjXsJYR/Kw v+QhSXTdlw4MkENjOM82TJawhsIykBCkBdmjwwKZiKjXJZZpSgoJNEUBmhxP72d8BXIiPG oKEL6eMaPA5NCdI0rmslkfl/9B4EahI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-573-AsULK73wOU-E_MuqCx3YSg-1; Tue, 20 Jul 2021 10:19:59 -0400 X-MC-Unique: AsULK73wOU-E_MuqCx3YSg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D6BDA1005E6A; Tue, 20 Jul 2021 14:19:56 +0000 (UTC) Received: from llong.com (ovpn-116-153.rdu2.redhat.com [10.10.116.153]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7E44219C87; Tue, 20 Jul 2021 14:19:51 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Andrew Morton , Roman Gushchin , Phil Auld , Peter Zijlstra , Juri Lelli , Frederic Weisbecker , Marcelo Tosatti , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Waiman Long Subject: [PATCH v3 4/9] cgroup/cpuset: Enable event notification when partition become invalid Date: Tue, 20 Jul 2021 10:18:29 -0400 Message-Id: <20210720141834.10624-5-longman@redhat.com> In-Reply-To: <20210720141834.10624-1-longman@redhat.com> References: <20210720141834.10624-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org A valid cpuset partition can become invalid if all its CPUs are offlined or somehow removed. This can happen through external events without "cpuset.cpus.partition" being touched at all. Users that rely on the property of a partition being present do not currently have a simple way to get such an event notified other than constant periodic polling which is both inefficient and cumbersome. To make life easier for those users, event notification is now enabled for "cpuset.cpus.partition" when it goes into or out of an invalid partition state. Suggested-by: Tejun Heo Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 49 ++++++++++++++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 11 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 04a6951abe2a..2e34fc5b76f0 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -160,6 +160,9 @@ struct cpuset { */ int use_parent_ecpus; int child_ecpus_count; + + /* Handle for cpuset.cpus.partition */ + struct cgroup_file partition_file; }; /* @@ -263,6 +266,19 @@ static inline int is_partition_root(const struct cpuset *cs) return cs->partition_root_state > 0; } +/* + * Send notification event of partition_root_state change when going into + * or out of PRS_ERROR which may be due to an external event like hotplug. + */ +static inline void notify_partition_change(struct cpuset *cs, + int old_prs, int new_prs) +{ + if ((old_prs == new_prs) || + ((old_prs != PRS_ERROR) && (new_prs != PRS_ERROR))) + return; + cgroup_file_notify(&cs->partition_file); +} + static struct cpuset top_cpuset = { .flags = ((1 << CS_ONLINE) | (1 << CS_CPU_EXCLUSIVE) | (1 << CS_MEM_EXCLUSIVE)), @@ -1148,7 +1164,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, struct cpuset *parent = parent_cs(cpuset); int adding; /* Moving cpus from effective_cpus to subparts_cpus */ int deleting; /* Moving cpus from subparts_cpus to effective_cpus */ - int new_prs; + int old_prs, new_prs; bool part_error = false; /* Partition error? */ percpu_rwsem_assert_held(&cpuset_rwsem); @@ -1184,7 +1200,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, * A cpumask update cannot make parent's effective_cpus become empty. */ adding = deleting = false; - new_prs = cpuset->partition_root_state; + old_prs = new_prs = cpuset->partition_root_state; if (cmd == partcmd_enable) { cpumask_copy(tmp->addmask, cpuset->cpus_allowed); adding = true; @@ -1274,7 +1290,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, parent->subparts_cpus); } - if (!adding && !deleting && (new_prs == cpuset->partition_root_state)) + if (!adding && !deleting && (new_prs == old_prs)) return 0; /* @@ -1302,9 +1318,11 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, parent->nr_subparts_cpus = cpumask_weight(parent->subparts_cpus); - if (cpuset->partition_root_state != new_prs) + if (old_prs != new_prs) cpuset->partition_root_state = new_prs; + spin_unlock_irq(&callback_lock); + notify_partition_change(cpuset, old_prs, new_prs); return cmd == partcmd_update; } @@ -1326,7 +1344,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) struct cpuset *cp; struct cgroup_subsys_state *pos_css; bool need_rebuild_sched_domains = false; - int new_prs; + int old_prs, new_prs; rcu_read_lock(); cpuset_for_each_descendant_pre(cp, pos_css, cs) { @@ -1366,8 +1384,8 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) * update_tasks_cpumask() again for tasks in the parent * cpuset if the parent's subparts_cpus changes. */ - new_prs = cp->partition_root_state; - if ((cp != cs) && new_prs) { + old_prs = new_prs = cp->partition_root_state; + if ((cp != cs) && old_prs) { switch (parent->partition_root_state) { case PRS_DISABLED: /* @@ -1438,10 +1456,11 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) } } - if (new_prs != cp->partition_root_state) + if (new_prs != old_prs) cp->partition_root_state = new_prs; spin_unlock_irq(&callback_lock); + notify_partition_change(cp, old_prs, new_prs); WARN_ON(!is_in_v2_mode() && !cpumask_equal(cp->cpus_allowed, cp->effective_cpus)); @@ -2023,6 +2042,7 @@ static int update_prstate(struct cpuset *cs, int new_prs) spin_lock_irq(&callback_lock); cs->partition_root_state = new_prs; spin_unlock_irq(&callback_lock); + notify_partition_change(cs, old_prs, new_prs); } free_cpumasks(NULL, &tmpmask); @@ -2708,6 +2728,7 @@ static struct cftype dfl_files[] = { .write = sched_partition_write, .private = FILE_PARTITION_ROOT, .flags = CFTYPE_NOT_ON_ROOT, + .file_offset = offsetof(struct cpuset, partition_file), }, { @@ -3103,11 +3124,17 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp) */ if ((parent->partition_root_state == PRS_ERROR) || cpumask_empty(&new_cpus)) { + int old_prs; + update_parent_subparts_cpumask(cs, partcmd_disable, NULL, tmp); - spin_lock_irq(&callback_lock); - cs->partition_root_state = PRS_ERROR; - spin_unlock_irq(&callback_lock); + old_prs = cs->partition_root_state; + if (old_prs != PRS_ERROR) { + spin_lock_irq(&callback_lock); + cs->partition_root_state = PRS_ERROR; + spin_unlock_irq(&callback_lock); + notify_partition_change(cs, old_prs, PRS_ERROR); + } } cpuset_force_rebuild(); } -- 2.18.1