Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp440554rwr; Wed, 3 May 2023 00:28:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6YxZl7RAmmzC8OB9mIcTzD5lhTUyYCsIXi3vT7cNYBdnGuzkJfR0xG6d8DBlO+9TbD0evJ X-Received: by 2002:a05:6a20:9382:b0:f4:d4a8:9c55 with SMTP id x2-20020a056a20938200b000f4d4a89c55mr25439100pzh.40.1683098936600; Wed, 03 May 2023 00:28:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683098936; cv=none; d=google.com; s=arc-20160816; b=aM8kBo4cZWYJNdlaEyfpHoslrFDAm19TGuQOX6b7gdDhtWUtCZl1Tz+2uoTgJ0xu52 XshaK3itHNAsg2bpIiu9cqq2ZsRe4Cr8UxaUx0OkXld+GY7NNcPGTY1UKUw/C4RZsgTk yIkDpSl95jRPyF1ECdnBBP9FSSKtuwCPo7qMHrZo8Mk+cQKHXX1B/NYNcXxvFhHiZ5Je pb0cQd1bo9GrG5kp2i9hMGe7kwX//2e3Z+yLrOK1SUb5F6VYnazyjbIeAXBWtNVzAaEC EpZ4HYUwEtEZEKWCNfI5ncu4HKfcOhv7RWsvcFPVkg5InAObLUJFsL8FzIXtT0NMoiPG 5Ydg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jDCLQUYbINeFQHzJXfJkeYrD7LMuwYqoDjKGOmqQCVc=; b=0Vj66LInOdIEykcGWiczh56i2N7OmYFLI1Ck0sXGxjkiQ3jx2wZWRelG35FqWQ5l88 rUwVDYQW4f+DYXNmiHrmhz6kVfU23G3VNKt/cZYoDS+umwx1GAQpu6LnKmK3psF69WBU +CM9uAT8hZmO1kzSSEOnyJX+EFoim9r18TZMILczPzODXTdmM6kKJ2JnkSoz0WmiMYl8 x8kg1VBRqnGoWga9JbBHkeLQv+Q/zOA5NnPhQ01XNzBNcQ8nvzuT8c+CU9HMM0axt4hh hQoP/54qN/9XwC8XZgljlnBpFMNSMybklCmaT/aIkFR0e/Af3OMzetT8QuMGcclpUkBK 5S5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dgwgX4Bb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f16-20020a63f110000000b0051ba1651735si28942349pgi.186.2023.05.03.00.28.43; Wed, 03 May 2023 00:28:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dgwgX4Bb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229815AbjECHYD (ORCPT + 99 others); Wed, 3 May 2023 03:24:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229675AbjECHX4 (ORCPT ); Wed, 3 May 2023 03:23:56 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B02A4224 for ; Wed, 3 May 2023 00:23:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1683098587; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jDCLQUYbINeFQHzJXfJkeYrD7LMuwYqoDjKGOmqQCVc=; b=dgwgX4BbA0wkRnt9SmL6XDwR6631XVWgb4h0AuBA40mhUFMfzoL4aUkJX6YQIsJnngzr6u CRkAEZKADWWNle8zgZ/SSLcBaYFxbzD+0+GWibi9ZDs2MIQHNmvPIr23iNwAFpDqUKtL6d nBygj4iATRXTUJ/jfjz2iDO7kfMXYcE= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-398-12y5UhNEP3mAKtQ2_RzBNg-1; Wed, 03 May 2023 03:23:06 -0400 X-MC-Unique: 12y5UhNEP3mAKtQ2_RzBNg-1 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-3f3fe24912cso1278125e9.0 for ; Wed, 03 May 2023 00:23:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683098585; x=1685690585; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jDCLQUYbINeFQHzJXfJkeYrD7LMuwYqoDjKGOmqQCVc=; b=KyhLV/I5cFA4wP8WrNXlnjC8qzdPw7dIrdEFCVjLZZpAkBGloO04PoYPVx0FBVkaDR cpml73LKb3pEj3TfKd1UZIpk+AZPBGacDw2GDFsjy14VWCuUq4nOjtuVKIBq724WAbmT Q+15IC4SEZfMq9qqq/so5HjnpKEjhOnBO/7hkNGhucF8zHXLhxGsv/N3yvhfjP0+eeVs PNzxTy/C063MdR099huMPMSeWvD3x3nFY7bml6gjCcV5kvACSxy6x/zTrhwiHQH6qiFn i5phFJzAETDSDytRFHxMlVsBU7acRuIVEAPPq87woOanXra5+pBSAzYulCS96NU+dLlS N8Pg== X-Gm-Message-State: AC+VfDzQxA6DNb0LY7o9xswW77xzCAVZjnMxEB27drpjSMjCgwLYvTTy NsSl7bB/UxT0F+wrO7kMnthBE0rwLDYHgb6s48GxHV6dKfPzZYDjaCXzzUqZ6wvfDuvu2fkc7R+ RV7U5E7rcjtrqfoTmS8vslj6R X-Received: by 2002:a05:600c:2194:b0:3f3:195b:d18c with SMTP id e20-20020a05600c219400b003f3195bd18cmr14138392wme.30.1683098585354; Wed, 03 May 2023 00:23:05 -0700 (PDT) X-Received: by 2002:a05:600c:2194:b0:3f3:195b:d18c with SMTP id e20-20020a05600c219400b003f3195bd18cmr14138347wme.30.1683098584931; Wed, 03 May 2023 00:23:04 -0700 (PDT) Received: from localhost.localdomain.com ([2a02:b127:8011:7489:32ac:78e2:be8c:a5fb]) by smtp.gmail.com with ESMTPSA id k1-20020a7bc301000000b003eddc6aa5fasm947259wmj.39.2023.05.03.00.23.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 00:23:04 -0700 (PDT) From: Juri Lelli To: Peter Zijlstra , Ingo Molnar , Qais Yousef , Waiman Long , Tejun Heo , Zefan Li , Johannes Weiner , Hao Luo Cc: Dietmar Eggemann , Steven Rostedt , linux-kernel@vger.kernel.org, luca.abeni@santannapisa.it, claudio@evidence.eu.com, tommaso.cucinotta@santannapisa.it, bristot@redhat.com, mathieu.poirier@linaro.org, cgroups@vger.kernel.org, Vincent Guittot , Wei Wang , Rick Yiu , Quentin Perret , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Sudeep Holla , Juri Lelli Subject: [PATCH v2 3/6] sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets Date: Wed, 3 May 2023 09:22:25 +0200 Message-Id: <20230503072228.115707-4-juri.lelli@redhat.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230503072228.115707-1-juri.lelli@redhat.com> References: <20230503072228.115707-1-juri.lelli@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by: Qais Yousef Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ Signed-off-by: Juri Lelli --- include/linux/cpuset.h | 4 ++++ kernel/cgroup/cgroup.c | 4 ++++ kernel/cgroup/cpuset.c | 25 +++++++++++++++++++++++++ kernel/sched/deadline.c | 14 ++++++++++++++ 4 files changed, 47 insertions(+) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index f90e6325d707..d629094fac6e 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -71,6 +71,8 @@ extern void cpuset_init_smp(void); extern void cpuset_force_rebuild(void); extern void cpuset_update_active_cpus(void); extern void cpuset_wait_for_hotplug(void); +extern void inc_dl_tasks_cs(struct task_struct *task); +extern void dec_dl_tasks_cs(struct task_struct *task); extern void cpuset_lock(void); extern void cpuset_unlock(void); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask); @@ -189,6 +191,8 @@ static inline void cpuset_update_active_cpus(void) static inline void cpuset_wait_for_hotplug(void) { } +static inline void inc_dl_tasks_cs(struct task_struct *task) { } +static inline void dec_dl_tasks_cs(struct task_struct *task) { } static inline void cpuset_lock(void) { } static inline void cpuset_unlock(void) { } diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 625d7483951c..9d809191a54f 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include @@ -6683,6 +6684,9 @@ void cgroup_exit(struct task_struct *tsk) list_add_tail(&tsk->cg_list, &cset->dying_tasks); cset->nr_tasks--; + if (dl_task(tsk)) + dec_dl_tasks_cs(tsk); + WARN_ON_ONCE(cgroup_task_frozen(tsk)); if (unlikely(!(tsk->flags & PF_KTHREAD) && test_bit(CGRP_FREEZE, &task_dfl_cgroup(tsk)->flags))) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index ee66be215fb9..b9f4d5602517 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -193,6 +193,12 @@ struct cpuset { int use_parent_ecpus; int child_ecpus_count; + /* + * number of SCHED_DEADLINE tasks attached to this cpuset, so that we + * know when to rebuild associated root domain bandwidth information. + */ + int nr_deadline_tasks; + /* Invalid partition error code, not lock protected */ enum prs_errcode prs_err; @@ -245,6 +251,20 @@ static inline struct cpuset *parent_cs(struct cpuset *cs) return css_cs(cs->css.parent); } +void inc_dl_tasks_cs(struct task_struct *p) +{ + struct cpuset *cs = task_cs(p); + + cs->nr_deadline_tasks++; +} + +void dec_dl_tasks_cs(struct task_struct *p) +{ + struct cpuset *cs = task_cs(p); + + cs->nr_deadline_tasks--; +} + /* bits in struct cpuset flags field */ typedef enum { CS_ONLINE, @@ -2497,6 +2517,11 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) ret = security_task_setscheduler(task); if (ret) goto out_unlock; + + if (dl_task(task)) { + cs->nr_deadline_tasks++; + cpuset_attach_old_cs->nr_deadline_tasks--; + } } /* diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 5a9a4b81c972..e11de074a6fd 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -16,6 +16,8 @@ * Fabio Checconi */ +#include + /* * Default limits for DL period; on the top end we guard against small util * tasks still getting ridiculously long effective runtimes, on the bottom end we @@ -2596,6 +2598,12 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p) if (task_on_rq_queued(p) && p->dl.dl_runtime) task_non_contending(p); + /* + * In case a task is setscheduled out from SCHED_DEADLINE we need to + * keep track of that on its cpuset (for correct bandwidth tracking). + */ + dec_dl_tasks_cs(p); + if (!task_on_rq_queued(p)) { /* * Inactive timer is armed. However, p is leaving DEADLINE and @@ -2636,6 +2644,12 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p) if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1) put_task_struct(p); + /* + * In case a task is setscheduled to SCHED_DEADLINE we need to keep + * track of that on its cpuset (for correct bandwidth tracking). + */ + inc_dl_tasks_cs(p); + /* If p is not queued we will update its parameters at next wakeup. */ if (!task_on_rq_queued(p)) { add_rq_bw(&p->dl, &rq->dl); -- 2.40.1