Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp1180779rwr; Wed, 3 May 2023 11:09:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7NoQwO4iqpIwqqhRBHhbtT8jING8zMG2gmY39UOfIDbynyhQtEGTN95UQKXOx6AM6sfMTp X-Received: by 2002:a17:90b:1989:b0:24d:fab5:5f57 with SMTP id mv9-20020a17090b198900b0024dfab55f57mr3480026pjb.17.1683137394802; Wed, 03 May 2023 11:09:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683137394; cv=none; d=google.com; s=arc-20160816; b=By9vSwwOF3nnRqvx/8Gbhgxum6vg4x9+6x9Xj6FbMLq2nuNpV27bNcMnUJ3gUzK7Ia WPu83leq9RwTvZPHsBP4sN5IesJMA1p19lZRoogQL169PRGu13LC7SqzsdpG6fupwn0n fs0ruQMDsdU7cyLnjlfRsirIOWTesfJWn3ONO7Eg+vxlneDoQRn5FnG7/dnHTKZoTY52 3kKilqT6BCC3U7xQJdjkNABmr3FOcOXRG9+HmZIuYozKoSW1xjbd/7tCPA/N4T+Isxdi RVNtMW1yh6W+mMDdYtTx1ViBdu/20wRcL14RdH3RxaEqh+M/vnVU3I9g2wZZqN6uq2Jy LkDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=AQpEwS4JBjiBUmzDrp0hg9pwaawUbOZgQLeijVoHEYM=; b=uoE7BB1lTqfawdmw6jSrpCzU3htClx4hrAT7TZmWn77Km0sMl0yDcNf9RQt5tZcRzB A1sKcEBzz5brH+9NxqQ/grznahku2Vh6TvW/qPtoQG0Dbm6YSMYtYkgnLYTYZZQFiQKQ TkZ87cM1GwhCSQTZyOsDAqdlCLRFCC9WhFYgNg4fM27MQojlUrWfXChD/D2Hvok+FPma aKRRckAL8ZWm8dQP2blQMVfLjXeRw9ukwGfykifB6WMdQYD6hqeRb48Zzbe9z6qu8Mdp nyRbzTWrN5tK9mjMdLTMYIl99sKGDDqyR1gVNlP/R9HU5Bvk18D3zAXQ41YrYCXX3voU AuLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KVAdYGKg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q4-20020a17090a7a8400b0024e2c5e3a4asi4291006pjf.132.2023.05.03.11.09.42; Wed, 03 May 2023 11:09:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KVAdYGKg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230054AbjECR5N (ORCPT + 99 others); Wed, 3 May 2023 13:57:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229455AbjECR5M (ORCPT ); Wed, 3 May 2023 13:57:12 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8E1B72BE for ; Wed, 3 May 2023 10:56:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1683136577; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AQpEwS4JBjiBUmzDrp0hg9pwaawUbOZgQLeijVoHEYM=; b=KVAdYGKgbyLiolXPFs47a8Lz2DM95pE1EYVNpR/KYfbMFJb64CXhbljf+dBpsS352ekNn8 N5QqHVd/8vNLK0LZaOIftUBrqA7Zre5nhnhWqyG6/y44/Jf52ZzbVTtG9veJpHVi/hA25O qx7tO2KCYNSVxbYVUNL7ZpFqtqZVXx8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-298-NHidxL8xOX2KWq5WVmBCBw-1; Wed, 03 May 2023 13:56:12 -0400 X-MC-Unique: NHidxL8xOX2KWq5WVmBCBw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7FA53185A79C; Wed, 3 May 2023 17:56:11 +0000 (UTC) Received: from [10.22.17.228] (unknown [10.22.17.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id DA7B34020961; Wed, 3 May 2023 17:56:09 +0000 (UTC) Message-ID: <09fc4561-5ea9-8b5f-a784-37f578f152af@redhat.com> Date: Wed, 3 May 2023 13:56:09 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [PATCH v2 3/6] sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets Content-Language: en-US To: Juri Lelli , Peter Zijlstra , Ingo Molnar , Qais Yousef , Tejun Heo , Zefan Li , Johannes Weiner , Hao Luo Cc: Dietmar Eggemann , Steven Rostedt , linux-kernel@vger.kernel.org, luca.abeni@santannapisa.it, claudio@evidence.eu.com, tommaso.cucinotta@santannapisa.it, bristot@redhat.com, mathieu.poirier@linaro.org, cgroups@vger.kernel.org, Vincent Guittot , Wei Wang , Rick Yiu , Quentin Perret , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Sudeep Holla References: <20230503072228.115707-1-juri.lelli@redhat.com> <20230503072228.115707-4-juri.lelli@redhat.com> From: Waiman Long In-Reply-To: <20230503072228.115707-4-juri.lelli@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/3/23 03:22, Juri Lelli wrote: > Qais reported that iterating over all tasks when rebuilding root domains > for finding out which ones are DEADLINE and need their bandwidth > correctly restored on such root domains can be a costly operation (10+ > ms delays on suspend-resume). > > To fix the problem keep track of the number of DEADLINE tasks belonging > to each cpuset and then use this information (followup patch) to only > perform the above iteration if DEADLINE tasks are actually present in > the cpuset for which a corresponding root domain is being rebuilt. > > Reported-by: Qais Yousef > Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/ > Signed-off-by: Juri Lelli > --- > include/linux/cpuset.h | 4 ++++ > kernel/cgroup/cgroup.c | 4 ++++ > kernel/cgroup/cpuset.c | 25 +++++++++++++++++++++++++ > kernel/sched/deadline.c | 14 ++++++++++++++ > 4 files changed, 47 insertions(+) > > diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h > index f90e6325d707..d629094fac6e 100644 > --- a/include/linux/cpuset.h > +++ b/include/linux/cpuset.h > @@ -71,6 +71,8 @@ extern void cpuset_init_smp(void); > extern void cpuset_force_rebuild(void); > extern void cpuset_update_active_cpus(void); > extern void cpuset_wait_for_hotplug(void); > +extern void inc_dl_tasks_cs(struct task_struct *task); > +extern void dec_dl_tasks_cs(struct task_struct *task); > extern void cpuset_lock(void); > extern void cpuset_unlock(void); > extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask); > @@ -189,6 +191,8 @@ static inline void cpuset_update_active_cpus(void) > > static inline void cpuset_wait_for_hotplug(void) { } > > +static inline void inc_dl_tasks_cs(struct task_struct *task) { } > +static inline void dec_dl_tasks_cs(struct task_struct *task) { } > static inline void cpuset_lock(void) { } > static inline void cpuset_unlock(void) { } > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index 625d7483951c..9d809191a54f 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -57,6 +57,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -6683,6 +6684,9 @@ void cgroup_exit(struct task_struct *tsk) > list_add_tail(&tsk->cg_list, &cset->dying_tasks); > cset->nr_tasks--; > > + if (dl_task(tsk)) > + dec_dl_tasks_cs(tsk); > + > WARN_ON_ONCE(cgroup_task_frozen(tsk)); > if (unlikely(!(tsk->flags & PF_KTHREAD) && > test_bit(CGRP_FREEZE, &task_dfl_cgroup(tsk)->flags))) > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index ee66be215fb9..b9f4d5602517 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -193,6 +193,12 @@ struct cpuset { > int use_parent_ecpus; > int child_ecpus_count; > > + /* > + * number of SCHED_DEADLINE tasks attached to this cpuset, so that we > + * know when to rebuild associated root domain bandwidth information. > + */ > + int nr_deadline_tasks; > + > /* Invalid partition error code, not lock protected */ > enum prs_errcode prs_err; > > @@ -245,6 +251,20 @@ static inline struct cpuset *parent_cs(struct cpuset *cs) > return css_cs(cs->css.parent); > } > > +void inc_dl_tasks_cs(struct task_struct *p) > +{ > + struct cpuset *cs = task_cs(p); > + > + cs->nr_deadline_tasks++; > +} > + > +void dec_dl_tasks_cs(struct task_struct *p) > +{ > + struct cpuset *cs = task_cs(p); > + > + cs->nr_deadline_tasks--; > +} > + > /* bits in struct cpuset flags field */ > typedef enum { > CS_ONLINE, > @@ -2497,6 +2517,11 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > ret = security_task_setscheduler(task); > if (ret) > goto out_unlock; > + > + if (dl_task(task)) { > + cs->nr_deadline_tasks++; > + cpuset_attach_old_cs->nr_deadline_tasks--; > + } > } > > /* > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 5a9a4b81c972..e11de074a6fd 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -16,6 +16,8 @@ > * Fabio Checconi > */ > > +#include > + > /* > * Default limits for DL period; on the top end we guard against small util > * tasks still getting ridiculously long effective runtimes, on the bottom end we > @@ -2596,6 +2598,12 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p) > if (task_on_rq_queued(p) && p->dl.dl_runtime) > task_non_contending(p); > > + /* > + * In case a task is setscheduled out from SCHED_DEADLINE we need to > + * keep track of that on its cpuset (for correct bandwidth tracking). > + */ > + dec_dl_tasks_cs(p); > + > if (!task_on_rq_queued(p)) { > /* > * Inactive timer is armed. However, p is leaving DEADLINE and > @@ -2636,6 +2644,12 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p) > if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1) > put_task_struct(p); > > + /* > + * In case a task is setscheduled to SCHED_DEADLINE we need to keep > + * track of that on its cpuset (for correct bandwidth tracking). > + */ > + inc_dl_tasks_cs(p); > + > /* If p is not queued we will update its parameters at next wakeup. */ > if (!task_on_rq_queued(p)) { > add_rq_bw(&p->dl, &rq->dl); Reviewed-by: Waiman Long