Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3935064pxj; Mon, 21 Jun 2021 09:42:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJycKfqCnRPOYrFGlm1lK4ts3x0OXh0OMVyIEh+an4z0hhgAXQxLQbmQMMgyZKgZ1TAN+au/ X-Received: by 2002:a05:6402:2210:: with SMTP id cq16mr3456931edb.261.1624293757382; Mon, 21 Jun 2021 09:42:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624293757; cv=none; d=google.com; s=arc-20160816; b=WJJfbx7FH6TzvrXN3zVIjyFwadosGt2htAE9bPBhv224LikwGlLBEeHx3zY0WjJtL2 ZAgUcWhkCAyHE7HaREMVqDABCi1L7r0YTQVPQHfyocElQwMul132Fi5NjWHjQ8TbT399 7ALy30mmtWw4BaKBIB6AGNC3qlHDRdqH/itCXpQ0nWsq5zbAkxogk/T6freOyQxn2Ftr mIHiwufF2ELxLiEzHpoXfk6imYq4op9ktDNvRedynlGz2aK2StWsBQyPSkZSGh/s5J9I 1K4U9vP1J2QDvKs4urDHnz2QrFtHyqmTn2UbQv/Oie07BVD3F3MlOMS7xVyHOXPuBNoo suGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ZnyDvnG5ZhYgMn6tMi27bIwWL9AbU2dgk9vvsELPdRU=; b=wtKO9yNFgBitEe+iMPIpY7o/GYKF8W7nqiltvumvWwq3VmKedD052n0skfiUaXIQoY x4LRnHR+FUn1M7aXXO3DzIg7MAtjIoOW2F4csSjK5HHjrIXlXFftGuX7+ug4keLYFvAE AAPa4GnmAakcK5bxUVPq0n+INdmbQxSgcWVraaHuxleTRR9GnUOYIgIUKJLX3+hboZl7 jpwiqqIfOZtEQ0CjwwzSHaN7T+QI6eV0bS0ebs+bNg8cTi3oqQhO8TCwmo/YWA814Ud4 zQ7gyvIWJjpfitkhMkcx+bLR7MU8T/foB06T5y+Q4+3GJcbOVv6lssE9gZ6K0q+nYlVD y5HQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gm2DSLrP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id el22si4604195ejc.540.2021.06.21.09.42.15; Mon, 21 Jun 2021 09:42:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gm2DSLrP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232795AbhFUQn1 (ORCPT + 99 others); Mon, 21 Jun 2021 12:43:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233394AbhFUQjv (ORCPT ); Mon, 21 Jun 2021 12:39:51 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BB4FC061151 for ; Mon, 21 Jun 2021 09:22:47 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id r9so20351284wrz.10 for ; Mon, 21 Jun 2021 09:22:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=ZnyDvnG5ZhYgMn6tMi27bIwWL9AbU2dgk9vvsELPdRU=; b=gm2DSLrPi//j9X/mkUegCOP3/S6a37EAmqSALOuICpRdlXNwUbiep/D63aqxU5TK6N We/SX5m7sprl9iqJdw+3XoEUiPNokE/0ONZs1nt33QltP8UklmBn7PttwaMEFgdj+0GP 6sXECTAiSiTb2LTlDVpcyiABLCAf70pVVXkCNr/CTzhw8HgYZ8cLD4KXqG5ahwEI+0pO 1Ru62ne6wtI59ttbbqx8f1ovjBSaF3zr2pvY04mFGbDpGDnQpzxQrA2nnaz4LOsY7DeO MFKeTXutMm0/U9Vvvqcb7NyQo7Vd18fMzCU2FFWaQErhBzBq5hE+B10iDWwaLHULyve0 Gerw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=ZnyDvnG5ZhYgMn6tMi27bIwWL9AbU2dgk9vvsELPdRU=; b=U7rrtZ0t2fdqANgOhFPgBVz6wsa67TsB9elQkemACTnuoLkVnndKZO5m/TXWABnjA1 HMomIfRitOEtjuoHv1BSngTWBYsoVOByOBtwQc0cRHmkJ6/vksm0ecYDL6NSp3bbm6Qh +lD6g2sBB6BIJc5e29QqhoF8RwUhT6+UyxE1mywPThn/eCqJp2RlinFBlKr7I3UAdOk+ 061fNdy5obTnT+CF1KI2AH/VKF9Q34W2RpFDerpWuXLgBCY/qRDmSi+4aW/wF4hP/m31 OUg0xxF2zhTNIrX66JedzHQmxyiMyqUohs/CKkGzWKYpWSGuhV5UHsCmRtWKPvHyEJl2 rWyw== X-Gm-Message-State: AOAM531IwU4S4T5CAfrBg4S29oflh8IfYL4LlkRRCjIWE2c0/AFvRmFo IJXa3SjQSSXv8NBoDwcQkkcsvg== X-Received: by 2002:a5d:66c6:: with SMTP id k6mr28769322wrw.422.1624292565531; Mon, 21 Jun 2021 09:22:45 -0700 (PDT) Received: from vingu-book ([2a01:e0a:f:6020:7073:a754:9f82:97c1]) by smtp.gmail.com with ESMTPSA id k16sm19052901wrn.96.2021.06.21.09.22.44 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Mon, 21 Jun 2021 09:22:44 -0700 (PDT) Date: Mon, 21 Jun 2021 18:22:43 +0200 From: Vincent Guittot To: Odin Ugedal Cc: Sachin Sant , open list , linuxppc-dev@lists.ozlabs.org, Peter Zijlstra Subject: Re: [powerpc][5.13.0-rc7] Kernel warning (kernel/sched/fair.c:401) while running LTP tests Message-ID: <20210621162243.GA29874@vingu-book> References: <9D4A658A-5F77-4C33-904A-126E6052B205@linux.vnet.ibm.com> <6D1F875D-58E9-4A55-B0C3-21D5F31EDB76@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le lundi 21 juin 2021 ? 14:42:23 (+0200), Odin Ugedal a ?crit : > Hi, > > Did some more research, and it looks like this is what happens: > > $ tree /sys/fs/cgroup/ltp/ -d --charset=ascii > /sys/fs/cgroup/ltp/ > |-- drain > `-- test-6851 > `-- level2 > |-- level3a > | |-- worker1 > | `-- worker2 > `-- level3b > `-- worker3 > > Timeline (ish): > - worker3 gets throttled > - level3b is decayed, since it has no more load > - level2 get throttled > - worker3 get unthrottled > - level2 get unthrottled > - worker3 is added to list > - level3b is not added to list, since nr_running==0 and is decayed > > > The attached diff (based on > https://lore.kernel.org/lkml/20210518125202.78658-3-odin@uged.al/) > fixes the issue for me. Not the most elegant solution, but the > simplest one as of now, and to show what is wrong. > > Any thoughts Vincent? I would prefer that we use the reason of adding the cfs in the list instead. Something like the below should also fixed the problem. It is based on a proposal I made to Rik sometimes ago when he tried to flatten the rq: https://lore.kernel.org/lkml/20190906191237.27006-6-riel@surriel.com/ This will ensure that a cfs is added in the list whenever one of its child is still in the list. --- kernel/sched/fair.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ea7de54cb022..e751061a9449 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3272,6 +3272,31 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags) #ifdef CONFIG_SMP #ifdef CONFIG_FAIR_GROUP_SCHED +/* + * Because list_add_leaf_cfs_rq always places a child cfs_rq on the list + * immediately before a parent cfs_rq, and cfs_rqs are removed from the list + * bottom-up, we only have to test whether the cfs_rq before us on the list + * is our child. + * If cfs_rq is not on the list, test wether a child needs its to be added to + * connect a branch to the tree * (see list_add_leaf_cfs_rq() for details). + */ +static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq) +{ + struct cfs_rq *prev_cfs_rq; + struct list_head *prev; + + if (cfs_rq->on_list) { + prev = cfs_rq->leaf_cfs_rq_list.prev; + } else { + struct rq *rq = rq_of(cfs_rq); + + prev = rq->tmp_alone_branch; + } + + prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list); + + return (prev_cfs_rq->tg->parent == cfs_rq->tg); +} static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) { @@ -3287,6 +3312,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) if (cfs_rq->avg.runnable_sum) return false; + if (child_cfs_rq_on_list(cfs_rq)) + return false; + return true; } -- 2.17.1 > > Thanks > Odin > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index bfaa6e1f6067..aa32e9c29efd 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -376,7 +376,8 @@ static inline bool list_add_leaf_cfs_rq(struct > cfs_rq *cfs_rq) > return false; > } > > -static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq) > +/* Returns 1 if cfs_rq was present in the list and removed */ > +static inline bool list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq) > { > if (cfs_rq->on_list) { > struct rq *rq = rq_of(cfs_rq); > @@ -393,7 +394,9 @@ static inline void list_del_leaf_cfs_rq(struct > cfs_rq *cfs_rq) > > list_del_rcu(&cfs_rq->leaf_cfs_rq_list); > cfs_rq->on_list = 0; > + return 1; > } > + return 0; > } > > static inline void assert_list_leaf_cfs_rq(struct rq *rq) > @@ -3298,24 +3301,6 @@ static inline void cfs_rq_util_change(struct > cfs_rq *cfs_rq, int flags) > > #ifdef CONFIG_SMP > #ifdef CONFIG_FAIR_GROUP_SCHED > - > -static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) > -{ > - if (cfs_rq->load.weight) > - return false; > - > - if (cfs_rq->avg.load_sum) > - return false; > - > - if (cfs_rq->avg.util_sum) > - return false; > - > - if (cfs_rq->avg.runnable_sum) > - return false; > - > - return true; > -} > - > /** > * update_tg_load_avg - update the tg's load avg > * @cfs_rq: the cfs_rq whose avg changed > @@ -4109,11 +4094,6 @@ static inline void update_misfit_status(struct > task_struct *p, struct rq *rq) > > #else /* CONFIG_SMP */ > > -static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) > -{ > - return true; > -} > - > #define UPDATE_TG 0x0 > #define SKIP_AGE_LOAD 0x0 > #define DO_ATTACH 0x0 > @@ -4771,10 +4751,11 @@ static int tg_unthrottle_up(struct task_group > *tg, void *data) > if (!cfs_rq->throttle_count) { > cfs_rq->throttled_clock_task_time += rq_clock_task(rq) - > cfs_rq->throttled_clock_task; > - > - /* Add cfs_rq with load or one or more already running > entities to the list */ > - if (!cfs_rq_is_decayed(cfs_rq) || cfs_rq->nr_running) > + if (cfs_rq->insert_on_unthrottle) { > list_add_leaf_cfs_rq(cfs_rq); > + if (tg->parent) > + > tg->parent->cfs_rq[cpu_of(rq)]->insert_on_unthrottle = true; > + } > } > > return 0; > @@ -4788,7 +4769,7 @@ static int tg_throttle_down(struct task_group > *tg, void *data) > /* group is entering throttled state, stop time */ > if (!cfs_rq->throttle_count) { > cfs_rq->throttled_clock_task = rq_clock_task(rq); > - list_del_leaf_cfs_rq(cfs_rq); > + cfs_rq->insert_on_unthrottle = list_del_leaf_cfs_rq(cfs_rq); > } > cfs_rq->throttle_count++; > > @@ -8019,6 +8000,23 @@ static bool __update_blocked_others(struct rq > *rq, bool *done) > > #ifdef CONFIG_FAIR_GROUP_SCHED > > +static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) > +{ > + if (cfs_rq->load.weight) > + return false; > + > + if (cfs_rq->avg.load_sum) > + return false; > + > + if (cfs_rq->avg.util_sum) > + return false; > + > + if (cfs_rq->avg.runnable_sum) > + return false; > + > + return true; > +} > + > static bool __update_blocked_fair(struct rq *rq, bool *done) > { > struct cfs_rq *cfs_rq, *pos; > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index a189bec13729..12a707d99ee6 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -602,6 +602,7 @@ struct cfs_rq { > u64 throttled_clock_task_time; > int throttled; > int throttle_count; > + int insert_on_unthrottle; > struct list_head throttled_list; > #endif /* CONFIG_CFS_BANDWIDTH */ > #endif /* CONFIG_FAIR_GROUP_SCHED */