Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7757500imu; Fri, 28 Dec 2018 04:27:30 -0800 (PST) X-Google-Smtp-Source: ALg8bN5LaCW7TXLcO4RVBPr4gzlwf+9AE+HkYLl2ItDT4xwFVPinm90w+zr4nPQWEM8Qc3C2Z+iP X-Received: by 2002:a63:9501:: with SMTP id p1mr26318937pgd.149.1546000050156; Fri, 28 Dec 2018 04:27:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546000050; cv=none; d=google.com; s=arc-20160816; b=J+8ctSyQXevwRPJTGIn0wOXASvcOXU4/qjeKtbrddoepd9ST2psVB0KhqyPDoCo0Sd Hs8oDo5LjpXEOWuqO3zxl2ursRxwLCo9jbtaVl5Fd6nU8rj4vmTNd/QAvfwv2DnLUkrN gc9AH7xEgfIeU8AMU597d89SRzS6fHetuthfE2elY4BWYeuzGW6ygjIVj6tF1AMLQYTZ TZvmehdCSHpQBuxH+aKF+/gjKajP1L49K0frBUr0L4N0MGkkILIEqDmMEFvRUwLrEsZR +nMEeYvBmnMWEgHn2OG0xVfwomYaNwMO/K4t3aA4xV1rfF+TVZC4uAKt6H/3sdfQ8ZUW X0ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=iAg7JAS0ijZtnBPznNA3j6Qu4HTRBjTq9XVyiwOIhKU=; b=hIgJvOg00zqFm5GyVO4XYAJ1WEhZrFkuI2gjcrvk4QPMMxDHiabYgTg9TpX3bYi05j /1SJ7isNGo6wCyCclF0W0ICFRYEPftobhLkSDCQua2nCApDUkxW7Gii1jsaYkU45dHaf 2ytkfXqR8zzWU9youK2x+aqzbwtnTmVprmiG9OLDwrJ4KKmCimdJ59sbyOZOHw1+dq3j jcjkzWSOKrSpKz3CC3Kmh0Kfay2DjUQZDQ3hIFkB9i79fOgSABzQGZZtmIQysccAzBwC +mBGOvEIGWcCu/KWHmczMUSDDJsi0fex9UeeTSkGowt/s4go/pBfK0gOfxq9npail1ft N9hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=j7R7J7Px; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 14si17895530pgg.425.2018.12.28.04.27.14; Fri, 28 Dec 2018 04:27:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=j7R7J7Px; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730951AbeL1JaV (ORCPT + 99 others); Fri, 28 Dec 2018 04:30:21 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:36118 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729537AbeL1JaV (ORCPT ); Fri, 28 Dec 2018 04:30:21 -0500 Received: by mail-it1-f196.google.com with SMTP id c9so26541032itj.1 for ; Fri, 28 Dec 2018 01:30:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iAg7JAS0ijZtnBPznNA3j6Qu4HTRBjTq9XVyiwOIhKU=; b=j7R7J7PxYE4duzT+VFv2rycDZehE1A+VG9l093qaKZ33jT7rdjSDrj6Yxdujj9qKWh ymDmoq1cZZirFb3HTbbK4SxDunI0IUu0sznEees3Avz+aguHjuT7RmRgwyY5vXMx4qWt +DgaRMHvej8X180+2h7BRKsbGq8rxY/lWI2HM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iAg7JAS0ijZtnBPznNA3j6Qu4HTRBjTq9XVyiwOIhKU=; b=F64m7WsvFcYBkdEd16SQekbZe26nffDeYKxBRXxAOlSxd8mQeKlwIxPgH0rqR8pUWE B56nRSX+UzCvbkSHaLGxOMRaRtbfbiZaBGx0WpRO1AXPVlb52+C3pfm/6u6p7oWgVaIW R/TIhDyjiP1hRQjfUqTvVLVsBgBiGfLH8+r0LSuqaLrGaHWT68HltDZtFWO2vz6bQycv vLIPZBAH83Q7iaqDcGlw4XUfeo96+MFusks/5u6/0/qsfGqVMk6vP6q/7tUH3/I/YBrV or4IWOaxvCXhQNJar9fMRu37qdMoHqxpAOTo2Nl7Mf4e9rOZkxMz5R1cLIxzOs69i30G mleA== X-Gm-Message-State: AA+aEWbchc5o7zEAZM+sYe2jdqt/lavB17LrG2UwudibFRHaxbXRga/t kwav9Jecf41voCFvsUyQ01m39y/nIfx3vhydtCJCAg== X-Received: by 2002:a05:660c:a8f:: with SMTP id m15mr16906225itk.114.1545989419091; Fri, 28 Dec 2018 01:30:19 -0800 (PST) MIME-Version: 1.0 References: <1545879866-27809-1-git-send-email-xiexiuqi@huawei.com> <20181227102107.GA21156@linaro.org> <20181228011524.GF2509588@devbig004.ftw2.facebook.com> <20181228015352.GG2509588@devbig004.ftw2.facebook.com> <20181228020243.GH2509588@devbig004.ftw2.facebook.com> In-Reply-To: <20181228020243.GH2509588@devbig004.ftw2.facebook.com> From: Vincent Guittot Date: Fri, 28 Dec 2018 10:30:07 +0100 Message-ID: Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages To: Tejun Heo Cc: Linus Torvalds , Sargun Dhillon , Xie XiuQi , Ingo Molnar , Peter Zijlstra , xiezhipeng1@huawei.com, huawei.libin@huawei.com, linux-kernel , Dmitry Adamushko , Rik van Riel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 28 Dec 2018 at 03:02, Tejun Heo wrote: > > On Thu, Dec 27, 2018 at 05:53:52PM -0800, Tejun Heo wrote: > > Vincent knows that part way better than me but I think the safest way > > would be doing the optimization removal iff tmp_alone_branch is > > already pointing to leaf_cfs_rq_list. IIUC, it's pointing to > > something else only while a branch is being built and deferring > > optimization removal by an avg update cycle isn't gonna make any > > difference anyway. But the lock should not be released during the build of a branch and tmp_alone_branch must always points to rq->leaf_cfs_rq_list at the end and before the lock is released I think that there is a bigger problem with commit a9e7f6544b9c and cfs_rq throttling: Let take the example of the following topology TG2 --> TG1 --> root 1-The 1st time a task is enqueued, we will add TG2 cfs_rq then TG1 cfs_rq to leaf_cfs_rq_list and we are sure to do the whole branch in one path because it has never been used and can't be throttled so tmp_alone_branch will point to leaf_cfs_rq_list at the end. 2-Then TG1 is throttled 3-and we add TG3 as a new child of TG1. 4-The 1st enqueue of a task on TG3 will add TG3 cfs_rq just before TG1 cfs_rq and tmp_alone_branch will stay on rq->leaf_cfs_rq_list. With commit a9e7f6544b9c, we can del a cfs_rq from rq->leaf_cfs_rq_list. So if the load of TG1 cfs_rq becomes null before step 2 above, TG1 cfs_rq is removed from the list. Then at step 4, TG3 cfs_rq is added at the beg of rq->leaf_cfs_rq_list but tmp_alone_branch still points to TG3 cfs_rq because its throttled parent can't be enqueued when the lock is released tmp_alone_branch doesn't point to rq->leaf_cfs_rq_list whereas it should. so if TG3 cfs_rq is removed or destroyed before tmp_alone_branch points on another TG cfs_rq, the next TG cfs_rq that will be added, will be linked outside rq->leaf_cfs_rq_list In addition, we can break the ordering of the cfs_rq in rq->leaf_cfs_rq_list but this ordering is used to update and propagate the update from leaf down to root. > > So, something like the following. Xie, can you see whether the > following patch resolves the problem? > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d1907506318a..88b9118b5191 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7698,7 +7698,8 @@ static void update_blocked_averages(int cpu) > * There can be a lot of idle CPU cgroups. Don't let fully > * decayed cfs_rqs linger on the list. > */ > - if (cfs_rq_is_decayed(cfs_rq)) > + if (cfs_rq_is_decayed(cfs_rq) && > + rq->tmp_alone_branch == &rq->leaf_cfs_rq_list) > list_del_leaf_cfs_rq(cfs_rq); This patch reduces the cases but I don't thinks it's enough because it doesn't cover the case of unregister_fair_sched_group() And we can still break the ordering of the cfs_rq > > /* Don't need periodic decay once load/util_avg are null */