Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7622229imu; Fri, 28 Dec 2018 01:43:14 -0800 (PST) X-Google-Smtp-Source: ALg8bN6pcZWFCuNRCTyvfdjrbmot/1RjQ/IrCK2IIC8kq3KiafKRIqLzrYn3ea0QXf0dB0++5rij X-Received: by 2002:a17:902:82c2:: with SMTP id u2mr26897497plz.110.1545990194789; Fri, 28 Dec 2018 01:43:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545990194; cv=none; d=google.com; s=arc-20160816; b=kC0bZcSyw82KT0DuW9cKop+2y1SIsjLZIXGc3h+UlsgIc0Jt+8ktnY3Afiv19BTf9U psIbiXAqzNsVKSyWwe294Tt/S2U76/ay+unJVUMFZQp+ECqVtgDqhdNx6se3uWxN0+I6 16oHwlNB2bKjt2lEKwzEdUBhWBVkEJkH/Fi3j/Ki3pm8hJzCz4fS6tPgbyTG07PDSaEd 4oUldHURwMxq9630MixwTOZ1fx5nePbYD4dJZhKwtprns68igVVZ0Dgz+T+AjyD8ZC4N 5R5QCmWWkktAiW/RhQcbSWxItRdPvjVtdSl/wmycebak/JI4uoVz2bIeA5lFrCRjo/xj DBpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=NeDMijcpAMRtx9PUwmCW56P5NPXFc6qdEWwwmvNRlcI=; b=i1JiJ7VfxWe36z5zorgnMz5zIev+1zDM9NFl7QiJbWIP0l6MF/1lgI2t3bcr2i1i2f WIOdjdDvPTyed8QCdIBvsen6wUsXpK0OQ8fGoq0GPON4E8FDfTTOieaeERuK7KWMmaFJ DGTxClq1b3XO5z43OEdkU/56P3T8Sskb57R0TVBYuJuDPZ9/LFFes4Rfq42nmvCwDd3Q dwFzbGBAbIBHBOOe1WLcYci0rv2OZEHZn9WwYcHnrtUSXoTflv889DPjhPwdkrJaox6x aZMQHFpeAg7GrwnRv5PrttfF8RfHwzTmbiXvOsxIpPBW+Kcg1l7SigxcAoyoCWw4/ePi KJWA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sargun.me header.s=google header.b=KxBZNjfV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j34si34138464pgj.557.2018.12.28.01.42.47; Fri, 28 Dec 2018 01:43:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@sargun.me header.s=google header.b=KxBZNjfV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730165AbeL0VJP (ORCPT + 99 others); Thu, 27 Dec 2018 16:09:15 -0500 Received: from mail-ed1-f65.google.com ([209.85.208.65]:34305 "EHLO mail-ed1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727007AbeL0VJP (ORCPT ); Thu, 27 Dec 2018 16:09:15 -0500 Received: by mail-ed1-f65.google.com with SMTP id b3so16204398ede.1 for ; Thu, 27 Dec 2018 13:09:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sargun.me; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NeDMijcpAMRtx9PUwmCW56P5NPXFc6qdEWwwmvNRlcI=; b=KxBZNjfVetBRUL7pJhf0umW8WHsat0LQDDiMnc8L/Om0JFjPHWAD7Oesql7Bvo487u Y+4AECxIvS4c3+r0NBPgrU/q887teytpaeYeyAIGGgt91iAqJXvbgVbssMS+sQgmJiW+ NpJmikwhowAuMtC7SmBW1phYxqPV0HTZHQ4Kk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NeDMijcpAMRtx9PUwmCW56P5NPXFc6qdEWwwmvNRlcI=; b=c40HqtZu2t29talEk+VMqlQ011oRYjHtTSuxDZC6qBmgA3ay5sTw/KOAinKGSaISCt kOibU8f/AkV0m5tWxNFXn6gpFsGI6JbWnZ+MUMCz3ho2da6Cg7e6Eh6b6+1S8Vx6MFTM cWN+a3rZ8BZl4FQ3BKl3BCVy3cE652dwaCF0txms07/+gOhc+r+1yvXtvny9FxJJEwhq JBGI0y/nmxJWoDaEiKiMY8XYTrpVQdtHY9lo4kmOea20pL+AlufhevoAFZbXD3/QqKJ9 4BF0PuTVseve/Wf1YDp+O73f0QRBY+cPDbffNsJCrAfH5UGCrv9CukMVzHOOAXI9C9vH 7geA== X-Gm-Message-State: AA+aEWbErArzPJOMDQ31R1MvomwX0oMkIWL/l2kP3mqoDzoinNrfMFhL 5RDCkFn02+6MQ9WzGoCzI3z59SlqE1l4hyMnDdm/6Q== X-Received: by 2002:a50:b667:: with SMTP id c36mr20680578ede.190.1545944953055; Thu, 27 Dec 2018 13:09:13 -0800 (PST) MIME-Version: 1.0 References: <1545879866-27809-1-git-send-email-xiexiuqi@huawei.com> <20181227102107.GA21156@linaro.org> In-Reply-To: From: Sargun Dhillon Date: Thu, 27 Dec 2018 16:08:37 -0500 Message-ID: Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages To: Linus Torvalds Cc: Vincent Guittot , Xie XiuQi , Ingo Molnar , Peter Zijlstra , xiezhipeng1@huawei.com, huawei.libin@huawei.com, linux-kernel , Dmitry Adamushko , Tejun Heo Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 27, 2018 at 1:15 PM Linus Torvalds wrote: > > On Thu, Dec 27, 2018 at 9:02 AM Vincent Guittot > wrote: > > > > In the original behavior, the cs_rq was removed from the list only > > when the cgroup was removed. > > patch a9e7f6544b9c (sched/fair: Fix O(nr_cgroups) in load balance > > path) has added an optimization which remove the cfs_rq when there > > were no blocked load to update in order to optimize the loop but it > > has introduced a race condition that create this infinite loop. The > > patch fixes the problem by removing the optimization. > > I will look at re-adding the optimization once i will have afix for > > the race condition > > Hmm. What's the race? We seem to take the rq lock for all the cases, > but maybe I'm missing something? > > That commit a9e7f6544b9c is a year and a half old, why did this start > being reported now? > This appears to be broken since October on 4.18.5. We've only noticed it recently with a workload which does ridiculously parallel compiles in cgroups that are rapidly churned. It's also an awkward bug to catch, because none of the lockup detectors, were catching it in our environment. The only reason we caught it was that it was blocking other cores, and those other cores were missing IPIs, resulting in catastrophic failure. > [ goes off and looks ] > > Oh. unthrottle_cfs_rq -> enqueue_entity -> list_add_leaf_cfs_rq() > doesn't actually seem to hold the rq lock at all. It's just called > under a rcu read lock. > > So it all seems to depend on that "on_list" flag for exclusion. Which > seems fundamentally racy, since it's not protected by a lock. > > So yeah, the whole logic seems to depend on "on_list is sticky and > stays set until the whole task group is destroyed". > > So commit a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance > path") would appear to be entirely wrong, because on_list isn't > actually protected by a lock, and that can confuse things. > > But that still makes me go "how come is this only noticed 18 months > after the fact"? > > So I'm probably still missing something. > > Tejun? PeterZ? Tell my why I'm being dense. > > Linus