Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp4430934imb; Wed, 6 Mar 2019 13:15:14 -0800 (PST) X-Google-Smtp-Source: APXvYqzWoTP1qNPeKmbtEhXs+K81atMBHLbGBd6HmZLTnoJkAmc7mAV8RlIMGyO62AHHcMVY86q1 X-Received: by 2002:a62:5e46:: with SMTP id s67mr9210754pfb.126.1551906914584; Wed, 06 Mar 2019 13:15:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551906914; cv=none; d=google.com; s=arc-20160816; b=QfR7q9E+eO1asa7YYuMv6kN5HhvP94XDO+rhQQWhWG+TPrQRpwUbKQnOzdd56O+a28 OshPO4fYr5gPfMe4+pzb2141gf1UJ1UaxWOsIXB55MufyDnq2eMjI2apOPwcbkhArQfO bNxqtsFIDiKRqDebcOC47hDucXA+wmQ3FDoXfoFdZnD+2MC9m/KcSSH+e1Q9PY+RpU1c 8fBMm3XGU7Kg2o2J9Mn9RE/FCn0QMGlP2GVIe/YtsnPaMVGq1BxtB3VX6ZbM2MQ5bNDb XFJPGqK2jQEladaWHB0rSvQwlri7pFOYM4J+Ky0988bcwCZ7DPo/EbcXqCmchsN1TSQ5 qu3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from:dkim-signature; bh=nIxVnIBmK5Dz7RvSx+eW2Kz/VSS3SsqySiZ+k2TYnLA=; b=zq1MXDt089zxnARC6J/fWsgJiOvJVaZPTIOlI3o6hbNeqtmsg4yHKb6VEk2rqG9mZl PwRcknKybgG9weIwMjhRsxX5Dsc376u0OEs4MIL9mIAWtiQBCYh8eIPKyhhqRaNt3QYi uQlPvVkM4acAcQ2pZqggVQwvnuFel+Zxfz6YCfIchIdtAI4+k2+GIq+aQe/Hxl/Hd/Uy IcKBhiOiMI1bFbXDc2bhX9pbeCb6sRKVq28ldAzqLfmiEdq3SlFkgzKw3GwErdlpgIo5 4OgvSRMM9RvlSEgksOPdnQIhlQlduEFE/74NPBg45XXQM54KCs1boTF7u+q7vmrMydCo sL/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LGEsQkRu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a7si2349194plm.420.2019.03.06.13.14.56; Wed, 06 Mar 2019 13:15:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LGEsQkRu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730330AbfCFTZH (ORCPT + 99 others); Wed, 6 Mar 2019 14:25:07 -0500 Received: from mail-pf1-f172.google.com ([209.85.210.172]:45167 "EHLO mail-pf1-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730045AbfCFTZG (ORCPT ); Wed, 6 Mar 2019 14:25:06 -0500 Received: by mail-pf1-f172.google.com with SMTP id v21so9382458pfm.12 for ; Wed, 06 Mar 2019 11:25:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=nIxVnIBmK5Dz7RvSx+eW2Kz/VSS3SsqySiZ+k2TYnLA=; b=LGEsQkRuKk2nAodt0UvUAYdMyaoJSgjpwYkitzQ9KextwOGqlwGrMqa0kdlzgad3Ua NV0d/7QI4alzP/2IWJK7guZcToVu7orgPrVgYTKEp/3HCq54xJlassVSDuewVkNuirqw /h6amjiomdILQHwmNurAPXdUjYaGPFQttc30jT0+jhkj+NTOT7uQkpTo50fZfE5w6v8w 41AhF0g3hEVU3oIeD+DQY84FrNN9rNe0z0f3J3V9SS7ozINa6gOF2TzJ2+M83wfR4fws Fhpy8P+scfHSgg38GVQmdG8TG1XmFAMM735n4BVuK5qNdbMIyiz6gZxAnYy4fqFKlE36 BK7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=nIxVnIBmK5Dz7RvSx+eW2Kz/VSS3SsqySiZ+k2TYnLA=; b=GBGLFDKD09lg7mdidhkD/ufI23uULvU5QpMAWc8kz6a5tSUx7mHNtBAPq7HD/65F8p O7A3x40pE63LZCQQ08rDZqGLEmz6WDVdrRtm8Qa3qP/cZoOpMZ+PemSKvZlAinIDMjv+ pbzyG4aFzRfoOb5624tUhBjUFBePLQk9OEqX5M3OI5lsnXWGay5hAsLmdrOiUGbNzUvu shSoUUpzh0SKHSIVKTEE1HSEax5eSoBvPxzJU2/Drv3rOvZs/uaviD7dDWqmpdz/9hD4 1Q5Hm3nZ25iZgBdwamb8H2YNLpGLl5d5pQgcJ5IRBVbb0MSWh0UeFitCmEPJlIFKRTaE rZUw== X-Gm-Message-State: APjAAAUSt9hvNj1QHk+np75KQ1GEkU+9sZX7S/T4XNKtRrQkXtaYdX7/ 6w6uCO5IxhepEfycePbMR+v8Gi7eLFo= X-Received: by 2002:a17:902:14b:: with SMTP id 69mr3899090plb.216.1551900305122; Wed, 06 Mar 2019 11:25:05 -0800 (PST) Received: from bsegall-linux.svl.corp.google.com.localhost ([2620:15c:2cd:202:39d7:98b3:2536:e93f]) by smtp.gmail.com with ESMTPSA id g12sm4489992pgr.76.2019.03.06.11.25.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 06 Mar 2019 11:25:03 -0800 (PST) From: bsegall@google.com To: Phil Auld Cc: mingo@redhat.com, peterz@infradead.org, linux-kernel@vger.kernel.org Subject: Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer References: <20190301145209.GA9304@pauld.bos.csb> <20190304190510.GB5366@lorien.usersys.redhat.com> <20190305200554.GA8786@pauld.bos.csb> <20190306162313.GB8786@pauld.bos.csb> Date: Wed, 06 Mar 2019 11:25:02 -0800 In-Reply-To: <20190306162313.GB8786@pauld.bos.csb> (Phil Auld's message of "Wed, 6 Mar 2019 11:23:13 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Phil Auld writes: > On Tue, Mar 05, 2019 at 12:45:34PM -0800 bsegall@google.com wrote: >> Phil Auld writes: >> >> > Interestingly, if I limit the number of child cgroups to the number of >> > them I'm actually putting processes into (16 down from 2500) the problem >> > does not reproduce. >> >> That is indeed interesting, and definitely not something we'd want to >> matter. (Particularly if it's not root->a->b->c...->throttled_cgroup or >> root->throttled->a->...->thread vs root->throttled_cgroup, which is what >> I was originally thinking of) >> > > The locking may be a red herring. > > The setup is root->throttled->a where a is 1-2500. There are 4 threads in > each of the first 16 a groups. The parent, throttled, is where the > cfs_period/quota_us are set. > > I wonder if the problem is the walk_tg_tree_from() call in unthrottle_cfs_rq(). > > The distribute_cfg_runtime looks to be O(n * m) where n is number of > throttled cfs_rqs and m is the number of child cgroups. But I'm not > completely clear on how the hierarchical cgroups play together here. > > I'll pull on this thread some. > > Thanks for your input. > > > Cheers, > Phil Yeah, that isn't under the cfs_b lock, but is still part of distribute (and under rq lock, which might also matter). I was thinking too much about just the cfs_b regions. I'm not sure there's any good general optimization there. I suppose cfs_rqs (tgs/cfs_bs?) could have "nearest ancestor with a quota" pointer and ones with quota could have "descendants with quota" list, parallel to the children/parent lists of tgs. Then throttle/unthrottle would only have to visit these lists, and child cgroups/cfs_rqs without their own quotas would just check cfs_rq->nearest_quota_cfs_rq->throttle_count. throttled_clock_task_time can also probably be tracked there.