Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp4092602pxb; Mon, 4 Oct 2021 17:20:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyylfkUMc0cqFl63aX1y+tcijipJwQGidwvMOIutwFFpiitjMFLhl5YIFrKizMLzZIsm+uT X-Received: by 2002:a17:906:6547:: with SMTP id u7mr20839180ejn.544.1633393215268; Mon, 04 Oct 2021 17:20:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633393215; cv=none; d=google.com; s=arc-20160816; b=f9MMSTyjU9PjVwBCP2OslKaJ4gbnSsytVO1fl8miszI69zngi+2r0YI3efyH5pPdZQ Le/cXkB7zNpElYOCUwZhQMkQvo5vQy1dkUFx5Nr89Yku2n4SVSv25EH6jkhXlLgBYcv8 Rs1UnqmgsTg2P9a+8WC0USTtdr/0Df6Bmxf0qiP7yYx3qmxnh+PWZ1smb6Brsqg12NS3 w/EKKhmT85ifRITyPF+SP1XEwqB3pB3qMk4/DZ5moTgEVzlKXOm9XHwOspLLn+1b/DGu Rwz41DhCLMf4/6iw6TFfcnlrI+hg8ePXiCjfgfWjkxmlL6eb8yXb/BHi8mtC+MNt/7m9 MUeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=SCVhyupfrjBiKyeY9hJ5YnhsdhSBKnUqbu6GIOmzaEg=; b=UrvJt2pkhMjUdgQT0eB1vu09Kv0mbSOgMdkEQmYlwmsBTuCgS4IIfUodpPrvgbT9ff bxdl/jprMjIvLv0h7RPHDeO3n38YzF/IkV8naMATFq3F/MPpZ+Zkn8abZkm+T9zvx8DO pMQtnekiWIWnjGJydf/qDDgeZahd72N300b7kpzFWHJsxsQZFbzlL0vifpwHKPYUYcLw EkdeyonhsDRSviIMBt+ij0qy+kNrKlX+4za1Hb8dLm0fAAtQ9e+uzcmkq0AmR7EIiHBg Mm5Zk/ItNiXyH+WdRgtvmKjOYLRS3niOytumdiy9KLkiU9tBALVOjJEu2mp2jgWTSFkp vMHA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=oxDPVCN4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j19si2700279edw.476.2021.10.04.17.19.52; Mon, 04 Oct 2021 17:20:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=oxDPVCN4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238146AbhJDSXX (ORCPT + 99 others); Mon, 4 Oct 2021 14:23:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238103AbhJDSXW (ORCPT ); Mon, 4 Oct 2021 14:23:22 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EEADEC061746; Mon, 4 Oct 2021 11:21:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=SCVhyupfrjBiKyeY9hJ5YnhsdhSBKnUqbu6GIOmzaEg=; b=oxDPVCN4maSgCmTrkBGLj6mDaH vqWb6hu8DrDy9ofVH0aHsMmoX6UjMEuPV8DJNrd9cJW45dLm9QdHI+ct/huVJ2Un1l4cCocHlIfnN kVZPsMiKc/CPJWWv6K7T4JX3N+KmW+3J/+fGbMQXT5+HjroG3VP7FHlk3c3ytGjQCHAtwGIBfKJE0 kwpZUdtbTGmKNuUzq9J0iw7UhhraKT3Pj3qd479SGC9WOUIyUgazlFurX6XM15NBEhCswX6w5KBC1 DiGDdojnDE7SLzA9jaKhuckw/CQ3ZCj2tjDYdNJ7BqVeyzShU6NHxxU9bKjqWz7SVpKhWcTlJ6pgV ufpNizjw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mXSZS-00HBjr-RM; Mon, 04 Oct 2021 18:20:25 +0000 Date: Mon, 4 Oct 2021 19:20:14 +0100 From: Matthew Wilcox To: Stephen Brennan Cc: Andrew Morton , Konrad Wilk , Alexander Viro , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v6 0/1] proc: Allow pid_revalidate() during LOOKUP_RCU Message-ID: References: <20211004175629.292270-1-stephen.s.brennan@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211004175629.292270-1-stephen.s.brennan@oracle.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 04, 2021 at 10:56:28AM -0700, Stephen Brennan wrote: > Problem Description: > > When running running ~128 parallel instances of "TZ=/etc/localtime ps > -fe >/dev/null" on a 128CPU machine, the %sys utilization reaches 97%, > and perf shows the following code path as being responsible for heavy > contention on the d_lockref spinlock: > > walk_component() > lookup_fast() > d_revalidate() > pid_revalidate() // returns -ECHILD > unlazy_child() > lockref_get_not_dead(&nd->path.dentry->d_lockref) <-- contention > > The reason is that pid_revalidate() is triggering a drop from RCU to ref > path walk mode. All concurrent path lookups thus try to grab a reference > to the dentry for /proc/, before re-executing pid_revalidate() and then > stepping into the /proc/$pid directory. Thus there is huge spinlock > contention. This patch allows pid_revalidate() to execute in RCU mode, > meaning that the path lookup can successfully enter the /proc/$pid > directory while still in RCU mode. Later on, the path lookup may still > drop into ref mode, but the contention will be much reduced at this > point. > > By applying this patch, %sys utilization falls to around 85% under the > same workload, and the number of ps processes executed per unit time > increases by 3x-4x. Although this particular workload is a bit > contrived, we have seen some large collections of eager monitoring > scripts which produced similarly high %sys time due to contention in the > /proc directory. I think it's perhaps also worth noting that this is a performance regression relative to ... v5.4? v4.14? I forget the details; do you have those to hand, Stephen? (Yes, this is a stupid workload. Yes, a customer really does have this workload.)