Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp31658ybf; Wed, 26 Feb 2020 08:18:33 -0800 (PST) X-Google-Smtp-Source: APXvYqwPMmGplyOPRBmEc2vxzLETXrvrMz930QtDoYHhfzwjyRKOSTpQS1YtJsN9SrcE9IzBgT5Z X-Received: by 2002:a54:410e:: with SMTP id l14mr3746212oic.42.1582733913567; Wed, 26 Feb 2020 08:18:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582733913; cv=none; d=google.com; s=arc-20160816; b=oEUSmjLix17yAqWhYVPISHpv+1RxzEBi1LRUp/uQMrCnox9hBxysnzvTdXHwiU4ckJ QHhCoS9TBR2pnUJ+n79hatXx4cC6sfzCE1aJxRkR1TU7hznPzVZmJpo38ucdu24tE7YH Y79BZJxOJS6Ca7cGLLZnX61zrYhlce3omEhsOJCoKdFN25aDpY6sr1+QZsN6Rbm11x7h r2R1qkaGVkkJPhTbBeDnmG28aZ4d8E+vR0vnC4P6yDaBAYmc+ZxL4q2/Zse2Up4joYqU 2TdmIkev71bGOXkBE19fFUtc5j+NhSnf+Z0Jx8320btlfPY/aZWlQiV0UDGJ/3xILDgy 8bqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=GFnA8Sa/wZo/6noDEkcT6+Ar1rUimsYkRyF79+pzSnM=; b=sx6XMZN4cP21cXWy2og6hQu6VraMOhhddP0Dde9q0+hWmgD7S2Jiv3dpH19Nrr2rml TTDTRs+QKLotYfenrA5OHESOImVFEZMf4ia3URI7YgSaJcqgyY0eREX5V0++ZVfViewx kJ6Lw9z4qcyXO3p3jhnWg+Yyd2SZbCn1QRoTirvCzTo1lPytTa8NpS7rViwZaIw2gFAl jHGS1XhCRcnA6ZX+oqXD1OeEPc9KStvPH5S31DsuzVeq6KWxcB5zK4n0ayUoR6LntlLx OcAaZBqKFlmozpd7hJALHy0pFy+eVQH/HsM175IFr0E2wF85NHpGcHLqbDP8VtzwgN99 qQ7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hKKpA64p; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p7si1602ota.299.2020.02.26.08.18.21; Wed, 26 Feb 2020 08:18:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hKKpA64p; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727834AbgBZQPK (ORCPT + 99 others); Wed, 26 Feb 2020 11:15:10 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:23106 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727746AbgBZQPJ (ORCPT ); Wed, 26 Feb 2020 11:15:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582733708; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc; bh=GFnA8Sa/wZo/6noDEkcT6+Ar1rUimsYkRyF79+pzSnM=; b=hKKpA64puSbthkYg26S2DHL5/xf5LH8T/6yNVtOh65UTzmp2Qlc0X3hjk80XCIyfnsTFXP cAy/TCeF6DUqd+V0kIQXuNWjm12+hXnF0ATe7q3tJMPvKQuEWwKipwxNMs6XX+H9CCYcnh MwPDAIaeFLicA/c0KPP0dU2HTy/PN8c= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-411-6kLbTQcGNnSdZWckjUi0gA-1; Wed, 26 Feb 2020 11:15:04 -0500 X-MC-Unique: 6kLbTQcGNnSdZWckjUi0gA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B5C251005512; Wed, 26 Feb 2020 16:15:02 +0000 (UTC) Received: from llong.com (dhcp-17-59.bos.redhat.com [10.18.17.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id DC22160BE1; Wed, 26 Feb 2020 16:14:55 +0000 (UTC) From: Waiman Long To: Alexander Viro , Jonathan Corbet , Luis Chamberlain , Kees Cook , Iurii Zaikin Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Mauro Carvalho Chehab , Eric Biggers , Dave Chinner , Eric Sandeen , Waiman Long Subject: [PATCH 00/11] fs/dcache: Limit # of negative dentries Date: Wed, 26 Feb 2020 11:13:53 -0500 Message-Id: <20200226161404.14136-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org As there is no limit for negative dentries, it is possible that a sizeable portion of system memory can be tied up in dentry cache slabs. Dentry slabs are generally recalimable if the dentries are in the LRUs. Still having too much memory used up by dentries can be problematic: 1) When a filesystem with too many negative dentries is being unmounted, the process of draining the dentries associated with the filesystem can take some time. To users, the system may seem to hang for a while. The long wait may also cause unexpected timeout error or other warnings. This can happen when a long running container with many negative dentries is being destroyed, for instance. 2) Tying up too much memory in unused negative dentries means there are less memory available for other use. Even though the kernel is able to reclaim unused dentries when running out of free memory, it will still introduce additional latency to the application reducing its performance. There are two different approaches to limit negative dentries. 1) Global reclaim Based on the total number of negative dentries as tracked by the nr_dentry_negative percpu count, a function can be activated to scan the various LRU lists to trim out excess negative dentries. 2) Local reclaim By tracking the number of negative dentries under each directory, we can start the reclaim process if the number exceeds a certain limit. The problem with global reclaim is that there are just too many LRU lists present that may need to be scanned for each filesystem. Especially problematic is the fact that each memory cgroup can have its own LRU lists. As memory cgroup can come and go at any time, scanning its LRUs can be tricky. Local reclaim does not have this problem. So it is used as the basis for negative dentry reclaim for this patchset. Accurately tracking the number of negative dentries in each directory can be costly in term of performance hit. As a result, this patchset estimates the number of negative dentries present in a directory by looking at a newly added children count and an opportunistically stored positive dentry count. A new sysctl parameter "dentry-dir-max" is introduced which accepts a value of 0 (default) for no limit or a positive integer 256 and up. Small dentry-dir-max numbers are forbidden to avoid excessive dentry count checking which can impact system performance. The actual negative dentry reclaim is delegated to the system workqueue to avoid adding excessive latency to normal filesystem operation. Waiman Long (11): fs/dcache: Fix incorrect accounting of negative dentries fs/dcache: Simplify __dentry_kill() fs/dcache: Add a counter to track number of children fs/dcache: Add sysctl parameter dentry-dir-max fs/dcache: Reclaim excessive negative dentries in directories fs/dcache: directory opportunistically stores # of positive dentries fs/dcache: Add static key negative_reclaim_enable fs/dcache: Limit dentry reclaim count in negative_reclaim_workfn() fs/dcache: Don't allow small values for dentry-dir-max fs/dcache: Kill off dentry as last resort fs/dcache: Track # of negative dentries reclaimed & killed Documentation/admin-guide/sysctl/fs.rst | 18 + fs/dcache.c | 428 +++++++++++++++++++++++- include/linux/dcache.h | 18 +- kernel/sysctl.c | 11 + 4 files changed, 457 insertions(+), 18 deletions(-) -- 2.18.1