Received: by 10.192.165.148 with SMTP id m20csp1161056imm; Wed, 2 May 2018 15:27:09 -0700 (PDT) X-Google-Smtp-Source: AB8JxZouoihvovZgHSexvMXyzg4X/2mqPsskzjcbkExNhcNEGKNzOLEWvl76PN2+SMBC+/F3cqeB X-Received: by 2002:a17:902:8f96:: with SMTP id z22-v6mr570814plo.200.1525300029489; Wed, 02 May 2018 15:27:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525300029; cv=none; d=google.com; s=arc-20160816; b=GbheDNWIZa9bbzYISbn6nzq/x64Qj76/Ixnbqxn61ir53d0+OCbtfaGhPFUFC7p4zl USJ5tzPbNY9eU0GXPUwMyo7Pm4qmAtosggNJXN5rQovqx7pUq8WHuL7YHVK8S/8YjRYM gLmgvebkn3xnR3f76mIfP2M+RToRkTeyUjQqXfF6fwcFW0sTgX1j5G1GdSAF+NU0DKmB 0UC15b/KMy+3UupdFxjmKM37sbPe/tWvcaIFtDDkYrPJeB2h+1pQZwdwyo1fqR40YoVn qRI96lmYRvRpdbDakJC4vwTnjFJGIX7P2BC4g+rDBrJa02ySh8wIwGmLoeUFGHayvWxD pB4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=eM6MwvMsFR5YxzDKGyd9Ra54Sz5qwk/lx98h8ds9yVk=; b=EdYrCxHk1mHy3T/36uXTWipdJL8GZuvcQke7Gi1seX59C+go8qt3VfDWYx3fExhqpe +Wqb/3yCvslm1n+7+j6rCkJTyu5LYNa7q7pdZ15Gv7sxV4RkVUm3q7VQGoEE4mNRoGJa jH43bHY69JNNsw2SUQPDWzc2vJmjDMDZCvrAQ8FEp6+vZ5vSzWrXeGLau9opqlqBM0T9 Ii4OQKZ4kjK1wKjEerN6F90s3o5PggzjlD/qYZWOC5krMkrb98DIMFkg35zrbIykVuQ4 mh4lzfH+Cd/Qf59gq+yyV1AP/T9LU5rsy7rgEqyw66CEXJEVLjiL+SliJFN8ePYzVNv9 ss6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v4si366745pff.281.2018.05.02.15.26.55; Wed, 02 May 2018 15:27:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751576AbeEBW0o (ORCPT + 99 others); Wed, 2 May 2018 18:26:44 -0400 Received: from mail-wr0-f172.google.com ([209.85.128.172]:32821 "EHLO mail-wr0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751344AbeEBW0l (ORCPT ); Wed, 2 May 2018 18:26:41 -0400 Received: by mail-wr0-f172.google.com with SMTP id o4-v6so15574182wrm.0 for ; Wed, 02 May 2018 15:26:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=eM6MwvMsFR5YxzDKGyd9Ra54Sz5qwk/lx98h8ds9yVk=; b=J7zhTP4QhJd0grlTgEFvP6RRt+hL0KAlxgMgOzMhE5XHP52oPnRJTxYMIjiMvXUOL8 Tc7mzTrfNPHyixf05cw82ej/PkHUQZ6CPCcgopTyOKEdbxw9MyA+if5uLrF+Sg+hgJ2j SVhozotEJUbMyx1/lYx+Gqec3oyqZsaLcW5jGALbMaHHHGoGLS3UFar1EgKc7YzqAtqp sn6reuxs0qIomx8v6K+Mms2XfOA3SMCzN0R/CSkZSIr4Bshav8RuBauKOCG8nEL0r2Bh pj4NxWam9t0MbEjaKmKK1toctDC1szCnezS+Q+tCTnsYiz+PDaQlMEaBX13LQe/ZGosl uDyA== X-Gm-Message-State: ALQs6tB1e7tERMpg12H4BXik8CBruSAmZ/BzK6GagugjS5K6BiNICzcX EC/sIvWZ0YsmSXB6Zqkjjiqpkg== X-Received: by 2002:adf:8186:: with SMTP id 6-v6mr17459182wra.160.1525300000550; Wed, 02 May 2018 15:26:40 -0700 (PDT) Received: from veci.piliscsaba.redhat.com (catv-176-63-54-97.catv.broadband.hu. [176.63.54.97]) by smtp.gmail.com with ESMTPSA id w31-v6sm26017556wrb.93.2018.05.02.15.26.39 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 02 May 2018 15:26:39 -0700 (PDT) From: Miklos Szeredi To: Al Viro Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, miklos@szeredi.hu Subject: [PATCH] dcache: fix quadratic behavior with parallel shrinkers Date: Thu, 3 May 2018 00:26:35 +0200 Message-Id: <20180502222635.1862-1-mszeredi@redhat.com> X-Mailer: git-send-email 2.14.3 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When multiple shrinkers are operating on a directory containing many dentries, it takes much longer than if only one shrinker is operating on the directory. Call the shrinker instances A and B, which shrink DIR containing NUM dentries. Assume A wins the race for locking DIR's d_lock, then it goes onto moving all unlinked dentries to its dispose list. When it's done, then B will scan the directory once again, but will find that all dentries are already being shrunk, so it will have an empty dispose list. Both A and B will have found NUM dentries (data.found == NUM). Now comes the interesting part: A will proceed to shrink the dispose list by killing individual dentries and decrementing the refcount of the parent (which is DIR). NB: decrementing DIR's refcount will block if DIR's d_lock is held. B will shrink a zero size list and then immediately restart scanning the directory, where it will lock DIR's d_lock, scan the remaining dentries and find no dentry to dispose. So that results in B doing the directory scan over and over again, holding d_lock of DIR, while A is waiting for a chance to decrement refcount of DIR and making very slow progress because of this. B is wasting time and holding up progress of A at the same time. Proposed fix is to check this situation in B (found some dentries, but all are being shrunk already) and just sleep for some time, before retrying the scan. The sleep is proportional to the number of found dentries. Test script: --- 8< --- 8< --- 8< --- 8< --- 8< --- #!/bin/bash TESTROOT=/var/tmp/test-root SUBDIR=$TESTROOT/sub/dir prepare() { rm -rf $TESTROOT mkdir -p $SUBDIR for (( i = 0; i < 1000; i++ )); do for ((j = 0; j < 1000; j++)); do if test -e $SUBDIR/$i.$j; then echo "This should not happen!" exit 1 fi done printf "%i (%s) ...\r" $((($i + 1) * $j)) `grep dentry /proc/slabinfo | sed -e "s/dentry *\([0-9]*\).*/\1/"` done } prepare printf "\nStarting shrinking\n" time rmdir $TESTROOT 2> /dev/null prepare printf "\nStarting parallel shrinking\n" time (rmdir $SUBDIR & rmdir $TESTROOT 2> /dev/null & wait) --- 8< --- 8< --- 8< --- 8< --- 8< --- Signed-off-by: Miklos Szeredi --- fs/dcache.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/dcache.c b/fs/dcache.c index 60df712262c2..ff250f3843d7 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "internal.h" #include "mount.h" @@ -1479,9 +1480,15 @@ void shrink_dcache_parent(struct dentry *parent) continue; } - cond_resched(); if (!data.found) break; + + /* + * Wait some for other shrinkers to process these found + * dentries. This formula gives about 100ns on average per + * dentry for large number of dentries. + */ + usleep_range(data.found / 15 + 1, data.found / 7 + 2); } } EXPORT_SYMBOL(shrink_dcache_parent); -- 2.14.3