Date: Sat, 3 Jun 2017 02:12:41 +0100
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Khazhismel Kumykov <khazhy@google.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
        Miklos Szeredi <mszeredi@redhat.com>
Subject: Re: Hang/soft lockup in d_invalidate with simultaneous calls
Message-ID: <20170603011241.GH6365@ZenIV.linux.org.uk>
References: <CACGdZYJBAYz3MXb18i=T91hD5TNFvCXUJH7ByvnJC1BTxjq6pA@mail.gmail.com>
 <CACGdZY+FsRTopRriKDwVtfmM5Qb1s-YXO2XOU+KyQA6xOFARNA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CACGdZY+FsRTopRriKDwVtfmM5Qb1s-YXO2XOU+KyQA6xOFARNA@mail.gmail.com>
User-Agent: Mutt/1.8.0 (2017-02-23)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2835
Lines: 68

On Wed, May 17, 2017 at 02:58:11PM -0700, Khazhismel Kumykov wrote:

> Once the dentry is on a shrink list would
> it be unreachable anyways,

Why would it be?  Suppose e.g. shrink_dcache_parent() finds a dentry with
zero refcount; to the shrink list it goes, right?  Then, before we actually
get around to evicting it, somebody goes and looks it up and incrementes
refcount.  It's still on the shrink list at that point; whoever had done
that lookup has no way to safely remove the sucker from that - only the
owner of shrink list could do that.  And that's precisely what happens
once that owner gets around to shrink_dentry_list():
                d_shrink_del(dentry);

                /*
                 * We found an inuse dentry which was not removed from
                 * the LRU because of laziness during lookup. Do not free it.
                 */
                if (dentry->d_lockref.count > 0) {
                        spin_unlock(&dentry->d_lock);
                        if (parent)
                                spin_unlock(&parent->d_lock);
                        continue;
                }
and we are done with it.

dentry being on a shrink list is *NOT* unreachable; it might have been already
looked up since it had been placed there and it might be looked up at any point
up until shrink_dentry_list() gets to it.

We really can't quit d_invalidate() until all those dentries are done with.
The shit hits the fan when you get a bunch of threads hitting d_invalidate()
in parallel on the same directory.  "Let's just go away and expect that
eventually they'll get evicted" is not a fix.  Each of them picks one
victim (after having skipped everything claimed by others), then proudly
disposes of it and repeats everything from scratch.  Worse, if everything
_is_ claimed, we'll just keep rescanning again and again until they go
away.

Part of that could be relieved if we turned check_and_drop() into
static void check_and_drop(void *_data)
{
        struct detach_data *data = _data;

        if (!data->mountpoint && list_empty(&data->select.dispose))
                __d_drop(data->select.start);
}

It doesn't solve the entire problem, though - we still could get too
many threads into that thing before any of them gets to that __d_drop().
I wonder if we should try and collect more victims; that could lead
to contention of its own, but...

Anyway, could you try the one-liner as above and see what it does to your
reproducer?  I.e.
diff --git a/fs/dcache.c b/fs/dcache.c
index cddf39777835..21f8dd0002a6 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1494,7 +1494,7 @@ static void check_and_drop(void *_data)
 {
 	struct detach_data *data = _data;
 
-	if (!data->mountpoint && !data->select.found)
+	if (!data->mountpoint && list_empty(&data->select.dispose))
 		__d_drop(data->select.start);
 }