Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754250AbcCAXvJ (ORCPT ); Tue, 1 Mar 2016 18:51:09 -0500 Received: from mail333.us4.mandrillapp.com ([205.201.137.77]:42851 "EHLO mail333.us4.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753563AbcCAXu7 (ORCPT ); Tue, 1 Mar 2016 18:50:59 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; q=dns; s=mandrill; d=linuxfoundation.org; b=YV4n3DYVu4Shp55BEXYuQ1zNPHo2xsapmth4fawi8XlUHkVDy+MEZ+Dmdt5Mpig+aHbZFuOObsRr mq0L5xdPenn9k0p8uhIUNWLCn96NslmApO6o5Ux61zJQxDecuy7oHsLNWOZT+nMD0/U3hQIRroDV eyNAM0k8fwnVE8X5TVw=; From: Greg Kroah-Hartman Subject: [PATCH 3.14 028/130] shrink_dentry_list(): take parents ->d_lock earlier X-Mailer: git-send-email 2.7.2 To: Cc: Greg Kroah-Hartman , , Al Viro Message-Id: <20160301234500.751381018@linuxfoundation.org> In-Reply-To: <20160301234459.768886030@linuxfoundation.org> References: <20160301234459.768886030@linuxfoundation.org> X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=30481620.fc92e6bdf0024e94931a8d60ff2cb992 X-Mandrill-User: md_30481620 Date: Tue, 01 Mar 2016 23:50:58 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4642 Lines: 140 3.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Al Viro commit 046b961b45f93a92e4c70525a12f3d378bced130 upstream. The cause of livelocks there is that we are taking ->d_lock on dentry and its parent in the wrong order, forcing us to use trylock on the parent's one. d_walk() takes them in the right order, and unfortunately it's not hard to create a situation when shrink_dentry_list() can't make progress since trylock keeps failing, and shrink_dcache_parent() or check_submounts_and_drop() keeps calling d_walk() disrupting the very shrink_dentry_list() it's waiting for. Solution is straightforward - if that trylock fails, let's unlock the dentry itself and take locks in the right order. We need to stabilize ->d_parent without holding ->d_lock, but that's doable using RCU. And we'd better do that in the very beginning of the loop in shrink_dentry_list(), since the checks on refcount, etc. would need to be redone anyway. That deals with a half of the problem - killing dentries on the shrink list itself. Another one (dropping their parents) is in the next commit. locking parent is interesting - it would be easy to do rcu_read_lock(), lock whatever we think is a parent, lock dentry itself and check if the parent is still the right one. Except that we need to check that *before* locking the dentry, or we are risking taking ->d_lock out of order. Fortunately, once the D1 is locked, we can check if D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent can start or stop pointing to D1 only under D1->d_lock, so taking D1->d_lock is enough. In other words, the right solution is rcu_read_lock/lock what looks like parent right now/check if it's still our parent/rcu_read_unlock/lock the child. Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman --- fs/dcache.c | 53 +++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 41 insertions(+), 12 deletions(-) --- a/fs/dcache.c +++ b/fs/dcache.c @@ -528,6 +528,38 @@ failed: return dentry; /* try again with same dentry */ } +static inline struct dentry *lock_parent(struct dentry *dentry) +{ + struct dentry *parent = dentry->d_parent; + if (IS_ROOT(dentry)) + return NULL; + if (likely(spin_trylock(&parent->d_lock))) + return parent; + spin_unlock(&dentry->d_lock); + rcu_read_lock(); +again: + parent = ACCESS_ONCE(dentry->d_parent); + spin_lock(&parent->d_lock); + /* + * We can't blindly lock dentry until we are sure + * that we won't violate the locking order. + * Any changes of dentry->d_parent must have + * been done with parent->d_lock held, so + * spin_lock() above is enough of a barrier + * for checking if it's still our child. + */ + if (unlikely(parent != dentry->d_parent)) { + spin_unlock(&parent->d_lock); + goto again; + } + rcu_read_unlock(); + if (parent != dentry) + spin_lock(&dentry->d_lock); + else + parent = NULL; + return parent; +} + /* * This is dput * @@ -805,6 +837,8 @@ static void shrink_dentry_list(struct li struct inode *inode; dentry = list_entry(list->prev, struct dentry, d_lru); spin_lock(&dentry->d_lock); + parent = lock_parent(dentry); + /* * The dispose list is isolated and dentries are not accounted * to the LRU here, so we can simply remove it from the list @@ -818,6 +852,8 @@ static void shrink_dentry_list(struct li */ if ((int)dentry->d_lockref.count > 0) { spin_unlock(&dentry->d_lock); + if (parent) + spin_unlock(&parent->d_lock); continue; } @@ -825,6 +861,8 @@ static void shrink_dentry_list(struct li if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) { bool can_free = dentry->d_flags & DCACHE_MAY_FREE; spin_unlock(&dentry->d_lock); + if (parent) + spin_unlock(&parent->d_lock); if (can_free) dentry_free(dentry); continue; @@ -834,22 +872,13 @@ static void shrink_dentry_list(struct li if (inode && unlikely(!spin_trylock(&inode->i_lock))) { d_shrink_add(dentry, list); spin_unlock(&dentry->d_lock); + if (parent) + spin_unlock(&parent->d_lock); continue; } - parent = NULL; - if (!IS_ROOT(dentry)) { - parent = dentry->d_parent; - if (unlikely(!spin_trylock(&parent->d_lock))) { - if (inode) - spin_unlock(&inode->i_lock); - d_shrink_add(dentry, list); - spin_unlock(&dentry->d_lock); - continue; - } - } - __dentry_kill(dentry); + /* * We need to prune ancestors too. This is necessary to prevent * quadratic behavior of shrink_dcache_parent(), but is also