Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3472834imu; Sun, 11 Nov 2018 15:54:42 -0800 (PST) X-Google-Smtp-Source: AJdET5cwFJpw1EdYeQV/m6hf+7SA2S1ohXHC0YrVMOhQMFAyNYC7NSsLA35mC8p5bUJgy3hzmBwQ X-Received: by 2002:a63:6dc8:: with SMTP id i191mr15122182pgc.215.1541980482592; Sun, 11 Nov 2018 15:54:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541980482; cv=none; d=google.com; s=arc-20160816; b=EJvngeyIZZ9QsVQi9ERg95pd9JXfN2rnkS7hudvTrK4MiUqqUXOvOSgUm7FTsXa3Y8 s7RxsV5y9BAKEYbdsHWIfuWsPbYsGUeEcGyGch/Z632BJ/9sQjA8ltEvNpdr2mH6JSdu 0cAyrgvexD0VhLPp8Cc1RhdXeQpUgGo61CJTaBlbxquMjG8pHppvDb12iFj2ihTNSt01 aaV8w3/bfGvxcTXSiCKq2aiha7QAhGErFubU8lp1+1kbuY7qeO/+94PZNd0f6xx5AlK2 l2uqVUNPExWgAZ3xZV0z8y095TyD2IS6IaJsGB8R6HXh0SyAlwD5sdVVymDl2IRlI1MF im/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=HZGZzHiZWoYu5lQ3WXzWNe7qsFFmHROoe2w8bUPhkdM=; b=YUdCGGgR8EH+N6kFHhfFCB+dITf9rFAimTE5WS2Hf+R0B9wnQDYS1LRlB9yQHZQWSL Rls8iXbNJeufylExfvrBJf48XbICNEUOSEMuisEJPpOXujv0q/C09ZSFfXUCNRZ3c7vH izWrrso+8mB89fP7kzUF6SNrMozXzRnQ/YTh4xoMEz9EJwipHswVj14LNSZ6G9z/+uvU YwGkEXaA4cZhHszq7rGz204hNF0qPuSEn7fG7vsa3MVlVykhVzfzGKZGhfnH293SqoMl YCLI5Sdy9l7O0aMqKa+JEBzhleuz50ot/rjKQT+yi92Z/TViuIXQC2/K8acV2GOUdo48 iAfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Zd06rmNG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p18si14632980pgb.469.2018.11.11.15.54.27; Sun, 11 Nov 2018 15:54:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Zd06rmNG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733280AbeKLJo0 (ORCPT + 99 others); Mon, 12 Nov 2018 04:44:26 -0500 Received: from mail.kernel.org ([198.145.29.99]:37238 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732748AbeKLISX (ORCPT ); Mon, 12 Nov 2018 03:18:23 -0500 Received: from localhost (unknown [206.108.79.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DC81C21707; Sun, 11 Nov 2018 22:28:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1541975307; bh=H4h1kHzJz9mWcc3XM6StcIleVwqJa3TCyhgPabXOMo4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Zd06rmNGCzLlGfMKY7qgksuZFzpZl6FD9HKBcf5YV9Jk4z49MA9Iu8GaJoUqDMhn4 ljUCFYdGF9+JI8L0ryH7HcO8Upb+usQjGZwHvNwV2A/hPwkHxA/uSX0kof4sxNtOFD DS8Gmm4E9BpxYaIuso8nYeCEPnfpvlT+zvEXRoQ0= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jan Kara Subject: [PATCH 4.19 279/361] fsnotify: Fix busy inodes during unmount Date: Sun, 11 Nov 2018 14:20:26 -0800 Message-Id: <20181111221655.682367480@linuxfoundation.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181111221619.915519183@linuxfoundation.org> References: <20181111221619.915519183@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.19-stable review patch. If anyone has any objections, please let me know. ------------------ From: Jan Kara commit 721fb6fbfd2132164c2e8777cc837f9b2c1794dc upstream. Detaching of mark connector from fsnotify_put_mark() can race with unmounting of the filesystem like: CPU1 CPU2 fsnotify_put_mark() spin_lock(&conn->lock); ... inode = fsnotify_detach_connector_from_object(conn) spin_unlock(&conn->lock); generic_shutdown_super() fsnotify_unmount_inodes() sees connector detached for inode -> nothing to do evict_inode() barfs on pending inode reference iput(inode); Resulting in "Busy inodes after unmount" message and possible kernel oops. Make fsnotify_unmount_inodes() properly wait for outstanding inode references from detached connectors. Note that the accounting of outstanding inode references in the superblock can cause some cacheline contention on the counter. OTOH it happens only during deletion of the last notification mark from an inode (or during unlinking of watched inode) and that is not too bad. I have measured time to create & delete inotify watch 100000 times from 64 processes in parallel (each process having its own inotify group and its own file on a shared superblock) on a 64 CPU machine. Average and standard deviation of 15 runs look like: Avg Stddev Vanilla 9.817400 0.276165 Fixed 9.710467 0.228294 So there's no statistically significant difference. Fixes: 6b3f05d24d35 ("fsnotify: Detach mark from object list when last reference is dropped") CC: stable@vger.kernel.org Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman --- fs/notify/fsnotify.c | 3 +++ fs/notify/mark.c | 39 +++++++++++++++++++++++++++++++-------- include/linux/fs.h | 3 +++ 3 files changed, 37 insertions(+), 8 deletions(-) --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -96,6 +96,9 @@ void fsnotify_unmount_inodes(struct supe if (iput_inode) iput(iput_inode); + /* Wait for outstanding inode references from connectors */ + wait_var_event(&sb->s_fsnotify_inode_refs, + !atomic_long_read(&sb->s_fsnotify_inode_refs)); } /* --- a/fs/notify/mark.c +++ b/fs/notify/mark.c @@ -179,17 +179,20 @@ static void fsnotify_connector_destroy_w } } -static struct inode *fsnotify_detach_connector_from_object( - struct fsnotify_mark_connector *conn) +static void *fsnotify_detach_connector_from_object( + struct fsnotify_mark_connector *conn, + unsigned int *type) { struct inode *inode = NULL; + *type = conn->type; if (conn->type == FSNOTIFY_OBJ_TYPE_DETACHED) return NULL; if (conn->type == FSNOTIFY_OBJ_TYPE_INODE) { inode = fsnotify_conn_inode(conn); inode->i_fsnotify_mask = 0; + atomic_long_inc(&inode->i_sb->s_fsnotify_inode_refs); } else if (conn->type == FSNOTIFY_OBJ_TYPE_VFSMOUNT) { fsnotify_conn_mount(conn)->mnt_fsnotify_mask = 0; } @@ -211,10 +214,29 @@ static void fsnotify_final_mark_destroy( fsnotify_put_group(group); } +/* Drop object reference originally held by a connector */ +static void fsnotify_drop_object(unsigned int type, void *objp) +{ + struct inode *inode; + struct super_block *sb; + + if (!objp) + return; + /* Currently only inode references are passed to be dropped */ + if (WARN_ON_ONCE(type != FSNOTIFY_OBJ_TYPE_INODE)) + return; + inode = objp; + sb = inode->i_sb; + iput(inode); + if (atomic_long_dec_and_test(&sb->s_fsnotify_inode_refs)) + wake_up_var(&sb->s_fsnotify_inode_refs); +} + void fsnotify_put_mark(struct fsnotify_mark *mark) { struct fsnotify_mark_connector *conn; - struct inode *inode = NULL; + void *objp = NULL; + unsigned int type = FSNOTIFY_OBJ_TYPE_DETACHED; bool free_conn = false; /* Catch marks that were actually never attached to object */ @@ -234,7 +256,7 @@ void fsnotify_put_mark(struct fsnotify_m conn = mark->connector; hlist_del_init_rcu(&mark->obj_list); if (hlist_empty(&conn->list)) { - inode = fsnotify_detach_connector_from_object(conn); + objp = fsnotify_detach_connector_from_object(conn, &type); free_conn = true; } else { __fsnotify_recalc_mask(conn); @@ -242,7 +264,7 @@ void fsnotify_put_mark(struct fsnotify_m mark->connector = NULL; spin_unlock(&conn->lock); - iput(inode); + fsnotify_drop_object(type, objp); if (free_conn) { spin_lock(&destroy_lock); @@ -709,7 +731,8 @@ void fsnotify_destroy_marks(fsnotify_con { struct fsnotify_mark_connector *conn; struct fsnotify_mark *mark, *old_mark = NULL; - struct inode *inode; + void *objp; + unsigned int type; conn = fsnotify_grab_connector(connp); if (!conn) @@ -735,11 +758,11 @@ void fsnotify_destroy_marks(fsnotify_con * mark references get dropped. It would lead to strange results such * as delaying inode deletion or blocking unmount. */ - inode = fsnotify_detach_connector_from_object(conn); + objp = fsnotify_detach_connector_from_object(conn, &type); spin_unlock(&conn->lock); if (old_mark) fsnotify_put_mark(old_mark); - iput(inode); + fsnotify_drop_object(type, objp); } /* --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1428,6 +1428,9 @@ struct super_block { /* Number of inodes with nlink == 0 but still referenced */ atomic_long_t s_remove_count; + /* Pending fsnotify inode refs */ + atomic_long_t s_fsnotify_inode_refs; + /* Being remounted read-only */ int s_readonly_remount;