Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3332559imu; Sun, 11 Nov 2018 12:34:08 -0800 (PST) X-Google-Smtp-Source: AJdET5f5ldyrGMem3nhGfl+IBHdiNBxdsxEX84iVmQ6PpGHw9Y8ymZ0dVV9VICcVbWXjNip0nHxd X-Received: by 2002:a63:e950:: with SMTP id q16mr4700219pgj.138.1541968448346; Sun, 11 Nov 2018 12:34:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541968448; cv=none; d=google.com; s=arc-20160816; b=e2DxZFqjiV30QcTZxw8SMJxIG9LzVWo78AaobCh5zR7JcrHuWGQjjQZrSjzuYsfS8q grgwuQYoUzKRm89unvctxqCkyDaq1mPC6eFnip9RR5/7KOrFxbRtGPFQmkawxw/LlbRQ baNMYl06FQZfTaHxs+/jUnfrtEIAXwTuptQNjnZsq8Zk3tUu3RG779qm/UTlgd6WRO3f MeHeOD9UvOdlHRV27zU1htgmH6hy3lCsBeEjcwPSxLPRvyE9rMfT19FdeZz+VlFx4qn3 cJJZDF4cXhplqEPDio9SQcVH6UkGeUljj6fTn6Ph0daJscpIrxuIQkP3FCaIDldfEQpC /Lkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:subject:message-id:date:cc:to :from:mime-version:content-transfer-encoding:content-disposition; bh=zoSwfOH8Ekg8WfoqIxjFXNloQ2KXVdH2qE+GyISexxo=; b=Xupn/PTI4rfvTEVy5OAf0G1gIEMHVJAuHD8vwBcFuM0MDVn00m1/p6p1B6iI35j/Ty Wx8fbUocEbfAMWnSvNxgb2tb/IfAxgo7nAM/ft/e8gXnqZaaEhKHdDETXMe3RpcnjVo/ 2JM5fb5a6I4FLbL/FYpEZOIKW96yvAYpglePTSR+ogdbPwwLdKGWm7Ip+esC8S91gM1G topexPLCHRY8n5idL5xQbEM4ktMDpr0XEqfSz4nC17lubcV1SYsnWv7DvIVtGftUq3Ks VFXL2ihhjTXaHYTwyc8po7JHVqMCEVtUvYo+qfjpZVO0eyiSn2InqkG+mCD7XSKWbuOA Zt1w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n13si9872774pgp.307.2018.11.11.12.33.53; Sun, 11 Nov 2018 12:34:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730437AbeKLGXF (ORCPT + 99 others); Mon, 12 Nov 2018 01:23:05 -0500 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:50126 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730259AbeKLFsQ (ORCPT ); Mon, 12 Nov 2018 00:48:16 -0500 Received: from [192.168.4.242] (helo=deadeye) by shadbolt.decadent.org.uk with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1gLvsc-0000lJ-7D; Sun, 11 Nov 2018 19:58:46 +0000 Received: from ben by deadeye with local (Exim 4.91) (envelope-from ) id 1gLvsZ-0001pW-8x; Sun, 11 Nov 2018 19:58:43 +0000 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Al Viro" , "Oleg Nesterov" Date: Sun, 11 Nov 2018 19:49:05 +0000 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.16 316/366] fix __legitimize_mnt()/mntput() race In-Reply-To: X-SA-Exim-Connect-IP: 192.168.4.242 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.16.61-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Al Viro commit 119e1ef80ecfe0d1deb6378d4ab41f5b71519de1 upstream. __legitimize_mnt() has two problems - one is that in case of success the check of mount_lock is not ordered wrt preceding increment of refcount, making it possible to have successful __legitimize_mnt() on one CPU just before the otherwise final mntpu() on another, with __legitimize_mnt() not seeing mntput() taking the lock and mntput() not seeing the increment done by __legitimize_mnt(). Solved by a pair of barriers. Another is that failure of __legitimize_mnt() on the second read_seqretry() leaves us with reference that'll need to be dropped by caller; however, if that races with final mntput() we can end up with caller dropping rcu_read_lock() and doing mntput() to release that reference - with the first mntput() having freed the damn thing just as rcu_read_lock() had been dropped. Solution: in "do mntput() yourself" failure case grab mount_lock, check if MNT_DOOMED has been set by racing final mntput() that has missed our increment and if it has - undo the increment and treat that as "failure, caller doesn't need to drop anything" case. It's not easy to hit - the final mntput() has to come right after the first read_seqretry() in __legitimize_mnt() *and* manage to miss the increment done by __legitimize_mnt() before the second read_seqretry() in there. The things that are almost impossible to hit on bare hardware are not impossible on SMP KVM, though... Reported-by: Oleg Nesterov Fixes: 48a066e72d97 ("RCU'd vsfmounts") Signed-off-by: Al Viro [bwh: Backported to 3.16: __legitimize_mnt() has not been split out from legitimize_mnt(). Adjust the added return statement and comments accordingly.] Signed-off-by: Ben Hutchings --- fs/namespace.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) --- a/fs/namespace.c +++ b/fs/namespace.c @@ -592,12 +592,20 @@ bool legitimize_mnt(struct vfsmount *bas return true; mnt = real_mount(bastard); mnt_add_count(mnt, 1); + smp_mb(); // see mntput_no_expire() if (likely(!read_seqretry(&mount_lock, seq))) return true; if (bastard->mnt_flags & MNT_SYNC_UMOUNT) { mnt_add_count(mnt, -1); return false; } + lock_mount_hash(); + if (unlikely(bastard->mnt_flags & MNT_DOOMED)) { + mnt_add_count(mnt, -1); + unlock_mount_hash(); + return false; + } + unlock_mount_hash(); rcu_read_unlock(); mntput(bastard); rcu_read_lock(); @@ -984,6 +992,11 @@ put_again: return; } lock_mount_hash(); + /* + * make sure that if legitimize_mnt() has not seen us grab + * mount_lock, we'll see their refcount increment here. + */ + smp_mb(); mnt_add_count(mnt, -1); if (mnt_get_count(mnt)) { rcu_read_unlock();