Date: Tue, 17 Feb 2015 15:12:24 -0500
From: Jeff Layton <jlayton@poochiereds.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
        "Kirill A. Shutemov" <kirill@shutemov.name>,
        linux-fsdevel <linux-fsdevel@vger.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Christoph Hellwig <hch@lst.de>, Dave Chinner <david@fromorbit.com>,
        Sasha Levin <sasha.levin@oracle.com>
Subject: Re: [GIT PULL] please pull file-locking related changes for v3.20
Message-ID: <20150217151224.2dc31ad8@tlielax.poochiereds.net>
In-Reply-To: <CA+55aFyM6yifUDZdUF=8wsbJY-NGvs74SCZpp5u7Wnhbxq6Qhg@mail.gmail.com>
References: <20150209055540.2f2a3689@tlielax.poochiereds.net>
	<20150216133200.GB3270@node.dhcp.inet.fi>
	<20150216090054.62455465@tlielax.poochiereds.net>
	<CA+55aFxrX8kto-4QV9ZanmWjeVrsG1A=Tf5Bh+45U7JL-nmU6w@mail.gmail.com>
	<CA+55aFxCo82EuWjFjri+VYwRr65sO-cRBn+ZJupBMd13PmgEOQ@mail.gmail.com>
	<20150217190844.GC27900@fieldses.org>
	<CA+55aFxwuHCr3evNEfpAE-jGG+279PnevvxscvG-8q4YiYEv-g@mail.gmail.com>
	<20150217142714.36ed9ddb@tlielax.poochiereds.net>
	<CA+55aFyM6yifUDZdUF=8wsbJY-NGvs74SCZpp5u7Wnhbxq6Qhg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="MP_/zUJwrAai1zvi+zci8KHPjqN"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5344
Lines: 144

--MP_/zUJwrAai1zvi+zci8KHPjqN
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On Tue, 17 Feb 2015 11:41:40 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Feb 17, 2015 at 11:27 AM, Jeff Layton <jlayton@poochiereds.net> wrote:
> >
> > What about this instead then?
> 
> No. Really.
> 
> > - leave the "drop the spinlock" thing in place in flock_lock_file for
> >   v3.20
> 
> No. The whole concept of "drop the lock in the middle" is *BROKEN*.
> It's seriously crap. It's not just a bug, it's a really fundamentally
> wrong thing to do.
> 
> > - change locks_remove_flock to just walk the list and delete any locks
> >   associated with the filp being closed
> 
> No. That's still wrong. You can have two people holding a write-lock.
> Seriously. That's *shit*.
> 
> The "drop the spinlock in the middle" must go. There's not even any
> reason for it. Just get rid of it. There can be no deadlock if you get
> rid of it, because
> 
>  - we hold the flc_lock over the whole event, so we can never see any
> half-way state
> 
>  - if we actually decide to sleep (due to conflicting locks) and
> return FILE_LOCK_DEFERRED, we will drop the lock before actually
> sleeping, so nobody else will be deadlocking on this file lock. So any
> *other* person who tries to do an upgrade will not sleep, because the
> pending upgrade will have moved to the blocking list (that whole
> "locks_insert_block" part.
> 
> Ergo, either we'll upgrade the lock (atomically, within flc_lock), or
> we will drop the lock (possibly moving it to the blocking list). I
> don't see a deadlock.
> 
> I think your (and mine - but mine had the more fundamental problem of
> never setting "old_fl" correctly at all) patch had a deadlock because
> you didn't actually remove the old lock when you returned
> FILE_LOCK_DEFERRED.
> 
> But I think the correct minimal patch is actually to just remove the
> "if (found)" statement.
> 
>                        Linus

I agree that there's no deadlock. I also agree that allowing two
LOCK_EX's (or a LOCK_SH + LOCK_EX) on the file is broken. I'm just
leery on making a user-visible change at this point. I'd prefer to let
something like that soak in linux-next for a while.

Another possibility is to keep dropping the spinlock, but check to see
if someone set a new lock on the same filp in the loop after that. If
they have, then we could just remove that lock before adding the new
one.

I don't think that would violate anything since there are no atomicity
guarantees here. If you're setting locks on the same filp from multiple
tasks then you're simply asking for trouble.

I don't expect that most apps do that though, but rather work on their
own set of open file descriptions. Those might get bitten however if we
stop dropping the spinlock there since we'll be changing how flock's
fairness works.

See the attached (untested) patch for what I'm thinking. If you still
think that removing the "if (found)" clause is the right thing to do,
I'll go with that, but I do worry that we might break some (fragile)
app that might rely on the way that flock works today.

-- 
Jeff Layton <jlayton@poochiereds.net>

--MP_/zUJwrAai1zvi+zci8KHPjqN
Content-Type: text/x-patch
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename=0001-locks-ensure-that-we-can-t-set-multiple-flock-locks-.patch

>From 3212be05d47300fbb5718932f92b33acde3d219c Mon Sep 17 00:00:00 2001
From: Jeff Layton <jeff.layton@primarydata.com>
Date: Tue, 17 Feb 2015 15:08:06 -0500
Subject: [PATCH] locks: ensure that we can't set multiple flock locks for the
 same filp

Currently, we'll drop the spinlock in the middle of flock_lock_file in
the event that we found an lock that needed to be removed prior to an
upgrade or downgrade.

It's possible however for another task to race in and set a lock on
the same filp. If that happens, then we don't want to set an additional
lock, so just remove the one that raced in and set our own.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/locks.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index fe8f9f46445b..099b60a46ccc 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -864,7 +864,7 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
 static int flock_lock_file(struct file *filp, struct file_lock *request)
 {
 	struct file_lock *new_fl = NULL;
-	struct file_lock *fl;
+	struct file_lock *fl, *tmp;
 	struct file_lock_context *ctx;
 	struct inode *inode = file_inode(filp);
 	int error = 0;
@@ -912,7 +912,12 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
 	}
 
 find_conflict:
-	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
+	list_for_each_entry_safe(fl, tmp, &ctx->flc_flock, fl_list) {
+		/* did someone set a lock on the same filp? */
+		if (fl->fl_file == filp) {
+			locks_delete_lock_ctx(fl, &dispose);
+			continue;
+		}
 		if (!flock_locks_conflict(request, fl))
 			continue;
 		error = -EAGAIN;
-- 
2.1.0


--MP_/zUJwrAai1zvi+zci8KHPjqN--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/