Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753208AbbBQUMe (ORCPT ); Tue, 17 Feb 2015 15:12:34 -0500 Received: from mail-yk0-f172.google.com ([209.85.160.172]:49358 "EHLO mail-yk0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752816AbbBQUMc (ORCPT ); Tue, 17 Feb 2015 15:12:32 -0500 Date: Tue, 17 Feb 2015 15:12:24 -0500 From: Jeff Layton To: Linus Torvalds Cc: "J. Bruce Fields" , "Kirill A. Shutemov" , linux-fsdevel , Linux Kernel Mailing List , Christoph Hellwig , Dave Chinner , Sasha Levin Subject: Re: [GIT PULL] please pull file-locking related changes for v3.20 Message-ID: <20150217151224.2dc31ad8@tlielax.poochiereds.net> In-Reply-To: References: <20150209055540.2f2a3689@tlielax.poochiereds.net> <20150216133200.GB3270@node.dhcp.inet.fi> <20150216090054.62455465@tlielax.poochiereds.net> <20150217190844.GC27900@fieldses.org> <20150217142714.36ed9ddb@tlielax.poochiereds.net> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/zUJwrAai1zvi+zci8KHPjqN" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5344 Lines: 144 --MP_/zUJwrAai1zvi+zci8KHPjqN Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline On Tue, 17 Feb 2015 11:41:40 -0800 Linus Torvalds wrote: > On Tue, Feb 17, 2015 at 11:27 AM, Jeff Layton wrote: > > > > What about this instead then? > > No. Really. > > > - leave the "drop the spinlock" thing in place in flock_lock_file for > > v3.20 > > No. The whole concept of "drop the lock in the middle" is *BROKEN*. > It's seriously crap. It's not just a bug, it's a really fundamentally > wrong thing to do. > > > - change locks_remove_flock to just walk the list and delete any locks > > associated with the filp being closed > > No. That's still wrong. You can have two people holding a write-lock. > Seriously. That's *shit*. > > The "drop the spinlock in the middle" must go. There's not even any > reason for it. Just get rid of it. There can be no deadlock if you get > rid of it, because > > - we hold the flc_lock over the whole event, so we can never see any > half-way state > > - if we actually decide to sleep (due to conflicting locks) and > return FILE_LOCK_DEFERRED, we will drop the lock before actually > sleeping, so nobody else will be deadlocking on this file lock. So any > *other* person who tries to do an upgrade will not sleep, because the > pending upgrade will have moved to the blocking list (that whole > "locks_insert_block" part. > > Ergo, either we'll upgrade the lock (atomically, within flc_lock), or > we will drop the lock (possibly moving it to the blocking list). I > don't see a deadlock. > > I think your (and mine - but mine had the more fundamental problem of > never setting "old_fl" correctly at all) patch had a deadlock because > you didn't actually remove the old lock when you returned > FILE_LOCK_DEFERRED. > > But I think the correct minimal patch is actually to just remove the > "if (found)" statement. > > Linus I agree that there's no deadlock. I also agree that allowing two LOCK_EX's (or a LOCK_SH + LOCK_EX) on the file is broken. I'm just leery on making a user-visible change at this point. I'd prefer to let something like that soak in linux-next for a while. Another possibility is to keep dropping the spinlock, but check to see if someone set a new lock on the same filp in the loop after that. If they have, then we could just remove that lock before adding the new one. I don't think that would violate anything since there are no atomicity guarantees here. If you're setting locks on the same filp from multiple tasks then you're simply asking for trouble. I don't expect that most apps do that though, but rather work on their own set of open file descriptions. Those might get bitten however if we stop dropping the spinlock there since we'll be changing how flock's fairness works. See the attached (untested) patch for what I'm thinking. If you still think that removing the "if (found)" clause is the right thing to do, I'll go with that, but I do worry that we might break some (fragile) app that might rely on the way that flock works today. -- Jeff Layton --MP_/zUJwrAai1zvi+zci8KHPjqN Content-Type: text/x-patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=0001-locks-ensure-that-we-can-t-set-multiple-flock-locks-.patch >From 3212be05d47300fbb5718932f92b33acde3d219c Mon Sep 17 00:00:00 2001 From: Jeff Layton Date: Tue, 17 Feb 2015 15:08:06 -0500 Subject: [PATCH] locks: ensure that we can't set multiple flock locks for the same filp Currently, we'll drop the spinlock in the middle of flock_lock_file in the event that we found an lock that needed to be removed prior to an upgrade or downgrade. It's possible however for another task to race in and set a lock on the same filp. If that happens, then we don't want to set an additional lock, so just remove the one that raced in and set our own. Signed-off-by: Jeff Layton --- fs/locks.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index fe8f9f46445b..099b60a46ccc 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -864,7 +864,7 @@ static int posix_locks_deadlock(struct file_lock *caller_fl, static int flock_lock_file(struct file *filp, struct file_lock *request) { struct file_lock *new_fl = NULL; - struct file_lock *fl; + struct file_lock *fl, *tmp; struct file_lock_context *ctx; struct inode *inode = file_inode(filp); int error = 0; @@ -912,7 +912,12 @@ static int flock_lock_file(struct file *filp, struct file_lock *request) } find_conflict: - list_for_each_entry(fl, &ctx->flc_flock, fl_list) { + list_for_each_entry_safe(fl, tmp, &ctx->flc_flock, fl_list) { + /* did someone set a lock on the same filp? */ + if (fl->fl_file == filp) { + locks_delete_lock_ctx(fl, &dispose); + continue; + } if (!flock_locks_conflict(request, fl)) continue; error = -EAGAIN; -- 2.1.0 --MP_/zUJwrAai1zvi+zci8KHPjqN-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/