v3:
- break out a ceph locking cleanup patch into a separate one earlier
in the series
- switch back to using the i_lock to protect assignment of i_flctx.
Using cmpxchg() was subject to races that I couldn't quite get a
grip on. Eventually I may switch it back, but it probably doesn't
provide much benefit.
- add a patch to clean up some comments that refer to i_flock
- use list_empty_careful in lockless checks rather than list_empty
v2:
- bugfix to the flc_posix list ordering that broke the split/merge code
- don't use i_lock to manage the i_flctx pointer. Do so locklessly
via cmpxchg().
- reordering of the patches to make the set bisectable. As a result
the new spinlock is not introduced until near the end of the set
- some cleanup of the lm_change operation was added, made possible
by the move to standard list_heads for tracking locks
- it takes greater pains to avoid spinlocking by checking when the
lists are empty in addition to whether the i_flctx pointer is NULL
- a patch was added to keep count of the number of locks, so we can
avoid having to do count/alloc/populate in ceph and cifs
This is the third iteration of this patchset. The second was posted
on January 8th, under the cover letter entitled:
[PATCH v2 00/10] locks: saner method for managing file locks
The big change here is that it once again uses the i_lock to protect the
i_flctx pointer assignment. In principle we should be able to use
cmpxchg instead, but that seems leave open a race that causes
locks_remove_file to miss recently-added locks. Given that is not a
super latency-sensitive codepath, an extra spinlock shouldn't make much
difference.
Many thanks to Sasha Levin who helped me nail a race that would leave
leftover locks on the i_flock list, and David Howells who convinced
me to just go ahead back to using the i_lock to protect that field.
There are some other minor changes, but overall it's pretty similar
to the last set. I still plan to merge this for v3.20.
------------------------[snip]-------------------------
We currently manage all file_locks via a singly-linked list. This is
problematic for a number of reasons:
- we have to protect all file locks with the same spinlock (or
equivalent). Currently that uses the i_lock, but Christoph has voiced
objections due to the potential for contention with other i_lock
users. He'd like to see us move to using a different lock.
- we have to walk through irrelevant file locks in order to get to the
ones we're interested in. For instance, POSIX locks are at the end
of the list, so we have to skip over all of the flock locks and
leases before we can work with them.
- the singly-linked list is primitive and difficult to work with. We
have to keep track of the "before" pointer and it's easy to get that
wrong.
Cleaning all of this up is complicated by the fact that no one really
wants to grow struct inode in order to do so. We have a single pointer
in the inode now and I don't think we want to use any more.
One possibility that Trond raised was to move this to an hlist, but
that doesn't do anything about the desire for a new spinlock.
This patchset takes the approach of replacing the i_flock list with a
new struct file_lock_context that is allocated when we intend to add a
new file lock to an inode. The file_lock_context is only freed when we
destroy the inode.
Within that, we have separate (and standard!) lists for each lock type,
and a dedicated spinlock for managing those lists. In principle we could
even consider adding separate locks for each, but I didn't bother with
that for now.
For now, the code is still pretty "raw" and isn't bisectable. This is
just a RFC for the basic approach. This is probably v3.19 material at
best.
Anyone have thoughts or comments on the basic approach?
Jeff Layton (13):
locks: add new struct list_head to struct file_lock
locks: have locks_release_file use flock_lock_file to release generic
flock locks
locks: add a new struct file_locking_context pointer to struct inode
ceph: move spinlocking into ceph_encode_locks_to_buffer and
ceph_count_locks
locks: move flock locks to file_lock_context
locks: convert posix locks to file_lock_context
locks: convert lease handling to file_lock_context
locks: remove i_flock field from struct inode
locks: add a dedicated spinlock to protect i_flctx lists
locks: clean up the lm_change prototype
locks: keep a count of locks on the flctx lists
locks: consolidate NULL i_flctx checks in locks_remove_file
locks: update comments that refer to inode->i_flock
fs/ceph/locks.c | 64 +++---
fs/ceph/mds_client.c | 4 -
fs/cifs/file.c | 34 +--
fs/inode.c | 3 +-
fs/lockd/svcsubs.c | 26 ++-
fs/locks.c | 569 +++++++++++++++++++++++++++------------------------
fs/nfs/delegation.c | 23 ++-
fs/nfs/nfs4state.c | 70 ++++---
fs/nfs/pagelist.c | 6 +-
fs/nfs/write.c | 41 +++-
fs/nfsd/nfs4state.c | 21 +-
fs/read_write.c | 2 +-
include/linux/fs.h | 52 +++--
13 files changed, 510 insertions(+), 405 deletions(-)
--
2.1.0
From: Jeff Layton <[email protected]>
...that we can use to queue file_locks to per-ctx list_heads. Go ahead
and convert locks_delete_lock and locks_dispose_list to use it instead
of the fl_block list.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/locks.c | 8 +++++---
include/linux/fs.h | 1 +
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/locks.c b/fs/locks.c
index 59e2f905e4ff..bfe5f17401de 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -207,6 +207,7 @@ static struct kmem_cache *filelock_cache __read_mostly;
static void locks_init_lock_heads(struct file_lock *fl)
{
INIT_HLIST_NODE(&fl->fl_link);
+ INIT_LIST_HEAD(&fl->fl_list);
INIT_LIST_HEAD(&fl->fl_block);
init_waitqueue_head(&fl->fl_wait);
}
@@ -243,6 +244,7 @@ EXPORT_SYMBOL_GPL(locks_release_private);
void locks_free_lock(struct file_lock *fl)
{
BUG_ON(waitqueue_active(&fl->fl_wait));
+ BUG_ON(!list_empty(&fl->fl_list));
BUG_ON(!list_empty(&fl->fl_block));
BUG_ON(!hlist_unhashed(&fl->fl_link));
@@ -257,8 +259,8 @@ locks_dispose_list(struct list_head *dispose)
struct file_lock *fl;
while (!list_empty(dispose)) {
- fl = list_first_entry(dispose, struct file_lock, fl_block);
- list_del_init(&fl->fl_block);
+ fl = list_first_entry(dispose, struct file_lock, fl_list);
+ list_del_init(&fl->fl_list);
locks_free_lock(fl);
}
}
@@ -691,7 +693,7 @@ static void locks_delete_lock(struct file_lock **thisfl_p,
locks_unlink_lock(thisfl_p);
if (dispose)
- list_add(&fl->fl_block, dispose);
+ list_add(&fl->fl_list, dispose);
else
locks_free_lock(fl);
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 42efe13077b6..cd6818115162 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -934,6 +934,7 @@ int locks_in_grace(struct net *);
*/
struct file_lock {
struct file_lock *fl_next; /* singly linked list for this inode */
+ struct list_head fl_list; /* link into file_lock_context */
struct hlist_node fl_link; /* node in global lists */
struct list_head fl_block; /* circular list of blocked processes */
fl_owner_t fl_owner;
--
2.1.0
From: Jeff Layton <[email protected]>
...instead of open-coding it and removing flock locks directly. This
helps consolidate the flock lock removal logic into a single spot.
Signed-off-by: Jeff Layton <[email protected]>
---
fs/locks.c | 49 +++++++++++++++++++++++++++++++------------------
1 file changed, 31 insertions(+), 18 deletions(-)
diff --git a/fs/locks.c b/fs/locks.c
index bfe5f17401de..ae1e7cf721d6 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2360,6 +2360,30 @@ void locks_remove_posix(struct file *filp, fl_owner_t owner)
EXPORT_SYMBOL(locks_remove_posix);
+static void
+locks_remove_flock(struct file *filp)
+{
+ struct file_lock fl = {
+ .fl_owner = filp,
+ .fl_pid = current->tgid,
+ .fl_file = filp,
+ .fl_flags = FL_FLOCK,
+ .fl_type = F_UNLCK,
+ .fl_end = OFFSET_MAX,
+ };
+
+ if (!file_inode(filp)->i_flock)
+ return;
+
+ if (filp->f_op->flock)
+ filp->f_op->flock(filp, F_SETLKW, &fl);
+ else
+ flock_lock_file(filp, &fl);
+
+ if (fl.fl_ops && fl.fl_ops->fl_release_private)
+ fl.fl_ops->fl_release_private(&fl);
+}
+
/*
* This function is called on the last close of an open file.
*/
@@ -2370,24 +2394,14 @@ void locks_remove_file(struct file *filp)
struct file_lock **before;
LIST_HEAD(dispose);
- if (!inode->i_flock)
- return;
-
+ /* remove any OFD locks */
locks_remove_posix(filp, filp);
- if (filp->f_op->flock) {
- struct file_lock fl = {
- .fl_owner = filp,
- .fl_pid = current->tgid,
- .fl_file = filp,
- .fl_flags = FL_FLOCK,
- .fl_type = F_UNLCK,
- .fl_end = OFFSET_MAX,
- };
- filp->f_op->flock(filp, F_SETLKW, &fl);
- if (fl.fl_ops && fl.fl_ops->fl_release_private)
- fl.fl_ops->fl_release_private(&fl);
- }
+ /* remove flock locks */
+ locks_remove_flock(filp);
+
+ if (!inode->i_flock)
+ return;
spin_lock(&inode->i_lock);
before = &inode->i_flock;
@@ -2407,8 +2421,7 @@ void locks_remove_file(struct file *filp)
* some info about it and then just remove it from
* the list.
*/
- WARN(!IS_FLOCK(fl),
- "leftover lock: dev=%u:%u ino=%lu type=%hhd flags=0x%x start=%lld end=%lld\n",
+ WARN(1, "leftover lock: dev=%u:%u ino=%lu type=%hhd flags=0x%x start=%lld end=%lld\n",
MAJOR(inode->i_sb->s_dev),
MINOR(inode->i_sb->s_dev), inode->i_ino,
fl->fl_type, fl->fl_flags,
--
2.1.0
From: Jeff Layton <[email protected]>
The current scheme of using the i_flock list is really difficult to
manage. There is also a legitimate desire for a per-inode spinlock to
manage these lists that isn't the i_lock.
Start conversion to a new scheme to eventually replace the old i_flock
list with a new "file_lock_context" object.
We start by adding a new i_flctx to struct inode. For now, it lives in
parallel with i_flock list, but will eventually replace it. The idea is
to allocate a structure to sit in that pointer and act as a locus for
all things file locking.
We allocate a file_lock_context for an inode when the first lock is
added to it, and it's only freed when the inode is freed. We use the
i_lock to protect the assignment, but afterward it should mostly be
accessed locklessly.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/inode.c | 3 ++-
fs/locks.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 11 +++++++++++
3 files changed, 57 insertions(+), 1 deletion(-)
diff --git a/fs/inode.c b/fs/inode.c
index aa149e7262ac..f30872ade6d7 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -194,7 +194,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
#ifdef CONFIG_FSNOTIFY
inode->i_fsnotify_mask = 0;
#endif
-
+ inode->i_flctx = NULL;
this_cpu_inc(nr_inodes);
return 0;
@@ -237,6 +237,7 @@ void __destroy_inode(struct inode *inode)
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
fsnotify_inode_delete(inode);
+ locks_free_lock_context(inode->i_flctx);
if (!inode->i_nlink) {
WARN_ON(atomic_long_read(&inode->i_sb->s_remove_count) == 0);
atomic_long_dec(&inode->i_sb->s_remove_count);
diff --git a/fs/locks.c b/fs/locks.c
index ae1e7cf721d6..526d5fca67c8 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -202,8 +202,49 @@ static DEFINE_HASHTABLE(blocked_hash, BLOCKED_HASH_BITS);
*/
static DEFINE_SPINLOCK(blocked_lock_lock);
+static struct kmem_cache *flctx_cache __read_mostly;
static struct kmem_cache *filelock_cache __read_mostly;
+static struct file_lock_context *
+locks_get_lock_context(struct inode *inode)
+{
+ struct file_lock_context *new;
+
+ if (likely(inode->i_flctx))
+ goto out;
+
+ new = kmem_cache_alloc(flctx_cache, GFP_KERNEL);
+ if (!new)
+ goto out;
+
+ INIT_LIST_HEAD(&new->flc_flock);
+
+ /*
+ * Assign the pointer if it's not already assigned. If it is, then
+ * free the context we just allocated.
+ */
+ spin_lock(&inode->i_lock);
+ if (likely(!inode->i_flctx)) {
+ inode->i_flctx = new;
+ new = NULL;
+ }
+ spin_unlock(&inode->i_lock);
+
+ if (new)
+ kmem_cache_free(flctx_cache, new);
+out:
+ return inode->i_flctx;
+}
+
+void
+locks_free_lock_context(struct file_lock_context *ctx)
+{
+ if (ctx) {
+ WARN_ON_ONCE(!list_empty(&ctx->flc_flock));
+ kmem_cache_free(flctx_cache, ctx);
+ }
+}
+
static void locks_init_lock_heads(struct file_lock *fl)
{
INIT_HLIST_NODE(&fl->fl_link);
@@ -2636,6 +2677,9 @@ static int __init filelock_init(void)
{
int i;
+ flctx_cache = kmem_cache_create("file_lock_ctx",
+ sizeof(struct file_lock_context), 0, SLAB_PANIC, NULL);
+
filelock_cache = kmem_cache_create("file_lock_cache",
sizeof(struct file_lock), 0, SLAB_PANIC, NULL);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cd6818115162..dec0d38b05de 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -626,6 +626,7 @@ struct inode {
#endif
const struct file_operations *i_fop; /* former ->i_op->default_file_ops */
struct file_lock *i_flock;
+ struct file_lock_context *i_flctx;
struct address_space i_data;
struct list_head i_devices;
union {
@@ -965,6 +966,10 @@ struct file_lock {
} fl_u;
};
+struct file_lock_context {
+ struct list_head flc_flock;
+};
+
/* The following constant reflects the upper bound of the file/locking space */
#ifndef OFFSET_MAX
#define INT_LIMIT(x) (~((x)1 << (sizeof(x)*8 - 1)))
@@ -991,6 +996,7 @@ extern int fcntl_setlease(unsigned int fd, struct file *filp, long arg);
extern int fcntl_getlease(struct file *filp);
/* fs/locks.c */
+void locks_free_lock_context(struct file_lock_context *ctx);
void locks_free_lock(struct file_lock *fl);
extern void locks_init_lock(struct file_lock *);
extern struct file_lock * locks_alloc_lock(void);
@@ -1048,6 +1054,11 @@ static inline int fcntl_getlease(struct file *filp)
return F_UNLCK;
}
+static inline void
+locks_free_lock_context(struct file_lock_context *ctx)
+{
+}
+
static inline void locks_init_lock(struct file_lock *fl)
{
return;
--
2.1.0
From: Jeff Layton <[email protected]>
There is only a single call site for each of these functions, and the
caller takes the i_lock prior to calling them and drops it just
afterward. Move the spinlocking into the functions instead.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/ceph/locks.c | 4 ++++
fs/ceph/mds_client.c | 4 ----
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c
index c35c5c614e38..366dc2412605 100644
--- a/fs/ceph/locks.c
+++ b/fs/ceph/locks.c
@@ -251,12 +251,14 @@ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
*fcntl_count = 0;
*flock_count = 0;
+ spin_lock(&inode->i_lock);
for (lock = inode->i_flock; lock != NULL; lock = lock->fl_next) {
if (lock->fl_flags & FL_POSIX)
++(*fcntl_count);
else if (lock->fl_flags & FL_FLOCK)
++(*flock_count);
}
+ spin_unlock(&inode->i_lock);
dout("counted %d flock locks and %d fcntl locks",
*flock_count, *fcntl_count);
}
@@ -279,6 +281,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
dout("encoding %d flock and %d fcntl locks", num_flock_locks,
num_fcntl_locks);
+ spin_lock(&inode->i_lock);
for (lock = inode->i_flock; lock != NULL; lock = lock->fl_next) {
if (lock->fl_flags & FL_POSIX) {
++seen_fcntl;
@@ -306,6 +309,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
}
}
fail:
+ spin_unlock(&inode->i_lock);
return err;
}
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index d2171f4a6980..5f62fb7a5d0a 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2700,20 +2700,16 @@ static int encode_caps_cb(struct inode *inode, struct ceph_cap *cap,
struct ceph_filelock *flocks;
encode_again:
- spin_lock(&inode->i_lock);
ceph_count_locks(inode, &num_fcntl_locks, &num_flock_locks);
- spin_unlock(&inode->i_lock);
flocks = kmalloc((num_fcntl_locks+num_flock_locks) *
sizeof(struct ceph_filelock), GFP_NOFS);
if (!flocks) {
err = -ENOMEM;
goto out_free;
}
- spin_lock(&inode->i_lock);
err = ceph_encode_locks_to_buffer(inode, flocks,
num_fcntl_locks,
num_flock_locks);
- spin_unlock(&inode->i_lock);
if (err) {
kfree(flocks);
if (err == -ENOSPC)
--
2.1.0
From: Jeff Layton <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/ceph/locks.c | 23 ++++++++++++++++-------
fs/locks.c | 54 ++++++++++++++++++++++++++++++++++-------------------
fs/nfs/delegation.c | 19 +++++++++++++++++--
fs/nfs/nfs4state.c | 42 +++++++++++++++++++++++++++++++++++++++--
fs/nfs/pagelist.c | 6 ++++++
fs/nfs/write.c | 43 +++++++++++++++++++++++++++++++++++++-----
6 files changed, 152 insertions(+), 35 deletions(-)
diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c
index 366dc2412605..917656ea8dcf 100644
--- a/fs/ceph/locks.c
+++ b/fs/ceph/locks.c
@@ -239,14 +239,16 @@ int ceph_flock(struct file *file, int cmd, struct file_lock *fl)
return err;
}
-/**
- * Must be called with lock_flocks() already held. Fills in the passed
- * counter variables, so you can prepare pagelist metadata before calling
- * ceph_encode_locks.
+/*
+ * Fills in the passed counter variables, so you can prepare pagelist metadata
+ * before calling ceph_encode_locks.
+ *
+ * FIXME: add counters to struct file_lock_context so we don't need to do this?
*/
void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
{
struct file_lock *lock;
+ struct file_lock_context *ctx;
*fcntl_count = 0;
*flock_count = 0;
@@ -255,7 +257,11 @@ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
for (lock = inode->i_flock; lock != NULL; lock = lock->fl_next) {
if (lock->fl_flags & FL_POSIX)
++(*fcntl_count);
- else if (lock->fl_flags & FL_FLOCK)
+ }
+
+ ctx = inode->i_flctx;
+ if (ctx) {
+ list_for_each_entry(lock, &ctx->flc_flock, fl_list)
++(*flock_count);
}
spin_unlock(&inode->i_lock);
@@ -273,6 +279,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
int num_fcntl_locks, int num_flock_locks)
{
struct file_lock *lock;
+ struct file_lock_context *ctx;
int err = 0;
int seen_fcntl = 0;
int seen_flock = 0;
@@ -295,8 +302,10 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
++l;
}
}
- for (lock = inode->i_flock; lock != NULL; lock = lock->fl_next) {
- if (lock->fl_flags & FL_FLOCK) {
+
+ ctx = inode->i_flctx;
+ if (ctx) {
+ list_for_each_entry(lock, &ctx->flc_flock, fl_list) {
++seen_flock;
if (seen_flock > num_flock_locks) {
err = -ENOSPC;
diff --git a/fs/locks.c b/fs/locks.c
index 526d5fca67c8..055df53f19de 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -694,6 +694,14 @@ static void locks_insert_lock(struct file_lock **pos, struct file_lock *fl)
locks_insert_global_locks(fl);
}
+static void
+locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before)
+{
+ fl->fl_nspid = get_pid(task_tgid(current));
+ list_add_tail(&fl->fl_list, before);
+ locks_insert_global_locks(fl);
+}
+
/**
* locks_delete_lock - Delete a lock and then free it.
* @thisfl_p: pointer that points to the fl_next field of the previous
@@ -739,6 +747,18 @@ static void locks_delete_lock(struct file_lock **thisfl_p,
locks_free_lock(fl);
}
+static void
+locks_delete_lock_ctx(struct file_lock *fl, struct list_head *dispose)
+{
+ locks_delete_global_locks(fl);
+ if (fl->fl_nspid) {
+ put_pid(fl->fl_nspid);
+ fl->fl_nspid = NULL;
+ }
+ locks_wake_up_blocks(fl);
+ list_move(&fl->fl_list, dispose);
+}
+
/* Determine if lock sys_fl blocks lock caller_fl. Common functionality
* checks for shared/exclusive status of overlapping locks.
*/
@@ -888,12 +908,17 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
static int flock_lock_file(struct file *filp, struct file_lock *request)
{
struct file_lock *new_fl = NULL;
- struct file_lock **before;
- struct inode * inode = file_inode(filp);
+ struct file_lock *fl;
+ struct file_lock_context *ctx;
+ struct inode *inode = file_inode(filp);
int error = 0;
- int found = 0;
+ bool found = false;
LIST_HEAD(dispose);
+ ctx = locks_get_lock_context(inode);
+ if (!ctx)
+ return -ENOMEM;
+
if (!(request->fl_flags & FL_ACCESS) && (request->fl_type != F_UNLCK)) {
new_fl = locks_alloc_lock();
if (!new_fl)
@@ -904,18 +929,13 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
if (request->fl_flags & FL_ACCESS)
goto find_conflict;
- for_each_lock(inode, before) {
- struct file_lock *fl = *before;
- if (IS_POSIX(fl))
- break;
- if (IS_LEASE(fl))
- continue;
+ list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
if (filp != fl->fl_file)
continue;
if (request->fl_type == fl->fl_type)
goto out;
- found = 1;
- locks_delete_lock(before, &dispose);
+ found = true;
+ locks_delete_lock_ctx(fl, &dispose);
break;
}
@@ -936,12 +956,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
}
find_conflict:
- for_each_lock(inode, before) {
- struct file_lock *fl = *before;
- if (IS_POSIX(fl))
- break;
- if (IS_LEASE(fl))
- continue;
+ list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
if (!flock_locks_conflict(request, fl))
continue;
error = -EAGAIN;
@@ -954,7 +969,7 @@ find_conflict:
if (request->fl_flags & FL_ACCESS)
goto out;
locks_copy_lock(new_fl, request);
- locks_insert_lock(before, new_fl);
+ locks_insert_lock_ctx(new_fl, &ctx->flc_flock);
new_fl = NULL;
error = 0;
@@ -2412,8 +2427,9 @@ locks_remove_flock(struct file *filp)
.fl_type = F_UNLCK,
.fl_end = OFFSET_MAX,
};
+ struct file_lock_context *flctx = file_inode(filp)->i_flctx;
- if (!file_inode(filp)->i_flock)
+ if (!flctx || list_empty(&flctx->flc_flock))
return;
if (filp->f_op->flock)
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 7f3f60641344..9f9f67b17e2b 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -85,15 +85,16 @@ static int nfs_delegation_claim_locks(struct nfs_open_context *ctx, struct nfs4_
{
struct inode *inode = state->inode;
struct file_lock *fl;
+ struct file_lock_context *flctx;
int status = 0;
- if (inode->i_flock == NULL)
+ if (inode->i_flock == NULL && inode->i_flctx == NULL)
goto out;
/* Protect inode->i_flock using the i_lock */
spin_lock(&inode->i_lock);
for (fl = inode->i_flock; fl != NULL; fl = fl->fl_next) {
- if (!(fl->fl_flags & (FL_POSIX|FL_FLOCK)))
+ if (!(fl->fl_flags & (FL_POSIX)))
continue;
if (nfs_file_open_context(fl->fl_file) != ctx)
continue;
@@ -103,6 +104,20 @@ static int nfs_delegation_claim_locks(struct nfs_open_context *ctx, struct nfs4_
goto out;
spin_lock(&inode->i_lock);
}
+
+ flctx = inode->i_flctx;
+ if (flctx) {
+ list_for_each_entry(fl, &flctx->flc_flock, fl_list) {
+ if (nfs_file_open_context(fl->fl_file) != ctx)
+ continue;
+ spin_unlock(&inode->i_lock);
+ status = nfs4_lock_delegation_recall(fl, state,
+ stateid);
+ if (status < 0)
+ goto out;
+ spin_lock(&inode->i_lock);
+ }
+ }
spin_unlock(&inode->i_lock);
out:
return status;
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 5194933ed419..65c404bf61ae 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1366,8 +1366,9 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_
struct nfs_inode *nfsi = NFS_I(inode);
struct file_lock *fl;
int status = 0;
+ struct file_lock_context *flctx = inode->i_flctx;
- if (inode->i_flock == NULL)
+ if (inode->i_flock == NULL && flctx == NULL)
return 0;
/* Guard against delegation returns and new lock/unlock calls */
@@ -1375,7 +1376,7 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_
/* Protect inode->i_flock using the BKL */
spin_lock(&inode->i_lock);
for (fl = inode->i_flock; fl != NULL; fl = fl->fl_next) {
- if (!(fl->fl_flags & (FL_POSIX|FL_FLOCK)))
+ if (!(fl->fl_flags & FL_POSIX))
continue;
if (nfs_file_open_context(fl->fl_file)->state != state)
continue;
@@ -1408,6 +1409,43 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_
}
spin_lock(&inode->i_lock);
}
+
+ if (!flctx)
+ goto out_unlock;
+
+ list_for_each_entry(fl, &flctx->flc_flock, fl_list) {
+ if (nfs_file_open_context(fl->fl_file)->state != state)
+ continue;
+ spin_unlock(&inode->i_lock);
+ status = ops->recover_lock(state, fl);
+ switch (status) {
+ case 0:
+ break;
+ case -ESTALE:
+ case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_STALE_STATEID:
+ case -NFS4ERR_BAD_STATEID:
+ case -NFS4ERR_EXPIRED:
+ case -NFS4ERR_NO_GRACE:
+ case -NFS4ERR_STALE_CLIENTID:
+ case -NFS4ERR_BADSESSION:
+ case -NFS4ERR_BADSLOT:
+ case -NFS4ERR_BAD_HIGH_SLOT:
+ case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION:
+ goto out;
+ default:
+ pr_err("NFS: %s: unhandled error %d\n",
+ __func__, status);
+ case -ENOMEM:
+ case -NFS4ERR_DENIED:
+ case -NFS4ERR_RECLAIM_BAD:
+ case -NFS4ERR_RECLAIM_CONFLICT:
+ /* kill_proc(fl->fl_pid, SIGLOST, 1); */
+ status = 0;
+ }
+ spin_lock(&inode->i_lock);
+ }
+out_unlock:
spin_unlock(&inode->i_lock);
out:
up_write(&nfsi->rwsem);
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 2b5e769beb16..a3b62e15b444 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -826,6 +826,7 @@ static bool nfs_can_coalesce_requests(struct nfs_page *prev,
struct nfs_pageio_descriptor *pgio)
{
size_t size;
+ struct file_lock_context *flctx;
if (prev) {
if (!nfs_match_open_context(req->wb_context, prev->wb_context))
@@ -834,6 +835,11 @@ static bool nfs_can_coalesce_requests(struct nfs_page *prev,
!nfs_match_lock_context(req->wb_lock_context,
prev->wb_lock_context))
return false;
+ flctx = req->wb_context->dentry->d_inode->i_flctx;
+ if (flctx != NULL && !list_empty_careful(&flctx->flc_flock) &&
+ !nfs_match_lock_context(req->wb_lock_context,
+ prev->wb_lock_context))
+ return false;
if (req_offset(req) != req_offset(prev) + prev->wb_bytes)
return false;
if (req->wb_page == prev->wb_page) {
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index af3af685a9e3..e072aeb34195 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1113,6 +1113,11 @@ int nfs_flush_incompatible(struct file *file, struct page *page)
do_flush |= l_ctx->lockowner.l_owner != current->files
|| l_ctx->lockowner.l_pid != current->tgid;
}
+ if (l_ctx && ctx->dentry->d_inode->i_flctx &&
+ !list_empty_careful(&ctx->dentry->d_inode->i_flctx->flc_flock)) {
+ do_flush |= l_ctx->lockowner.l_owner != current->files
+ || l_ctx->lockowner.l_pid != current->tgid;
+ }
nfs_release_request(req);
if (!do_flush)
return 0;
@@ -1170,6 +1175,13 @@ out:
return PageUptodate(page) != 0;
}
+static bool
+is_whole_file_wrlock(struct file_lock *fl)
+{
+ return fl->fl_start == 0 && fl->fl_end == OFFSET_MAX &&
+ fl->fl_type == F_WRLCK;
+}
+
/* If we know the page is up to date, and we're not using byte range locks (or
* if we have the whole file locked for writing), it may be more efficient to
* extend the write to cover the entire page in order to avoid fragmentation
@@ -1180,17 +1192,38 @@ out:
*/
static int nfs_can_extend_write(struct file *file, struct page *page, struct inode *inode)
{
+ int ret;
+ struct file_lock_context *flctx = inode->i_flctx;
+ struct file_lock *fl;
+
if (file->f_flags & O_DSYNC)
return 0;
if (!nfs_write_pageuptodate(page, inode))
return 0;
if (NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE))
return 1;
- if (inode->i_flock == NULL || (inode->i_flock->fl_start == 0 &&
- inode->i_flock->fl_end == OFFSET_MAX &&
- inode->i_flock->fl_type != F_RDLCK))
- return 1;
- return 0;
+ if (!inode->i_flock && !flctx)
+ return 0;
+
+ /* Check to see if there are whole file write locks */
+ spin_lock(&inode->i_lock);
+ ret = 0;
+
+ fl = inode->i_flock;
+ if (fl && is_whole_file_wrlock(fl)) {
+ ret = 1;
+ goto out;
+ }
+
+ if (!list_empty(&flctx->flc_flock)) {
+ fl = list_first_entry(&flctx->flc_flock, struct file_lock,
+ fl_list);
+ if (fl->fl_type == F_WRLCK)
+ ret = 1;
+ }
+out:
+ spin_unlock(&inode->i_lock);
+ return ret;
}
/*
--
2.1.0
From: Jeff Layton <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/ceph/locks.c | 58 +++++++++++++---------------
fs/cifs/file.c | 26 +++++--------
fs/lockd/svcsubs.c | 20 ++++++----
fs/locks.c | 108 +++++++++++++++++++++++++++-------------------------
fs/nfs/delegation.c | 28 +++++---------
fs/nfs/nfs4state.c | 52 +++++--------------------
fs/nfs/pagelist.c | 8 ++--
fs/nfs/write.c | 30 +++++++--------
fs/nfsd/nfs4state.c | 18 +++++----
fs/read_write.c | 2 +-
include/linux/fs.h | 3 +-
11 files changed, 155 insertions(+), 198 deletions(-)
diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c
index 917656ea8dcf..19beeed83233 100644
--- a/fs/ceph/locks.c
+++ b/fs/ceph/locks.c
@@ -253,18 +253,15 @@ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
*fcntl_count = 0;
*flock_count = 0;
- spin_lock(&inode->i_lock);
- for (lock = inode->i_flock; lock != NULL; lock = lock->fl_next) {
- if (lock->fl_flags & FL_POSIX)
- ++(*fcntl_count);
- }
-
ctx = inode->i_flctx;
if (ctx) {
+ spin_lock(&inode->i_lock);
+ list_for_each_entry(lock, &ctx->flc_posix, fl_list)
+ ++(*fcntl_count);
list_for_each_entry(lock, &ctx->flc_flock, fl_list)
++(*flock_count);
+ spin_unlock(&inode->i_lock);
}
- spin_unlock(&inode->i_lock);
dout("counted %d flock locks and %d fcntl locks",
*flock_count, *fcntl_count);
}
@@ -279,7 +276,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
int num_fcntl_locks, int num_flock_locks)
{
struct file_lock *lock;
- struct file_lock_context *ctx;
+ struct file_lock_context *ctx = inode->i_flctx;
int err = 0;
int seen_fcntl = 0;
int seen_flock = 0;
@@ -288,34 +285,31 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
dout("encoding %d flock and %d fcntl locks", num_flock_locks,
num_fcntl_locks);
+ if (!ctx)
+ return 0;
+
spin_lock(&inode->i_lock);
- for (lock = inode->i_flock; lock != NULL; lock = lock->fl_next) {
- if (lock->fl_flags & FL_POSIX) {
- ++seen_fcntl;
- if (seen_fcntl > num_fcntl_locks) {
- err = -ENOSPC;
- goto fail;
- }
- err = lock_to_ceph_filelock(lock, &flocks[l]);
- if (err)
- goto fail;
- ++l;
+ list_for_each_entry(lock, &ctx->flc_flock, fl_list) {
+ ++seen_fcntl;
+ if (seen_fcntl > num_fcntl_locks) {
+ err = -ENOSPC;
+ goto fail;
}
+ err = lock_to_ceph_filelock(lock, &flocks[l]);
+ if (err)
+ goto fail;
+ ++l;
}
-
- ctx = inode->i_flctx;
- if (ctx) {
- list_for_each_entry(lock, &ctx->flc_flock, fl_list) {
- ++seen_flock;
- if (seen_flock > num_flock_locks) {
- err = -ENOSPC;
- goto fail;
- }
- err = lock_to_ceph_filelock(lock, &flocks[l]);
- if (err)
- goto fail;
- ++l;
+ list_for_each_entry(lock, &ctx->flc_flock, fl_list) {
+ ++seen_flock;
+ if (seen_flock > num_flock_locks) {
+ err = -ENOSPC;
+ goto fail;
}
+ err = lock_to_ceph_filelock(lock, &flocks[l]);
+ if (err)
+ goto fail;
+ ++l;
}
fail:
spin_unlock(&inode->i_lock);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 96b7e9b7706d..ea78f6f81ce2 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1109,11 +1109,6 @@ cifs_push_mandatory_locks(struct cifsFileInfo *cfile)
return rc;
}
-/* copied from fs/locks.c with a name change */
-#define cifs_for_each_lock(inode, lockp) \
- for (lockp = &inode->i_flock; *lockp != NULL; \
- lockp = &(*lockp)->fl_next)
-
struct lock_to_push {
struct list_head llist;
__u64 offset;
@@ -1128,8 +1123,9 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
{
struct inode *inode = cfile->dentry->d_inode;
struct cifs_tcon *tcon = tlink_tcon(cfile->tlink);
- struct file_lock *flock, **before;
- unsigned int count = 0, i = 0;
+ struct file_lock *flock;
+ struct file_lock_context *flctx = inode->i_flctx;
+ unsigned int count = 0, i;
int rc = 0, xid, type;
struct list_head locks_to_send, *el;
struct lock_to_push *lck, *tmp;
@@ -1137,10 +1133,12 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
xid = get_xid();
+ if (!flctx)
+ goto out;
+
spin_lock(&inode->i_lock);
- cifs_for_each_lock(inode, before) {
- if ((*before)->fl_flags & FL_POSIX)
- count++;
+ list_for_each(el, &flctx->flc_posix) {
+ count++;
}
spin_unlock(&inode->i_lock);
@@ -1151,7 +1149,7 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
* added to the list while we are holding cinode->lock_sem that
* protects locking operations of this inode.
*/
- for (; i < count; i++) {
+ for (i = 0; i < count; i++) {
lck = kmalloc(sizeof(struct lock_to_push), GFP_KERNEL);
if (!lck) {
rc = -ENOMEM;
@@ -1162,10 +1160,7 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
el = locks_to_send.next;
spin_lock(&inode->i_lock);
- cifs_for_each_lock(inode, before) {
- flock = *before;
- if ((flock->fl_flags & FL_POSIX) == 0)
- continue;
+ list_for_each_entry(flock, &flctx->flc_posix, fl_list) {
if (el == &locks_to_send) {
/*
* The list ended. We don't have enough allocated
@@ -1185,7 +1180,6 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
lck->length = length;
lck->type = type;
lck->offset = flock->fl_start;
- el = el->next;
}
spin_unlock(&inode->i_lock);
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index d12ff4e2dbe7..5300bb53835f 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -164,12 +164,15 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,
{
struct inode *inode = nlmsvc_file_inode(file);
struct file_lock *fl;
+ struct file_lock_context *flctx = inode->i_flctx;
struct nlm_host *lockhost;
+ if (!flctx || list_empty_careful(&flctx->flc_posix))
+ return 0;
again:
file->f_locks = 0;
spin_lock(&inode->i_lock);
- for (fl = inode->i_flock; fl; fl = fl->fl_next) {
+ list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
if (fl->fl_lmops != &nlmsvc_lock_operations)
continue;
@@ -223,18 +226,21 @@ nlm_file_inuse(struct nlm_file *file)
{
struct inode *inode = nlmsvc_file_inode(file);
struct file_lock *fl;
+ struct file_lock_context *flctx = inode->i_flctx;
if (file->f_count || !list_empty(&file->f_blocks) || file->f_shares)
return 1;
- spin_lock(&inode->i_lock);
- for (fl = inode->i_flock; fl; fl = fl->fl_next) {
- if (fl->fl_lmops == &nlmsvc_lock_operations) {
- spin_unlock(&inode->i_lock);
- return 1;
+ if (flctx && !list_empty_careful(&flctx->flc_posix)) {
+ spin_lock(&inode->i_lock);
+ list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
+ if (fl->fl_lmops == &nlmsvc_lock_operations) {
+ spin_unlock(&inode->i_lock);
+ return 1;
+ }
}
+ spin_unlock(&inode->i_lock);
}
- spin_unlock(&inode->i_lock);
file->f_locks = 0;
return 0;
}
diff --git a/fs/locks.c b/fs/locks.c
index 055df53f19de..e50bb4d9e757 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -157,9 +157,6 @@ static int target_leasetype(struct file_lock *fl)
int leases_enable = 1;
int lease_break_time = 45;
-#define for_each_lock(inode, lockp) \
- for (lockp = &inode->i_flock; *lockp != NULL; lockp = &(*lockp)->fl_next)
-
/*
* The global file_lock_list is only used for displaying /proc/locks, so we
* keep a list on each CPU, with each list protected by its own spinlock via
@@ -218,6 +215,7 @@ locks_get_lock_context(struct inode *inode)
goto out;
INIT_LIST_HEAD(&new->flc_flock);
+ INIT_LIST_HEAD(&new->flc_posix);
/*
* Assign the pointer if it's not already assigned. If it is, then
@@ -241,6 +239,7 @@ locks_free_lock_context(struct file_lock_context *ctx)
{
if (ctx) {
WARN_ON_ONCE(!list_empty(&ctx->flc_flock));
+ WARN_ON_ONCE(!list_empty(&ctx->flc_posix));
kmem_cache_free(flctx_cache, ctx);
}
}
@@ -809,21 +808,26 @@ void
posix_test_lock(struct file *filp, struct file_lock *fl)
{
struct file_lock *cfl;
+ struct file_lock_context *ctx;
struct inode *inode = file_inode(filp);
+ ctx = inode->i_flctx;
+ if (!ctx || list_empty_careful(&ctx->flc_posix)) {
+ fl->fl_type = F_UNLCK;
+ return;
+ }
+
spin_lock(&inode->i_lock);
- for (cfl = file_inode(filp)->i_flock; cfl; cfl = cfl->fl_next) {
- if (!IS_POSIX(cfl))
- continue;
- if (posix_locks_conflict(fl, cfl))
- break;
+ list_for_each_entry(cfl, &ctx->flc_posix, fl_list) {
+ if (posix_locks_conflict(fl, cfl)) {
+ locks_copy_conflock(fl, cfl);
+ if (cfl->fl_nspid)
+ fl->fl_pid = pid_vnr(cfl->fl_nspid);
+ goto out;
+ }
}
- if (cfl) {
- locks_copy_conflock(fl, cfl);
- if (cfl->fl_nspid)
- fl->fl_pid = pid_vnr(cfl->fl_nspid);
- } else
- fl->fl_type = F_UNLCK;
+ fl->fl_type = F_UNLCK;
+out:
spin_unlock(&inode->i_lock);
return;
}
@@ -983,16 +987,20 @@ out:
static int __posix_lock_file(struct inode *inode, struct file_lock *request, struct file_lock *conflock)
{
- struct file_lock *fl;
+ struct file_lock *fl, *tmp;
struct file_lock *new_fl = NULL;
struct file_lock *new_fl2 = NULL;
struct file_lock *left = NULL;
struct file_lock *right = NULL;
- struct file_lock **before;
+ struct file_lock_context *ctx;
int error;
bool added = false;
LIST_HEAD(dispose);
+ ctx = locks_get_lock_context(inode);
+ if (!ctx)
+ return -ENOMEM;
+
/*
* We may need two file_lock structures for this operation,
* so we get them in advance to avoid races.
@@ -1013,8 +1021,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
* blocker's list of waiters and the global blocked_hash.
*/
if (request->fl_type != F_UNLCK) {
- for_each_lock(inode, before) {
- fl = *before;
+ list_for_each_entry(fl, &ctx->flc_posix, fl_list) {
if (!IS_POSIX(fl))
continue;
if (!posix_locks_conflict(request, fl))
@@ -1044,29 +1051,25 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
if (request->fl_flags & FL_ACCESS)
goto out;
- /*
- * Find the first old lock with the same owner as the new lock.
- */
-
- before = &inode->i_flock;
-
- /* First skip locks owned by other processes. */
- while ((fl = *before) && (!IS_POSIX(fl) ||
- !posix_same_owner(request, fl))) {
- before = &fl->fl_next;
+ /* Find the first old lock with the same owner as the new lock */
+ list_for_each_entry(fl, &ctx->flc_posix, fl_list) {
+ if (posix_same_owner(request, fl))
+ break;
}
/* Process locks with this owner. */
- while ((fl = *before) && posix_same_owner(request, fl)) {
- /* Detect adjacent or overlapping regions (if same lock type)
- */
+ list_for_each_entry_safe_from(fl, tmp, &ctx->flc_posix, fl_list) {
+ if (!posix_same_owner(request, fl))
+ break;
+
+ /* Detect adjacent or overlapping regions (if same lock type) */
if (request->fl_type == fl->fl_type) {
/* In all comparisons of start vs end, use
* "start - 1" rather than "end + 1". If end
* is OFFSET_MAX, end + 1 will become negative.
*/
if (fl->fl_end < request->fl_start - 1)
- goto next_lock;
+ continue;
/* If the next lock in the list has entirely bigger
* addresses than the new one, insert the lock here.
*/
@@ -1087,18 +1090,17 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
else
request->fl_end = fl->fl_end;
if (added) {
- locks_delete_lock(before, &dispose);
+ locks_delete_lock_ctx(fl, &dispose);
continue;
}
request = fl;
added = true;
- }
- else {
+ } else {
/* Processing for different lock types is a bit
* more complex.
*/
if (fl->fl_end < request->fl_start)
- goto next_lock;
+ continue;
if (fl->fl_start > request->fl_end)
break;
if (request->fl_type == F_UNLCK)
@@ -1117,7 +1119,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
* one (This may happen several times).
*/
if (added) {
- locks_delete_lock(before, &dispose);
+ locks_delete_lock_ctx(fl, &dispose);
continue;
}
/*
@@ -1133,15 +1135,11 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
locks_copy_lock(new_fl, request);
request = new_fl;
new_fl = NULL;
- locks_delete_lock(before, &dispose);
- locks_insert_lock(before, request);
+ locks_insert_lock_ctx(request, &fl->fl_list);
+ locks_delete_lock_ctx(fl, &dispose);
added = true;
}
}
- /* Go on to next lock.
- */
- next_lock:
- before = &fl->fl_next;
}
/*
@@ -1166,7 +1164,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
goto out;
}
locks_copy_lock(new_fl, request);
- locks_insert_lock(before, new_fl);
+ locks_insert_lock_ctx(new_fl, &fl->fl_list);
new_fl = NULL;
}
if (right) {
@@ -1177,7 +1175,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
left = new_fl2;
new_fl2 = NULL;
locks_copy_lock(left, right);
- locks_insert_lock(before, left);
+ locks_insert_lock_ctx(left, &fl->fl_list);
}
right->fl_start = request->fl_end + 1;
locks_wake_up_blocks(right);
@@ -1257,22 +1255,29 @@ EXPORT_SYMBOL(posix_lock_file_wait);
*/
int locks_mandatory_locked(struct file *file)
{
+ int ret;
struct inode *inode = file_inode(file);
+ struct file_lock_context *ctx;
struct file_lock *fl;
+ ctx = inode->i_flctx;
+ if (!ctx || list_empty_careful(&ctx->flc_posix))
+ return 0;
+
/*
* Search the lock list for this inode for any POSIX locks.
*/
spin_lock(&inode->i_lock);
- for (fl = inode->i_flock; fl != NULL; fl = fl->fl_next) {
- if (!IS_POSIX(fl))
- continue;
+ ret = 0;
+ list_for_each_entry(fl, &ctx->flc_posix, fl_list) {
if (fl->fl_owner != current->files &&
- fl->fl_owner != file)
+ fl->fl_owner != file) {
+ ret = -EAGAIN;
break;
+ }
}
spin_unlock(&inode->i_lock);
- return fl ? -EAGAIN : 0;
+ return ret;
}
/**
@@ -2389,13 +2394,14 @@ out:
void locks_remove_posix(struct file *filp, fl_owner_t owner)
{
struct file_lock lock;
+ struct file_lock_context *ctx = file_inode(filp)->i_flctx;
/*
* If there are no locks held on this file, we don't need to call
* posix_lock_file(). Another process could be setting a lock on this
* file at the same time, but we wouldn't remove that lock anyway.
*/
- if (!file_inode(filp)->i_flock)
+ if (!ctx || list_empty(&ctx->flc_posix))
return;
lock.fl_type = F_UNLCK;
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 9f9f67b17e2b..3fb1caa3874d 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -85,17 +85,17 @@ static int nfs_delegation_claim_locks(struct nfs_open_context *ctx, struct nfs4_
{
struct inode *inode = state->inode;
struct file_lock *fl;
- struct file_lock_context *flctx;
+ struct file_lock_context *flctx = inode->i_flctx;
+ struct list_head *list;
int status = 0;
- if (inode->i_flock == NULL && inode->i_flctx == NULL)
+ if (flctx == NULL)
goto out;
- /* Protect inode->i_flock using the i_lock */
+ list = &flctx->flc_posix;
spin_lock(&inode->i_lock);
- for (fl = inode->i_flock; fl != NULL; fl = fl->fl_next) {
- if (!(fl->fl_flags & (FL_POSIX)))
- continue;
+restart:
+ list_for_each_entry(fl, list, fl_list) {
if (nfs_file_open_context(fl->fl_file) != ctx)
continue;
spin_unlock(&inode->i_lock);
@@ -104,19 +104,9 @@ static int nfs_delegation_claim_locks(struct nfs_open_context *ctx, struct nfs4_
goto out;
spin_lock(&inode->i_lock);
}
-
- flctx = inode->i_flctx;
- if (flctx) {
- list_for_each_entry(fl, &flctx->flc_flock, fl_list) {
- if (nfs_file_open_context(fl->fl_file) != ctx)
- continue;
- spin_unlock(&inode->i_lock);
- status = nfs4_lock_delegation_recall(fl, state,
- stateid);
- if (status < 0)
- goto out;
- spin_lock(&inode->i_lock);
- }
+ if (list == &flctx->flc_posix) {
+ list = &flctx->flc_flock;
+ goto restart;
}
spin_unlock(&inode->i_lock);
out:
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 65c404bf61ae..6084c267f3a0 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1367,53 +1367,18 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_
struct file_lock *fl;
int status = 0;
struct file_lock_context *flctx = inode->i_flctx;
+ struct list_head *list;
- if (inode->i_flock == NULL && flctx == NULL)
+ if (flctx == NULL)
return 0;
+ list = &flctx->flc_posix;
+
/* Guard against delegation returns and new lock/unlock calls */
down_write(&nfsi->rwsem);
- /* Protect inode->i_flock using the BKL */
spin_lock(&inode->i_lock);
- for (fl = inode->i_flock; fl != NULL; fl = fl->fl_next) {
- if (!(fl->fl_flags & FL_POSIX))
- continue;
- if (nfs_file_open_context(fl->fl_file)->state != state)
- continue;
- spin_unlock(&inode->i_lock);
- status = ops->recover_lock(state, fl);
- switch (status) {
- case 0:
- break;
- case -ESTALE:
- case -NFS4ERR_ADMIN_REVOKED:
- case -NFS4ERR_STALE_STATEID:
- case -NFS4ERR_BAD_STATEID:
- case -NFS4ERR_EXPIRED:
- case -NFS4ERR_NO_GRACE:
- case -NFS4ERR_STALE_CLIENTID:
- case -NFS4ERR_BADSESSION:
- case -NFS4ERR_BADSLOT:
- case -NFS4ERR_BAD_HIGH_SLOT:
- case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION:
- goto out;
- default:
- printk(KERN_ERR "NFS: %s: unhandled error %d\n",
- __func__, status);
- case -ENOMEM:
- case -NFS4ERR_DENIED:
- case -NFS4ERR_RECLAIM_BAD:
- case -NFS4ERR_RECLAIM_CONFLICT:
- /* kill_proc(fl->fl_pid, SIGLOST, 1); */
- status = 0;
- }
- spin_lock(&inode->i_lock);
- }
-
- if (!flctx)
- goto out_unlock;
-
- list_for_each_entry(fl, &flctx->flc_flock, fl_list) {
+restart:
+ list_for_each_entry(fl, list, fl_list) {
if (nfs_file_open_context(fl->fl_file)->state != state)
continue;
spin_unlock(&inode->i_lock);
@@ -1445,7 +1410,10 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_
}
spin_lock(&inode->i_lock);
}
-out_unlock:
+ if (list == &flctx->flc_posix) {
+ list = &flctx->flc_flock;
+ goto restart;
+ }
spin_unlock(&inode->i_lock);
out:
up_write(&nfsi->rwsem);
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index a3b62e15b444..29c7f33c9cf1 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -831,12 +831,10 @@ static bool nfs_can_coalesce_requests(struct nfs_page *prev,
if (prev) {
if (!nfs_match_open_context(req->wb_context, prev->wb_context))
return false;
- if (req->wb_context->dentry->d_inode->i_flock != NULL &&
- !nfs_match_lock_context(req->wb_lock_context,
- prev->wb_lock_context))
- return false;
flctx = req->wb_context->dentry->d_inode->i_flctx;
- if (flctx != NULL && !list_empty_careful(&flctx->flc_flock) &&
+ if (flctx != NULL &&
+ !(list_empty_careful(&flctx->flc_posix) &&
+ list_empty_careful(&flctx->flc_flock)) &&
!nfs_match_lock_context(req->wb_lock_context,
prev->wb_lock_context))
return false;
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index e072aeb34195..784c13485b3f 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1091,6 +1091,7 @@ int nfs_flush_incompatible(struct file *file, struct page *page)
{
struct nfs_open_context *ctx = nfs_file_open_context(file);
struct nfs_lock_context *l_ctx;
+ struct file_lock_context *flctx = file_inode(file)->i_flctx;
struct nfs_page *req;
int do_flush, status;
/*
@@ -1109,12 +1110,9 @@ int nfs_flush_incompatible(struct file *file, struct page *page)
do_flush = req->wb_page != page || req->wb_context != ctx;
/* for now, flush if more than 1 request in page_group */
do_flush |= req->wb_this_page != req;
- if (l_ctx && ctx->dentry->d_inode->i_flock != NULL) {
- do_flush |= l_ctx->lockowner.l_owner != current->files
- || l_ctx->lockowner.l_pid != current->tgid;
- }
- if (l_ctx && ctx->dentry->d_inode->i_flctx &&
- !list_empty_careful(&ctx->dentry->d_inode->i_flctx->flc_flock)) {
+ if (l_ctx && flctx &&
+ !(list_empty_careful(&flctx->flc_posix) &&
+ list_empty_careful(&flctx->flc_flock))) {
do_flush |= l_ctx->lockowner.l_owner != current->files
|| l_ctx->lockowner.l_pid != current->tgid;
}
@@ -1202,26 +1200,24 @@ static int nfs_can_extend_write(struct file *file, struct page *page, struct ino
return 0;
if (NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE))
return 1;
- if (!inode->i_flock && !flctx)
+ if (!flctx || (list_empty_careful(&flctx->flc_flock) &&
+ list_empty_careful(&flctx->flc_posix)))
return 0;
/* Check to see if there are whole file write locks */
- spin_lock(&inode->i_lock);
ret = 0;
-
- fl = inode->i_flock;
- if (fl && is_whole_file_wrlock(fl)) {
- ret = 1;
- goto out;
- }
-
- if (!list_empty(&flctx->flc_flock)) {
+ spin_lock(&inode->i_lock);
+ if (!list_empty(&flctx->flc_posix)) {
+ fl = list_first_entry(&flctx->flc_posix, struct file_lock,
+ fl_list);
+ if (is_whole_file_wrlock(fl))
+ ret = 1;
+ } else if (!list_empty(&flctx->flc_flock)) {
fl = list_first_entry(&flctx->flc_flock, struct file_lock,
fl_list);
if (fl->fl_type == F_WRLCK)
ret = 1;
}
-out:
spin_unlock(&inode->i_lock);
return ret;
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c06a1ba80d73..fad821991369 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5556,10 +5556,11 @@ out_nfserr:
static bool
check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
{
- struct file_lock **flpp;
+ struct file_lock *fl;
int status = false;
struct file *filp = find_any_file(fp);
struct inode *inode;
+ struct file_lock_context *flctx;
if (!filp) {
/* Any valid lock stateid should have some sort of access */
@@ -5568,15 +5569,18 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
}
inode = file_inode(filp);
+ flctx = inode->i_flctx;
- spin_lock(&inode->i_lock);
- for (flpp = &inode->i_flock; *flpp != NULL; flpp = &(*flpp)->fl_next) {
- if ((*flpp)->fl_owner == (fl_owner_t)lowner) {
- status = true;
- break;
+ if (flctx && !list_empty_careful(&flctx->flc_posix)) {
+ spin_lock(&inode->i_lock);
+ list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
+ if (fl->fl_owner == (fl_owner_t)lowner) {
+ status = true;
+ break;
+ }
}
+ spin_unlock(&inode->i_lock);
}
- spin_unlock(&inode->i_lock);
fput(filp);
return status;
}
diff --git a/fs/read_write.c b/fs/read_write.c
index c0805c93b6fa..4060691e78f7 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -358,7 +358,7 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t
return retval;
}
- if (unlikely(inode->i_flock && mandatory_lock(inode))) {
+ if (unlikely(inode->i_flctx && mandatory_lock(inode))) {
retval = locks_mandatory_area(
read_write == READ ? FLOCK_VERIFY_READ : FLOCK_VERIFY_WRITE,
inode, file, pos, count);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index dec0d38b05de..571f113588e9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -968,6 +968,7 @@ struct file_lock {
struct file_lock_context {
struct list_head flc_flock;
+ struct list_head flc_posix;
};
/* The following constant reflects the upper bound of the file/locking space */
@@ -1971,7 +1972,7 @@ static inline int locks_verify_truncate(struct inode *inode,
struct file *filp,
loff_t size)
{
- if (inode->i_flock && mandatory_lock(inode))
+ if (inode->i_flctx && mandatory_lock(inode))
return locks_mandatory_area(
FLOCK_VERIFY_WRITE, inode, filp,
size < inode->i_size ? size : inode->i_size,
--
2.1.0
From: Jeff Layton <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/locks.c | 252 +++++++++++++++++++++--------------------------------
include/linux/fs.h | 5 +-
2 files changed, 102 insertions(+), 155 deletions(-)
diff --git a/fs/locks.c b/fs/locks.c
index e50bb4d9e757..d46e70567b99 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -216,6 +216,7 @@ locks_get_lock_context(struct inode *inode)
INIT_LIST_HEAD(&new->flc_flock);
INIT_LIST_HEAD(&new->flc_posix);
+ INIT_LIST_HEAD(&new->flc_lease);
/*
* Assign the pointer if it's not already assigned. If it is, then
@@ -240,6 +241,7 @@ locks_free_lock_context(struct file_lock_context *ctx)
if (ctx) {
WARN_ON_ONCE(!list_empty(&ctx->flc_flock));
WARN_ON_ONCE(!list_empty(&ctx->flc_posix));
+ WARN_ON_ONCE(!list_empty(&ctx->flc_lease));
kmem_cache_free(flctx_cache, ctx);
}
}
@@ -677,22 +679,6 @@ static void locks_wake_up_blocks(struct file_lock *blocker)
spin_unlock(&blocked_lock_lock);
}
-/* Insert file lock fl into an inode's lock list at the position indicated
- * by pos. At the same time add the lock to the global file lock list.
- *
- * Must be called with the i_lock held!
- */
-static void locks_insert_lock(struct file_lock **pos, struct file_lock *fl)
-{
- fl->fl_nspid = get_pid(task_tgid(current));
-
- /* insert into file's list */
- fl->fl_next = *pos;
- *pos = fl;
-
- locks_insert_global_locks(fl);
-}
-
static void
locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before)
{
@@ -701,63 +687,28 @@ locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before)
locks_insert_global_locks(fl);
}
-/**
- * locks_delete_lock - Delete a lock and then free it.
- * @thisfl_p: pointer that points to the fl_next field of the previous
- * inode->i_flock list entry
- *
- * Unlink a lock from all lists and free the namespace reference, but don't
- * free it yet. Wake up processes that are blocked waiting for this lock and
- * notify the FS that the lock has been cleared.
- *
- * Must be called with the i_lock held!
- */
-static void locks_unlink_lock(struct file_lock **thisfl_p)
+static void
+locks_unlink_lock_ctx(struct file_lock *fl)
{
- struct file_lock *fl = *thisfl_p;
-
locks_delete_global_locks(fl);
-
- *thisfl_p = fl->fl_next;
- fl->fl_next = NULL;
-
+ list_del_init(&fl->fl_list);
if (fl->fl_nspid) {
put_pid(fl->fl_nspid);
fl->fl_nspid = NULL;
}
-
locks_wake_up_blocks(fl);
}
-/*
- * Unlink a lock from all lists and free it.
- *
- * Must be called with i_lock held!
- */
-static void locks_delete_lock(struct file_lock **thisfl_p,
- struct list_head *dispose)
+static void
+locks_delete_lock_ctx(struct file_lock *fl, struct list_head *dispose)
{
- struct file_lock *fl = *thisfl_p;
-
- locks_unlink_lock(thisfl_p);
+ locks_unlink_lock_ctx(fl);
if (dispose)
list_add(&fl->fl_list, dispose);
else
locks_free_lock(fl);
}
-static void
-locks_delete_lock_ctx(struct file_lock *fl, struct list_head *dispose)
-{
- locks_delete_global_locks(fl);
- if (fl->fl_nspid) {
- put_pid(fl->fl_nspid);
- fl->fl_nspid = NULL;
- }
- locks_wake_up_blocks(fl);
- list_move(&fl->fl_list, dispose);
-}
-
/* Determine if lock sys_fl blocks lock caller_fl. Common functionality
* checks for shared/exclusive status of overlapping locks.
*/
@@ -1376,7 +1327,7 @@ int lease_modify(struct file_lock **before, int arg, struct list_head *dispose)
printk(KERN_ERR "locks_delete_lock: fasync == %p\n", fl->fl_fasync);
fl->fl_fasync = NULL;
}
- locks_delete_lock(before, dispose);
+ locks_delete_lock_ctx(fl, dispose);
}
return 0;
}
@@ -1392,20 +1343,17 @@ static bool past_time(unsigned long then)
static void time_out_leases(struct inode *inode, struct list_head *dispose)
{
- struct file_lock **before;
- struct file_lock *fl;
+ struct file_lock_context *ctx = inode->i_flctx;
+ struct file_lock *fl, *tmp;
lockdep_assert_held(&inode->i_lock);
- before = &inode->i_flock;
- while ((fl = *before) && IS_LEASE(fl) && lease_breaking(fl)) {
+ list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list) {
trace_time_out_leases(inode, fl);
if (past_time(fl->fl_downgrade_time))
- lease_modify(before, F_RDLCK, dispose);
+ lease_modify(&fl, F_RDLCK, dispose);
if (past_time(fl->fl_break_time))
- lease_modify(before, F_UNLCK, dispose);
- if (fl == *before) /* lease_modify may have freed fl */
- before = &fl->fl_next;
+ lease_modify(&fl, F_UNLCK, dispose);
}
}
@@ -1419,11 +1367,12 @@ static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
static bool
any_leases_conflict(struct inode *inode, struct file_lock *breaker)
{
+ struct file_lock_context *ctx = inode->i_flctx;
struct file_lock *fl;
lockdep_assert_held(&inode->i_lock);
- for (fl = inode->i_flock ; fl && IS_LEASE(fl); fl = fl->fl_next) {
+ list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
if (leases_conflict(fl, breaker))
return true;
}
@@ -1447,7 +1396,8 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
{
int error = 0;
struct file_lock *new_fl;
- struct file_lock *fl, **before;
+ struct file_lock_context *ctx = inode->i_flctx;
+ struct file_lock *fl;
unsigned long break_time;
int want_write = (mode & O_ACCMODE) != O_RDONLY;
LIST_HEAD(dispose);
@@ -1457,6 +1407,12 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
return PTR_ERR(new_fl);
new_fl->fl_flags = type;
+ /* typically we will check that ctx is non-NULL before calling */
+ if (!ctx) {
+ WARN_ON_ONCE(1);
+ return error;
+ }
+
spin_lock(&inode->i_lock);
time_out_leases(inode, &dispose);
@@ -1471,9 +1427,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
break_time++; /* so that 0 means no break time */
}
- for (before = &inode->i_flock;
- ((fl = *before) != NULL) && IS_LEASE(fl);
- before = &fl->fl_next) {
+ list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
if (!leases_conflict(fl, new_fl))
continue;
if (want_write) {
@@ -1482,17 +1436,16 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
fl->fl_flags |= FL_UNLOCK_PENDING;
fl->fl_break_time = break_time;
} else {
- if (lease_breaking(inode->i_flock))
+ if (lease_breaking(fl))
continue;
fl->fl_flags |= FL_DOWNGRADE_PENDING;
fl->fl_downgrade_time = break_time;
}
if (fl->fl_lmops->lm_break(fl))
- locks_delete_lock(before, &dispose);
+ locks_delete_lock_ctx(fl, &dispose);
}
- fl = inode->i_flock;
- if (!fl || !IS_LEASE(fl))
+ if (list_empty(&ctx->flc_lease))
goto out;
if (mode & O_NONBLOCK) {
@@ -1502,12 +1455,13 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
}
restart:
- break_time = inode->i_flock->fl_break_time;
+ fl = list_first_entry(&ctx->flc_lease, struct file_lock, fl_list);
+ break_time = fl->fl_break_time;
if (break_time != 0)
break_time -= jiffies;
if (break_time == 0)
break_time++;
- locks_insert_block(inode->i_flock, new_fl);
+ locks_insert_block(fl, new_fl);
trace_break_lease_block(inode, new_fl);
spin_unlock(&inode->i_lock);
locks_dispose_list(&dispose);
@@ -1525,10 +1479,8 @@ restart:
time_out_leases(inode, &dispose);
if (any_leases_conflict(inode, new_fl))
goto restart;
-
error = 0;
}
-
out:
spin_unlock(&inode->i_lock);
locks_dispose_list(&dispose);
@@ -1550,13 +1502,17 @@ EXPORT_SYMBOL(__break_lease);
void lease_get_mtime(struct inode *inode, struct timespec *time)
{
bool has_lease = false;
- struct file_lock *flock;
+ struct file_lock_context *ctx = inode->i_flctx;
+ struct file_lock *fl;
- if (inode->i_flock) {
+ if (ctx && !list_empty_careful(&ctx->flc_lease)) {
spin_lock(&inode->i_lock);
- flock = inode->i_flock;
- if (flock && IS_LEASE(flock) && (flock->fl_type == F_WRLCK))
- has_lease = true;
+ if (!list_empty(&ctx->flc_lease)) {
+ fl = list_first_entry(&ctx->flc_lease,
+ struct file_lock, fl_list);
+ if (fl->fl_type == F_WRLCK)
+ has_lease = true;
+ }
spin_unlock(&inode->i_lock);
}
@@ -1595,20 +1551,22 @@ int fcntl_getlease(struct file *filp)
{
struct file_lock *fl;
struct inode *inode = file_inode(filp);
+ struct file_lock_context *ctx = inode->i_flctx;
int type = F_UNLCK;
LIST_HEAD(dispose);
- spin_lock(&inode->i_lock);
- time_out_leases(file_inode(filp), &dispose);
- for (fl = file_inode(filp)->i_flock; fl && IS_LEASE(fl);
- fl = fl->fl_next) {
- if (fl->fl_file == filp) {
+ if (ctx && !list_empty_careful(&ctx->flc_lease)) {
+ spin_lock(&inode->i_lock);
+ time_out_leases(file_inode(filp), &dispose);
+ list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
+ if (fl->fl_file != filp)
+ continue;
type = target_leasetype(fl);
break;
}
+ spin_unlock(&inode->i_lock);
+ locks_dispose_list(&dispose);
}
- spin_unlock(&inode->i_lock);
- locks_dispose_list(&dispose);
return type;
}
@@ -1641,9 +1599,10 @@ check_conflicting_open(const struct dentry *dentry, const long arg)
static int
generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **priv)
{
- struct file_lock *fl, **before, **my_before = NULL, *lease;
+ struct file_lock *fl, *my_fl = NULL, *lease;
struct dentry *dentry = filp->f_path.dentry;
struct inode *inode = dentry->d_inode;
+ struct file_lock_context *ctx;
bool is_deleg = (*flp)->fl_flags & FL_DELEG;
int error;
LIST_HEAD(dispose);
@@ -1651,6 +1610,10 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
lease = *flp;
trace_generic_add_lease(inode, lease);
+ ctx = locks_get_lock_context(inode);
+ if (!ctx)
+ return -ENOMEM;
+
/*
* In the delegation case we need mutual exclusion with
* a number of operations that take the i_mutex. We trylock
@@ -1684,13 +1647,12 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
* except for this filp.
*/
error = -EAGAIN;
- for (before = &inode->i_flock;
- ((fl = *before) != NULL) && IS_LEASE(fl);
- before = &fl->fl_next) {
+ list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
if (fl->fl_file == filp) {
- my_before = before;
+ my_fl = fl;
continue;
}
+
/*
* No exclusive leases if someone else has a lease on
* this file:
@@ -1705,9 +1667,8 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
goto out;
}
- if (my_before != NULL) {
- lease = *my_before;
- error = lease->fl_lmops->lm_change(my_before, arg, &dispose);
+ if (my_fl != NULL) {
+ error = lease->fl_lmops->lm_change(&my_fl, arg, &dispose);
if (error)
goto out;
goto out_setup;
@@ -1717,7 +1678,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
if (!leases_enable)
goto out;
- locks_insert_lock(before, lease);
+ locks_insert_lock_ctx(lease, &ctx->flc_lease);
/*
* The check in break_lease() is lockless. It's possible for another
* open to race in after we did the earlier check for a conflicting
@@ -1729,8 +1690,10 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
*/
smp_mb();
error = check_conflicting_open(dentry, arg);
- if (error)
- goto out_unlink;
+ if (error) {
+ locks_unlink_lock_ctx(lease);
+ goto out;
+ }
out_setup:
if (lease->fl_lmops->lm_setup)
@@ -1740,33 +1703,35 @@ out:
locks_dispose_list(&dispose);
if (is_deleg)
mutex_unlock(&inode->i_mutex);
- if (!error && !my_before)
+ if (!error && !my_fl)
*flp = NULL;
return error;
-out_unlink:
- locks_unlink_lock(before);
- goto out;
}
static int generic_delete_lease(struct file *filp)
{
int error = -EAGAIN;
- struct file_lock *fl, **before;
+ struct file_lock *fl, *victim = NULL;
struct dentry *dentry = filp->f_path.dentry;
struct inode *inode = dentry->d_inode;
+ struct file_lock_context *ctx = inode->i_flctx;
LIST_HEAD(dispose);
+ if (!ctx) {
+ trace_generic_delete_lease(inode, NULL);
+ return error;
+ }
+
spin_lock(&inode->i_lock);
- time_out_leases(inode, &dispose);
- for (before = &inode->i_flock;
- ((fl = *before) != NULL) && IS_LEASE(fl);
- before = &fl->fl_next) {
- if (fl->fl_file == filp)
+ list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
+ if (fl->fl_file == filp) {
+ victim = fl;
break;
+ }
}
trace_generic_delete_lease(inode, fl);
- if (fl && IS_LEASE(fl))
- error = fl->fl_lmops->lm_change(before, F_UNLCK, &dispose);
+ if (victim)
+ error = fl->fl_lmops->lm_change(&victim, F_UNLCK, &dispose);
spin_unlock(&inode->i_lock);
locks_dispose_list(&dispose);
return error;
@@ -2447,56 +2412,37 @@ locks_remove_flock(struct file *filp)
fl.fl_ops->fl_release_private(&fl);
}
+static void
+locks_remove_lease(struct file *filp)
+{
+ struct inode *inode = file_inode(filp);
+ struct file_lock_context *ctx = inode->i_flctx;
+ struct file_lock *fl, *tmp;
+ LIST_HEAD(dispose);
+
+ if (!ctx || list_empty(&ctx->flc_lease))
+ return;
+
+ spin_lock(&inode->i_lock);
+ list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list)
+ lease_modify(&fl, F_UNLCK, &dispose);
+ spin_unlock(&inode->i_lock);
+ locks_dispose_list(&dispose);
+}
+
/*
* This function is called on the last close of an open file.
*/
void locks_remove_file(struct file *filp)
{
- struct inode * inode = file_inode(filp);
- struct file_lock *fl;
- struct file_lock **before;
- LIST_HEAD(dispose);
-
/* remove any OFD locks */
locks_remove_posix(filp, filp);
/* remove flock locks */
locks_remove_flock(filp);
- if (!inode->i_flock)
- return;
-
- spin_lock(&inode->i_lock);
- before = &inode->i_flock;
-
- while ((fl = *before) != NULL) {
- if (fl->fl_file == filp) {
- if (IS_LEASE(fl)) {
- lease_modify(before, F_UNLCK, &dispose);
- continue;
- }
-
- /*
- * There's a leftover lock on the list of a type that
- * we didn't expect to see. Most likely a classic
- * POSIX lock that ended up not getting released
- * properly, or that raced onto the list somehow. Log
- * some info about it and then just remove it from
- * the list.
- */
- WARN(1, "leftover lock: dev=%u:%u ino=%lu type=%hhd flags=0x%x start=%lld end=%lld\n",
- MAJOR(inode->i_sb->s_dev),
- MINOR(inode->i_sb->s_dev), inode->i_ino,
- fl->fl_type, fl->fl_flags,
- fl->fl_start, fl->fl_end);
-
- locks_delete_lock(before, &dispose);
- continue;
- }
- before = &fl->fl_next;
- }
- spin_unlock(&inode->i_lock);
- locks_dispose_list(&dispose);
+ /* remove any leases */
+ locks_remove_lease(filp);
}
/**
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 571f113588e9..2ddec3cf81b9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -969,6 +969,7 @@ struct file_lock {
struct file_lock_context {
struct list_head flc_flock;
struct list_head flc_posix;
+ struct list_head flc_lease;
};
/* The following constant reflects the upper bound of the file/locking space */
@@ -1990,7 +1991,7 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
* end up racing with tasks trying to set a new lease on this file.
*/
smp_mb();
- if (inode->i_flock)
+ if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
return __break_lease(inode, mode, FL_LEASE);
return 0;
}
@@ -2003,7 +2004,7 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
* end up racing with tasks trying to set a new lease on this file.
*/
smp_mb();
- if (inode->i_flock)
+ if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
return __break_lease(inode, mode, FL_DELEG);
return 0;
}
--
2.1.0
From: Jeff Layton <[email protected]>
Nothing uses it anymore. Also add a forward declaration for struct
file_lock to silence some compiler warnings that the removal triggers.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
include/linux/fs.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2ddec3cf81b9..ce0873af0b97 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -625,7 +625,6 @@ struct inode {
atomic_t i_readcount; /* struct files open RO */
#endif
const struct file_operations *i_fop; /* former ->i_op->default_file_ops */
- struct file_lock *i_flock;
struct file_lock_context *i_flctx;
struct address_space i_data;
struct list_head i_devices;
@@ -886,6 +885,8 @@ static inline struct file *get_file(struct file *f)
/* legacy typedef, should eventually be removed */
typedef void *fl_owner_t;
+struct file_lock;
+
struct file_lock_operations {
void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
void (*fl_release_private)(struct file_lock *);
--
2.1.0
From: Jeff Layton <[email protected]>
We can now add a dedicated spinlock without expanding struct inode.
Change to using that to protect the various i_flctx lists.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/ceph/locks.c | 8 ++---
fs/cifs/file.c | 8 ++---
fs/lockd/svcsubs.c | 12 ++++----
fs/locks.c | 87 +++++++++++++++++++++++++++--------------------------
fs/nfs/delegation.c | 8 ++---
fs/nfs/nfs4state.c | 8 ++---
fs/nfs/write.c | 4 +--
fs/nfsd/nfs4state.c | 4 +--
include/linux/fs.h | 1 +
9 files changed, 71 insertions(+), 69 deletions(-)
diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c
index 19beeed83233..0303da8e3233 100644
--- a/fs/ceph/locks.c
+++ b/fs/ceph/locks.c
@@ -255,12 +255,12 @@ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
ctx = inode->i_flctx;
if (ctx) {
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
list_for_each_entry(lock, &ctx->flc_posix, fl_list)
++(*fcntl_count);
list_for_each_entry(lock, &ctx->flc_flock, fl_list)
++(*flock_count);
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
}
dout("counted %d flock locks and %d fcntl locks",
*flock_count, *fcntl_count);
@@ -288,7 +288,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
if (!ctx)
return 0;
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
list_for_each_entry(lock, &ctx->flc_flock, fl_list) {
++seen_fcntl;
if (seen_fcntl > num_fcntl_locks) {
@@ -312,7 +312,7 @@ int ceph_encode_locks_to_buffer(struct inode *inode,
++l;
}
fail:
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
return err;
}
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index ea78f6f81ce2..b65166eb111e 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1136,11 +1136,11 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
if (!flctx)
goto out;
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
list_for_each(el, &flctx->flc_posix) {
count++;
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
INIT_LIST_HEAD(&locks_to_send);
@@ -1159,7 +1159,7 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
}
el = locks_to_send.next;
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
list_for_each_entry(flock, &flctx->flc_posix, fl_list) {
if (el == &locks_to_send) {
/*
@@ -1181,7 +1181,7 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
lck->type = type;
lck->offset = flock->fl_start;
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
list_for_each_entry_safe(lck, tmp, &locks_to_send, llist) {
int stored_rc;
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index 5300bb53835f..665ef5a05183 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -171,7 +171,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,
return 0;
again:
file->f_locks = 0;
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
if (fl->fl_lmops != &nlmsvc_lock_operations)
continue;
@@ -183,7 +183,7 @@ again:
if (match(lockhost, host)) {
struct file_lock lock = *fl;
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
lock.fl_type = F_UNLCK;
lock.fl_start = 0;
lock.fl_end = OFFSET_MAX;
@@ -195,7 +195,7 @@ again:
goto again;
}
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
return 0;
}
@@ -232,14 +232,14 @@ nlm_file_inuse(struct nlm_file *file)
return 1;
if (flctx && !list_empty_careful(&flctx->flc_posix)) {
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
if (fl->fl_lmops == &nlmsvc_lock_operations) {
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
return 1;
}
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
}
file->f_locks = 0;
return 0;
diff --git a/fs/locks.c b/fs/locks.c
index d46e70567b99..a268d959ccd6 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -161,7 +161,7 @@ int lease_break_time = 45;
* The global file_lock_list is only used for displaying /proc/locks, so we
* keep a list on each CPU, with each list protected by its own spinlock via
* the file_lock_lglock. Note that alterations to the list also require that
- * the relevant i_lock is held.
+ * the relevant flc_lock is held.
*/
DEFINE_STATIC_LGLOCK(file_lock_lglock);
static DEFINE_PER_CPU(struct hlist_head, file_lock_list);
@@ -189,13 +189,13 @@ static DEFINE_HASHTABLE(blocked_hash, BLOCKED_HASH_BITS);
* contrast to those that are acting as records of acquired locks).
*
* Note that when we acquire this lock in order to change the above fields,
- * we often hold the i_lock as well. In certain cases, when reading the fields
+ * we often hold the flc_lock as well. In certain cases, when reading the fields
* protected by this lock, we can skip acquiring it iff we already hold the
- * i_lock.
+ * flc_lock.
*
* In particular, adding an entry to the fl_block list requires that you hold
- * both the i_lock and the blocked_lock_lock (acquired in that order). Deleting
- * an entry from the list however only requires the file_lock_lock.
+ * both the flc_lock and the blocked_lock_lock (acquired in that order).
+ * Deleting an entry from the list however only requires the file_lock_lock.
*/
static DEFINE_SPINLOCK(blocked_lock_lock);
@@ -214,6 +214,7 @@ locks_get_lock_context(struct inode *inode)
if (!new)
goto out;
+ spin_lock_init(&new->flc_lock);
INIT_LIST_HEAD(&new->flc_flock);
INIT_LIST_HEAD(&new->flc_posix);
INIT_LIST_HEAD(&new->flc_lease);
@@ -557,7 +558,7 @@ static int posix_same_owner(struct file_lock *fl1, struct file_lock *fl2)
return fl1->fl_owner == fl2->fl_owner;
}
-/* Must be called with the i_lock held! */
+/* Must be called with the flc_lock held! */
static void locks_insert_global_locks(struct file_lock *fl)
{
lg_local_lock(&file_lock_lglock);
@@ -566,12 +567,12 @@ static void locks_insert_global_locks(struct file_lock *fl)
lg_local_unlock(&file_lock_lglock);
}
-/* Must be called with the i_lock held! */
+/* Must be called with the flc_lock held! */
static void locks_delete_global_locks(struct file_lock *fl)
{
/*
* Avoid taking lock if already unhashed. This is safe since this check
- * is done while holding the i_lock, and new insertions into the list
+ * is done while holding the flc_lock, and new insertions into the list
* also require that it be held.
*/
if (hlist_unhashed(&fl->fl_link))
@@ -623,10 +624,10 @@ static void locks_delete_block(struct file_lock *waiter)
* the order they blocked. The documentation doesn't require this but
* it seems like the reasonable thing to do.
*
- * Must be called with both the i_lock and blocked_lock_lock held. The fl_block
- * list itself is protected by the blocked_lock_lock, but by ensuring that the
- * i_lock is also held on insertions we can avoid taking the blocked_lock_lock
- * in some cases when we see that the fl_block list is empty.
+ * Must be called with both the flc_lock and blocked_lock_lock held. The
+ * fl_block list itself is protected by the blocked_lock_lock, but by ensuring
+ * that the flc_lock is also held on insertions we can avoid taking the
+ * blocked_lock_lock in some cases when we see that the fl_block list is empty.
*/
static void __locks_insert_block(struct file_lock *blocker,
struct file_lock *waiter)
@@ -638,7 +639,7 @@ static void __locks_insert_block(struct file_lock *blocker,
locks_insert_global_blocked(waiter);
}
-/* Must be called with i_lock held. */
+/* Must be called with flc_lock held. */
static void locks_insert_block(struct file_lock *blocker,
struct file_lock *waiter)
{
@@ -650,15 +651,15 @@ static void locks_insert_block(struct file_lock *blocker,
/*
* Wake up processes blocked waiting for blocker.
*
- * Must be called with the inode->i_lock held!
+ * Must be called with the inode->flc_lock held!
*/
static void locks_wake_up_blocks(struct file_lock *blocker)
{
/*
* Avoid taking global lock if list is empty. This is safe since new
- * blocked requests are only added to the list under the i_lock, and
- * the i_lock is always held here. Note that removal from the fl_block
- * list does not require the i_lock, so we must recheck list_empty()
+ * blocked requests are only added to the list under the flc_lock, and
+ * the flc_lock is always held here. Note that removal from the fl_block
+ * list does not require the flc_lock, so we must recheck list_empty()
* after acquiring the blocked_lock_lock.
*/
if (list_empty(&blocker->fl_block))
@@ -768,7 +769,7 @@ posix_test_lock(struct file *filp, struct file_lock *fl)
return;
}
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
list_for_each_entry(cfl, &ctx->flc_posix, fl_list) {
if (posix_locks_conflict(fl, cfl)) {
locks_copy_conflock(fl, cfl);
@@ -779,7 +780,7 @@ posix_test_lock(struct file *filp, struct file_lock *fl)
}
fl->fl_type = F_UNLCK;
out:
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
return;
}
EXPORT_SYMBOL(posix_test_lock);
@@ -880,7 +881,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
return -ENOMEM;
}
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
if (request->fl_flags & FL_ACCESS)
goto find_conflict;
@@ -905,9 +906,9 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
* give it the opportunity to lock the file.
*/
if (found) {
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
cond_resched();
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
}
find_conflict:
@@ -929,7 +930,7 @@ find_conflict:
error = 0;
out:
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
if (new_fl)
locks_free_lock(new_fl);
locks_dispose_list(&dispose);
@@ -965,7 +966,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
new_fl2 = locks_alloc_lock();
}
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
/*
* New lock request. Walk all POSIX locks and look for conflicts. If
* there are any, either return error or put the request on the
@@ -1136,7 +1137,7 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
locks_wake_up_blocks(left);
}
out:
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
/*
* Free any unused locks.
*/
@@ -1218,7 +1219,7 @@ int locks_mandatory_locked(struct file *file)
/*
* Search the lock list for this inode for any POSIX locks.
*/
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
ret = 0;
list_for_each_entry(fl, &ctx->flc_posix, fl_list) {
if (fl->fl_owner != current->files &&
@@ -1227,7 +1228,7 @@ int locks_mandatory_locked(struct file *file)
break;
}
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
return ret;
}
@@ -1346,7 +1347,7 @@ static void time_out_leases(struct inode *inode, struct list_head *dispose)
struct file_lock_context *ctx = inode->i_flctx;
struct file_lock *fl, *tmp;
- lockdep_assert_held(&inode->i_lock);
+ lockdep_assert_held(&ctx->flc_lock);
list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list) {
trace_time_out_leases(inode, fl);
@@ -1370,7 +1371,7 @@ any_leases_conflict(struct inode *inode, struct file_lock *breaker)
struct file_lock_context *ctx = inode->i_flctx;
struct file_lock *fl;
- lockdep_assert_held(&inode->i_lock);
+ lockdep_assert_held(&ctx->flc_lock);
list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
if (leases_conflict(fl, breaker))
@@ -1413,7 +1414,7 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
return error;
}
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
time_out_leases(inode, &dispose);
@@ -1463,11 +1464,11 @@ restart:
break_time++;
locks_insert_block(fl, new_fl);
trace_break_lease_block(inode, new_fl);
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
error = wait_event_interruptible_timeout(new_fl->fl_wait,
!new_fl->fl_next, break_time);
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
trace_break_lease_unblock(inode, new_fl);
locks_delete_block(new_fl);
if (error >= 0) {
@@ -1482,7 +1483,7 @@ restart:
error = 0;
}
out:
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
locks_free_lock(new_fl);
return error;
@@ -1506,14 +1507,14 @@ void lease_get_mtime(struct inode *inode, struct timespec *time)
struct file_lock *fl;
if (ctx && !list_empty_careful(&ctx->flc_lease)) {
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
if (!list_empty(&ctx->flc_lease)) {
fl = list_first_entry(&ctx->flc_lease,
struct file_lock, fl_list);
if (fl->fl_type == F_WRLCK)
has_lease = true;
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
}
if (has_lease)
@@ -1556,7 +1557,7 @@ int fcntl_getlease(struct file *filp)
LIST_HEAD(dispose);
if (ctx && !list_empty_careful(&ctx->flc_lease)) {
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
time_out_leases(file_inode(filp), &dispose);
list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
if (fl->fl_file != filp)
@@ -1564,7 +1565,7 @@ int fcntl_getlease(struct file *filp)
type = target_leasetype(fl);
break;
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
}
return type;
@@ -1632,7 +1633,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
return -EINVAL;
}
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
time_out_leases(inode, &dispose);
error = check_conflicting_open(dentry, arg);
if (error)
@@ -1699,7 +1700,7 @@ out_setup:
if (lease->fl_lmops->lm_setup)
lease->fl_lmops->lm_setup(lease, priv);
out:
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
if (is_deleg)
mutex_unlock(&inode->i_mutex);
@@ -1722,7 +1723,7 @@ static int generic_delete_lease(struct file *filp)
return error;
}
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
list_for_each_entry(fl, &ctx->flc_lease, fl_list) {
if (fl->fl_file == filp) {
victim = fl;
@@ -1732,7 +1733,7 @@ static int generic_delete_lease(struct file *filp)
trace_generic_delete_lease(inode, fl);
if (victim)
error = fl->fl_lmops->lm_change(&victim, F_UNLCK, &dispose);
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
return error;
}
@@ -2423,10 +2424,10 @@ locks_remove_lease(struct file *filp)
if (!ctx || list_empty(&ctx->flc_lease))
return;
- spin_lock(&inode->i_lock);
+ spin_lock(&ctx->flc_lock);
list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list)
lease_modify(&fl, F_UNLCK, &dispose);
- spin_unlock(&inode->i_lock);
+ spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
}
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 3fb1caa3874d..8cdb2b28a104 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -93,22 +93,22 @@ static int nfs_delegation_claim_locks(struct nfs_open_context *ctx, struct nfs4_
goto out;
list = &flctx->flc_posix;
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
restart:
list_for_each_entry(fl, list, fl_list) {
if (nfs_file_open_context(fl->fl_file) != ctx)
continue;
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
status = nfs4_lock_delegation_recall(fl, state, stateid);
if (status < 0)
goto out;
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
}
if (list == &flctx->flc_posix) {
list = &flctx->flc_flock;
goto restart;
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
out:
return status;
}
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 6084c267f3a0..a3bb22ab68c5 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1376,12 +1376,12 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_
/* Guard against delegation returns and new lock/unlock calls */
down_write(&nfsi->rwsem);
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
restart:
list_for_each_entry(fl, list, fl_list) {
if (nfs_file_open_context(fl->fl_file)->state != state)
continue;
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
status = ops->recover_lock(state, fl);
switch (status) {
case 0:
@@ -1408,13 +1408,13 @@ restart:
/* kill_proc(fl->fl_pid, SIGLOST, 1); */
status = 0;
}
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
}
if (list == &flctx->flc_posix) {
list = &flctx->flc_flock;
goto restart;
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
out:
up_write(&nfsi->rwsem);
return status;
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 784c13485b3f..4ae66f416eb9 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1206,7 +1206,7 @@ static int nfs_can_extend_write(struct file *file, struct page *page, struct ino
/* Check to see if there are whole file write locks */
ret = 0;
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
if (!list_empty(&flctx->flc_posix)) {
fl = list_first_entry(&flctx->flc_posix, struct file_lock,
fl_list);
@@ -1218,7 +1218,7 @@ static int nfs_can_extend_write(struct file *file, struct page *page, struct ino
if (fl->fl_type == F_WRLCK)
ret = 1;
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
return ret;
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index fad821991369..80242f5bd621 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -5572,14 +5572,14 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
flctx = inode->i_flctx;
if (flctx && !list_empty_careful(&flctx->flc_posix)) {
- spin_lock(&inode->i_lock);
+ spin_lock(&flctx->flc_lock);
list_for_each_entry(fl, &flctx->flc_posix, fl_list) {
if (fl->fl_owner == (fl_owner_t)lowner) {
status = true;
break;
}
}
- spin_unlock(&inode->i_lock);
+ spin_unlock(&flctx->flc_lock);
}
fput(filp);
return status;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ce0873af0b97..32eafa9b5c9f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -968,6 +968,7 @@ struct file_lock {
};
struct file_lock_context {
+ spinlock_t flc_lock;
struct list_head flc_flock;
struct list_head flc_posix;
struct list_head flc_lease;
--
2.1.0
From: Jeff Layton <[email protected]>
Now that we use standard list_heads for tracking leases, we can have
lm_change take a pointer to the lease to be modified instead of a
double pointer.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/locks.c | 13 ++++++-------
fs/nfsd/nfs4state.c | 3 ++-
include/linux/fs.h | 6 +++---
3 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/fs/locks.c b/fs/locks.c
index a268d959ccd6..864f2460a0ad 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1309,9 +1309,8 @@ static void lease_clear_pending(struct file_lock *fl, int arg)
}
/* We already had a lease on this file; just change its type */
-int lease_modify(struct file_lock **before, int arg, struct list_head *dispose)
+int lease_modify(struct file_lock *fl, int arg, struct list_head *dispose)
{
- struct file_lock *fl = *before;
int error = assign_type(fl, arg);
if (error)
@@ -1352,9 +1351,9 @@ static void time_out_leases(struct inode *inode, struct list_head *dispose)
list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list) {
trace_time_out_leases(inode, fl);
if (past_time(fl->fl_downgrade_time))
- lease_modify(&fl, F_RDLCK, dispose);
+ lease_modify(fl, F_RDLCK, dispose);
if (past_time(fl->fl_break_time))
- lease_modify(&fl, F_UNLCK, dispose);
+ lease_modify(fl, F_UNLCK, dispose);
}
}
@@ -1669,7 +1668,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
}
if (my_fl != NULL) {
- error = lease->fl_lmops->lm_change(&my_fl, arg, &dispose);
+ error = lease->fl_lmops->lm_change(my_fl, arg, &dispose);
if (error)
goto out;
goto out_setup;
@@ -1732,7 +1731,7 @@ static int generic_delete_lease(struct file *filp)
}
trace_generic_delete_lease(inode, fl);
if (victim)
- error = fl->fl_lmops->lm_change(&victim, F_UNLCK, &dispose);
+ error = fl->fl_lmops->lm_change(victim, F_UNLCK, &dispose);
spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
return error;
@@ -2426,7 +2425,7 @@ locks_remove_lease(struct file *filp)
spin_lock(&ctx->flc_lock);
list_for_each_entry_safe(fl, tmp, &ctx->flc_lease, fl_list)
- lease_modify(&fl, F_UNLCK, &dispose);
+ lease_modify(fl, F_UNLCK, &dispose);
spin_unlock(&ctx->flc_lock);
locks_dispose_list(&dispose);
}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 80242f5bd621..532a60cca2fb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3477,7 +3477,8 @@ nfsd_break_deleg_cb(struct file_lock *fl)
}
static int
-nfsd_change_deleg_cb(struct file_lock **onlist, int arg, struct list_head *dispose)
+nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
+ struct list_head *dispose)
{
if (arg & F_UNLCK)
return lease_modify(onlist, arg, dispose);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 32eafa9b5c9f..94e706a0a408 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -900,7 +900,7 @@ struct lock_manager_operations {
void (*lm_notify)(struct file_lock *); /* unblock callback */
int (*lm_grant)(struct file_lock *, int);
bool (*lm_break)(struct file_lock *);
- int (*lm_change)(struct file_lock **, int, struct list_head *);
+ int (*lm_change)(struct file_lock *, int, struct list_head *);
void (*lm_setup)(struct file_lock *, void **);
};
@@ -1021,7 +1021,7 @@ extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int t
extern void lease_get_mtime(struct inode *, struct timespec *time);
extern int generic_setlease(struct file *, long, struct file_lock **, void **priv);
extern int vfs_setlease(struct file *, long, struct file_lock **, void **);
-extern int lease_modify(struct file_lock **, int, struct list_head *);
+extern int lease_modify(struct file_lock *, int, struct list_head *);
#else /* !CONFIG_FILE_LOCKING */
static inline int fcntl_getlk(struct file *file, unsigned int cmd,
struct flock __user *user)
@@ -1153,7 +1153,7 @@ static inline int vfs_setlease(struct file *filp, long arg,
return -EINVAL;
}
-static inline int lease_modify(struct file_lock **before, int arg,
+static inline int lease_modify(struct file_lock *fl, int arg,
struct list_head *dispose)
{
return -EINVAL;
--
2.1.0
From: Jeff Layton <[email protected]>
This makes things a bit more efficient in the cifs and ceph lock
pushing code.
Signed-off-by: Jeff Layton <[email protected]>
Acked-by: Christoph Hellwig <[email protected]>
---
fs/ceph/locks.c | 11 ++---------
fs/cifs/file.c | 14 ++++----------
fs/locks.c | 45 +++++++++++++++++++++++++++++----------------
include/linux/fs.h | 3 +++
4 files changed, 38 insertions(+), 35 deletions(-)
diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c
index 0303da8e3233..06ea5cd05cd9 100644
--- a/fs/ceph/locks.c
+++ b/fs/ceph/locks.c
@@ -242,12 +242,9 @@ int ceph_flock(struct file *file, int cmd, struct file_lock *fl)
/*
* Fills in the passed counter variables, so you can prepare pagelist metadata
* before calling ceph_encode_locks.
- *
- * FIXME: add counters to struct file_lock_context so we don't need to do this?
*/
void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
{
- struct file_lock *lock;
struct file_lock_context *ctx;
*fcntl_count = 0;
@@ -255,12 +252,8 @@ void ceph_count_locks(struct inode *inode, int *fcntl_count, int *flock_count)
ctx = inode->i_flctx;
if (ctx) {
- spin_lock(&ctx->flc_lock);
- list_for_each_entry(lock, &ctx->flc_posix, fl_list)
- ++(*fcntl_count);
- list_for_each_entry(lock, &ctx->flc_flock, fl_list)
- ++(*flock_count);
- spin_unlock(&ctx->flc_lock);
+ *fcntl_count = ctx->flc_posix_cnt;
+ *flock_count = ctx->flc_flock_cnt;
}
dout("counted %d flock locks and %d fcntl locks",
*flock_count, *fcntl_count);
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index b65166eb111e..8c2ca6f62bad 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1125,7 +1125,7 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
struct cifs_tcon *tcon = tlink_tcon(cfile->tlink);
struct file_lock *flock;
struct file_lock_context *flctx = inode->i_flctx;
- unsigned int count = 0, i;
+ unsigned int i;
int rc = 0, xid, type;
struct list_head locks_to_send, *el;
struct lock_to_push *lck, *tmp;
@@ -1136,20 +1136,14 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile)
if (!flctx)
goto out;
- spin_lock(&flctx->flc_lock);
- list_for_each(el, &flctx->flc_posix) {
- count++;
- }
- spin_unlock(&flctx->flc_lock);
-
INIT_LIST_HEAD(&locks_to_send);
/*
- * Allocating count locks is enough because no FL_POSIX locks can be
- * added to the list while we are holding cinode->lock_sem that
+ * Allocating flc_posix_cnt locks is enough because no FL_POSIX locks
+ * can be added to the list while we are holding cinode->lock_sem that
* protects locking operations of this inode.
*/
- for (i = 0; i < count; i++) {
+ for (i = 0; i < flctx->flc_posix_cnt; i++) {
lck = kmalloc(sizeof(struct lock_to_push), GFP_KERNEL);
if (!lck) {
rc = -ENOMEM;
diff --git a/fs/locks.c b/fs/locks.c
index 864f2460a0ad..bd578700342d 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -681,18 +681,21 @@ static void locks_wake_up_blocks(struct file_lock *blocker)
}
static void
-locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before)
+locks_insert_lock_ctx(struct file_lock *fl, int *counter,
+ struct list_head *before)
{
fl->fl_nspid = get_pid(task_tgid(current));
list_add_tail(&fl->fl_list, before);
+ ++*counter;
locks_insert_global_locks(fl);
}
static void
-locks_unlink_lock_ctx(struct file_lock *fl)
+locks_unlink_lock_ctx(struct file_lock *fl, int *counter)
{
locks_delete_global_locks(fl);
list_del_init(&fl->fl_list);
+ --*counter;
if (fl->fl_nspid) {
put_pid(fl->fl_nspid);
fl->fl_nspid = NULL;
@@ -701,9 +704,10 @@ locks_unlink_lock_ctx(struct file_lock *fl)
}
static void
-locks_delete_lock_ctx(struct file_lock *fl, struct list_head *dispose)
+locks_delete_lock_ctx(struct file_lock *fl, int *counter,
+ struct list_head *dispose)
{
- locks_unlink_lock_ctx(fl);
+ locks_unlink_lock_ctx(fl, counter);
if (dispose)
list_add(&fl->fl_list, dispose);
else
@@ -891,7 +895,7 @@ static int flock_lock_file(struct file *filp, struct file_lock *request)
if (request->fl_type == fl->fl_type)
goto out;
found = true;
- locks_delete_lock_ctx(fl, &dispose);
+ locks_delete_lock_ctx(fl, &ctx->flc_flock_cnt, &dispose);
break;
}
@@ -925,7 +929,7 @@ find_conflict:
if (request->fl_flags & FL_ACCESS)
goto out;
locks_copy_lock(new_fl, request);
- locks_insert_lock_ctx(new_fl, &ctx->flc_flock);
+ locks_insert_lock_ctx(new_fl, &ctx->flc_flock_cnt, &ctx->flc_flock);
new_fl = NULL;
error = 0;
@@ -1042,7 +1046,8 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
else
request->fl_end = fl->fl_end;
if (added) {
- locks_delete_lock_ctx(fl, &dispose);
+ locks_delete_lock_ctx(fl, &ctx->flc_posix_cnt,
+ &dispose);
continue;
}
request = fl;
@@ -1071,7 +1076,8 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
* one (This may happen several times).
*/
if (added) {
- locks_delete_lock_ctx(fl, &dispose);
+ locks_delete_lock_ctx(fl,
+ &ctx->flc_posix_cnt, &dispose);
continue;
}
/*
@@ -1087,8 +1093,10 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
locks_copy_lock(new_fl, request);
request = new_fl;
new_fl = NULL;
- locks_insert_lock_ctx(request, &fl->fl_list);
- locks_delete_lock_ctx(fl, &dispose);
+ locks_insert_lock_ctx(request,
+ &ctx->flc_posix_cnt, &fl->fl_list);
+ locks_delete_lock_ctx(fl,
+ &ctx->flc_posix_cnt, &dispose);
added = true;
}
}
@@ -1116,7 +1124,8 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
goto out;
}
locks_copy_lock(new_fl, request);
- locks_insert_lock_ctx(new_fl, &fl->fl_list);
+ locks_insert_lock_ctx(new_fl, &ctx->flc_posix_cnt,
+ &fl->fl_list);
new_fl = NULL;
}
if (right) {
@@ -1127,7 +1136,8 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str
left = new_fl2;
new_fl2 = NULL;
locks_copy_lock(left, right);
- locks_insert_lock_ctx(left, &fl->fl_list);
+ locks_insert_lock_ctx(left, &ctx->flc_posix_cnt,
+ &fl->fl_list);
}
right->fl_start = request->fl_end + 1;
locks_wake_up_blocks(right);
@@ -1311,6 +1321,7 @@ static void lease_clear_pending(struct file_lock *fl, int arg)
/* We already had a lease on this file; just change its type */
int lease_modify(struct file_lock *fl, int arg, struct list_head *dispose)
{
+ struct file_lock_context *flctx;
int error = assign_type(fl, arg);
if (error)
@@ -1320,6 +1331,7 @@ int lease_modify(struct file_lock *fl, int arg, struct list_head *dispose)
if (arg == F_UNLCK) {
struct file *filp = fl->fl_file;
+ flctx = file_inode(filp)->i_flctx;
f_delown(filp);
filp->f_owner.signum = 0;
fasync_helper(0, fl->fl_file, 0, &fl->fl_fasync);
@@ -1327,7 +1339,7 @@ int lease_modify(struct file_lock *fl, int arg, struct list_head *dispose)
printk(KERN_ERR "locks_delete_lock: fasync == %p\n", fl->fl_fasync);
fl->fl_fasync = NULL;
}
- locks_delete_lock_ctx(fl, dispose);
+ locks_delete_lock_ctx(fl, &flctx->flc_lease_cnt, dispose);
}
return 0;
}
@@ -1442,7 +1454,8 @@ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
fl->fl_downgrade_time = break_time;
}
if (fl->fl_lmops->lm_break(fl))
- locks_delete_lock_ctx(fl, &dispose);
+ locks_delete_lock_ctx(fl, &ctx->flc_lease_cnt,
+ &dispose);
}
if (list_empty(&ctx->flc_lease))
@@ -1678,7 +1691,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
if (!leases_enable)
goto out;
- locks_insert_lock_ctx(lease, &ctx->flc_lease);
+ locks_insert_lock_ctx(lease, &ctx->flc_lease_cnt, &ctx->flc_lease);
/*
* The check in break_lease() is lockless. It's possible for another
* open to race in after we did the earlier check for a conflicting
@@ -1691,7 +1704,7 @@ generic_add_lease(struct file *filp, long arg, struct file_lock **flp, void **pr
smp_mb();
error = check_conflicting_open(dentry, arg);
if (error) {
- locks_unlink_lock_ctx(lease);
+ locks_unlink_lock_ctx(lease, &ctx->flc_lease_cnt);
goto out;
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 94e706a0a408..f87cb2f03103 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -972,6 +972,9 @@ struct file_lock_context {
struct list_head flc_flock;
struct list_head flc_posix;
struct list_head flc_lease;
+ int flc_flock_cnt;
+ int flc_posix_cnt;
+ int flc_lease_cnt;
};
/* The following constant reflects the upper bound of the file/locking space */
--
2.1.0
We have each of the locks_remove_* variants doing this individually.
Have the caller do it instead, and have locks_remove_flock and
locks_remove_lease just assume that it's a valid pointer.
Signed-off-by: Jeff Layton <[email protected]>
---
fs/locks.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/locks.c b/fs/locks.c
index bd578700342d..2fc36b3772a0 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2400,6 +2400,7 @@ void locks_remove_posix(struct file *filp, fl_owner_t owner)
EXPORT_SYMBOL(locks_remove_posix);
+/* The i_flctx must be valid when calling into here */
static void
locks_remove_flock(struct file *filp)
{
@@ -2413,7 +2414,7 @@ locks_remove_flock(struct file *filp)
};
struct file_lock_context *flctx = file_inode(filp)->i_flctx;
- if (!flctx || list_empty(&flctx->flc_flock))
+ if (list_empty(&flctx->flc_flock))
return;
if (filp->f_op->flock)
@@ -2425,6 +2426,7 @@ locks_remove_flock(struct file *filp)
fl.fl_ops->fl_release_private(&fl);
}
+/* The i_flctx must be valid when calling into here */
static void
locks_remove_lease(struct file *filp)
{
@@ -2433,7 +2435,7 @@ locks_remove_lease(struct file *filp)
struct file_lock *fl, *tmp;
LIST_HEAD(dispose);
- if (!ctx || list_empty(&ctx->flc_lease))
+ if (list_empty(&ctx->flc_lease))
return;
spin_lock(&ctx->flc_lock);
@@ -2448,6 +2450,9 @@ locks_remove_lease(struct file *filp)
*/
void locks_remove_file(struct file *filp)
{
+ if (!file_inode(filp)->i_flctx)
+ return;
+
/* remove any OFD locks */
locks_remove_posix(filp, filp);
--
2.1.0
Signed-off-by: Jeff Layton <[email protected]>
---
fs/locks.c | 2 +-
include/linux/fs.h | 19 ++++++++++---------
2 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/fs/locks.c b/fs/locks.c
index 2fc36b3772a0..4d0d41163a50 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2212,7 +2212,7 @@ again:
*/
/*
* we need that spin_lock here - it prevents reordering between
- * update of inode->i_flock and check for it done in close().
+ * update of i_flctx->flc_posix and check for it done in close().
* rcu_read_lock() wouldn't do.
*/
spin_lock(¤t->files->file_lock);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f87cb2f03103..ddd2fa7cefd3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -925,12 +925,11 @@ int locks_in_grace(struct net *);
* FIXME: should we create a separate "struct lock_request" to help distinguish
* these two uses?
*
- * The i_flock list is ordered by:
+ * The varous i_flctx lists are ordered by:
*
- * 1) lock type -- FL_LEASEs first, then FL_FLOCK, and finally FL_POSIX
- * 2) lock owner
- * 3) lock range start
- * 4) lock range end
+ * 1) lock owner
+ * 2) lock range start
+ * 3) lock range end
*
* Obviously, the last two criteria only matter for POSIX locks.
*/
@@ -1992,8 +1991,9 @@ static inline int break_lease(struct inode *inode, unsigned int mode)
{
/*
* Since this check is lockless, we must ensure that any refcounts
- * taken are done before checking inode->i_flock. Otherwise, we could
- * end up racing with tasks trying to set a new lease on this file.
+ * taken are done before checking i_flctx->flc_lease. Otherwise, we
+ * could end up racing with tasks trying to set a new lease on this
+ * file.
*/
smp_mb();
if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
@@ -2005,8 +2005,9 @@ static inline int break_deleg(struct inode *inode, unsigned int mode)
{
/*
* Since this check is lockless, we must ensure that any refcounts
- * taken are done before checking inode->i_flock. Otherwise, we could
- * end up racing with tasks trying to set a new lease on this file.
+ * taken are done before checking i_flctx->flc_lease. Otherwise, we
+ * could end up racing with tasks trying to set a new lease on this
+ * file.
*/
smp_mb();
if (inode->i_flctx && !list_empty_careful(&inode->i_flctx->flc_lease))
--
2.1.0
I applied Jeff's patch to my orangefs-3.19-rc5 tree. Orangefs
doesn't yet support the lock callout for file_operations, but
we have been experimenting with some ideas that would allow
Orangefs to honor locks in our distributed environment: basically
posix locks for each kernel client plus meta data in our userspace
servers.
Anyhow, the lock callout in the Orangefs patch I've posted
just returns ENOSYS. I have added posix locks in a current
test, so now I have:
int pvfs2_lock(struct file *filp, int cmd, struct file_lock *fl)
{
int rc;
rc = posix_lock_file(filp, fl, NULL);
return rc;
}
So... while this isn't safe for our real world distributed environment,
it makes the kernel with this version of the Orangefs kernel client
loaded on it think that locks work.
Besides observing that locks seem to work by running some programs
that need locks (like svn/sqlite) I also ran xfstest generic/131.
Without Jeff's patch, xfstest generic/131 fails:
29:Verify that F_GETLK for F_WRLCK doesn't require that file be
opened for write
Same with Jeff's patch.
Jeff's patch applied painlessly, and my simple tests didn't uncover
any problems with Jeff's implementation of separate lists for different
lock types, so that's good.
What surprised me, though, is that the posix lock code failed
to get all the way through xfstest generic/131... maybe you
all knew that already?
-Mike
On Thu, Jan 22, 2015 at 9:27 AM, Jeff Layton
<[email protected]> wrote:
> v3:
> - break out a ceph locking cleanup patch into a separate one earlier
> in the series
> - switch back to using the i_lock to protect assignment of i_flctx.
> Using cmpxchg() was subject to races that I couldn't quite get a
> grip on. Eventually I may switch it back, but it probably doesn't
> provide much benefit.
> - add a patch to clean up some comments that refer to i_flock
> - use list_empty_careful in lockless checks rather than list_empty
>
> v2:
> - bugfix to the flc_posix list ordering that broke the split/merge code
> - don't use i_lock to manage the i_flctx pointer. Do so locklessly
> via cmpxchg().
> - reordering of the patches to make the set bisectable. As a result
> the new spinlock is not introduced until near the end of the set
> - some cleanup of the lm_change operation was added, made possible
> by the move to standard list_heads for tracking locks
> - it takes greater pains to avoid spinlocking by checking when the
> lists are empty in addition to whether the i_flctx pointer is NULL
> - a patch was added to keep count of the number of locks, so we can
> avoid having to do count/alloc/populate in ceph and cifs
>
> This is the third iteration of this patchset. The second was posted
> on January 8th, under the cover letter entitled:
>
> [PATCH v2 00/10] locks: saner method for managing file locks
>
> The big change here is that it once again uses the i_lock to protect the
> i_flctx pointer assignment. In principle we should be able to use
> cmpxchg instead, but that seems leave open a race that causes
> locks_remove_file to miss recently-added locks. Given that is not a
> super latency-sensitive codepath, an extra spinlock shouldn't make much
> difference.
>
> Many thanks to Sasha Levin who helped me nail a race that would leave
> leftover locks on the i_flock list, and David Howells who convinced
> me to just go ahead back to using the i_lock to protect that field.
>
> There are some other minor changes, but overall it's pretty similar
> to the last set. I still plan to merge this for v3.20.
>
> ------------------------[snip]-------------------------
>
> We currently manage all file_locks via a singly-linked list. This is
> problematic for a number of reasons:
>
> - we have to protect all file locks with the same spinlock (or
> equivalent). Currently that uses the i_lock, but Christoph has voiced
> objections due to the potential for contention with other i_lock
> users. He'd like to see us move to using a different lock.
>
> - we have to walk through irrelevant file locks in order to get to the
> ones we're interested in. For instance, POSIX locks are at the end
> of the list, so we have to skip over all of the flock locks and
> leases before we can work with them.
>
> - the singly-linked list is primitive and difficult to work with. We
> have to keep track of the "before" pointer and it's easy to get that
> wrong.
>
> Cleaning all of this up is complicated by the fact that no one really
> wants to grow struct inode in order to do so. We have a single pointer
> in the inode now and I don't think we want to use any more.
>
> One possibility that Trond raised was to move this to an hlist, but
> that doesn't do anything about the desire for a new spinlock.
>
> This patchset takes the approach of replacing the i_flock list with a
> new struct file_lock_context that is allocated when we intend to add a
> new file lock to an inode. The file_lock_context is only freed when we
> destroy the inode.
>
> Within that, we have separate (and standard!) lists for each lock type,
> and a dedicated spinlock for managing those lists. In principle we could
> even consider adding separate locks for each, but I didn't bother with
> that for now.
>
> For now, the code is still pretty "raw" and isn't bisectable. This is
> just a RFC for the basic approach. This is probably v3.19 material at
> best.
>
> Anyone have thoughts or comments on the basic approach?
>
> Jeff Layton (13):
> locks: add new struct list_head to struct file_lock
> locks: have locks_release_file use flock_lock_file to release generic
> flock locks
> locks: add a new struct file_locking_context pointer to struct inode
> ceph: move spinlocking into ceph_encode_locks_to_buffer and
> ceph_count_locks
> locks: move flock locks to file_lock_context
> locks: convert posix locks to file_lock_context
> locks: convert lease handling to file_lock_context
> locks: remove i_flock field from struct inode
> locks: add a dedicated spinlock to protect i_flctx lists
> locks: clean up the lm_change prototype
> locks: keep a count of locks on the flctx lists
> locks: consolidate NULL i_flctx checks in locks_remove_file
> locks: update comments that refer to inode->i_flock
>
> fs/ceph/locks.c | 64 +++---
> fs/ceph/mds_client.c | 4 -
> fs/cifs/file.c | 34 +--
> fs/inode.c | 3 +-
> fs/lockd/svcsubs.c | 26 ++-
> fs/locks.c | 569 +++++++++++++++++++++++++++------------------------
> fs/nfs/delegation.c | 23 ++-
> fs/nfs/nfs4state.c | 70 ++++---
> fs/nfs/pagelist.c | 6 +-
> fs/nfs/write.c | 41 +++-
> fs/nfsd/nfs4state.c | 21 +-
> fs/read_write.c | 2 +-
> include/linux/fs.h | 52 +++--
> 13 files changed, 510 insertions(+), 405 deletions(-)
>
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2 Feb 2015 15:29:33 -0500
Mike Marshall <[email protected]> wrote:
> I applied Jeff's patch to my orangefs-3.19-rc5 tree. Orangefs
> doesn't yet support the lock callout for file_operations, but
> we have been experimenting with some ideas that would allow
> Orangefs to honor locks in our distributed environment: basically
> posix locks for each kernel client plus meta data in our userspace
> servers.
>
> Anyhow, the lock callout in the Orangefs patch I've posted
> just returns ENOSYS. I have added posix locks in a current
> test, so now I have:
>
> int pvfs2_lock(struct file *filp, int cmd, struct file_lock *fl)
> {
> int rc;
>
> rc = posix_lock_file(filp, fl, NULL);
>
> return rc;
> }
>
> So... while this isn't safe for our real world distributed environment,
> it makes the kernel with this version of the Orangefs kernel client
> loaded on it think that locks work.
>
> Besides observing that locks seem to work by running some programs
> that need locks (like svn/sqlite) I also ran xfstest generic/131.
>
> Without Jeff's patch, xfstest generic/131 fails:
>
> 29:Verify that F_GETLK for F_WRLCK doesn't require that file be
> opened for write
>
> Same with Jeff's patch.
>
> Jeff's patch applied painlessly, and my simple tests didn't uncover
> any problems with Jeff's implementation of separate lists for different
> lock types, so that's good.
>
> What surprised me, though, is that the posix lock code failed
> to get all the way through xfstest generic/131... maybe you
> all knew that already?
>
> -Mike
>
Hmm... I haven't seen 131 fail like this, but your code above looks
wrong. You're ignoring the "cmd" argument, so even F_GETLK requests are
setting a lock.
I think you might want to do something like:
int pvfs2_lock(struct file *filp, int cmd, struct file_lock *fl)
{
if (cmd == F_GETLK)
return posix_test_lock(filp, fl);
return posix_lock_file(filp, fl, fl);
}
...untested of course, but you get the idea.
> On Thu, Jan 22, 2015 at 9:27 AM, Jeff Layton
> <[email protected]> wrote:
> > v3:
> > - break out a ceph locking cleanup patch into a separate one earlier
> > in the series
> > - switch back to using the i_lock to protect assignment of i_flctx.
> > Using cmpxchg() was subject to races that I couldn't quite get a
> > grip on. Eventually I may switch it back, but it probably doesn't
> > provide much benefit.
> > - add a patch to clean up some comments that refer to i_flock
> > - use list_empty_careful in lockless checks rather than list_empty
> >
> > v2:
> > - bugfix to the flc_posix list ordering that broke the split/merge code
> > - don't use i_lock to manage the i_flctx pointer. Do so locklessly
> > via cmpxchg().
> > - reordering of the patches to make the set bisectable. As a result
> > the new spinlock is not introduced until near the end of the set
> > - some cleanup of the lm_change operation was added, made possible
> > by the move to standard list_heads for tracking locks
> > - it takes greater pains to avoid spinlocking by checking when the
> > lists are empty in addition to whether the i_flctx pointer is NULL
> > - a patch was added to keep count of the number of locks, so we can
> > avoid having to do count/alloc/populate in ceph and cifs
> >
> > This is the third iteration of this patchset. The second was posted
> > on January 8th, under the cover letter entitled:
> >
> > [PATCH v2 00/10] locks: saner method for managing file locks
> >
> > The big change here is that it once again uses the i_lock to protect the
> > i_flctx pointer assignment. In principle we should be able to use
> > cmpxchg instead, but that seems leave open a race that causes
> > locks_remove_file to miss recently-added locks. Given that is not a
> > super latency-sensitive codepath, an extra spinlock shouldn't make much
> > difference.
> >
> > Many thanks to Sasha Levin who helped me nail a race that would leave
> > leftover locks on the i_flock list, and David Howells who convinced
> > me to just go ahead back to using the i_lock to protect that field.
> >
> > There are some other minor changes, but overall it's pretty similar
> > to the last set. I still plan to merge this for v3.20.
> >
> > ------------------------[snip]-------------------------
> >
> > We currently manage all file_locks via a singly-linked list. This is
> > problematic for a number of reasons:
> >
> > - we have to protect all file locks with the same spinlock (or
> > equivalent). Currently that uses the i_lock, but Christoph has voiced
> > objections due to the potential for contention with other i_lock
> > users. He'd like to see us move to using a different lock.
> >
> > - we have to walk through irrelevant file locks in order to get to the
> > ones we're interested in. For instance, POSIX locks are at the end
> > of the list, so we have to skip over all of the flock locks and
> > leases before we can work with them.
> >
> > - the singly-linked list is primitive and difficult to work with. We
> > have to keep track of the "before" pointer and it's easy to get that
> > wrong.
> >
> > Cleaning all of this up is complicated by the fact that no one really
> > wants to grow struct inode in order to do so. We have a single pointer
> > in the inode now and I don't think we want to use any more.
> >
> > One possibility that Trond raised was to move this to an hlist, but
> > that doesn't do anything about the desire for a new spinlock.
> >
> > This patchset takes the approach of replacing the i_flock list with a
> > new struct file_lock_context that is allocated when we intend to add a
> > new file lock to an inode. The file_lock_context is only freed when we
> > destroy the inode.
> >
> > Within that, we have separate (and standard!) lists for each lock type,
> > and a dedicated spinlock for managing those lists. In principle we could
> > even consider adding separate locks for each, but I didn't bother with
> > that for now.
> >
> > For now, the code is still pretty "raw" and isn't bisectable. This is
> > just a RFC for the basic approach. This is probably v3.19 material at
> > best.
> >
> > Anyone have thoughts or comments on the basic approach?
> >
> > Jeff Layton (13):
> > locks: add new struct list_head to struct file_lock
> > locks: have locks_release_file use flock_lock_file to release generic
> > flock locks
> > locks: add a new struct file_locking_context pointer to struct inode
> > ceph: move spinlocking into ceph_encode_locks_to_buffer and
> > ceph_count_locks
> > locks: move flock locks to file_lock_context
> > locks: convert posix locks to file_lock_context
> > locks: convert lease handling to file_lock_context
> > locks: remove i_flock field from struct inode
> > locks: add a dedicated spinlock to protect i_flctx lists
> > locks: clean up the lm_change prototype
> > locks: keep a count of locks on the flctx lists
> > locks: consolidate NULL i_flctx checks in locks_remove_file
> > locks: update comments that refer to inode->i_flock
> >
> > fs/ceph/locks.c | 64 +++---
> > fs/ceph/mds_client.c | 4 -
> > fs/cifs/file.c | 34 +--
> > fs/inode.c | 3 +-
> > fs/lockd/svcsubs.c | 26 ++-
> > fs/locks.c | 569 +++++++++++++++++++++++++++------------------------
> > fs/nfs/delegation.c | 23 ++-
> > fs/nfs/nfs4state.c | 70 ++++---
> > fs/nfs/pagelist.c | 6 +-
> > fs/nfs/write.c | 41 +++-
> > fs/nfsd/nfs4state.c | 21 +-
> > fs/read_write.c | 2 +-
> > include/linux/fs.h | 52 +++--
> > 13 files changed, 510 insertions(+), 405 deletions(-)
> >
> > --
> > 2.1.0
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jeff Layton <[email protected]>
Yes, 131 goes all the way through now... thanks for fixing my code in
your thread <g>...
int pvfs2_lock(struct file *filp, int cmd, struct file_lock *fl)
{
int rc = 0;
if (cmd == F_GETLK)
posix_test_lock(filp, fl);
else
rc = posix_lock_file(filp, fl, NULL);
return rc;
}
-Mike
On Mon, Feb 2, 2015 at 3:42 PM, Jeff Layton <[email protected]> wrote:
> On Mon, 2 Feb 2015 15:29:33 -0500
> Mike Marshall <[email protected]> wrote:
>
>> I applied Jeff's patch to my orangefs-3.19-rc5 tree. Orangefs
>> doesn't yet support the lock callout for file_operations, but
>> we have been experimenting with some ideas that would allow
>> Orangefs to honor locks in our distributed environment: basically
>> posix locks for each kernel client plus meta data in our userspace
>> servers.
>>
>> Anyhow, the lock callout in the Orangefs patch I've posted
>> just returns ENOSYS. I have added posix locks in a current
>> test, so now I have:
>>
>> int pvfs2_lock(struct file *filp, int cmd, struct file_lock *fl)
>> {
>> int rc;
>>
>> rc = posix_lock_file(filp, fl, NULL);
>>
>> return rc;
>> }
>>
>> So... while this isn't safe for our real world distributed environment,
>> it makes the kernel with this version of the Orangefs kernel client
>> loaded on it think that locks work.
>>
>> Besides observing that locks seem to work by running some programs
>> that need locks (like svn/sqlite) I also ran xfstest generic/131.
>>
>> Without Jeff's patch, xfstest generic/131 fails:
>>
>> 29:Verify that F_GETLK for F_WRLCK doesn't require that file be
>> opened for write
>>
>> Same with Jeff's patch.
>>
>> Jeff's patch applied painlessly, and my simple tests didn't uncover
>> any problems with Jeff's implementation of separate lists for different
>> lock types, so that's good.
>>
>> What surprised me, though, is that the posix lock code failed
>> to get all the way through xfstest generic/131... maybe you
>> all knew that already?
>>
>> -Mike
>>
>
> Hmm... I haven't seen 131 fail like this, but your code above looks
> wrong. You're ignoring the "cmd" argument, so even F_GETLK requests are
> setting a lock.
>
> I think you might want to do something like:
>
> int pvfs2_lock(struct file *filp, int cmd, struct file_lock *fl)
> {
> if (cmd == F_GETLK)
> return posix_test_lock(filp, fl);
> return posix_lock_file(filp, fl, fl);
> }
>
> ...untested of course, but you get the idea.
>
>
>> On Thu, Jan 22, 2015 at 9:27 AM, Jeff Layton
>> <[email protected]> wrote:
>> > v3:
>> > - break out a ceph locking cleanup patch into a separate one earlier
>> > in the series
>> > - switch back to using the i_lock to protect assignment of i_flctx.
>> > Using cmpxchg() was subject to races that I couldn't quite get a
>> > grip on. Eventually I may switch it back, but it probably doesn't
>> > provide much benefit.
>> > - add a patch to clean up some comments that refer to i_flock
>> > - use list_empty_careful in lockless checks rather than list_empty
>> >
>> > v2:
>> > - bugfix to the flc_posix list ordering that broke the split/merge code
>> > - don't use i_lock to manage the i_flctx pointer. Do so locklessly
>> > via cmpxchg().
>> > - reordering of the patches to make the set bisectable. As a result
>> > the new spinlock is not introduced until near the end of the set
>> > - some cleanup of the lm_change operation was added, made possible
>> > by the move to standard list_heads for tracking locks
>> > - it takes greater pains to avoid spinlocking by checking when the
>> > lists are empty in addition to whether the i_flctx pointer is NULL
>> > - a patch was added to keep count of the number of locks, so we can
>> > avoid having to do count/alloc/populate in ceph and cifs
>> >
>> > This is the third iteration of this patchset. The second was posted
>> > on January 8th, under the cover letter entitled:
>> >
>> > [PATCH v2 00/10] locks: saner method for managing file locks
>> >
>> > The big change here is that it once again uses the i_lock to protect the
>> > i_flctx pointer assignment. In principle we should be able to use
>> > cmpxchg instead, but that seems leave open a race that causes
>> > locks_remove_file to miss recently-added locks. Given that is not a
>> > super latency-sensitive codepath, an extra spinlock shouldn't make much
>> > difference.
>> >
>> > Many thanks to Sasha Levin who helped me nail a race that would leave
>> > leftover locks on the i_flock list, and David Howells who convinced
>> > me to just go ahead back to using the i_lock to protect that field.
>> >
>> > There are some other minor changes, but overall it's pretty similar
>> > to the last set. I still plan to merge this for v3.20.
>> >
>> > ------------------------[snip]-------------------------
>> >
>> > We currently manage all file_locks via a singly-linked list. This is
>> > problematic for a number of reasons:
>> >
>> > - we have to protect all file locks with the same spinlock (or
>> > equivalent). Currently that uses the i_lock, but Christoph has voiced
>> > objections due to the potential for contention with other i_lock
>> > users. He'd like to see us move to using a different lock.
>> >
>> > - we have to walk through irrelevant file locks in order to get to the
>> > ones we're interested in. For instance, POSIX locks are at the end
>> > of the list, so we have to skip over all of the flock locks and
>> > leases before we can work with them.
>> >
>> > - the singly-linked list is primitive and difficult to work with. We
>> > have to keep track of the "before" pointer and it's easy to get that
>> > wrong.
>> >
>> > Cleaning all of this up is complicated by the fact that no one really
>> > wants to grow struct inode in order to do so. We have a single pointer
>> > in the inode now and I don't think we want to use any more.
>> >
>> > One possibility that Trond raised was to move this to an hlist, but
>> > that doesn't do anything about the desire for a new spinlock.
>> >
>> > This patchset takes the approach of replacing the i_flock list with a
>> > new struct file_lock_context that is allocated when we intend to add a
>> > new file lock to an inode. The file_lock_context is only freed when we
>> > destroy the inode.
>> >
>> > Within that, we have separate (and standard!) lists for each lock type,
>> > and a dedicated spinlock for managing those lists. In principle we could
>> > even consider adding separate locks for each, but I didn't bother with
>> > that for now.
>> >
>> > For now, the code is still pretty "raw" and isn't bisectable. This is
>> > just a RFC for the basic approach. This is probably v3.19 material at
>> > best.
>> >
>> > Anyone have thoughts or comments on the basic approach?
>> >
>> > Jeff Layton (13):
>> > locks: add new struct list_head to struct file_lock
>> > locks: have locks_release_file use flock_lock_file to release generic
>> > flock locks
>> > locks: add a new struct file_locking_context pointer to struct inode
>> > ceph: move spinlocking into ceph_encode_locks_to_buffer and
>> > ceph_count_locks
>> > locks: move flock locks to file_lock_context
>> > locks: convert posix locks to file_lock_context
>> > locks: convert lease handling to file_lock_context
>> > locks: remove i_flock field from struct inode
>> > locks: add a dedicated spinlock to protect i_flctx lists
>> > locks: clean up the lm_change prototype
>> > locks: keep a count of locks on the flctx lists
>> > locks: consolidate NULL i_flctx checks in locks_remove_file
>> > locks: update comments that refer to inode->i_flock
>> >
>> > fs/ceph/locks.c | 64 +++---
>> > fs/ceph/mds_client.c | 4 -
>> > fs/cifs/file.c | 34 +--
>> > fs/inode.c | 3 +-
>> > fs/lockd/svcsubs.c | 26 ++-
>> > fs/locks.c | 569 +++++++++++++++++++++++++++------------------------
>> > fs/nfs/delegation.c | 23 ++-
>> > fs/nfs/nfs4state.c | 70 ++++---
>> > fs/nfs/pagelist.c | 6 +-
>> > fs/nfs/write.c | 41 +++-
>> > fs/nfsd/nfs4state.c | 21 +-
>> > fs/read_write.c | 2 +-
>> > include/linux/fs.h | 52 +++--
>> > 13 files changed, 510 insertions(+), 405 deletions(-)
>> >
>> > --
>> > 2.1.0
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>> > the body of a message to [email protected]
>> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> Jeff Layton <[email protected]>