Highlights include:
- Stable patch to fix a deadlock between resends and LAYOUTRETURN
- Stable patch to not clear the layout stateid if LAYOUTRETURN is outstanding
- Stable patch to clear NFS_LAYOUT_RETURN_REQUESTED when invalidating the layout
- Stable patch to not send LAYOUTGET until the layout error has been reported
- Patches to embed LAYOUTRETURN operations in the COMPOUNDs for CLOSE and
DELEGRETURN. This fixes a set of races for the case where the server
implements return-on-close, and where the client and server disagree
on whether or not the layout was returned due to parallel OPEN and CLOSE
calls.
Trond Myklebust (26):
pNFS: Fix a deadlock between read resends and layoutreturn
pNFS: Don't clear the layout stateid if a layout return is outstanding
pNFS: Clear NFS_LAYOUT_RETURN_REQUESTED when invalidating the layout
stateid
pNFS: Force a retry of LAYOUTGET if the stateid doesn't match our
cache
pNFS: On error, do not send LAYOUTGET until the LAYOUTRETURN has
completed
pNFS: Fix race in pnfs_wait_on_layoutreturn
pNFS: consolidate the different range intersection tests
pNFS: Delay getting the layout header in CB_LAYOUTRECALL handlers
pNFS: Do not free layout segments that are marked for return
NFSv4: Ignore LAYOUTRETURN result if the layout doesn't match or is
invalid
pNFS: Remove spurious wake up in pnfs_layout_remove_lseg()
pNFS: Skip checking for return-on-close if the layout is invalid
pNFS: Get rid of unnecessary layout parameter in encode_layoutreturn
callback
pNFS: Don't mark layout segments invalid on layoutreturn in pnfs_roc
NFSv4: Fix missing operation accounting in NFS4_dec_delegreturn_sz
NFSv4: Add encode/decode of the layoutreturn op in CLOSE
NFSv4: Add encode/decode of the layoutreturn op in DELEGRETURN
pNFS: Clean up - add a helper to initialise struct layoutreturn_args
pNFS: Enable layoutreturn operation for return-on-close
pNFS: Prevent unnecessary layoutreturns after delegreturn
pNFS: Clear all layout segment state in
pnfs_mark_layout_stateid_invalid
pNFS: Fix bugs in _pnfs_return_layout
pNFS: Sync the layout state bits in pnfs_cache_lseg_for_layoutreturn
pNFS: Don't mark the layout as freed if the last lseg is marked for
return
pNFS: Wait on outstanding layoutreturns to complete in pnfs_roc()
pNFS: Skip invalid stateids when doing a bulk destroy
fs/nfs/callback_proc.c | 99 +++++---
fs/nfs/flexfilelayout/flexfilelayout.c | 8 +-
fs/nfs/flexfilelayout/flexfilelayoutdev.c | 33 +--
fs/nfs/nfs4proc.c | 134 +++++++----
fs/nfs/nfs4xdr.c | 45 +++-
fs/nfs/objlayout/objlayout.c | 4 +-
fs/nfs/objlayout/objlayout.h | 1 -
fs/nfs/pnfs.c | 381 ++++++++++++++++++------------
fs/nfs/pnfs.h | 72 ++++--
include/linux/nfs_xdr.h | 6 +
10 files changed, 506 insertions(+), 277 deletions(-)
--
2.9.3
If we no longer hold any layout segments, we're normally expected to
consider the layout stateid to be invalid. However we cannot assume this
if we're about to, or in the process of sending a layoutreturn.
Fixes: 334a8f37115b ("pNFS: Don't forget the layout stateid if...")
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v4.8+
---
fs/nfs/pnfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 206d560c74f4..09bbb07d0494 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -411,7 +411,9 @@ pnfs_layout_remove_lseg(struct pnfs_layout_hdr *lo,
list_del_init(&lseg->pls_list);
/* Matched by pnfs_get_layout_hdr in pnfs_layout_insert_lseg */
atomic_dec(&lo->plh_refcount);
- if (list_empty(&lo->plh_segs)) {
+ if (list_empty(&lo->plh_segs) &&
+ !test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags) &&
+ !test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags)) {
if (atomic_read(&lo->plh_outstanding) == 0)
set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
clear_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
--
2.9.3
We must not call nfs_pageio_init_read() on a new nfs_pageio_descriptor
while holding a reference to a layout segment, as that can deadlock
pnfs_update_layout().
Fixes: d67ae825a59d6 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v4.0+
---
fs/nfs/flexfilelayout/flexfilelayout.c | 4 ++++
fs/nfs/pnfs.c | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index 98ace127bf86..a5c38889e7ae 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -28,6 +28,9 @@
static struct group_info *ff_zero_group;
+static void ff_layout_read_record_layoutstats_done(struct rpc_task *task,
+ struct nfs_pgio_header *hdr);
+
static struct pnfs_layout_hdr *
ff_layout_alloc_layout_hdr(struct inode *inode, gfp_t gfp_flags)
{
@@ -1293,6 +1296,7 @@ static int ff_layout_read_done_cb(struct rpc_task *task,
hdr->pgio_mirror_idx + 1,
&hdr->pgio_mirror_idx))
goto out_eagain;
+ ff_layout_read_record_layoutstats_done(task, hdr);
pnfs_read_resend_pnfs(hdr);
return task->tk_status;
case -NFS4ERR_RESET_TO_MDS:
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 259ef85f435a..206d560c74f4 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2286,6 +2286,10 @@ void pnfs_read_resend_pnfs(struct nfs_pgio_header *hdr)
struct nfs_pageio_descriptor pgio;
if (!test_and_set_bit(NFS_IOHDR_REDO, &hdr->flags)) {
+ /* Prevent deadlocks with layoutreturn! */
+ pnfs_put_lseg(hdr->lseg);
+ hdr->lseg = NULL;
+
nfs_pageio_init_read(&pgio, hdr->inode, false,
hdr->completion_ops);
hdr->task.tk_status = nfs_pageio_resend(&pgio, hdr);
--
2.9.3
We must ensure that we don't schedule a layoutreturn if the layout stateid
has been marked as invalid.
Fixes: 2a59a0411671e ("pNFS: Fix pnfs_set_layout_stateid() to clear...")
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v4.8+
---
fs/nfs/pnfs.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 09bbb07d0494..d4e06b7459f4 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -299,6 +299,14 @@ pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo)
}
}
+static void
+pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo)
+{
+ lo->plh_return_iomode = 0;
+ lo->plh_return_seq = 0;
+ clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
+}
+
/*
* Mark a pnfs_layout_hdr and all associated layout segments as invalid
*
@@ -317,6 +325,7 @@ pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo,
};
set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
+ pnfs_clear_layoutreturn_info(lo);
return pnfs_mark_matching_lsegs_invalid(lo, lseg_list, &range, 0);
}
@@ -818,14 +827,6 @@ pnfs_destroy_all_layouts(struct nfs_client *clp)
pnfs_destroy_layouts_byclid(clp, false);
}
-static void
-pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo)
-{
- lo->plh_return_iomode = 0;
- lo->plh_return_seq = 0;
- clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
-}
-
/* update lo->plh_stateid with new if is more recent */
void
pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo, const nfs4_stateid *new,
--
2.9.3
If the server sends us a completely new stateid, and the client thinks
it already holds a layout, then force a retry of the LAYOUTGET after
invalidating the existing layout in order to avoid corruption due to
races.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index d4e06b7459f4..9b7e88b7edfc 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1844,7 +1844,10 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
goto out_forget;
}
- if (nfs4_stateid_match_other(&lo->plh_stateid, &res->stateid)) {
+ if (!pnfs_layout_is_valid(lo)) {
+ /* We have a completely new layout */
+ pnfs_set_layout_stateid(lo, &res->stateid, true);
+ } else if (nfs4_stateid_match_other(&lo->plh_stateid, &res->stateid)) {
/* existing state ID, make sure the sequence number matches. */
if (pnfs_layout_stateid_blocked(lo, &res->stateid)) {
dprintk("%s forget reply due to sequence\n", __func__);
@@ -1854,12 +1857,10 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
} else {
/*
* We got an entirely new state ID. Mark all segments for the
- * inode invalid, and don't bother validating the stateid
- * sequence number.
+ * inode invalid, and retry the layoutget
*/
pnfs_mark_layout_stateid_invalid(lo, &free_me);
-
- pnfs_set_layout_stateid(lo, &res->stateid, true);
+ goto out_forget;
}
pnfs_get_lseg(lseg);
--
2.9.3
If there is an I/O error, we should not call LAYOUTGET until the
LAYOUTRETURN that reports the error is complete.
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v4.8+
---
fs/nfs/pnfs.c | 6 +++++-
fs/nfs/pnfs.h | 1 +
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 9b7e88b7edfc..06bfcd277006 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -947,6 +947,7 @@ static void pnfs_clear_layoutcommit(struct inode *inode,
void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo)
{
clear_bit_unlock(NFS_LAYOUT_RETURN, &lo->plh_flags);
+ clear_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags);
smp_mb__after_atomic();
wake_up_bit(&lo->plh_flags, NFS_LAYOUT_RETURN);
rpc_wake_up(&NFS_SERVER(lo->plh_inode)->roc_rpcwaitq);
@@ -960,8 +961,9 @@ pnfs_prepare_layoutreturn(struct pnfs_layout_hdr *lo,
/* Serialise LAYOUTGET/LAYOUTRETURN */
if (atomic_read(&lo->plh_outstanding) != 0)
return false;
- if (test_and_set_bit(NFS_LAYOUT_RETURN, &lo->plh_flags))
+ if (test_and_set_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags))
return false;
+ set_bit(NFS_LAYOUT_RETURN, &lo->plh_flags);
pnfs_get_layout_hdr(lo);
if (test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags)) {
if (stateid != NULL) {
@@ -1954,6 +1956,8 @@ void pnfs_error_mark_layout_for_return(struct inode *inode,
spin_lock(&inode->i_lock);
pnfs_set_plh_return_info(lo, range.iomode, 0);
+ /* Block LAYOUTGET */
+ set_bit(NFS_LAYOUT_RETURN, &lo->plh_flags);
/*
* mark all matching lsegs so that we are sure to have no live
* segments at hand when sending layoutreturn. See pnfs_put_lseg()
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 5c295512c967..44cad8afda0e 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -96,6 +96,7 @@ enum {
NFS_LAYOUT_RW_FAILED, /* get rw layout failed stop trying */
NFS_LAYOUT_BULK_RECALL, /* bulk recall affecting layout */
NFS_LAYOUT_RETURN, /* layoutreturn in progress */
+ NFS_LAYOUT_RETURN_LOCK, /* Serialise layoutreturn */
NFS_LAYOUT_RETURN_REQUESTED, /* Return this layout ASAP */
NFS_LAYOUT_INVALID_STID, /* layout stateid id is invalid */
NFS_LAYOUT_FIRST_LAYOUTGET, /* Serialize first layoutget */
--
2.9.3
Both pnfs.c and the flexfiles code have their own versions of the
range intersection testing, and the "end_offset" helper.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayoutdev.c | 33 +++++++----------------------
fs/nfs/pnfs.c | 35 +++----------------------------
fs/nfs/pnfs.h | 32 ++++++++++++++++++++++++++++
3 files changed, 43 insertions(+), 57 deletions(-)
diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
index f7a3f6b05369..6cf545c06cd3 100644
--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
+++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
@@ -198,22 +198,13 @@ static bool ff_layout_mirror_valid(struct pnfs_layout_segment *lseg,
return true;
}
-static u64
-end_offset(u64 start, u64 len)
-{
- u64 end;
-
- end = start + len;
- return end >= start ? end : NFS4_MAX_UINT64;
-}
-
static void extend_ds_error(struct nfs4_ff_layout_ds_err *err,
u64 offset, u64 length)
{
u64 end;
- end = max_t(u64, end_offset(err->offset, err->length),
- end_offset(offset, length));
+ end = max_t(u64, pnfs_end_offset(err->offset, err->length),
+ pnfs_end_offset(offset, length));
err->offset = min_t(u64, err->offset, offset);
err->length = end - err->offset;
}
@@ -235,9 +226,9 @@ ff_ds_error_match(const struct nfs4_ff_layout_ds_err *e1,
ret = memcmp(&e1->deviceid, &e2->deviceid, sizeof(e1->deviceid));
if (ret != 0)
return ret;
- if (end_offset(e1->offset, e1->length) < e2->offset)
+ if (pnfs_end_offset(e1->offset, e1->length) < e2->offset)
return -1;
- if (e1->offset > end_offset(e2->offset, e2->length))
+ if (e1->offset > pnfs_end_offset(e2->offset, e2->length))
return 1;
/* If ranges overlap or are contiguous, they are the same */
return 0;
@@ -457,16 +448,6 @@ nfs4_ff_find_or_create_ds_client(struct pnfs_layout_segment *lseg, u32 ds_idx,
}
}
-static bool is_range_intersecting(u64 offset1, u64 length1,
- u64 offset2, u64 length2)
-{
- u64 end1 = end_offset(offset1, length1);
- u64 end2 = end_offset(offset2, length2);
-
- return (end1 == NFS4_MAX_UINT64 || end1 > offset2) &&
- (end2 == NFS4_MAX_UINT64 || end2 > offset1);
-}
-
/* called with inode i_lock held */
int ff_layout_encode_ds_ioerr(struct nfs4_flexfile_layout *flo,
struct xdr_stream *xdr, int *count,
@@ -476,8 +457,10 @@ int ff_layout_encode_ds_ioerr(struct nfs4_flexfile_layout *flo,
__be32 *p;
list_for_each_entry_safe(err, n, &flo->error_list, list) {
- if (!is_range_intersecting(err->offset, err->length,
- range->offset, range->length))
+ if (!pnfs_is_range_intersecting(err->offset,
+ pnfs_end_offset(err->offset, err->length),
+ range->offset,
+ pnfs_end_offset(range->offset, range->length)))
continue;
/* offset(8) + length(8) + stateid(NFS4_STATEID_SIZE)
* + array length + deviceid(NFS4_DEVICEID4_SIZE)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index e17d9ec82ca7..d70cc467a87b 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -500,15 +500,6 @@ pnfs_put_lseg_locked(struct pnfs_layout_segment *lseg)
}
EXPORT_SYMBOL_GPL(pnfs_put_lseg_locked);
-static u64
-end_offset(u64 start, u64 len)
-{
- u64 end;
-
- end = start + len;
- return end >= start ? end : NFS4_MAX_UINT64;
-}
-
/*
* is l2 fully contained in l1?
* start1 end1
@@ -521,33 +512,13 @@ pnfs_lseg_range_contained(const struct pnfs_layout_range *l1,
const struct pnfs_layout_range *l2)
{
u64 start1 = l1->offset;
- u64 end1 = end_offset(start1, l1->length);
+ u64 end1 = pnfs_end_offset(start1, l1->length);
u64 start2 = l2->offset;
- u64 end2 = end_offset(start2, l2->length);
+ u64 end2 = pnfs_end_offset(start2, l2->length);
return (start1 <= start2) && (end1 >= end2);
}
-/*
- * is l1 and l2 intersecting?
- * start1 end1
- * [----------------------------------)
- * start2 end2
- * [----------------)
- */
-static bool
-pnfs_lseg_range_intersecting(const struct pnfs_layout_range *l1,
- const struct pnfs_layout_range *l2)
-{
- u64 start1 = l1->offset;
- u64 end1 = end_offset(start1, l1->length);
- u64 start2 = l2->offset;
- u64 end2 = end_offset(start2, l2->length);
-
- return (end1 == NFS4_MAX_UINT64 || end1 > start2) &&
- (end2 == NFS4_MAX_UINT64 || end2 > start1);
-}
-
static bool pnfs_lseg_dec_and_remove_zero(struct pnfs_layout_segment *lseg,
struct list_head *tmp_list)
{
@@ -2069,7 +2040,7 @@ pnfs_generic_pg_test(struct nfs_pageio_descriptor *pgio,
*
*/
if (pgio->pg_lseg) {
- seg_end = end_offset(pgio->pg_lseg->pls_range.offset,
+ seg_end = pnfs_end_offset(pgio->pg_lseg->pls_range.offset,
pgio->pg_lseg->pls_range.length);
req_start = req_offset(req);
WARN_ON_ONCE(req_start >= seg_end);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 44cad8afda0e..337dad382b6a 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -560,6 +560,38 @@ pnfs_copy_range(struct pnfs_layout_range *dst,
memcpy(dst, src, sizeof(*dst));
}
+static inline u64
+pnfs_end_offset(u64 start, u64 len)
+{
+ if (NFS4_MAX_UINT64 - start <= len)
+ return NFS4_MAX_UINT64;
+ return start + len;
+}
+
+/*
+ * Are 2 ranges intersecting?
+ * start1 end1
+ * [----------------------------------)
+ * start2 end2
+ * [----------------)
+ */
+static inline bool
+pnfs_is_range_intersecting(u64 start1, u64 end1, u64 start2, u64 end2)
+{
+ return (end1 == NFS4_MAX_UINT64 || start2 < end1) &&
+ (end2 == NFS4_MAX_UINT64 || start1 < end2);
+}
+
+static inline bool
+pnfs_lseg_range_intersecting(const struct pnfs_layout_range *l1,
+ const struct pnfs_layout_range *l2)
+{
+ u64 end1 = pnfs_end_offset(l1->offset, l1->length);
+ u64 end2 = pnfs_end_offset(l2->offset, l2->length);
+
+ return pnfs_is_range_intersecting(l1->offset, end1, l2->offset, end2);
+}
+
extern unsigned int layoutstats_timer;
#ifdef NFS_DEBUG
--
2.9.3
We must put the task to sleep while holding the inode->i_lock in order
to ensure atomicity with the test for NFS_LAYOUT_RETURN.
Fixes: 500d701f336b ("NFS41: make close wait for layoutreturn")
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 06bfcd277006..e17d9ec82ca7 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1257,13 +1257,11 @@ bool pnfs_wait_on_layoutreturn(struct inode *ino, struct rpc_task *task)
* i_lock */
spin_lock(&ino->i_lock);
lo = nfsi->layout;
- if (lo && test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags))
+ if (lo && test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags)) {
+ rpc_sleep_on(&NFS_SERVER(ino)->roc_rpcwaitq, task, NULL);
sleep = true;
+ }
spin_unlock(&ino->i_lock);
-
- if (sleep)
- rpc_sleep_on(&NFS_SERVER(ino)->roc_rpcwaitq, task, NULL);
-
return sleep;
}
--
2.9.3
Instead of grabbing the layout, we want to get the inode so that we
can reduce races between layoutget and layoutrecall when the server
does not support call referring.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/callback_proc.c | 99 ++++++++++++++++++++++++++++++++++----------------
1 file changed, 67 insertions(+), 32 deletions(-)
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index e9aa235e9d10..f073a6d2c6a5 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -110,20 +110,52 @@ __be32 nfs4_callback_recall(struct cb_recallargs *args, void *dummy,
#if defined(CONFIG_NFS_V4_1)
/*
- * Lookup a layout by filehandle.
+ * Lookup a layout inode by stateid
*
- * Note: gets a refcount on the layout hdr and on its respective inode.
- * Caller must put the layout hdr and the inode.
+ * Note: returns a refcount on the inode and superblock
+ */
+static struct inode *nfs_layout_find_inode_by_stateid(struct nfs_client *clp,
+ const nfs4_stateid *stateid)
+{
+ struct nfs_server *server;
+ struct inode *inode;
+ struct pnfs_layout_hdr *lo;
+
+restart:
+ list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link) {
+ list_for_each_entry(lo, &server->layouts, plh_layouts) {
+ if (stateid != NULL &&
+ !nfs4_stateid_match_other(stateid, &lo->plh_stateid))
+ continue;
+ inode = igrab(lo->plh_inode);
+ if (!inode)
+ continue;
+ if (!nfs_sb_active(inode->i_sb)) {
+ rcu_read_lock();
+ spin_unlock(&clp->cl_lock);
+ iput(inode);
+ spin_lock(&clp->cl_lock);
+ goto restart;
+ }
+ return inode;
+ }
+ }
+
+ return NULL;
+}
+
+/*
+ * Lookup a layout inode by filehandle.
+ *
+ * Note: returns a refcount on the inode and superblock
*
- * TODO: keep track of all layouts (and delegations) in a hash table
- * hashed by filehandle.
*/
-static struct pnfs_layout_hdr * get_layout_by_fh_locked(struct nfs_client *clp,
- struct nfs_fh *fh)
+static struct inode *nfs_layout_find_inode_by_fh(struct nfs_client *clp,
+ const struct nfs_fh *fh)
{
struct nfs_server *server;
struct nfs_inode *nfsi;
- struct inode *ino;
+ struct inode *inode;
struct pnfs_layout_hdr *lo;
restart:
@@ -134,37 +166,38 @@ static struct pnfs_layout_hdr * get_layout_by_fh_locked(struct nfs_client *clp,
continue;
if (nfsi->layout != lo)
continue;
- ino = igrab(lo->plh_inode);
- if (!ino)
- break;
- spin_lock(&ino->i_lock);
- /* Is this layout in the process of being freed? */
- if (nfsi->layout != lo) {
- spin_unlock(&ino->i_lock);
- iput(ino);
+ inode = igrab(lo->plh_inode);
+ if (!inode)
+ continue;
+ if (!nfs_sb_active(inode->i_sb)) {
+ rcu_read_lock();
+ spin_unlock(&clp->cl_lock);
+ iput(inode);
+ spin_lock(&clp->cl_lock);
goto restart;
}
- pnfs_get_layout_hdr(lo);
- spin_unlock(&ino->i_lock);
- return lo;
+ return inode;
}
}
return NULL;
}
-static struct pnfs_layout_hdr * get_layout_by_fh(struct nfs_client *clp,
- struct nfs_fh *fh)
+static struct inode *nfs_layout_find_inode(struct nfs_client *clp,
+ const struct nfs_fh *fh,
+ const nfs4_stateid *stateid)
{
- struct pnfs_layout_hdr *lo;
+ struct inode *inode;
spin_lock(&clp->cl_lock);
rcu_read_lock();
- lo = get_layout_by_fh_locked(clp, fh);
+ inode = nfs_layout_find_inode_by_stateid(clp, stateid);
+ if (!inode)
+ inode = nfs_layout_find_inode_by_fh(clp, fh);
rcu_read_unlock();
spin_unlock(&clp->cl_lock);
- return lo;
+ return inode;
}
/*
@@ -213,18 +246,20 @@ static u32 initiate_file_draining(struct nfs_client *clp,
u32 rv = NFS4ERR_NOMATCHING_LAYOUT;
LIST_HEAD(free_me_list);
- lo = get_layout_by_fh(clp, &args->cbl_fh);
- if (!lo) {
- trace_nfs4_cb_layoutrecall_file(clp, &args->cbl_fh, NULL,
- &args->cbl_stateid, -rv);
+ ino = nfs_layout_find_inode(clp, &args->cbl_fh, &args->cbl_stateid);
+ if (!ino)
goto out;
- }
- ino = lo->plh_inode;
pnfs_layoutcommit_inode(ino, false);
spin_lock(&ino->i_lock);
+ lo = NFS_I(ino)->layout;
+ if (!lo) {
+ spin_unlock(&ino->i_lock);
+ goto out;
+ }
+ pnfs_get_layout_hdr(lo);
rv = pnfs_check_callback_stateid(lo, &args->cbl_stateid);
if (rv != NFS_OK)
goto unlock;
@@ -258,10 +293,10 @@ static u32 initiate_file_draining(struct nfs_client *clp,
/* Free all lsegs that are attached to commit buckets */
nfs_commit_inode(ino, 0);
pnfs_put_layout_hdr(lo);
+out:
trace_nfs4_cb_layoutrecall_file(clp, &args->cbl_fh, ino,
&args->cbl_stateid, -rv);
- iput(ino);
-out:
+ nfs_iput_and_deactive(ino);
return rv;
}
--
2.9.3
We may want to process and transmit layout stat information for the
layout segments that are being returned, so we should defer freeing
them until after the layoutreturn has completed.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 15 +++--------
fs/nfs/pnfs.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++-------
fs/nfs/pnfs.h | 6 ++++-
3 files changed, 73 insertions(+), 22 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 917a6db5c84f..561b21e4a930 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -8572,21 +8572,12 @@ static void nfs4_layoutreturn_release(void *calldata)
{
struct nfs4_layoutreturn *lrp = calldata;
struct pnfs_layout_hdr *lo = lrp->args.layout;
- LIST_HEAD(freeme);
dprintk("--> %s\n", __func__);
- spin_lock(&lo->plh_inode->i_lock);
- if (lrp->res.lrs_present) {
- pnfs_mark_matching_lsegs_invalid(lo, &freeme,
- &lrp->args.range,
- be32_to_cpu(lrp->args.stateid.seqid));
- pnfs_set_layout_stateid(lo, &lrp->res.stateid, true);
- } else
- pnfs_mark_layout_stateid_invalid(lo, &freeme);
- pnfs_clear_layoutreturn_waitbit(lo);
- spin_unlock(&lo->plh_inode->i_lock);
+ pnfs_layoutreturn_free_lsegs(lo, &lrp->args.range,
+ be32_to_cpu(lrp->args.stateid.seqid),
+ lrp->res.lrs_present ? &lrp->res.stateid : NULL);
nfs4_sequence_free_slot(&lrp->res.seq_res);
- pnfs_free_lseg_list(&freeme);
pnfs_put_layout_hdr(lrp->args.layout);
nfs_iput_and_deactive(lrp->inode);
kfree(calldata);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index d70cc467a87b..555151b6d31b 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -54,6 +54,10 @@ static DEFINE_SPINLOCK(pnfs_spinlock);
static LIST_HEAD(pnfs_modules_tbl);
static void pnfs_layoutreturn_before_put_layout_hdr(struct pnfs_layout_hdr *lo);
+static void pnfs_free_returned_lsegs(struct pnfs_layout_hdr *lo,
+ struct list_head *free_me,
+ const struct pnfs_layout_range *range,
+ u32 seq);
/* Return the registered pnfs layout driver module matching given id */
static struct pnfs_layoutdriver_type *
@@ -326,6 +330,7 @@ pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo,
set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
pnfs_clear_layoutreturn_info(lo);
+ pnfs_free_returned_lsegs(lo, lseg_list, &range, 0);
return pnfs_mark_matching_lsegs_invalid(lo, lseg_list, &range, 0);
}
@@ -405,9 +410,10 @@ pnfs_init_lseg(struct pnfs_layout_hdr *lo, struct pnfs_layout_segment *lseg,
static void pnfs_free_lseg(struct pnfs_layout_segment *lseg)
{
- struct inode *ino = lseg->pls_layout->plh_inode;
-
- NFS_SERVER(ino)->pnfs_curr_ld->free_lseg(lseg);
+ if (lseg != NULL) {
+ struct inode *inode = lseg->pls_layout->plh_inode;
+ NFS_SERVER(inode)->pnfs_curr_ld->free_lseg(lseg);
+ }
}
static void
@@ -430,6 +436,18 @@ pnfs_layout_remove_lseg(struct pnfs_layout_hdr *lo,
rpc_wake_up(&NFS_SERVER(inode)->roc_rpcwaitq);
}
+static bool
+pnfs_cache_lseg_for_layoutreturn(struct pnfs_layout_hdr *lo,
+ struct pnfs_layout_segment *lseg)
+{
+ if (test_and_clear_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags) &&
+ pnfs_layout_is_valid(lo)) {
+ list_move_tail(&lseg->pls_list, &lo->plh_return_segs);
+ return true;
+ }
+ return false;
+}
+
void
pnfs_put_lseg(struct pnfs_layout_segment *lseg)
{
@@ -453,6 +471,8 @@ pnfs_put_lseg(struct pnfs_layout_segment *lseg)
}
pnfs_get_layout_hdr(lo);
pnfs_layout_remove_lseg(lo, lseg);
+ if (pnfs_cache_lseg_for_layoutreturn(lo, lseg))
+ lseg = NULL;
spin_unlock(&inode->i_lock);
pnfs_free_lseg(lseg);
pnfs_put_layout_hdr(lo);
@@ -493,9 +513,11 @@ pnfs_put_lseg_locked(struct pnfs_layout_segment *lseg)
struct pnfs_layout_hdr *lo = lseg->pls_layout;
if (test_bit(NFS_LSEG_VALID, &lseg->pls_flags))
return;
- pnfs_get_layout_hdr(lo);
pnfs_layout_remove_lseg(lo, lseg);
- pnfs_free_lseg_async(lseg);
+ if (!pnfs_cache_lseg_for_layoutreturn(lo, lseg)) {
+ pnfs_get_layout_hdr(lo);
+ pnfs_free_lseg_async(lseg);
+ }
}
}
EXPORT_SYMBOL_GPL(pnfs_put_lseg_locked);
@@ -619,6 +641,20 @@ pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
return remaining;
}
+static void
+pnfs_free_returned_lsegs(struct pnfs_layout_hdr *lo,
+ struct list_head *free_me,
+ const struct pnfs_layout_range *range,
+ u32 seq)
+{
+ struct pnfs_layout_segment *lseg, *next;
+
+ list_for_each_entry_safe(lseg, next, &lo->plh_return_segs, pls_list) {
+ if (pnfs_match_lseg_recall(lseg, range, seq))
+ list_move_tail(&lseg->pls_list, free_me);
+ }
+}
+
/* note free_me must contain lsegs from a single layout_hdr */
void
pnfs_free_lseg_list(struct list_head *free_me)
@@ -915,7 +951,7 @@ static void pnfs_clear_layoutcommit(struct inode *inode,
}
}
-void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo)
+static void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo)
{
clear_bit_unlock(NFS_LAYOUT_RETURN, &lo->plh_flags);
clear_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags);
@@ -924,6 +960,27 @@ void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo)
rpc_wake_up(&NFS_SERVER(lo->plh_inode)->roc_rpcwaitq);
}
+void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo,
+ const struct pnfs_layout_range *range,
+ u32 seq,
+ const nfs4_stateid *stateid)
+{
+ struct inode *inode = lo->plh_inode;
+ LIST_HEAD(freeme);
+
+ spin_lock(&inode->i_lock);
+ if (stateid) {
+ pnfs_mark_matching_lsegs_invalid(lo, &freeme, range, seq);
+ pnfs_free_returned_lsegs(lo, &freeme, range, seq);
+ pnfs_set_layout_stateid(lo, stateid, true);
+ } else
+ pnfs_mark_layout_stateid_invalid(lo, &freeme);
+ pnfs_clear_layoutreturn_waitbit(lo);
+ spin_unlock(&inode->i_lock);
+ pnfs_free_lseg_list(&freeme);
+
+}
+
static bool
pnfs_prepare_layoutreturn(struct pnfs_layout_hdr *lo,
nfs4_stateid *stateid,
@@ -1349,6 +1406,7 @@ alloc_init_layout_hdr(struct inode *ino,
atomic_set(&lo->plh_refcount, 1);
INIT_LIST_HEAD(&lo->plh_layouts);
INIT_LIST_HEAD(&lo->plh_segs);
+ INIT_LIST_HEAD(&lo->plh_return_segs);
INIT_LIST_HEAD(&lo->plh_bulk_destroy);
lo->plh_inode = ino;
lo->plh_lc_cred = get_rpccred(ctx->cred);
@@ -1920,7 +1978,6 @@ void pnfs_error_mark_layout_for_return(struct inode *inode,
.offset = 0,
.length = NFS4_MAX_UINT64,
};
- LIST_HEAD(free_me);
bool return_now = false;
spin_lock(&inode->i_lock);
@@ -1932,7 +1989,7 @@ void pnfs_error_mark_layout_for_return(struct inode *inode,
* segments at hand when sending layoutreturn. See pnfs_put_lseg()
* for how it works.
*/
- if (!pnfs_mark_matching_lsegs_return(lo, &free_me, &range, 0)) {
+ if (!pnfs_mark_matching_lsegs_return(lo, &lo->plh_return_segs, &range, 0)) {
nfs4_stateid stateid;
enum pnfs_iomode iomode;
@@ -1944,7 +2001,6 @@ void pnfs_error_mark_layout_for_return(struct inode *inode,
spin_unlock(&inode->i_lock);
nfs_commit_inode(inode, 0);
}
- pnfs_free_lseg_list(&free_me);
}
EXPORT_SYMBOL_GPL(pnfs_error_mark_layout_for_return);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 337dad382b6a..a382710edf40 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -191,6 +191,7 @@ struct pnfs_layout_hdr {
struct list_head plh_layouts; /* other client layouts */
struct list_head plh_bulk_destroy;
struct list_head plh_segs; /* layout segments list */
+ struct list_head plh_return_segs; /* invalid layout segments */
unsigned long plh_block_lgets; /* block LAYOUTGET if >0 */
unsigned long plh_retry_timestamp;
unsigned long plh_flags;
@@ -293,7 +294,10 @@ struct pnfs_layout_segment *pnfs_update_layout(struct inode *ino,
enum pnfs_iomode iomode,
bool strict_iomode,
gfp_t gfp_flags);
-void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo);
+void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo,
+ const struct pnfs_layout_range *range,
+ u32 seq,
+ const nfs4_stateid *stateid);
void pnfs_generic_layout_insert_lseg(struct pnfs_layout_hdr *lo,
struct pnfs_layout_segment *lseg,
--
2.9.3
There is no change to the value of NFS_LAYOUT_RETURN, so we should
not be waking up the RPC call.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 471018f27c8d..a64d1e40dba9 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -420,8 +420,6 @@ static void
pnfs_layout_remove_lseg(struct pnfs_layout_hdr *lo,
struct pnfs_layout_segment *lseg)
{
- struct inode *inode = lo->plh_inode;
-
WARN_ON(test_bit(NFS_LSEG_VALID, &lseg->pls_flags));
list_del_init(&lseg->pls_list);
/* Matched by pnfs_get_layout_hdr in pnfs_layout_insert_lseg */
@@ -433,7 +431,6 @@ pnfs_layout_remove_lseg(struct pnfs_layout_hdr *lo,
set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
clear_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
}
- rpc_wake_up(&NFS_SERVER(inode)->roc_rpcwaitq);
}
static bool
--
2.9.3
Fix a potential race with CB_LAYOUTRECALL in which the server recalls the
remaining layout segments while our LAYOUTRETURN is still in transit.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 3 +--
fs/nfs/pnfs.c | 8 +++++++-
fs/nfs/pnfs.h | 2 +-
3 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 561b21e4a930..87972cfa62bc 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -8574,8 +8574,7 @@ static void nfs4_layoutreturn_release(void *calldata)
struct pnfs_layout_hdr *lo = lrp->args.layout;
dprintk("--> %s\n", __func__);
- pnfs_layoutreturn_free_lsegs(lo, &lrp->args.range,
- be32_to_cpu(lrp->args.stateid.seqid),
+ pnfs_layoutreturn_free_lsegs(lo, &lrp->args.stateid, &lrp->args.range,
lrp->res.lrs_present ? &lrp->res.stateid : NULL);
nfs4_sequence_free_slot(&lrp->res.seq_res);
pnfs_put_layout_hdr(lrp->args.layout);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 555151b6d31b..471018f27c8d 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -961,20 +961,26 @@ static void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo)
}
void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo,
+ const nfs4_stateid *arg_stateid,
const struct pnfs_layout_range *range,
- u32 seq,
const nfs4_stateid *stateid)
{
struct inode *inode = lo->plh_inode;
LIST_HEAD(freeme);
spin_lock(&inode->i_lock);
+ if (!pnfs_layout_is_valid(lo) || !arg_stateid ||
+ !nfs4_stateid_match_other(&lo->plh_stateid, arg_stateid))
+ goto out_unlock;
if (stateid) {
+ u32 seq = be32_to_cpu(arg_stateid->seqid);
+
pnfs_mark_matching_lsegs_invalid(lo, &freeme, range, seq);
pnfs_free_returned_lsegs(lo, &freeme, range, seq);
pnfs_set_layout_stateid(lo, stateid, true);
} else
pnfs_mark_layout_stateid_invalid(lo, &freeme);
+out_unlock:
pnfs_clear_layoutreturn_waitbit(lo);
spin_unlock(&inode->i_lock);
pnfs_free_lseg_list(&freeme);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index a382710edf40..bc9a3aa31d3c 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -295,8 +295,8 @@ struct pnfs_layout_segment *pnfs_update_layout(struct inode *ino,
bool strict_iomode,
gfp_t gfp_flags);
void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo,
+ const nfs4_stateid *arg_stateid,
const struct pnfs_layout_range *range,
- u32 seq,
const nfs4_stateid *stateid);
void pnfs_generic_layout_insert_lseg(struct pnfs_layout_hdr *lo,
--
2.9.3
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index a64d1e40dba9..330f3a012f8e 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1190,7 +1190,8 @@ bool pnfs_roc(struct inode *ino)
spin_lock(&ino->i_lock);
lo = nfsi->layout;
- if (!lo || test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags))
+ if (!lo || !pnfs_layout_is_valid(lo) ||
+ test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags))
goto out_noroc;
/* no roc if we hold a delegation */
--
2.9.3
The parameter is already present in the "args" structure.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 4 ++--
fs/nfs/nfs4xdr.c | 8 ++++----
fs/nfs/objlayout/objlayout.c | 4 ++--
fs/nfs/objlayout/objlayout.h | 1 -
fs/nfs/pnfs.h | 3 +--
5 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index a5c38889e7ae..90462a2a9237 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -2012,10 +2012,10 @@ ff_layout_alloc_deviceid_node(struct nfs_server *server,
}
static void
-ff_layout_encode_layoutreturn(struct pnfs_layout_hdr *lo,
- struct xdr_stream *xdr,
+ff_layout_encode_layoutreturn(struct xdr_stream *xdr,
const struct nfs4_layoutreturn_args *args)
{
+ struct pnfs_layout_hdr *lo = args->layout;
struct nfs4_flexfile_layout *flo = FF_LAYOUT_FROM_HDR(lo);
__be32 *start;
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index b54931503872..86f72ae605c8 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -2013,6 +2013,7 @@ encode_layoutreturn(struct xdr_stream *xdr,
const struct nfs4_layoutreturn_args *args,
struct compound_hdr *hdr)
{
+ const struct pnfs_layoutdriver_type *lr_ops = NFS_SERVER(args->inode)->pnfs_curr_ld;
__be32 *p;
encode_op_hdr(xdr, OP_LAYOUTRETURN, decode_layoutreturn_maxsz, hdr);
@@ -2027,10 +2028,9 @@ encode_layoutreturn(struct xdr_stream *xdr,
spin_lock(&args->inode->i_lock);
encode_nfs4_stateid(xdr, &args->stateid);
spin_unlock(&args->inode->i_lock);
- if (NFS_SERVER(args->inode)->pnfs_curr_ld->encode_layoutreturn) {
- NFS_SERVER(args->inode)->pnfs_curr_ld->encode_layoutreturn(
- NFS_I(args->inode)->layout, xdr, args);
- } else
+ if (lr_ops->encode_layoutreturn)
+ lr_ops->encode_layoutreturn(xdr, args);
+ else
encode_uint32(xdr, 0);
}
diff --git a/fs/nfs/objlayout/objlayout.c b/fs/nfs/objlayout/objlayout.c
index 919efd4a1a23..2a4cdce939a0 100644
--- a/fs/nfs/objlayout/objlayout.c
+++ b/fs/nfs/objlayout/objlayout.c
@@ -504,10 +504,10 @@ encode_accumulated_error(struct objlayout *objlay, __be32 *p)
}
void
-objlayout_encode_layoutreturn(struct pnfs_layout_hdr *pnfslay,
- struct xdr_stream *xdr,
+objlayout_encode_layoutreturn(struct xdr_stream *xdr,
const struct nfs4_layoutreturn_args *args)
{
+ struct pnfs_layout_hdr *pnfslay = args->layout;
struct objlayout *objlay = OBJLAYOUT(pnfslay);
struct objlayout_io_res *oir, *tmp;
__be32 *start;
diff --git a/fs/nfs/objlayout/objlayout.h b/fs/nfs/objlayout/objlayout.h
index 2641dbad345c..fc94a5872ed4 100644
--- a/fs/nfs/objlayout/objlayout.h
+++ b/fs/nfs/objlayout/objlayout.h
@@ -175,7 +175,6 @@ extern void objlayout_encode_layoutcommit(
const struct nfs4_layoutcommit_args *);
extern void objlayout_encode_layoutreturn(
- struct pnfs_layout_hdr *,
struct xdr_stream *,
const struct nfs4_layoutreturn_args *);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index bc9a3aa31d3c..75ff9392127f 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -172,8 +172,7 @@ struct pnfs_layoutdriver_type {
(struct nfs_server *server, struct pnfs_device *pdev,
gfp_t gfp_flags);
- void (*encode_layoutreturn) (struct pnfs_layout_hdr *layoutid,
- struct xdr_stream *xdr,
+ void (*encode_layoutreturn) (struct xdr_stream *xdr,
const struct nfs4_layoutreturn_args *args);
void (*cleanup_layoutcommit) (struct nfs4_layoutcommit_data *data);
--
2.9.3
The layoutreturn call will take care of invalidating the layout segments
once the call is successful.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 330f3a012f8e..d7b5ad437b14 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1205,22 +1205,28 @@ bool pnfs_roc(struct inode *ino)
goto out_noroc;
}
- /* always send layoutreturn if being marked so */
- if (test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags))
- layoutreturn = pnfs_prepare_layoutreturn(lo,
- &stateid, NULL);
- list_for_each_entry_safe(lseg, tmp, &lo->plh_segs, pls_list)
+ list_for_each_entry_safe(lseg, tmp, &lo->plh_segs, pls_list) {
/* If we are sending layoutreturn, invalidate all valid lsegs */
- if (layoutreturn || test_bit(NFS_LSEG_ROC, &lseg->pls_flags)) {
+ if (test_bit(NFS_LSEG_ROC, &lseg->pls_flags)) {
mark_lseg_invalid(lseg, &tmp_list);
found = true;
}
+ }
+
+ /* always send layoutreturn if being marked so */
+ if (test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags)) {
+ layoutreturn = pnfs_prepare_layoutreturn(lo,
+ &stateid, NULL);
+ if (layoutreturn)
+ goto out_noroc;
+ }
+
/* ROC in two conditions:
* 1. there are ROC lsegs
* 2. we don't send layoutreturn
*/
- if (found && !layoutreturn) {
+ if (found) {
/* lo ref dropped in pnfs_roc_release() */
pnfs_get_layout_hdr(lo);
roc = true;
--
2.9.3
Add XDR encoding for the layoutreturn op, and storage for the layoutreturn
arguments to the CLOSE compound.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 49 ++++++++++++++++++++++++++++++++++++++++---------
fs/nfs/nfs4xdr.c | 26 ++++++++++++++++++++++++++
include/linux/nfs_xdr.h | 3 +++
3 files changed, 69 insertions(+), 9 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 87972cfa62bc..9b9b0eabef9b 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3035,10 +3035,14 @@ struct nfs4_closedata {
struct nfs4_state *state;
struct nfs_closeargs arg;
struct nfs_closeres res;
+ struct {
+ struct nfs4_layoutreturn_args arg;
+ struct nfs4_layoutreturn_res res;
+ u32 roc_barrier;
+ bool roc;
+ } lr;
struct nfs_fattr fattr;
unsigned long timestamp;
- bool roc;
- u32 roc_barrier;
};
static void nfs4_free_closedata(void *data)
@@ -3047,7 +3051,7 @@ static void nfs4_free_closedata(void *data)
struct nfs4_state_owner *sp = calldata->state->owner;
struct super_block *sb = calldata->state->inode->i_sb;
- if (calldata->roc)
+ if (calldata->lr.roc)
pnfs_roc_release(calldata->state->inode);
nfs4_put_open_state(calldata->state);
nfs_free_seqid(calldata->arg.seqid);
@@ -3067,15 +3071,41 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
if (!nfs4_sequence_done(task, &calldata->res.seq_res))
return;
trace_nfs4_close(state, &calldata->arg, &calldata->res, task->tk_status);
+
+ /* Handle Layoutreturn errors */
+ if (calldata->arg.lr_args && task->tk_status != 0) {
+ switch (calldata->res.lr_ret) {
+ default:
+ calldata->res.lr_ret = -NFS4ERR_NOMATCHING_LAYOUT;
+ break;
+ case 0:
+ calldata->arg.lr_args = NULL;
+ calldata->res.lr_res = NULL;
+ break;
+ case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_DELEG_REVOKED:
+ case -NFS4ERR_EXPIRED:
+ case -NFS4ERR_BAD_STATEID:
+ case -NFS4ERR_OLD_STATEID:
+ case -NFS4ERR_UNKNOWN_LAYOUTTYPE:
+ case -NFS4ERR_WRONG_CRED:
+ calldata->arg.lr_args = NULL;
+ calldata->res.lr_res = NULL;
+ calldata->res.lr_ret = 0;
+ rpc_restart_call_prepare(task);
+ return;
+ }
+ }
+
/* hmm. we are done with the inode, and in the process of freeing
* the state_owner. we keep this around to process errors
*/
switch (task->tk_status) {
case 0:
res_stateid = &calldata->res.stateid;
- if (calldata->roc)
+ if (calldata->lr.roc)
pnfs_roc_set_barrier(state->inode,
- calldata->roc_barrier);
+ calldata->lr.roc_barrier);
renew_lease(server, calldata->timestamp);
break;
case -NFS4ERR_ADMIN_REVOKED:
@@ -3151,7 +3181,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
goto out_no_action;
}
- if (nfs4_wait_on_layoutreturn(inode, task)) {
+ if (!calldata->arg.lr_args && nfs4_wait_on_layoutreturn(inode, task)) {
nfs_release_seqid(calldata->arg.seqid);
goto out_wait;
}
@@ -3165,8 +3195,8 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
else
calldata->arg.bitmask = NULL;
}
- if (calldata->roc)
- pnfs_roc_get_barrier(inode, &calldata->roc_barrier);
+ if (calldata->lr.roc)
+ pnfs_roc_get_barrier(inode, &calldata->lr.roc_barrier);
calldata->arg.share_access =
nfs4_map_atomic_open_share(NFS_SERVER(inode),
@@ -3250,7 +3280,8 @@ int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait)
calldata->res.fattr = &calldata->fattr;
calldata->res.seqid = calldata->arg.seqid;
calldata->res.server = server;
- calldata->roc = nfs4_roc(state->inode);
+ calldata->res.lr_ret = -NFS4ERR_NOMATCHING_LAYOUT;
+ calldata->lr.roc = nfs4_roc(state->inode);
nfs_sb_active(calldata->inode->i_sb);
msg.rpc_argp = &calldata->arg;
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 0a82b3fb2d27..73d2a68f0698 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -415,6 +415,8 @@ static int nfs4_stat_to_errno(int);
#else /* CONFIG_NFS_V4_1 */
#define encode_sequence_maxsz 0
#define decode_sequence_maxsz 0
+#define encode_layoutreturn_maxsz 0
+#define decode_layoutreturn_maxsz 0
#endif /* CONFIG_NFS_V4_1 */
#define NFS4_enc_compound_sz (1024) /* XXX: large enough? */
@@ -508,11 +510,13 @@ static int nfs4_stat_to_errno(int);
#define NFS4_enc_close_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
+ encode_layoutreturn_maxsz + \
encode_close_maxsz + \
encode_getattr_maxsz)
#define NFS4_dec_close_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
+ decode_layoutreturn_maxsz + \
decode_close_maxsz + \
decode_getattr_maxsz)
#define NFS4_enc_setattr_sz (compound_encode_hdr_maxsz + \
@@ -2061,6 +2065,13 @@ static void encode_free_stateid(struct xdr_stream *xdr,
encode_op_hdr(xdr, OP_FREE_STATEID, decode_free_stateid_maxsz, hdr);
encode_nfs4_stateid(xdr, &args->stateid);
}
+#else
+static inline void
+encode_layoutreturn(struct xdr_stream *xdr,
+ const struct nfs4_layoutreturn_args *args,
+ struct compound_hdr *hdr)
+{
+}
#endif /* CONFIG_NFS_V4_1 */
/*
@@ -2248,6 +2259,8 @@ static void nfs4_xdr_enc_close(struct rpc_rqst *req, struct xdr_stream *xdr,
encode_compound_hdr(xdr, req, &hdr);
encode_sequence(xdr, &args->seq_args, &hdr);
encode_putfh(xdr, args->fh, &hdr);
+ if (args->lr_args)
+ encode_layoutreturn(xdr, args->lr_args, &hdr);
encode_close(xdr, args, &hdr);
if (args->bitmask != NULL)
encode_getfattr(xdr, args->bitmask, &hdr);
@@ -6088,6 +6101,13 @@ static int decode_free_stateid(struct xdr_stream *xdr,
res->status = decode_op_hdr(xdr, OP_FREE_STATEID);
return res->status;
}
+#else
+static inline
+int decode_layoutreturn(struct xdr_stream *xdr,
+ struct nfs4_layoutreturn_res *res)
+{
+ return 0;
+}
#endif /* CONFIG_NFS_V4_1 */
/*
@@ -6440,6 +6460,12 @@ static int nfs4_xdr_dec_close(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
status = decode_putfh(xdr);
if (status)
goto out;
+ if (res->lr_res) {
+ status = decode_layoutreturn(xdr, res->lr_res);
+ res->lr_ret = status;
+ if (status)
+ goto out;
+ }
status = decode_close(xdr, res);
if (status != 0)
goto out;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index beb1e10f446e..44ed64bb66ae 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -469,6 +469,7 @@ struct nfs_closeargs {
fmode_t fmode;
u32 share_access;
const u32 * bitmask;
+ struct nfs4_layoutreturn_args *lr_args;
};
struct nfs_closeres {
@@ -477,6 +478,8 @@ struct nfs_closeres {
struct nfs_fattr * fattr;
struct nfs_seqid * seqid;
const struct nfs_server *server;
+ struct nfs4_layoutreturn_res *lr_res;
+ int lr_ret;
};
/*
* * Arguments to the lock,lockt, and locku call.
--
2.9.3
Add XDR encoding for the layoutreturn op, and storage for the layoutreturn
arguments to the DELEGRETURN compound.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 50 ++++++++++++++++++++++++++++++++++++++++---------
fs/nfs/nfs4xdr.c | 10 ++++++++++
include/linux/nfs_xdr.h | 3 +++
3 files changed, 54 insertions(+), 9 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9b9b0eabef9b..765af663257a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5608,11 +5608,15 @@ struct nfs4_delegreturndata {
struct nfs_fh fh;
nfs4_stateid stateid;
unsigned long timestamp;
+ struct {
+ struct nfs4_layoutreturn_args arg;
+ struct nfs4_layoutreturn_res res;
+ u32 roc_barrier;
+ bool roc;
+ } lr;
struct nfs_fattr fattr;
int rpc_status;
struct inode *inode;
- bool roc;
- u32 roc_barrier;
};
static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
@@ -5623,6 +5627,32 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
return;
trace_nfs4_delegreturn_exit(&data->args, &data->res, task->tk_status);
+
+ /* Handle Layoutreturn errors */
+ if (data->args.lr_args && task->tk_status != 0) {
+ switch(data->res.lr_ret) {
+ default:
+ data->res.lr_ret = -NFS4ERR_NOMATCHING_LAYOUT;
+ break;
+ case 0:
+ data->args.lr_args = NULL;
+ data->res.lr_res = NULL;
+ break;
+ case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_DELEG_REVOKED:
+ case -NFS4ERR_EXPIRED:
+ case -NFS4ERR_BAD_STATEID:
+ case -NFS4ERR_OLD_STATEID:
+ case -NFS4ERR_UNKNOWN_LAYOUTTYPE:
+ case -NFS4ERR_WRONG_CRED:
+ data->args.lr_args = NULL;
+ data->res.lr_res = NULL;
+ data->res.lr_ret = 0;
+ rpc_restart_call_prepare(task);
+ return;
+ }
+ }
+
switch (task->tk_status) {
case 0:
renew_lease(data->res.server, data->timestamp);
@@ -5646,8 +5676,8 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
}
}
data->rpc_status = task->tk_status;
- if (data->roc && data->rpc_status == 0)
- pnfs_roc_set_barrier(data->inode, data->roc_barrier);
+ if (data->lr.roc && data->rpc_status == 0)
+ pnfs_roc_set_barrier(data->inode, data->lr.roc_barrier);
}
static void nfs4_delegreturn_release(void *calldata)
@@ -5656,7 +5686,7 @@ static void nfs4_delegreturn_release(void *calldata)
struct inode *inode = data->inode;
if (inode) {
- if (data->roc)
+ if (data->lr.roc)
pnfs_roc_release(inode);
nfs_iput_and_deactive(inode);
}
@@ -5669,11 +5699,12 @@ static void nfs4_delegreturn_prepare(struct rpc_task *task, void *data)
d_data = (struct nfs4_delegreturndata *)data;
- if (nfs4_wait_on_layoutreturn(d_data->inode, task))
+ if (!d_data->args.lr_args &&
+ nfs4_wait_on_layoutreturn(d_data->inode, task))
return;
- if (d_data->roc)
- pnfs_roc_get_barrier(d_data->inode, &d_data->roc_barrier);
+ if (d_data->lr.roc)
+ pnfs_roc_get_barrier(d_data->inode, &d_data->lr.roc_barrier);
nfs4_setup_sequence(d_data->res.server,
&d_data->args.seq_args,
@@ -5720,12 +5751,13 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
nfs4_stateid_copy(&data->stateid, stateid);
data->res.fattr = &data->fattr;
data->res.server = server;
+ data->res.lr_ret = -NFS4ERR_NOMATCHING_LAYOUT;
nfs_fattr_init(data->res.fattr);
data->timestamp = jiffies;
data->rpc_status = 0;
data->inode = nfs_igrab_and_active(inode);
if (data->inode)
- data->roc = nfs4_roc(inode);
+ data->lr.roc = nfs4_roc(inode);
task_setup_data.callback_data = data;
msg.rpc_argp = &data->args;
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 73d2a68f0698..84fdcf27138c 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -710,11 +710,13 @@ static int nfs4_stat_to_errno(int);
#define NFS4_enc_delegreturn_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
+ encode_layoutreturn_maxsz + \
encode_delegreturn_maxsz + \
encode_getattr_maxsz)
#define NFS4_dec_delegreturn_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
+ decode_layoutreturn_maxsz + \
decode_delegreturn_maxsz + \
decode_getattr_maxsz)
#define NFS4_enc_getacl_sz (compound_encode_hdr_maxsz + \
@@ -2683,6 +2685,8 @@ static void nfs4_xdr_enc_delegreturn(struct rpc_rqst *req,
encode_compound_hdr(xdr, req, &hdr);
encode_sequence(xdr, &args->seq_args, &hdr);
encode_putfh(xdr, args->fhandle, &hdr);
+ if (args->lr_args)
+ encode_layoutreturn(xdr, args->lr_args, &hdr);
encode_getfattr(xdr, args->bitmask, &hdr);
encode_delegreturn(xdr, args->stateid, &hdr);
encode_nops(&hdr);
@@ -6942,6 +6946,12 @@ static int nfs4_xdr_dec_delegreturn(struct rpc_rqst *rqstp,
status = decode_putfh(xdr);
if (status != 0)
goto out;
+ if (res->lr_res) {
+ status = decode_layoutreturn(xdr, res->lr_res);
+ res->lr_ret = status;
+ if (status)
+ goto out;
+ }
status = decode_getfattr(xdr, res->fattr, res->server);
if (status != 0)
goto out;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 44ed64bb66ae..bfbd0cace91b 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -552,12 +552,15 @@ struct nfs4_delegreturnargs {
const struct nfs_fh *fhandle;
const nfs4_stateid *stateid;
const u32 * bitmask;
+ struct nfs4_layoutreturn_args *lr_args;
};
struct nfs4_delegreturnres {
struct nfs4_sequence_res seq_res;
struct nfs_fattr * fattr;
struct nfs_server *server;
+ struct nfs4_layoutreturn_res *lr_res;
+ int lr_ret;
};
/*
--
2.9.3
We need to account for the reply to the PUTFH operation in the
DELEGRETURN compound.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4xdr.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 86f72ae605c8..0a82b3fb2d27 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -710,6 +710,7 @@ static int nfs4_stat_to_errno(int);
encode_getattr_maxsz)
#define NFS4_dec_delegreturn_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
+ decode_putfh_maxsz + \
decode_delegreturn_maxsz + \
decode_getattr_maxsz)
#define NFS4_enc_getacl_sz (compound_encode_hdr_maxsz + \
--
2.9.3
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index d7b5ad437b14..a93afdd37203 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1014,6 +1014,23 @@ pnfs_prepare_layoutreturn(struct pnfs_layout_hdr *lo,
return true;
}
+static void
+pnfs_init_layoutreturn_args(struct nfs4_layoutreturn_args *args,
+ struct pnfs_layout_hdr *lo,
+ const nfs4_stateid *stateid,
+ enum pnfs_iomode iomode)
+{
+ struct inode *inode = lo->plh_inode;
+
+ args->layout_type = NFS_SERVER(inode)->pnfs_curr_ld->id;
+ args->inode = inode;
+ args->range.iomode = iomode;
+ args->range.offset = 0;
+ args->range.length = NFS4_MAX_UINT64;
+ args->layout = lo;
+ nfs4_stateid_copy(&args->stateid, stateid);
+}
+
static int
pnfs_send_layoutreturn(struct pnfs_layout_hdr *lo, const nfs4_stateid *stateid,
enum pnfs_iomode iomode, bool sync)
@@ -1032,13 +1049,7 @@ pnfs_send_layoutreturn(struct pnfs_layout_hdr *lo, const nfs4_stateid *stateid,
goto out;
}
- nfs4_stateid_copy(&lrp->args.stateid, stateid);
- lrp->args.layout_type = NFS_SERVER(ino)->pnfs_curr_ld->id;
- lrp->args.inode = ino;
- lrp->args.range.iomode = iomode;
- lrp->args.range.offset = 0;
- lrp->args.range.length = NFS4_MAX_UINT64;
- lrp->args.layout = lo;
+ pnfs_init_layoutreturn_args(&lrp->args, lo, stateid, iomode);
lrp->clp = NFS_SERVER(ino)->nfs_client;
lrp->cred = lo->plh_lc_cred;
--
2.9.3
If we cannot grab the inode or superblock, then we cannot pin the
layout header, and so we cannot send a layoutreturn as part of an
async delegreturn call. In this case, we currently end up sending
an extra layoutreturn after the delegreturn. Since the layout was
implicitly returned by the delegreturn, that just gets a BAD_STATEID.
The fix is to simply complete the return-on-close immediately.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 221d97de0e2c..5593b088c561 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5744,14 +5744,16 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
nfs_fattr_init(data->res.fattr);
data->timestamp = jiffies;
data->rpc_status = 0;
+ data->lr.roc = pnfs_roc(inode, &data->lr.arg, &data->lr.res, cred);
data->inode = nfs_igrab_and_active(inode);
if (data->inode) {
- data->lr.roc = pnfs_roc(inode, &data->lr.arg, &data->lr.res,
- cred);
if (data->lr.roc) {
data->args.lr_args = &data->lr.arg;
data->res.lr_res = &data->lr.res;
}
+ } else if (data->lr.roc) {
+ pnfs_roc_release(&data->lr.arg, &data->lr.res, 0);
+ data->lr.roc = false;
}
task_setup_data.callback_data = data;
--
2.9.3
When the layout state is invalidated, then so is the layout segment
state, and hence we do need to clean up the state bits.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index f61cb81eb5ab..76b18518881e 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -58,6 +58,8 @@ static void pnfs_free_returned_lsegs(struct pnfs_layout_hdr *lo,
struct list_head *free_me,
const struct pnfs_layout_range *range,
u32 seq);
+static bool pnfs_lseg_dec_and_remove_zero(struct pnfs_layout_segment *lseg,
+ struct list_head *tmp_list);
/* Return the registered pnfs layout driver module matching given id */
static struct pnfs_layoutdriver_type *
@@ -311,6 +313,18 @@ pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo)
clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
}
+static void
+pnfs_clear_lseg_state(struct pnfs_layout_segment *lseg,
+ struct list_head *free_me)
+{
+ clear_bit(NFS_LSEG_ROC, &lseg->pls_flags);
+ clear_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags);
+ if (test_and_clear_bit(NFS_LSEG_VALID, &lseg->pls_flags))
+ pnfs_lseg_dec_and_remove_zero(lseg, free_me);
+ if (test_and_clear_bit(NFS_LSEG_LAYOUTCOMMIT, &lseg->pls_flags))
+ pnfs_lseg_dec_and_remove_zero(lseg, free_me);
+}
+
/*
* Mark a pnfs_layout_hdr and all associated layout segments as invalid
*
@@ -327,11 +341,14 @@ pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo,
.offset = 0,
.length = NFS4_MAX_UINT64,
};
+ struct pnfs_layout_segment *lseg, *next;
set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
pnfs_clear_layoutreturn_info(lo);
+ list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list)
+ pnfs_clear_lseg_state(lseg, lseg_list);
pnfs_free_returned_lsegs(lo, lseg_list, &range, 0);
- return pnfs_mark_matching_lsegs_invalid(lo, lseg_list, &range, 0);
+ return !list_empty(&lo->plh_segs);
}
static int
--
2.9.3
Amend the pnfs return on close helper functions to enable sending the
layoutreturn op in CLOSE/DELEGRETURN. This closes a potential race between
CLOSE/DELEGRETURN and parallel OPEN calls to the same file, and allows the
client and the server to agree on whether or not there is an outstanding
layout.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 45 ++++++++----------
fs/nfs/pnfs.c | 139 ++++++++++++++++++++++++------------------------------
fs/nfs/pnfs.h | 30 ++++++------
3 files changed, 96 insertions(+), 118 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 765af663257a..221d97de0e2c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3052,7 +3052,8 @@ static void nfs4_free_closedata(void *data)
struct super_block *sb = calldata->state->inode->i_sb;
if (calldata->lr.roc)
- pnfs_roc_release(calldata->state->inode);
+ pnfs_roc_release(&calldata->lr.arg, &calldata->lr.res,
+ calldata->res.lr_ret);
nfs4_put_open_state(calldata->state);
nfs_free_seqid(calldata->arg.seqid);
nfs4_put_state_owner(sp);
@@ -3103,9 +3104,6 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
switch (task->tk_status) {
case 0:
res_stateid = &calldata->res.stateid;
- if (calldata->lr.roc)
- pnfs_roc_set_barrier(state->inode,
- calldata->lr.roc_barrier);
renew_lease(server, calldata->timestamp);
break;
case -NFS4ERR_ADMIN_REVOKED:
@@ -3181,7 +3179,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
goto out_no_action;
}
- if (!calldata->arg.lr_args && nfs4_wait_on_layoutreturn(inode, task)) {
+ if (!calldata->lr.roc && nfs4_wait_on_layoutreturn(inode, task)) {
nfs_release_seqid(calldata->arg.seqid);
goto out_wait;
}
@@ -3195,8 +3193,6 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
else
calldata->arg.bitmask = NULL;
}
- if (calldata->lr.roc)
- pnfs_roc_get_barrier(inode, &calldata->lr.roc_barrier);
calldata->arg.share_access =
nfs4_map_atomic_open_share(NFS_SERVER(inode),
@@ -3223,13 +3219,6 @@ static const struct rpc_call_ops nfs4_close_ops = {
.rpc_release = nfs4_free_closedata,
};
-static bool nfs4_roc(struct inode *inode)
-{
- if (!nfs_have_layout(inode))
- return false;
- return pnfs_roc(inode);
-}
-
/*
* It is possible for data to be read/written from a mem-mapped file
* after the sys_close call (which hits the vfs layer as a flush).
@@ -3281,7 +3270,12 @@ int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait)
calldata->res.seqid = calldata->arg.seqid;
calldata->res.server = server;
calldata->res.lr_ret = -NFS4ERR_NOMATCHING_LAYOUT;
- calldata->lr.roc = nfs4_roc(state->inode);
+ calldata->lr.roc = pnfs_roc(state->inode,
+ &calldata->lr.arg, &calldata->lr.res, msg.rpc_cred);
+ if (calldata->lr.roc) {
+ calldata->arg.lr_args = &calldata->lr.arg;
+ calldata->res.lr_res = &calldata->lr.res;
+ }
nfs_sb_active(calldata->inode->i_sb);
msg.rpc_argp = &calldata->arg;
@@ -5676,8 +5670,6 @@ static void nfs4_delegreturn_done(struct rpc_task *task, void *calldata)
}
}
data->rpc_status = task->tk_status;
- if (data->lr.roc && data->rpc_status == 0)
- pnfs_roc_set_barrier(data->inode, data->lr.roc_barrier);
}
static void nfs4_delegreturn_release(void *calldata)
@@ -5687,7 +5679,8 @@ static void nfs4_delegreturn_release(void *calldata)
if (inode) {
if (data->lr.roc)
- pnfs_roc_release(inode);
+ pnfs_roc_release(&data->lr.arg, &data->lr.res,
+ data->res.lr_ret);
nfs_iput_and_deactive(inode);
}
kfree(calldata);
@@ -5699,13 +5692,9 @@ static void nfs4_delegreturn_prepare(struct rpc_task *task, void *data)
d_data = (struct nfs4_delegreturndata *)data;
- if (!d_data->args.lr_args &&
- nfs4_wait_on_layoutreturn(d_data->inode, task))
+ if (!d_data->lr.roc && nfs4_wait_on_layoutreturn(d_data->inode, task))
return;
- if (d_data->lr.roc)
- pnfs_roc_get_barrier(d_data->inode, &d_data->lr.roc_barrier);
-
nfs4_setup_sequence(d_data->res.server,
&d_data->args.seq_args,
&d_data->res.seq_res,
@@ -5756,8 +5745,14 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
data->timestamp = jiffies;
data->rpc_status = 0;
data->inode = nfs_igrab_and_active(inode);
- if (data->inode)
- data->lr.roc = nfs4_roc(inode);
+ if (data->inode) {
+ data->lr.roc = pnfs_roc(inode, &data->lr.arg, &data->lr.res,
+ cred);
+ if (data->lr.roc) {
+ data->args.lr_args = &data->lr.arg;
+ data->res.lr_res = &data->lr.res;
+ }
+ }
task_setup_data.callback_data = data;
msg.rpc_argp = &data->args;
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index a93afdd37203..f61cb81eb5ab 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -984,6 +984,20 @@ void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo,
}
+static void
+pnfs_set_plh_return_info(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode,
+ u32 seq)
+{
+ if (lo->plh_return_iomode != 0 && lo->plh_return_iomode != iomode)
+ iomode = IOMODE_ANY;
+ lo->plh_return_iomode = iomode;
+ set_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
+ if (seq != 0) {
+ WARN_ON_ONCE(lo->plh_return_seq != 0 && lo->plh_return_seq != seq);
+ lo->plh_return_seq = seq;
+ }
+}
+
static bool
pnfs_prepare_layoutreturn(struct pnfs_layout_hdr *lo,
nfs4_stateid *stateid,
@@ -1188,17 +1202,22 @@ pnfs_commit_and_return_layout(struct inode *inode)
return ret;
}
-bool pnfs_roc(struct inode *ino)
+bool pnfs_roc(struct inode *ino,
+ struct nfs4_layoutreturn_args *args,
+ struct nfs4_layoutreturn_res *res,
+ const struct rpc_cred *cred)
{
struct nfs_inode *nfsi = NFS_I(ino);
struct nfs_open_context *ctx;
struct nfs4_state *state;
struct pnfs_layout_hdr *lo;
- struct pnfs_layout_segment *lseg, *tmp;
+ struct pnfs_layout_segment *lseg, *next;
nfs4_stateid stateid;
- LIST_HEAD(tmp_list);
- bool found = false, layoutreturn = false, roc = false;
+ enum pnfs_iomode iomode = 0;
+ bool layoutreturn = false, roc = false;
+ if (!nfs_have_layout(ino))
+ return false;
spin_lock(&ino->i_lock);
lo = nfsi->layout;
if (!lo || !pnfs_layout_is_valid(lo) ||
@@ -1217,83 +1236,63 @@ bool pnfs_roc(struct inode *ino)
}
- list_for_each_entry_safe(lseg, tmp, &lo->plh_segs, pls_list) {
+ list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list) {
/* If we are sending layoutreturn, invalidate all valid lsegs */
- if (test_bit(NFS_LSEG_ROC, &lseg->pls_flags)) {
- mark_lseg_invalid(lseg, &tmp_list);
- found = true;
- }
+ if (!test_and_clear_bit(NFS_LSEG_ROC, &lseg->pls_flags))
+ continue;
+ /*
+ * Note: mark lseg for return so pnfs_layout_remove_lseg
+ * doesn't invalidate the layout for us.
+ */
+ set_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags);
+ if (!mark_lseg_invalid(lseg, &lo->plh_return_segs))
+ continue;
+ pnfs_set_plh_return_info(lo, lseg->pls_range.iomode, 0);
}
- /* always send layoutreturn if being marked so */
- if (test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags)) {
- layoutreturn = pnfs_prepare_layoutreturn(lo,
- &stateid, NULL);
- if (layoutreturn)
- goto out_noroc;
- }
+ if (!test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags))
+ goto out_noroc;
/* ROC in two conditions:
* 1. there are ROC lsegs
* 2. we don't send layoutreturn
*/
- if (found) {
- /* lo ref dropped in pnfs_roc_release() */
- pnfs_get_layout_hdr(lo);
- roc = true;
- }
+ /* lo ref dropped in pnfs_roc_release() */
+ layoutreturn = pnfs_prepare_layoutreturn(lo, &stateid, &iomode);
+ /* If the creds don't match, we can't compound the layoutreturn */
+ if (!layoutreturn || cred != lo->plh_lc_cred)
+ goto out_noroc;
+
+ roc = layoutreturn;
+ pnfs_init_layoutreturn_args(args, lo, &stateid, iomode);
+ res->lrs_present = 0;
+ layoutreturn = false;
out_noroc:
spin_unlock(&ino->i_lock);
- pnfs_free_lseg_list(&tmp_list);
pnfs_layoutcommit_inode(ino, true);
if (layoutreturn)
- pnfs_send_layoutreturn(lo, &stateid, IOMODE_ANY, true);
+ pnfs_send_layoutreturn(lo, &stateid, iomode, true);
return roc;
}
-void pnfs_roc_release(struct inode *ino)
+void pnfs_roc_release(struct nfs4_layoutreturn_args *args,
+ struct nfs4_layoutreturn_res *res,
+ int ret)
{
- struct pnfs_layout_hdr *lo;
+ struct pnfs_layout_hdr *lo = args->layout;
+ const nfs4_stateid *arg_stateid = NULL;
+ const nfs4_stateid *res_stateid = NULL;
- spin_lock(&ino->i_lock);
- lo = NFS_I(ino)->layout;
- pnfs_clear_layoutreturn_waitbit(lo);
- if (atomic_dec_and_test(&lo->plh_refcount)) {
- pnfs_detach_layout_hdr(lo);
- spin_unlock(&ino->i_lock);
- pnfs_free_layout_hdr(lo);
- } else
- spin_unlock(&ino->i_lock);
-}
-
-void pnfs_roc_set_barrier(struct inode *ino, u32 barrier)
-{
- struct pnfs_layout_hdr *lo;
-
- spin_lock(&ino->i_lock);
- lo = NFS_I(ino)->layout;
- if (pnfs_seqid_is_newer(barrier, lo->plh_barrier))
- lo->plh_barrier = barrier;
- spin_unlock(&ino->i_lock);
- trace_nfs4_layoutreturn_on_close(ino, 0);
-}
-
-void pnfs_roc_get_barrier(struct inode *ino, u32 *barrier)
-{
- struct nfs_inode *nfsi = NFS_I(ino);
- struct pnfs_layout_hdr *lo;
- u32 current_seqid;
-
- spin_lock(&ino->i_lock);
- lo = nfsi->layout;
- current_seqid = be32_to_cpu(lo->plh_stateid.seqid);
-
- /* Since close does not return a layout stateid for use as
- * a barrier, we choose the worst-case barrier.
- */
- *barrier = current_seqid + atomic_read(&lo->plh_outstanding);
- spin_unlock(&ino->i_lock);
+ if (ret == 0) {
+ arg_stateid = &args->stateid;
+ if (res->lrs_present)
+ res_stateid = &res->stateid;
+ }
+ pnfs_layoutreturn_free_lsegs(lo, arg_stateid, &args->range,
+ res_stateid);
+ pnfs_put_layout_hdr(lo);
+ trace_nfs4_layoutreturn_on_close(args->inode, 0);
}
bool pnfs_wait_on_layoutreturn(struct inode *ino, struct rpc_task *task)
@@ -1931,20 +1930,6 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
return ERR_PTR(-EAGAIN);
}
-static void
-pnfs_set_plh_return_info(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode,
- u32 seq)
-{
- if (lo->plh_return_iomode != 0 && lo->plh_return_iomode != iomode)
- iomode = IOMODE_ANY;
- lo->plh_return_iomode = iomode;
- set_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
- if (seq != 0) {
- WARN_ON_ONCE(lo->plh_return_seq != 0 && lo->plh_return_seq != seq);
- lo->plh_return_seq = seq;
- }
-}
-
/**
* pnfs_mark_matching_lsegs_return - Free or return matching layout segments
* @lo: pointer to layout header
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 75ff9392127f..f55c065664e1 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -271,10 +271,13 @@ int pnfs_mark_matching_lsegs_return(struct pnfs_layout_hdr *lo,
u32 seq);
int pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo,
struct list_head *lseg_list);
-bool pnfs_roc(struct inode *ino);
-void pnfs_roc_release(struct inode *ino);
-void pnfs_roc_set_barrier(struct inode *ino, u32 barrier);
-void pnfs_roc_get_barrier(struct inode *ino, u32 *barrier);
+bool pnfs_roc(struct inode *ino,
+ struct nfs4_layoutreturn_args *args,
+ struct nfs4_layoutreturn_res *res,
+ const struct rpc_cred *cred);
+void pnfs_roc_release(struct nfs4_layoutreturn_args *args,
+ struct nfs4_layoutreturn_res *res,
+ int ret);
bool pnfs_wait_on_layoutreturn(struct inode *ino, struct rpc_task *task);
void pnfs_set_layoutcommit(struct inode *, struct pnfs_layout_segment *, loff_t);
void pnfs_cleanup_layoutcommit(struct nfs4_layoutcommit_data *data);
@@ -666,23 +669,18 @@ pnfs_layoutcommit_outstanding(struct inode *inode)
static inline bool
-pnfs_roc(struct inode *ino)
+pnfs_roc(struct inode *ino,
+ struct nfs4_layoutreturn_args *args,
+ struct nfs4_layoutreturn_res *res,
+ const struct rpc_cred *cred)
{
return false;
}
static inline void
-pnfs_roc_release(struct inode *ino)
-{
-}
-
-static inline void
-pnfs_roc_set_barrier(struct inode *ino, u32 barrier)
-{
-}
-
-static inline void
-pnfs_roc_get_barrier(struct inode *ino, u32 *barrier)
+pnfs_roc_release(struct nfs4_layoutreturn_args *args,
+ struct nfs4_layoutreturn_res *res,
+ int ret)
{
}
--
2.9.3
Ensure that the layout state bits are synced when we cache a layout
segment for layoutreturn using an appropriate call to
pnfs_set_plh_return_info.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 29 +++++++++++++++--------------
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index a3a51860563c..08acfa49f115 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -306,6 +306,20 @@ pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo)
}
static void
+pnfs_set_plh_return_info(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode,
+ u32 seq)
+{
+ if (lo->plh_return_iomode != 0 && lo->plh_return_iomode != iomode)
+ iomode = IOMODE_ANY;
+ lo->plh_return_iomode = iomode;
+ set_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
+ if (seq != 0) {
+ WARN_ON_ONCE(lo->plh_return_seq != 0 && lo->plh_return_seq != seq);
+ lo->plh_return_seq = seq;
+ }
+}
+
+static void
pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo)
{
lo->plh_return_iomode = 0;
@@ -456,6 +470,7 @@ pnfs_cache_lseg_for_layoutreturn(struct pnfs_layout_hdr *lo,
{
if (test_and_clear_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags) &&
pnfs_layout_is_valid(lo)) {
+ pnfs_set_plh_return_info(lo, lseg->pls_range.iomode, 0);
list_move_tail(&lseg->pls_list, &lo->plh_return_segs);
return true;
}
@@ -1001,20 +1016,6 @@ void pnfs_layoutreturn_free_lsegs(struct pnfs_layout_hdr *lo,
}
-static void
-pnfs_set_plh_return_info(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode,
- u32 seq)
-{
- if (lo->plh_return_iomode != 0 && lo->plh_return_iomode != iomode)
- iomode = IOMODE_ANY;
- lo->plh_return_iomode = iomode;
- set_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
- if (seq != 0) {
- WARN_ON_ONCE(lo->plh_return_seq != 0 && lo->plh_return_seq != seq);
- lo->plh_return_seq = seq;
- }
-}
-
static bool
pnfs_prepare_layoutreturn(struct pnfs_layout_hdr *lo,
nfs4_stateid *stateid,
--
2.9.3
Address another memory leak.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 08acfa49f115..57ec46b57364 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -455,6 +455,8 @@ pnfs_layout_remove_lseg(struct pnfs_layout_hdr *lo,
list_del_init(&lseg->pls_list);
/* Matched by pnfs_get_layout_hdr in pnfs_layout_insert_lseg */
atomic_dec(&lo->plh_refcount);
+ if (test_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags))
+ return;
if (list_empty(&lo->plh_segs) &&
!test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags) &&
!test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags)) {
--
2.9.3
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 57ec46b57364..550010826bdd 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1245,11 +1245,20 @@ bool pnfs_roc(struct inode *ino,
if (!nfs_have_layout(ino))
return false;
+retry:
spin_lock(&ino->i_lock);
lo = nfsi->layout;
if (!lo || !pnfs_layout_is_valid(lo) ||
test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags))
goto out_noroc;
+ if (test_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) {
+ pnfs_get_layout_hdr(lo);
+ spin_unlock(&ino->i_lock);
+ wait_on_bit(&lo->plh_flags, NFS_LAYOUT_RETURN,
+ TASK_UNINTERRUPTIBLE);
+ pnfs_put_layout_hdr(lo);
+ goto retry;
+ }
/* no roc if we hold a delegation */
if (nfs4_check_delegation(ino, FMODE_READ))
--
2.9.3
We need to honour the NFS_LAYOUT_RETURN_REQUESTED bit regardless of
whether or not there are layout segments pending.
Furthermore, we should ensure that we leave the plh_return_segs list
empty.
This patch fixes a memory leak of the layout segments on plh_return_segs.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 76b18518881e..a3a51860563c 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1145,7 +1145,7 @@ _pnfs_return_layout(struct inode *ino)
struct nfs_inode *nfsi = NFS_I(ino);
LIST_HEAD(tmp_list);
nfs4_stateid stateid;
- int status = 0, empty;
+ int status = 0;
bool send;
dprintk("NFS: %s for inode %lu\n", __func__, ino->i_ino);
@@ -1159,7 +1159,14 @@ _pnfs_return_layout(struct inode *ino)
}
/* Reference matched in nfs4_layoutreturn_release */
pnfs_get_layout_hdr(lo);
- empty = list_empty(&lo->plh_segs);
+ /* Is there an outstanding layoutreturn ? */
+ if (test_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags)) {
+ spin_unlock(&ino->i_lock);
+ if (wait_on_bit(&lo->plh_flags, NFS_LAYOUT_RETURN,
+ TASK_UNINTERRUPTIBLE))
+ goto out_put_layout_hdr;
+ spin_lock(&ino->i_lock);
+ }
pnfs_clear_layoutcommit(ino, &tmp_list);
pnfs_mark_matching_lsegs_invalid(lo, &tmp_list, NULL, 0);
@@ -1173,7 +1180,7 @@ _pnfs_return_layout(struct inode *ino)
}
/* Don't send a LAYOUTRETURN if list was initially empty */
- if (empty) {
+ if (!test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags)) {
spin_unlock(&ino->i_lock);
dprintk("NFS: %s no layout segments to return\n", __func__);
goto out_put_layout_hdr;
--
2.9.3
If the layout stateid is already invalid, we have no work to do.
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 550010826bdd..0b25a1c820ba 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -750,6 +750,8 @@ pnfs_layout_bulk_destroy_byserver_locked(struct nfs_client *clp,
struct inode *inode;
list_for_each_entry_safe(lo, next, &server->layouts, plh_layouts) {
+ if (test_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags))
+ continue;
inode = igrab(lo->plh_inode);
if (inode == NULL)
continue;
--
2.9.3