These are on top of the current pnfs-submit rebased onto 2.6.37-rc5,
per Benny's suggestion.
More wave2 (CB_LAYOUTRECALL and LAYOUTRETURN) patches, based on
internal review as well as comments from Trond. These implement a completely
forgetful model. The client will *never* send LAYOUTRETURN. We also
don't do any waiting in LAYOUTGET, instead falling back to MDS at the
smallest sign of trouble.
patches 1 - 7 revert previous roc changes, which are unecessary given
that we are not returning anything
patch 8 reverts a fix already done in mainline in a different way
patch 9 prevents a pointless reordering in the cumulative diff
patch 10 fixes a COMMIT bug tickled by the new code
patch 11 - 20 remove LAYOUTRETURN and fix various review issues
patch 21 just brings an existing post-submit patch into the submit branch
patch 22 turns off LAYOUTCOMMIT. It is a temporary stop gap until
code can be rearranged so the LAYOUTCOMMIT code can be easily pushed
outside of the pnfs-submit branch
Fred
Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfs/nfs4filelayout.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index fb0efda..328c504 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -596,7 +596,8 @@ filelayout_commit(struct nfs_write_data *data, int sync)
}
dprintk("%s: Initiating commit: %llu USE DS:\n",
__func__, file_offset);
- print_ds(ds);
+ ifdebug(FACILITY)
+ print_ds(ds);
/* Send COMMIT to data server */
nfs_initiate_commit(dsdata, clnt, call_ops, sync);
--
1.7.2.1
On Dec 15, 2010, at 10:56 AM, Benny Halevy wrote:
> On 2010-12-15 17:43, Fred Isaman wrote:
>>
>> On Dec 15, 2010, at 10:29 AM, Benny Halevy wrote:
>>
>>> On 2010-12-15 16:11, Fred Isaman wrote:
>>>>
>>>> On Dec 15, 2010, at 8:57 AM, Benny Halevy wrote:
>>>>
>>>>> On 2010-12-10 03:22, Fred Isaman wrote:
>>>>>> It was checking that at least one known bit was set. It needs to check
>>>>>> no unknown bit was set. From RFC5661, section 20.6.3:
>>>>>>
>>>>>> When a bit is set in the type mask that corresponds to an undefined
>>>>>> type of recallable object, NFS4ERR_INVAL MUST be returned.
>>>>>>
>>>>>> Signed-off-by: Fred Isaman <[email protected]>
>>>>>> ---
>>>>>> fs/nfs/callback.h | 1 +
>>>>>> fs/nfs/callback_proc.c | 27 ++++-----------------------
>>>>>> 2 files changed, 5 insertions(+), 23 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
>>>>>> index b16dd1f..616c5c1 100644
>>>>>> --- a/fs/nfs/callback.h
>>>>>> +++ b/fs/nfs/callback.h
>>>>>> @@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
>>>>>> #define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
>>>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
>>>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
>>>>>> +#define RCA4_TYPE_MASK_ALL 0xf31f
>>>>>>
>>>>>> struct cb_recallanyargs {
>>>>>> struct sockaddr *craa_addr;
>>>>>> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
>>>>>> index 61b3c66..d4aec46 100644
>>>>>> --- a/fs/nfs/callback_proc.c
>>>>>> +++ b/fs/nfs/callback_proc.c
>>>>>> @@ -661,28 +661,10 @@ out_putclient:
>>>>>> goto out;
>>>>>> }
>>>>>>
>>>>>> -static inline bool
>>>>>> -validate_bitmap_values(const unsigned long *mask)
>>>>>> +static bool
>>>>>> +validate_bitmap_values(unsigned long mask)
>>>>>> {
>>>>>> - int i;
>>>>>> -
>>>>>> - if (*mask == 0)
>>>>>> - return true;
>>>>>> - if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
>>>>>> - test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
>>>>>> - test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
>>>>>> - test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
>>>>>> - test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
>>>>>> - return true;
>>>>>> - for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
>>>>>> - i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
>>>>>> - if (test_bit(i, mask))
>>>>>> - return true;
>>>>>> - for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
>>>>>> - i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
>>>>>> - if (test_bit(i, mask))
>>>>>> - return true;
>>>>>> - return false;
>>>>>> + return mask & ~RCA4_TYPE_MASK_ALL;
>>>>>
>>>>> Hmm, shouldn't that be
>>>>> return (mask & ~RCA4_TYPE_MASK_ALL) == 0;
>>>>>
>>>>> Benny
>>>>>
>>>>
>>>> Yes, you are right.
>>>
>>> OK. This is fixed in my branch to be released asap.
>>
>> Thanks. I have a bunch more minor code cleanups that I'll send once I see what you have, unless you want them immediately. I'll also have a rebase of the pnfs-submit branch with the wave2 patches pushed down to the bottom shortly after I see your branch.
>>
>
> I prefer that you send them now.
OK, I'll send them in a few minutes.
>
>>> I've reverted large parts of this patchset in the post-submit stream
>>> to restore layoutcommit and layoutreturn, but not their embedding in
>>> the CLOSE compound. I also kept the cleanups and bug fixes.
>>> I'll send out the post-submit when it's ready.
>>> Some more work will be required to restore the original patches
>>> author and signoffees.
>>>
>>> This is the list as of now:
>>>
>>> pick af44531 Revert "pnfs-submit: wave2: remove forgotten layoutreturn struct definitions"
>>> pick c465549 Revert "pnfs-submit: Turn off layoutcommits"
>>> pick 0f4ba67 Revert "pnfs-submit: wave2: remove all LAYOUTRETURN code"
>>> pick 486db47 Revert "pnfs-submit: wave2: Remove LAYOUTRETURN from return on close"
>>>
>>> pick 484c935 FIXME: roc should return layout on last close
>>> (This patch just adds a FIXME comment.)
>>>
>>
>> The above looks good.
>>
>>> pick 8698772 Revert "pnfs-submit: wave2: remove cl_layoutrecalls list"
>>> pick 263879b Revert "pnfs-submit: wave2: Pull out all recall initiated LAYOUTRETURNS"
>>> pick 693765f Revert "pnfs-submit: wave2: Don't wait in layoutget"
>>
>> Just note the Trond was highly resistant to the above.
>>
>
> I understand that in the forgetful client model but eventually returning the layout
> on close will be more efficient overall vs. having the server poll the client.
>
>>> pick de56e11 Revert "pnfs-submit: wave2: check that partial LAYOUTGET return is ignored"
>>>
>>
>> We need some sort of check that we got what we asked for, given that the xdr code can chop the servers reply.
>
> The post-submit world is supposed to handle partial layouts correctly, as this is required for
> the obj and block layouts so we can't just blindly toss away partial layouts.
>
> If the files layout driver won't support partial layouts we can restrict this (sigh, ugly)
> only for it.
>
> Benny
Note the problem right now is that the xdr decoding function will only parse the first array element, and toss the rest. That is the real reason for the check, and something that needs to be fixed.
Fred
>
>>
>> Fred
>>
>>> Anything else you had in mind?
>>>
>>> Benny
>>>
>>>>
>>>> Fred
>>>>
>>>>>> }
>>>>>>
>>>>>> __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>>>> @@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>>>> rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
>>>>>>
>>>>>> status = cpu_to_be32(NFS4ERR_INVAL);
>>>>>> - if (!validate_bitmap_values((const unsigned long *)
>>>>>> - &args->craa_type_mask))
>>>>>> + if (!validate_bitmap_values(args->craa_type_mask))
>>>>>> goto out;
>>>>>>
>>>>>> status = cpu_to_be32(NFS4_OK);
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
This reverts commit 4cce4b96893b2d76a3e65c6dc84dc86425ee718b.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4proc.c | 23 ++++++++++++-----------
fs/nfs/nfs4xdr.c | 27 +++++++++++++++++----------
fs/nfs/pnfs.c | 21 ++++++++++++---------
include/linux/nfs_xdr.h | 19 ++++++++++---------
4 files changed, 51 insertions(+), 39 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 3a1e578..f993a4a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5530,7 +5530,7 @@ static void nfs4_layoutcommit_prepare(struct rpc_task *task, void *data)
{
struct nfs4_layoutcommit_data *ldata =
(struct nfs4_layoutcommit_data *)data;
- struct nfs_server *server = NFS_SERVER(ldata->inode);
+ struct nfs_server *server = NFS_SERVER(ldata->args.inode);
if (nfs4_setup_sequence(server, NULL, &ldata->args.seq_args,
&ldata->res.seq_res, 1, task))
@@ -5543,7 +5543,7 @@ nfs4_layoutcommit_done(struct rpc_task *task, void *calldata)
{
struct nfs4_layoutcommit_data *data =
(struct nfs4_layoutcommit_data *)calldata;
- struct nfs_server *server = NFS_SERVER(data->inode);
+ struct nfs_server *server = NFS_SERVER(data->args.inode);
if (!nfs4_sequence_done(task, &data->res.seq_res))
return;
@@ -5560,8 +5560,8 @@ static void nfs4_layoutcommit_release(void *lcdata)
(struct nfs4_layoutcommit_data *)lcdata;
/* Matched by get_layout in pnfs_layoutcommit_inode */
- put_layout_hdr(data->inode);
- put_rpccred(data->args.cred);
+ put_layout_hdr(data->args.inode);
+ put_rpccred(data->cred);
kfree(lcdata);
}
@@ -5579,11 +5579,11 @@ nfs4_proc_layoutcommit(struct nfs4_layoutcommit_data *data, int issync)
.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LAYOUTCOMMIT],
.rpc_argp = &data->args,
.rpc_resp = &data->res,
- .rpc_cred = data->args.cred,
+ .rpc_cred = data->cred,
};
struct rpc_task_setup task_setup_data = {
.task = &data->task,
- .rpc_client = NFS_CLIENT(data->inode),
+ .rpc_client = NFS_CLIENT(data->args.inode),
.rpc_message = &msg,
.callback_ops = &nfs4_layoutcommit_ops,
.callback_data = data,
@@ -5592,12 +5592,13 @@ nfs4_proc_layoutcommit(struct nfs4_layoutcommit_data *data, int issync)
struct rpc_task *task;
int status = 0;
- dprintk("NFS: initiating layoutcommit call. %llu@%llu lbw: %llu "
+ dprintk("NFS: %4d initiating layoutcommit call. %llu@%llu lbw: %llu "
"type: %d issync %d\n",
- data->args.op.range.length,
- data->args.op.range.offset,
- data->args.op.lastbytewritten,
- data->args.op.layout_type, issync);
+ data->task.tk_pid,
+ data->args.range.length,
+ data->args.range.offset,
+ data->args.lastbytewritten,
+ data->args.layout_type, issync);
task = rpc_run_task(&task_setup_data);
if (IS_ERR(task))
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index dc63895..eb5c922 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1868,7 +1868,7 @@ encode_layoutget(struct xdr_stream *xdr,
static int
encode_layoutcommit(struct xdr_stream *xdr,
- const struct nfs4_layoutcommit_op_args *args,
+ const struct nfs4_layoutcommit_args *args,
struct compound_hdr *hdr)
{
__be32 *p;
@@ -1885,7 +1885,14 @@ encode_layoutcommit(struct xdr_stream *xdr,
p = xdr_encode_opaque_fixed(p, args->stateid.data, NFS4_STATEID_SIZE);
*p++ = cpu_to_be32(1); /* newoffset = TRUE */
p = xdr_encode_hyper(p, args->lastbytewritten);
- *p = cpu_to_be32(0); /* nt_timechanged = FALSE */
+ *p = cpu_to_be32(args->time_modify_changed != 0);
+ if (args->time_modify_changed) {
+ p = reserve_space(xdr, 12);
+ *p++ = cpu_to_be32(0);
+ *p++ = cpu_to_be32(args->time_modify.tv_sec);
+ *p = cpu_to_be32(args->time_modify.tv_nsec);
+ }
+
p = reserve_space(xdr, 4);
*p = cpu_to_be32(args->layout_type);
@@ -2812,7 +2819,7 @@ static int nfs4_xdr_enc_layoutcommit(struct rpc_rqst *req, uint32_t *p,
encode_compound_hdr(&xdr, req, &hdr);
encode_sequence(&xdr, &args->seq_args, &hdr);
encode_putfh(&xdr, args->fh, &hdr);
- encode_layoutcommit(&xdr, &args->op, &hdr);
+ encode_layoutcommit(&xdr, args, &hdr);
encode_getfattr(&xdr, args->bitmask, &hdr);
encode_nops(&hdr);
return 0;
@@ -5301,10 +5308,10 @@ out_overflow:
return -EIO;
}
-static int decode_layoutcommit(struct xdr_stream *xdr)
+static int decode_layoutcommit(struct xdr_stream *xdr,
+ struct rpc_rqst *req,
+ struct nfs4_layoutcommit_res *res)
{
- u32 sizechanged;
- u64 newsize;
__be32 *p;
int status;
@@ -5315,13 +5322,13 @@ static int decode_layoutcommit(struct xdr_stream *xdr)
p = xdr_inline_decode(xdr, 4);
if (unlikely(!p))
goto out_overflow;
- sizechanged = be32_to_cpup(p);
+ res->sizechanged = be32_to_cpup(p);
- if (sizechanged) {
+ if (res->sizechanged) {
p = xdr_inline_decode(xdr, 8);
if (unlikely(!p))
goto out_overflow;
- xdr_decode_hyper(p, &newsize);
+ xdr_decode_hyper(p, &res->newsize);
}
return 0;
out_overflow:
@@ -6456,7 +6463,7 @@ static int nfs4_xdr_dec_layoutcommit(struct rpc_rqst *rqstp, uint32_t *p,
status = decode_putfh(&xdr);
if (status)
goto out;
- status = decode_layoutcommit(&xdr);
+ status = decode_layoutcommit(&xdr, rqstp, res);
if (status)
goto out;
decode_getfattr(&xdr, res->fattr, res->server,
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 36955e1..8dbac82 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1203,18 +1203,21 @@ pnfs_layoutcommit_setup(struct inode *inode,
dprintk("--> %s\n", __func__);
- data->inode = inode;
+ data->args.inode = inode;
data->args.fh = NFS_FH(inode);
- data->args.op.layout_type = nfss->pnfs_curr_ld->id;
+ data->args.layout_type = nfss->pnfs_curr_ld->id;
data->res.fattr = &data->fattr;
nfs_fattr_init(&data->fattr);
+ /* TODO: Need to determine the correct values */
+ data->args.time_modify_changed = 0;
+
/* Set values from inode so it can be reset
*/
- data->args.op.range.iomode = IOMODE_RW;
- data->args.op.range.offset = write_begin_pos;
- data->args.op.range.length = write_end_pos - write_begin_pos + 1;
- data->args.op.lastbytewritten = min(write_end_pos,
+ data->args.range.iomode = IOMODE_RW;
+ data->args.range.offset = write_begin_pos;
+ data->args.range.length = write_end_pos - write_begin_pos + 1;
+ data->args.lastbytewritten = min(write_end_pos,
i_size_read(inode) - 1);
data->args.bitmask = nfss->attr_bitmask;
data->res.server = nfss;
@@ -1254,12 +1257,12 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync)
*/
write_begin_pos = nfsi->layout->write_begin_pos;
write_end_pos = nfsi->layout->write_end_pos;
- data->args.cred = nfsi->layout->cred;
+ data->cred = nfsi->layout->cred;
nfsi->layout->write_begin_pos = 0;
nfsi->layout->write_end_pos = 0;
nfsi->layout->cred = NULL;
__clear_bit(NFS_LAYOUT_NEED_LCOMMIT, &nfsi->layout->plh_flags);
- memcpy(data->args.op.stateid.data, nfsi->layout->stateid.data,
+ memcpy(data->args.stateid.data, nfsi->layout->stateid.data,
NFS4_STATEID_SIZE);
/* Reference for layoutcommit matched in pnfs_layoutcommit_release */
@@ -1272,7 +1275,7 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync)
write_end_pos);
if (status) {
/* The layout driver failed to setup the layoutcommit */
- put_rpccred(data->args.cred);
+ put_rpccred(data->cred);
put_layout_hdr(inode);
goto out_free;
}
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 1b7364f..ece0b2e 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -225,24 +225,25 @@ struct nfs4_layoutget {
struct pnfs_layout_segment **lsegpp;
};
-struct nfs4_layoutcommit_op_args {
+struct nfs4_layoutcommit_args {
nfs4_stateid stateid;
__u64 lastbytewritten;
+ __u32 time_modify_changed;
+ struct timespec time_modify;
+ const u32 *bitmask;
+ struct nfs_fh *fh;
+ struct inode *inode;
/* Values set by layout driver */
struct pnfs_layout_range range;
__u32 layout_type;
-};
-
-struct nfs4_layoutcommit_args {
- struct nfs4_layoutcommit_op_args op;
- const u32 *bitmask;
- struct nfs_fh *fh;
- struct rpc_cred *cred;
+ void *layoutdriver_data;
struct nfs4_sequence_args seq_args;
};
struct nfs4_layoutcommit_res {
+ __u32 sizechanged;
+ __u64 newsize;
struct nfs_fattr *fattr;
const struct nfs_server *server;
struct nfs4_sequence_res seq_res;
@@ -250,7 +251,7 @@ struct nfs4_layoutcommit_res {
struct nfs4_layoutcommit_data {
struct rpc_task task;
- struct inode *inode;
+ struct rpc_cred *cred;
struct nfs_fattr fattr;
struct nfs4_layoutcommit_args args;
struct nfs4_layoutcommit_res res;
--
1.7.2.1
On 2010-12-10 03:22, Fred Isaman wrote:
> It was checking that at least one known bit was set. It needs to check
> no unknown bit was set. From RFC5661, section 20.6.3:
>
> When a bit is set in the type mask that corresponds to an undefined
> type of recallable object, NFS4ERR_INVAL MUST be returned.
>
> Signed-off-by: Fred Isaman <[email protected]>
> ---
> fs/nfs/callback.h | 1 +
> fs/nfs/callback_proc.c | 27 ++++-----------------------
> 2 files changed, 5 insertions(+), 23 deletions(-)
>
> diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
> index b16dd1f..616c5c1 100644
> --- a/fs/nfs/callback.h
> +++ b/fs/nfs/callback.h
> @@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
> #define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
> +#define RCA4_TYPE_MASK_ALL 0xf31f
>
> struct cb_recallanyargs {
> struct sockaddr *craa_addr;
> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
> index 61b3c66..d4aec46 100644
> --- a/fs/nfs/callback_proc.c
> +++ b/fs/nfs/callback_proc.c
> @@ -661,28 +661,10 @@ out_putclient:
> goto out;
> }
>
> -static inline bool
> -validate_bitmap_values(const unsigned long *mask)
> +static bool
> +validate_bitmap_values(unsigned long mask)
> {
> - int i;
> -
> - if (*mask == 0)
> - return true;
> - if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
> - test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
> - test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
> - test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
> - test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
> - return true;
> - for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
> - i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
> - if (test_bit(i, mask))
> - return true;
> - for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
> - i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
> - if (test_bit(i, mask))
> - return true;
> - return false;
> + return mask & ~RCA4_TYPE_MASK_ALL;
Hmm, shouldn't that be
return (mask & ~RCA4_TYPE_MASK_ALL) == 0;
Benny
> }
>
> __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
> @@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
> rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
>
> status = cpu_to_be32(NFS4ERR_INVAL);
> - if (!validate_bitmap_values((const unsigned long *)
> - &args->craa_type_mask))
> + if (!validate_bitmap_values(args->craa_type_mask))
> goto out;
>
> status = cpu_to_be32(NFS4_OK);
by removing state manager invocation. This also has the advantage
that it avoids a current bug where we don't set an inode on bulk
LAYOUTRETURNs, since we no longer send any.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/callback.h | 5 --
fs/nfs/callback_proc.c | 103 +++++++++++++----------------------------------
fs/nfs/nfs4_fs.h | 1 -
fs/nfs/nfs4state.c | 4 --
4 files changed, 29 insertions(+), 84 deletions(-)
diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
index 616c5c1..7f55c7e 100644
--- a/fs/nfs/callback.h
+++ b/fs/nfs/callback.h
@@ -167,7 +167,6 @@ extern unsigned nfs4_callback_layoutrecall(
extern bool matches_outstanding_recall(struct inode *ino,
struct pnfs_layout_range *range);
extern void notify_drained(struct nfs_client *clp, u64 mask);
-extern void nfs_client_return_layouts(struct nfs_client *clp);
static inline void put_session_client(struct nfs4_session *session)
{
@@ -183,10 +182,6 @@ find_client_from_cps(struct cb_process_state *cps, struct sockaddr *addr)
#else /* CONFIG_NFS_V4_1 */
-static inline void nfs_client_return_layouts(struct nfs_client *clp)
-{
-}
-
static inline struct nfs_client *
find_client_from_cps(struct cb_process_state *cps, struct sockaddr *addr)
{
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 4cd7e84..97e1c96 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -163,88 +163,30 @@ matches_outstanding_recall(struct inode *ino, struct pnfs_layout_range *range)
return rv;
}
-/* Send a synchronous LAYOUTRETURN. By the time this is called, we know
- * all IO has been drained, any matching lsegs deleted, and that no
- * overlapping LAYOUTGETs will be sent or processed for the duration
- * of this call.
- * Note that it is possible that when this is called, the stateid has
- * been invalidated. But will not be cleared, so can still use.
- */
-static int
-pnfs_send_layoutreturn(struct nfs_client *clp,
- struct pnfs_cb_lrecall_info *cb_info)
-{
- struct cb_layoutrecallargs *args = &cb_info->pcl_args;
- struct nfs4_layoutreturn *lrp;
-
- lrp = kzalloc(sizeof(*lrp), GFP_KERNEL);
- if (!lrp)
- return -ENOMEM;
- lrp->args.reclaim = 0;
- lrp->args.layout_type = args->cbl_layout_type;
- lrp->args.return_type = args->cbl_recall_type;
- lrp->clp = clp;
- if (args->cbl_recall_type == RETURN_FILE) {
- lrp->args.range = args->cbl_range;
- lrp->args.inode = cb_info->pcl_ino;
- } else {
- lrp->args.range.iomode = IOMODE_ANY;
- lrp->args.inode = NULL;
- }
- return nfs4_proc_layoutreturn(lrp, true);
-}
-
-/* Called by state manager to finish CB_LAYOUTRECALLS initiated by
- * nfs4_callback_layoutrecall().
- */
-void nfs_client_return_layouts(struct nfs_client *clp)
-{
- struct pnfs_cb_lrecall_info *cb_info;
-
- spin_lock(&clp->cl_lock);
- while (true) {
- if (list_empty(&clp->cl_layoutrecalls)) {
- spin_unlock(&clp->cl_lock);
- break;
- }
- cb_info = list_first_entry(&clp->cl_layoutrecalls,
- struct pnfs_cb_lrecall_info,
- pcl_list);
- spin_unlock(&clp->cl_lock);
- if (atomic_read(&cb_info->pcl_count) != 0)
- break;
- /* What do on error return? These layoutreturns are
- * required by the protocol. So if do not get
- * successful reply, probably have to do something
- * more drastic.
- */
- pnfs_send_layoutreturn(clp, cb_info);
- spin_lock(&clp->cl_lock);
- /* Removing from the list unblocks LAYOUTGETs */
- list_del(&cb_info->pcl_list);
- clp->cl_cb_lrecall_count--;
- clp->cl_drain_notification[1 << cb_info->pcl_notify_bit] = NULL;
- kfree(cb_info);
- }
-}
-
void notify_drained(struct nfs_client *clp, u64 mask)
{
atomic_t **ptr = clp->cl_drain_notification;
- bool done = false;
/* clp lock not needed except to remove used up entries */
/* Should probably use functions defined in bitmap.h */
while (mask) {
- if ((mask & 1) && (atomic_dec_and_test(*ptr)))
- done = true;
+ if ((mask & 1) && (atomic_dec_and_test(*ptr))) {
+ struct pnfs_cb_lrecall_info *cb_info;
+
+ cb_info = container_of(*ptr,
+ struct pnfs_cb_lrecall_info,
+ pcl_count);
+ spin_lock(&clp->cl_lock);
+ /* Removing from the list unblocks LAYOUTGETs */
+ list_del(&cb_info->pcl_list);
+ clp->cl_cb_lrecall_count--;
+ clp->cl_drain_notification[1 << cb_info->pcl_notify_bit] = NULL;
+ spin_unlock(&clp->cl_lock);
+ kfree(cb_info);
+ }
mask >>= 1;
ptr++;
}
- if (done) {
- set_bit(NFS4CLNT_LAYOUT_RECALL, &clp->cl_state);
- nfs4_schedule_state_manager(clp);
- }
}
static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
@@ -269,8 +211,9 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
* does having a layout ref keep ino around?
* It should.
*/
- /* We need to hold the reference until any
- * potential LAYOUTRETURN is finished.
+ /* Without this, layout can be freed as soon
+ * as we release cl_lock. Matched in
+ * do_callback_layoutrecall.
*/
get_layout_hdr(lo);
cb_info->pcl_ino = lo->inode;
@@ -389,6 +332,18 @@ static u32 do_callback_layoutrecall(struct nfs_client *clp,
res = NFS4ERR_NOMATCHING_LAYOUT;
}
kfree(new);
+ } else {
+ /* We are currently using a referenced layout */
+ if (args->cbl_recall_type == RETURN_FILE) {
+ struct pnfs_layout_hdr *lo;
+
+ lo = NFS_I(new->pcl_ino)->layout;
+ spin_lock(&lo->inode->i_lock);
+ lo->plh_block_lgets--;
+ spin_unlock(&lo->inode->i_lock);
+ put_layout_hdr(new->pcl_ino);
+ }
+ res = NFS4ERR_DELAY;
}
out:
dprintk("%s returning %i\n", __func__, res);
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index fe5c07d..15fea61 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -46,7 +46,6 @@ enum nfs4_client_state {
NFS4CLNT_DELEGRETURN,
NFS4CLNT_SESSION_RESET,
NFS4CLNT_RECALL_SLOT,
- NFS4CLNT_LAYOUT_RECALL,
};
enum nfs4_session_state {
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 466fc8b..6a1eb41 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1576,10 +1576,6 @@ static void nfs4_state_manager(struct nfs_client *clp)
nfs_client_return_marked_delegations(clp);
continue;
}
- if (test_and_clear_bit(NFS4CLNT_LAYOUT_RECALL, &clp->cl_state)) {
- nfs_client_return_layouts(clp);
- continue;
- }
/* Recall session slots */
if (test_and_clear_bit(NFS4CLNT_RECALL_SLOT, &clp->cl_state)
&& nfs4_has_session(clp)) {
--
1.7.2.1
and neither is SERVERFAULT, which pretty much means our only option
on memory allocation failure is DELAY.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/callback_proc.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 2274b6f..61b3c66 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -348,7 +348,7 @@ static u32 do_callback_layoutrecall(struct nfs_client *clp,
dprintk("%s enter, type=%i\n", __func__, args->cbl_recall_type);
new = kmalloc(sizeof(*new), GFP_KERNEL);
if (!new) {
- res = NFS4ERR_RESOURCE;
+ res = NFS4ERR_DELAY;
goto out;
}
memcpy(&new->pcl_args, args, sizeof(*args));
--
1.7.2.1
On Dec 15, 2010, at 8:57 AM, Benny Halevy wrote:
> On 2010-12-10 03:22, Fred Isaman wrote:
>> It was checking that at least one known bit was set. It needs to check
>> no unknown bit was set. From RFC5661, section 20.6.3:
>>
>> When a bit is set in the type mask that corresponds to an undefined
>> type of recallable object, NFS4ERR_INVAL MUST be returned.
>>
>> Signed-off-by: Fred Isaman <[email protected]>
>> ---
>> fs/nfs/callback.h | 1 +
>> fs/nfs/callback_proc.c | 27 ++++-----------------------
>> 2 files changed, 5 insertions(+), 23 deletions(-)
>>
>> diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
>> index b16dd1f..616c5c1 100644
>> --- a/fs/nfs/callback.h
>> +++ b/fs/nfs/callback.h
>> @@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
>> #define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
>> +#define RCA4_TYPE_MASK_ALL 0xf31f
>>
>> struct cb_recallanyargs {
>> struct sockaddr *craa_addr;
>> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
>> index 61b3c66..d4aec46 100644
>> --- a/fs/nfs/callback_proc.c
>> +++ b/fs/nfs/callback_proc.c
>> @@ -661,28 +661,10 @@ out_putclient:
>> goto out;
>> }
>>
>> -static inline bool
>> -validate_bitmap_values(const unsigned long *mask)
>> +static bool
>> +validate_bitmap_values(unsigned long mask)
>> {
>> - int i;
>> -
>> - if (*mask == 0)
>> - return true;
>> - if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
>> - test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
>> - test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
>> - test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
>> - test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
>> - return true;
>> - for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
>> - i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
>> - if (test_bit(i, mask))
>> - return true;
>> - for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
>> - i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
>> - if (test_bit(i, mask))
>> - return true;
>> - return false;
>> + return mask & ~RCA4_TYPE_MASK_ALL;
>
> Hmm, shouldn't that be
> return (mask & ~RCA4_TYPE_MASK_ALL) == 0;
>
> Benny
>
Yes, you are right.
Fred
>> }
>>
>> __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>> @@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>> rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
>>
>> status = cpu_to_be32(NFS4ERR_INVAL);
>> - if (!validate_bitmap_values((const unsigned long *)
>> - &args->craa_type_mask))
>> + if (!validate_bitmap_values(args->craa_type_mask))
>> goto out;
>>
>> status = cpu_to_be32(NFS4_OK);
Either a bad server reply, or our ignoring of multiple array segments in
a reply, can cause a reply to not meet our requirements. Ensure
that we ignore such replies.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/pnfs.c | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 3ee9621..2e35706 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -909,6 +909,17 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
struct nfs_client *clp = NFS_SERVER(ino)->nfs_client;
int status = 0;
+ /* Verify we got what we asked for.
+ * Note that because the xdr parsing only accepts a single
+ * element array, this can fail even if the server is behaving
+ * correctly.
+ */
+ if (lgp->args.range.iomode > res->range.iomode ||
+ res->range.offset != 0 ||
+ res->range.length != NFS4_MAX_UINT64) {
+ status = -EINVAL;
+ goto out;
+ }
/* Inject layout blob into I/O device driver */
lseg = NFS_SERVER(ino)->pnfs_curr_ld->alloc_lseg(lo, res);
if (!lseg || IS_ERR(lseg)) {
--
1.7.2.1
On 2010-12-15 16:11, Fred Isaman wrote:
>
> On Dec 15, 2010, at 8:57 AM, Benny Halevy wrote:
>
>> On 2010-12-10 03:22, Fred Isaman wrote:
>>> It was checking that at least one known bit was set. It needs to check
>>> no unknown bit was set. From RFC5661, section 20.6.3:
>>>
>>> When a bit is set in the type mask that corresponds to an undefined
>>> type of recallable object, NFS4ERR_INVAL MUST be returned.
>>>
>>> Signed-off-by: Fred Isaman <[email protected]>
>>> ---
>>> fs/nfs/callback.h | 1 +
>>> fs/nfs/callback_proc.c | 27 ++++-----------------------
>>> 2 files changed, 5 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
>>> index b16dd1f..616c5c1 100644
>>> --- a/fs/nfs/callback.h
>>> +++ b/fs/nfs/callback.h
>>> @@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
>>> #define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
>>> +#define RCA4_TYPE_MASK_ALL 0xf31f
>>>
>>> struct cb_recallanyargs {
>>> struct sockaddr *craa_addr;
>>> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
>>> index 61b3c66..d4aec46 100644
>>> --- a/fs/nfs/callback_proc.c
>>> +++ b/fs/nfs/callback_proc.c
>>> @@ -661,28 +661,10 @@ out_putclient:
>>> goto out;
>>> }
>>>
>>> -static inline bool
>>> -validate_bitmap_values(const unsigned long *mask)
>>> +static bool
>>> +validate_bitmap_values(unsigned long mask)
>>> {
>>> - int i;
>>> -
>>> - if (*mask == 0)
>>> - return true;
>>> - if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
>>> - test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
>>> - test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
>>> - test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
>>> - test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
>>> - return true;
>>> - for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
>>> - i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
>>> - if (test_bit(i, mask))
>>> - return true;
>>> - for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
>>> - i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
>>> - if (test_bit(i, mask))
>>> - return true;
>>> - return false;
>>> + return mask & ~RCA4_TYPE_MASK_ALL;
>>
>> Hmm, shouldn't that be
>> return (mask & ~RCA4_TYPE_MASK_ALL) == 0;
>>
>> Benny
>>
>
> Yes, you are right.
OK. This is fixed in my branch to be released asap.
I've reverted large parts of this patchset in the post-submit stream
to restore layoutcommit and layoutreturn, but not their embedding in
the CLOSE compound. I also kept the cleanups and bug fixes.
I'll send out the post-submit when it's ready.
Some more work will be required to restore the original patches
author and signoffees.
This is the list as of now:
pick af44531 Revert "pnfs-submit: wave2: remove forgotten layoutreturn struct definitions"
pick c465549 Revert "pnfs-submit: Turn off layoutcommits"
pick 0f4ba67 Revert "pnfs-submit: wave2: remove all LAYOUTRETURN code"
pick 486db47 Revert "pnfs-submit: wave2: Remove LAYOUTRETURN from return on close"
pick 484c935 FIXME: roc should return layout on last close
(This patch just adds a FIXME comment.)
pick 8698772 Revert "pnfs-submit: wave2: remove cl_layoutrecalls list"
pick 263879b Revert "pnfs-submit: wave2: Pull out all recall initiated LAYOUTRETURNS"
pick 693765f Revert "pnfs-submit: wave2: Don't wait in layoutget"
pick de56e11 Revert "pnfs-submit: wave2: check that partial LAYOUTGET return is ignored"
Anything else you had in mind?
Benny
>
> Fred
>
>>> }
>>>
>>> __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>> @@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>> rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
>>>
>>> status = cpu_to_be32(NFS4ERR_INVAL);
>>> - if (!validate_bitmap_values((const unsigned long *)
>>> - &args->craa_type_mask))
>>> + if (!validate_bitmap_values(args->craa_type_mask))
>>> goto out;
>>>
>>> status = cpu_to_be32(NFS4_OK);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
This reverts commit a512f09586628b295c960e8df35d42726180f96f.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4proc.c | 55 ++++++++++++++++++++++++++++++++--------------------
fs/nfs/nfs4xdr.c | 34 --------------------------------
fs/nfs/pnfs.h | 24 -----------------------
3 files changed, 34 insertions(+), 79 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index dbe518c..3a1e578 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -74,6 +74,8 @@ static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
struct nfs_fattr *fattr, struct iattr *sattr,
struct nfs4_state *state);
+static void nfs4_layoutreturn_set_stateid(struct inode *ino,
+ struct nfs4_layoutreturn_res *res);
/* Prevent leaks of NFSv4 errors into userland */
static int nfs4_map_errors(int err)
@@ -1842,8 +1844,17 @@ static void nfs4_free_closedata(void *data)
nfs_free_seqid(calldata->arg.seqid);
nfs4_put_state_owner(sp);
path_put(&calldata->path);
- if (calldata->res.op_bitmask & NFS4_HAS_LAYOUTRETURN)
- nfs4_layoutreturn_file_release(calldata->inode);
+ if (calldata->res.op_bitmask & NFS4_HAS_LAYOUTRETURN) {
+ struct pnfs_layout_hdr *lo = NFS_I(calldata->inode)->layout;
+
+ spin_lock(&lo->inode->i_lock);
+ lo->plh_block_lgets--;
+ lo->plh_outstanding--;
+ if (!pnfs_layoutgets_blocked(lo, NULL))
+ rpc_wake_up(&NFS_I(lo->inode)->lo_rpcwaitq_stateid);
+ spin_unlock(&lo->inode->i_lock);
+ put_layout_hdr(lo->inode);
+ }
kfree(calldata);
}
@@ -1932,13 +1943,18 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
/* Are there layout segments to return on close? */
if (pnfs_roc(calldata)) {
struct nfs_inode *nfsi = NFS_I(calldata->inode);
-
if (pnfs_return_layout_barrier(nfsi,
&calldata->arg.lr_args.range)) {
dprintk("%s: waiting on barrier\n", __func__);
/* FIXME race with wake here */
rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
- nfs4_layoutreturn_file_release(calldata->inode);
+ spin_lock(&calldata->inode->i_lock);
+ nfsi->layout->plh_block_lgets--;
+ nfsi->layout->plh_outstanding--;
+ if (!pnfs_layoutgets_blocked(nfsi->layout, NULL))
+ rpc_wake_up(&nfsi->lo_rpcwaitq_stateid);
+ spin_unlock(&calldata->inode->i_lock);
+ put_layout_hdr(calldata->inode);
return;
}
}
@@ -5620,8 +5636,8 @@ nfs4_layoutreturn_prepare(struct rpc_task *task, void *calldata)
rpc_call_start(task);
}
-void nfs4_layoutreturn_set_stateid(struct inode *ino,
- struct nfs4_layoutreturn_res *res)
+static void nfs4_layoutreturn_set_stateid(struct inode *ino,
+ struct nfs4_layoutreturn_res *res)
{
struct pnfs_layout_hdr *lo = NFS_I(ino)->layout;
@@ -5656,26 +5672,23 @@ static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata)
dprintk("<-- %s\n", __func__);
}
-void nfs4_layoutreturn_file_release(struct inode *ino)
-{
- struct pnfs_layout_hdr *lo = NFS_I(ino)->layout;
-
- spin_lock(&ino->i_lock);
- lo->plh_block_lgets--;
- lo->plh_outstanding--;
- if (!pnfs_layoutgets_blocked(lo, NULL))
- rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq_stateid);
- spin_unlock(&ino->i_lock);
- put_layout_hdr(ino);
-}
-
static void nfs4_layoutreturn_release(void *calldata)
{
struct nfs4_layoutreturn *lrp = calldata;
dprintk("--> %s return_type %d\n", __func__, lrp->args.return_type);
- if (lrp->args.return_type == RETURN_FILE)
- nfs4_layoutreturn_file_release(lrp->args.inode);
+ if (lrp->args.return_type == RETURN_FILE) {
+ struct inode *ino = lrp->args.inode;
+ struct pnfs_layout_hdr *lo = NFS_I(ino)->layout;
+
+ spin_lock(&ino->i_lock);
+ lo->plh_block_lgets--;
+ lo->plh_outstanding--;
+ if (!pnfs_layoutgets_blocked(lo, NULL))
+ rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq_stateid);
+ spin_unlock(&ino->i_lock);
+ put_layout_hdr(ino);
+ }
kfree(calldata);
dprintk("<-- %s\n", __func__);
}
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 8dbfbb0..b9fe438 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -338,10 +338,6 @@ static int nfs4_stat_to_errno(int);
#else /* CONFIG_NFS_V4_1 */
#define encode_sequence_maxsz 0
#define decode_sequence_maxsz 0
-#define encode_layoutcommit_maxsz 0
-#define decode_layoutcommit_maxsz 0
-#define encode_layoutreturn_maxsz 0
-#define decode_layoutreturn_maxsz 0
#endif /* CONFIG_NFS_V4_1 */
#define NFS4_enc_compound_sz (1024) /* XXX: large enough? */
@@ -1933,22 +1929,6 @@ encode_layoutreturn(struct xdr_stream *xdr,
hdr->nops++;
hdr->replen += decode_layoutreturn_maxsz;
}
-#else
-static int
-encode_layoutcommit(struct xdr_stream *xdr,
- const struct nfs4_layoutcommit_op_args *args,
- struct compound_hdr *hdr)
-{
- return 0;
-}
-
-static void
-encode_layoutreturn(struct xdr_stream *xdr,
- const struct nfs4_layoutreturn_args *args,
- struct compound_hdr *hdr)
-{
-}
-
#endif /* CONFIG_NFS_V4_1 */
/*
@@ -5352,20 +5332,6 @@ out_overflow:
print_overflow_msg(__func__, xdr);
return -EIO;
}
-
-#else
-
-static int decode_layoutcommit(struct xdr_stream *xdr)
-{
- return 0;
-}
-
-static int decode_layoutreturn(struct xdr_stream *xdr,
- struct nfs4_layoutreturn_res *res)
-{
- return 0;
-}
-
#endif /* CONFIG_NFS_V4_1 */
/*
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index c527e55..8ef47e9 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -188,9 +188,6 @@ extern int nfs4_proc_layoutget(struct nfs4_layoutget *lgp);
extern int nfs4_proc_layoutcommit(struct nfs4_layoutcommit_data *data,
int issync);
extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
-extern void nfs4_layoutreturn_file_release(struct inode *ino);
-extern void nfs4_layoutreturn_set_stateid(struct inode *ino,
- struct nfs4_layoutreturn_res *res);
/* pnfs.c */
void get_layout_hdr(struct pnfs_layout_hdr *lo);
@@ -393,27 +390,6 @@ pnfs_pageio_init_write(struct nfs_pageio_descriptor *pgio, struct inode *ino)
pgio->pg_lseg = NULL;
}
-static inline void nfs4_layoutreturn_file_release(struct inode *ino)
-{
-}
-
-static inline bool pnfs_roc(struct nfs4_closedata *data)
-{
- return false;
-}
-
-static inline bool pnfs_return_layout_barrier(struct nfs_inode *nfsi,
- struct pnfs_layout_range *range)
-{
- BUG();
- return false;
-}
-
-static inline void nfs4_layoutreturn_set_stateid(struct inode *ino,
- struct nfs4_layoutreturn_res *res)
-{
-}
-
#endif /* CONFIG_NFS_V4_1 */
#endif /* FS_NFS_PNFS_H */
--
1.7.2.1
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4proc.c | 4 ++--
fs/nfs/pnfs.c | 11 ++++++-----
fs/nfs/pnfs.h | 2 +-
3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index adcab30..c4dc5b1 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5607,7 +5607,7 @@ static void nfs4_layoutreturn_release(void *calldata)
spin_lock(&ino->i_lock);
lo->plh_block_lgets--;
- lo->plh_outstanding--;
+ atomic_dec(&lo->plh_outstanding);
spin_unlock(&ino->i_lock);
put_layout_hdr(ino);
}
@@ -5644,7 +5644,7 @@ int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool issync)
/* FIXME we should test for BULK here */
spin_lock(&lo->inode->i_lock);
BUG_ON(lo->plh_block_lgets == 0);
- lo->plh_outstanding++;
+ atomic_inc(&lo->plh_outstanding);
spin_unlock(&lo->inode->i_lock);
}
task = rpc_run_task(&task_setup_data);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index f9757ff..27a1973 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -505,7 +505,8 @@ pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid,
return true;
return lo->plh_block_lgets ||
test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
- (list_empty(&lo->segs) && (lo->plh_outstanding > lget));
+ (list_empty(&lo->segs) &&
+ (atomic_read(&lo->plh_outstanding) > lget));
}
int
@@ -868,7 +869,7 @@ pnfs_update_layout(struct inode *ino,
if (pnfs_layoutgets_blocked(lo, NULL, 0))
goto out_unlock;
- lo->plh_outstanding++;
+ atomic_inc(&lo->plh_outstanding);
get_layout_hdr(lo); /* Matched in pnfs_layoutget_release */
if (list_empty(&lo->segs)) {
@@ -883,18 +884,18 @@ pnfs_update_layout(struct inode *ino,
spin_unlock(&ino->i_lock);
lseg = send_layoutget(lo, ctx, &arg);
- spin_lock(&ino->i_lock);
if (!lseg) {
+ spin_lock(&ino->i_lock);
if (list_empty(&lo->segs)) {
spin_lock(&clp->cl_lock);
list_del_init(&lo->layouts);
spin_unlock(&clp->cl_lock);
clear_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
}
+ spin_unlock(&ino->i_lock);
}
- lo->plh_outstanding--;
+ atomic_dec(&lo->plh_outstanding);
put_layout_hdr(ino);
- spin_unlock(&ino->i_lock);
out:
dprintk("%s end, state 0x%lx lseg %p\n", __func__,
nfsi->layout->plh_flags, lseg);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 1ccc35d..b5a30b8 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -100,7 +100,7 @@ struct pnfs_layout_hdr {
struct list_head segs; /* layout segments list */
int roc_iomode;/* return on close iomode, 0=none */
nfs4_stateid stateid;
- unsigned long plh_outstanding; /* number of RPCs out */
+ atomic_t plh_outstanding; /* number of RPCs out */
unsigned long plh_block_lgets; /* block LAYOUTGET if >0 */
u32 plh_barrier; /* ignore lower seqids */
unsigned long plh_flags;
--
1.7.2.1
Recent changes to close can delay pending layoutcommit until umount,
when the async layoutcommits can come tricklng in after we have destroyed
the session. Since file does not need them, just turn them off for
the moment. Non-file layouts will probably have to trigger them in
some fashion at close.
A better solution is to just push all the layoutcommit code outside
of the pnfs-submit branch. This is really just a stop gap until code
is rearranged to make that easier.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4proc.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9b15535..224bdfe 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3098,7 +3098,6 @@ static void pnfs4_update_write_done(struct nfs_inode *nfsi, struct nfs_write_dat
{
#ifdef CONFIG_NFS_V4_1
pnfs_update_last_write(nfsi, data->args.offset, data->res.count);
- pnfs_need_layoutcommit(nfsi, data->args.context);
#endif /* CONFIG_NFS_V4_1 */
}
--
1.7.2.1
On 12/10/2010 03:22 AM, Fred Isaman wrote:
> Recent changes to close can delay pending layoutcommit until umount,
> when the async layoutcommits can come tricklng in after we have destroyed
> the session.
Then "Recent changes" are broken and should be fixed. It was fine
before. New broken code is not acceptable.
> Since file does not need them, just turn them off for
> the moment. Non-file layouts will probably have to trigger them in
> some fashion at close.
>
Rrrr. Are we back to this argument. We stand down win an argument
and 2 weeks later you are back on it has if we never talked about it.
NO!!! only "coherent clustered filesystems" do not need them. It has
nothing to do with layout type. A none-clustered aggregated parallel
filesystem will need them just the same as blocks and objects.
AND THE STD DOES NOT GIVE YOU A CHOICE!!!
> A better solution is to just push all the layoutcommit code outside
> of the pnfs-submit branch. This is really just a stop gap until code
> is rearranged to make that easier.
>
Than all this is not finished. Please keep it in the shops until the
final solution is presented and we can actually see the new compared
to the old system. Until then we should keep what worked and was tested.
Boaz
> Signed-off-by: Fred Isaman <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 9b15535..224bdfe 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -3098,7 +3098,6 @@ static void pnfs4_update_write_done(struct nfs_inode *nfsi, struct nfs_write_dat
> {
> #ifdef CONFIG_NFS_V4_1
> pnfs_update_last_write(nfsi, data->args.offset, data->res.count);
> - pnfs_need_layoutcommit(nfsi, data->args.context);
> #endif /* CONFIG_NFS_V4_1 */
> }
>
This reverts commit ed527fc7fba08f0665cc6b59d9130a6bd09a2659.
Conflicts:
fs/nfs/pnfs.c
fs/nfs/pnfs.h
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4proc.c | 39 +++------------------------------------
fs/nfs/pnfs.c | 6 ++----
fs/nfs/pnfs.h | 14 --------------
3 files changed, 5 insertions(+), 54 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2811d97..dbe518c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1880,16 +1880,6 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
nfs4_close_clear_stateid_flags(state,
calldata->arg.fmode);
break;
- case -NFS4ERR_DELEG_REVOKED:
- if (calldata->res.op_bitmask & (NFS4_HAS_LAYOUTCOMMIT |
- NFS4_HAS_LAYOUTRETURN)) {
- pnfs_mark_layout_revoked(calldata->inode);
- /* Retry without layout operations as
- * pnfs_roc will find roc_iomode==0 next time around
- */
- rpc_restart_call_prepare(task);
- break;
- }
case -NFS4ERR_STALE_STATEID:
case -NFS4ERR_OLD_STATEID:
case -NFS4ERR_BAD_STATEID:
@@ -5646,7 +5636,6 @@ void nfs4_layoutreturn_set_stateid(struct inode *ino,
static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata)
{
struct nfs4_layoutreturn *lrp = calldata;
- struct inode *ino = lrp->args.inode;
struct nfs_server *server;
dprintk("--> %s\n", __func__);
@@ -5655,50 +5644,28 @@ static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata)
return;
if (lrp->args.return_type == RETURN_FILE)
- server = NFS_SERVER(ino);
+ server = NFS_SERVER(lrp->args.inode);
else
server = NULL;
if (nfs4_async_handle_error(task, server, NULL, lrp->clp) == -EAGAIN) {
nfs_restart_rpc(task, lrp->clp);
return;
}
- switch (task->tk_status) {
- case -NFS4ERR_DELEG_REVOKED:
- task->tk_status = 0; /* TODO: revalidate remaining layouts? */
- if (lrp->args.return_type == RETURN_FILE)
- pnfs_mark_layout_revoked(ino);
- break;
- case 0:
- if (lrp->args.return_type == RETURN_FILE)
- nfs4_layoutreturn_set_stateid(lrp->args.inode, &lrp->res);
- }
+ if ((task->tk_status == 0) && (lrp->args.return_type == RETURN_FILE))
+ nfs4_layoutreturn_set_stateid(lrp->args.inode, &lrp->res);
dprintk("<-- %s\n", __func__);
}
void nfs4_layoutreturn_file_release(struct inode *ino)
{
struct pnfs_layout_hdr *lo = NFS_I(ino)->layout;
- LIST_HEAD(tmp_list);
spin_lock(&ino->i_lock);
- if (test_bit(NFS_LAYOUT_REVOKED, &lo->plh_flags)) {
- struct pnfs_layout_range range = {
- .iomode = IOMODE_ANY,
- .offset = 0,
- .length = NFS4_MAX_UINT64,
- };
-
- /* layout driver's free_lseg may block, hence we don't
- * call pnfs_free_lseg_list under the spin_lock */
- pnfs_clear_lseg_list(lo, &tmp_list, &range);
- clear_bit(NFS_LAYOUT_REVOKED, &lo->plh_flags);
- }
lo->plh_block_lgets--;
lo->plh_outstanding--;
if (!pnfs_layoutgets_blocked(lo, NULL))
rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq_stateid);
spin_unlock(&ino->i_lock);
- pnfs_free_lseg_list(&tmp_list);
put_layout_hdr(ino);
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index a21debe..d3ce095 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -376,7 +376,7 @@ static void mark_lseg_invalid(struct pnfs_layout_segment *lseg,
}
/* Returns false if there was nothing to do, true otherwise */
-bool
+static bool
pnfs_clear_lseg_list(struct pnfs_layout_hdr *lo, struct list_head *tmp_list,
struct pnfs_layout_range *range)
{
@@ -629,12 +629,10 @@ pnfs_roc(struct nfs4_closedata *data)
spin_lock(&data->inode->i_lock);
lo = NFS_I(data->inode)->layout;
if (!lo || lo->roc_iomode == 0 ||
- test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
- test_bit(NFS_LAYOUT_REVOKED, &lo->plh_flags))
+ test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags))
goto out_nolayout;
range.iomode = lo->roc_iomode;
- lo->roc_iomode = 0;
list_for_each_entry_safe(lseg, tmp, &lo->segs, fi_list)
if (should_free_lseg(&lseg->range, &range)) {
mark_lseg_invalid(lseg, &tmp_list);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 12fe7ab..c527e55 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -56,7 +56,6 @@ enum {
NFS_LAYOUT_RW_FAILED, /* get rw layout failed stop trying */
NFS_LAYOUT_BULK_RECALL, /* bulk recall affecting layout */
NFS_LAYOUT_NEED_LCOMMIT, /* LAYOUTCOMMIT needed */
- NFS_LAYOUT_REVOKED, /* layout revoked by the server */
};
/* Per-layout driver specific registration structure */
@@ -220,8 +219,6 @@ void pnfs_pageio_init_read(struct nfs_pageio_descriptor *, struct inode *,
void pnfs_pageio_init_write(struct nfs_pageio_descriptor *, struct inode *);
bool pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid);
int pnfs_layout_process(struct nfs4_layoutget *lgp);
-bool pnfs_clear_lseg_list(struct pnfs_layout_hdr *, struct list_head *tmp_list,
- struct pnfs_layout_range *);
void pnfs_free_lseg_list(struct list_head *tmp_list);
void pnfs_destroy_layout(struct nfs_inode *);
void pnfs_destroy_all_layouts(struct nfs_client *);
@@ -293,12 +290,6 @@ layoutcommit_needed(struct nfs_inode *nfsi)
test_bit(NFS_LAYOUT_NEED_LCOMMIT, &nfsi->layout->plh_flags);
}
-static inline void
-pnfs_mark_layout_revoked(struct inode *ino)
-{
- set_bit(NFS_LAYOUT_REVOKED, &NFS_I(ino)->layout->plh_flags);
-}
-
#else /* CONFIG_NFS_V4_1 */
static inline void pnfs_destroy_all_layouts(struct nfs_client *clp)
@@ -423,11 +414,6 @@ static inline void nfs4_layoutreturn_set_stateid(struct inode *ino,
{
}
-static inline void
-pnfs_mark_layout_revoked(struct inode *ino)
-{
-}
-
#endif /* CONFIG_NFS_V4_1 */
#endif /* FS_NFS_PNFS_H */
--
1.7.2.1
On Dec 15, 2010, at 10:29 AM, Benny Halevy wrote:
> On 2010-12-15 16:11, Fred Isaman wrote:
>>
>> On Dec 15, 2010, at 8:57 AM, Benny Halevy wrote:
>>
>>> On 2010-12-10 03:22, Fred Isaman wrote:
>>>> It was checking that at least one known bit was set. It needs to check
>>>> no unknown bit was set. From RFC5661, section 20.6.3:
>>>>
>>>> When a bit is set in the type mask that corresponds to an undefined
>>>> type of recallable object, NFS4ERR_INVAL MUST be returned.
>>>>
>>>> Signed-off-by: Fred Isaman <[email protected]>
>>>> ---
>>>> fs/nfs/callback.h | 1 +
>>>> fs/nfs/callback_proc.c | 27 ++++-----------------------
>>>> 2 files changed, 5 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
>>>> index b16dd1f..616c5c1 100644
>>>> --- a/fs/nfs/callback.h
>>>> +++ b/fs/nfs/callback.h
>>>> @@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
>>>> #define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
>>>> +#define RCA4_TYPE_MASK_ALL 0xf31f
>>>>
>>>> struct cb_recallanyargs {
>>>> struct sockaddr *craa_addr;
>>>> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
>>>> index 61b3c66..d4aec46 100644
>>>> --- a/fs/nfs/callback_proc.c
>>>> +++ b/fs/nfs/callback_proc.c
>>>> @@ -661,28 +661,10 @@ out_putclient:
>>>> goto out;
>>>> }
>>>>
>>>> -static inline bool
>>>> -validate_bitmap_values(const unsigned long *mask)
>>>> +static bool
>>>> +validate_bitmap_values(unsigned long mask)
>>>> {
>>>> - int i;
>>>> -
>>>> - if (*mask == 0)
>>>> - return true;
>>>> - if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
>>>> - test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
>>>> - test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
>>>> - test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
>>>> - test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
>>>> - return true;
>>>> - for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
>>>> - i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
>>>> - if (test_bit(i, mask))
>>>> - return true;
>>>> - for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
>>>> - i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
>>>> - if (test_bit(i, mask))
>>>> - return true;
>>>> - return false;
>>>> + return mask & ~RCA4_TYPE_MASK_ALL;
>>>
>>> Hmm, shouldn't that be
>>> return (mask & ~RCA4_TYPE_MASK_ALL) == 0;
>>>
>>> Benny
>>>
>>
>> Yes, you are right.
>
> OK. This is fixed in my branch to be released asap.
Thanks. I have a bunch more minor code cleanups that I'll send once I see what you have, unless you want them immediately. I'll also have a rebase of the pnfs-submit branch with the wave2 patches pushed down to the bottom shortly after I see your branch.
> I've reverted large parts of this patchset in the post-submit stream
> to restore layoutcommit and layoutreturn, but not their embedding in
> the CLOSE compound. I also kept the cleanups and bug fixes.
> I'll send out the post-submit when it's ready.
> Some more work will be required to restore the original patches
> author and signoffees.
>
> This is the list as of now:
>
> pick af44531 Revert "pnfs-submit: wave2: remove forgotten layoutreturn struct definitions"
> pick c465549 Revert "pnfs-submit: Turn off layoutcommits"
> pick 0f4ba67 Revert "pnfs-submit: wave2: remove all LAYOUTRETURN code"
> pick 486db47 Revert "pnfs-submit: wave2: Remove LAYOUTRETURN from return on close"
>
> pick 484c935 FIXME: roc should return layout on last close
> (This patch just adds a FIXME comment.)
>
The above looks good.
> pick 8698772 Revert "pnfs-submit: wave2: remove cl_layoutrecalls list"
> pick 263879b Revert "pnfs-submit: wave2: Pull out all recall initiated LAYOUTRETURNS"
> pick 693765f Revert "pnfs-submit: wave2: Don't wait in layoutget"
Just note the Trond was highly resistant to the above.
> pick de56e11 Revert "pnfs-submit: wave2: check that partial LAYOUTGET return is ignored"
>
We need some sort of check that we got what we asked for, given that the xdr code can chop the servers reply.
Fred
> Anything else you had in mind?
>
> Benny
>
>>
>> Fred
>>
>>>> }
>>>>
>>>> __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>> @@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>> rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
>>>>
>>>> status = cpu_to_be32(NFS4ERR_INVAL);
>>>> - if (!validate_bitmap_values((const unsigned long *)
>>>> - &args->craa_type_mask))
>>>> + if (!validate_bitmap_values(args->craa_type_mask))
>>>> goto out;
>>>>
>>>> status = cpu_to_be32(NFS4_OK);
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
remove boilerplate nfs4proc and nfs4xdr code
pnfs_return_layout is only called by evict_inode. Just have it forget everything
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4proc.c | 118 -----------------------------------------------------
fs/nfs/nfs4xdr.c | 111 -------------------------------------------------
fs/nfs/pnfs.c | 62 +---------------------------
fs/nfs/pnfs.h | 2 -
4 files changed, 1 insertions(+), 292 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 57f5a8a..9b15535 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5554,124 +5554,6 @@ out:
return 0;
}
-static void
-nfs4_layoutreturn_prepare(struct rpc_task *task, void *calldata)
-{
- struct nfs4_layoutreturn *lrp = calldata;
-
- dprintk("--> %s\n", __func__);
- if (lrp->args.return_type == RETURN_FILE) {
- struct nfs_inode *nfsi = NFS_I(lrp->args.inode);
-
- if (pnfs_return_layout_barrier(nfsi, &lrp->args.range)) {
- dprintk("%s: waiting on barrier\n", __func__);
- rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
- return;
- }
- }
- if (nfs41_setup_sequence(lrp->clp->cl_session, &lrp->args.seq_args,
- &lrp->res.seq_res, 0, task))
- return;
- rpc_call_start(task);
-}
-
-static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata)
-{
- struct nfs4_layoutreturn *lrp = calldata;
- struct nfs_server *server;
-
- dprintk("--> %s\n", __func__);
-
- if (!nfs4_sequence_done(task, &lrp->res.seq_res))
- return;
-
- if (lrp->args.return_type == RETURN_FILE)
- server = NFS_SERVER(lrp->args.inode);
- else
- server = NULL;
- if (nfs4_async_handle_error(task, server, NULL, lrp->clp) == -EAGAIN) {
- nfs_restart_rpc(task, lrp->clp);
- return;
- }
- if ((task->tk_status == 0) && (lrp->args.return_type == RETURN_FILE)) {
- struct pnfs_layout_hdr *lo = NFS_I(lrp->args.inode)->layout;
-
- spin_lock(&lo->inode->i_lock);
- if (lrp->res.lrs_present)
- pnfs_set_layout_stateid(lo, &lrp->res.stateid, true);
- else
- BUG_ON(!list_empty(&lo->segs));
- spin_unlock(&lo->inode->i_lock);
- }
- dprintk("<-- %s\n", __func__);
-}
-
-static void nfs4_layoutreturn_release(void *calldata)
-{
- struct nfs4_layoutreturn *lrp = calldata;
-
- dprintk("--> %s return_type %d\n", __func__, lrp->args.return_type);
- if (lrp->args.return_type == RETURN_FILE) {
- struct inode *ino = lrp->args.inode;
- struct pnfs_layout_hdr *lo = NFS_I(ino)->layout;
-
- spin_lock(&ino->i_lock);
- lo->plh_block_lgets--;
- atomic_dec(&lo->plh_outstanding);
- spin_unlock(&ino->i_lock);
- put_layout_hdr(ino);
- }
- kfree(calldata);
- dprintk("<-- %s\n", __func__);
-}
-
-static const struct rpc_call_ops nfs4_layoutreturn_call_ops = {
- .rpc_call_prepare = nfs4_layoutreturn_prepare,
- .rpc_call_done = nfs4_layoutreturn_done,
- .rpc_release = nfs4_layoutreturn_release,
-};
-
-int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool issync)
-{
- struct rpc_task *task;
- struct rpc_message msg = {
- .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LAYOUTRETURN],
- .rpc_argp = &lrp->args,
- .rpc_resp = &lrp->res,
- };
- struct rpc_task_setup task_setup_data = {
- .rpc_client = lrp->clp->cl_rpcclient,
- .rpc_message = &msg,
- .callback_ops = &nfs4_layoutreturn_call_ops,
- .callback_data = lrp,
- .flags = RPC_TASK_ASYNC,
- };
- int status = 0;
-
- dprintk("--> %s\n", __func__);
- if (lrp->args.return_type == RETURN_FILE) {
- struct pnfs_layout_hdr *lo = NFS_I(lrp->args.inode)->layout;
- /* FIXME we should test for BULK here */
- spin_lock(&lo->inode->i_lock);
- BUG_ON(lo->plh_block_lgets == 0);
- atomic_inc(&lo->plh_outstanding);
- spin_unlock(&lo->inode->i_lock);
- }
- task = rpc_run_task(&task_setup_data);
- if (IS_ERR(task))
- return PTR_ERR(task);
- if (!issync)
- goto out;
- status = nfs4_wait_for_completion_rpc_task(task);
- if (status != 0)
- goto out;
- status = task->tk_status;
-out:
- dprintk("<-- %s\n", __func__);
- rpc_put_task(task);
- return status;
-}
-
static int
_nfs4_proc_getdeviceinfo(struct nfs_server *server, struct pnfs_device *pdev)
{
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 7f92bfa..4564021 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -329,12 +329,6 @@ static int nfs4_stat_to_errno(int);
op_encode_hdr_maxsz + \
encode_stateid_maxsz)
#define decode_layoutcommit_maxsz (3 + op_decode_hdr_maxsz)
-#define encode_layoutreturn_maxsz (8 + op_encode_hdr_maxsz + \
- encode_stateid_maxsz + \
- 1 /* FIXME: opaque lrf_body always empty at
- *the moment */)
-#define decode_layoutreturn_maxsz (op_decode_hdr_maxsz + \
- 1 + decode_stateid_maxsz)
#else /* CONFIG_NFS_V4_1 */
#define encode_sequence_maxsz 0
#define decode_sequence_maxsz 0
@@ -748,14 +742,6 @@ static int nfs4_stat_to_errno(int);
decode_putfh_maxsz + \
decode_layoutcommit_maxsz + \
decode_getattr_maxsz)
-#define NFS4_enc_layoutreturn_sz (compound_encode_hdr_maxsz + \
- encode_sequence_maxsz + \
- encode_putfh_maxsz + \
- encode_layoutreturn_maxsz)
-#define NFS4_dec_layoutreturn_sz (compound_decode_hdr_maxsz + \
- decode_sequence_maxsz + \
- decode_putfh_maxsz + \
- decode_layoutreturn_maxsz)
#define NFS4_enc_dswrite_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz +\
encode_putfh_maxsz + \
@@ -1902,36 +1888,6 @@ encode_layoutcommit(struct xdr_stream *xdr,
return 0;
}
-static void
-encode_layoutreturn(struct xdr_stream *xdr,
- const struct nfs4_layoutreturn_args *args,
- struct compound_hdr *hdr)
-{
- nfs4_stateid stateid;
- __be32 *p;
-
- p = reserve_space(xdr, 20);
- *p++ = cpu_to_be32(OP_LAYOUTRETURN);
- *p++ = cpu_to_be32(args->reclaim);
- *p++ = cpu_to_be32(args->layout_type);
- *p++ = cpu_to_be32(args->range.iomode);
- *p = cpu_to_be32(args->return_type);
- if (args->return_type == RETURN_FILE) {
- p = reserve_space(xdr, 16 + NFS4_STATEID_SIZE);
- p = xdr_encode_hyper(p, args->range.offset);
- p = xdr_encode_hyper(p, args->range.length);
- spin_lock(&args->inode->i_lock);
- memcpy(stateid.data, NFS_I(args->inode)->layout->stateid.data,
- NFS4_STATEID_SIZE);
- spin_unlock(&args->inode->i_lock);
- p = xdr_encode_opaque_fixed(p, &stateid.data,
- NFS4_STATEID_SIZE);
- p = reserve_space(xdr, 4);
- *p = cpu_to_be32(0);
- }
- hdr->nops++;
- hdr->replen += decode_layoutreturn_maxsz;
-}
#endif /* CONFIG_NFS_V4_1 */
/*
@@ -2822,26 +2778,6 @@ static int nfs4_xdr_enc_layoutcommit(struct rpc_rqst *req, uint32_t *p,
}
/*
- * Encode LAYOUTRETURN request
- */
-static int nfs4_xdr_enc_layoutreturn(struct rpc_rqst *req, uint32_t *p,
- struct nfs4_layoutreturn_args *args)
-{
- struct xdr_stream xdr;
- struct compound_hdr hdr = {
- .minorversion = nfs4_xdr_minorversion(&args->seq_args),
- };
-
- xdr_init_encode(&xdr, &req->rq_snd_buf, p);
- encode_compound_hdr(&xdr, req, &hdr);
- encode_sequence(&xdr, &args->seq_args, &hdr);
- encode_putfh(&xdr, NFS_FH(args->inode), &hdr);
- encode_layoutreturn(&xdr, args, &hdr);
- encode_nops(&hdr);
- return 0;
-}
-
-/*
* Encode a pNFS File Layout Data Server WRITE request
*/
static int nfs4_xdr_enc_dswrite(struct rpc_rqst *req, uint32_t *p,
@@ -5283,27 +5219,6 @@ out_overflow:
return -EIO;
}
-static int decode_layoutreturn(struct xdr_stream *xdr,
- struct nfs4_layoutreturn_res *res)
-{
- __be32 *p;
- int status;
-
- status = decode_op_hdr(xdr, OP_LAYOUTRETURN);
- if (status)
- return status;
- p = xdr_inline_decode(xdr, 4);
- if (unlikely(!p))
- goto out_overflow;
- res->lrs_present = be32_to_cpup(p);
- if (res->lrs_present)
- status = decode_stateid(xdr, &res->stateid);
- return status;
-out_overflow:
- print_overflow_msg(__func__, xdr);
- return -EIO;
-}
-
static int decode_layoutcommit(struct xdr_stream *xdr,
struct rpc_rqst *req,
struct nfs4_layoutcommit_res *res)
@@ -6409,31 +6324,6 @@ out:
}
/*
- * Decode LAYOUTRETURN response
- */
-static int nfs4_xdr_dec_layoutreturn(struct rpc_rqst *rqstp, uint32_t *p,
- struct nfs4_layoutreturn_res *res)
-{
- struct xdr_stream xdr;
- struct compound_hdr hdr;
- int status;
-
- xdr_init_decode(&xdr, &rqstp->rq_rcv_buf, p);
- status = decode_compound_hdr(&xdr, &hdr);
- if (status)
- goto out;
- status = decode_sequence(&xdr, &res->seq_res, rqstp);
- if (status)
- goto out;
- status = decode_putfh(&xdr);
- if (status)
- goto out;
- status = decode_layoutreturn(&xdr, res);
-out:
- return status;
-}
-
-/*
* Decode LAYOUTCOMMIT response
*/
static int nfs4_xdr_dec_layoutcommit(struct rpc_rqst *rqstp, uint32_t *p,
@@ -6711,7 +6601,6 @@ struct rpc_procinfo nfs4_procedures[] = {
PROC(GETDEVICEINFO, enc_getdeviceinfo, dec_getdeviceinfo),
PROC(LAYOUTGET, enc_layoutget, dec_layoutget),
PROC(LAYOUTCOMMIT, enc_layoutcommit, dec_layoutcommit),
- PROC(LAYOUTRETURN, enc_layoutreturn, dec_layoutreturn),
PROC(PNFS_WRITE, enc_dswrite, dec_dswrite),
PROC(PNFS_COMMIT, enc_dscommit, dec_dscommit),
#endif /* CONFIG_NFS_V4_1 */
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index e76d4f8..c22e439 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -599,55 +599,6 @@ void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
}
}
-/* Return true if there is layout based io in progress in the given range.
- * Assumes range has already been marked invalid, and layout marked to
- * prevent any new lseg from being inserted.
- */
-bool
-pnfs_return_layout_barrier(struct nfs_inode *nfsi,
- struct pnfs_layout_range *range)
-{
- struct pnfs_layout_segment *lseg;
- bool ret = false;
-
- spin_lock(&nfsi->vfs_inode.i_lock);
- list_for_each_entry(lseg, &nfsi->layout->segs, fi_list)
- if (should_free_lseg(&lseg->range, range)) {
- ret = true;
- break;
- }
- spin_unlock(&nfsi->vfs_inode.i_lock);
- dprintk("%s:Return %d\n", __func__, ret);
- return ret;
-}
-
-static int
-return_layout(struct inode *ino, struct pnfs_layout_range *range, bool wait)
-{
- struct nfs4_layoutreturn *lrp;
- struct nfs_server *server = NFS_SERVER(ino);
- int status = -ENOMEM;
-
- dprintk("--> %s\n", __func__);
-
- lrp = kzalloc(sizeof(*lrp), GFP_KERNEL);
- if (lrp == NULL) {
- put_layout_hdr(ino);
- goto out;
- }
- lrp->args.reclaim = 0;
- lrp->args.layout_type = server->pnfs_curr_ld->id;
- lrp->args.return_type = RETURN_FILE;
- lrp->args.range = *range;
- lrp->args.inode = ino;
- lrp->clp = server->nfs_client;
-
- status = nfs4_proc_layoutreturn(lrp, wait);
-out:
- dprintk("<-- %s status: %d\n", __func__, status);
- return status;
-}
-
/* Initiates a LAYOUTRETURN(FILE) */
int
_pnfs_return_layout(struct inode *ino, struct pnfs_layout_range *range,
@@ -673,21 +624,10 @@ _pnfs_return_layout(struct inode *ino, struct pnfs_layout_range *range,
goto out;
}
lo->plh_block_lgets++;
- /* Reference matched in nfs4_layoutreturn_release */
- get_layout_hdr(lo);
spin_unlock(&ino->i_lock);
pnfs_free_lseg_list(&tmp_list);
- if (layoutcommit_needed(nfsi)) {
- status = pnfs_layoutcommit_inode(ino, wait);
- if (status) {
- /* Return layout even if layoutcommit fails */
- dprintk("%s: layoutcommit failed, status=%d. "
- "Returning layout anyway\n",
- __func__, status);
- }
- }
- status = return_layout(ino, &arg, wait);
+ /* Don't need to wait since this is followed by call to end_writeback */
out:
dprintk("<-- %s status: %d\n", __func__, status);
return status;
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index d999e38..0ddab0d 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -183,7 +183,6 @@ extern int nfs4_proc_getdeviceinfo(struct nfs_server *server,
extern int nfs4_proc_layoutget(struct nfs4_layoutget *lgp);
extern int nfs4_proc_layoutcommit(struct nfs4_layoutcommit_data *data,
int issync);
-extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
/* pnfs.c */
void get_layout_hdr(struct pnfs_layout_hdr *lo);
@@ -193,7 +192,6 @@ bool should_free_lseg(struct pnfs_layout_range *lseg_range,
struct pnfs_layout_segment *
pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
enum pnfs_iomode access_type);
-bool pnfs_return_layout_barrier(struct nfs_inode *, struct pnfs_layout_range *);
int _pnfs_return_layout(struct inode *, struct pnfs_layout_range *, bool wait);
void set_pnfs_layoutdriver(struct nfs_server *, u32 id);
void unset_pnfs_layoutdriver(struct nfs_server *);
--
1.7.2.1
On 12/16/2010 04:13 PM, Fred Isaman wrote:
> On Thu, Dec 16, 2010 at 7:47 AM, Boaz Harrosh <[email protected]> wrote:
>> On 12/10/2010 03:22 AM, Fred Isaman wrote:
>>> Recent changes to close can delay pending layoutcommit until umount,
>>> when the async layoutcommits can come tricklng in after we have destroyed
>>> the session.
>>
>> Then "Recent changes" are broken and should be fixed. It was fine
>> before. New broken code is not acceptable.
>>
>>> Since file does not need them, just turn them off for
>>> the moment. Non-file layouts will probably have to trigger them in
>>> some fashion at close.
>>>
>>
>> Rrrr. Are we back to this argument. We stand down win an argument
>> and 2 weeks later you are back on it has if we never talked about it.
>>
>> NO!!! only "coherent clustered filesystems" do not need them. It has
>> nothing to do with layout type. A none-clustered aggregated parallel
>> filesystem will need them just the same as blocks and objects.
>>
>> AND THE STD DOES NOT GIVE YOU A CHOICE!!!
>
> You keep saying this, but just repeating it does not convince me.
> Could you please take the time to explain *why* they are needed. A
> separate thread in the ietf list would be great. Because right now,
> Andy and I are coding and preparing for submission to Trond under the
> assumption that they are possibly a nice optimization, but are never
> actually needed for the file layout.
>
> Fred
>
I have many times. You are being forgetful. Last bakeathon Mike, I think
or someone, had a proposition talk that "since layoutcommit is use less,
lets make it optional". And we went into a long argument about that. Until
finally Trond came to the rescue and explained very clearly. Better then
I ever could, that if you want to keep current reboot semantics a client
must send a successful layoutcommit after all successful DS commits before
it can free it's dirty pages. layoutcommit is the writing of meta data after
the write of data, and is, just as important, a part of a file.
Consider an spNFS type filesystem that keeps all meta-data on MDS and
the data on disjoint independent DSs. None of the servers see the same
data and the MDS is just another client + Meta information. layoutcommit
is the point data is exposed outside and also the point meta-data is persisted
(Somewhere). Failing to do layoutcommits will not enable me to do such
simple, scalable and possibly reliable FSs. The authors of the STD did
envision such systems and invented the layoutcommit. (There is the issue of
the changed_attribute of a crashing writing client but this is solvable with
very simple MDS-to-DS communication, and is another matter)
The fact that we had the above talk also proves another thing. That the
STD does not make layoutcommit optional, hence the request to make it
optional.
And nothing of the above is new. I have stated that many times and
you keep "forgetting".
BTW: An object based filesystem could be implemented just as well
over files as over objects. Save the RAID5 stuff. Objects being
partition/objectid numeric file names, and uid/gid games for fencing
off / access permission. In such a system layoutcommits serve the
same purpose as in objects, but with a files layout
You can call me and we can talk about it if you need to. But do
You agree that the current STD text mandates it?
Boaz
This reverts commit 361887febfa01e06c5ae7c38cc6867da3e4cf31d.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4xdr.c | 11 ++---------
fs/nfs/pnfs.c | 9 +--------
include/linux/nfs_xdr.h | 2 --
3 files changed, 3 insertions(+), 19 deletions(-)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index b9fe438..dc63895 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -439,15 +439,13 @@ static int nfs4_stat_to_errno(int);
encode_putfh_maxsz + \
encode_close_maxsz + \
encode_getattr_maxsz + \
- encode_layoutreturn_maxsz + \
- encode_layoutcommit_maxsz)
+ encode_layoutreturn_maxsz)
#define NFS4_dec_close_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
decode_close_maxsz + \
decode_getattr_maxsz + \
- decode_layoutreturn_maxsz + \
- decode_layoutcommit_maxsz)
+ decode_layoutreturn_maxsz)
#define NFS4_enc_setattr_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
@@ -2138,8 +2136,6 @@ static int nfs4_xdr_enc_close(struct rpc_rqst *req, __be32 *p, struct nfs_closea
encode_compound_hdr(&xdr, req, &hdr);
encode_sequence(&xdr, &args->seq_args, &hdr);
encode_putfh(&xdr, args->fh, &hdr);
- if (args->op_bitmask & NFS4_HAS_LAYOUTCOMMIT) /* layoutcommit set */
- encode_layoutcommit(&xdr, &args->lc_args, &hdr);
encode_close(&xdr, args, &hdr);
encode_getfattr(&xdr, args->bitmask, &hdr);
if (args->op_bitmask & NFS4_HAS_LAYOUTRETURN) /* layoutreturn set */
@@ -5709,9 +5705,6 @@ static int nfs4_xdr_dec_close(struct rpc_rqst *rqstp, __be32 *p, struct nfs_clos
status = decode_putfh(&xdr);
if (status)
goto out;
- /* We pay no attention to the layoutcommit return */
- if (res->op_bitmask & NFS4_HAS_LAYOUTCOMMIT)
- decode_layoutcommit(&xdr);
status = decode_close(&xdr, res);
if (status != 0)
goto out;
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 1d3c849..4746b20 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -612,6 +612,7 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi,
* Return on close
*
* No LAYOUTRETURNS can be sent when BULK RECALL flag is set.
+ * FIXME: add layoutcommit operation if layoutcommit_needed is true.
*/
bool
pnfs_roc(struct nfs4_closedata *data)
@@ -639,14 +640,6 @@ pnfs_roc(struct nfs4_closedata *data)
}
if (found == false)
goto out_nolayout;
-
- /* Add layoutcommit operation if needed */
- if (layoutcommit_needed(NFS_I(data->inode))) {
- pnfs_layoutcommit_setup(data->inode, &data->arg.lc_args, false);
- data->res.op_bitmask |= NFS4_HAS_LAYOUTCOMMIT;
- data->arg.op_bitmask |= NFS4_HAS_LAYOUTCOMMIT;
- }
-
/* Stop new and drop response to outstanding LAYOUTGETS */
lo->plh_block_lgets++;
lo->plh_outstanding++;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 5ec855b..1b7364f 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -352,7 +352,6 @@ struct nfs_open_confirmres {
/* op_bitmask bits */
#define NFS4_HAS_LAYOUTRETURN 0x01
-#define NFS4_HAS_LAYOUTCOMMIT 0x02
struct nfs_closeargs {
struct nfs_fh * fh;
@@ -361,7 +360,6 @@ struct nfs_closeargs {
fmode_t fmode;
const u32 * bitmask;
u32 op_bitmask; /* which optional ops to encode */
- struct nfs4_layoutcommit_op_args lc_args; /* optional */
struct nfs4_layoutreturn_args lr_args; /* optional */
struct nfs4_sequence_args seq_args;
};
--
1.7.2.1
remove explicit LAYOUTRETURN call before CLOSE
ensure draining of io and forgetting of layouts marked roc before CLOSE
update barrier on return from CLOSE
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4_fs.h | 2 +-
fs/nfs/nfs4proc.c | 14 ++++++-
fs/nfs/nfs4state.c | 16 +-------
fs/nfs/pnfs.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++----
fs/nfs/pnfs.h | 38 +++++++++++++------
5 files changed, 134 insertions(+), 37 deletions(-)
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index a917872..d58a130 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -242,7 +242,7 @@ extern int nfs4_proc_async_renew(struct nfs_client *, struct rpc_cred *);
extern int nfs4_proc_renew(struct nfs_client *, struct rpc_cred *);
extern int nfs4_init_clientid(struct nfs_client *, struct rpc_cred *);
extern int nfs41_init_clientid(struct nfs_client *, struct rpc_cred *);
-extern int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, int wait);
+extern int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, int wait, bool roc);
extern int nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *fhandle);
extern int nfs4_proc_fs_locations(struct inode *dir, const struct qstr *name,
struct nfs4_fs_locations *fs_locations, struct page *page);
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c4dc5b1..57f5a8a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1841,6 +1841,8 @@ struct nfs4_closedata {
struct nfs_closeres res;
struct nfs_fattr fattr;
unsigned long timestamp;
+ bool roc;
+ u32 roc_barrier;
};
static void nfs4_free_closedata(void *data)
@@ -1848,6 +1850,7 @@ static void nfs4_free_closedata(void *data)
struct nfs4_closedata *calldata = data;
struct nfs4_state_owner *sp = calldata->state->owner;
+ pnfs_roc_release(calldata->roc, calldata->state->inode);
nfs4_put_open_state(calldata->state);
nfs_free_seqid(calldata->arg.seqid);
nfs4_put_state_owner(sp);
@@ -1880,6 +1883,8 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
*/
switch (task->tk_status) {
case 0:
+ pnfs_roc_set_barrier(calldata->roc, state->inode,
+ calldata->roc_barrier);
nfs_set_open_stateid(state, &calldata->res.stateid, 0);
renew_lease(server, calldata->timestamp);
nfs4_close_clear_stateid_flags(state,
@@ -1932,8 +1937,11 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
return;
}
- if (calldata->arg.fmode == 0)
+ if (calldata->arg.fmode == 0) {
task->tk_msg.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_CLOSE];
+ pnfs_roc_drain(calldata->roc, state->inode,
+ &calldata->roc_barrier, task);
+ }
nfs_fattr_init(calldata->res.fattr);
calldata->timestamp = jiffies;
@@ -1961,7 +1969,7 @@ static const struct rpc_call_ops nfs4_close_ops = {
*
* NOTE: Caller must be holding the sp->so_owner semaphore!
*/
-int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, int wait)
+int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, int wait, bool roc)
{
struct nfs_server *server = NFS_SERVER(state->inode);
struct nfs4_closedata *calldata;
@@ -1996,6 +2004,7 @@ int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, i
calldata->res.fattr = &calldata->fattr;
calldata->res.seqid = calldata->arg.seqid;
calldata->res.server = server;
+ calldata->roc = roc;
path_get(path);
calldata->path = *path;
@@ -2013,6 +2022,7 @@ int nfs4_do_close(struct path *path, struct nfs4_state *state, gfp_t gfp_mask, i
out_free_calldata:
kfree(calldata);
out:
+ pnfs_roc_release(roc, state->inode);
nfs4_put_open_state(state);
nfs4_put_state_owner(sp);
return status;
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 6a1eb41..bca8386 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -619,21 +619,9 @@ static void __nfs4_close(struct path *path, struct nfs4_state *state,
nfs4_put_open_state(state);
nfs4_put_state_owner(owner);
} else {
- u32 roc_iomode;
- struct nfs_inode *nfsi = NFS_I(state->inode);
-
- if (has_layout(nfsi) &&
- (roc_iomode = pnfs_layout_roc_iomode(nfsi)) != 0) {
- struct pnfs_layout_range range = {
- .iomode = roc_iomode,
- .offset = 0,
- .length = NFS4_MAX_UINT64,
- };
-
- pnfs_return_layout(state->inode, &range, wait);
- }
+ bool roc = pnfs_roc(state->inode);
- nfs4_do_close(path, state, gfp_mask, wait);
+ nfs4_do_close(path, state, gfp_mask, wait, roc);
}
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 69f5e7b..e76d4f8 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -479,9 +479,12 @@ pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo, const nfs4_stateid *new,
newseq = be32_to_cpu(new->stateid.seqid);
if ((int)(newseq - oldseq) > 0) {
memcpy(&lo->stateid, &new->stateid, sizeof(new->stateid));
- if (update_barrier)
- lo->plh_barrier = be32_to_cpu(new->stateid.seqid);
- else {
+ if (update_barrier) {
+ u32 new_barrier = be32_to_cpu(new->stateid.seqid);
+
+ if ((int)(new_barrier - lo->plh_barrier))
+ lo->plh_barrier = new_barrier;
+ } else {
/* Because of wraparound, we want to keep the barrier
* "close" to the current seqids. It needs to be
* within 2**31 to count as "behind", so if it
@@ -690,6 +693,91 @@ out:
return status;
}
+bool pnfs_roc(struct inode *ino)
+{
+ struct pnfs_layout_hdr *lo;
+ struct pnfs_layout_segment *lseg, *tmp;
+ LIST_HEAD(tmp_list);
+ bool found = false;
+
+ spin_lock(&ino->i_lock);
+ lo = NFS_I(ino)->layout;
+ if (!lo || !test_and_clear_bit(NFS_LAYOUT_ROC, &lo->plh_flags) ||
+ test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags))
+ goto out_nolayout;
+ list_for_each_entry_safe(lseg, tmp, &lo->segs, fi_list)
+ if (test_bit(NFS_LSEG_ROC, &lseg->pls_flags)) {
+ mark_lseg_invalid(lseg, &tmp_list);
+ found = true;
+ }
+ if (!found)
+ goto out_nolayout;
+ lo->plh_block_lgets++;
+ get_layout_hdr(lo); /* matched in pnfs_roc_release */
+ spin_unlock(&ino->i_lock);
+ pnfs_free_lseg_list(&tmp_list);
+ return true;
+
+out_nolayout:
+ spin_unlock(&ino->i_lock);
+ return false;
+}
+
+void pnfs_roc_release(bool needed, struct inode *ino)
+{
+ if (needed) {
+ struct pnfs_layout_hdr *lo;
+
+ spin_lock(&ino->i_lock);
+ lo = NFS_I(ino)->layout;
+ lo->plh_block_lgets--;
+ put_layout_hdr_locked(lo);
+ spin_unlock(&ino->i_lock);
+ }
+}
+
+void pnfs_roc_set_barrier(bool needed, struct inode *ino, u32 barrier)
+{
+ if (needed) {
+ struct pnfs_layout_hdr *lo;
+
+ spin_lock(&ino->i_lock);
+ lo = NFS_I(ino)->layout;
+ if ((int)(barrier - lo->plh_barrier) > 0)
+ lo->plh_barrier = barrier;
+ spin_unlock(&ino->i_lock);
+ }
+}
+
+void pnfs_roc_drain(bool needed, struct inode *ino, u32 *barrier,
+ struct rpc_task *task)
+{
+ struct nfs_inode *nfsi = NFS_I(ino);
+ struct pnfs_layout_segment *lseg;
+ bool found = false;
+
+ if (!needed)
+ return;
+ spin_lock(&ino->i_lock);
+ list_for_each_entry(lseg, &nfsi->layout->segs, fi_list)
+ if (test_bit(NFS_LSEG_ROC, &lseg->pls_flags)) {
+ rpc_sleep_on(&NFS_I(ino)->lo_rpcwaitq, task, NULL);
+ found = true;
+ break;
+ }
+ if (!found) {
+ struct pnfs_layout_hdr *lo = nfsi->layout;
+ u32 current_seqid = be32_to_cpu(lo->stateid.stateid.seqid);
+
+ /* Since close does not return a layout stateid for use as
+ * a barrier, we choose the worst-case barrier.
+ */
+ *barrier = current_seqid + atomic_read(&lo->plh_outstanding);
+ }
+ spin_unlock(&ino->i_lock);
+ return;
+}
+
/*
* Compare two layout segments for sorting into layout cache.
* We want to preferentially return RW over RO layouts, so ensure those
@@ -958,11 +1046,8 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
pnfs_insert_layout(lo, lseg);
if (res->return_on_close) {
- /* FI: This needs to be re-examined. At lo level,
- * all it needs is a bit indicating whether any of
- * the lsegs in the list have the flags set.
- */
- lo->roc_iomode |= res->range.iomode;
+ set_bit(NFS_LSEG_ROC, &lseg->pls_flags);
+ set_bit(NFS_LAYOUT_ROC, &lo->plh_flags);
}
/* Done processing layoutget. Set the layout stateid */
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 82b9a7e..d999e38 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -35,6 +35,7 @@
enum {
NFS_LSEG_VALID = 0, /* cleared when lseg is recalled/returned */
+ NFS_LSEG_ROC, /* roc bit received from server */
};
struct pnfs_layout_segment {
@@ -60,6 +61,7 @@ enum {
NFS_LAYOUT_RW_FAILED, /* get rw layout failed stop trying */
NFS_LAYOUT_BULK_RECALL, /* bulk recall affecting layout */
NFS_LAYOUT_NEED_LCOMMIT, /* LAYOUTCOMMIT needed */
+ NFS_LAYOUT_ROC, /* some lseg had roc bit set */
};
/* Per-layout driver specific registration structure */
@@ -102,7 +104,6 @@ struct pnfs_layout_hdr {
struct list_head layouts; /* other client layouts */
struct list_head plh_bulk_recall; /* clnt list of bulk recalls */
struct list_head segs; /* layout segments list */
- int roc_iomode;/* return on close iomode, 0=none */
nfs4_stateid stateid;
atomic_t plh_outstanding; /* number of RPCs out */
unsigned long plh_block_lgets; /* block LAYOUTGET if >0 */
@@ -223,6 +224,11 @@ int pnfs_choose_layoutget_stateid(nfs4_stateid *dst,
void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
struct pnfs_layout_range *range,
struct list_head *tmp_list);
+bool pnfs_roc(struct inode *ino);
+void pnfs_roc_release(bool needed, struct inode *ino);
+void pnfs_roc_set_barrier(bool needed, struct inode *ino, u32 barrier);
+void pnfs_roc_drain(bool needed, struct inode *ino, u32 *barrier,
+ struct rpc_task *task);
static inline bool
has_layout(struct nfs_inode *nfsi)
@@ -248,14 +254,6 @@ static inline int pnfs_enabled_sb(struct nfs_server *nfss)
return nfss->pnfs_curr_ld != NULL;
}
-/* Should the pNFS client commit and return the layout on close
- */
-static inline int
-pnfs_layout_roc_iomode(struct nfs_inode *nfsi)
-{
- return nfsi->layout->roc_iomode;
-}
-
static inline int pnfs_return_layout(struct inode *ino,
struct pnfs_layout_range *range,
bool wait)
@@ -345,10 +343,26 @@ pnfs_ld_layoutret_on_setattr(struct inode *inode)
return false;
}
-static inline int
-pnfs_layout_roc_iomode(struct nfs_inode *nfsi)
+static inline bool
+pnfs_roc(struct inode *ino)
+{
+ return false;
+}
+
+static inline void
+pnfs_roc_release(bool needed, struct inode *ino)
+{
+}
+
+static inline void
+pnfs_roc_set_barrier(bool needed, struct inode *ino, u32 barrier)
+{
+}
+
+static inline void
+pnfs_roc_drain(bool needed, struct inode *ino, u32 *barrier,
+ struct rpc_task *task)
{
- return 0;
}
static inline int pnfs_return_layout(struct inode *ino,
--
1.7.2.1
On 12/10/2010 03:22 AM, Fred Isaman wrote:
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -599,55 +599,6 @@ void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
> }
> }
>
> -/* Return true if there is layout based io in progress in the given range.
> - * Assumes range has already been marked invalid, and layout marked to
> - * prevent any new lseg from being inserted.
> - */
> -bool
> -pnfs_return_layout_barrier(struct nfs_inode *nfsi,
> - struct pnfs_layout_range *range)
> -{
> - struct pnfs_layout_segment *lseg;
> - bool ret = false;
> -
> - spin_lock(&nfsi->vfs_inode.i_lock);
> - list_for_each_entry(lseg, &nfsi->layout->segs, fi_list)
> - if (should_free_lseg(&lseg->range, range)) {
> - ret = true;
> - break;
> - }
> - spin_unlock(&nfsi->vfs_inode.i_lock);
> - dprintk("%s:Return %d\n", __func__, ret);
> - return ret;
> -}
> -
> -static int
> -return_layout(struct inode *ino, struct pnfs_layout_range *range, bool wait)
> -{
> - struct nfs4_layoutreturn *lrp;
> - struct nfs_server *server = NFS_SERVER(ino);
> - int status = -ENOMEM;
> -
> - dprintk("--> %s\n", __func__);
> -
> - lrp = kzalloc(sizeof(*lrp), GFP_KERNEL);
> - if (lrp == NULL) {
> - put_layout_hdr(ino);
> - goto out;
> - }
> - lrp->args.reclaim = 0;
> - lrp->args.layout_type = server->pnfs_curr_ld->id;
> - lrp->args.return_type = RETURN_FILE;
> - lrp->args.range = *range;
> - lrp->args.inode = ino;
> - lrp->clp = server->nfs_client;
> -
> - status = nfs4_proc_layoutreturn(lrp, wait);
> -out:
> - dprintk("<-- %s status: %d\n", __func__, status);
> - return status;
> -}
> -
> /* Initiates a LAYOUTRETURN(FILE) */
> int
> _pnfs_return_layout(struct inode *ino, struct pnfs_layout_range *range,
> @@ -673,21 +624,10 @@ _pnfs_return_layout(struct inode *ino, struct pnfs_layout_range *range,
> goto out;
> }
> lo->plh_block_lgets++;
> - /* Reference matched in nfs4_layoutreturn_release */
> - get_layout_hdr(lo);
> spin_unlock(&ino->i_lock);
> pnfs_free_lseg_list(&tmp_list);
>
> - if (layoutcommit_needed(nfsi)) {
> - status = pnfs_layoutcommit_inode(ino, wait);
> - if (status) {
> - /* Return layout even if layoutcommit fails */
> - dprintk("%s: layoutcommit failed, status=%d. "
> - "Returning layout anyway\n",
> - __func__, status);
> - }
> - }
> - status = return_layout(ino, &arg, wait);
You are also removing the layoutcommit.
1. You have not stated it anywhere, and snacked it in silently
2. If you are removing layoutcommit please do that in a different
patch with it's own comment and explanation.
3. How come? forgetful or not layoutcommits are a different issue
and must be done correctly when writing !!!?!
Boaz
> + /* Don't need to wait since this is followed by call to end_writeback */
> out:
> dprintk("<-- %s status: %d\n", __func__, status);
> return status;
> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
> index d999e38..0ddab0d 100644
> --- a/fs/nfs/pnfs.h
> +++ b/fs/nfs/pnfs.h
> @@ -183,7 +183,6 @@ extern int nfs4_proc_getdeviceinfo(struct nfs_server *server,
> extern int nfs4_proc_layoutget(struct nfs4_layoutget *lgp);
> extern int nfs4_proc_layoutcommit(struct nfs4_layoutcommit_data *data,
> int issync);
> -extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool wait);
>
> /* pnfs.c */
> void get_layout_hdr(struct pnfs_layout_hdr *lo);
> @@ -193,7 +192,6 @@ bool should_free_lseg(struct pnfs_layout_range *lseg_range,
> struct pnfs_layout_segment *
> pnfs_update_layout(struct inode *ino, struct nfs_open_context *ctx,
> enum pnfs_iomode access_type);
> -bool pnfs_return_layout_barrier(struct nfs_inode *, struct pnfs_layout_range *);
> int _pnfs_return_layout(struct inode *, struct pnfs_layout_range *, bool wait);
> void set_pnfs_layoutdriver(struct nfs_server *, u32 id);
> void unset_pnfs_layoutdriver(struct nfs_server *);
This reverts commit b2ae4b414a3a56d0dff75b1278c7c359fd81b4b7.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4proc.c | 73 +++++++++++++---------------------------------
fs/nfs/nfs4state.c | 17 ++++++++++-
fs/nfs/nfs4xdr.c | 14 +--------
fs/nfs/pnfs.c | 64 ++++-------------------------------------
fs/nfs/pnfs.h | 1 -
include/linux/nfs_xdr.h | 19 ------------
6 files changed, 45 insertions(+), 143 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index f993a4a..5f6120a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -74,8 +74,6 @@ static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle,
static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
struct nfs_fattr *fattr, struct iattr *sattr,
struct nfs4_state *state);
-static void nfs4_layoutreturn_set_stateid(struct inode *ino,
- struct nfs4_layoutreturn_res *res);
/* Prevent leaks of NFSv4 errors into userland */
static int nfs4_map_errors(int err)
@@ -1835,6 +1833,16 @@ static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
return err;
}
+struct nfs4_closedata {
+ struct path path;
+ struct inode *inode;
+ struct nfs4_state *state;
+ struct nfs_closeargs arg;
+ struct nfs_closeres res;
+ struct nfs_fattr fattr;
+ unsigned long timestamp;
+};
+
static void nfs4_free_closedata(void *data)
{
struct nfs4_closedata *calldata = data;
@@ -1844,17 +1852,6 @@ static void nfs4_free_closedata(void *data)
nfs_free_seqid(calldata->arg.seqid);
nfs4_put_state_owner(sp);
path_put(&calldata->path);
- if (calldata->res.op_bitmask & NFS4_HAS_LAYOUTRETURN) {
- struct pnfs_layout_hdr *lo = NFS_I(calldata->inode)->layout;
-
- spin_lock(&lo->inode->i_lock);
- lo->plh_block_lgets--;
- lo->plh_outstanding--;
- if (!pnfs_layoutgets_blocked(lo, NULL))
- rpc_wake_up(&NFS_I(lo->inode)->lo_rpcwaitq_stateid);
- spin_unlock(&lo->inode->i_lock);
- put_layout_hdr(lo->inode);
- }
kfree(calldata);
}
@@ -1884,9 +1881,6 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
switch (task->tk_status) {
case 0:
nfs_set_open_stateid(state, &calldata->res.stateid, 0);
- if (calldata->res.op_bitmask & NFS4_HAS_LAYOUTRETURN)
- nfs4_layoutreturn_set_stateid(calldata->inode,
- &calldata->res.lr_res);
renew_lease(server, calldata->timestamp);
nfs4_close_clear_stateid_flags(state,
calldata->arg.fmode);
@@ -1938,27 +1932,8 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
return;
}
- if (calldata->arg.fmode == 0) {
+ if (calldata->arg.fmode == 0)
task->tk_msg.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_CLOSE];
- /* Are there layout segments to return on close? */
- if (pnfs_roc(calldata)) {
- struct nfs_inode *nfsi = NFS_I(calldata->inode);
- if (pnfs_return_layout_barrier(nfsi,
- &calldata->arg.lr_args.range)) {
- dprintk("%s: waiting on barrier\n", __func__);
- /* FIXME race with wake here */
- rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
- spin_lock(&calldata->inode->i_lock);
- nfsi->layout->plh_block_lgets--;
- nfsi->layout->plh_outstanding--;
- if (!pnfs_layoutgets_blocked(nfsi->layout, NULL))
- rpc_wake_up(&nfsi->lo_rpcwaitq_stateid);
- spin_unlock(&calldata->inode->i_lock);
- put_layout_hdr(calldata->inode);
- return;
- }
- }
- }
nfs_fattr_init(calldata->res.fattr);
calldata->timestamp = jiffies;
@@ -5626,7 +5601,6 @@ nfs4_layoutreturn_prepare(struct rpc_task *task, void *calldata)
if (pnfs_return_layout_barrier(nfsi, &lrp->args.range)) {
dprintk("%s: waiting on barrier\n", __func__);
- /* FIXME race with wake here */
rpc_sleep_on(&nfsi->lo_rpcwaitq, task, NULL);
return;
}
@@ -5637,19 +5611,6 @@ nfs4_layoutreturn_prepare(struct rpc_task *task, void *calldata)
rpc_call_start(task);
}
-static void nfs4_layoutreturn_set_stateid(struct inode *ino,
- struct nfs4_layoutreturn_res *res)
-{
- struct pnfs_layout_hdr *lo = NFS_I(ino)->layout;
-
- spin_lock(&ino->i_lock);
- if (res->lrs_present)
- pnfs_set_layout_stateid(lo, &res->stateid, true);
- else
- BUG_ON(!list_empty(&lo->segs));
- spin_unlock(&ino->i_lock);
-}
-
static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata)
{
struct nfs4_layoutreturn *lrp = calldata;
@@ -5668,8 +5629,16 @@ static void nfs4_layoutreturn_done(struct rpc_task *task, void *calldata)
nfs_restart_rpc(task, lrp->clp);
return;
}
- if ((task->tk_status == 0) && (lrp->args.return_type == RETURN_FILE))
- nfs4_layoutreturn_set_stateid(lrp->args.inode, &lrp->res);
+ if ((task->tk_status == 0) && (lrp->args.return_type == RETURN_FILE)) {
+ struct pnfs_layout_hdr *lo = NFS_I(lrp->args.inode)->layout;
+
+ spin_lock(&lo->inode->i_lock);
+ if (lrp->res.lrs_present)
+ pnfs_set_layout_stateid(lo, &lrp->res.stateid, true);
+ else
+ BUG_ON(!list_empty(&lo->segs));
+ spin_unlock(&lo->inode->i_lock);
+ }
dprintk("<-- %s\n", __func__);
}
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 304cd30..466fc8b 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -618,8 +618,23 @@ static void __nfs4_close(struct path *path, struct nfs4_state *state,
if (!call_close) {
nfs4_put_open_state(state);
nfs4_put_state_owner(owner);
- } else
+ } else {
+ u32 roc_iomode;
+ struct nfs_inode *nfsi = NFS_I(state->inode);
+
+ if (has_layout(nfsi) &&
+ (roc_iomode = pnfs_layout_roc_iomode(nfsi)) != 0) {
+ struct pnfs_layout_range range = {
+ .iomode = roc_iomode,
+ .offset = 0,
+ .length = NFS4_MAX_UINT64,
+ };
+
+ pnfs_return_layout(state->inode, &range, wait);
+ }
+
nfs4_do_close(path, state, gfp_mask, wait);
+ }
}
void nfs4_close_state(struct path *path, struct nfs4_state *state, fmode_t fmode)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index eb5c922..7f92bfa 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -438,14 +438,12 @@ static int nfs4_stat_to_errno(int);
encode_sequence_maxsz + \
encode_putfh_maxsz + \
encode_close_maxsz + \
- encode_getattr_maxsz + \
- encode_layoutreturn_maxsz)
+ encode_getattr_maxsz)
#define NFS4_dec_close_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
decode_close_maxsz + \
- decode_getattr_maxsz + \
- decode_layoutreturn_maxsz)
+ decode_getattr_maxsz)
#define NFS4_enc_setattr_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
@@ -2145,8 +2143,6 @@ static int nfs4_xdr_enc_close(struct rpc_rqst *req, __be32 *p, struct nfs_closea
encode_putfh(&xdr, args->fh, &hdr);
encode_close(&xdr, args, &hdr);
encode_getfattr(&xdr, args->bitmask, &hdr);
- if (args->op_bitmask & NFS4_HAS_LAYOUTRETURN) /* layoutreturn set */
- encode_layoutreturn(&xdr, &args->lr_args, &hdr);
encode_nops(&hdr);
return 0;
}
@@ -5723,12 +5719,6 @@ static int nfs4_xdr_dec_close(struct rpc_rqst *rqstp, __be32 *p, struct nfs_clos
*/
decode_getfattr(&xdr, res->fattr, res->server,
!RPC_IS_ASYNC(rqstp->rq_task));
- /*
- * With the forgetful model, we pay no attention to the
- * layoutreturn status.
- */
- if (res->op_bitmask & NFS4_HAS_LAYOUTRETURN)
- decode_layoutreturn(&xdr, &res->lr_res);
out:
return status;
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 8dbac82..3ee9621 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -608,63 +608,6 @@ pnfs_return_layout_barrier(struct nfs_inode *nfsi,
return ret;
}
-/*
- * Return on close
- *
- * No LAYOUTRETURNS can be sent when BULK RECALL flag is set.
- * FIXME: add layoutcommit operation if layoutcommit_needed is true.
- */
-bool
-pnfs_roc(struct nfs4_closedata *data)
-{
- struct nfs4_layoutreturn_args *lr_args = &data->arg.lr_args;
- struct pnfs_layout_hdr *lo;
- struct pnfs_layout_segment *lseg, *tmp;
- struct pnfs_layout_range range = {
- .length = NFS4_MAX_UINT64,
- };
- LIST_HEAD(tmp_list);
- bool found = false;
-
- spin_lock(&data->inode->i_lock);
- lo = NFS_I(data->inode)->layout;
- if (!lo || lo->roc_iomode == 0 ||
- test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags))
- goto out_nolayout;
-
- range.iomode = lo->roc_iomode;
- list_for_each_entry_safe(lseg, tmp, &lo->segs, fi_list)
- if (should_free_lseg(&lseg->range, &range)) {
- mark_lseg_invalid(lseg, &tmp_list);
- found = true;
- }
- if (found == false)
- goto out_nolayout;
- /* Stop new and drop response to outstanding LAYOUTGETS */
- lo->plh_block_lgets++;
- lo->plh_outstanding++;
- /* Reference matched in pnfs_layoutreturn_release */
- get_layout_hdr(lo);
-
- spin_unlock(&data->inode->i_lock);
-
- pnfs_free_lseg_list(&tmp_list);
-
- lr_args->reclaim = 0;
- lr_args->layout_type = NFS_SERVER(data->inode)->pnfs_curr_ld->id;
- lr_args->return_type = RETURN_FILE;
- lr_args->range = range;
- lr_args->inode = data->inode;
- data->res.op_bitmask |= NFS4_HAS_LAYOUTRETURN;
- data->arg.op_bitmask |= NFS4_HAS_LAYOUTRETURN;
-
- return true;
-
-out_nolayout:
- spin_unlock(&data->inode->i_lock);
- return false;
-}
-
static int
return_layout(struct inode *ino, struct pnfs_layout_range *range, bool wait)
{
@@ -1000,8 +943,13 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
*lgp->lsegpp = lseg;
pnfs_insert_layout(lo, lseg);
- if (res->return_on_close)
+ if (res->return_on_close) {
+ /* FI: This needs to be re-examined. At lo level,
+ * all it needs is a bit indicating whether any of
+ * the lsegs in the list have the flags set.
+ */
lo->roc_iomode |= res->range.iomode;
+ }
/* Done processing layoutget. Set the layout stateid */
pnfs_set_layout_stateid(lo, &res->stateid, false);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 7029926..4b6065f 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -230,7 +230,6 @@ void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
struct pnfs_layout_range *range,
int notify_bit, atomic_t *notify_count,
struct list_head *tmp_list);
-bool pnfs_roc(struct nfs4_closedata *data);
static inline bool
has_layout(struct nfs_inode *nfsi)
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index ece0b2e..a651574 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -350,18 +350,12 @@ struct nfs_open_confirmres {
/*
* Arguments to the close call.
*/
-
-/* op_bitmask bits */
-#define NFS4_HAS_LAYOUTRETURN 0x01
-
struct nfs_closeargs {
struct nfs_fh * fh;
nfs4_stateid * stateid;
struct nfs_seqid * seqid;
fmode_t fmode;
const u32 * bitmask;
- u32 op_bitmask; /* which optional ops to encode */
- struct nfs4_layoutreturn_args lr_args; /* optional */
struct nfs4_sequence_args seq_args;
};
@@ -370,21 +364,8 @@ struct nfs_closeres {
struct nfs_fattr * fattr;
struct nfs_seqid * seqid;
const struct nfs_server *server;
- u32 op_bitmask; /* which optional ops encoded */
- struct nfs4_layoutreturn_res lr_res; /* optional */
struct nfs4_sequence_res seq_res;
};
-
-struct nfs4_closedata {
- struct path path;
- struct inode *inode;
- struct nfs4_state *state;
- struct nfs_closeargs arg;
- struct nfs_closeres res;
- struct nfs_fattr fattr;
- unsigned long timestamp;
-};
-
/*
* * Arguments to the lock,lockt, and locku call.
* */
--
1.7.2.1
This reverts commit 462e3c258daa62a5331a0febff3e224da6af5481.
Not needed since layouttype is initialized by
commit 85e174ba6b: NFS: set layout driver
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/client.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index dbf43e7..172175f 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -950,7 +950,6 @@ static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, str
goto out_error;
}
- memset(&fsinfo, 0, sizeof(fsinfo));
fsinfo.fattr = fattr;
fsinfo.layouttype = 0;
error = clp->rpc_ops->fsinfo(server, mntfh, &fsinfo);
--
1.7.2.1
instead, just go direct to MDS. This simplifies things a lot. The
primary problem that initiated this is that plh_outstanding accounting
was not correct, and there is no easy way to make it correct if it is
incremented within layoutget_prepare (due to ctrl-c issues).
This patch basically does three things:
- remove the rpc waitqs
- push plh_outstanding accounting outside of the layoutget rpc code
- shift pnfs_layoutgets_blocked() up and make static
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/callback_proc.c | 4 ---
fs/nfs/client.c | 2 -
fs/nfs/inode.c | 1 -
fs/nfs/nfs4proc.c | 45 +--------------------------------
fs/nfs/pnfs.c | 59 +++++++++++++++++++++++---------------------
fs/nfs/pnfs.h | 1 -
include/linux/nfs_fs.h | 1 -
include/linux/nfs_fs_sb.h | 1 -
8 files changed, 33 insertions(+), 81 deletions(-)
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index d4aec46..4cd7e84 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -224,7 +224,6 @@ void nfs_client_return_layouts(struct nfs_client *clp)
list_del(&cb_info->pcl_list);
clp->cl_cb_lrecall_count--;
clp->cl_drain_notification[1 << cb_info->pcl_notify_bit] = NULL;
- rpc_wake_up(&clp->cl_rpcwaitq_recall);
kfree(cb_info);
}
}
@@ -376,7 +375,6 @@ static u32 do_callback_layoutrecall(struct nfs_client *clp,
list_del(&new->pcl_list);
clp->cl_cb_lrecall_count--;
clp->cl_drain_notification[1 << bit_num] = NULL;
- rpc_wake_up(&clp->cl_rpcwaitq_recall);
spin_unlock(&clp->cl_lock);
if (res == NFS4_OK) {
if (args->cbl_recall_type == RETURN_FILE) {
@@ -385,8 +383,6 @@ static u32 do_callback_layoutrecall(struct nfs_client *clp,
lo = NFS_I(new->pcl_ino)->layout;
spin_lock(&lo->inode->i_lock);
lo->plh_block_lgets--;
- if (!pnfs_layoutgets_blocked(lo, NULL))
- rpc_wake_up(&NFS_I(lo->inode)->lo_rpcwaitq_stateid);
spin_unlock(&lo->inode->i_lock);
put_layout_hdr(new->pcl_ino);
}
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 172175f..f8e712f 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -159,8 +159,6 @@ static struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_
#if defined(CONFIG_NFS_V4_1)
INIT_LIST_HEAD(&clp->cl_layouts);
INIT_LIST_HEAD(&clp->cl_layoutrecalls);
- rpc_init_wait_queue(&clp->cl_rpcwaitq_recall,
- "NFS client CB_LAYOUTRECALLS");
#endif
nfs_fscache_get_client_cookie(clp);
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index bbeb337..54a2fc7 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1460,7 +1460,6 @@ static inline void nfs4_init_once(struct nfs_inode *nfsi)
nfsi->delegation_state = 0;
init_rwsem(&nfsi->rwsem);
rpc_init_wait_queue(&nfsi->lo_rpcwaitq, "pNFS Layoutreturn");
- rpc_init_wait_queue(&nfsi->lo_rpcwaitq_stateid, "pNFS Layoutstateid");
nfsi->layout = NULL;
#endif
}
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 5f6120a..b161393 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5357,43 +5357,17 @@ static void
nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
{
struct nfs4_layoutget *lgp = calldata;
- struct inode *ino = lgp->args.inode;
- struct nfs_inode *nfsi = NFS_I(ino);
- struct nfs_server *server = NFS_SERVER(ino);
- struct nfs_client *clp = NFS_SERVER(ino)->nfs_client;
+ struct nfs_server *server = NFS_SERVER(lgp->args.inode);
dprintk("--> %s\n", __func__);
- spin_lock(&clp->cl_lock);
- if (matches_outstanding_recall(ino, &lgp->args.range)) {
- rpc_sleep_on(&clp->cl_rpcwaitq_recall, task, NULL);
- spin_unlock(&clp->cl_lock);
- return;
- }
- spin_unlock(&clp->cl_lock);
/* Note the is a race here, where a CB_LAYOUTRECALL can come in
* right now covering the LAYOUTGET we are about to send.
* However, that is not so catastrophic, and there seems
* to be no way to prevent it completely.
*/
- spin_lock(&ino->i_lock);
- if (pnfs_layoutgets_blocked(nfsi->layout, NULL)) {
- rpc_sleep_on(&nfsi->lo_rpcwaitq_stateid, task, NULL);
- spin_unlock(&ino->i_lock);
- return;
- }
- /* This needs after above check but atomic with it in order to properly
- * serialize openstateid LAYOUTGETs.
- */
- nfsi->layout->plh_outstanding++;
- spin_unlock(&ino->i_lock);
-
if (nfs4_setup_sequence(server, NULL, &lgp->args.seq_args,
- &lgp->res.seq_res, 0, task)) {
- spin_lock(&ino->i_lock);
- nfsi->layout->plh_outstanding--;
- spin_unlock(&ino->i_lock);
+ &lgp->res.seq_res, 0, task))
return;
- }
rpc_call_start(task);
}
@@ -5422,9 +5396,6 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
default:
if (nfs4_async_handle_error(task, NFS_SERVER(ino),
NULL, NULL) == -EAGAIN) {
- spin_lock(&ino->i_lock);
- NFS_I(ino)->layout->plh_outstanding--;
- spin_unlock(&ino->i_lock);
rpc_restart_call_prepare(task);
return;
}
@@ -5486,16 +5457,6 @@ int nfs4_proc_layoutget(struct nfs4_layoutget *lgp)
status = task->tk_status;
if (status == 0)
status = pnfs_layout_process(lgp);
- else {
- struct inode *ino = lgp->args.inode;
- struct pnfs_layout_hdr *lo = NFS_I(ino)->layout;
-
- spin_lock(&ino->i_lock);
- lo->plh_outstanding--;
- if (!pnfs_layoutgets_blocked(lo, NULL))
- rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq_stateid);
- spin_unlock(&ino->i_lock);
- }
rpc_put_task(task);
dprintk("<-- %s status=%d\n", __func__, status);
return status;
@@ -5654,8 +5615,6 @@ static void nfs4_layoutreturn_release(void *calldata)
spin_lock(&ino->i_lock);
lo->plh_block_lgets--;
lo->plh_outstanding--;
- if (!pnfs_layoutgets_blocked(lo, NULL))
- rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq_stateid);
spin_unlock(&ino->i_lock);
put_layout_hdr(ino);
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 2e35706..abb3eb0 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -297,8 +297,6 @@ _put_lseg_common(struct pnfs_layout_segment *lseg)
list_del_init(&lseg->layout->layouts);
spin_unlock(&clp->cl_lock);
clear_bit(NFS_LAYOUT_BULK_RECALL, &lseg->layout->plh_flags);
- if (!pnfs_layoutgets_blocked(lseg->layout, NULL))
- rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq_stateid);
}
rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq);
}
@@ -496,6 +494,20 @@ pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo, const nfs4_stateid *new,
}
}
+/* lget is set to 1 if called from inside send_layoutget call chain */
+static bool
+pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid,
+ int lget)
+{
+ assert_spin_locked(&lo->inode->i_lock);
+ if ((stateid) &&
+ (int)(lo->plh_barrier - be32_to_cpu(stateid->stateid.seqid)) >= 0)
+ return true;
+ return lo->plh_block_lgets ||
+ test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
+ (list_empty(&lo->segs) && (lo->plh_outstanding > lget));
+}
+
int
pnfs_choose_layoutget_stateid(nfs4_stateid *dst, struct pnfs_layout_hdr *lo,
struct nfs4_state *open_state)
@@ -504,8 +516,7 @@ pnfs_choose_layoutget_stateid(nfs4_stateid *dst, struct pnfs_layout_hdr *lo,
dprintk("--> %s\n", __func__);
spin_lock(&lo->inode->i_lock);
- if (lo->plh_block_lgets ||
- test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
+ if (pnfs_layoutgets_blocked(lo, NULL, 1)) {
/* We avoid -EAGAIN, as that has special meaning to
* some callers.
*/
@@ -720,9 +731,6 @@ pnfs_insert_layout(struct pnfs_layout_hdr *lo,
}
if (!found) {
list_add_tail(&lseg->fi_list, &lo->segs);
- if (list_is_singular(&lo->segs) &&
- !pnfs_layoutgets_blocked(lo, NULL))
- rpc_wake_up(&NFS_I(lo->inode)->lo_rpcwaitq_stateid);
dprintk("%s: inserted lseg %p "
"iomode %d offset %llu length %llu at tail\n",
__func__, lseg, lseg->range.iomode,
@@ -839,6 +847,13 @@ pnfs_update_layout(struct inode *ino,
if (!pnfs_enabled_sb(NFS_SERVER(ino)))
return NULL;
+ spin_lock(&clp->cl_lock);
+ if (matches_outstanding_recall(ino, &arg)) {
+ dprintk("%s matches recall, use MDS\n", __func__);
+ spin_unlock(&clp->cl_lock);
+ return NULL;
+ }
+ spin_unlock(&clp->cl_lock);
spin_lock(&ino->i_lock);
lo = pnfs_find_alloc_layout(ino);
if (lo == NULL) {
@@ -855,6 +870,10 @@ pnfs_update_layout(struct inode *ino,
if (test_bit(lo_fail_bit(iomode), &nfsi->layout->plh_flags))
goto out_unlock;
+ if (pnfs_layoutgets_blocked(lo, NULL, 0))
+ goto out_unlock;
+ lo->plh_outstanding++;
+
get_layout_hdr(lo); /* Matched in pnfs_layoutget_release */
if (list_empty(&lo->segs)) {
/* The lo must be on the clp list if there is any
@@ -868,16 +887,17 @@ pnfs_update_layout(struct inode *ino,
spin_unlock(&ino->i_lock);
lseg = send_layoutget(lo, ctx, &arg);
+ spin_lock(&ino->i_lock);
if (!lseg) {
- spin_lock(&ino->i_lock);
if (list_empty(&lo->segs)) {
spin_lock(&clp->cl_lock);
list_del_init(&lo->layouts);
spin_unlock(&clp->cl_lock);
clear_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
}
- spin_unlock(&ino->i_lock);
}
+ lo->plh_outstanding--;
+ spin_unlock(&ino->i_lock);
out:
dprintk("%s end, state 0x%lx lseg %p\n", __func__,
nfsi->layout->plh_flags, lseg);
@@ -887,18 +907,6 @@ out_unlock:
goto out;
}
-bool
-pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid)
-{
- assert_spin_locked(&lo->inode->i_lock);
- if ((stateid) &&
- (int)(lo->plh_barrier - be32_to_cpu(stateid->stateid.seqid)) >= 0)
- return true;
- return lo->plh_block_lgets ||
- test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
- (list_empty(&lo->segs) && lo->plh_outstanding);
-}
-
int
pnfs_layout_process(struct nfs4_layoutget *lgp)
{
@@ -929,13 +937,11 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
status = PTR_ERR(lseg);
dprintk("%s: Could not allocate layout: error %d\n",
__func__, status);
- spin_lock(&ino->i_lock);
goto out;
}
spin_lock(&ino->i_lock);
/* decrement needs to be done before call to pnfs_layoutget_blocked */
- lo->plh_outstanding--;
spin_lock(&clp->cl_lock);
if (matches_outstanding_recall(ino, &res->range)) {
spin_unlock(&clp->cl_lock);
@@ -944,7 +950,7 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
}
spin_unlock(&clp->cl_lock);
- if (pnfs_layoutgets_blocked(lo, &res->stateid)) {
+ if (pnfs_layoutgets_blocked(lo, &res->stateid, 1)) {
dprintk("%s forget reply due to state\n", __func__);
goto out_forget_reply;
}
@@ -964,17 +970,14 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
/* Done processing layoutget. Set the layout stateid */
pnfs_set_layout_stateid(lo, &res->stateid, false);
-out:
- if (!pnfs_layoutgets_blocked(lo, NULL))
- rpc_wake_up(&NFS_I(ino)->lo_rpcwaitq_stateid);
spin_unlock(&ino->i_lock);
+out:
return status;
out_forget_reply:
spin_unlock(&ino->i_lock);
lseg->layout = lo;
NFS_SERVER(ino)->pnfs_curr_ld->free_lseg(lseg);
- spin_lock(&ino->i_lock);
goto out;
}
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 4b6065f..8d2ab18 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -214,7 +214,6 @@ enum pnfs_try_status pnfs_try_to_commit(struct nfs_write_data *,
void pnfs_pageio_init_read(struct nfs_pageio_descriptor *, struct inode *,
struct nfs_open_context *, struct list_head *);
void pnfs_pageio_init_write(struct nfs_pageio_descriptor *, struct inode *);
-bool pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid);
int pnfs_layout_process(struct nfs4_layoutget *lgp);
void pnfs_free_lseg_list(struct list_head *tmp_list);
void pnfs_destroy_layout(struct nfs_inode *);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index b4bb8d6..caed83e 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -191,7 +191,6 @@ struct nfs_inode {
/* pNFS layout information */
struct rpc_wait_queue lo_rpcwaitq;
- struct rpc_wait_queue lo_rpcwaitq_stateid;
struct pnfs_layout_hdr *layout;
#endif /* CONFIG_NFS_V4*/
#ifdef CONFIG_NFS_FSCACHE
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index c3127cc..956a103 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -87,7 +87,6 @@ struct nfs_client {
unsigned long cl_cb_lrecall_count;
#define PNFS_MAX_CB_LRECALLS (64)
atomic_t *cl_drain_notification[PNFS_MAX_CB_LRECALLS];
- struct rpc_wait_queue cl_rpcwaitq_recall;
struct pnfs_deviceid_cache *cl_devid_cache; /* pNFS deviceid cache */
#endif /* CONFIG_NFS_V4_1 */
--
1.7.2.1
This reverts commit aafe6a6b39b0ddf87496d3912b308bf08ed6b2b9.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4xdr.c | 35 +++++++++++++++++------------------
fs/nfs/pnfs.c | 1 -
2 files changed, 17 insertions(+), 19 deletions(-)
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index f3aaa21..8dbfbb0 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -441,17 +441,17 @@ static int nfs4_stat_to_errno(int);
#define NFS4_enc_close_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
- encode_layoutcommit_maxsz + \
- encode_layoutreturn_maxsz + \
encode_close_maxsz + \
- encode_getattr_maxsz)
+ encode_getattr_maxsz + \
+ encode_layoutreturn_maxsz + \
+ encode_layoutcommit_maxsz)
#define NFS4_dec_close_sz (compound_decode_hdr_maxsz + \
decode_sequence_maxsz + \
decode_putfh_maxsz + \
- decode_layoutcommit_maxsz + \
- decode_layoutreturn_maxsz + \
decode_close_maxsz + \
- decode_getattr_maxsz)
+ decode_getattr_maxsz + \
+ decode_layoutreturn_maxsz + \
+ decode_layoutcommit_maxsz)
#define NFS4_enc_setattr_sz (compound_encode_hdr_maxsz + \
encode_sequence_maxsz + \
encode_putfh_maxsz + \
@@ -2160,10 +2160,10 @@ static int nfs4_xdr_enc_close(struct rpc_rqst *req, __be32 *p, struct nfs_closea
encode_putfh(&xdr, args->fh, &hdr);
if (args->op_bitmask & NFS4_HAS_LAYOUTCOMMIT) /* layoutcommit set */
encode_layoutcommit(&xdr, &args->lc_args, &hdr);
- if (args->op_bitmask & NFS4_HAS_LAYOUTRETURN) /* layoutreturn set */
- encode_layoutreturn(&xdr, &args->lr_args, &hdr);
encode_close(&xdr, args, &hdr);
encode_getfattr(&xdr, args->bitmask, &hdr);
+ if (args->op_bitmask & NFS4_HAS_LAYOUTRETURN) /* layoutreturn set */
+ encode_layoutreturn(&xdr, &args->lr_args, &hdr);
encode_nops(&hdr);
return 0;
}
@@ -5743,16 +5743,9 @@ static int nfs4_xdr_dec_close(struct rpc_rqst *rqstp, __be32 *p, struct nfs_clos
status = decode_putfh(&xdr);
if (status)
goto out;
- if (res->op_bitmask & NFS4_HAS_LAYOUTCOMMIT) {
- status = decode_layoutcommit(&xdr);
- if (status)
- goto out;
- }
- if (res->op_bitmask & NFS4_HAS_LAYOUTRETURN) {
- status = decode_layoutreturn(&xdr, &res->lr_res);
- if (status)
- goto out;
- }
+ /* We pay no attention to the layoutcommit return */
+ if (res->op_bitmask & NFS4_HAS_LAYOUTCOMMIT)
+ decode_layoutcommit(&xdr);
status = decode_close(&xdr, res);
if (status != 0)
goto out;
@@ -5764,6 +5757,12 @@ static int nfs4_xdr_dec_close(struct rpc_rqst *rqstp, __be32 *p, struct nfs_clos
*/
decode_getfattr(&xdr, res->fattr, res->server,
!RPC_IS_ASYNC(rqstp->rq_task));
+ /*
+ * With the forgetful model, we pay no attention to the
+ * layoutreturn status.
+ */
+ if (res->op_bitmask & NFS4_HAS_LAYOUTRETURN)
+ decode_layoutreturn(&xdr, &res->lr_res);
out:
return status;
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index d3ce095..1d3c849 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -625,7 +625,6 @@ pnfs_roc(struct nfs4_closedata *data)
LIST_HEAD(tmp_list);
bool found = false;
- data->arg.op_bitmask = data->res.op_bitmask = 0;
spin_lock(&data->inode->i_lock);
lo = NFS_I(data->inode)->layout;
if (!lo || lo->roc_iomode == 0 ||
--
1.7.2.1
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/callback_xdr.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
index 2e1a33b..3f4889f 100644
--- a/fs/nfs/callback_xdr.c
+++ b/fs/nfs/callback_xdr.c
@@ -641,8 +641,8 @@ preprocess_nfs41_op(int nop, unsigned int op_nr, struct callback_op **op)
*op = &callback_ops[op_nr];
break;
- case OP_CB_NOTIFY:
case OP_CB_NOTIFY_DEVICEID:
+ case OP_CB_NOTIFY:
case OP_CB_PUSH_DELEG:
case OP_CB_RECALLABLE_OBJ_AVAIL:
case OP_CB_WANTS_CANCELLED:
--
1.7.2.1
This reverts commit 536da3e86e791f72b2ef81831391d2f1e9dd38f7.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/pnfs.c | 106 +++++++++++++++++++++++++++++++--------------------------
fs/nfs/pnfs.h | 3 --
2 files changed, 58 insertions(+), 51 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 4746b20..36955e1 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1191,88 +1191,98 @@ pnfs_try_to_commit(struct nfs_write_data *data,
}
/*
- * Set up the arguments required for the RPC call.
+ * Set up the argument/result storage required for the RPC call.
*/
-void
+static int
pnfs_layoutcommit_setup(struct inode *inode,
- struct nfs4_layoutcommit_op_args *args, bool use_cred)
+ struct nfs4_layoutcommit_data *data,
+ loff_t write_begin_pos, loff_t write_end_pos)
{
- struct nfs_inode *nfsi = NFS_I(inode);
- loff_t write_begin_pos, write_end_pos;
+ struct nfs_server *nfss = NFS_SERVER(inode);
+ int result = 0;
dprintk("--> %s\n", __func__);
- assert_spin_locked(&inode->i_lock);
+ data->inode = inode;
+ data->args.fh = NFS_FH(inode);
+ data->args.op.layout_type = nfss->pnfs_curr_ld->id;
+ data->res.fattr = &data->fattr;
+ nfs_fattr_init(&data->fattr);
- /*
- * Clear layoutcommit properties in the inode so
- * new layoutcommit info can be generated
+ /* Set values from inode so it can be reset
*/
- write_begin_pos = nfsi->layout->write_begin_pos;
- write_end_pos = nfsi->layout->write_end_pos;
- nfsi->layout->write_begin_pos = 0;
- nfsi->layout->write_end_pos = 0;
- /* In the true case, caller has passed on the cred to another struct */
- if (use_cred == false)
- put_rpccred(nfsi->layout->cred);
- nfsi->layout->cred = NULL;
- __clear_bit(NFS_LAYOUT_NEED_LCOMMIT, &nfsi->layout->plh_flags);
- /* FIXME: figure out what to do here */
- memcpy(args->stateid.data, nfsi->layout->stateid.data,
- NFS4_STATEID_SIZE);
-
- args->layout_type = NFS_SERVER(inode)->pnfs_curr_ld->id;
-
- args->range.iomode = IOMODE_RW;
- args->range.offset = write_begin_pos;
- args->range.length = write_end_pos - write_begin_pos + 1;
- args->lastbytewritten = min(write_end_pos, i_size_read(inode) - 1);
+ data->args.op.range.iomode = IOMODE_RW;
+ data->args.op.range.offset = write_begin_pos;
+ data->args.op.range.length = write_end_pos - write_begin_pos + 1;
+ data->args.op.lastbytewritten = min(write_end_pos,
+ i_size_read(inode) - 1);
+ data->args.bitmask = nfss->attr_bitmask;
+ data->res.server = nfss;
+
+ dprintk("<-- %s Status %d\n", __func__, result);
+ return result;
}
-/*
- * Issue a async layoutcommit for an inode.
- * Returns 0 on success, negative value for error
+/* Issue a async layoutcommit for an inode.
*/
int
pnfs_layoutcommit_inode(struct inode *inode, int sync)
{
struct nfs4_layoutcommit_data *data;
- int status = -ENOMEM;
+ struct nfs_inode *nfsi = NFS_I(inode);
+ loff_t write_begin_pos;
+ loff_t write_end_pos;
+
+ int status = 0;
dprintk("%s Begin (sync:%d)\n", __func__, sync);
+ BUG_ON(!has_layout(nfsi));
+
data = kzalloc(sizeof(*data), GFP_NOFS);
if (!data)
- goto out;
+ return -ENOMEM;
- status = 0;
spin_lock(&inode->i_lock);
- if (!layoutcommit_needed(NFS_I(inode))) {
+ if (!layoutcommit_needed(nfsi)) {
spin_unlock(&inode->i_lock);
- kfree(data);
- goto out;
+ goto out_free;
}
- /* Use the layoutcommit cred */
- data->args.cred = NFS_I(inode)->layout->cred;
- /* Set up layoutcommit operation args */
- pnfs_layoutcommit_setup(inode, &data->args.op, true);
+ /* Clear layoutcommit properties in the inode so
+ * new lc info can be generated
+ */
+ write_begin_pos = nfsi->layout->write_begin_pos;
+ write_end_pos = nfsi->layout->write_end_pos;
+ data->args.cred = nfsi->layout->cred;
+ nfsi->layout->write_begin_pos = 0;
+ nfsi->layout->write_end_pos = 0;
+ nfsi->layout->cred = NULL;
+ __clear_bit(NFS_LAYOUT_NEED_LCOMMIT, &nfsi->layout->plh_flags);
+ memcpy(data->args.op.stateid.data, nfsi->layout->stateid.data,
+ NFS4_STATEID_SIZE);
/* Reference for layoutcommit matched in pnfs_layoutcommit_release */
get_layout_hdr(NFS_I(inode)->layout);
- spin_unlock(&inode->i_lock);
- data->args.fh = NFS_FH(inode);
- data->args.bitmask = NFS_SERVER(inode)->attr_bitmask;
+ spin_unlock(&inode->i_lock);
- data->inode = inode;
- data->res.server = NFS_SERVER(inode);
- data->res.fattr = &data->fattr;
- nfs_fattr_init(&data->fattr);
+ /* Set up layout commit args */
+ status = pnfs_layoutcommit_setup(inode, data, write_begin_pos,
+ write_end_pos);
+ if (status) {
+ /* The layout driver failed to setup the layoutcommit */
+ put_rpccred(data->args.cred);
+ put_layout_hdr(inode);
+ goto out_free;
+ }
status = nfs4_proc_layoutcommit(data, sync);
out:
dprintk("%s end (err:%d)\n", __func__, status);
return status;
+out_free:
+ kfree(data);
+ goto out;
}
/*
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 8ef47e9..7029926 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -231,9 +231,6 @@ void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
int notify_bit, atomic_t *notify_count,
struct list_head *tmp_list);
bool pnfs_roc(struct nfs4_closedata *data);
-void pnfs_layoutcommit_setup(struct inode *inode,
- struct nfs4_layoutcommit_op_args *args,
- bool use_cred);
static inline bool
has_layout(struct nfs_inode *nfsi)
--
1.7.2.1
The file layout driver could send the wrong data structure to the MDS
on a commit split between DS and MDS.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/nfs4filelayout.c | 15 +++++++++++++--
1 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 055816a..fb0efda 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -497,6 +497,7 @@ filelayout_commit(struct nfs_write_data *data, int sync)
struct list_head **ds_page_list = NULL;
u16 *indices_used;
int num_indices_seen = 0;
+ bool used_mds = false;
const struct rpc_call_ops *call_ops;
struct rpc_clnt *clnt;
struct nfs_write_data **clone_list = NULL;
@@ -546,11 +547,21 @@ filelayout_commit(struct nfs_write_data *data, int sync)
if (!clone_list)
goto mem_error;
for (i = 0; i < num_indices_seen - 1; i++) {
+ if (indices_used[i] == NFS4_PNFS_MAX_MULTI_CNT) {
+ used_mds = true;
+ clone_list[i] = data;
+ } else {
+ clone_list[i] = filelayout_clone_write_data(data);
+ if (!clone_list[i])
+ goto mem_error;
+ }
+ }
+ if (used_mds) {
clone_list[i] = filelayout_clone_write_data(data);
if (!clone_list[i])
goto mem_error;
- }
- clone_list[i] = data;
+ } else
+ clone_list[i] = data;
/*
* Now send off the RPCs to each ds. Note that it is important
* that any RPC to the MDS be sent last (or at least after all
--
1.7.2.1
Trond points out that, given the restriction that bulk recalls must be
serialized, and the fact that the DELAY response we send does not
obligate us to any of the restrictions that an OK response would,
we don't really need another per-client list and the locking complications
it incurs.
This patch:
- removes cl_layoutrecalls
- removes struct pnfs_cb_lrecall_info, used as entries in cl_layoutrecalls
- removes _recall_matches_lget, insetead relying on bit tests
- changes notification code, it is now used only to make NOMATCH/DELAY
decision as late as possible
- add trigger_flush function
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/callback.h | 9 ++-
fs/nfs/callback_proc.c | 167 +++++++-------------------------------------
fs/nfs/client.c | 1 -
fs/nfs/nfs4_fs.h | 1 +
fs/nfs/nfs4proc.c | 9 +--
fs/nfs/pnfs.c | 42 +++++-------
fs/nfs/pnfs.h | 12 +---
include/linux/nfs_fs_sb.h | 5 +-
8 files changed, 55 insertions(+), 191 deletions(-)
diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
index 7f55c7e..19be056 100644
--- a/fs/nfs/callback.h
+++ b/fs/nfs/callback.h
@@ -154,6 +154,7 @@ struct cb_layoutrecallargs {
union {
struct {
struct nfs_fh cbl_fh;
+ struct inode *cbl_inode;
struct pnfs_layout_range cbl_range;
nfs4_stateid cbl_stateid;
};
@@ -164,9 +165,11 @@ struct cb_layoutrecallargs {
extern unsigned nfs4_callback_layoutrecall(
struct cb_layoutrecallargs *args,
void *dummy, struct cb_process_state *cps);
-extern bool matches_outstanding_recall(struct inode *ino,
- struct pnfs_layout_range *range);
-extern void notify_drained(struct nfs_client *clp, u64 mask);
+
+static inline void notify_drained(struct nfs_client *clp, int count)
+{
+ atomic_sub(count, &clp->cl_drain_notify);
+}
static inline void put_session_client(struct nfs4_session *session)
{
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 97e1c96..cbde28e 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -123,82 +123,21 @@ int nfs4_validate_delegation_stateid(struct nfs_delegation *delegation, const nf
#if defined(CONFIG_NFS_V4_1)
-static bool
-_recall_matches_lget(struct pnfs_cb_lrecall_info *cb_info,
- struct inode *ino, struct pnfs_layout_range *range)
-{
- struct cb_layoutrecallargs *cb_args = &cb_info->pcl_args;
-
- switch (cb_args->cbl_recall_type) {
- case RETURN_ALL:
- return true;
- case RETURN_FSID:
- return !memcmp(&NFS_SERVER(ino)->fsid, &cb_args->cbl_fsid,
- sizeof(struct nfs_fsid));
- case RETURN_FILE:
- return (ino == cb_info->pcl_ino) &&
- should_free_lseg(range, &cb_args->cbl_range);
- default:
- /* Should never hit here, as decode_layoutrecall_args()
- * will verify cb_info from server.
- */
- BUG();
- }
-}
-
-bool
-matches_outstanding_recall(struct inode *ino, struct pnfs_layout_range *range)
+static void trigger_flush(struct inode *ino)
{
- struct nfs_client *clp = NFS_SERVER(ino)->nfs_client;
- struct pnfs_cb_lrecall_info *cb_info;
- bool rv = false;
-
- assert_spin_locked(&clp->cl_lock);
- list_for_each_entry(cb_info, &clp->cl_layoutrecalls, pcl_list) {
- if (_recall_matches_lget(cb_info, ino, range)) {
- rv = true;
- break;
- }
- }
- return rv;
+ write_inode_now(ino, 0);
}
-void notify_drained(struct nfs_client *clp, u64 mask)
-{
- atomic_t **ptr = clp->cl_drain_notification;
-
- /* clp lock not needed except to remove used up entries */
- /* Should probably use functions defined in bitmap.h */
- while (mask) {
- if ((mask & 1) && (atomic_dec_and_test(*ptr))) {
- struct pnfs_cb_lrecall_info *cb_info;
-
- cb_info = container_of(*ptr,
- struct pnfs_cb_lrecall_info,
- pcl_count);
- spin_lock(&clp->cl_lock);
- /* Removing from the list unblocks LAYOUTGETs */
- list_del(&cb_info->pcl_list);
- clp->cl_cb_lrecall_count--;
- clp->cl_drain_notification[1 << cb_info->pcl_notify_bit] = NULL;
- spin_unlock(&clp->cl_lock);
- kfree(cb_info);
- }
- mask >>= 1;
- ptr++;
- }
-}
-
-static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
+static int initiate_layout_draining(struct nfs_client *clp,
+ struct cb_layoutrecallargs *args)
{
- struct nfs_client *clp = cb_info->pcl_clp;
struct pnfs_layout_hdr *lo;
int rv = NFS4ERR_NOMATCHING_LAYOUT;
- struct cb_layoutrecallargs *args = &cb_info->pcl_args;
if (args->cbl_recall_type == RETURN_FILE) {
LIST_HEAD(free_me_list);
+ args->cbl_inode = NULL;
spin_lock(&clp->cl_lock);
list_for_each_entry(lo, &clp->cl_layouts, layouts) {
if (nfs_compare_fh(&args->cbl_fh,
@@ -207,16 +146,12 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags))
rv = NFS4ERR_DELAY;
else {
- /* FIXME I need to better understand igrab and
- * does having a layout ref keep ino around?
- * It should.
- */
/* Without this, layout can be freed as soon
* as we release cl_lock. Matched in
* do_callback_layoutrecall.
*/
get_layout_hdr(lo);
- cb_info->pcl_ino = lo->inode;
+ args->cbl_inode = lo->inode;
rv = NFS4_OK;
}
break;
@@ -227,12 +162,12 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
if (rv == NFS4_OK) {
lo->plh_block_lgets++;
nfs4_asynch_forget_layouts(lo, &args->cbl_range,
- cb_info->pcl_notify_bit,
- &cb_info->pcl_count,
&free_me_list);
}
pnfs_set_layout_stateid(lo, &args->cbl_stateid, true);
spin_unlock(&lo->inode->i_lock);
+ if (rv == NFS4_OK)
+ trigger_flush(lo->inode);
pnfs_free_lseg_list(&free_me_list);
} else {
struct pnfs_layout_hdr *tmp;
@@ -245,18 +180,12 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
};
spin_lock(&clp->cl_lock);
- /* Per RFC 5661, 12.5.5.2.1.5, bulk recall must be serialized */
- if (!list_is_singular(&clp->cl_layoutrecalls)) {
- spin_unlock(&clp->cl_lock);
- return NFS4ERR_DELAY;
- }
list_for_each_entry(lo, &clp->cl_layouts, layouts) {
if ((args->cbl_recall_type == RETURN_FSID) &&
memcmp(&NFS_SERVER(lo->inode)->fsid,
&args->cbl_fsid, sizeof(struct nfs_fsid)))
continue;
get_layout_hdr(lo);
- /* We could list_del(&lo->layouts) here */
BUG_ON(!list_empty(&lo->plh_bulk_recall));
list_add(&lo->plh_bulk_recall, &recall_list);
}
@@ -265,12 +194,10 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
&recall_list, plh_bulk_recall) {
spin_lock(&lo->inode->i_lock);
set_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
- nfs4_asynch_forget_layouts(lo, &range,
- cb_info->pcl_notify_bit,
- &cb_info->pcl_count,
- &free_me_list);
+ nfs4_asynch_forget_layouts(lo, &range, &free_me_list);
list_del_init(&lo->plh_bulk_recall);
spin_unlock(&lo->inode->i_lock);
+ trigger_flush(lo->inode);
put_layout_hdr(lo->inode);
rv = NFS4_OK;
}
@@ -282,69 +209,29 @@ static int initiate_layout_draining(struct pnfs_cb_lrecall_info *cb_info)
static u32 do_callback_layoutrecall(struct nfs_client *clp,
struct cb_layoutrecallargs *args)
{
- struct pnfs_cb_lrecall_info *new;
- atomic_t **ptr;
- int bit_num;
- u32 res;
+ u32 status, res = NFS4ERR_DELAY;
dprintk("%s enter, type=%i\n", __func__, args->cbl_recall_type);
- new = kmalloc(sizeof(*new), GFP_KERNEL);
- if (!new) {
- res = NFS4ERR_DELAY;
+ if (test_and_set_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state))
goto out;
- }
- memcpy(&new->pcl_args, args, sizeof(*args));
- atomic_set(&new->pcl_count, 1);
- new->pcl_clp = clp;
- new->pcl_ino = NULL;
- spin_lock(&clp->cl_lock);
- if (clp->cl_cb_lrecall_count >= PNFS_MAX_CB_LRECALLS) {
- kfree(new);
+ atomic_inc(&clp->cl_drain_notify);
+ status = initiate_layout_draining(clp, args);
+ if (atomic_dec_and_test(&clp->cl_drain_notify))
+ res = NFS4ERR_NOMATCHING_LAYOUT;
+ else
res = NFS4ERR_DELAY;
- spin_unlock(&clp->cl_lock);
- goto out;
- }
- clp->cl_cb_lrecall_count++;
- /* Adding to the list will block conflicting LGET activity */
- list_add_tail(&new->pcl_list, &clp->cl_layoutrecalls);
- for (bit_num = 0, ptr = clp->cl_drain_notification; *ptr; ptr++)
- bit_num++;
- *ptr = &new->pcl_count;
- new->pcl_notify_bit = bit_num;
- spin_unlock(&clp->cl_lock);
- res = initiate_layout_draining(new);
- if (res || atomic_dec_and_test(&new->pcl_count)) {
- spin_lock(&clp->cl_lock);
- list_del(&new->pcl_list);
- clp->cl_cb_lrecall_count--;
- clp->cl_drain_notification[1 << bit_num] = NULL;
- spin_unlock(&clp->cl_lock);
- if (res == NFS4_OK) {
- if (args->cbl_recall_type == RETURN_FILE) {
- struct pnfs_layout_hdr *lo;
-
- lo = NFS_I(new->pcl_ino)->layout;
- spin_lock(&lo->inode->i_lock);
- lo->plh_block_lgets--;
- spin_unlock(&lo->inode->i_lock);
- put_layout_hdr(new->pcl_ino);
- }
- res = NFS4ERR_NOMATCHING_LAYOUT;
- }
- kfree(new);
- } else {
- /* We are currently using a referenced layout */
- if (args->cbl_recall_type == RETURN_FILE) {
- struct pnfs_layout_hdr *lo;
+ if (status)
+ res = status;
+ else if (args->cbl_recall_type == RETURN_FILE) {
+ struct pnfs_layout_hdr *lo;
- lo = NFS_I(new->pcl_ino)->layout;
- spin_lock(&lo->inode->i_lock);
- lo->plh_block_lgets--;
- spin_unlock(&lo->inode->i_lock);
- put_layout_hdr(new->pcl_ino);
- }
- res = NFS4ERR_DELAY;
+ lo = NFS_I(args->cbl_inode)->layout;
+ spin_lock(&lo->inode->i_lock);
+ lo->plh_block_lgets--;
+ spin_unlock(&lo->inode->i_lock);
+ put_layout_hdr(args->cbl_inode);
}
+ clear_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state);
out:
dprintk("%s returning %i\n", __func__, res);
return res;
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index f8e712f..9042a7a 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -158,7 +158,6 @@ static struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_
clp->cl_machine_cred = cred;
#if defined(CONFIG_NFS_V4_1)
INIT_LIST_HEAD(&clp->cl_layouts);
- INIT_LIST_HEAD(&clp->cl_layoutrecalls);
#endif
nfs_fscache_get_client_cookie(clp);
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 15fea61..a917872 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -44,6 +44,7 @@ enum nfs4_client_state {
NFS4CLNT_RECLAIM_REBOOT,
NFS4CLNT_RECLAIM_NOGRACE,
NFS4CLNT_DELEGRETURN,
+ NFS4CLNT_LAYOUTRECALL,
NFS4CLNT_SESSION_RESET,
NFS4CLNT_RECALL_SLOT,
};
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index b161393..adcab30 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5378,14 +5378,8 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
dprintk("--> %s\n", __func__);
- if (!nfs4_sequence_done(task, &lgp->res.seq_res)) {
- /* layout code relies on fact that in this case
- * code falls back to tk_action=call_start, but not
- * back to rpc_prepare_task, to keep plh_outstanding
- * correct.
- */
+ if (!nfs4_sequence_done(task, &lgp->res.seq_res))
return;
- }
switch (task->tk_status) {
case 0:
break;
@@ -5408,7 +5402,6 @@ static void nfs4_layoutget_release(void *calldata)
struct nfs4_layoutget *lgp = calldata;
dprintk("--> %s\n", __func__);
- put_layout_hdr(lgp->args.inode);
if (lgp->res.layout.buf != NULL)
free_page((unsigned long) lgp->res.layout.buf);
put_nfs_open_context(lgp->args.ctx);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index abb3eb0..f9757ff 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -278,7 +278,7 @@ init_lseg(struct pnfs_layout_hdr *lo, struct pnfs_layout_segment *lseg)
smp_mb();
lseg->valid = true;
lseg->layout = lo;
- lseg->pls_notify_mask = 0;
+ lseg->pls_notify_count = 0;
}
static void
@@ -328,12 +328,12 @@ put_lseg(struct pnfs_layout_segment *lseg)
atomic_read(&lseg->pls_refcount), lseg->valid);
ino = lseg->layout->inode;
if (atomic_dec_and_lock(&lseg->pls_refcount, &ino->i_lock)) {
- u64 mask = lseg->pls_notify_mask;
+ int count = lseg->pls_notify_count;
_put_lseg_common(lseg);
spin_unlock(&ino->i_lock);
NFS_SERVER(ino)->pnfs_curr_ld->free_lseg(lseg);
- notify_drained(NFS_SERVER(ino)->nfs_client, mask);
+ notify_drained(NFS_SERVER(ino)->nfs_client, count);
/* Matched by get_layout_hdr_locked in pnfs_insert_layout */
put_layout_hdr(ino);
}
@@ -403,14 +403,14 @@ pnfs_free_lseg_list(struct list_head *free_me)
{
struct pnfs_layout_segment *lseg, *tmp;
struct inode *ino;
- u64 mask;
+ int count;
list_for_each_entry_safe(lseg, tmp, free_me, fi_list) {
BUG_ON(atomic_read(&lseg->pls_refcount) != 0);
ino = lseg->layout->inode;
- mask = lseg->pls_notify_mask;
+ count = lseg->pls_notify_count;
NFS_SERVER(ino)->pnfs_curr_ld->free_lseg(lseg);
- notify_drained(NFS_SERVER(ino)->nfs_client, mask);
+ notify_drained(NFS_SERVER(ino)->nfs_client, count);
/* Matched by get_layout_hdr_locked in pnfs_insert_layout */
put_layout_hdr(ino);
}
@@ -556,10 +556,8 @@ send_layoutget(struct pnfs_layout_hdr *lo,
BUG_ON(ctx == NULL);
lgp = kzalloc(sizeof(*lgp), GFP_KERNEL);
- if (lgp == NULL) {
- put_layout_hdr(ino);
+ if (lgp == NULL)
return NULL;
- }
lgp->args.minlength = NFS4_MAX_UINT64;
lgp->args.maxcount = PNFS_LAYOUT_MAXSIZE;
lgp->args.range.iomode = range->iomode;
@@ -583,7 +581,6 @@ send_layoutget(struct pnfs_layout_hdr *lo,
void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
struct pnfs_layout_range *range,
- int notify_bit, atomic_t *notify_count,
struct list_head *tmp_list)
{
struct pnfs_layout_segment *lseg, *tmp;
@@ -591,8 +588,8 @@ void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
assert_spin_locked(&lo->inode->i_lock);
list_for_each_entry_safe(lseg, tmp, &lo->segs, fi_list)
if (should_free_lseg(&lseg->range, range)) {
- lseg->pls_notify_mask |= (1 << notify_bit);
- atomic_inc(notify_count);
+ lseg->pls_notify_count++;
+ atomic_inc(&NFS_SERVER(lo->inode)->nfs_client->cl_drain_notify);
mark_lseg_invalid(lseg, tmp_list);
}
}
@@ -847,13 +844,6 @@ pnfs_update_layout(struct inode *ino,
if (!pnfs_enabled_sb(NFS_SERVER(ino)))
return NULL;
- spin_lock(&clp->cl_lock);
- if (matches_outstanding_recall(ino, &arg)) {
- dprintk("%s matches recall, use MDS\n", __func__);
- spin_unlock(&clp->cl_lock);
- return NULL;
- }
- spin_unlock(&clp->cl_lock);
spin_lock(&ino->i_lock);
lo = pnfs_find_alloc_layout(ino);
if (lo == NULL) {
@@ -861,6 +851,12 @@ pnfs_update_layout(struct inode *ino,
goto out_unlock;
}
+ /* Do we even need to bother with this? */
+ if (test_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state) ||
+ test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
+ dprintk("%s matches recall, use MDS\n", __func__);
+ goto out_unlock;
+ }
/* Check to see if the layout for the given range already exists */
lseg = pnfs_find_lseg(lo, &arg);
if (lseg)
@@ -897,6 +893,7 @@ pnfs_update_layout(struct inode *ino,
}
}
lo->plh_outstanding--;
+ put_layout_hdr(ino);
spin_unlock(&ino->i_lock);
out:
dprintk("%s end, state 0x%lx lseg %p\n", __func__,
@@ -941,14 +938,11 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
}
spin_lock(&ino->i_lock);
- /* decrement needs to be done before call to pnfs_layoutget_blocked */
- spin_lock(&clp->cl_lock);
- if (matches_outstanding_recall(ino, &res->range)) {
- spin_unlock(&clp->cl_lock);
+ if (test_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state) ||
+ test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
dprintk("%s forget reply due to recall\n", __func__);
goto out_forget_reply;
}
- spin_unlock(&clp->cl_lock);
if (pnfs_layoutgets_blocked(lo, &res->stateid, 1)) {
dprintk("%s forget reply due to state\n", __func__);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 8d2ab18..1ccc35d 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -39,7 +39,7 @@ struct pnfs_layout_segment {
atomic_t pls_refcount;
bool valid;
struct pnfs_layout_hdr *layout;
- u64 pls_notify_mask;
+ int pls_notify_count;
};
enum pnfs_try_status {
@@ -123,15 +123,6 @@ struct pnfs_device {
unsigned int pglen;
};
-struct pnfs_cb_lrecall_info {
- struct list_head pcl_list; /* hook into cl_layoutrecalls list */
- atomic_t pcl_count;
- int pcl_notify_bit;
- struct nfs_client *pcl_clp;
- struct inode *pcl_ino;
- struct cb_layoutrecallargs pcl_args;
-};
-
/*
* Device ID RCU cache. A device ID is unique per client ID and layout type.
*/
@@ -227,7 +218,6 @@ int pnfs_choose_layoutget_stateid(nfs4_stateid *dst,
struct nfs4_state *open_state);
void nfs4_asynch_forget_layouts(struct pnfs_layout_hdr *lo,
struct pnfs_layout_range *range,
- int notify_bit, atomic_t *notify_count,
struct list_head *tmp_list);
static inline bool
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 956a103..f6f0d87 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -83,10 +83,7 @@ struct nfs_client {
u32 cl_exchange_flags;
struct nfs4_session *cl_session; /* sharred session */
struct list_head cl_layouts;
- struct list_head cl_layoutrecalls;
- unsigned long cl_cb_lrecall_count;
-#define PNFS_MAX_CB_LRECALLS (64)
- atomic_t *cl_drain_notification[PNFS_MAX_CB_LRECALLS];
+ atomic_t cl_drain_notify;
struct pnfs_deviceid_cache *cl_devid_cache; /* pNFS deviceid cache */
#endif /* CONFIG_NFS_V4_1 */
--
1.7.2.1
This makes way for adding roc bit to lseg
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/pnfs.c | 18 ++++++++++--------
fs/nfs/pnfs.h | 6 +++++-
2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 27a1973..69f5e7b 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -276,7 +276,7 @@ init_lseg(struct pnfs_layout_hdr *lo, struct pnfs_layout_segment *lseg)
INIT_LIST_HEAD(&lseg->fi_list);
atomic_set(&lseg->pls_refcount, 1);
smp_mb();
- lseg->valid = true;
+ set_bit(NFS_LSEG_VALID, &lseg->pls_flags);
lseg->layout = lo;
lseg->pls_notify_count = 0;
}
@@ -286,7 +286,7 @@ _put_lseg_common(struct pnfs_layout_segment *lseg)
{
struct inode *ino = lseg->layout->inode;
- BUG_ON(lseg->valid == true);
+ BUG_ON(test_bit(NFS_LSEG_VALID, &lseg->pls_flags));
list_del(&lseg->fi_list);
if (list_empty(&lseg->layout->segs)) {
struct nfs_client *clp;
@@ -309,7 +309,8 @@ put_lseg_locked(struct pnfs_layout_segment *lseg,
struct list_head *tmp_list)
{
dprintk("%s: lseg %p ref %d valid %d\n", __func__, lseg,
- atomic_read(&lseg->pls_refcount), lseg->valid);
+ atomic_read(&lseg->pls_refcount),
+ test_bit(NFS_LSEG_VALID, &lseg->pls_flags));
if (atomic_dec_and_test(&lseg->pls_refcount)) {
_put_lseg_common(lseg);
list_add(&lseg->fi_list, tmp_list);
@@ -325,7 +326,8 @@ put_lseg(struct pnfs_layout_segment *lseg)
return;
dprintk("%s: lseg %p ref %d valid %d\n", __func__, lseg,
- atomic_read(&lseg->pls_refcount), lseg->valid);
+ atomic_read(&lseg->pls_refcount),
+ test_bit(NFS_LSEG_VALID, &lseg->pls_flags));
ino = lseg->layout->inode;
if (atomic_dec_and_lock(&lseg->pls_refcount, &ino->i_lock)) {
int count = lseg->pls_notify_count;
@@ -363,8 +365,7 @@ static void mark_lseg_invalid(struct pnfs_layout_segment *lseg,
struct list_head *tmp_list)
{
assert_spin_locked(&lseg->layout->inode->i_lock);
- if (lseg->valid) {
- lseg->valid = false;
+ if (test_and_clear_bit(NFS_LSEG_VALID, &lseg->pls_flags)) {
/* Remove the reference keeping the lseg in the
* list. It will now be removed when all
* outstanding io is finished.
@@ -809,7 +810,8 @@ pnfs_find_lseg(struct pnfs_layout_hdr *lo,
assert_spin_locked(&lo->inode->i_lock);
list_for_each_entry(lseg, &lo->segs, fi_list) {
- if (lseg->valid && is_matching_lseg(lseg, range)) {
+ if (test_bit(NFS_LSEG_VALID, &lseg->pls_flags) &&
+ is_matching_lseg(lseg, range)) {
get_lseg(lseg);
ret = lseg;
break;
@@ -820,7 +822,7 @@ pnfs_find_lseg(struct pnfs_layout_hdr *lo,
dprintk("%s:Return lseg %p ref %d valid %d\n",
__func__, ret, ret ? atomic_read(&ret->pls_refcount) : 0,
- ret ? ret->valid : 0);
+ ret ? test_bit(NFS_LSEG_VALID, &ret->pls_flags) : 0);
return ret;
}
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index b5a30b8..82b9a7e 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -33,11 +33,15 @@
#include <linux/nfs_page.h>
#include "callback.h" /* for cb_layoutrecallargs */
+enum {
+ NFS_LSEG_VALID = 0, /* cleared when lseg is recalled/returned */
+};
+
struct pnfs_layout_segment {
struct list_head fi_list;
struct pnfs_layout_range range;
atomic_t pls_refcount;
- bool valid;
+ unsigned long pls_flags;
struct pnfs_layout_hdr *layout;
int pls_notify_count;
};
--
1.7.2.1
On 2010-12-15 17:43, Fred Isaman wrote:
>
> On Dec 15, 2010, at 10:29 AM, Benny Halevy wrote:
>
>> On 2010-12-15 16:11, Fred Isaman wrote:
>>>
>>> On Dec 15, 2010, at 8:57 AM, Benny Halevy wrote:
>>>
>>>> On 2010-12-10 03:22, Fred Isaman wrote:
>>>>> It was checking that at least one known bit was set. It needs to check
>>>>> no unknown bit was set. From RFC5661, section 20.6.3:
>>>>>
>>>>> When a bit is set in the type mask that corresponds to an undefined
>>>>> type of recallable object, NFS4ERR_INVAL MUST be returned.
>>>>>
>>>>> Signed-off-by: Fred Isaman <[email protected]>
>>>>> ---
>>>>> fs/nfs/callback.h | 1 +
>>>>> fs/nfs/callback_proc.c | 27 ++++-----------------------
>>>>> 2 files changed, 5 insertions(+), 23 deletions(-)
>>>>>
>>>>> diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
>>>>> index b16dd1f..616c5c1 100644
>>>>> --- a/fs/nfs/callback.h
>>>>> +++ b/fs/nfs/callback.h
>>>>> @@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
>>>>> #define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
>>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
>>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
>>>>> +#define RCA4_TYPE_MASK_ALL 0xf31f
>>>>>
>>>>> struct cb_recallanyargs {
>>>>> struct sockaddr *craa_addr;
>>>>> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
>>>>> index 61b3c66..d4aec46 100644
>>>>> --- a/fs/nfs/callback_proc.c
>>>>> +++ b/fs/nfs/callback_proc.c
>>>>> @@ -661,28 +661,10 @@ out_putclient:
>>>>> goto out;
>>>>> }
>>>>>
>>>>> -static inline bool
>>>>> -validate_bitmap_values(const unsigned long *mask)
>>>>> +static bool
>>>>> +validate_bitmap_values(unsigned long mask)
>>>>> {
>>>>> - int i;
>>>>> -
>>>>> - if (*mask == 0)
>>>>> - return true;
>>>>> - if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
>>>>> - test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
>>>>> - test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
>>>>> - test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
>>>>> - test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
>>>>> - return true;
>>>>> - for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
>>>>> - i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
>>>>> - if (test_bit(i, mask))
>>>>> - return true;
>>>>> - for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
>>>>> - i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
>>>>> - if (test_bit(i, mask))
>>>>> - return true;
>>>>> - return false;
>>>>> + return mask & ~RCA4_TYPE_MASK_ALL;
>>>>
>>>> Hmm, shouldn't that be
>>>> return (mask & ~RCA4_TYPE_MASK_ALL) == 0;
>>>>
>>>> Benny
>>>>
>>>
>>> Yes, you are right.
>>
>> OK. This is fixed in my branch to be released asap.
>
> Thanks. I have a bunch more minor code cleanups that I'll send once I see what you have, unless you want them immediately. I'll also have a rebase of the pnfs-submit branch with the wave2 patches pushed down to the bottom shortly after I see your branch.
>
I prefer that you send them now.
>> I've reverted large parts of this patchset in the post-submit stream
>> to restore layoutcommit and layoutreturn, but not their embedding in
>> the CLOSE compound. I also kept the cleanups and bug fixes.
>> I'll send out the post-submit when it's ready.
>> Some more work will be required to restore the original patches
>> author and signoffees.
>>
>> This is the list as of now:
>>
>> pick af44531 Revert "pnfs-submit: wave2: remove forgotten layoutreturn struct definitions"
>> pick c465549 Revert "pnfs-submit: Turn off layoutcommits"
>> pick 0f4ba67 Revert "pnfs-submit: wave2: remove all LAYOUTRETURN code"
>> pick 486db47 Revert "pnfs-submit: wave2: Remove LAYOUTRETURN from return on close"
>>
>> pick 484c935 FIXME: roc should return layout on last close
>> (This patch just adds a FIXME comment.)
>>
>
> The above looks good.
>
>> pick 8698772 Revert "pnfs-submit: wave2: remove cl_layoutrecalls list"
>> pick 263879b Revert "pnfs-submit: wave2: Pull out all recall initiated LAYOUTRETURNS"
>> pick 693765f Revert "pnfs-submit: wave2: Don't wait in layoutget"
>
> Just note the Trond was highly resistant to the above.
>
I understand that in the forgetful client model but eventually returning the layout
on close will be more efficient overall vs. having the server poll the client.
>> pick de56e11 Revert "pnfs-submit: wave2: check that partial LAYOUTGET return is ignored"
>>
>
> We need some sort of check that we got what we asked for, given that the xdr code can chop the servers reply.
The post-submit world is supposed to handle partial layouts correctly, as this is required for
the obj and block layouts so we can't just blindly toss away partial layouts.
If the files layout driver won't support partial layouts we can restrict this (sigh, ugly)
only for it.
Benny
>
> Fred
>
>> Anything else you had in mind?
>>
>> Benny
>>
>>>
>>> Fred
>>>
>>>>> }
>>>>>
>>>>> __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>>> @@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>>> rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
>>>>>
>>>>> status = cpu_to_be32(NFS4ERR_INVAL);
>>>>> - if (!validate_bitmap_values((const unsigned long *)
>>>>> - &args->craa_type_mask))
>>>>> + if (!validate_bitmap_values(args->craa_type_mask))
>>>>> goto out;
>>>>>
>>>>> status = cpu_to_be32(NFS4_OK);
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
It was checking that at least one known bit was set. It needs to check
no unknown bit was set. From RFC5661, section 20.6.3:
When a bit is set in the type mask that corresponds to an undefined
type of recallable object, NFS4ERR_INVAL MUST be returned.
Signed-off-by: Fred Isaman <[email protected]>
---
fs/nfs/callback.h | 1 +
fs/nfs/callback_proc.c | 27 ++++-----------------------
2 files changed, 5 insertions(+), 23 deletions(-)
diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
index b16dd1f..616c5c1 100644
--- a/fs/nfs/callback.h
+++ b/fs/nfs/callback.h
@@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
#define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
#define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
#define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
+#define RCA4_TYPE_MASK_ALL 0xf31f
struct cb_recallanyargs {
struct sockaddr *craa_addr;
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 61b3c66..d4aec46 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -661,28 +661,10 @@ out_putclient:
goto out;
}
-static inline bool
-validate_bitmap_values(const unsigned long *mask)
+static bool
+validate_bitmap_values(unsigned long mask)
{
- int i;
-
- if (*mask == 0)
- return true;
- if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
- test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
- test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
- test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
- test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
- return true;
- for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
- i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
- if (test_bit(i, mask))
- return true;
- for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
- i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
- if (test_bit(i, mask))
- return true;
- return false;
+ return mask & ~RCA4_TYPE_MASK_ALL;
}
__be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
@@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
status = cpu_to_be32(NFS4ERR_INVAL);
- if (!validate_bitmap_values((const unsigned long *)
- &args->craa_type_mask))
+ if (!validate_bitmap_values(args->craa_type_mask))
goto out;
status = cpu_to_be32(NFS4_OK);
--
1.7.2.1
On Thu, Dec 16, 2010 at 7:47 AM, Boaz Harrosh <[email protected]> wrote:
> On 12/10/2010 03:22 AM, Fred Isaman wrote:
>> Recent changes to close can delay pending layoutcommit until umount,
>> when the async layoutcommits can come tricklng in after we have destroyed
>> the session.
>
> Then "Recent changes" are broken and should be fixed. It was fine
> before. New broken code is not acceptable.
>
>> Since file does not need them, just turn them off for
>> the moment. ?Non-file layouts will probably have to trigger them in
>> some fashion at close.
>>
>
> Rrrr. Are we back to this argument. We stand down win an argument
> and 2 weeks later you are back on it has if we never talked about it.
>
> NO!!! only "coherent clustered filesystems" do not need them. It has
> nothing to do with layout type. A none-clustered aggregated parallel
> filesystem will need them just the same as blocks and objects.
>
> AND THE STD DOES NOT GIVE YOU A CHOICE!!!
You keep saying this, but just repeating it does not convince me.
Could you please take the time to explain *why* they are needed. A
separate thread in the ietf list would be great. Because right now,
Andy and I are coding and preparing for submission to Trond under the
assumption that they are possibly a nice optimization, but are never
actually needed for the file layout.
Fred
>
>> A better solution is to just push all the layoutcommit code outside
>> of the pnfs-submit branch. ?This is really just a stop gap until code
>> is rearranged to make that easier.
>>
>
> Than all this is not finished. Please keep it in the shops until the
> final solution is presented and we can actually see the new compared
> to the old system. Until then we should keep what worked and was tested.
>
> Boaz
>
>> Signed-off-by: Fred Isaman <[email protected]>
>> ---
>> ?fs/nfs/nfs4proc.c | ? ?1 -
>> ?1 files changed, 0 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 9b15535..224bdfe 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -3098,7 +3098,6 @@ static void pnfs4_update_write_done(struct nfs_inode *nfsi, struct nfs_write_dat
>> ?{
>> ?#ifdef CONFIG_NFS_V4_1
>> ? ? ? pnfs_update_last_write(nfsi, data->args.offset, data->res.count);
>> - ? ? pnfs_need_layoutcommit(nfsi, data->args.context);
>> ?#endif /* CONFIG_NFS_V4_1 */
>> ?}
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
On Thu, Dec 16, 2010 at 7:47 AM, Boaz Harrosh <[email protected]> wr=
ote:
> On 12/10/2010 03:22 AM, Fred Isaman wrote:
>
>> --- a/fs/nfs/pnfs.c
>> +++ b/fs/nfs/pnfs.c
>> @@ -599,55 +599,6 @@ void nfs4_asynch_forget_layouts(struct pnfs_lay=
out_hdr *lo,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
>> =A0}
>>
>> -/* Return true if there is layout based io in progress in the given=
range.
>> - * Assumes range has already been marked invalid, and layout marked=
to
>> - * prevent any new lseg from being inserted.
>> - */
>> -bool
>> -pnfs_return_layout_barrier(struct nfs_inode *nfsi,
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct pnfs_layout_=
range *range)
>> -{
>> - =A0 =A0 struct pnfs_layout_segment *lseg;
>> - =A0 =A0 bool ret =3D false;
>> -
>> - =A0 =A0 spin_lock(&nfsi->vfs_inode.i_lock);
>> - =A0 =A0 list_for_each_entry(lseg, &nfsi->layout->segs, fi_list)
>> - =A0 =A0 =A0 =A0 =A0 =A0 if (should_free_lseg(&lseg->range, range))=
{
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ret =3D true;
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 break;
>> - =A0 =A0 =A0 =A0 =A0 =A0 }
>> - =A0 =A0 spin_unlock(&nfsi->vfs_inode.i_lock);
>> - =A0 =A0 dprintk("%s:Return %d\n", __func__, ret);
>> - =A0 =A0 return ret;
>> -}
>> -
>> -static int
>> -return_layout(struct inode *ino, struct pnfs_layout_range *range, b=
ool wait)
>> -{
>> - =A0 =A0 struct nfs4_layoutreturn *lrp;
>> - =A0 =A0 struct nfs_server *server =3D NFS_SERVER(ino);
>> - =A0 =A0 int status =3D -ENOMEM;
>> -
>> - =A0 =A0 dprintk("--> %s\n", __func__);
>> -
>> - =A0 =A0 lrp =3D kzalloc(sizeof(*lrp), GFP_KERNEL);
>> - =A0 =A0 if (lrp =3D=3D NULL) {
>> - =A0 =A0 =A0 =A0 =A0 =A0 put_layout_hdr(ino);
>> - =A0 =A0 =A0 =A0 =A0 =A0 goto out;
>> - =A0 =A0 }
>> - =A0 =A0 lrp->args.reclaim =3D 0;
>> - =A0 =A0 lrp->args.layout_type =3D server->pnfs_curr_ld->id;
>> - =A0 =A0 lrp->args.return_type =3D RETURN_FILE;
>> - =A0 =A0 lrp->args.range =3D *range;
>> - =A0 =A0 lrp->args.inode =3D ino;
>> - =A0 =A0 lrp->clp =3D server->nfs_client;
>> -
>> - =A0 =A0 status =3D nfs4_proc_layoutreturn(lrp, wait);
>> -out:
>> - =A0 =A0 dprintk("<-- %s status: %d\n", __func__, status);
>> - =A0 =A0 return status;
>> -}
>> -
>> =A0/* Initiates a LAYOUTRETURN(FILE) */
>> =A0int
>> =A0_pnfs_return_layout(struct inode *ino, struct pnfs_layout_range *=
range,
>> @@ -673,21 +624,10 @@ _pnfs_return_layout(struct inode *ino, struct =
pnfs_layout_range *range,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out;
>> =A0 =A0 =A0 }
>> =A0 =A0 =A0 lo->plh_block_lgets++;
>> - =A0 =A0 /* Reference matched in nfs4_layoutreturn_release */
>> - =A0 =A0 get_layout_hdr(lo);
>> =A0 =A0 =A0 spin_unlock(&ino->i_lock);
>> =A0 =A0 =A0 pnfs_free_lseg_list(&tmp_list);
>>
>> - =A0 =A0 if (layoutcommit_needed(nfsi)) {
>> - =A0 =A0 =A0 =A0 =A0 =A0 status =3D pnfs_layoutcommit_inode(ino, wa=
it);
>> - =A0 =A0 =A0 =A0 =A0 =A0 if (status) {
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* Return layout even if l=
ayoutcommit fails */
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s: layoutcommit =
failed, status=3D%d. "
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "Returning=
layout anyway\n",
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __func__, =
status);
>> - =A0 =A0 =A0 =A0 =A0 =A0 }
>> - =A0 =A0 }
>> - =A0 =A0 status =3D return_layout(ino, &arg, wait);
>
>
> You are also removing the layoutcommit.
> 1. You have not stated it anywhere, and snacked it in silently
> 2. If you are removing layoutcommit please do that in a different
> =A0 patch with it's own comment and explanation.
> 3. How come? forgetful or not layoutcommits are a different issue
> =A0 and must be done correctly when writing !!!?!
>
> Boaz
You are right, that should have been a separate patch.
=46red
>> + =A0 =A0 /* Don't need to wait since this is followed by call to en=
d_writeback */
>> =A0out:
>> =A0 =A0 =A0 dprintk("<-- %s status: %d\n", __func__, status);
>> =A0 =A0 =A0 return status;
>> diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
>> index d999e38..0ddab0d 100644
>> --- a/fs/nfs/pnfs.h
>> +++ b/fs/nfs/pnfs.h
>> @@ -183,7 +183,6 @@ extern int nfs4_proc_getdeviceinfo(struct nfs_se=
rver *server,
>> =A0extern int nfs4_proc_layoutget(struct nfs4_layoutget *lgp);
>> =A0extern int nfs4_proc_layoutcommit(struct nfs4_layoutcommit_data *=
data,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0i=
nt issync);
>> -extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bo=
ol wait);
>>
>> =A0/* pnfs.c */
>> =A0void get_layout_hdr(struct pnfs_layout_hdr *lo);
>> @@ -193,7 +192,6 @@ bool should_free_lseg(struct pnfs_layout_range *=
lseg_range,
>> =A0struct pnfs_layout_segment *
>> =A0pnfs_update_layout(struct inode *ino, struct nfs_open_context *ct=
x,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0enum pnfs_iomode access_type);
>> -bool pnfs_return_layout_barrier(struct nfs_inode *, struct pnfs_lay=
out_range *);
>> =A0int _pnfs_return_layout(struct inode *, struct pnfs_layout_range =
*, bool wait);
>> =A0void set_pnfs_layoutdriver(struct nfs_server *, u32 id);
>> =A0void unset_pnfs_layoutdriver(struct nfs_server *);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to [email protected]
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>
On 2010-12-15 17:59, Fred Isaman wrote:
>
> On Dec 15, 2010, at 10:56 AM, Benny Halevy wrote:
>
>> On 2010-12-15 17:43, Fred Isaman wrote:
>>>
>>> On Dec 15, 2010, at 10:29 AM, Benny Halevy wrote:
>>>
>>>> On 2010-12-15 16:11, Fred Isaman wrote:
>>>>>
>>>>> On Dec 15, 2010, at 8:57 AM, Benny Halevy wrote:
>>>>>
>>>>>> On 2010-12-10 03:22, Fred Isaman wrote:
>>>>>>> It was checking that at least one known bit was set. It needs to check
>>>>>>> no unknown bit was set. From RFC5661, section 20.6.3:
>>>>>>>
>>>>>>> When a bit is set in the type mask that corresponds to an undefined
>>>>>>> type of recallable object, NFS4ERR_INVAL MUST be returned.
>>>>>>>
>>>>>>> Signed-off-by: Fred Isaman <[email protected]>
>>>>>>> ---
>>>>>>> fs/nfs/callback.h | 1 +
>>>>>>> fs/nfs/callback_proc.c | 27 ++++-----------------------
>>>>>>> 2 files changed, 5 insertions(+), 23 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
>>>>>>> index b16dd1f..616c5c1 100644
>>>>>>> --- a/fs/nfs/callback.h
>>>>>>> +++ b/fs/nfs/callback.h
>>>>>>> @@ -126,6 +126,7 @@ extern int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation,
>>>>>>> #define RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9
>>>>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MIN 12
>>>>>>> #define RCA4_TYPE_MASK_OTHER_LAYOUT_MAX 15
>>>>>>> +#define RCA4_TYPE_MASK_ALL 0xf31f
>>>>>>>
>>>>>>> struct cb_recallanyargs {
>>>>>>> struct sockaddr *craa_addr;
>>>>>>> diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
>>>>>>> index 61b3c66..d4aec46 100644
>>>>>>> --- a/fs/nfs/callback_proc.c
>>>>>>> +++ b/fs/nfs/callback_proc.c
>>>>>>> @@ -661,28 +661,10 @@ out_putclient:
>>>>>>> goto out;
>>>>>>> }
>>>>>>>
>>>>>>> -static inline bool
>>>>>>> -validate_bitmap_values(const unsigned long *mask)
>>>>>>> +static bool
>>>>>>> +validate_bitmap_values(unsigned long mask)
>>>>>>> {
>>>>>>> - int i;
>>>>>>> -
>>>>>>> - if (*mask == 0)
>>>>>>> - return true;
>>>>>>> - if (test_bit(RCA4_TYPE_MASK_RDATA_DLG, mask) ||
>>>>>>> - test_bit(RCA4_TYPE_MASK_WDATA_DLG, mask) ||
>>>>>>> - test_bit(RCA4_TYPE_MASK_DIR_DLG, mask) ||
>>>>>>> - test_bit(RCA4_TYPE_MASK_FILE_LAYOUT, mask) ||
>>>>>>> - test_bit(RCA4_TYPE_MASK_BLK_LAYOUT, mask))
>>>>>>> - return true;
>>>>>>> - for (i = RCA4_TYPE_MASK_OBJ_LAYOUT_MIN;
>>>>>>> - i <= RCA4_TYPE_MASK_OBJ_LAYOUT_MAX; i++)
>>>>>>> - if (test_bit(i, mask))
>>>>>>> - return true;
>>>>>>> - for (i = RCA4_TYPE_MASK_OTHER_LAYOUT_MIN;
>>>>>>> - i <= RCA4_TYPE_MASK_OTHER_LAYOUT_MAX; i++)
>>>>>>> - if (test_bit(i, mask))
>>>>>>> - return true;
>>>>>>> - return false;
>>>>>>> + return mask & ~RCA4_TYPE_MASK_ALL;
>>>>>>
>>>>>> Hmm, shouldn't that be
>>>>>> return (mask & ~RCA4_TYPE_MASK_ALL) == 0;
>>>>>>
>>>>>> Benny
>>>>>>
>>>>>
>>>>> Yes, you are right.
>>>>
>>>> OK. This is fixed in my branch to be released asap.
>>>
>>> Thanks. I have a bunch more minor code cleanups that I'll send once I see what you have, unless you want them immediately. I'll also have a rebase of the pnfs-submit branch with the wave2 patches pushed down to the bottom shortly after I see your branch.
>>>
>>
>> I prefer that you send them now.
>
> OK, I'll send them in a few minutes.
>
Thanks. Got 'em.
>>
>>>> I've reverted large parts of this patchset in the post-submit stream
>>>> to restore layoutcommit and layoutreturn, but not their embedding in
>>>> the CLOSE compound. I also kept the cleanups and bug fixes.
>>>> I'll send out the post-submit when it's ready.
>>>> Some more work will be required to restore the original patches
>>>> author and signoffees.
>>>>
>>>> This is the list as of now:
>>>>
>>>> pick af44531 Revert "pnfs-submit: wave2: remove forgotten layoutreturn struct definitions"
>>>> pick c465549 Revert "pnfs-submit: Turn off layoutcommits"
>>>> pick 0f4ba67 Revert "pnfs-submit: wave2: remove all LAYOUTRETURN code"
>>>> pick 486db47 Revert "pnfs-submit: wave2: Remove LAYOUTRETURN from return on close"
>>>>
>>>> pick 484c935 FIXME: roc should return layout on last close
>>>> (This patch just adds a FIXME comment.)
>>>>
>>>
>>> The above looks good.
>>>
>>>> pick 8698772 Revert "pnfs-submit: wave2: remove cl_layoutrecalls list"
>>>> pick 263879b Revert "pnfs-submit: wave2: Pull out all recall initiated LAYOUTRETURNS"
>>>> pick 693765f Revert "pnfs-submit: wave2: Don't wait in layoutget"
>>>
>>> Just note the Trond was highly resistant to the above.
>>>
>>
>> I understand that in the forgetful client model but eventually returning the layout
>> on close will be more efficient overall vs. having the server poll the client.
>>
>>>> pick de56e11 Revert "pnfs-submit: wave2: check that partial LAYOUTGET return is ignored"
>>>>
>>>
>>> We need some sort of check that we got what we asked for, given that the xdr code can chop the servers reply.
>>
>> The post-submit world is supposed to handle partial layouts correctly, as this is required for
>> the obj and block layouts so we can't just blindly toss away partial layouts.
>>
>> If the files layout driver won't support partial layouts we can restrict this (sigh, ugly)
>> only for it.
>>
>> Benny
>
> Note the problem right now is that the xdr decoding function will only parse the first array element, and toss the rest. That is the real reason for the check, and something that needs to be fixed.
>
I agree we need to implement getting an array of lsegs in layoutget response
but the problem with this solution is that it will ignore also partial layouts
provided as a single array element. A (more common, I assume) case we need to support too :)
Benny
> Fred
>
>>
>>>
>>> Fred
>>>
>>>> Anything else you had in mind?
>>>>
>>>> Benny
>>>>
>>>>>
>>>>> Fred
>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>>>>> @@ -702,8 +684,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
>>>>>>> rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
>>>>>>>
>>>>>>> status = cpu_to_be32(NFS4ERR_INVAL);
>>>>>>> - if (!validate_bitmap_values((const unsigned long *)
>>>>>>> - &args->craa_type_mask))
>>>>>>> + if (!validate_bitmap_values(args->craa_type_mask))
>>>>>>> goto out;
>>>>>>>
>>>>>>> status = cpu_to_be32(NFS4_OK);
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>>>> the body of a message to [email protected]
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>