2010-05-20 16:05:30

by Andy Adamson

[permalink] [raw]
Subject: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

From: Andy Adamson <[email protected]>

Taking a reference on the nfs_open_context to signal that a layoutcommit is
needed is problematic because the last reference to the context triggers a
close (nfs_release). But, if the layout holds a reference on the
nfs_open_context, then close will not be triggered.

Since we only use the rpc credential from the layoutcommit_ctx, replace the
layoutcommit_ctx with the rpc_cred.

Hold a reference on the rpc_cred until the layoutcommit rpc returns. Note that
the rpc layer (rpcauth_bind) also references the rpc_cred.

If the layoutdriver fails to setup the layoutcommit, clear the layoutcommit
properties and put the credential.

Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/inode.c | 4 ++--
fs/nfs/nfs4state.c | 2 +-
fs/nfs/pnfs.c | 42 +++++++++++++++++-------------------------
fs/nfs/write.c | 2 +-
include/linux/nfs4_pnfs.h | 6 ++++++
include/linux/nfs_fs.h | 3 +--
include/linux/pnfs_xdr.h | 1 -
7 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 7fd33d9..2b8e8e6 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1110,7 +1110,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
/*
* file needs layout commit, server attributes may be stale
*/
- if (nfsi->layoutcommit_ctx && nfsi->change_attr >= fattr->change_attr) {
+ if (do_layoutcommit(nfsi) && nfsi->change_attr >= fattr->change_attr) {
dprintk("NFS: %s: layoutcommit is needed for file %s/%ld\n",
__func__, inode->i_sb->s_id, inode->i_ino);
return 0;
@@ -1331,7 +1331,7 @@ static void pnfs_alloc_init_inode(struct nfs_inode *nfsi)
nfsi->pnfs_layout_state = 0;
memset(&nfsi->layout.stateid, 0, NFS4_STATEID_SIZE);
nfsi->layout.roc_iomode = 0;
- nfsi->layoutcommit_ctx = NULL;
+ nfsi->lo_cred = NULL;
nfsi->pnfs_write_begin_pos = 0;
nfsi->pnfs_write_end_pos = 0;
#endif /* CONFIG_NFS_V4_1 */
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index d145de1..bf03fde 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -589,7 +589,7 @@ static void __nfs4_close(struct path *path, struct nfs4_state *state, fmode_t fm
#ifdef CONFIG_NFS_V4_1
struct nfs_inode *nfsi = NFS_I(state->inode);

- if (nfsi->layoutcommit_ctx)
+ if (do_layoutcommit(nfsi))
pnfs_layoutcommit_inode(state->inode, wait);
if (has_layout(nfsi) && nfsi->layout.roc_iomode) {
struct nfs4_pnfs_layout_segment range;
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 20285bc..cb8ff06 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -145,21 +145,19 @@ find_pnfs(u32 id, struct pnfs_module **module) {
return 0;
}

-/* Set context to indicate we require a layoutcommit
+/* Set lo_cred to indicate we require a layoutcommit
* If we don't even have a layout, we don't need to commit it.
*/
void
pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx)
{
- dprintk("%s: has_layout=%d layoutcommit_ctx=%p ctx=%p\n", __func__,
- has_layout(nfsi), nfsi->layoutcommit_ctx, ctx);
+ dprintk("%s: has_layout=%d ctx=%p\n", __func__, has_layout(nfsi), ctx);
spin_lock(&nfsi->lo_lock);
- if (has_layout(nfsi) && !nfsi->layoutcommit_ctx) {
- nfsi->layoutcommit_ctx = get_nfs_open_context(ctx);
+ if (has_layout(nfsi) && !do_layoutcommit(nfsi)) {
+ nfsi->lo_cred = get_rpccred(ctx->state->owner->so_cred);
nfsi->change_attr++;
spin_unlock(&nfsi->lo_lock);
- dprintk("%s: Set layoutcommit_ctx=%p\n", __func__,
- nfsi->layoutcommit_ctx);
+ dprintk("%s: Set layoutcommit\n", __func__);
return;
}
spin_unlock(&nfsi->lo_lock);
@@ -755,7 +753,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range,
!pnfs_return_layout_barrier(nfsi, &arg));
}

- if (nfsi->layoutcommit_ctx) {
+ if (do_layoutcommit(nfsi)) {
status = pnfs_layoutcommit_inode(ino, wait);
if (status) {
dprintk("%s: layoutcommit failed, status=%d. "
@@ -1910,16 +1908,9 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommit_data *data)

dprintk("%s: (status %d)\n", __func__, data->status);

- /* TODO: For now, set an error in the open context (just like
- * if a commit failed) We may want to do more, much more, like
- * replay all writes through the NFSv4
- * server, or something.
- */
- if (data->status < 0) {
+ if (data->status < 0)
printk(KERN_ERR "%s, Layoutcommit Failed! = %d\n",
__func__, data->status);
- data->ctx->error = data->status;
- }

/* TODO: Maybe we should avoid this by allowing the layout driver
* to directly xdr its layout on the wire.
@@ -1929,9 +1920,6 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommit_data *data)
&nfsi->layout,
&data->args,
data->status);
-
- /* release the open_context acquired in pnfs_writeback_done */
- put_nfs_open_context(data->ctx);
}

/*
@@ -1995,30 +1983,34 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync)
return -ENOMEM;

spin_lock(&nfsi->lo_lock);
- if (!nfsi->layoutcommit_ctx)
+ if (!do_layoutcommit(nfsi))
goto out_unlock;

data->args.inode = inode;
- data->cred = nfsi->layoutcommit_ctx->cred;
- data->ctx = nfsi->layoutcommit_ctx;
+ data->cred = nfsi->lo_cred;

/* Set up layout commit args*/
status = pnfs_layoutcommit_setup(data, sync);
- if (status)
- goto out_unlock;

/* Clear layoutcommit properties in the inode so
* new lc info can be generated
*/
nfsi->pnfs_write_begin_pos = 0;
nfsi->pnfs_write_end_pos = 0;
- nfsi->layoutcommit_ctx = NULL;
+ nfsi->lo_cred = NULL;
+
+ if (status) {
+ /* The layout driver failed to setup the layoutcommit */
+ put_rpccred(data->cred);
+ goto out_unlock;
+ }

/* release lock on pnfs layoutcommit attrs */
spin_unlock(&nfsi->lo_lock);

data->is_sync = sync;
status = pnfs4_proc_layoutcommit(data);
+ put_rpccred(data->cred);
out:
dprintk("%s end (err:%d)\n", __func__, status);
return status;
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index a4c95a0..2c1918e 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1490,7 +1490,7 @@ static int nfs_commit_inode(struct inode *inode, int how)
nfs_wait_bit_killable,
TASK_KILLABLE);
#ifdef CONFIG_NFS_V4_1
- if (may_wait && NFS_I(inode)->layoutcommit_ctx) {
+ if (may_wait && do_layoutcommit(NFS_I(inode))) {
error = pnfs_layoutcommit_inode(inode, 1);
if (error < 0)
return error;
diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
index 4d47b48..c9ef43e 100644
--- a/include/linux/nfs4_pnfs.h
+++ b/include/linux/nfs4_pnfs.h
@@ -80,6 +80,12 @@ has_layout(struct nfs_inode *nfsi)
return nfsi->layout.ld_data != NULL;
}

+static inline bool
+do_layoutcommit(struct nfs_inode *nfsi)
+{
+ return nfsi->lo_cred != NULL;
+}
+
#endif /* CONFIG_NFS_V4_1 */

struct pnfs_layout_segment {
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 98a8dc0..478b00c 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -203,11 +203,10 @@ struct nfs_inode {
#define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */
#define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */
time_t pnfs_layout_suspend;
+ struct rpc_cred *lo_cred; /* layoutcommit credential */
wait_queue_head_t lo_waitq;
spinlock_t lo_lock;
struct pnfs_layout_type layout;
- /* use rpc_creds in this open_context to send LAYOUTCOMMIT to MDS */
- struct nfs_open_context *layoutcommit_ctx;
/* DH: These vars keep track of the maximum write range
* so the values can be used for layoutcommit.
*/
diff --git a/include/linux/pnfs_xdr.h b/include/linux/pnfs_xdr.h
index a0bf341..154b04e 100644
--- a/include/linux/pnfs_xdr.h
+++ b/include/linux/pnfs_xdr.h
@@ -86,7 +86,6 @@ struct pnfs_layoutcommit_data {
bool is_sync;
struct rpc_cred *cred;
struct nfs_fattr fattr;
- struct nfs_open_context *ctx;
struct pnfs_layoutcommit_arg args;
struct pnfs_layoutcommit_res res;
int status;
--
1.6.6



2010-05-24 13:43:08

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On Mon, May 24, 2010 at 2:37 AM, Tao Guo <[email protected]> wrote:
> I see... But what if sync == 1? At that time, data must have already
> been freed, so you should not use data->cred anymore.

Agreed.

Since LAYOUTCOMMIT is not called after close, we can get rid of taking
the reference all together since a reference is taken on OPEN and
dropped on CLOSE.

-->Andy
>
> --
> tao.
>

2010-05-25 04:07:23

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On May. 24, 2010, 21:28 +0300, [email protected] wrote:
> From: Andy Adamson <[email protected]>
>
> Taking a reference on the nfs_open_context to signal that a layoutcommit is
> needed is problematic because the last reference to the context triggers a
> close (nfs_release). But, if the layout holds a reference on the
> nfs_open_context, then close will not be triggered.
>
> Since we only use the rpc credential from the layoutcommit_ctx, replace the
> layoutcommit_ctx with the rpc_cred.
>
> Hold a reference on the rpc_cred until the layoutcommit rpc returns.
>
> If the layoutdriver fails to setup the layoutcommit, clear the layoutcommit
> properties and put the credential.
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> fs/nfs/inode.c | 4 ++--
> fs/nfs/nfs4state.c | 2 +-
> fs/nfs/pnfs.c | 42 +++++++++++++++++-------------------------
> fs/nfs/write.c | 2 +-
> include/linux/nfs4_pnfs.h | 6 ++++++
> include/linux/nfs_fs.h | 3 +--
> include/linux/pnfs_xdr.h | 1 -
> 7 files changed, 28 insertions(+), 32 deletions(-)
>
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 7fd33d9..2b8e8e6 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1110,7 +1110,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> /*
> * file needs layout commit, server attributes may be stale
> */
> - if (nfsi->layoutcommit_ctx && nfsi->change_attr >= fattr->change_attr) {
> + if (do_layoutcommit(nfsi) && nfsi->change_attr >= fattr->change_attr) {

Andy, the patch looks good overall, thanks!
One nit though: mind if I rename "do_layoutcommit" which reads as if it actually performs
the layoutcommit call to "layoutcommit_needed"?

Benny

> dprintk("NFS: %s: layoutcommit is needed for file %s/%ld\n",
> __func__, inode->i_sb->s_id, inode->i_ino);
> return 0;
> @@ -1331,7 +1331,7 @@ static void pnfs_alloc_init_inode(struct nfs_inode *nfsi)
> nfsi->pnfs_layout_state = 0;
> memset(&nfsi->layout.stateid, 0, NFS4_STATEID_SIZE);
> nfsi->layout.roc_iomode = 0;
> - nfsi->layoutcommit_ctx = NULL;
> + nfsi->lo_cred = NULL;
> nfsi->pnfs_write_begin_pos = 0;
> nfsi->pnfs_write_end_pos = 0;
> #endif /* CONFIG_NFS_V4_1 */
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index d145de1..bf03fde 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -589,7 +589,7 @@ static void __nfs4_close(struct path *path, struct nfs4_state *state, fmode_t fm
> #ifdef CONFIG_NFS_V4_1
> struct nfs_inode *nfsi = NFS_I(state->inode);
>
> - if (nfsi->layoutcommit_ctx)
> + if (do_layoutcommit(nfsi))
> pnfs_layoutcommit_inode(state->inode, wait);
> if (has_layout(nfsi) && nfsi->layout.roc_iomode) {
> struct nfs4_pnfs_layout_segment range;
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index 20285bc..ee530c8 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -145,21 +145,19 @@ find_pnfs(u32 id, struct pnfs_module **module) {
> return 0;
> }
>
> -/* Set context to indicate we require a layoutcommit
> +/* Set lo_cred to indicate we require a layoutcommit
> * If we don't even have a layout, we don't need to commit it.
> */
> void
> pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx)
> {
> - dprintk("%s: has_layout=%d layoutcommit_ctx=%p ctx=%p\n", __func__,
> - has_layout(nfsi), nfsi->layoutcommit_ctx, ctx);
> + dprintk("%s: has_layout=%d ctx=%p\n", __func__, has_layout(nfsi), ctx);
> spin_lock(&nfsi->lo_lock);
> - if (has_layout(nfsi) && !nfsi->layoutcommit_ctx) {
> - nfsi->layoutcommit_ctx = get_nfs_open_context(ctx);
> + if (has_layout(nfsi) && !do_layoutcommit(nfsi)) {
> + nfsi->lo_cred = get_rpccred(ctx->state->owner->so_cred);
> nfsi->change_attr++;
> spin_unlock(&nfsi->lo_lock);
> - dprintk("%s: Set layoutcommit_ctx=%p\n", __func__,
> - nfsi->layoutcommit_ctx);
> + dprintk("%s: Set layoutcommit\n", __func__);
> return;
> }
> spin_unlock(&nfsi->lo_lock);
> @@ -755,7 +753,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range,
> !pnfs_return_layout_barrier(nfsi, &arg));
> }
>
> - if (nfsi->layoutcommit_ctx) {
> + if (do_layoutcommit(nfsi)) {
> status = pnfs_layoutcommit_inode(ino, wait);
> if (status) {
> dprintk("%s: layoutcommit failed, status=%d. "
> @@ -1910,16 +1908,9 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommit_data *data)
>
> dprintk("%s: (status %d)\n", __func__, data->status);
>
> - /* TODO: For now, set an error in the open context (just like
> - * if a commit failed) We may want to do more, much more, like
> - * replay all writes through the NFSv4
> - * server, or something.
> - */
> - if (data->status < 0) {
> + if (data->status < 0)
> printk(KERN_ERR "%s, Layoutcommit Failed! = %d\n",
> __func__, data->status);
> - data->ctx->error = data->status;
> - }
>
> /* TODO: Maybe we should avoid this by allowing the layout driver
> * to directly xdr its layout on the wire.
> @@ -1929,9 +1920,7 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommit_data *data)
> &nfsi->layout,
> &data->args,
> data->status);
> -
> - /* release the open_context acquired in pnfs_writeback_done */
> - put_nfs_open_context(data->ctx);
> + put_rpccred(data->cred);
> }
>
> /*
> @@ -1995,24 +1984,27 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync)
> return -ENOMEM;
>
> spin_lock(&nfsi->lo_lock);
> - if (!nfsi->layoutcommit_ctx)
> + if (!do_layoutcommit(nfsi))
> goto out_unlock;
>
> data->args.inode = inode;
> - data->cred = nfsi->layoutcommit_ctx->cred;
> - data->ctx = nfsi->layoutcommit_ctx;
> + data->cred = nfsi->lo_cred;
>
> /* Set up layout commit args*/
> status = pnfs_layoutcommit_setup(data, sync);
> - if (status)
> - goto out_unlock;
>
> /* Clear layoutcommit properties in the inode so
> * new lc info can be generated
> */
> nfsi->pnfs_write_begin_pos = 0;
> nfsi->pnfs_write_end_pos = 0;
> - nfsi->layoutcommit_ctx = NULL;
> + nfsi->lo_cred = NULL;
> +
> + if (status) {
> + /* The layout driver failed to setup the layoutcommit */
> + put_rpccred(data->cred);
> + goto out_unlock;
> + }
>
> /* release lock on pnfs layoutcommit attrs */
> spin_unlock(&nfsi->lo_lock);
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index a4c95a0..2c1918e 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1490,7 +1490,7 @@ static int nfs_commit_inode(struct inode *inode, int how)
> nfs_wait_bit_killable,
> TASK_KILLABLE);
> #ifdef CONFIG_NFS_V4_1
> - if (may_wait && NFS_I(inode)->layoutcommit_ctx) {
> + if (may_wait && do_layoutcommit(NFS_I(inode))) {
> error = pnfs_layoutcommit_inode(inode, 1);
> if (error < 0)
> return error;
> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
> index 4d47b48..c9ef43e 100644
> --- a/include/linux/nfs4_pnfs.h
> +++ b/include/linux/nfs4_pnfs.h
> @@ -80,6 +80,12 @@ has_layout(struct nfs_inode *nfsi)
> return nfsi->layout.ld_data != NULL;
> }
>
> +static inline bool
> +do_layoutcommit(struct nfs_inode *nfsi)
> +{
> + return nfsi->lo_cred != NULL;
> +}
> +
> #endif /* CONFIG_NFS_V4_1 */
>
> struct pnfs_layout_segment {
> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> index 98a8dc0..478b00c 100644
> --- a/include/linux/nfs_fs.h
> +++ b/include/linux/nfs_fs.h
> @@ -203,11 +203,10 @@ struct nfs_inode {
> #define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */
> #define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */
> time_t pnfs_layout_suspend;
> + struct rpc_cred *lo_cred; /* layoutcommit credential */
> wait_queue_head_t lo_waitq;
> spinlock_t lo_lock;
> struct pnfs_layout_type layout;
> - /* use rpc_creds in this open_context to send LAYOUTCOMMIT to MDS */
> - struct nfs_open_context *layoutcommit_ctx;
> /* DH: These vars keep track of the maximum write range
> * so the values can be used for layoutcommit.
> */
> diff --git a/include/linux/pnfs_xdr.h b/include/linux/pnfs_xdr.h
> index a0bf341..154b04e 100644
> --- a/include/linux/pnfs_xdr.h
> +++ b/include/linux/pnfs_xdr.h
> @@ -86,7 +86,6 @@ struct pnfs_layoutcommit_data {
> bool is_sync;
> struct rpc_cred *cred;
> struct nfs_fattr fattr;
> - struct nfs_open_context *ctx;
> struct pnfs_layoutcommit_arg args;
> struct pnfs_layoutcommit_res res;
> int status;

2010-05-25 06:56:19

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On May. 25, 2010, 7:07 +0300, Benny Halevy <[email protected]> wrote:
> On May. 24, 2010, 21:28 +0300, [email protected] wrote:
>> From: Andy Adamson <[email protected]>
>>
>> Taking a reference on the nfs_open_context to signal that a layoutcommit is
>> needed is problematic because the last reference to the context triggers a
>> close (nfs_release). But, if the layout holds a reference on the
>> nfs_open_context, then close will not be triggered.
>>
>> Since we only use the rpc credential from the layoutcommit_ctx, replace the
>> layoutcommit_ctx with the rpc_cred.
>>
>> Hold a reference on the rpc_cred until the layoutcommit rpc returns.
>>
>> If the layoutdriver fails to setup the layoutcommit, clear the layoutcommit
>> properties and put the credential.
>>
>> Signed-off-by: Andy Adamson <[email protected]>
>> ---
>> fs/nfs/inode.c | 4 ++--
>> fs/nfs/nfs4state.c | 2 +-
>> fs/nfs/pnfs.c | 42 +++++++++++++++++-------------------------
>> fs/nfs/write.c | 2 +-
>> include/linux/nfs4_pnfs.h | 6 ++++++
>> include/linux/nfs_fs.h | 3 +--
>> include/linux/pnfs_xdr.h | 1 -
>> 7 files changed, 28 insertions(+), 32 deletions(-)
>>
>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>> index 7fd33d9..2b8e8e6 100644
>> --- a/fs/nfs/inode.c
>> +++ b/fs/nfs/inode.c
>> @@ -1110,7 +1110,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
>> /*
>> * file needs layout commit, server attributes may be stale
>> */
>> - if (nfsi->layoutcommit_ctx && nfsi->change_attr >= fattr->change_attr) {
>> + if (do_layoutcommit(nfsi) && nfsi->change_attr >= fattr->change_attr) {
>
> Andy, the patch looks good overall, thanks!
> One nit though: mind if I rename "do_layoutcommit" which reads as if it actually performs
> the layoutcommit call to "layoutcommit_needed"?
>
> Benny
>

Committed at pnfs-all-2.6.34-2010-05-25
(with the renamed function)

Thanks!

Benny

>> dprintk("NFS: %s: layoutcommit is needed for file %s/%ld\n",
>> __func__, inode->i_sb->s_id, inode->i_ino);
>> return 0;
>> @@ -1331,7 +1331,7 @@ static void pnfs_alloc_init_inode(struct nfs_inode *nfsi)
>> nfsi->pnfs_layout_state = 0;
>> memset(&nfsi->layout.stateid, 0, NFS4_STATEID_SIZE);
>> nfsi->layout.roc_iomode = 0;
>> - nfsi->layoutcommit_ctx = NULL;
>> + nfsi->lo_cred = NULL;
>> nfsi->pnfs_write_begin_pos = 0;
>> nfsi->pnfs_write_end_pos = 0;
>> #endif /* CONFIG_NFS_V4_1 */
>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
>> index d145de1..bf03fde 100644
>> --- a/fs/nfs/nfs4state.c
>> +++ b/fs/nfs/nfs4state.c
>> @@ -589,7 +589,7 @@ static void __nfs4_close(struct path *path, struct nfs4_state *state, fmode_t fm
>> #ifdef CONFIG_NFS_V4_1
>> struct nfs_inode *nfsi = NFS_I(state->inode);
>>
>> - if (nfsi->layoutcommit_ctx)
>> + if (do_layoutcommit(nfsi))
>> pnfs_layoutcommit_inode(state->inode, wait);
>> if (has_layout(nfsi) && nfsi->layout.roc_iomode) {
>> struct nfs4_pnfs_layout_segment range;
>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>> index 20285bc..ee530c8 100644
>> --- a/fs/nfs/pnfs.c
>> +++ b/fs/nfs/pnfs.c
>> @@ -145,21 +145,19 @@ find_pnfs(u32 id, struct pnfs_module **module) {
>> return 0;
>> }
>>
>> -/* Set context to indicate we require a layoutcommit
>> +/* Set lo_cred to indicate we require a layoutcommit
>> * If we don't even have a layout, we don't need to commit it.
>> */
>> void
>> pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_context *ctx)
>> {
>> - dprintk("%s: has_layout=%d layoutcommit_ctx=%p ctx=%p\n", __func__,
>> - has_layout(nfsi), nfsi->layoutcommit_ctx, ctx);
>> + dprintk("%s: has_layout=%d ctx=%p\n", __func__, has_layout(nfsi), ctx);
>> spin_lock(&nfsi->lo_lock);
>> - if (has_layout(nfsi) && !nfsi->layoutcommit_ctx) {
>> - nfsi->layoutcommit_ctx = get_nfs_open_context(ctx);
>> + if (has_layout(nfsi) && !do_layoutcommit(nfsi)) {
>> + nfsi->lo_cred = get_rpccred(ctx->state->owner->so_cred);
>> nfsi->change_attr++;
>> spin_unlock(&nfsi->lo_lock);
>> - dprintk("%s: Set layoutcommit_ctx=%p\n", __func__,
>> - nfsi->layoutcommit_ctx);
>> + dprintk("%s: Set layoutcommit\n", __func__);
>> return;
>> }
>> spin_unlock(&nfsi->lo_lock);
>> @@ -755,7 +753,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs4_pnfs_layout_segment *range,
>> !pnfs_return_layout_barrier(nfsi, &arg));
>> }
>>
>> - if (nfsi->layoutcommit_ctx) {
>> + if (do_layoutcommit(nfsi)) {
>> status = pnfs_layoutcommit_inode(ino, wait);
>> if (status) {
>> dprintk("%s: layoutcommit failed, status=%d. "
>> @@ -1910,16 +1908,9 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommit_data *data)
>>
>> dprintk("%s: (status %d)\n", __func__, data->status);
>>
>> - /* TODO: For now, set an error in the open context (just like
>> - * if a commit failed) We may want to do more, much more, like
>> - * replay all writes through the NFSv4
>> - * server, or something.
>> - */
>> - if (data->status < 0) {
>> + if (data->status < 0)
>> printk(KERN_ERR "%s, Layoutcommit Failed! = %d\n",
>> __func__, data->status);
>> - data->ctx->error = data->status;
>> - }
>>
>> /* TODO: Maybe we should avoid this by allowing the layout driver
>> * to directly xdr its layout on the wire.
>> @@ -1929,9 +1920,7 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommit_data *data)
>> &nfsi->layout,
>> &data->args,
>> data->status);
>> -
>> - /* release the open_context acquired in pnfs_writeback_done */
>> - put_nfs_open_context(data->ctx);
>> + put_rpccred(data->cred);
>> }
>>
>> /*
>> @@ -1995,24 +1984,27 @@ pnfs_layoutcommit_inode(struct inode *inode, int sync)
>> return -ENOMEM;
>>
>> spin_lock(&nfsi->lo_lock);
>> - if (!nfsi->layoutcommit_ctx)
>> + if (!do_layoutcommit(nfsi))
>> goto out_unlock;
>>
>> data->args.inode = inode;
>> - data->cred = nfsi->layoutcommit_ctx->cred;
>> - data->ctx = nfsi->layoutcommit_ctx;
>> + data->cred = nfsi->lo_cred;
>>
>> /* Set up layout commit args*/
>> status = pnfs_layoutcommit_setup(data, sync);
>> - if (status)
>> - goto out_unlock;
>>
>> /* Clear layoutcommit properties in the inode so
>> * new lc info can be generated
>> */
>> nfsi->pnfs_write_begin_pos = 0;
>> nfsi->pnfs_write_end_pos = 0;
>> - nfsi->layoutcommit_ctx = NULL;
>> + nfsi->lo_cred = NULL;
>> +
>> + if (status) {
>> + /* The layout driver failed to setup the layoutcommit */
>> + put_rpccred(data->cred);
>> + goto out_unlock;
>> + }
>>
>> /* release lock on pnfs layoutcommit attrs */
>> spin_unlock(&nfsi->lo_lock);
>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>> index a4c95a0..2c1918e 100644
>> --- a/fs/nfs/write.c
>> +++ b/fs/nfs/write.c
>> @@ -1490,7 +1490,7 @@ static int nfs_commit_inode(struct inode *inode, int how)
>> nfs_wait_bit_killable,
>> TASK_KILLABLE);
>> #ifdef CONFIG_NFS_V4_1
>> - if (may_wait && NFS_I(inode)->layoutcommit_ctx) {
>> + if (may_wait && do_layoutcommit(NFS_I(inode))) {
>> error = pnfs_layoutcommit_inode(inode, 1);
>> if (error < 0)
>> return error;
>> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
>> index 4d47b48..c9ef43e 100644
>> --- a/include/linux/nfs4_pnfs.h
>> +++ b/include/linux/nfs4_pnfs.h
>> @@ -80,6 +80,12 @@ has_layout(struct nfs_inode *nfsi)
>> return nfsi->layout.ld_data != NULL;
>> }
>>
>> +static inline bool
>> +do_layoutcommit(struct nfs_inode *nfsi)
>> +{
>> + return nfsi->lo_cred != NULL;
>> +}
>> +
>> #endif /* CONFIG_NFS_V4_1 */
>>
>> struct pnfs_layout_segment {
>> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
>> index 98a8dc0..478b00c 100644
>> --- a/include/linux/nfs_fs.h
>> +++ b/include/linux/nfs_fs.h
>> @@ -203,11 +203,10 @@ struct nfs_inode {
>> #define NFS_INO_RW_LAYOUT_FAILED 1 /* get rw layout failed stop trying */
>> #define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout allocation */
>> time_t pnfs_layout_suspend;
>> + struct rpc_cred *lo_cred; /* layoutcommit credential */
>> wait_queue_head_t lo_waitq;
>> spinlock_t lo_lock;
>> struct pnfs_layout_type layout;
>> - /* use rpc_creds in this open_context to send LAYOUTCOMMIT to MDS */
>> - struct nfs_open_context *layoutcommit_ctx;
>> /* DH: These vars keep track of the maximum write range
>> * so the values can be used for layoutcommit.
>> */
>> diff --git a/include/linux/pnfs_xdr.h b/include/linux/pnfs_xdr.h
>> index a0bf341..154b04e 100644
>> --- a/include/linux/pnfs_xdr.h
>> +++ b/include/linux/pnfs_xdr.h
>> @@ -86,7 +86,6 @@ struct pnfs_layoutcommit_data {
>> bool is_sync;
>> struct rpc_cred *cred;
>> struct nfs_fattr fattr;
>> - struct nfs_open_context *ctx;
>> struct pnfs_layoutcommit_arg args;
>> struct pnfs_layoutcommit_res res;
>> int status;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2010-05-21 02:50:35

by Tao Guo

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On Fri, May 21, 2010 at 12:04 AM, <[email protected]> wrote:
> From: Andy Adamson <[email protected]>
>
> Taking a reference on the nfs_open_context to signal that a layoutcom=
mit is
> needed is problematic because the last reference to the context trigg=
ers a
> close (nfs_release). =C2=A0But, if the layout holds a reference on th=
e
> nfs_open_context, then close will not be triggered.
>
> Since we only use the rpc credential from the layoutcommit_ctx, repla=
ce the
> layoutcommit_ctx with the rpc_cred.
>
> Hold a reference on the rpc_cred until the layoutcommit rpc returns. =
Note that
> the rpc layer (rpcauth_bind) also references the rpc_cred.
>
> If the layoutdriver fails to setup the layoutcommit, clear the layout=
commit
> properties and put the credential.
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> =C2=A0fs/nfs/inode.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0=
=C2=A04 ++--
> =C2=A0fs/nfs/nfs4state.c =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 =C2=A02 =
+-
> =C2=A0fs/nfs/pnfs.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0=
42 +++++++++++++++++-------------------------
> =C2=A0fs/nfs/write.c =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0=
=C2=A02 +-
> =C2=A0include/linux/nfs4_pnfs.h | =C2=A0 =C2=A06 ++++++
> =C2=A0include/linux/nfs_fs.h =C2=A0 =C2=A0| =C2=A0 =C2=A03 +--
> =C2=A0include/linux/pnfs_xdr.h =C2=A0| =C2=A0 =C2=A01 -
> =C2=A07 files changed, 28 insertions(+), 32 deletions(-)
>
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 7fd33d9..2b8e8e6 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1110,7 +1110,7 @@ static int nfs_update_inode(struct inode *inode=
, struct nfs_fattr *fattr)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 * file needs layout commit, server attrib=
utes may be stale
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> - =C2=A0 =C2=A0 =C2=A0 if (nfsi->layoutcommit_ctx && nfsi->change_att=
r >=3D fattr->change_attr) {
> + =C2=A0 =C2=A0 =C2=A0 if (do_layoutcommit(nfsi) && nfsi->change_attr=
>=3D fattr->change_attr) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0dprintk("NFS: =
%s: layoutcommit is needed for file %s/%ld\n",
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0__func__, inode->i_sb->s_id, inode->i_ino);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> @@ -1331,7 +1331,7 @@ static void pnfs_alloc_init_inode(struct nfs_in=
ode *nfsi)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0nfsi->pnfs_layout_state =3D 0;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0memset(&nfsi->layout.stateid, 0, NFS4_STAT=
EID_SIZE);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0nfsi->layout.roc_iomode =3D 0;
> - =C2=A0 =C2=A0 =C2=A0 nfsi->layoutcommit_ctx =3D NULL;
> + =C2=A0 =C2=A0 =C2=A0 nfsi->lo_cred =3D NULL;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0nfsi->pnfs_write_begin_pos =3D 0;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0nfsi->pnfs_write_end_pos =3D 0;
> =C2=A0#endif /* CONFIG_NFS_V4_1 */
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index d145de1..bf03fde 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -589,7 +589,7 @@ static void __nfs4_close(struct path *path, struc=
t nfs4_state *state, fmode_t fm
> =C2=A0#ifdef CONFIG_NFS_V4_1
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct nfs_ino=
de *nfsi =3D NFS_I(state->inode);
>
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (nfsi->layoutco=
mmit_ctx)
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (do_layoutcommi=
t(nfsi))
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0pnfs_layoutcommit_inode(state->inode, wait);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (has_layout=
(nfsi) && nfsi->layout.roc_iomode) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0struct nfs4_pnfs_layout_segment range;
> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> index 20285bc..cb8ff06 100644
> --- a/fs/nfs/pnfs.c
> +++ b/fs/nfs/pnfs.c
> @@ -145,21 +145,19 @@ find_pnfs(u32 id, struct pnfs_module **module) =
{
> =C2=A0 =C2=A0 =C2=A0 =C2=A0return 0;
> =C2=A0}
>
> -/* Set context to indicate we require a layoutcommit
> +/* Set lo_cred to indicate we require a layoutcommit
> =C2=A0* If we don't even have a layout, we don't need to commit it.
> =C2=A0*/
> =C2=A0void
> =C2=A0pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_=
context *ctx)
> =C2=A0{
> - =C2=A0 =C2=A0 =C2=A0 dprintk("%s: has_layout=3D%d layoutcommit_ctx=3D=
%p ctx=3D%p\n", __func__,
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 has_layout(nfsi), =
nfsi->layoutcommit_ctx, ctx);
> + =C2=A0 =C2=A0 =C2=A0 dprintk("%s: has_layout=3D%d ctx=3D%p\n", __fu=
nc__, has_layout(nfsi), ctx);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock(&nfsi->lo_lock);
> - =C2=A0 =C2=A0 =C2=A0 if (has_layout(nfsi) && !nfsi->layoutcommit_ct=
x) {
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsi->layoutcommit=
_ctx =3D get_nfs_open_context(ctx);
> + =C2=A0 =C2=A0 =C2=A0 if (has_layout(nfsi) && !do_layoutcommit(nfsi)=
) {
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 nfsi->lo_cred =3D =
get_rpccred(ctx->state->owner->so_cred);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0nfsi->change_a=
ttr++;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&n=
fsi->lo_lock);
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 dprintk("%s: Set l=
ayoutcommit_ctx=3D%p\n", __func__,
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 nfsi->layoutcommit_ctx);
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 dprintk("%s: Set l=
ayoutcommit\n", __func__);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0}
> =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&nfsi->lo_lock);
> @@ -755,7 +753,7 @@ _pnfs_return_layout(struct inode *ino, struct nfs=
4_pnfs_layout_segment *range,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0!pnfs_return_layout_barrier(n=
fsi, &arg));
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}
>
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (nfsi->layoutco=
mmit_ctx) {
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (do_layoutcommi=
t(nfsi)) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0status =3D pnfs_layoutcommit_inode(ino, wait);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0if (status) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0dprintk("%s: layoutcommit fai=
led, status=3D%d. "
> @@ -1910,16 +1908,9 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommi=
t_data *data)
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0dprintk("%s: (status %d)\n", __func__, dat=
a->status);
>
> - =C2=A0 =C2=A0 =C2=A0 /* TODO: For now, set an error in the open con=
text (just like
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0* if a commit failed) We may want to do =
more, much more, like
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0* replay all writes through the NFSv4
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0* server, or something.
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0*/
> - =C2=A0 =C2=A0 =C2=A0 if (data->status < 0) {
> + =C2=A0 =C2=A0 =C2=A0 if (data->status < 0)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0printk(KERN_ER=
R "%s, Layoutcommit Failed! =3D %d\n",
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 __func__, data->status);
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 data->ctx->error =3D=
data->status;
> - =C2=A0 =C2=A0 =C2=A0 }
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0/* TODO: Maybe we should avoid this by all=
owing the layout driver
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 * to directly xdr its layout on the wire.
> @@ -1929,9 +1920,6 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommit=
_data *data)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&nfsi->layout,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0&data->args,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0data->status);
> -
> - =C2=A0 =C2=A0 =C2=A0 /* release the open_context acquired in pnfs_w=
riteback_done */
> - =C2=A0 =C2=A0 =C2=A0 put_nfs_open_context(data->ctx);
> =C2=A0}
>
> =C2=A0/*
> @@ -1995,30 +1983,34 @@ pnfs_layoutcommit_inode(struct inode *inode, =
int sync)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return -ENOMEM=
;
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_lock(&nfsi->lo_lock);
> - =C2=A0 =C2=A0 =C2=A0 if (!nfsi->layoutcommit_ctx)
> + =C2=A0 =C2=A0 =C2=A0 if (!do_layoutcommit(nfsi))
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0goto out_unloc=
k;
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0data->args.inode =3D inode;
> - =C2=A0 =C2=A0 =C2=A0 data->cred =C2=A0=3D nfsi->layoutcommit_ctx->c=
red;
> - =C2=A0 =C2=A0 =C2=A0 data->ctx =3D nfsi->layoutcommit_ctx;
> + =C2=A0 =C2=A0 =C2=A0 data->cred =3D nfsi->lo_cred;
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Set up layout commit args*/
> =C2=A0 =C2=A0 =C2=A0 =C2=A0status =3D pnfs_layoutcommit_setup(data, s=
ync);
> - =C2=A0 =C2=A0 =C2=A0 if (status)
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto out_unlock;
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Clear layoutcommit properties in the in=
ode so
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 * new lc info can be generated
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> =C2=A0 =C2=A0 =C2=A0 =C2=A0nfsi->pnfs_write_begin_pos =3D 0;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0nfsi->pnfs_write_end_pos =3D 0;
> - =C2=A0 =C2=A0 =C2=A0 nfsi->layoutcommit_ctx =3D NULL;
> + =C2=A0 =C2=A0 =C2=A0 nfsi->lo_cred =3D NULL;
> +
> + =C2=A0 =C2=A0 =C2=A0 if (status) {
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* The layout driv=
er failed to setup the layoutcommit */
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 put_rpccred(data->=
cred);
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto out_unlock;
> + =C2=A0 =C2=A0 =C2=A0 }
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0/* release lock on pnfs layoutcommit attrs=
*/
> =C2=A0 =C2=A0 =C2=A0 =C2=A0spin_unlock(&nfsi->lo_lock);
>
> =C2=A0 =C2=A0 =C2=A0 =C2=A0data->is_sync =3D sync;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0status =3D pnfs4_proc_layoutcommit(data);
> + =C2=A0 =C2=A0 =C2=A0 put_rpccred(data->cred);
Is this OK to put_rpccred here if sync =3D=3D 0? why not move it into
pnfs_layoutcommit_done just as the code did?
--tao

> =C2=A0out:
> =C2=A0 =C2=A0 =C2=A0 =C2=A0dprintk("%s end (err:%d)\n", __func__, sta=
tus);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0return status;
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index a4c95a0..2c1918e 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1490,7 +1490,7 @@ static int nfs_commit_inode(struct inode *inode=
, int how)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0n=
fs_wait_bit_killable,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0T=
ASK_KILLABLE);
> =C2=A0#ifdef CONFIG_NFS_V4_1
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (may_wait && NF=
S_I(inode)->layoutcommit_ctx) {
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (may_wait && do=
_layoutcommit(NFS_I(inode))) {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0error =3D pnfs_layoutcommit_inode(inode, 1);
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0if (error < 0)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return error;
> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
> index 4d47b48..c9ef43e 100644
> --- a/include/linux/nfs4_pnfs.h
> +++ b/include/linux/nfs4_pnfs.h
> @@ -80,6 +80,12 @@ has_layout(struct nfs_inode *nfsi)
> =C2=A0 =C2=A0 =C2=A0 =C2=A0return nfsi->layout.ld_data !=3D NULL;
> =C2=A0}
>
> +static inline bool
> +do_layoutcommit(struct nfs_inode *nfsi)
> +{
> + =C2=A0 =C2=A0 =C2=A0 return nfsi->lo_cred !=3D NULL;
> +}
> +
> =C2=A0#endif /* CONFIG_NFS_V4_1 */
>
> =C2=A0struct pnfs_layout_segment {
> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> index 98a8dc0..478b00c 100644
> --- a/include/linux/nfs_fs.h
> +++ b/include/linux/nfs_fs.h
> @@ -203,11 +203,10 @@ struct nfs_inode {
> =C2=A0#define NFS_INO_RW_LAYOUT_FAILED 1 =C2=A0 =C2=A0 /* get rw layo=
ut failed stop trying */
> =C2=A0#define NFS_INO_LAYOUT_ALLOC =C2=A0 =C2=A0 2 =C2=A0 =C2=A0 /* b=
it lock for layout allocation */
> =C2=A0 =C2=A0 =C2=A0 =C2=A0time_t pnfs_layout_suspend;
> + =C2=A0 =C2=A0 =C2=A0 struct rpc_cred =C2=A0 =C2=A0 =C2=A0 =C2=A0 *l=
o_cred; /* layoutcommit credential */
> =C2=A0 =C2=A0 =C2=A0 =C2=A0wait_queue_head_t lo_waitq;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0spinlock_t lo_lock;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct pnfs_layout_type layout;
> - =C2=A0 =C2=A0 =C2=A0 /* use rpc_creds in this open_context to send =
LAYOUTCOMMIT to MDS */
> - =C2=A0 =C2=A0 =C2=A0 struct nfs_open_context *layoutcommit_ctx;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0/* DH: These vars keep track of the maximu=
m write range
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 * so the values can be used for layoutcom=
mit.
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> diff --git a/include/linux/pnfs_xdr.h b/include/linux/pnfs_xdr.h
> index a0bf341..154b04e 100644
> --- a/include/linux/pnfs_xdr.h
> +++ b/include/linux/pnfs_xdr.h
> @@ -86,7 +86,6 @@ struct pnfs_layoutcommit_data {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0bool is_sync;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rpc_cred *cred;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct nfs_fattr fattr;
> - =C2=A0 =C2=A0 =C2=A0 struct nfs_open_context *ctx;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct pnfs_layoutcommit_arg args;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct pnfs_layoutcommit_res res;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0int status;
> --
> 1.6.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to [email protected]
> More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht=
ml
>



--=20
tao.

2010-05-21 12:47:45

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On Thu, May 20, 2010 at 10:50 PM, Tao Guo <[email protected]> wrote=
:
> On Fri, May 21, 2010 at 12:04 AM, =A0<[email protected]> wrote:
>> From: Andy Adamson <[email protected]>
>>
>> Taking a reference on the nfs_open_context to signal that a layoutco=
mmit is
>> needed is problematic because the last reference to the context trig=
gers a
>> close (nfs_release). =A0But, if the layout holds a reference on the
>> nfs_open_context, then close will not be triggered.
>>
>> Since we only use the rpc credential from the layoutcommit_ctx, repl=
ace the
>> layoutcommit_ctx with the rpc_cred.
>>
>> Hold a reference on the rpc_cred until the layoutcommit rpc returns.=
Note that
>> the rpc layer (rpcauth_bind) also references the rpc_cred.
>>
>> If the layoutdriver fails to setup the layoutcommit, clear the layou=
tcommit
>> properties and put the credential.
>>
>> Signed-off-by: Andy Adamson <[email protected]>
>> ---
>> =A0fs/nfs/inode.c =A0 =A0 =A0 =A0 =A0 =A0| =A0 =A04 ++--
>> =A0fs/nfs/nfs4state.c =A0 =A0 =A0 =A0| =A0 =A02 +-
>> =A0fs/nfs/pnfs.c =A0 =A0 =A0 =A0 =A0 =A0 | =A0 42 +++++++++++++++++-=
------------------------
>> =A0fs/nfs/write.c =A0 =A0 =A0 =A0 =A0 =A0| =A0 =A02 +-
>> =A0include/linux/nfs4_pnfs.h | =A0 =A06 ++++++
>> =A0include/linux/nfs_fs.h =A0 =A0| =A0 =A03 +--
>> =A0include/linux/pnfs_xdr.h =A0| =A0 =A01 -
>> =A07 files changed, 28 insertions(+), 32 deletions(-)
>>
>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>> index 7fd33d9..2b8e8e6 100644
>> --- a/fs/nfs/inode.c
>> +++ b/fs/nfs/inode.c
>> @@ -1110,7 +1110,7 @@ static int nfs_update_inode(struct inode *inod=
e, struct nfs_fattr *fattr)
>> =A0 =A0 =A0 =A0/*
>> =A0 =A0 =A0 =A0 * file needs layout commit, server attributes may be=
stale
>> =A0 =A0 =A0 =A0 */
>> - =A0 =A0 =A0 if (nfsi->layoutcommit_ctx && nfsi->change_attr >=3D f=
attr->change_attr) {
>> + =A0 =A0 =A0 if (do_layoutcommit(nfsi) && nfsi->change_attr >=3D fa=
ttr->change_attr) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprintk("NFS: %s: layoutcommit is nee=
ded for file %s/%ld\n",
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0__func__, inode->i_sb=
->s_id, inode->i_ino);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return 0;
>> @@ -1331,7 +1331,7 @@ static void pnfs_alloc_init_inode(struct nfs_i=
node *nfsi)
>> =A0 =A0 =A0 =A0nfsi->pnfs_layout_state =3D 0;
>> =A0 =A0 =A0 =A0memset(&nfsi->layout.stateid, 0, NFS4_STATEID_SIZE);
>> =A0 =A0 =A0 =A0nfsi->layout.roc_iomode =3D 0;
>> - =A0 =A0 =A0 nfsi->layoutcommit_ctx =3D NULL;
>> + =A0 =A0 =A0 nfsi->lo_cred =3D NULL;
>> =A0 =A0 =A0 =A0nfsi->pnfs_write_begin_pos =3D 0;
>> =A0 =A0 =A0 =A0nfsi->pnfs_write_end_pos =3D 0;
>> =A0#endif /* CONFIG_NFS_V4_1 */
>> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
>> index d145de1..bf03fde 100644
>> --- a/fs/nfs/nfs4state.c
>> +++ b/fs/nfs/nfs4state.c
>> @@ -589,7 +589,7 @@ static void __nfs4_close(struct path *path, stru=
ct nfs4_state *state, fmode_t fm
>> =A0#ifdef CONFIG_NFS_V4_1
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs_inode *nfsi =3D NFS_I(stat=
e->inode);
>>
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (nfsi->layoutcommit_ctx)
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (do_layoutcommit(nfsi))
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0pnfs_layoutcommit_ino=
de(state->inode, wait);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (has_layout(nfsi) && nfsi->layout.=
roc_iomode) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct nfs4_pnfs_layo=
ut_segment range;
>> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
>> index 20285bc..cb8ff06 100644
>> --- a/fs/nfs/pnfs.c
>> +++ b/fs/nfs/pnfs.c
>> @@ -145,21 +145,19 @@ find_pnfs(u32 id, struct pnfs_module **module)=
{
>> =A0 =A0 =A0 =A0return 0;
>> =A0}
>>
>> -/* Set context to indicate we require a layoutcommit
>> +/* Set lo_cred to indicate we require a layoutcommit
>> =A0* If we don't even have a layout, we don't need to commit it.
>> =A0*/
>> =A0void
>> =A0pnfs_need_layoutcommit(struct nfs_inode *nfsi, struct nfs_open_co=
ntext *ctx)
>> =A0{
>> - =A0 =A0 =A0 dprintk("%s: has_layout=3D%d layoutcommit_ctx=3D%p ctx=
=3D%p\n", __func__,
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 has_layout(nfsi), nfsi->layoutcommit_c=
tx, ctx);
>> + =A0 =A0 =A0 dprintk("%s: has_layout=3D%d ctx=3D%p\n", __func__, ha=
s_layout(nfsi), ctx);
>> =A0 =A0 =A0 =A0spin_lock(&nfsi->lo_lock);
>> - =A0 =A0 =A0 if (has_layout(nfsi) && !nfsi->layoutcommit_ctx) {
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsi->layoutcommit_ctx =3D get_nfs_ope=
n_context(ctx);
>> + =A0 =A0 =A0 if (has_layout(nfsi) && !do_layoutcommit(nfsi)) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsi->lo_cred =3D get_rpccred(ctx->sta=
te->owner->so_cred);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0nfsi->change_attr++;
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_unlock(&nfsi->lo_lock);
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s: Set layoutcommit_ctx=3D%p=
\n", __func__,
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 nfsi->layoutcommit_ctx=
);
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 dprintk("%s: Set layoutcommit\n", __fu=
nc__);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return;
>> =A0 =A0 =A0 =A0}
>> =A0 =A0 =A0 =A0spin_unlock(&nfsi->lo_lock);
>> @@ -755,7 +753,7 @@ _pnfs_return_layout(struct inode *ino, struct nf=
s4_pnfs_layout_segment *range,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0!pnfs=
_return_layout_barrier(nfsi, &arg));
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>>
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (nfsi->layoutcommit_ctx) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (do_layoutcommit(nfsi)) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0status =3D pnfs_layou=
tcommit_inode(ino, wait);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (status) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0dprin=
tk("%s: layoutcommit failed, status=3D%d. "
>> @@ -1910,16 +1908,9 @@ pnfs_layoutcommit_done(struct pnfs_layoutcomm=
it_data *data)
>>
>> =A0 =A0 =A0 =A0dprintk("%s: (status %d)\n", __func__, data->status);
>>
>> - =A0 =A0 =A0 /* TODO: For now, set an error in the open context (ju=
st like
>> - =A0 =A0 =A0 =A0* if a commit failed) We may want to do more, much =
more, like
>> - =A0 =A0 =A0 =A0* replay all writes through the NFSv4
>> - =A0 =A0 =A0 =A0* server, or something.
>> - =A0 =A0 =A0 =A0*/
>> - =A0 =A0 =A0 if (data->status < 0) {
>> + =A0 =A0 =A0 if (data->status < 0)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printk(KERN_ERR "%s, Layoutcommit Fai=
led! =3D %d\n",
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 __func__, data->status);
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 data->ctx->error =3D data->status;
>> - =A0 =A0 =A0 }
>>
>> =A0 =A0 =A0 =A0/* TODO: Maybe we should avoid this by allowing the l=
ayout driver
>> =A0 =A0 =A0 =A0 * to directly xdr its layout on the wire.
>> @@ -1929,9 +1920,6 @@ pnfs_layoutcommit_done(struct pnfs_layoutcommi=
t_data *data)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0&nfsi->layout,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0&data->args,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0data->status);
>> -
>> - =A0 =A0 =A0 /* release the open_context acquired in pnfs_writeback=
_done */
>> - =A0 =A0 =A0 put_nfs_open_context(data->ctx);
>> =A0}
>>
>> =A0/*
>> @@ -1995,30 +1983,34 @@ pnfs_layoutcommit_inode(struct inode *inode,=
int sync)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return -ENOMEM;
>>
>> =A0 =A0 =A0 =A0spin_lock(&nfsi->lo_lock);
>> - =A0 =A0 =A0 if (!nfsi->layoutcommit_ctx)
>> + =A0 =A0 =A0 if (!do_layoutcommit(nfsi))
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out_unlock;
>>
>> =A0 =A0 =A0 =A0data->args.inode =3D inode;
>> - =A0 =A0 =A0 data->cred =A0=3D nfsi->layoutcommit_ctx->cred;
>> - =A0 =A0 =A0 data->ctx =3D nfsi->layoutcommit_ctx;
>> + =A0 =A0 =A0 data->cred =3D nfsi->lo_cred;
>>
>> =A0 =A0 =A0 =A0/* Set up layout commit args*/
>> =A0 =A0 =A0 =A0status =3D pnfs_layoutcommit_setup(data, sync);
>> - =A0 =A0 =A0 if (status)
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out_unlock;
>>
>> =A0 =A0 =A0 =A0/* Clear layoutcommit properties in the inode so
>> =A0 =A0 =A0 =A0 * new lc info can be generated
>> =A0 =A0 =A0 =A0 */
>> =A0 =A0 =A0 =A0nfsi->pnfs_write_begin_pos =3D 0;
>> =A0 =A0 =A0 =A0nfsi->pnfs_write_end_pos =3D 0;
>> - =A0 =A0 =A0 nfsi->layoutcommit_ctx =3D NULL;
>> + =A0 =A0 =A0 nfsi->lo_cred =3D NULL;
>> +
>> + =A0 =A0 =A0 if (status) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* The layout driver failed to setup t=
he layoutcommit */
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 put_rpccred(data->cred);
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto out_unlock;
>> + =A0 =A0 =A0 }
>>
>> =A0 =A0 =A0 =A0/* release lock on pnfs layoutcommit attrs */
>> =A0 =A0 =A0 =A0spin_unlock(&nfsi->lo_lock);
>>
>> =A0 =A0 =A0 =A0data->is_sync =3D sync;
>> =A0 =A0 =A0 =A0status =3D pnfs4_proc_layoutcommit(data);
>> + =A0 =A0 =A0 put_rpccred(data->cred);
> Is this OK to put_rpccred here if sync =3D=3D 0? why not move it into
> pnfs_layoutcommit_done just as the code did?

As I said in the patch comments, the RPC layer takes a reference on the=
cred,

-->Andy

> --tao
>
>> =A0out:
>> =A0 =A0 =A0 =A0dprintk("%s end (err:%d)\n", __func__, status);
>> =A0 =A0 =A0 =A0return status;
>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>> index a4c95a0..2c1918e 100644
>> --- a/fs/nfs/write.c
>> +++ b/fs/nfs/write.c
>> @@ -1490,7 +1490,7 @@ static int nfs_commit_inode(struct inode *inod=
e, int how)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0nfs_wait_bit_killable,
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0TASK_KILLABLE);
>> =A0#ifdef CONFIG_NFS_V4_1
>> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (may_wait && NFS_I(inode)->layoutco=
mmit_ctx) {
>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (may_wait && do_layoutcommit(NFS_I(=
inode))) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0error =3D pnfs_layout=
commit_inode(inode, 1);
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (error < 0)
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0retur=
n error;
>> diff --git a/include/linux/nfs4_pnfs.h b/include/linux/nfs4_pnfs.h
>> index 4d47b48..c9ef43e 100644
>> --- a/include/linux/nfs4_pnfs.h
>> +++ b/include/linux/nfs4_pnfs.h
>> @@ -80,6 +80,12 @@ has_layout(struct nfs_inode *nfsi)
>> =A0 =A0 =A0 =A0return nfsi->layout.ld_data !=3D NULL;
>> =A0}
>>
>> +static inline bool
>> +do_layoutcommit(struct nfs_inode *nfsi)
>> +{
>> + =A0 =A0 =A0 return nfsi->lo_cred !=3D NULL;
>> +}
>> +
>> =A0#endif /* CONFIG_NFS_V4_1 */
>>
>> =A0struct pnfs_layout_segment {
>> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
>> index 98a8dc0..478b00c 100644
>> --- a/include/linux/nfs_fs.h
>> +++ b/include/linux/nfs_fs.h
>> @@ -203,11 +203,10 @@ struct nfs_inode {
>> =A0#define NFS_INO_RW_LAYOUT_FAILED 1 =A0 =A0 /* get rw layout faile=
d stop trying */
>> =A0#define NFS_INO_LAYOUT_ALLOC =A0 =A0 2 =A0 =A0 /* bit lock for la=
yout allocation */
>> =A0 =A0 =A0 =A0time_t pnfs_layout_suspend;
>> + =A0 =A0 =A0 struct rpc_cred =A0 =A0 =A0 =A0 *lo_cred; /* layoutcom=
mit credential */
>> =A0 =A0 =A0 =A0wait_queue_head_t lo_waitq;
>> =A0 =A0 =A0 =A0spinlock_t lo_lock;
>> =A0 =A0 =A0 =A0struct pnfs_layout_type layout;
>> - =A0 =A0 =A0 /* use rpc_creds in this open_context to send LAYOUTCO=
MMIT to MDS */
>> - =A0 =A0 =A0 struct nfs_open_context *layoutcommit_ctx;
>> =A0 =A0 =A0 =A0/* DH: These vars keep track of the maximum write ran=
ge
>> =A0 =A0 =A0 =A0 * so the values can be used for layoutcommit.
>> =A0 =A0 =A0 =A0 */
>> diff --git a/include/linux/pnfs_xdr.h b/include/linux/pnfs_xdr.h
>> index a0bf341..154b04e 100644
>> --- a/include/linux/pnfs_xdr.h
>> +++ b/include/linux/pnfs_xdr.h
>> @@ -86,7 +86,6 @@ struct pnfs_layoutcommit_data {
>> =A0 =A0 =A0 =A0bool is_sync;
>> =A0 =A0 =A0 =A0struct rpc_cred *cred;
>> =A0 =A0 =A0 =A0struct nfs_fattr fattr;
>> - =A0 =A0 =A0 struct nfs_open_context *ctx;
>> =A0 =A0 =A0 =A0struct pnfs_layoutcommit_arg args;
>> =A0 =A0 =A0 =A0struct pnfs_layoutcommit_res res;
>> =A0 =A0 =A0 =A0int status;
>> --
>> 1.6.6
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"=
in
>> the body of a message to [email protected]
>> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> tao.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
> the body of a message to [email protected]
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>

2010-05-24 06:37:46

by Tao Guo

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

I see... But what if sync == 1? At that time, data must have already
been freed, so you should not use data->cred anymore.

--
tao.

2010-05-24 16:02:52

by Tao Guo

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On Mon, May 24, 2010 at 9:43 PM, William A. (Andy) Adamson
<[email protected]> wrote:
> On Mon, May 24, 2010 at 2:37 AM, Tao Guo <[email protected]> wrote:
>> I see... But what if sync == 1? At that time, data must have already
>> been freed, so you should not use data->cred anymore.
>
> Agreed.
>
> Since LAYOUTCOMMIT is not called after close, we can get rid of taking
> the reference all together since a reference is taken on OPEN and
> dropped on CLOSE.
>
> -->Andy
I agree. However that will change several functions' interface I guess.

I still have a question: why some procedures need to set rpc_cred
whereas some needn't ? what is the connection between ctx's rpc_cred
and nfs4_state_owner's so_cred? Any guidance will be appreciated...

>>
>> --
>> tao.
>>
>



--
tao.

2010-05-24 17:25:53

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On Mon, May 24, 2010 at 12:02 PM, Tao Guo <[email protected]> wrote:
> On Mon, May 24, 2010 at 9:43 PM, William A. (Andy) Adamson
> <[email protected]> wrote:
>> On Mon, May 24, 2010 at 2:37 AM, Tao Guo <[email protected]> wrote:
>>> I see... But what if sync == 1? At that time, data must have already
>>> been freed, so you should not use data->cred anymore.
>>
>> Agreed.
>>
>> Since LAYOUTCOMMIT is not called after close, we can get rid of taking
>> the reference all together since a reference is taken on OPEN and
>> dropped on CLOSE.

I forgot memory mapped files where writes can occur after the close,
and so layoutcommit can also occur after the close. So, I will keep
the get_rpccred in pnfs_need_layout commit and resend this patch with
the put_rpccred in the layoutcommit done routine.

>>
>> -->Andy
> I agree. However that will change several functions' interface I guess.
>
> I still have a question: why some procedures need to set rpc_cred
> whereas some needn't ? what is the connection between ctx's rpc_cred
> and nfs4_state_owner's so_cred? Any guidance will be appreciated...

Here's my take on an answer...

If an rpc_cred is not specified in the rpc_msg then the fsuid, fsgid,
and group info toe the current_cred is used to lookup an rpc_cred.
RPC's can be sent from several tasks - rpciod vrs the application task
for example. OPEN, CLOSE, LOCK, UNLOCK, READ, WRITE (and memmapped
I/O) should all be sent with the credential that opened the file. So,
in order to ensure this the nfs4_state_owner->so_cred is used for
these compounds even if the task is rpciod with root fsuid/fsgid. The
nfs_open_context rpc_cred is the nfs4_state_owner so_cred. The
nfs_open_context can exist longer than the nfs4_state_owner.

I think LAYOUTCOMMIT also wants to use the credential that did the I/O
- not for the file layout driver which doesn't care, but for the
object and block layout drivers that use the layout as a lock and
where a permissions check (?) is done on LAYOUTCOMMIT.

Currently, LAYOUTGET and LAYOUTRETURN don't set the rpc_cred. This is
because the layout can last past opens and closes. If we ever want to
support EXCHGID4_FLAG_BIND_PRINC_STATEID we will need to remember the
cred and set it for these calls.

-->Andy



>
>>>
>>> --
>>> tao.
>>>
>>
>
>
>
> --
> tao.
>

2010-05-24 18:20:59

by Boaz Harrosh

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred

On 05/24/2010 08:25 PM, William A. (Andy) Adamson wrote:
> On Mon, May 24, 2010 at 12:02 PM, Tao Guo <[email protected]> wrote:
>> On Mon, May 24, 2010 at 9:43 PM, William A. (Andy) Adamson
>> <[email protected]> wrote:
>>> On Mon, May 24, 2010 at 2:37 AM, Tao Guo <[email protected]> wrote:
>>>> I see... But what if sync == 1? At that time, data must have already
>>>> been freed, so you should not use data->cred anymore.
>>>
>>> Agreed.
>>>
>>> Since LAYOUTCOMMIT is not called after close, we can get rid of taking
>>> the reference all together since a reference is taken on OPEN and
>>> dropped on CLOSE.
>
> I forgot memory mapped files where writes can occur after the close,
> and so layoutcommit can also occur after the close. So, I will keep
> the get_rpccred in pnfs_need_layout commit and resend this patch with
> the put_rpccred in the layoutcommit done routine.
>

Andy I get hangs on:
data->is_sync = sync;
status = pnfs4_proc_layoutcommit(data);
+ put_rpccred(data->cred);

with sync==1 like Tao Guo said

could you send the updated patch?

Boaz

>>>
>>> -->Andy
>> I agree. However that will change several functions' interface I guess.
>>
>> I still have a question: why some procedures need to set rpc_cred
>> whereas some needn't ? what is the connection between ctx's rpc_cred
>> and nfs4_state_owner's so_cred? Any guidance will be appreciated...
>
> Here's my take on an answer...
>
> If an rpc_cred is not specified in the rpc_msg then the fsuid, fsgid,
> and group info toe the current_cred is used to lookup an rpc_cred.
> RPC's can be sent from several tasks - rpciod vrs the application task
> for example. OPEN, CLOSE, LOCK, UNLOCK, READ, WRITE (and memmapped
> I/O) should all be sent with the credential that opened the file. So,
> in order to ensure this the nfs4_state_owner->so_cred is used for
> these compounds even if the task is rpciod with root fsuid/fsgid. The
> nfs_open_context rpc_cred is the nfs4_state_owner so_cred. The
> nfs_open_context can exist longer than the nfs4_state_owner.
>
> I think LAYOUTCOMMIT also wants to use the credential that did the I/O
> - not for the file layout driver which doesn't care, but for the
> object and block layout drivers that use the layout as a lock and
> where a permissions check (?) is done on LAYOUTCOMMIT.
>
> Currently, LAYOUTGET and LAYOUTRETURN don't set the rpc_cred. This is
> because the layout can last past opens and closes. If we ever want to
> support EXCHGID4_FLAG_BIND_PRINC_STATEID we will need to remember the
> cred and set it for these calls.
>
> -->Andy
>
>
>
>>
>>>>
>>>> --
>>>> tao.
>>>>
>>>
>>
>>
>>
>> --
>> tao.
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2010-05-24 18:22:55

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH 1/1] SQUASHME pnfs-submit: replace layoutcommit_ctx with rpc_cred


On May 24, 2010, at 2:20 PM, Boaz Harrosh wrote:

> On 05/24/2010 08:25 PM, William A. (Andy) Adamson wrote:
>> On Mon, May 24, 2010 at 12:02 PM, Tao Guo <[email protected]>
>> wrote:
>>> On Mon, May 24, 2010 at 9:43 PM, William A. (Andy) Adamson
>>> <[email protected]> wrote:
>>>> On Mon, May 24, 2010 at 2:37 AM, Tao Guo <[email protected]>
>>>> wrote:
>>>>> I see... But what if sync == 1? At that time, data must have
>>>>> already
>>>>> been freed, so you should not use data->cred anymore.
>>>>
>>>> Agreed.
>>>>
>>>> Since LAYOUTCOMMIT is not called after close, we can get rid of
>>>> taking
>>>> the reference all together since a reference is taken on OPEN and
>>>> dropped on CLOSE.
>>
>> I forgot memory mapped files where writes can occur after the close,
>> and so layoutcommit can also occur after the close. So, I will keep
>> the get_rpccred in pnfs_need_layout commit and resend this patch with
>> the put_rpccred in the layoutcommit done routine.
>>
>
> Andy I get hangs on:
> data->is_sync = sync;
> status = pnfs4_proc_layoutcommit(data);
> + put_rpccred(data->cred);
>
> with sync==1 like Tao Guo said
>
> could you send the updated patch?

On it's way
>
> Boaz
>
>>>>
>>>> -->Andy
>>> I agree. However that will change several functions' interface I
>>> guess.
>>>
>>> I still have a question: why some procedures need to set rpc_cred
>>> whereas some needn't ? what is the connection between ctx's rpc_cred
>>> and nfs4_state_owner's so_cred? Any guidance will be appreciated...
>>
>> Here's my take on an answer...
>>
>> If an rpc_cred is not specified in the rpc_msg then the fsuid, fsgid,
>> and group info toe the current_cred is used to lookup an rpc_cred.
>> RPC's can be sent from several tasks - rpciod vrs the application
>> task
>> for example. OPEN, CLOSE, LOCK, UNLOCK, READ, WRITE (and memmapped
>> I/O) should all be sent with the credential that opened the file. So,
>> in order to ensure this the nfs4_state_owner->so_cred is used for
>> these compounds even if the task is rpciod with root fsuid/fsgid. The
>> nfs_open_context rpc_cred is the nfs4_state_owner so_cred. The
>> nfs_open_context can exist longer than the nfs4_state_owner.
>>
>> I think LAYOUTCOMMIT also wants to use the credential that did the
>> I/O
>> - not for the file layout driver which doesn't care, but for the
>> object and block layout drivers that use the layout as a lock and
>> where a permissions check (?) is done on LAYOUTCOMMIT.
>>
>> Currently, LAYOUTGET and LAYOUTRETURN don't set the rpc_cred. This is
>> because the layout can last past opens and closes. If we ever want to
>> support EXCHGID4_FLAG_BIND_PRINC_STATEID we will need to remember the
>> cred and set it for these calls.
>>
>> -->Andy
>>
>>
>>
>>>
>>>>>
>>>>> --
>>>>> tao.
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> tao.
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-
>> nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
> in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html