From: Andy Adamson Subject: Re: [PATCH 3/5] SQUASHME: pnfs-submit: move pnfs_layout_suspend back to nfs_inode Date: Wed, 30 Jun 2010 11:13:24 -0400 Message-ID: <6434639C-16CE-43F0-BFCD-1AC4152690B9@netapp.com> References: <1277829737-5465-1-git-send-email-andros@netapp.com> <1277829737-5465-2-git-send-email-andros@netapp.com> <1277829737-5465-3-git-send-email-andros@netapp.com> <1277829737-5465-4-git-send-email-andros@netapp.com> <4C2B5994.7010801@panasas.com> Mime-Version: 1.0 (Apple Message framework v936) Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Cc: benny@panasas.com, linux-nfs@vger.kernel.org, Fred Isaman To: Boaz Harrosh Return-path: Received: from mx2.netapp.com ([216.240.18.37]:56697 "EHLO mx2.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754521Ab0F3PN0 (ORCPT ); Wed, 30 Jun 2010 11:13:26 -0400 In-Reply-To: <4C2B5994.7010801@panasas.com> Sender: linux-nfs-owner@vger.kernel.org List-ID: On Jun 30, 2010, at 10:49 AM, Boaz Harrosh wrote: > On 06/29/2010 07:42 PM, andros@netapp.com wrote: >> From: Fred Isaman >> >> Prepare to embed struct pnfs_layout_tupe into layout driver private >> structs >> and to make layout a pointer. Since a temp error on first LAYOUTGET >> erases the layout, the suspend timer needs to be moved. >> > > OK Fred, Andy. (I'm also referring to the 1/5 and 2/5 issues here) > > There is a total mess up. LAYOUTGET has nothing to do with layout > (.eg struct pnfs_layout_type) and the layout must not be deallocated > if there is any kind of error and/or if IO is actually to be done with > MDS or not. LAYOUTGET corresponds to layout_seg held on a list in > layout. > The list might be empty and so it's flags and state. pnfs_layout_type is the layout cache head structure which has no business in nfsi unless there are layout segments populating it. > > Half of the problem was before and this here made it worse. The > life time of nfsi->layout should be the same as nfsi itself just > as it was before. No, the nfs_inode exists with no regard to pnfs_layout_type (horrible name). > > Once the life_time of nfsi && nfsi->layout are unified then all your > problems go away because you are not checking for existence of nfsi > anywhere right? that's because there is no such problem. No, the problems don't go away simply because you statically allocate a cache head struct. We have problems with races and reference counting that exist independently of how a structure is allocated. We don't need to check for the existence of nfsi, The inode is guaranteed to exist until nfs_destroy_inode is called by the VFS. > > Look at it this way. If a layout_driver is in the game it gets to > allocate it's own area in NFS_I. part of that area is for generic > pnfs.c use, i.e struct pnfs_layout_type, the rest is for private use. > once existing it is part of the NFS_I life up till death. No. It should only exists when needed - just like nfsi->nfs_acl and nfsi->delegation etc. > > If it's not so it should be fixed, not some members leaking out > to generic structures. > > Boaz > >> Signed-off-by: Fred Isaman >> --- >> fs/nfs/inode.c | 1 + >> fs/nfs/pnfs.c | 8 ++++---- >> include/linux/nfs_fs.h | 2 +- >> 3 files changed, 6 insertions(+), 5 deletions(-) >> >> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c >> index 66d7be2..cb12d33 100644 >> --- a/fs/nfs/inode.c >> +++ b/fs/nfs/inode.c >> @@ -1420,6 +1420,7 @@ static void pnfs_init_once(struct nfs_inode >> *nfsi) >> #ifdef CONFIG_NFS_V4_1 >> init_waitqueue_head(&nfsi->lo_waitq); >> nfsi->pnfs_layout_state = 0; >> + nfsi->pnfs_layout_suspend = 0; >> seqlock_init(&nfsi->layout.seqlock); >> INIT_LIST_HEAD(&nfsi->layout.lo_layouts); >> INIT_LIST_HEAD(&nfsi->layout.segs); >> diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c >> index d05cb03..7f6ace2 100644 >> --- a/fs/nfs/pnfs.c >> +++ b/fs/nfs/pnfs.c >> @@ -1105,12 +1105,12 @@ _pnfs_update_layout(struct inode *ino, >> >> /* if get layout already failed once goto out */ >> if (test_bit(lo_fail_bit(iomode), &nfsi->pnfs_layout_state)) { >> - if (unlikely(nfsi->layout.pnfs_layout_suspend && >> - get_seconds() >= nfsi->layout.pnfs_layout_suspend)) { >> + if (unlikely(nfsi->pnfs_layout_suspend && >> + get_seconds() >= nfsi->pnfs_layout_suspend)) { >> dprintk("%s: layout_get resumed\n", __func__); >> clear_bit(lo_fail_bit(iomode), >> &nfsi->pnfs_layout_state); >> - nfsi->layout.pnfs_layout_suspend = 0; >> + nfsi->pnfs_layout_suspend = 0; >> } else >> goto out_put; >> } >> @@ -1224,7 +1224,7 @@ pnfs_get_layout_done(struct >> nfs4_pnfs_layoutget *lgp, int rpc_status) >> } >> >> /* remember that get layout failed and suspend trying */ >> - nfsi->layout.pnfs_layout_suspend = suspend; >> + nfsi->pnfs_layout_suspend = suspend; >> set_bit(lo_fail_bit(lgp->args.lseg.iomode), >> &nfsi->pnfs_layout_state); >> dprintk("%s: layout_get suspended until %ld\n", >> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h >> index e216e24..cebc958 100644 >> --- a/include/linux/nfs_fs.h >> +++ b/include/linux/nfs_fs.h >> @@ -105,7 +105,6 @@ struct pnfs_layout_type { >> seqlock_t seqlock; /* Protects the stateid */ >> nfs4_stateid stateid; >> void *ld_data; /* layout driver private data */ >> - time_t pnfs_layout_suspend; >> struct rpc_cred *lo_cred; /* layoutcommit credential */ >> /* DH: These vars keep track of the maximum write range >> * so the values can be used for layoutcommit. >> @@ -208,6 +207,7 @@ struct nfs_inode { >> #define NFS_INO_RW_LAYOUT_FAILED 1 /* rw LAYOUTGET failed stop >> trying */ >> #define NFS_INO_LAYOUT_ALLOC 2 /* bit lock for layout >> allocation */ >> #define NFS_INO_LAYOUTCOMMIT 3 /* LAYOUTCOMMIT needed */ >> + time_t pnfs_layout_suspend; >> #endif /* CONFIG_NFS_V4_1 */ >> #endif /* CONFIG_NFS_V4*/ >> #ifdef CONFIG_NFS_FSCACHE >