by Myklebust, Trond

[permalink] [raw]

Subject: Re: [PATCH 06/15] pnfs: change how lsegs are removed from layout list

On Wed, 2010-12-22 at 17:08 -0500, Fred Isaman wrote:
> On Dec 22, 2010, at 4:43 PM, Trond Myklebust wrote:
>
> > On Tue, 2010-12-21 at 23:00 -0500, Fred Isaman wrote:
> >> This is to prepare the way for sensible io draining. Instead of just
> >> removing the lseg from the list, we instead clear the VALID flag
> >> (preventing new io from grabbing references to the lseg) and remove
> >> the reference holding it in the list. Thus the lseg will be removed
> >> once any io in progress completes and any references still held are
> >> dropped.
> >>
> >> Signed-off-by: Fred Isaman <[email protected]>
> >> ---
> >> fs/nfs/inode.c | 2 +-
> >> fs/nfs/pnfs.c | 121 ++++++++++++++++++++++++++++++++++++-------------------
> >> fs/nfs/pnfs.h | 8 +++-
> >> 3 files changed, 87 insertions(+), 44 deletions(-)
> >>
> >> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> >> index e67e31c..43a69da 100644
> >> --- a/fs/nfs/inode.c
> >> +++ b/fs/nfs/inode.c
> >
> >> -/* Called without i_lock held, as the free_lseg call may sleep */
> >> -static void
> >> -destroy_lseg(struct kref *kref)
> >> +static void free_lseg(struct pnfs_layout_segment *lseg)
> >> {
> >> - struct pnfs_layout_segment *lseg =
> >> - container_of(kref, struct pnfs_layout_segment, pls_refcount);
> >> struct inode *ino = lseg->pls_layout->plh_inode;
> >>
> >> - dprintk("--> %s\n", __func__);
> >> + BUG_ON(atomic_read(&lseg->pls_refcount) != 0);
> >> NFS_SERVER(ino)->pnfs_curr_ld->free_lseg(lseg);
> >> /* Matched by get_layout_hdr in pnfs_insert_layout */
> >> put_layout_hdr(ino);
> >> }
> >
> > <snip>
> >
> >> static void
> >> -pnfs_free_lseg_list(struct list_head *tmp_list)
> >> +pnfs_free_lseg_list(struct list_head *free_me)
> >> {
> >> - struct pnfs_layout_segment *lseg;
> >> + struct pnfs_layout_segment *lseg, *tmp;
> >>
> >> - while (!list_empty(tmp_list)) {
> >> - lseg = list_entry(tmp_list->next, struct pnfs_layout_segment,
> >> - pls_list);
> >> - dprintk("%s calling put_lseg on %p\n", __func__, lseg);
> >> - list_del(&lseg->pls_list);
> >> - put_lseg(lseg);
> >> - }
> >> + list_for_each_entry_safe(lseg, tmp, free_me, pls_list)
> >> + free_lseg(lseg);
> >> + INIT_LIST_HEAD(free_me);
> >> }
> >
> > The above looks very dubious to me. Why is this change needed, and what
> > guarantees do we have that free_lseg() will do the right thing w.r.t.
> > removing stuff from the list?
>
> Note that this is called with a list of lsegs which already have refcount==0 and have been removed from generic layer lists such that no new reference is coming. All that is expected of free_lseg is that the layout driver gets a chance to free any resources it has attached to the lseg, and that we release the reference the lseg holds on the layout header itself.
>
> The primary reason for this is that the file layouts ld->free_lseg call can call nfs_put_client, which can sleep. So this allows the ld->free_lseg to be postponed until we can drop locks, similar to the dispose_list function in fs/inode.c

Sure, but why do these things have to be on the free_me list when you
call ld->free_lseg(lseg)? Why do you suddenly need the INIT_LIST_HEAD()?

As I said, that all looks very dubious...

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com