2017-05-02 16:41:08

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 0/3] Fix up a couple of issues around layout handling

The main issue to be dealt with is a deadlock that can occur due to
an ABBA-type of situation between layoutget and layoutreturn.

(Resend - apologies for the confused first send)

Trond Myklebust (3):
pNFS: Don't clear the layout return info if there are segments to
return
pNFS: Fix a deadlock when coalescing writes and returning the layout
pNFS: Fix a typo in pnfs_generic_alloc_ds_commits

fs/nfs/pnfs.c | 10 +++++++---
fs/nfs/pnfs_nfs.c | 2 +-
2 files changed, 8 insertions(+), 4 deletions(-)

--
2.9.3



2017-05-02 16:41:09

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 2/3] pNFS: Fix a deadlock when coalescing writes and returning the layout

Consider the following deadlock:

Process P1 Process P2 Process P3
========== ========== ==========
lock_page(page)

lseg = pnfs_update_layout(inode)

lo = NFS_I(inode)->layout
pnfs_error_mark_layout_for_return(lo)

lock_page(page)

lseg = pnfs_update_layout(inode)

In this scenario,
- P1 has declared the layout to be in error, but P2 holds a reference to
a layout segment on that inode, so the layoutreturn is deferred.
- P2 is waiting for a page lock held by P3.
- P3 is asking for a new layout segment, but is blocked waiting
for the layoutreturn.

The fix is to ensure that pnfs_error_mark_layout_for_return() does
not set the NFS_LAYOUT_RETURN flag, which blocks P3. Instead, we allow
the latter to call LAYOUTGET so that it can make progress and unblock
P2.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index cea1e838efae..adc6ec28d4b5 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2063,8 +2063,6 @@ void pnfs_error_mark_layout_for_return(struct inode *inode,
return;
}
pnfs_set_plh_return_info(lo, range.iomode, 0);
- /* Block LAYOUTGET */
- set_bit(NFS_LAYOUT_RETURN, &lo->plh_flags);
/*
* mark all matching lsegs so that we are sure to have no live
* segments at hand when sending layoutreturn. See pnfs_put_lseg()
--
2.9.3


2017-05-02 16:41:09

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 1/3] pNFS: Don't clear the layout return info if there are segments to return

In pnfs_clear_layoutreturn_info, ensure that we don't clear the layout
return info if there are new segments queued for return due to, for
instance, a race between a LAYOUTRETURN and a failed I/O attempt.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 140ecd7d350f..cea1e838efae 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -322,9 +322,15 @@ pnfs_set_plh_return_info(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode,
static void
pnfs_clear_layoutreturn_info(struct pnfs_layout_hdr *lo)
{
+ struct pnfs_layout_segment *lseg;
lo->plh_return_iomode = 0;
lo->plh_return_seq = 0;
clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
+ list_for_each_entry(lseg, &lo->plh_segs, pls_list) {
+ if (!test_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags))
+ continue;
+ pnfs_set_plh_return_info(lo, lseg->pls_range.iomode, 0);
+ }
}

static void pnfs_clear_layoutreturn_waitbit(struct pnfs_layout_hdr *lo)
@@ -367,9 +373,9 @@ pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo,
struct pnfs_layout_segment *lseg, *next;

set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
- pnfs_clear_layoutreturn_info(lo);
list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list)
pnfs_clear_lseg_state(lseg, lseg_list);
+ pnfs_clear_layoutreturn_info(lo);
pnfs_free_returned_lsegs(lo, lseg_list, &range, 0);
if (test_bit(NFS_LAYOUT_RETURN, &lo->plh_flags) &&
!test_and_set_bit(NFS_LAYOUT_RETURN_LOCK, &lo->plh_flags))
--
2.9.3


2017-05-02 16:41:11

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 3/3] pNFS: Fix a typo in pnfs_generic_alloc_ds_commits

If the layout segment is invalid, we want to just resend the remaining
writes.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs_nfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c
index 7697ac0ff81a..ae600ab1a646 100644
--- a/fs/nfs/pnfs_nfs.c
+++ b/fs/nfs/pnfs_nfs.c
@@ -223,7 +223,7 @@ pnfs_generic_alloc_ds_commits(struct nfs_commit_info *cinfo,
*/
if (!pnfs_is_valid_lseg(bucket->clseg) &&
!test_bit(NFS_LSEG_LAYOUTRETURN, &bucket->clseg->pls_flags))
- continue;
+ break;
data = nfs_commitdata_alloc(false);
if (!data)
break;
--
2.9.3