2016-05-15 01:06:53

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 00/13] pnfs: layout pipelining and related fixes

v3:
- move more of the LAYOUTGET error handling out of the state machine
- better tracepoints
- cleanup and fixes in pnfs_layout_process
- patches to prevent client from flipping to in-band I/O
- bugfixes

v2:
- rework of LAYOUTGET retry handling.

This is v3 of the set. In addition to the patches that provide layout
pipelining, this also has a few patches from Tom and Trond to stop
the client from flipping to in-band I/O quite as aggressively.

Testing with set with flexfiles has gone well. I'd definitely appreciate
testing in other pnfs environments though.

Original cover letter from the RFC patchset follows:

--------------------------[snip]----------------------------

At Primary Data, one of the things we're most interested in is data
mobility. IOW, we want to be able to change the layout for an inode
seamlessly, with little interruption to I/O patterns.

The problem we have now is that CB_LAYOUTRECALLs interrupt I/O. When one
comes in, most pNFS servers refuse to hand out new layouts until the
recalled ones have been returned (or the client indicates that it no
longer knows about them). It doesn't have to be this way though. RFC5661
allows for concurrent LAYOUTGET and LAYOUTRETURN calls.

Furthermore, servers are expected to deal with old stateids in
LAYOUTRETURN. From RFC5661, section 18.44.3:

If the client returns the layout in response to a CB_LAYOUTRECALL
where the lor_recalltype field of the clora_recall field was
LAYOUTRECALL4_FILE, the client should use the lor_stateid value from
CB_LAYOUTRECALL as the value for lrf_stateid. Otherwise, it should
use logr_stateid (from a previous LAYOUTGET result) or lorr_stateid
(from a previous LAYRETURN result). This is done to indicate the
point in time (in terms of layout stateid transitions) when the
recall was sent.

The way I'm interpreting this is that we can treat a LAYOUTRETURN with
an old stateid as returning all layouts that matched the given iomode,
at the time that that seqid was current.

With that, we can allow a LAYOUTGET on the same fh to proceed even when
there are still recalled layouts outstanding. This should allow the
client to pivot to a new layout while it's still draining I/Os
that are pinning the ones to be returned.

This patchset is a first draft of the client side piece that allows
this. Basically whenever we get a new layout segment, we'll tag it with
the seqid that was in the LAYOUTGET stateid that grants it.

When a CB_LAYOUTRECALL comes in, we tag the return seqid in the layout
header with the one that was in the request. When we do a LAYOUTRETURN
in response to a CB_LAYOUTRECALL, we craft the seqid such that we're
only returning the layouts that were recalled. Nothing that has been
granted since then will be returned.

I think I've done this in a way that the existing behavior is preserved
in the case where the server enforces the serialization of these
operations, but please do have a look and let me know if you see any
potential problems here. Testing this is still a WIP...

Jeff Layton (10):
pnfs: don't merge new ff lsegs with ones that have LAYOUTRETURN bit
set
pnfs: record sequence in pnfs_layout_segment when it's created
pnfs: keep track of the return sequence number in pnfs_layout_hdr
pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args
flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTED
flexfiles: add kerneldoc header to nfs4_ff_layout_prepare_ds
pnfs: fix bad error handling in send_layoutget
pnfs: lift retry logic from send_layoutget to pnfs_update_layout
pnfs: rework LAYOUTGET retry handling
pnfs: make pnfs_layout_process more robust

Tom Haynes (2):
pNFS/flexfiles: When checking for available DSes, conditionally check
for MDS io
pNFS/flexfiles: When initing reads or writes, we might have to retry
connecting to DSes

Trond Myklebust (1):
pNFS/flexfile: Fix erroneous fall back to read/write through the MDS

fs/nfs/callback_proc.c | 3 +-
fs/nfs/flexfilelayout/flexfilelayout.c | 63 ++++---
fs/nfs/flexfilelayout/flexfilelayout.h | 1 +
fs/nfs/flexfilelayout/flexfilelayoutdev.c | 32 +++-
fs/nfs/nfs42proc.c | 2 +-
fs/nfs/nfs4proc.c | 116 +++++-------
fs/nfs/nfs4trace.h | 10 +-
fs/nfs/pnfs.c | 298 +++++++++++++++++-------------
fs/nfs/pnfs.h | 14 +-
include/linux/nfs4.h | 2 +
include/linux/nfs_xdr.h | 2 -
11 files changed, 293 insertions(+), 250 deletions(-)

--
2.5.5



2016-05-15 01:06:54

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 01/13] pNFS/flexfile: Fix erroneous fall back to read/write through the MDS

From: Trond Myklebust <[email protected]>

This patch fixes a problem whereby the pNFS client falls back to doing
reads and writes through the metadata server even when the layout flag
FF_FLAGS_NO_IO_THRU_MDS is set.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 23 ++++++-----------------
1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index 60d690dbc947..51f6660a2247 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -1294,7 +1294,7 @@ ff_layout_set_layoutcommit(struct nfs_pgio_header *hdr)
}

static bool
-ff_layout_reset_to_mds(struct pnfs_layout_segment *lseg, int idx)
+ff_layout_device_unavailable(struct pnfs_layout_segment *lseg, int idx)
{
/* No mirroring for now */
struct nfs4_deviceid_node *node = FF_LAYOUT_DEVID_NODE(lseg, idx);
@@ -1331,16 +1331,10 @@ static int ff_layout_read_prepare_common(struct rpc_task *task,
rpc_exit(task, -EIO);
return -EIO;
}
- if (ff_layout_reset_to_mds(hdr->lseg, hdr->pgio_mirror_idx)) {
- dprintk("%s task %u reset io to MDS\n", __func__, task->tk_pid);
- if (ff_layout_has_available_ds(hdr->lseg))
- pnfs_read_resend_pnfs(hdr);
- else
- ff_layout_reset_read(hdr);
- rpc_exit(task, 0);
+ if (ff_layout_device_unavailable(hdr->lseg, hdr->pgio_mirror_idx)) {
+ rpc_exit(task, -EHOSTDOWN);
return -EAGAIN;
}
- hdr->pgio_done_cb = ff_layout_read_done_cb;

ff_layout_read_record_layoutstats_start(task, hdr);
return 0;
@@ -1530,14 +1524,8 @@ static int ff_layout_write_prepare_common(struct rpc_task *task,
return -EIO;
}

- if (ff_layout_reset_to_mds(hdr->lseg, hdr->pgio_mirror_idx)) {
- bool retry_pnfs;
-
- retry_pnfs = ff_layout_has_available_ds(hdr->lseg);
- dprintk("%s task %u reset io to %s\n", __func__,
- task->tk_pid, retry_pnfs ? "pNFS" : "MDS");
- ff_layout_reset_write(hdr, retry_pnfs);
- rpc_exit(task, 0);
+ if (ff_layout_device_unavailable(hdr->lseg, hdr->pgio_mirror_idx)) {
+ rpc_exit(task, -EHOSTDOWN);
return -EAGAIN;
}

@@ -1754,6 +1742,7 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr)
dprintk("%s USE DS: %s cl_count %d vers %d\n", __func__,
ds->ds_remotestr, atomic_read(&ds->ds_clp->cl_count), vers);

+ hdr->pgio_done_cb = ff_layout_read_done_cb;
atomic_inc(&ds->ds_clp->cl_count);
hdr->ds_clp = ds->ds_clp;
fh = nfs4_ff_layout_select_ds_fh(lseg, idx);
--
2.5.5


2016-05-15 01:06:55

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 02/13] pNFS/flexfiles: When checking for available DSes, conditionally check for MDS io

From: Tom Haynes <[email protected]>

Whenever we check to see if we have the needed number of DSes for the
action, we may also have to check to see whether IO is allowed to go to
the MDS or not.

[jlayton: fix merge conflict due to lack of localio patches here]

Signed-off-by: Tom Haynes <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 5 ++---
fs/nfs/flexfilelayout/flexfilelayout.h | 1 +
fs/nfs/flexfilelayout/flexfilelayoutdev.c | 6 ++++++
3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index 51f6660a2247..f538ca6bbe81 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -1101,8 +1101,7 @@ static int ff_layout_async_handle_error_v4(struct rpc_task *task,
rpc_wake_up(&tbl->slot_tbl_waitq);
/* fall through */
default:
- if (ff_layout_no_fallback_to_mds(lseg) ||
- ff_layout_has_available_ds(lseg))
+ if (ff_layout_avoid_mds_available_ds(lseg))
return -NFS4ERR_RESET_TO_PNFS;
reset:
dprintk("%s Retry through MDS. Error %d\n", __func__,
@@ -1764,7 +1763,7 @@ ff_layout_read_pagelist(struct nfs_pgio_header *hdr)
return PNFS_ATTEMPTED;

out_failed:
- if (ff_layout_has_available_ds(lseg))
+ if (ff_layout_avoid_mds_available_ds(lseg))
return PNFS_TRY_AGAIN;
return PNFS_NOT_ATTEMPTED;
}
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.h b/fs/nfs/flexfilelayout/flexfilelayout.h
index 1318c77aeb35..b54058122647 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.h
+++ b/fs/nfs/flexfilelayout/flexfilelayout.h
@@ -191,4 +191,5 @@ nfs4_ff_find_or_create_ds_client(struct pnfs_layout_segment *lseg,
struct rpc_cred *ff_layout_get_ds_cred(struct pnfs_layout_segment *lseg,
u32 ds_idx, struct rpc_cred *mdscred);
bool ff_layout_has_available_ds(struct pnfs_layout_segment *lseg);
+bool ff_layout_avoid_mds_available_ds(struct pnfs_layout_segment *lseg);
#endif /* FS_NFS_NFS4FLEXFILELAYOUT_H */
diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
index 56296f3df19c..c52ca75081a8 100644
--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
+++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
@@ -540,6 +540,12 @@ bool ff_layout_has_available_ds(struct pnfs_layout_segment *lseg)
return ff_rw_layout_has_available_ds(lseg);
}

+bool ff_layout_avoid_mds_available_ds(struct pnfs_layout_segment *lseg)
+{
+ return ff_layout_no_fallback_to_mds(lseg) ||
+ ff_layout_has_available_ds(lseg);
+}
+
module_param(dataserver_retrans, uint, 0644);
MODULE_PARM_DESC(dataserver_retrans, "The number of times the NFSv4.1 client "
"retries a request before it attempts further "
--
2.5.5


2016-05-15 01:06:56

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 03/13] pNFS/flexfiles: When initing reads or writes, we might have to retry connecting to DSes

From: Tom Haynes <[email protected]>

If we are initializing reads or writes and can not connect to a DS, then
check whether or not IO is allowed through the MDS. If it is allowed,
reset to the MDS. Else, fail the layout segment and force a retry
of a new layout segment.

Signed-off-by: Tom Haynes <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 29 +++++++++++++++++++++++++----
1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index f538ca6bbe81..f58cd2a14d5e 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -847,8 +847,13 @@ ff_layout_pg_init_read(struct nfs_pageio_descriptor *pgio,
goto out_mds;

ds = ff_layout_choose_best_ds_for_read(pgio->pg_lseg, 0, &ds_idx);
- if (!ds)
- goto out_mds;
+ if (!ds) {
+ if (ff_layout_no_fallback_to_mds(pgio->pg_lseg))
+ goto out_pnfs;
+ else
+ goto out_mds;
+ }
+
mirror = FF_LAYOUT_COMP(pgio->pg_lseg, ds_idx);

pgio->pg_mirror_idx = ds_idx;
@@ -862,6 +867,12 @@ out_mds:
pnfs_put_lseg(pgio->pg_lseg);
pgio->pg_lseg = NULL;
nfs_pageio_reset_read_mds(pgio);
+ return;
+
+out_pnfs:
+ pnfs_set_lo_fail(pgio->pg_lseg);
+ pnfs_put_lseg(pgio->pg_lseg);
+ pgio->pg_lseg = NULL;
}

static void
@@ -904,8 +915,12 @@ ff_layout_pg_init_write(struct nfs_pageio_descriptor *pgio,

for (i = 0; i < pgio->pg_mirror_count; i++) {
ds = nfs4_ff_layout_prepare_ds(pgio->pg_lseg, i, true);
- if (!ds)
- goto out_mds;
+ if (!ds) {
+ if (ff_layout_no_fallback_to_mds(pgio->pg_lseg))
+ goto out_pnfs;
+ else
+ goto out_mds;
+ }
pgm = &pgio->pg_mirrors[i];
mirror = FF_LAYOUT_COMP(pgio->pg_lseg, i);
pgm->pg_bsize = mirror->mirror_ds->ds_versions[0].wsize;
@@ -917,6 +932,12 @@ out_mds:
pnfs_put_lseg(pgio->pg_lseg);
pgio->pg_lseg = NULL;
nfs_pageio_reset_write_mds(pgio);
+ return;
+
+out_pnfs:
+ pnfs_set_lo_fail(pgio->pg_lseg);
+ pnfs_put_lseg(pgio->pg_lseg);
+ pgio->pg_lseg = NULL;
}

static unsigned int
--
2.5.5


2016-05-15 01:06:58

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 05/13] pnfs: record sequence in pnfs_layout_segment when it's created

In later patches, we're going to teach the client to be more selective
about how it returns layouts. This means keeping a record of what the
stateid's seqid was at the time that the server handed out a layout
segment.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/pnfs.c | 1 +
fs/nfs/pnfs.h | 1 +
2 files changed, 2 insertions(+)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 5b404d926e08..18b6f950e300 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1697,6 +1697,7 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)

init_lseg(lo, lseg);
lseg->pls_range = res->range;
+ lseg->pls_seq = be32_to_cpu(res->stateid.seqid);

spin_lock(&ino->i_lock);
if (pnfs_layoutgets_blocked(lo)) {
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 7222d3a35439..361fa5494aa5 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -64,6 +64,7 @@ struct pnfs_layout_segment {
struct list_head pls_lc_list;
struct pnfs_layout_range pls_range;
atomic_t pls_refcount;
+ u32 pls_seq;
unsigned long pls_flags;
struct pnfs_layout_hdr *pls_layout;
struct work_struct pls_work;
--
2.5.5


2016-05-15 01:06:56

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 04/13] pnfs: don't merge new ff lsegs with ones that have LAYOUTRETURN bit set

Otherwise, we'll end up returning layouts that we've just received if
the client issues a new LAYOUTGET prior to the LAYOUTRETURN.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index f58cd2a14d5e..127a65fa639a 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -298,6 +298,8 @@ ff_lseg_merge(struct pnfs_layout_segment *new,
{
u64 new_end, old_end;

+ if (test_bit(NFS_LSEG_LAYOUTRETURN, &old->pls_flags))
+ return false;
if (new->pls_range.iomode != old->pls_range.iomode)
return false;
old_end = pnfs_calc_offset_end(old->pls_range.offset,
@@ -318,8 +320,6 @@ ff_lseg_merge(struct pnfs_layout_segment *new,
new_end);
if (test_bit(NFS_LSEG_ROC, &old->pls_flags))
set_bit(NFS_LSEG_ROC, &new->pls_flags);
- if (test_bit(NFS_LSEG_LAYOUTRETURN, &old->pls_flags))
- set_bit(NFS_LSEG_LAYOUTRETURN, &new->pls_flags);
return true;
}

--
2.5.5


2016-05-15 01:06:59

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 06/13] pnfs: keep track of the return sequence number in pnfs_layout_hdr

When we want to selectively do a LAYOUTRETURN, we need to specify a
stateid that represents most recent layout acquisition that is to be
returned.

When we mark a layout stateid to be returned, we update the return
sequence number in the layout header with that value, if it's newer
than the existing one. Then, when we go to do a LAYOUTRETURN on
layout header put, we overwrite the seqid in the stateid with the
saved one, and then zero it out.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/pnfs.c | 11 ++++++++---
fs/nfs/pnfs.h | 1 +
2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 18b6f950e300..39432a3705b4 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -899,6 +899,7 @@ pnfs_prepare_layoutreturn(struct pnfs_layout_hdr *lo)
if (test_and_set_bit(NFS_LAYOUT_RETURN, &lo->plh_flags))
return false;
lo->plh_return_iomode = 0;
+ lo->plh_return_seq = 0;
pnfs_get_layout_hdr(lo);
clear_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
return true;
@@ -969,6 +970,7 @@ static void pnfs_layoutreturn_before_put_layout_hdr(struct pnfs_layout_hdr *lo)
bool send;

nfs4_stateid_copy(&stateid, &lo->plh_stateid);
+ stateid.seqid = cpu_to_be32(lo->plh_return_seq);
iomode = lo->plh_return_iomode;
send = pnfs_prepare_layoutreturn(lo);
spin_unlock(&inode->i_lock);
@@ -1747,7 +1749,8 @@ out_forget_reply:
}

static void
-pnfs_set_plh_return_iomode(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode)
+pnfs_set_plh_return_info(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode,
+ u32 seq)
{
if (lo->plh_return_iomode == iomode)
return;
@@ -1755,6 +1758,8 @@ pnfs_set_plh_return_iomode(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode)
iomode = IOMODE_ANY;
lo->plh_return_iomode = iomode;
set_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags);
+ if (!lo->plh_return_seq || pnfs_seqid_is_newer(seq, lo->plh_return_seq))
+ lo->plh_return_seq = seq;
}

/**
@@ -1793,7 +1798,7 @@ pnfs_mark_matching_lsegs_return(struct pnfs_layout_hdr *lo,
continue;
remaining++;
set_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags);
- pnfs_set_plh_return_iomode(lo, return_range->iomode);
+ pnfs_set_plh_return_info(lo, return_range->iomode, lseg->pls_seq);
}
return remaining;
}
@@ -1811,7 +1816,7 @@ void pnfs_error_mark_layout_for_return(struct inode *inode,
bool return_now = false;

spin_lock(&inode->i_lock);
- pnfs_set_plh_return_iomode(lo, range.iomode);
+ pnfs_set_plh_return_info(lo, range.iomode, lseg->pls_seq);
/*
* mark all matching lsegs so that we are sure to have no live
* segments at hand when sending layoutreturn. See pnfs_put_lseg()
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 361fa5494aa5..3476c9850678 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -195,6 +195,7 @@ struct pnfs_layout_hdr {
unsigned long plh_flags;
nfs4_stateid plh_stateid;
u32 plh_barrier; /* ignore lower seqids */
+ u32 plh_return_seq;
enum pnfs_iomode plh_return_iomode;
loff_t plh_lwb; /* last write byte for layoutcommit */
struct rpc_cred *plh_lc_cred; /* layoutcommit cred */
--
2.5.5


2016-05-15 01:07:00

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 08/13] flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTED

Setting just the NFS_LAYOUT_RETURN_REQUESTED flag doesn't do anything,
unless there are lsegs that are also being marked for return. At the
point where that happens this flag is also set, so these set_bit calls
don't do anything useful.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 2 --
fs/nfs/flexfilelayout/flexfilelayoutdev.c | 8 +-------
2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index 127a65fa639a..1c902b01db9e 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -1269,8 +1269,6 @@ static int ff_layout_read_done_cb(struct rpc_task *task,
hdr->pgio_mirror_idx + 1,
&hdr->pgio_mirror_idx))
goto out_eagain;
- set_bit(NFS_LAYOUT_RETURN_REQUESTED,
- &hdr->lseg->pls_layout->plh_flags);
pnfs_read_resend_pnfs(hdr);
return task->tk_status;
case -NFS4ERR_RESET_TO_MDS:
diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
index c52ca75081a8..5532cb14e800 100644
--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
+++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
@@ -393,13 +393,7 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx,
mirror, lseg->pls_range.offset,
lseg->pls_range.length, NFS4ERR_NXIO,
OP_ILLEGAL, GFP_NOIO);
- if (!fail_return) {
- if (ff_layout_has_available_ds(lseg))
- set_bit(NFS_LAYOUT_RETURN_REQUESTED,
- &lseg->pls_layout->plh_flags);
- else
- pnfs_error_mark_layout_for_return(ino, lseg);
- } else
+ if (fail_return || !ff_layout_has_available_ds(lseg))
pnfs_error_mark_layout_for_return(ino, lseg);
ds = NULL;
}
--
2.5.5


2016-05-15 01:07:00

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 07/13] pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args

LAYOUTRETURN is "special" in that servers and clients are expected to
work with old stateids. When the client sends a LAYOUTRETURN with an old
stateid in it then the server is expected to only tear down layout
segments that were present when that seqid was current. Ensure that the
client handles its accounting accordingly.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/callback_proc.c | 3 ++-
fs/nfs/nfs42proc.c | 2 +-
fs/nfs/nfs4proc.c | 5 ++--
fs/nfs/pnfs.c | 64 +++++++++++++++++++++++++++++++++-----------------
fs/nfs/pnfs.h | 6 +++--
5 files changed, 52 insertions(+), 28 deletions(-)

diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 618ced381a14..755838df9996 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -217,7 +217,8 @@ static u32 initiate_file_draining(struct nfs_client *clp,
}

if (pnfs_mark_matching_lsegs_return(lo, &free_me_list,
- &args->cbl_range)) {
+ &args->cbl_range,
+ be32_to_cpu(args->cbl_stateid.seqid))) {
rv = NFS4_OK;
goto unlock;
}
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index dff83460e5a6..198bcc3e103d 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -232,7 +232,7 @@ nfs42_layoutstat_done(struct rpc_task *task, void *calldata)
* with the current stateid.
*/
set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
- pnfs_mark_matching_lsegs_invalid(lo, &head, NULL);
+ pnfs_mark_matching_lsegs_invalid(lo, &head, NULL, 0);
spin_unlock(&inode->i_lock);
pnfs_free_lseg_list(&head);
} else
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index bc2676c95e1b..c0d75be8cb69 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7930,7 +7930,7 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
* with the current stateid.
*/
set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
- pnfs_mark_matching_lsegs_invalid(lo, &head, NULL);
+ pnfs_mark_matching_lsegs_invalid(lo, &head, NULL, 0);
spin_unlock(&inode->i_lock);
pnfs_free_lseg_list(&head);
} else
@@ -8122,7 +8122,8 @@ static void nfs4_layoutreturn_release(void *calldata)

dprintk("--> %s\n", __func__);
spin_lock(&lo->plh_inode->i_lock);
- pnfs_mark_matching_lsegs_invalid(lo, &freeme, &lrp->args.range);
+ pnfs_mark_matching_lsegs_invalid(lo, &freeme, &lrp->args.range,
+ be32_to_cpu(lrp->args.stateid.seqid));
pnfs_mark_layout_returned_if_empty(lo);
if (lrp->res.lrs_present)
pnfs_set_layout_stateid(lo, &lrp->res.stateid, true);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 39432a3705b4..e6cad5ee5d29 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -270,7 +270,7 @@ pnfs_mark_layout_stateid_invalid(struct pnfs_layout_hdr *lo,
};

set_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags);
- return pnfs_mark_matching_lsegs_invalid(lo, lseg_list, &range);
+ return pnfs_mark_matching_lsegs_invalid(lo, lseg_list, &range, 0);
}

static int
@@ -308,7 +308,7 @@ pnfs_layout_io_set_failed(struct pnfs_layout_hdr *lo, u32 iomode)

spin_lock(&inode->i_lock);
pnfs_layout_set_fail_bit(lo, pnfs_iomode_to_fail_bit(iomode));
- pnfs_mark_matching_lsegs_invalid(lo, &head, &range);
+ pnfs_mark_matching_lsegs_invalid(lo, &head, &range, 0);
spin_unlock(&inode->i_lock);
pnfs_free_lseg_list(&head);
dprintk("%s Setting layout IOMODE_%s fail bit\n", __func__,
@@ -522,13 +522,35 @@ static int mark_lseg_invalid(struct pnfs_layout_segment *lseg,
return rv;
}

-/* Returns count of number of matching invalid lsegs remaining in list
- * after call.
+/*
+ * Compare 2 layout stateid sequence ids, to see which is newer,
+ * taking into account wraparound issues.
+ */
+static bool pnfs_seqid_is_newer(u32 s1, u32 s2)
+{
+ return (s32)(s1 - s2) > 0;
+}
+
+/**
+ * pnfs_mark_matching_lsegs_invalid - tear down lsegs or mark them for later
+ * @lo: layout header containing the lsegs
+ * @tmp_list: list head where doomed lsegs should go
+ * @recall_range: optional recall range argument to match (may be NULL)
+ * @seq: only invalidate lsegs obtained prior to this sequence (may be 0)
+ *
+ * Walk the list of lsegs in the layout header, and tear down any that should
+ * be destroyed. If "recall_range" is specified then the segment must match
+ * that range. If "seq" is non-zero, then only match segments that were handed
+ * out at or before that sequence.
+ *
+ * Returns number of matching invalid lsegs remaining in list after scanning
+ * it and purging them.
*/
int
pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
struct list_head *tmp_list,
- const struct pnfs_layout_range *recall_range)
+ const struct pnfs_layout_range *recall_range,
+ u32 seq)
{
struct pnfs_layout_segment *lseg, *next;
int remaining = 0;
@@ -540,10 +562,12 @@ pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
list_for_each_entry_safe(lseg, next, &lo->plh_segs, pls_list)
if (!recall_range ||
should_free_lseg(&lseg->pls_range, recall_range)) {
- dprintk("%s: freeing lseg %p iomode %d "
+ if (seq && pnfs_seqid_is_newer(lseg->pls_seq, seq))
+ continue;
+ dprintk("%s: freeing lseg %p iomode %d seq %u"
"offset %llu length %llu\n", __func__,
- lseg, lseg->pls_range.iomode, lseg->pls_range.offset,
- lseg->pls_range.length);
+ lseg, lseg->pls_range.iomode, lseg->pls_seq,
+ lseg->pls_range.offset, lseg->pls_range.length);
if (!mark_lseg_invalid(lseg, tmp_list))
remaining++;
}
@@ -730,15 +754,6 @@ pnfs_destroy_all_layouts(struct nfs_client *clp)
pnfs_destroy_layouts_byclid(clp, false);
}

-/*
- * Compare 2 layout stateid sequence ids, to see which is newer,
- * taking into account wraparound issues.
- */
-static bool pnfs_seqid_is_newer(u32 s1, u32 s2)
-{
- return (s32)(s1 - s2) > 0;
-}
-
/* update lo->plh_stateid with new if is more recent */
void
pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo, const nfs4_stateid *new,
@@ -1014,7 +1029,7 @@ _pnfs_return_layout(struct inode *ino)
pnfs_get_layout_hdr(lo);
empty = list_empty(&lo->plh_segs);
pnfs_clear_layoutcommit(ino, &tmp_list);
- pnfs_mark_matching_lsegs_invalid(lo, &tmp_list, NULL);
+ pnfs_mark_matching_lsegs_invalid(lo, &tmp_list, NULL, 0);

if (NFS_SERVER(ino)->pnfs_curr_ld->return_range) {
struct pnfs_layout_range range = {
@@ -1721,7 +1736,7 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
* inode invalid, and don't bother validating the stateid
* sequence number.
*/
- pnfs_mark_matching_lsegs_invalid(lo, &free_me, NULL);
+ pnfs_mark_matching_lsegs_invalid(lo, &free_me, NULL, 0);

nfs4_stateid_copy(&lo->plh_stateid, &res->stateid);
lo->plh_barrier = be32_to_cpu(res->stateid.seqid);
@@ -1775,7 +1790,8 @@ pnfs_set_plh_return_info(struct pnfs_layout_hdr *lo, enum pnfs_iomode iomode,
int
pnfs_mark_matching_lsegs_return(struct pnfs_layout_hdr *lo,
struct list_head *tmp_list,
- const struct pnfs_layout_range *return_range)
+ const struct pnfs_layout_range *return_range,
+ u32 seq)
{
struct pnfs_layout_segment *lseg, *next;
int remaining = 0;
@@ -1798,8 +1814,11 @@ pnfs_mark_matching_lsegs_return(struct pnfs_layout_hdr *lo,
continue;
remaining++;
set_bit(NFS_LSEG_LAYOUTRETURN, &lseg->pls_flags);
- pnfs_set_plh_return_info(lo, return_range->iomode, lseg->pls_seq);
}
+
+ if (remaining)
+ pnfs_set_plh_return_info(lo, return_range->iomode, seq);
+
return remaining;
}

@@ -1822,7 +1841,8 @@ void pnfs_error_mark_layout_for_return(struct inode *inode,
* segments at hand when sending layoutreturn. See pnfs_put_lseg()
* for how it works.
*/
- if (!pnfs_mark_matching_lsegs_return(lo, &free_me, &range)) {
+ if (!pnfs_mark_matching_lsegs_return(lo, &free_me,
+ &range, lseg->pls_seq)) {
nfs4_stateid stateid;
enum pnfs_iomode iomode = lo->plh_return_iomode;

diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 3476c9850678..971068b58647 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -266,10 +266,12 @@ int pnfs_choose_layoutget_stateid(nfs4_stateid *dst,
struct nfs4_state *open_state);
int pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
struct list_head *tmp_list,
- const struct pnfs_layout_range *recall_range);
+ const struct pnfs_layout_range *recall_range,
+ u32 seq);
int pnfs_mark_matching_lsegs_return(struct pnfs_layout_hdr *lo,
struct list_head *tmp_list,
- const struct pnfs_layout_range *recall_range);
+ const struct pnfs_layout_range *recall_range,
+ u32 seq);
bool pnfs_roc(struct inode *ino);
void pnfs_roc_release(struct inode *ino);
void pnfs_roc_set_barrier(struct inode *ino, u32 barrier);
--
2.5.5


2016-05-15 01:07:01

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 09/13] flexfiles: add kerneldoc header to nfs4_ff_layout_prepare_ds

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayoutdev.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
index 5532cb14e800..6efd060e731f 100644
--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
+++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
@@ -342,7 +342,23 @@ out:
return fh;
}

-/* Upon return, either ds is connected, or ds is NULL */
+/**
+ * nfs4_ff_layout_prepare_ds - prepare a DS connection for an RPC call
+ * @lseg: the layout segment we're operating on
+ * @ds_idx: index of the DS to use
+ * @fail_return: return layout on connect failure?
+ *
+ * Try to prepare a DS connection to accept an RPC call. This involves
+ * selecting a mirror to use and connecting the client to it if it's not
+ * already connected.
+ *
+ * Since we only need single functioning mirror to satisfy a read, we don't
+ * want to return the layout if there is one. For writes though, any down
+ * mirror should result in a LAYOUTRETURN. @fail_return is how we distinguish
+ * between the two cases.
+ *
+ * Returns a pointer to a connected DS object on success or NULL on failure.
+ */
struct nfs4_pnfs_ds *
nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx,
bool fail_return)
--
2.5.5


2016-05-15 01:07:03

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 10/13] pnfs: fix bad error handling in send_layoutget

Currently, the code will clear the fail bit if we get back a fatal
error. I don't think that's correct -- we only want to clear that
bit if the layoutget succeeds.

Fixes: 0bcbf039f6 (nfs: handle request add failure properly)
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/pnfs.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index e6cad5ee5d29..5f6ed295acb5 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -876,11 +876,13 @@ send_layoutget(struct pnfs_layout_hdr *lo,
lseg = nfs4_proc_layoutget(lgp, gfp_flags);
} while (lseg == ERR_PTR(-EAGAIN));

- if (IS_ERR(lseg) && !nfs_error_is_fatal(PTR_ERR(lseg)))
- lseg = NULL;
- else
+ if (IS_ERR(lseg)) {
+ if (!nfs_error_is_fatal(PTR_ERR(lseg)))
+ lseg = NULL;
+ } else {
pnfs_layout_clear_fail_bit(lo,
pnfs_iomode_to_fail_bit(range->iomode));
+ }

return lseg;
}
--
2.5.5


2016-05-15 01:07:03

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 11/13] pnfs: lift retry logic from send_layoutget to pnfs_update_layout

If we get back something like NFS4ERR_OLD_STATEID, that will be
translated into -EAGAIN, and the do/while loop in send_layoutget
will drive the call again.

This is not quite what we want, I think. An error like that is a
sign that something has changed. That something could have been a
concurrent LAYOUTGET that would give us a usable lseg.

Lift the retry logic into pnfs_update_layout instead. That allows
us to redo the layout search, and may spare us from having to issue
an RPC.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/pnfs.c | 67 ++++++++++++++++++++++++++++++-----------------------------
1 file changed, 34 insertions(+), 33 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 5f6ed295acb5..5a8c19c57f16 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -839,7 +839,6 @@ send_layoutget(struct pnfs_layout_hdr *lo,
struct inode *ino = lo->plh_inode;
struct nfs_server *server = NFS_SERVER(ino);
struct nfs4_layoutget *lgp;
- struct pnfs_layout_segment *lseg;
loff_t i_size;

dprintk("--> %s\n", __func__);
@@ -849,42 +848,30 @@ send_layoutget(struct pnfs_layout_hdr *lo,
* store in lseg. If we race with a concurrent seqid morphing
* op, then re-send the LAYOUTGET.
*/
- do {
- lgp = kzalloc(sizeof(*lgp), gfp_flags);
- if (lgp == NULL)
- return NULL;
-
- i_size = i_size_read(ino);
-
- lgp->args.minlength = PAGE_SIZE;
- if (lgp->args.minlength > range->length)
- lgp->args.minlength = range->length;
- if (range->iomode == IOMODE_READ) {
- if (range->offset >= i_size)
- lgp->args.minlength = 0;
- else if (i_size - range->offset < lgp->args.minlength)
- lgp->args.minlength = i_size - range->offset;
- }
- lgp->args.maxcount = PNFS_LAYOUT_MAXSIZE;
- pnfs_copy_range(&lgp->args.range, range);
- lgp->args.type = server->pnfs_curr_ld->id;
- lgp->args.inode = ino;
- lgp->args.ctx = get_nfs_open_context(ctx);
- lgp->gfp_flags = gfp_flags;
- lgp->cred = lo->plh_lc_cred;
+ lgp = kzalloc(sizeof(*lgp), gfp_flags);
+ if (lgp == NULL)
+ return ERR_PTR(-ENOMEM);

- lseg = nfs4_proc_layoutget(lgp, gfp_flags);
- } while (lseg == ERR_PTR(-EAGAIN));
+ i_size = i_size_read(ino);

- if (IS_ERR(lseg)) {
- if (!nfs_error_is_fatal(PTR_ERR(lseg)))
- lseg = NULL;
- } else {
- pnfs_layout_clear_fail_bit(lo,
- pnfs_iomode_to_fail_bit(range->iomode));
+ lgp->args.minlength = PAGE_SIZE;
+ if (lgp->args.minlength > range->length)
+ lgp->args.minlength = range->length;
+ if (range->iomode == IOMODE_READ) {
+ if (range->offset >= i_size)
+ lgp->args.minlength = 0;
+ else if (i_size - range->offset < lgp->args.minlength)
+ lgp->args.minlength = i_size - range->offset;
}
+ lgp->args.maxcount = PNFS_LAYOUT_MAXSIZE;
+ pnfs_copy_range(&lgp->args.range, range);
+ lgp->args.type = server->pnfs_curr_ld->id;
+ lgp->args.inode = ino;
+ lgp->args.ctx = get_nfs_open_context(ctx);
+ lgp->gfp_flags = gfp_flags;
+ lgp->cred = lo->plh_lc_cred;

- return lseg;
+ return nfs4_proc_layoutget(lgp, gfp_flags);
}

static void pnfs_clear_layoutcommit(struct inode *inode,
@@ -1646,6 +1633,20 @@ lookup_again:
arg.length = PAGE_ALIGN(arg.length);

lseg = send_layoutget(lo, ctx, &arg, gfp_flags);
+ if (IS_ERR(lseg)) {
+ if (lseg == ERR_PTR(-EAGAIN)) {
+ if (first)
+ pnfs_clear_first_layoutget(lo);
+ pnfs_put_layout_hdr(lo);
+ goto lookup_again;
+ }
+
+ if (!nfs_error_is_fatal(PTR_ERR(lseg)))
+ lseg = NULL;
+ } else {
+ pnfs_layout_clear_fail_bit(lo, pnfs_iomode_to_fail_bit(iomode));
+ }
+
atomic_dec(&lo->plh_outstanding);
trace_pnfs_update_layout(ino, pos, count, iomode, lo,
PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET);
--
2.5.5


2016-05-15 01:07:06

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 13/13] pnfs: make pnfs_layout_process more robust

It can return NULL if layoutgets are blocked currently. Fix it to return
-EAGAIN in that case, so we can properly handle it in pnfs_update_layout.

Also, clean up and simplify the error handling -- eliminate "status" and
just use "lseg".

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/pnfs.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index d0760d30734d..4108e78ae3b6 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -1706,21 +1706,19 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
struct pnfs_layout_segment *lseg;
struct inode *ino = lo->plh_inode;
LIST_HEAD(free_me);
- int status = -EINVAL;

if (!pnfs_sanity_check_layout_range(&res->range))
- goto out;
+ return ERR_PTR(-EINVAL);

/* Inject layout blob into I/O device driver */
lseg = NFS_SERVER(ino)->pnfs_curr_ld->alloc_lseg(lo, res, lgp->gfp_flags);
- if (!lseg || IS_ERR(lseg)) {
+ if (IS_ERR_OR_NULL(lseg)) {
if (!lseg)
- status = -ENOMEM;
- else
- status = PTR_ERR(lseg);
- dprintk("%s: Could not allocate layout: error %d\n",
- __func__, status);
- goto out;
+ lseg = ERR_PTR(-ENOMEM);
+
+ dprintk("%s: Could not allocate layout: error %ld\n",
+ __func__, PTR_ERR(lseg));
+ return lseg;
}

init_lseg(lo, lseg);
@@ -1730,15 +1728,14 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
spin_lock(&ino->i_lock);
if (pnfs_layoutgets_blocked(lo)) {
dprintk("%s forget reply due to state\n", __func__);
- goto out_forget_reply;
+ goto out_forget;
}

if (nfs4_stateid_match_other(&lo->plh_stateid, &res->stateid)) {
/* existing state ID, make sure the sequence number matches. */
if (pnfs_layout_stateid_blocked(lo, &res->stateid)) {
dprintk("%s forget reply due to sequence\n", __func__);
- status = -EAGAIN;
- goto out_forget_reply;
+ goto out_forget;
}
pnfs_set_layout_stateid(lo, &res->stateid, false);
} else {
@@ -1764,14 +1761,12 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
spin_unlock(&ino->i_lock);
pnfs_free_lseg_list(&free_me);
return lseg;
-out:
- return ERR_PTR(status);

-out_forget_reply:
+out_forget:
spin_unlock(&ino->i_lock);
lseg->pls_layout = lo;
NFS_SERVER(ino)->pnfs_curr_ld->free_lseg(lseg);
- goto out;
+ return ERR_PTR(-EAGAIN);
}

static void
--
2.5.5


2016-05-15 01:07:05

by Jeff Layton

[permalink] [raw]
Subject: [PATCH v3 12/13] pnfs: rework LAYOUTGET retry handling

There are several problems in the way a stateid is selected for a
LAYOUTGET operation:

We pick a stateid to use in the RPC prepare op, but that makes
it difficult to serialize LAYOUTGETs that use the open stateid. That
serialization is done in pnfs_update_layout, which occurs well before
the rpc_prepare operation.

Between those two events, the i_lock is dropped and reacquired.
pnfs_update_layout can find that the list has lsegs in it and not do any
serialization, but then later pnfs_choose_layoutget_stateid ends up
choosing the open stateid.

This patch changes the client to select the stateid to use in the
LAYOUTGET earlier, when we're searching for a usable layout segment.
This way we can do it all while holding the i_lock the first time, and
ensure that we serialize any LAYOUTGET call that uses a non-layout
stateid.

This also means a rework of how LAYOUTGET replies are handled, as we
must now get the latest stateid if we want to retransmit in response
to a retryable error.

Most of those errors boil down to the fact that the layout state has
changed in some fashion. Thus, what we really want to do is to re-search
for a layout when it fails with a retryable error, so that we can avoid
reissuing the RPC at all if possible.

While the LAYOUTGET RPC is async, the initiating thread always waits for
it to complete, so it's effectively synchronous anyway. Currently, when
we need to retry a LAYOUTGET because of an error, we drive that retry
via the rpc state machine.

This means that once the call has been submitted, it runs until it
completes. So, we must move the error handling for this RPC out of the
rpc_call_done operation and into the caller.

In order to handle errors like NFS4ERR_DELAY properly, we must also
pass a pointer to the sliding timeout, which is now moved to the stack
in pnfs_update_layout.

The complicating errors are -NFS4ERR_RECALLCONFLICT and
-NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
up and return NULL back to the caller. So, there is some special
handling for those errors to ensure that the layers driving the retries
can handle that appropriately.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfs/nfs4proc.c | 111 +++++++++++++++----------------------
fs/nfs/nfs4trace.h | 10 +++-
fs/nfs/pnfs.c | 142 +++++++++++++++++++++++++-----------------------
fs/nfs/pnfs.h | 6 +-
include/linux/nfs4.h | 2 +
include/linux/nfs_xdr.h | 2 -
6 files changed, 131 insertions(+), 142 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c0d75be8cb69..1254ed84c760 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -416,6 +416,7 @@ static int nfs4_do_handle_exception(struct nfs_server *server,
case -NFS4ERR_DELAY:
nfs_inc_server_stats(server, NFSIOS_DELAY);
case -NFS4ERR_GRACE:
+ case -NFS4ERR_RECALLCONFLICT:
exception->delay = 1;
return 0;

@@ -7824,40 +7825,34 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
struct nfs4_layoutget *lgp = calldata;
struct nfs_server *server = NFS_SERVER(lgp->args.inode);
struct nfs4_session *session = nfs4_get_session(server);
- int ret;

dprintk("--> %s\n", __func__);
- /* Note the is a race here, where a CB_LAYOUTRECALL can come in
- * right now covering the LAYOUTGET we are about to send.
- * However, that is not so catastrophic, and there seems
- * to be no way to prevent it completely.
- */
- if (nfs41_setup_sequence(session, &lgp->args.seq_args,
- &lgp->res.seq_res, task))
- return;
- ret = pnfs_choose_layoutget_stateid(&lgp->args.stateid,
- NFS_I(lgp->args.inode)->layout,
- &lgp->args.range,
- lgp->args.ctx->state);
- if (ret < 0)
- rpc_exit(task, ret);
+ nfs41_setup_sequence(session, &lgp->args.seq_args,
+ &lgp->res.seq_res, task);
+ dprintk("<-- %s\n", __func__);
}

static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
{
struct nfs4_layoutget *lgp = calldata;
+
+ dprintk("--> %s\n", __func__);
+ nfs41_sequence_done(task, &lgp->res.seq_res);
+ dprintk("<-- %s\n", __func__);
+}
+
+static int
+nfs4_layoutget_handle_exception(struct rpc_task *task,
+ struct nfs4_layoutget *lgp, struct nfs4_exception *exception)
+{
struct inode *inode = lgp->args.inode;
struct nfs_server *server = NFS_SERVER(inode);
struct pnfs_layout_hdr *lo;
- struct nfs4_state *state = NULL;
- unsigned long timeo, now, giveup;
+ int status = task->tk_status;

dprintk("--> %s tk_status => %d\n", __func__, -task->tk_status);

- if (!nfs41_sequence_done(task, &lgp->res.seq_res))
- goto out;
-
- switch (task->tk_status) {
+ switch (status) {
case 0:
goto out;

@@ -7867,57 +7862,39 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
* retry go inband.
*/
case -NFS4ERR_LAYOUTUNAVAILABLE:
- task->tk_status = -ENODATA;
+ status = -ENODATA;
goto out;
/*
* NFS4ERR_BADLAYOUT means the MDS cannot return a layout of
* length lgp->args.minlength != 0 (see RFC5661 section 18.43.3).
*/
case -NFS4ERR_BADLAYOUT:
- goto out_overflow;
+ status = -EOVERFLOW;
+ goto out;
/*
* NFS4ERR_LAYOUTTRYLATER is a conflict with another client
* (or clients) writing to the same RAID stripe except when
* the minlength argument is 0 (see RFC5661 section 18.43.3).
+ *
+ * Treat it like we would RECALLCONFLICT -- we retry for a little
+ * while, and then eventually give up.
*/
case -NFS4ERR_LAYOUTTRYLATER:
- if (lgp->args.minlength == 0)
- goto out_overflow;
- /*
- * NFS4ERR_RECALLCONFLICT is when conflict with self (must recall
- * existing layout before getting a new one).
- */
- case -NFS4ERR_RECALLCONFLICT:
- timeo = rpc_get_timeout(task->tk_client);
- giveup = lgp->args.timestamp + timeo;
- now = jiffies;
- if (time_after(giveup, now)) {
- unsigned long delay;
-
- /* Delay for:
- * - Not less then NFS4_POLL_RETRY_MIN.
- * - One last time a jiffie before we give up
- * - exponential backoff (time_now minus start_attempt)
- */
- delay = max_t(unsigned long, NFS4_POLL_RETRY_MIN,
- min((giveup - now - 1),
- now - lgp->args.timestamp));
-
- dprintk("%s: NFS4ERR_RECALLCONFLICT waiting %lu\n",
- __func__, delay);
- rpc_delay(task, delay);
- /* Do not call nfs4_async_handle_error() */
- goto out_restart;
+ if (lgp->args.minlength == 0) {
+ status = -EOVERFLOW;
+ goto out;
}
+ status = -NFS4ERR_RECALLCONFLICT;
break;
case -NFS4ERR_EXPIRED:
case -NFS4ERR_BAD_STATEID:
+ exception->timeout = 0;
spin_lock(&inode->i_lock);
if (nfs4_stateid_match(&lgp->args.stateid,
&lgp->args.ctx->state->stateid)) {
spin_unlock(&inode->i_lock);
/* If the open stateid was bad, then recover it. */
- state = lgp->args.ctx->state;
+ exception->state = lgp->args.ctx->state;
break;
}
lo = NFS_I(inode)->layout;
@@ -7935,20 +7912,18 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
pnfs_free_lseg_list(&head);
} else
spin_unlock(&inode->i_lock);
- goto out_restart;
+ status = -EAGAIN;
+ goto out;
}
- if (nfs4_async_handle_error(task, server, state, &lgp->timeout) == -EAGAIN)
- goto out_restart;
+
+ status = nfs4_handle_exception(server, status, exception);
+ if (status == 0)
+ status = task->tk_status;
+ if (exception->retry && status != -NFS4ERR_RECALLCONFLICT)
+ status = -EAGAIN;
out:
dprintk("<-- %s\n", __func__);
- return;
-out_restart:
- task->tk_status = 0;
- rpc_restart_call_prepare(task);
- return;
-out_overflow:
- task->tk_status = -EOVERFLOW;
- goto out;
+ return status;
}

static size_t max_response_pages(struct nfs_server *server)
@@ -8017,7 +7992,7 @@ static const struct rpc_call_ops nfs4_layoutget_call_ops = {
};

struct pnfs_layout_segment *
-nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
+nfs4_proc_layoutget(struct nfs4_layoutget *lgp, long *timeout, gfp_t gfp_flags)
{
struct inode *inode = lgp->args.inode;
struct nfs_server *server = NFS_SERVER(inode);
@@ -8037,6 +8012,7 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
.flags = RPC_TASK_ASYNC,
};
struct pnfs_layout_segment *lseg = NULL;
+ struct nfs4_exception exception = { .timeout = *timeout };
int status = 0;

dprintk("--> %s\n", __func__);
@@ -8050,7 +8026,6 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
return ERR_PTR(-ENOMEM);
}
lgp->args.layout.pglen = max_pages * PAGE_SIZE;
- lgp->args.timestamp = jiffies;

lgp->res.layoutp = &lgp->args.layout;
lgp->res.seq_res.sr_slot = NULL;
@@ -8060,13 +8035,17 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
if (IS_ERR(task))
return ERR_CAST(task);
status = nfs4_wait_for_completion_rpc_task(task);
- if (status == 0)
- status = task->tk_status;
+ if (status == 0) {
+ status = nfs4_layoutget_handle_exception(task, lgp, &exception);
+ *timeout = exception.timeout;
+ }
+
trace_nfs4_layoutget(lgp->args.ctx,
&lgp->args.range,
&lgp->res.range,
&lgp->res.stateid,
status);
+
/* if layoutp->len is 0, nfs4_layoutget_prepare called rpc_exit */
if (status == 0 && lgp->res.layoutp->len)
lseg = pnfs_layout_process(lgp);
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index 2c8d05dae5b1..9c150b153782 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -1520,6 +1520,8 @@ DEFINE_NFS4_INODE_EVENT(nfs4_layoutreturn_on_close);
{ PNFS_UPDATE_LAYOUT_FOUND_CACHED, "found cached" }, \
{ PNFS_UPDATE_LAYOUT_RETURN, "layoutreturn" }, \
{ PNFS_UPDATE_LAYOUT_BLOCKED, "layouts blocked" }, \
+ { PNFS_UPDATE_LAYOUT_INVALID_OPEN, "invalid open" }, \
+ { PNFS_UPDATE_LAYOUT_RETRY, "retrying" }, \
{ PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET, "sent layoutget" })

TRACE_EVENT(pnfs_update_layout,
@@ -1528,9 +1530,10 @@ TRACE_EVENT(pnfs_update_layout,
u64 count,
enum pnfs_iomode iomode,
struct pnfs_layout_hdr *lo,
+ struct pnfs_layout_segment *lseg,
enum pnfs_update_layout_reason reason
),
- TP_ARGS(inode, pos, count, iomode, lo, reason),
+ TP_ARGS(inode, pos, count, iomode, lo, lseg, reason),
TP_STRUCT__entry(
__field(dev_t, dev)
__field(u64, fileid)
@@ -1540,6 +1543,7 @@ TRACE_EVENT(pnfs_update_layout,
__field(enum pnfs_iomode, iomode)
__field(int, layoutstateid_seq)
__field(u32, layoutstateid_hash)
+ __field(long, lseg)
__field(enum pnfs_update_layout_reason, reason)
),
TP_fast_assign(
@@ -1559,11 +1563,12 @@ TRACE_EVENT(pnfs_update_layout,
__entry->layoutstateid_seq = 0;
__entry->layoutstateid_hash = 0;
}
+ __entry->lseg = (long)lseg;
),
TP_printk(
"fileid=%02x:%02x:%llu fhandle=0x%08x "
"iomode=%s pos=%llu count=%llu "
- "layoutstateid=%d:0x%08x (%s)",
+ "layoutstateid=%d:0x%08x lseg=0x%lx (%s)",
MAJOR(__entry->dev), MINOR(__entry->dev),
(unsigned long long)__entry->fileid,
__entry->fhandle,
@@ -1571,6 +1576,7 @@ TRACE_EVENT(pnfs_update_layout,
(unsigned long long)__entry->pos,
(unsigned long long)__entry->count,
__entry->layoutstateid_seq, __entry->layoutstateid_hash,
+ __entry->lseg,
show_pnfs_update_layout_reason(__entry->reason)
)
);
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 5a8c19c57f16..d0760d30734d 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -796,45 +796,18 @@ pnfs_layoutgets_blocked(const struct pnfs_layout_hdr *lo)
test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
}

-int
-pnfs_choose_layoutget_stateid(nfs4_stateid *dst, struct pnfs_layout_hdr *lo,
- const struct pnfs_layout_range *range,
- struct nfs4_state *open_state)
-{
- int status = 0;
-
- dprintk("--> %s\n", __func__);
- spin_lock(&lo->plh_inode->i_lock);
- if (pnfs_layoutgets_blocked(lo)) {
- status = -EAGAIN;
- } else if (!nfs4_valid_open_stateid(open_state)) {
- status = -EBADF;
- } else if (list_empty(&lo->plh_segs) ||
- test_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags)) {
- int seq;
-
- do {
- seq = read_seqbegin(&open_state->seqlock);
- nfs4_stateid_copy(dst, &open_state->stateid);
- } while (read_seqretry(&open_state->seqlock, seq));
- } else
- nfs4_stateid_copy(dst, &lo->plh_stateid);
- spin_unlock(&lo->plh_inode->i_lock);
- dprintk("<-- %s\n", __func__);
- return status;
-}
-
/*
-* Get layout from server.
-* for now, assume that whole file layouts are requested.
-* arg->offset: 0
-* arg->length: all ones
-*/
+ * Get layout from server.
+ * for now, assume that whole file layouts are requested.
+ * arg->offset: 0
+ * arg->length: all ones
+ */
static struct pnfs_layout_segment *
send_layoutget(struct pnfs_layout_hdr *lo,
struct nfs_open_context *ctx,
+ nfs4_stateid *stateid,
const struct pnfs_layout_range *range,
- gfp_t gfp_flags)
+ long *timeout, gfp_t gfp_flags)
{
struct inode *ino = lo->plh_inode;
struct nfs_server *server = NFS_SERVER(ino);
@@ -868,10 +841,11 @@ send_layoutget(struct pnfs_layout_hdr *lo,
lgp->args.type = server->pnfs_curr_ld->id;
lgp->args.inode = ino;
lgp->args.ctx = get_nfs_open_context(ctx);
+ nfs4_stateid_copy(&lgp->args.stateid, stateid);
lgp->gfp_flags = gfp_flags;
lgp->cred = lo->plh_lc_cred;

- return nfs4_proc_layoutget(lgp, gfp_flags);
+ return nfs4_proc_layoutget(lgp, timeout, gfp_flags);
}

static void pnfs_clear_layoutcommit(struct inode *inode,
@@ -1511,27 +1485,30 @@ pnfs_update_layout(struct inode *ino,
.offset = pos,
.length = count,
};
- unsigned pg_offset;
+ unsigned pg_offset, seq;
struct nfs_server *server = NFS_SERVER(ino);
struct nfs_client *clp = server->nfs_client;
- struct pnfs_layout_hdr *lo;
+ struct pnfs_layout_hdr *lo = NULL;
struct pnfs_layout_segment *lseg = NULL;
+ nfs4_stateid stateid;
+ long timeout = 0;
+ unsigned long giveup = jiffies + rpc_get_timeout(server->client);
bool first;

if (!pnfs_enabled_sb(NFS_SERVER(ino))) {
- trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_NO_PNFS);
goto out;
}

if (iomode == IOMODE_READ && i_size_read(ino) == 0) {
- trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_RD_ZEROLEN);
goto out;
}

if (pnfs_within_mdsthreshold(ctx, ino, iomode)) {
- trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_MDSTHRESH);
goto out;
}
@@ -1542,14 +1519,14 @@ lookup_again:
lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
if (lo == NULL) {
spin_unlock(&ino->i_lock);
- trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_NOMEM);
goto out;
}

/* Do we even need to bother with this? */
if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
- trace_pnfs_update_layout(ino, pos, count, iomode, lo,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_BULK_RECALL);
dprintk("%s matches recall, use MDS\n", __func__);
goto out_unlock;
@@ -1557,14 +1534,34 @@ lookup_again:

/* if LAYOUTGET already failed once we don't try again */
if (pnfs_layout_io_test_failed(lo, iomode)) {
- trace_pnfs_update_layout(ino, pos, count, iomode, lo,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_IO_TEST_FAIL);
goto out_unlock;
}

- first = list_empty(&lo->plh_segs);
- if (first) {
- /* The first layoutget for the file. Need to serialize per
+ lseg = pnfs_find_lseg(lo, &arg);
+ if (lseg) {
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
+ PNFS_UPDATE_LAYOUT_FOUND_CACHED);
+ goto out_unlock;
+ }
+
+ if (!nfs4_valid_open_stateid(ctx->state)) {
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
+ PNFS_UPDATE_LAYOUT_INVALID_OPEN);
+ goto out_unlock;
+ }
+
+ /*
+ * Choose a stateid for the LAYOUTGET. If we don't have a layout
+ * stateid, or it has been invalidated, then we must use the open
+ * stateid.
+ */
+ if (lo->plh_stateid.seqid == 0 ||
+ test_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags)) {
+
+ /*
+ * The first layoutget for the file. Need to serialize per
* RFC 5661 Errata 3208.
*/
if (test_and_set_bit(NFS_LAYOUT_FIRST_LAYOUTGET,
@@ -1573,18 +1570,17 @@ lookup_again:
wait_on_bit(&lo->plh_flags, NFS_LAYOUT_FIRST_LAYOUTGET,
TASK_UNINTERRUPTIBLE);
pnfs_put_layout_hdr(lo);
+ dprintk("%s retrying\n", __func__);
goto lookup_again;
}
+
+ first = true;
+ do {
+ seq = read_seqbegin(&ctx->state->seqlock);
+ nfs4_stateid_copy(&stateid, &ctx->state->stateid);
+ } while (read_seqretry(&ctx->state->seqlock, seq));
} else {
- /* Check to see if the layout for the given range
- * already exists
- */
- lseg = pnfs_find_lseg(lo, &arg);
- if (lseg) {
- trace_pnfs_update_layout(ino, pos, count, iomode, lo,
- PNFS_UPDATE_LAYOUT_FOUND_CACHED);
- goto out_unlock;
- }
+ nfs4_stateid_copy(&stateid, &lo->plh_stateid);
}

/*
@@ -1599,15 +1595,17 @@ lookup_again:
pnfs_clear_first_layoutget(lo);
pnfs_put_layout_hdr(lo);
dprintk("%s retrying\n", __func__);
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo,
+ lseg, PNFS_UPDATE_LAYOUT_RETRY);
goto lookup_again;
}
- trace_pnfs_update_layout(ino, pos, count, iomode, lo,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_RETURN);
goto out_put_layout_hdr;
}

if (pnfs_layoutgets_blocked(lo)) {
- trace_pnfs_update_layout(ino, pos, count, iomode, lo,
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_BLOCKED);
goto out_unlock;
}
@@ -1632,24 +1630,34 @@ lookup_again:
if (arg.length != NFS4_MAX_UINT64)
arg.length = PAGE_ALIGN(arg.length);

- lseg = send_layoutget(lo, ctx, &arg, gfp_flags);
- if (IS_ERR(lseg)) {
- if (lseg == ERR_PTR(-EAGAIN)) {
+ lseg = send_layoutget(lo, ctx, &stateid, &arg, &timeout, gfp_flags);
+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
+ PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET);
+ if (IS_ERR_OR_NULL(lseg)) {
+ switch(PTR_ERR(lseg)) {
+ case -NFS4ERR_RECALLCONFLICT:
+ if (time_after(jiffies, giveup))
+ lseg = NULL;
+ /* Fallthrough */
+ case -EAGAIN:
+ pnfs_put_layout_hdr(lo);
if (first)
pnfs_clear_first_layoutget(lo);
- pnfs_put_layout_hdr(lo);
- goto lookup_again;
+ if (lseg) {
+ trace_pnfs_update_layout(ino, pos, count,
+ iomode, lo, lseg, PNFS_UPDATE_LAYOUT_RETRY);
+ goto lookup_again;
+ }
+ break;
+ default:
+ if (!nfs_error_is_fatal(PTR_ERR(lseg)))
+ lseg = NULL;
}
-
- if (!nfs_error_is_fatal(PTR_ERR(lseg)))
- lseg = NULL;
} else {
pnfs_layout_clear_fail_bit(lo, pnfs_iomode_to_fail_bit(iomode));
}

atomic_dec(&lo->plh_outstanding);
- trace_pnfs_update_layout(ino, pos, count, iomode, lo,
- PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET);
out_put_layout_hdr:
if (first)
pnfs_clear_first_layoutget(lo);
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 971068b58647..f9f3331bef49 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -228,7 +228,7 @@ extern void pnfs_unregister_layoutdriver(struct pnfs_layoutdriver_type *);
extern int nfs4_proc_getdeviceinfo(struct nfs_server *server,
struct pnfs_device *dev,
struct rpc_cred *cred);
-extern struct pnfs_layout_segment* nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags);
+extern struct pnfs_layout_segment* nfs4_proc_layoutget(struct nfs4_layoutget *lgp, long *timeout, gfp_t gfp_flags);
extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool sync);

/* pnfs.c */
@@ -260,10 +260,6 @@ void pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo);
void pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo,
const nfs4_stateid *new,
bool update_barrier);
-int pnfs_choose_layoutget_stateid(nfs4_stateid *dst,
- struct pnfs_layout_hdr *lo,
- const struct pnfs_layout_range *range,
- struct nfs4_state *open_state);
int pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
struct list_head *tmp_list,
const struct pnfs_layout_range *recall_range,
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 011433478a14..f4870a330290 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -621,7 +621,9 @@ enum pnfs_update_layout_reason {
PNFS_UPDATE_LAYOUT_IO_TEST_FAIL,
PNFS_UPDATE_LAYOUT_FOUND_CACHED,
PNFS_UPDATE_LAYOUT_RETURN,
+ PNFS_UPDATE_LAYOUT_RETRY,
PNFS_UPDATE_LAYOUT_BLOCKED,
+ PNFS_UPDATE_LAYOUT_INVALID_OPEN,
PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET,
};

diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index cb9982d8f38f..a4cb8a33ae2c 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -233,7 +233,6 @@ struct nfs4_layoutget_args {
struct inode *inode;
struct nfs_open_context *ctx;
nfs4_stateid stateid;
- unsigned long timestamp;
struct nfs4_layoutdriver_data layout;
};

@@ -251,7 +250,6 @@ struct nfs4_layoutget {
struct nfs4_layoutget_res res;
struct rpc_cred *cred;
gfp_t gfp_flags;
- long timeout;
};

struct nfs4_getdeviceinfo_args {
--
2.5.5


2016-05-16 19:50:29

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3 10/13] pnfs: fix bad error handling in send_layoutget

DQoNCk9uIDUvMTQvMTYsIDIxOjA2LCAiSmVmZiBMYXl0b24iIDxqbGF5dG9uQHBvb2NoaWVyZWRz
Lm5ldD4gd3JvdGU6DQoNCj5DdXJyZW50bHksIHRoZSBjb2RlIHdpbGwgY2xlYXIgdGhlIGZhaWwg
Yml0IGlmIHdlIGdldCBiYWNrIGEgZmF0YWwNCj5lcnJvci4gSSBkb24ndCB0aGluayB0aGF0J3Mg
Y29ycmVjdCAtLSB3ZSBvbmx5IHdhbnQgdG8gY2xlYXIgdGhhdA0KPmJpdCBpZiB0aGUgbGF5b3V0
Z2V0IHN1Y2NlZWRzLg0KPg0KPkZpeGVzOiAwYmNiZjAzOWY2IChuZnM6IGhhbmRsZSByZXF1ZXN0
IGFkZCBmYWlsdXJlIHByb3Blcmx5KQ0KPlNpZ25lZC1vZmYtYnk6IEplZmYgTGF5dG9uIDxqZWZm
LmxheXRvbkBwcmltYXJ5ZGF0YS5jb20+DQo+LS0tDQo+IGZzL25mcy9wbmZzLmMgfCA4ICsrKysr
LS0tDQo+IDEgZmlsZSBjaGFuZ2VkLCA1IGluc2VydGlvbnMoKyksIDMgZGVsZXRpb25zKC0pDQo+
DQo+ZGlmZiAtLWdpdCBhL2ZzL25mcy9wbmZzLmMgYi9mcy9uZnMvcG5mcy5jDQo+aW5kZXggZTZj
YWQ1ZWU1ZDI5Li41ZjZlZDI5NWFjYjUgMTAwNjQ0DQo+LS0tIGEvZnMvbmZzL3BuZnMuYw0KPisr
KyBiL2ZzL25mcy9wbmZzLmMNCj5AQCAtODc2LDExICs4NzYsMTMgQEAgc2VuZF9sYXlvdXRnZXQo
c3RydWN0IHBuZnNfbGF5b3V0X2hkciAqbG8sDQo+IAkJbHNlZyA9IG5mczRfcHJvY19sYXlvdXRn
ZXQobGdwLCBnZnBfZmxhZ3MpOw0KPiAJfSB3aGlsZSAobHNlZyA9PSBFUlJfUFRSKC1FQUdBSU4p
KTsNCj4gDQo+LQlpZiAoSVNfRVJSKGxzZWcpICYmICFuZnNfZXJyb3JfaXNfZmF0YWwoUFRSX0VS
Uihsc2VnKSkpDQo+LQkJbHNlZyA9IE5VTEw7DQo+LQllbHNlDQo+KwlpZiAoSVNfRVJSKGxzZWcp
KSB7DQo+KwkJaWYgKCFuZnNfZXJyb3JfaXNfZmF0YWwoUFRSX0VSUihsc2VnKSkpDQo+KwkJCWxz
ZWcgPSBOVUxMOw0KPisJfSBlbHNlIHsNCj4gCQlwbmZzX2xheW91dF9jbGVhcl9mYWlsX2JpdChs
bywNCj4gCQkJCXBuZnNfaW9tb2RlX3RvX2ZhaWxfYml0KHJhbmdlLT5pb21vZGUpKTsNCj4rCX0N
Cg0KTm/igKYgVGhlIGludGVudGlvbiB3YXMgaW5kZWVkIHRoYXQgd2UgY2xlYXIgdGhlIGZhaWwg
Yml0IGluIGFsbCBjYXNlcyBleGNlcHQgZmF0YWwgZXJyb3JzLiBPdGhlcndpc2UsIHRoZSBjbGll
bnQgd29u4oCZdCBhdHRlbXB0IGFub3RoZXIgbGF5b3V0Z2V0Lg0KDQoNCg0K


2016-05-16 20:08:10

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 10/13] pnfs: fix bad error handling in send_layoutget

On Mon, 2016-05-16 at 19:50 +0000, Trond Myklebust wrote:
>
>
> On 5/14/16, 21:06, "Jeff Layton" <[email protected]> wrote:
>
> >Currently, the code will clear the fail bit if we get back a fatal
> >error. I don't think that's correct -- we only want to clear that
> >bit if the layoutget succeeds.
> >
> >Fixes: 0bcbf039f6 (nfs: handle request add failure properly)
> >Signed-off-by: Jeff Layton <[email protected]>
> >---
> > fs/nfs/pnfs.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> >diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> >index e6cad5ee5d29..5f6ed295acb5 100644
> >--- a/fs/nfs/pnfs.c
> >+++ b/fs/nfs/pnfs.c
> >@@ -876,11 +876,13 @@ send_layoutget(struct pnfs_layout_hdr *lo,
> > lseg = nfs4_proc_layoutget(lgp, gfp_flags);
> > } while (lseg == ERR_PTR(-EAGAIN));
> > 
> >- if (IS_ERR(lseg) && !nfs_error_is_fatal(PTR_ERR(lseg)))
> >-  lseg = NULL;
> >- else
> >+ if (IS_ERR(lseg)) {
> >+  if (!nfs_error_is_fatal(PTR_ERR(lseg)))
> >+  lseg = NULL;
> >+ } else {
> > pnfs_layout_clear_fail_bit(lo,
> > pnfs_iomode_to_fail_bit(range->iomode));
> >+ }
>
> No… The intention was indeed that we clear the fail bit in all cases
> except fatal errors. Otherwise, the client won’t attempt another
> layoutget.
>

Got it, thanks. Let me respin, retest and resend. Let me know if you
see any other problems and I'll get those fixed before the next resend.

Thanks!
--
Jeff Layton <[email protected]>

2016-05-17 02:08:15

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v3 12/13] pnfs: rework LAYOUTGET retry handling

DQoNCk9uIDUvMTQvMTYsIDIxOjA2LCAiSmVmZiBMYXl0b24iIDxqbGF5dG9uQHBvb2NoaWVyZWRz
Lm5ldD4gd3JvdGU6DQoNCj5UaGVyZSBhcmUgc2V2ZXJhbCBwcm9ibGVtcyBpbiB0aGUgd2F5IGEg
c3RhdGVpZCBpcyBzZWxlY3RlZCBmb3IgYQ0KPkxBWU9VVEdFVCBvcGVyYXRpb246DQo+DQo+V2Ug
cGljayBhIHN0YXRlaWQgdG8gdXNlIGluIHRoZSBSUEMgcHJlcGFyZSBvcCwgYnV0IHRoYXQgbWFr
ZXMNCj5pdCBkaWZmaWN1bHQgdG8gc2VyaWFsaXplIExBWU9VVEdFVHMgdGhhdCB1c2UgdGhlIG9w
ZW4gc3RhdGVpZC4gVGhhdA0KPnNlcmlhbGl6YXRpb24gaXMgZG9uZSBpbiBwbmZzX3VwZGF0ZV9s
YXlvdXQsIHdoaWNoIG9jY3VycyB3ZWxsIGJlZm9yZQ0KPnRoZSBycGNfcHJlcGFyZSBvcGVyYXRp
b24uDQo+DQo+QmV0d2VlbiB0aG9zZSB0d28gZXZlbnRzLCB0aGUgaV9sb2NrIGlzIGRyb3BwZWQg
YW5kIHJlYWNxdWlyZWQuDQo+cG5mc191cGRhdGVfbGF5b3V0IGNhbiBmaW5kIHRoYXQgdGhlIGxp
c3QgaGFzIGxzZWdzIGluIGl0IGFuZCBub3QgZG8gYW55DQo+c2VyaWFsaXphdGlvbiwgYnV0IHRo
ZW4gbGF0ZXIgcG5mc19jaG9vc2VfbGF5b3V0Z2V0X3N0YXRlaWQgZW5kcyB1cA0KPmNob29zaW5n
IHRoZSBvcGVuIHN0YXRlaWQuDQo+DQo+VGhpcyBwYXRjaCBjaGFuZ2VzIHRoZSBjbGllbnQgdG8g
c2VsZWN0IHRoZSBzdGF0ZWlkIHRvIHVzZSBpbiB0aGUNCj5MQVlPVVRHRVQgZWFybGllciwgd2hl
biB3ZSdyZSBzZWFyY2hpbmcgZm9yIGEgdXNhYmxlIGxheW91dCBzZWdtZW50Lg0KPlRoaXMgd2F5
IHdlIGNhbiBkbyBpdCBhbGwgd2hpbGUgaG9sZGluZyB0aGUgaV9sb2NrIHRoZSBmaXJzdCB0aW1l
LCBhbmQNCj5lbnN1cmUgdGhhdCB3ZSBzZXJpYWxpemUgYW55IExBWU9VVEdFVCBjYWxsIHRoYXQg
dXNlcyBhIG5vbi1sYXlvdXQNCj5zdGF0ZWlkLg0KPg0KPlRoaXMgYWxzbyBtZWFucyBhIHJld29y
ayBvZiBob3cgTEFZT1VUR0VUIHJlcGxpZXMgYXJlIGhhbmRsZWQsIGFzIHdlDQo+bXVzdCBub3cg
Z2V0IHRoZSBsYXRlc3Qgc3RhdGVpZCBpZiB3ZSB3YW50IHRvIHJldHJhbnNtaXQgaW4gcmVzcG9u
c2UNCj50byBhIHJldHJ5YWJsZSBlcnJvci4NCj4NCj5Nb3N0IG9mIHRob3NlIGVycm9ycyBib2ls
IGRvd24gdG8gdGhlIGZhY3QgdGhhdCB0aGUgbGF5b3V0IHN0YXRlIGhhcw0KPmNoYW5nZWQgaW4g
c29tZSBmYXNoaW9uLiBUaHVzLCB3aGF0IHdlIHJlYWxseSB3YW50IHRvIGRvIGlzIHRvIHJlLXNl
YXJjaA0KPmZvciBhIGxheW91dCB3aGVuIGl0IGZhaWxzIHdpdGggYSByZXRyeWFibGUgZXJyb3Is
IHNvIHRoYXQgd2UgY2FuIGF2b2lkDQo+cmVpc3N1aW5nIHRoZSBSUEMgYXQgYWxsIGlmIHBvc3Np
YmxlLg0KPg0KPldoaWxlIHRoZSBMQVlPVVRHRVQgUlBDIGlzIGFzeW5jLCB0aGUgaW5pdGlhdGlu
ZyB0aHJlYWQgYWx3YXlzIHdhaXRzIGZvcg0KPml0IHRvIGNvbXBsZXRlLCBzbyBpdCdzIGVmZmVj
dGl2ZWx5IHN5bmNocm9ub3VzIGFueXdheS4gQ3VycmVudGx5LCB3aGVuDQo+d2UgbmVlZCB0byBy
ZXRyeSBhIExBWU9VVEdFVCBiZWNhdXNlIG9mIGFuIGVycm9yLCB3ZSBkcml2ZSB0aGF0IHJldHJ5
DQo+dmlhIHRoZSBycGMgc3RhdGUgbWFjaGluZS4NCj4NCj5UaGlzIG1lYW5zIHRoYXQgb25jZSB0
aGUgY2FsbCBoYXMgYmVlbiBzdWJtaXR0ZWQsIGl0IHJ1bnMgdW50aWwgaXQNCj5jb21wbGV0ZXMu
IFNvLCB3ZSBtdXN0IG1vdmUgdGhlIGVycm9yIGhhbmRsaW5nIGZvciB0aGlzIFJQQyBvdXQgb2Yg
dGhlDQo+cnBjX2NhbGxfZG9uZSBvcGVyYXRpb24gYW5kIGludG8gdGhlIGNhbGxlci4NCj4NCj5J
biBvcmRlciB0byBoYW5kbGUgZXJyb3JzIGxpa2UgTkZTNEVSUl9ERUxBWSBwcm9wZXJseSwgd2Ug
bXVzdCBhbHNvDQo+cGFzcyBhIHBvaW50ZXIgdG8gdGhlIHNsaWRpbmcgdGltZW91dCwgd2hpY2gg
aXMgbm93IG1vdmVkIHRvIHRoZSBzdGFjaw0KPmluIHBuZnNfdXBkYXRlX2xheW91dC4NCj4NCj5U
aGUgY29tcGxpY2F0aW5nIGVycm9ycyBhcmUgLU5GUzRFUlJfUkVDQUxMQ09ORkxJQ1QgYW5kDQo+
LU5GUzRFUlJfTEFZT1VUVFJZTEFURVIsIGFzIHRob3NlIGludm9sdmUgYSB0aW1lb3V0IGFmdGVy
IHdoaWNoIHdlIGdpdmUNCj51cCBhbmQgcmV0dXJuIE5VTEwgYmFjayB0byB0aGUgY2FsbGVyLiBT
bywgdGhlcmUgaXMgc29tZSBzcGVjaWFsDQo+aGFuZGxpbmcgZm9yIHRob3NlIGVycm9ycyB0byBl
bnN1cmUgdGhhdCB0aGUgbGF5ZXJzIGRyaXZpbmcgdGhlIHJldHJpZXMNCj5jYW4gaGFuZGxlIHRo
YXQgYXBwcm9wcmlhdGVseS4NCj4NCj5TaWduZWQtb2ZmLWJ5OiBKZWZmIExheXRvbiA8amVmZi5s
YXl0b25AcHJpbWFyeWRhdGEuY29tPg0KPi0tLQ0KPiBmcy9uZnMvbmZzNHByb2MuYyAgICAgICB8
IDExMSArKysrKysrKysrKysrKystLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQo+IGZzL25mcy9uZnM0
dHJhY2UuaCAgICAgIHwgIDEwICsrKy0NCj4gZnMvbmZzL3BuZnMuYyAgICAgICAgICAgfCAxNDIg
KysrKysrKysrKysrKysrKysrKysrKysrKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQo+IGZzL25m
cy9wbmZzLmggICAgICAgICAgIHwgICA2ICstDQo+IGluY2x1ZGUvbGludXgvbmZzNC5oICAgIHwg
ICAyICsNCj4gaW5jbHVkZS9saW51eC9uZnNfeGRyLmggfCAgIDIgLQ0KPiA2IGZpbGVzIGNoYW5n
ZWQsIDEzMSBpbnNlcnRpb25zKCspLCAxNDIgZGVsZXRpb25zKC0pDQo+DQo+ZGlmZiAtLWdpdCBh
L2ZzL25mcy9uZnM0cHJvYy5jIGIvZnMvbmZzL25mczRwcm9jLmMNCj5pbmRleCBjMGQ3NWJlOGNi
NjkuLjEyNTRlZDg0Yzc2MCAxMDA2NDQNCj4tLS0gYS9mcy9uZnMvbmZzNHByb2MuYw0KPisrKyBi
L2ZzL25mcy9uZnM0cHJvYy5jDQo+QEAgLTQxNiw2ICs0MTYsNyBAQCBzdGF0aWMgaW50IG5mczRf
ZG9faGFuZGxlX2V4Y2VwdGlvbihzdHJ1Y3QgbmZzX3NlcnZlciAqc2VydmVyLA0KPiAJCWNhc2Ug
LU5GUzRFUlJfREVMQVk6DQo+IAkJCW5mc19pbmNfc2VydmVyX3N0YXRzKHNlcnZlciwgTkZTSU9T
X0RFTEFZKTsNCj4gCQljYXNlIC1ORlM0RVJSX0dSQUNFOg0KPisJCWNhc2UgLU5GUzRFUlJfUkVD
QUxMQ09ORkxJQ1Q6DQo+IAkJCWV4Y2VwdGlvbi0+ZGVsYXkgPSAxOw0KPiAJCQlyZXR1cm4gMDsN
Cj4gDQo+QEAgLTc4MjQsNDAgKzc4MjUsMzQgQEAgbmZzNF9sYXlvdXRnZXRfcHJlcGFyZShzdHJ1
Y3QgcnBjX3Rhc2sgKnRhc2ssIHZvaWQgKmNhbGxkYXRhKQ0KPiAJc3RydWN0IG5mczRfbGF5b3V0
Z2V0ICpsZ3AgPSBjYWxsZGF0YTsNCj4gCXN0cnVjdCBuZnNfc2VydmVyICpzZXJ2ZXIgPSBORlNf
U0VSVkVSKGxncC0+YXJncy5pbm9kZSk7DQo+IAlzdHJ1Y3QgbmZzNF9zZXNzaW9uICpzZXNzaW9u
ID0gbmZzNF9nZXRfc2Vzc2lvbihzZXJ2ZXIpOw0KPi0JaW50IHJldDsNCj4gDQo+IAlkcHJpbnRr
KCItLT4gJXNcbiIsIF9fZnVuY19fKTsNCj4tCS8qIE5vdGUgdGhlIGlzIGEgcmFjZSBoZXJlLCB3
aGVyZSBhIENCX0xBWU9VVFJFQ0FMTCBjYW4gY29tZSBpbg0KPi0JICogcmlnaHQgbm93IGNvdmVy
aW5nIHRoZSBMQVlPVVRHRVQgd2UgYXJlIGFib3V0IHRvIHNlbmQuDQo+LQkgKiBIb3dldmVyLCB0
aGF0IGlzIG5vdCBzbyBjYXRhc3Ryb3BoaWMsIGFuZCB0aGVyZSBzZWVtcw0KPi0JICogdG8gYmUg
bm8gd2F5IHRvIHByZXZlbnQgaXQgY29tcGxldGVseS4NCj4tCSAqLw0KPi0JaWYgKG5mczQxX3Nl
dHVwX3NlcXVlbmNlKHNlc3Npb24sICZsZ3AtPmFyZ3Muc2VxX2FyZ3MsDQo+LQkJCQkmbGdwLT5y
ZXMuc2VxX3JlcywgdGFzaykpDQo+LQkJcmV0dXJuOw0KPi0JcmV0ID0gcG5mc19jaG9vc2VfbGF5
b3V0Z2V0X3N0YXRlaWQoJmxncC0+YXJncy5zdGF0ZWlkLA0KPi0JCQkJCSAgTkZTX0kobGdwLT5h
cmdzLmlub2RlKS0+bGF5b3V0LA0KPi0JCQkJCSAgJmxncC0+YXJncy5yYW5nZSwNCj4tCQkJCQkg
IGxncC0+YXJncy5jdHgtPnN0YXRlKTsNCj4tCWlmIChyZXQgPCAwKQ0KPi0JCXJwY19leGl0KHRh
c2ssIHJldCk7DQo+KwluZnM0MV9zZXR1cF9zZXF1ZW5jZShzZXNzaW9uLCAmbGdwLT5hcmdzLnNl
cV9hcmdzLA0KPisJCQkJJmxncC0+cmVzLnNlcV9yZXMsIHRhc2spOw0KPisJZHByaW50aygiPC0t
ICVzXG4iLCBfX2Z1bmNfXyk7DQo+IH0NCj4gDQo+IHN0YXRpYyB2b2lkIG5mczRfbGF5b3V0Z2V0
X2RvbmUoc3RydWN0IHJwY190YXNrICp0YXNrLCB2b2lkICpjYWxsZGF0YSkNCj4gew0KPiAJc3Ry
dWN0IG5mczRfbGF5b3V0Z2V0ICpsZ3AgPSBjYWxsZGF0YTsNCj4rDQo+KwlkcHJpbnRrKCItLT4g
JXNcbiIsIF9fZnVuY19fKTsNCj4rCW5mczQxX3NlcXVlbmNlX2RvbmUodGFzaywgJmxncC0+cmVz
LnNlcV9yZXMpOw0KPisJZHByaW50aygiPC0tICVzXG4iLCBfX2Z1bmNfXyk7DQo+K30NCj4rDQo+
K3N0YXRpYyBpbnQNCj4rbmZzNF9sYXlvdXRnZXRfaGFuZGxlX2V4Y2VwdGlvbihzdHJ1Y3QgcnBj
X3Rhc2sgKnRhc2ssDQo+KwkJc3RydWN0IG5mczRfbGF5b3V0Z2V0ICpsZ3AsIHN0cnVjdCBuZnM0
X2V4Y2VwdGlvbiAqZXhjZXB0aW9uKQ0KPit7DQo+IAlzdHJ1Y3QgaW5vZGUgKmlub2RlID0gbGdw
LT5hcmdzLmlub2RlOw0KPiAJc3RydWN0IG5mc19zZXJ2ZXIgKnNlcnZlciA9IE5GU19TRVJWRVIo
aW5vZGUpOw0KPiAJc3RydWN0IHBuZnNfbGF5b3V0X2hkciAqbG87DQo+LQlzdHJ1Y3QgbmZzNF9z
dGF0ZSAqc3RhdGUgPSBOVUxMOw0KPi0JdW5zaWduZWQgbG9uZyB0aW1lbywgbm93LCBnaXZldXA7
DQo+KwlpbnQgc3RhdHVzID0gdGFzay0+dGtfc3RhdHVzOw0KPiANCj4gCWRwcmludGsoIi0tPiAl
cyB0a19zdGF0dXMgPT4gJWRcbiIsIF9fZnVuY19fLCAtdGFzay0+dGtfc3RhdHVzKTsNCj4gDQo+
LQlpZiAoIW5mczQxX3NlcXVlbmNlX2RvbmUodGFzaywgJmxncC0+cmVzLnNlcV9yZXMpKQ0KPi0J
CWdvdG8gb3V0Ow0KPi0NCj4tCXN3aXRjaCAodGFzay0+dGtfc3RhdHVzKSB7DQo+Kwlzd2l0Y2gg
KHN0YXR1cykgew0KPiAJY2FzZSAwOg0KPiAJCWdvdG8gb3V0Ow0KPiANCj5AQCAtNzg2Nyw1NyAr
Nzg2MiwzOSBAQCBzdGF0aWMgdm9pZCBuZnM0X2xheW91dGdldF9kb25lKHN0cnVjdCBycGNfdGFz
ayAqdGFzaywgdm9pZCAqY2FsbGRhdGEpDQo+IAkgKiByZXRyeSBnbyBpbmJhbmQuDQo+IAkgKi8N
Cj4gCWNhc2UgLU5GUzRFUlJfTEFZT1VUVU5BVkFJTEFCTEU6DQo+LQkJdGFzay0+dGtfc3RhdHVz
ID0gLUVOT0RBVEE7DQo+KwkJc3RhdHVzID0gLUVOT0RBVEE7DQo+IAkJZ290byBvdXQ7DQo+IAkv
Kg0KPiAJICogTkZTNEVSUl9CQURMQVlPVVQgbWVhbnMgdGhlIE1EUyBjYW5ub3QgcmV0dXJuIGEg
bGF5b3V0IG9mDQo+IAkgKiBsZW5ndGggbGdwLT5hcmdzLm1pbmxlbmd0aCAhPSAwIChzZWUgUkZD
NTY2MSBzZWN0aW9uIDE4LjQzLjMpLg0KPiAJICovDQo+IAljYXNlIC1ORlM0RVJSX0JBRExBWU9V
VDoNCj4tCQlnb3RvIG91dF9vdmVyZmxvdzsNCj4rCQlzdGF0dXMgPSAtRU9WRVJGTE9XOw0KPisJ
CWdvdG8gb3V0Ow0KPiAJLyoNCj4gCSAqIE5GUzRFUlJfTEFZT1VUVFJZTEFURVIgaXMgYSBjb25m
bGljdCB3aXRoIGFub3RoZXIgY2xpZW50DQo+IAkgKiAob3IgY2xpZW50cykgd3JpdGluZyB0byB0
aGUgc2FtZSBSQUlEIHN0cmlwZSBleGNlcHQgd2hlbg0KPiAJICogdGhlIG1pbmxlbmd0aCBhcmd1
bWVudCBpcyAwIChzZWUgUkZDNTY2MSBzZWN0aW9uIDE4LjQzLjMpLg0KPisJICoNCj4rCSAqIFRy
ZWF0IGl0IGxpa2Ugd2Ugd291bGQgUkVDQUxMQ09ORkxJQ1QgLS0gd2UgcmV0cnkgZm9yIGEgbGl0
dGxlDQo+KwkgKiB3aGlsZSwgYW5kIHRoZW4gZXZlbnR1YWxseSBnaXZlIHVwLg0KPiAJICovDQo+
IAljYXNlIC1ORlM0RVJSX0xBWU9VVFRSWUxBVEVSOg0KPi0JCWlmIChsZ3AtPmFyZ3MubWlubGVu
Z3RoID09IDApDQo+LQkJCWdvdG8gb3V0X292ZXJmbG93Ow0KPi0JLyoNCj4tCSAqIE5GUzRFUlJf
UkVDQUxMQ09ORkxJQ1QgaXMgd2hlbiBjb25mbGljdCB3aXRoIHNlbGYgKG11c3QgcmVjYWxsDQo+
LQkgKiBleGlzdGluZyBsYXlvdXQgYmVmb3JlIGdldHRpbmcgYSBuZXcgb25lKS4NCj4tCSAqLw0K
Pi0JY2FzZSAtTkZTNEVSUl9SRUNBTExDT05GTElDVDoNCj4tCQl0aW1lbyA9IHJwY19nZXRfdGlt
ZW91dCh0YXNrLT50a19jbGllbnQpOw0KPi0JCWdpdmV1cCA9IGxncC0+YXJncy50aW1lc3RhbXAg
KyB0aW1lbzsNCj4tCQlub3cgPSBqaWZmaWVzOw0KPi0JCWlmICh0aW1lX2FmdGVyKGdpdmV1cCwg
bm93KSkgew0KPi0JCQl1bnNpZ25lZCBsb25nIGRlbGF5Ow0KPi0NCj4tCQkJLyogRGVsYXkgZm9y
Og0KPi0JCQkgKiAtIE5vdCBsZXNzIHRoZW4gTkZTNF9QT0xMX1JFVFJZX01JTi4NCj4tCQkJICog
LSBPbmUgbGFzdCB0aW1lIGEgamlmZmllIGJlZm9yZSB3ZSBnaXZlIHVwDQo+LQkJCSAqIC0gZXhw
b25lbnRpYWwgYmFja29mZiAodGltZV9ub3cgbWludXMgc3RhcnRfYXR0ZW1wdCkNCj4tCQkJICov
DQo+LQkJCWRlbGF5ID0gbWF4X3QodW5zaWduZWQgbG9uZywgTkZTNF9QT0xMX1JFVFJZX01JTiwN
Cj4tCQkJCSAgICBtaW4oKGdpdmV1cCAtIG5vdyAtIDEpLA0KPi0JCQkJCW5vdyAtIGxncC0+YXJn
cy50aW1lc3RhbXApKTsNCj4tDQo+LQkJCWRwcmludGsoIiVzOiBORlM0RVJSX1JFQ0FMTENPTkZM
SUNUIHdhaXRpbmcgJWx1XG4iLA0KPi0JCQkJX19mdW5jX18sIGRlbGF5KTsNCj4tCQkJcnBjX2Rl
bGF5KHRhc2ssIGRlbGF5KTsNCj4tCQkJLyogRG8gbm90IGNhbGwgbmZzNF9hc3luY19oYW5kbGVf
ZXJyb3IoKSAqLw0KPi0JCQlnb3RvIG91dF9yZXN0YXJ0Ow0KPisJCWlmIChsZ3AtPmFyZ3MubWlu
bGVuZ3RoID09IDApIHsNCj4rCQkJc3RhdHVzID0gLUVPVkVSRkxPVzsNCj4rCQkJZ290byBvdXQ7
DQo+IAkJfQ0KPisJCXN0YXR1cyA9IC1ORlM0RVJSX1JFQ0FMTENPTkZMSUNUOw0KPiAJCWJyZWFr
Ow0KPiAJY2FzZSAtTkZTNEVSUl9FWFBJUkVEOg0KPiAJY2FzZSAtTkZTNEVSUl9CQURfU1RBVEVJ
RDoNCj4rCQlleGNlcHRpb24tPnRpbWVvdXQgPSAwOw0KPiAJCXNwaW5fbG9jaygmaW5vZGUtPmlf
bG9jayk7DQo+IAkJaWYgKG5mczRfc3RhdGVpZF9tYXRjaCgmbGdwLT5hcmdzLnN0YXRlaWQsDQo+
IAkJCQkJJmxncC0+YXJncy5jdHgtPnN0YXRlLT5zdGF0ZWlkKSkgew0KPiAJCQlzcGluX3VubG9j
aygmaW5vZGUtPmlfbG9jayk7DQo+IAkJCS8qIElmIHRoZSBvcGVuIHN0YXRlaWQgd2FzIGJhZCwg
dGhlbiByZWNvdmVyIGl0LiAqLw0KPi0JCQlzdGF0ZSA9IGxncC0+YXJncy5jdHgtPnN0YXRlOw0K
PisJCQlleGNlcHRpb24tPnN0YXRlID0gbGdwLT5hcmdzLmN0eC0+c3RhdGU7DQo+IAkJCWJyZWFr
Ow0KPiAJCX0NCj4gCQlsbyA9IE5GU19JKGlub2RlKS0+bGF5b3V0Ow0KPkBAIC03OTM1LDIwICs3
OTEyLDE4IEBAIHN0YXRpYyB2b2lkIG5mczRfbGF5b3V0Z2V0X2RvbmUoc3RydWN0IHJwY190YXNr
ICp0YXNrLCB2b2lkICpjYWxsZGF0YSkNCj4gCQkJcG5mc19mcmVlX2xzZWdfbGlzdCgmaGVhZCk7
DQo+IAkJfSBlbHNlDQo+IAkJCXNwaW5fdW5sb2NrKCZpbm9kZS0+aV9sb2NrKTsNCj4tCQlnb3Rv
IG91dF9yZXN0YXJ0Ow0KPisJCXN0YXR1cyA9IC1FQUdBSU47DQo+KwkJZ290byBvdXQ7DQo+IAl9
DQo+LQlpZiAobmZzNF9hc3luY19oYW5kbGVfZXJyb3IodGFzaywgc2VydmVyLCBzdGF0ZSwgJmxn
cC0+dGltZW91dCkgPT0gLUVBR0FJTikNCj4tCQlnb3RvIG91dF9yZXN0YXJ0Ow0KPisNCj4rCXN0
YXR1cyA9IG5mczRfaGFuZGxlX2V4Y2VwdGlvbihzZXJ2ZXIsIHN0YXR1cywgZXhjZXB0aW9uKTsN
Cj4rCWlmIChzdGF0dXMgPT0gMCkNCj4rCQlzdGF0dXMgPSB0YXNrLT50a19zdGF0dXM7DQo+Kwlp
ZiAoZXhjZXB0aW9uLT5yZXRyeSAmJiBzdGF0dXMgIT0gLU5GUzRFUlJfUkVDQUxMQ09ORkxJQ1Qp
DQo+KwkJc3RhdHVzID0gLUVBR0FJTjsNCj4gb3V0Og0KPiAJZHByaW50aygiPC0tICVzXG4iLCBf
X2Z1bmNfXyk7DQo+LQlyZXR1cm47DQo+LW91dF9yZXN0YXJ0Og0KPi0JdGFzay0+dGtfc3RhdHVz
ID0gMDsNCj4tCXJwY19yZXN0YXJ0X2NhbGxfcHJlcGFyZSh0YXNrKTsNCj4tCXJldHVybjsNCj4t
b3V0X292ZXJmbG93Og0KPi0JdGFzay0+dGtfc3RhdHVzID0gLUVPVkVSRkxPVzsNCj4tCWdvdG8g
b3V0Ow0KPisJcmV0dXJuIHN0YXR1czsNCj4gfQ0KPiANCj4gc3RhdGljIHNpemVfdCBtYXhfcmVz
cG9uc2VfcGFnZXMoc3RydWN0IG5mc19zZXJ2ZXIgKnNlcnZlcikNCj5AQCAtODAxNyw3ICs3OTky
LDcgQEAgc3RhdGljIGNvbnN0IHN0cnVjdCBycGNfY2FsbF9vcHMgbmZzNF9sYXlvdXRnZXRfY2Fs
bF9vcHMgPSB7DQo+IH07DQo+IA0KPiBzdHJ1Y3QgcG5mc19sYXlvdXRfc2VnbWVudCAqDQo+LW5m
czRfcHJvY19sYXlvdXRnZXQoc3RydWN0IG5mczRfbGF5b3V0Z2V0ICpsZ3AsIGdmcF90IGdmcF9m
bGFncykNCj4rbmZzNF9wcm9jX2xheW91dGdldChzdHJ1Y3QgbmZzNF9sYXlvdXRnZXQgKmxncCwg
bG9uZyAqdGltZW91dCwgZ2ZwX3QgZ2ZwX2ZsYWdzKQ0KPiB7DQo+IAlzdHJ1Y3QgaW5vZGUgKmlu
b2RlID0gbGdwLT5hcmdzLmlub2RlOw0KPiAJc3RydWN0IG5mc19zZXJ2ZXIgKnNlcnZlciA9IE5G
U19TRVJWRVIoaW5vZGUpOw0KPkBAIC04MDM3LDYgKzgwMTIsNyBAQCBuZnM0X3Byb2NfbGF5b3V0
Z2V0KHN0cnVjdCBuZnM0X2xheW91dGdldCAqbGdwLCBnZnBfdCBnZnBfZmxhZ3MpDQo+IAkJLmZs
YWdzID0gUlBDX1RBU0tfQVNZTkMsDQo+IAl9Ow0KPiAJc3RydWN0IHBuZnNfbGF5b3V0X3NlZ21l
bnQgKmxzZWcgPSBOVUxMOw0KPisJc3RydWN0IG5mczRfZXhjZXB0aW9uIGV4Y2VwdGlvbiA9IHsg
LnRpbWVvdXQgPSAqdGltZW91dCB9Ow0KPiAJaW50IHN0YXR1cyA9IDA7DQo+IA0KPiAJZHByaW50
aygiLS0+ICVzXG4iLCBfX2Z1bmNfXyk7DQo+QEAgLTgwNTAsNyArODAyNiw2IEBAIG5mczRfcHJv
Y19sYXlvdXRnZXQoc3RydWN0IG5mczRfbGF5b3V0Z2V0ICpsZ3AsIGdmcF90IGdmcF9mbGFncykN
Cj4gCQlyZXR1cm4gRVJSX1BUUigtRU5PTUVNKTsNCj4gCX0NCj4gCWxncC0+YXJncy5sYXlvdXQu
cGdsZW4gPSBtYXhfcGFnZXMgKiBQQUdFX1NJWkU7DQo+LQlsZ3AtPmFyZ3MudGltZXN0YW1wID0g
amlmZmllczsNCj4gDQo+IAlsZ3AtPnJlcy5sYXlvdXRwID0gJmxncC0+YXJncy5sYXlvdXQ7DQo+
IAlsZ3AtPnJlcy5zZXFfcmVzLnNyX3Nsb3QgPSBOVUxMOw0KPkBAIC04MDYwLDEzICs4MDM1LDE3
IEBAIG5mczRfcHJvY19sYXlvdXRnZXQoc3RydWN0IG5mczRfbGF5b3V0Z2V0ICpsZ3AsIGdmcF90
IGdmcF9mbGFncykNCj4gCWlmIChJU19FUlIodGFzaykpDQo+IAkJcmV0dXJuIEVSUl9DQVNUKHRh
c2spOw0KPiAJc3RhdHVzID0gbmZzNF93YWl0X2Zvcl9jb21wbGV0aW9uX3JwY190YXNrKHRhc2sp
Ow0KPi0JaWYgKHN0YXR1cyA9PSAwKQ0KPi0JCXN0YXR1cyA9IHRhc2stPnRrX3N0YXR1czsNCj4r
CWlmIChzdGF0dXMgPT0gMCkgew0KPisJCXN0YXR1cyA9IG5mczRfbGF5b3V0Z2V0X2hhbmRsZV9l
eGNlcHRpb24odGFzaywgbGdwLCAmZXhjZXB0aW9uKTsNCg0KVGhpcyBpcyBib3JrZWTigKYgWW91
4oCZcmUgbm93IHJldHVybmluZyBhbiBORlN2NCBzdGF0dXMgYXMgYW4gRVJSX1BUUigpLCB3aGlj
aCBpcyBhIGJpZyBuby1ubyENCk1BWF9FUlJOTyBpcyA0MDk1LCB3aGlsZSBORlM0RVJSX1JFQ0FM
TENPTkZMSUNUIHRha2VzIHRoZSB2YWx1ZSAxMDA2MS4NCg0KPisJCSp0aW1lb3V0ID0gZXhjZXB0
aW9uLnRpbWVvdXQ7DQo+Kwl9DQo+Kw0KPiAJdHJhY2VfbmZzNF9sYXlvdXRnZXQobGdwLT5hcmdz
LmN0eCwNCj4gCQkJJmxncC0+YXJncy5yYW5nZSwNCj4gCQkJJmxncC0+cmVzLnJhbmdlLA0KPiAJ
CQkmbGdwLT5yZXMuc3RhdGVpZCwNCj4gCQkJc3RhdHVzKTsNCj4rDQo+IAkvKiBpZiBsYXlvdXRw
LT5sZW4gaXMgMCwgbmZzNF9sYXlvdXRnZXRfcHJlcGFyZSBjYWxsZWQgcnBjX2V4aXQgKi8NCj4g
CWlmIChzdGF0dXMgPT0gMCAmJiBsZ3AtPnJlcy5sYXlvdXRwLT5sZW4pDQo+IAkJbHNlZyA9IHBu
ZnNfbGF5b3V0X3Byb2Nlc3MobGdwKTsNCj5kaWZmIC0tZ2l0IGEvZnMvbmZzL25mczR0cmFjZS5o
IGIvZnMvbmZzL25mczR0cmFjZS5oDQo+aW5kZXggMmM4ZDA1ZGFlNWIxLi45YzE1MGIxNTM3ODIg
MTAwNjQ0DQo+LS0tIGEvZnMvbmZzL25mczR0cmFjZS5oDQo+KysrIGIvZnMvbmZzL25mczR0cmFj
ZS5oDQo+QEAgLTE1MjAsNiArMTUyMCw4IEBAIERFRklORV9ORlM0X0lOT0RFX0VWRU5UKG5mczRf
bGF5b3V0cmV0dXJuX29uX2Nsb3NlKTsNCj4gCQl7IFBORlNfVVBEQVRFX0xBWU9VVF9GT1VORF9D
QUNIRUQsICJmb3VuZCBjYWNoZWQiIH0sCVwNCj4gCQl7IFBORlNfVVBEQVRFX0xBWU9VVF9SRVRV
Uk4sICJsYXlvdXRyZXR1cm4iIH0sCQlcDQo+IAkJeyBQTkZTX1VQREFURV9MQVlPVVRfQkxPQ0tF
RCwgImxheW91dHMgYmxvY2tlZCIgfSwJXA0KPisJCXsgUE5GU19VUERBVEVfTEFZT1VUX0lOVkFM
SURfT1BFTiwgImludmFsaWQgb3BlbiIgfSwJXA0KPisJCXsgUE5GU19VUERBVEVfTEFZT1VUX1JF
VFJZLCAicmV0cnlpbmciIH0sCVwNCj4gCQl7IFBORlNfVVBEQVRFX0xBWU9VVF9TRU5EX0xBWU9V
VEdFVCwgInNlbnQgbGF5b3V0Z2V0IiB9KQ0KPiANCj4gVFJBQ0VfRVZFTlQocG5mc191cGRhdGVf
bGF5b3V0LA0KPkBAIC0xNTI4LDkgKzE1MzAsMTAgQEAgVFJBQ0VfRVZFTlQocG5mc191cGRhdGVf
bGF5b3V0LA0KPiAJCQl1NjQgY291bnQsDQo+IAkJCWVudW0gcG5mc19pb21vZGUgaW9tb2RlLA0K
PiAJCQlzdHJ1Y3QgcG5mc19sYXlvdXRfaGRyICpsbywNCj4rCQkJc3RydWN0IHBuZnNfbGF5b3V0
X3NlZ21lbnQgKmxzZWcsDQo+IAkJCWVudW0gcG5mc191cGRhdGVfbGF5b3V0X3JlYXNvbiByZWFz
b24NCj4gCQkpLA0KPi0JCVRQX0FSR1MoaW5vZGUsIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIHJl
YXNvbiksDQo+KwkJVFBfQVJHUyhpbm9kZSwgcG9zLCBjb3VudCwgaW9tb2RlLCBsbywgbHNlZywg
cmVhc29uKSwNCj4gCQlUUF9TVFJVQ1RfX2VudHJ5KA0KPiAJCQlfX2ZpZWxkKGRldl90LCBkZXYp
DQo+IAkJCV9fZmllbGQodTY0LCBmaWxlaWQpDQo+QEAgLTE1NDAsNiArMTU0Myw3IEBAIFRSQUNF
X0VWRU5UKHBuZnNfdXBkYXRlX2xheW91dCwNCj4gCQkJX19maWVsZChlbnVtIHBuZnNfaW9tb2Rl
LCBpb21vZGUpDQo+IAkJCV9fZmllbGQoaW50LCBsYXlvdXRzdGF0ZWlkX3NlcSkNCj4gCQkJX19m
aWVsZCh1MzIsIGxheW91dHN0YXRlaWRfaGFzaCkNCj4rCQkJX19maWVsZChsb25nLCBsc2VnKQ0K
PiAJCQlfX2ZpZWxkKGVudW0gcG5mc191cGRhdGVfbGF5b3V0X3JlYXNvbiwgcmVhc29uKQ0KPiAJ
CSksDQo+IAkJVFBfZmFzdF9hc3NpZ24oDQo+QEAgLTE1NTksMTEgKzE1NjMsMTIgQEAgVFJBQ0Vf
RVZFTlQocG5mc191cGRhdGVfbGF5b3V0LA0KPiAJCQkJX19lbnRyeS0+bGF5b3V0c3RhdGVpZF9z
ZXEgPSAwOw0KPiAJCQkJX19lbnRyeS0+bGF5b3V0c3RhdGVpZF9oYXNoID0gMDsNCj4gCQkJfQ0K
PisJCQlfX2VudHJ5LT5sc2VnID0gKGxvbmcpbHNlZzsNCj4gCQkpLA0KPiAJCVRQX3ByaW50aygN
Cj4gCQkJImZpbGVpZD0lMDJ4OiUwMng6JWxsdSBmaGFuZGxlPTB4JTA4eCAiDQo+IAkJCSJpb21v
ZGU9JXMgcG9zPSVsbHUgY291bnQ9JWxsdSAiDQo+LQkJCSJsYXlvdXRzdGF0ZWlkPSVkOjB4JTA4
eCAoJXMpIiwNCj4rCQkJImxheW91dHN0YXRlaWQ9JWQ6MHglMDh4IGxzZWc9MHglbHggKCVzKSIs
DQo+IAkJCU1BSk9SKF9fZW50cnktPmRldiksIE1JTk9SKF9fZW50cnktPmRldiksDQo+IAkJCSh1
bnNpZ25lZCBsb25nIGxvbmcpX19lbnRyeS0+ZmlsZWlkLA0KPiAJCQlfX2VudHJ5LT5maGFuZGxl
LA0KPkBAIC0xNTcxLDYgKzE1NzYsNyBAQCBUUkFDRV9FVkVOVChwbmZzX3VwZGF0ZV9sYXlvdXQs
DQo+IAkJCSh1bnNpZ25lZCBsb25nIGxvbmcpX19lbnRyeS0+cG9zLA0KPiAJCQkodW5zaWduZWQg
bG9uZyBsb25nKV9fZW50cnktPmNvdW50LA0KPiAJCQlfX2VudHJ5LT5sYXlvdXRzdGF0ZWlkX3Nl
cSwgX19lbnRyeS0+bGF5b3V0c3RhdGVpZF9oYXNoLA0KPisJCQlfX2VudHJ5LT5sc2VnLA0KPiAJ
CQlzaG93X3BuZnNfdXBkYXRlX2xheW91dF9yZWFzb24oX19lbnRyeS0+cmVhc29uKQ0KPiAJCSkN
Cj4gKTsNCj5kaWZmIC0tZ2l0IGEvZnMvbmZzL3BuZnMuYyBiL2ZzL25mcy9wbmZzLmMNCj5pbmRl
eCA1YThjMTljNTdmMTYuLmQwNzYwZDMwNzM0ZCAxMDA2NDQNCj4tLS0gYS9mcy9uZnMvcG5mcy5j
DQo+KysrIGIvZnMvbmZzL3BuZnMuYw0KPkBAIC03OTYsNDUgKzc5NiwxOCBAQCBwbmZzX2xheW91
dGdldHNfYmxvY2tlZChjb25zdCBzdHJ1Y3QgcG5mc19sYXlvdXRfaGRyICpsbykNCj4gCQl0ZXN0
X2JpdChORlNfTEFZT1VUX0JVTEtfUkVDQUxMLCAmbG8tPnBsaF9mbGFncyk7DQo+IH0NCj4gDQo+
LWludA0KPi1wbmZzX2Nob29zZV9sYXlvdXRnZXRfc3RhdGVpZChuZnM0X3N0YXRlaWQgKmRzdCwg
c3RydWN0IHBuZnNfbGF5b3V0X2hkciAqbG8sDQo+LQkJCSAgICAgIGNvbnN0IHN0cnVjdCBwbmZz
X2xheW91dF9yYW5nZSAqcmFuZ2UsDQo+LQkJCSAgICAgIHN0cnVjdCBuZnM0X3N0YXRlICpvcGVu
X3N0YXRlKQ0KPi17DQo+LQlpbnQgc3RhdHVzID0gMDsNCj4tDQo+LQlkcHJpbnRrKCItLT4gJXNc
biIsIF9fZnVuY19fKTsNCj4tCXNwaW5fbG9jaygmbG8tPnBsaF9pbm9kZS0+aV9sb2NrKTsNCj4t
CWlmIChwbmZzX2xheW91dGdldHNfYmxvY2tlZChsbykpIHsNCj4tCQlzdGF0dXMgPSAtRUFHQUlO
Ow0KPi0JfSBlbHNlIGlmICghbmZzNF92YWxpZF9vcGVuX3N0YXRlaWQob3Blbl9zdGF0ZSkpIHsN
Cj4tCQlzdGF0dXMgPSAtRUJBREY7DQo+LQl9IGVsc2UgaWYgKGxpc3RfZW1wdHkoJmxvLT5wbGhf
c2VncykgfHwNCj4tCQkgICB0ZXN0X2JpdChORlNfTEFZT1VUX0lOVkFMSURfU1RJRCwgJmxvLT5w
bGhfZmxhZ3MpKSB7DQo+LQkJaW50IHNlcTsNCj4tDQo+LQkJZG8gew0KPi0JCQlzZXEgPSByZWFk
X3NlcWJlZ2luKCZvcGVuX3N0YXRlLT5zZXFsb2NrKTsNCj4tCQkJbmZzNF9zdGF0ZWlkX2NvcHko
ZHN0LCAmb3Blbl9zdGF0ZS0+c3RhdGVpZCk7DQo+LQkJfSB3aGlsZSAocmVhZF9zZXFyZXRyeSgm
b3Blbl9zdGF0ZS0+c2VxbG9jaywgc2VxKSk7DQo+LQl9IGVsc2UNCj4tCQluZnM0X3N0YXRlaWRf
Y29weShkc3QsICZsby0+cGxoX3N0YXRlaWQpOw0KPi0Jc3Bpbl91bmxvY2soJmxvLT5wbGhfaW5v
ZGUtPmlfbG9jayk7DQo+LQlkcHJpbnRrKCI8LS0gJXNcbiIsIF9fZnVuY19fKTsNCj4tCXJldHVy
biBzdGF0dXM7DQo+LX0NCj4tDQo+IC8qDQo+LSogR2V0IGxheW91dCBmcm9tIHNlcnZlci4NCj4t
KiAgICBmb3Igbm93LCBhc3N1bWUgdGhhdCB3aG9sZSBmaWxlIGxheW91dHMgYXJlIHJlcXVlc3Rl
ZC4NCj4tKiAgICBhcmctPm9mZnNldDogMA0KPi0qICAgIGFyZy0+bGVuZ3RoOiBhbGwgb25lcw0K
Pi0qLw0KPisgKiBHZXQgbGF5b3V0IGZyb20gc2VydmVyLg0KPisgKiAgICBmb3Igbm93LCBhc3N1
bWUgdGhhdCB3aG9sZSBmaWxlIGxheW91dHMgYXJlIHJlcXVlc3RlZC4NCj4rICogICAgYXJnLT5v
ZmZzZXQ6IDANCj4rICogICAgYXJnLT5sZW5ndGg6IGFsbCBvbmVzDQo+KyAqLw0KPiBzdGF0aWMg
c3RydWN0IHBuZnNfbGF5b3V0X3NlZ21lbnQgKg0KPiBzZW5kX2xheW91dGdldChzdHJ1Y3QgcG5m
c19sYXlvdXRfaGRyICpsbywNCj4gCSAgIHN0cnVjdCBuZnNfb3Blbl9jb250ZXh0ICpjdHgsDQo+
KwkgICBuZnM0X3N0YXRlaWQgKnN0YXRlaWQsDQo+IAkgICBjb25zdCBzdHJ1Y3QgcG5mc19sYXlv
dXRfcmFuZ2UgKnJhbmdlLA0KPi0JICAgZ2ZwX3QgZ2ZwX2ZsYWdzKQ0KPisJICAgbG9uZyAqdGlt
ZW91dCwgZ2ZwX3QgZ2ZwX2ZsYWdzKQ0KPiB7DQo+IAlzdHJ1Y3QgaW5vZGUgKmlubyA9IGxvLT5w
bGhfaW5vZGU7DQo+IAlzdHJ1Y3QgbmZzX3NlcnZlciAqc2VydmVyID0gTkZTX1NFUlZFUihpbm8p
Ow0KPkBAIC04NjgsMTAgKzg0MSwxMSBAQCBzZW5kX2xheW91dGdldChzdHJ1Y3QgcG5mc19sYXlv
dXRfaGRyICpsbywNCj4gCWxncC0+YXJncy50eXBlID0gc2VydmVyLT5wbmZzX2N1cnJfbGQtPmlk
Ow0KPiAJbGdwLT5hcmdzLmlub2RlID0gaW5vOw0KPiAJbGdwLT5hcmdzLmN0eCA9IGdldF9uZnNf
b3Blbl9jb250ZXh0KGN0eCk7DQo+KwluZnM0X3N0YXRlaWRfY29weSgmbGdwLT5hcmdzLnN0YXRl
aWQsIHN0YXRlaWQpOw0KPiAJbGdwLT5nZnBfZmxhZ3MgPSBnZnBfZmxhZ3M7DQo+IAlsZ3AtPmNy
ZWQgPSBsby0+cGxoX2xjX2NyZWQ7DQo+IA0KPi0JcmV0dXJuIG5mczRfcHJvY19sYXlvdXRnZXQo
bGdwLCBnZnBfZmxhZ3MpOw0KPisJcmV0dXJuIG5mczRfcHJvY19sYXlvdXRnZXQobGdwLCB0aW1l
b3V0LCBnZnBfZmxhZ3MpOw0KPiB9DQo+IA0KPiBzdGF0aWMgdm9pZCBwbmZzX2NsZWFyX2xheW91
dGNvbW1pdChzdHJ1Y3QgaW5vZGUgKmlub2RlLA0KPkBAIC0xNTExLDI3ICsxNDg1LDMwIEBAIHBu
ZnNfdXBkYXRlX2xheW91dChzdHJ1Y3QgaW5vZGUgKmlubywNCj4gCQkub2Zmc2V0ID0gcG9zLA0K
PiAJCS5sZW5ndGggPSBjb3VudCwNCj4gCX07DQo+LQl1bnNpZ25lZCBwZ19vZmZzZXQ7DQo+Kwl1
bnNpZ25lZCBwZ19vZmZzZXQsIHNlcTsNCj4gCXN0cnVjdCBuZnNfc2VydmVyICpzZXJ2ZXIgPSBO
RlNfU0VSVkVSKGlubyk7DQo+IAlzdHJ1Y3QgbmZzX2NsaWVudCAqY2xwID0gc2VydmVyLT5uZnNf
Y2xpZW50Ow0KPi0Jc3RydWN0IHBuZnNfbGF5b3V0X2hkciAqbG87DQo+KwlzdHJ1Y3QgcG5mc19s
YXlvdXRfaGRyICpsbyA9IE5VTEw7DQo+IAlzdHJ1Y3QgcG5mc19sYXlvdXRfc2VnbWVudCAqbHNl
ZyA9IE5VTEw7DQo+KwluZnM0X3N0YXRlaWQgc3RhdGVpZDsNCj4rCWxvbmcgdGltZW91dCA9IDA7
DQo+Kwl1bnNpZ25lZCBsb25nIGdpdmV1cCA9IGppZmZpZXMgKyBycGNfZ2V0X3RpbWVvdXQoc2Vy
dmVyLT5jbGllbnQpOw0KPiAJYm9vbCBmaXJzdDsNCj4gDQo+IAlpZiAoIXBuZnNfZW5hYmxlZF9z
YihORlNfU0VSVkVSKGlubykpKSB7DQo+LQkJdHJhY2VfcG5mc191cGRhdGVfbGF5b3V0KGlubywg
cG9zLCBjb3VudCwgaW9tb2RlLCBOVUxMLA0KPisJCXRyYWNlX3BuZnNfdXBkYXRlX2xheW91dChp
bm8sIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIGxzZWcsDQo+IAkJCQkgUE5GU19VUERBVEVfTEFZ
T1VUX05PX1BORlMpOw0KPiAJCWdvdG8gb3V0Ow0KPiAJfQ0KPiANCj4gCWlmIChpb21vZGUgPT0g
SU9NT0RFX1JFQUQgJiYgaV9zaXplX3JlYWQoaW5vKSA9PSAwKSB7DQo+LQkJdHJhY2VfcG5mc191
cGRhdGVfbGF5b3V0KGlubywgcG9zLCBjb3VudCwgaW9tb2RlLCBOVUxMLA0KPisJCXRyYWNlX3Bu
ZnNfdXBkYXRlX2xheW91dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIGxzZWcsDQo+IAkJ
CQkgUE5GU19VUERBVEVfTEFZT1VUX1JEX1pFUk9MRU4pOw0KPiAJCWdvdG8gb3V0Ow0KPiAJfQ0K
PiANCj4gCWlmIChwbmZzX3dpdGhpbl9tZHN0aHJlc2hvbGQoY3R4LCBpbm8sIGlvbW9kZSkpIHsN
Cj4tCQl0cmFjZV9wbmZzX3VwZGF0ZV9sYXlvdXQoaW5vLCBwb3MsIGNvdW50LCBpb21vZGUsIE5V
TEwsDQo+KwkJdHJhY2VfcG5mc191cGRhdGVfbGF5b3V0KGlubywgcG9zLCBjb3VudCwgaW9tb2Rl
LCBsbywgbHNlZywNCj4gCQkJCSBQTkZTX1VQREFURV9MQVlPVVRfTURTVEhSRVNIKTsNCj4gCQln
b3RvIG91dDsNCj4gCX0NCj5AQCAtMTU0MiwxNCArMTUxOSwxNCBAQCBsb29rdXBfYWdhaW46DQo+
IAlsbyA9IHBuZnNfZmluZF9hbGxvY19sYXlvdXQoaW5vLCBjdHgsIGdmcF9mbGFncyk7DQo+IAlp
ZiAobG8gPT0gTlVMTCkgew0KPiAJCXNwaW5fdW5sb2NrKCZpbm8tPmlfbG9jayk7DQo+LQkJdHJh
Y2VfcG5mc191cGRhdGVfbGF5b3V0KGlubywgcG9zLCBjb3VudCwgaW9tb2RlLCBOVUxMLA0KPisJ
CXRyYWNlX3BuZnNfdXBkYXRlX2xheW91dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIGxz
ZWcsDQo+IAkJCQkgUE5GU19VUERBVEVfTEFZT1VUX05PTUVNKTsNCj4gCQlnb3RvIG91dDsNCj4g
CX0NCj4gDQo+IAkvKiBEbyB3ZSBldmVuIG5lZWQgdG8gYm90aGVyIHdpdGggdGhpcz8gKi8NCj4g
CWlmICh0ZXN0X2JpdChORlNfTEFZT1VUX0JVTEtfUkVDQUxMLCAmbG8tPnBsaF9mbGFncykpIHsN
Cj4tCQl0cmFjZV9wbmZzX3VwZGF0ZV9sYXlvdXQoaW5vLCBwb3MsIGNvdW50LCBpb21vZGUsIGxv
LA0KPisJCXRyYWNlX3BuZnNfdXBkYXRlX2xheW91dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwg
bG8sIGxzZWcsDQo+IAkJCQkgUE5GU19VUERBVEVfTEFZT1VUX0JVTEtfUkVDQUxMKTsNCj4gCQlk
cHJpbnRrKCIlcyBtYXRjaGVzIHJlY2FsbCwgdXNlIE1EU1xuIiwgX19mdW5jX18pOw0KPiAJCWdv
dG8gb3V0X3VubG9jazsNCj5AQCAtMTU1NywxNCArMTUzNCwzNCBAQCBsb29rdXBfYWdhaW46DQo+
IA0KPiAJLyogaWYgTEFZT1VUR0VUIGFscmVhZHkgZmFpbGVkIG9uY2Ugd2UgZG9uJ3QgdHJ5IGFn
YWluICovDQo+IAlpZiAocG5mc19sYXlvdXRfaW9fdGVzdF9mYWlsZWQobG8sIGlvbW9kZSkpIHsN
Cj4tCQl0cmFjZV9wbmZzX3VwZGF0ZV9sYXlvdXQoaW5vLCBwb3MsIGNvdW50LCBpb21vZGUsIGxv
LA0KPisJCXRyYWNlX3BuZnNfdXBkYXRlX2xheW91dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwg
bG8sIGxzZWcsDQo+IAkJCQkgUE5GU19VUERBVEVfTEFZT1VUX0lPX1RFU1RfRkFJTCk7DQo+IAkJ
Z290byBvdXRfdW5sb2NrOw0KPiAJfQ0KPiANCj4tCWZpcnN0ID0gbGlzdF9lbXB0eSgmbG8tPnBs
aF9zZWdzKTsNCj4tCWlmIChmaXJzdCkgew0KPi0JCS8qIFRoZSBmaXJzdCBsYXlvdXRnZXQgZm9y
IHRoZSBmaWxlLiBOZWVkIHRvIHNlcmlhbGl6ZSBwZXINCj4rCWxzZWcgPSBwbmZzX2ZpbmRfbHNl
ZyhsbywgJmFyZyk7DQo+KwlpZiAobHNlZykgew0KPisJCXRyYWNlX3BuZnNfdXBkYXRlX2xheW91
dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIGxzZWcsDQo+KwkJCQlQTkZTX1VQREFURV9M
QVlPVVRfRk9VTkRfQ0FDSEVEKTsNCj4rCQlnb3RvIG91dF91bmxvY2s7DQo+Kwl9DQo+Kw0KPisJ
aWYgKCFuZnM0X3ZhbGlkX29wZW5fc3RhdGVpZChjdHgtPnN0YXRlKSkgew0KPisJCXRyYWNlX3Bu
ZnNfdXBkYXRlX2xheW91dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIGxzZWcsDQo+KwkJ
CQlQTkZTX1VQREFURV9MQVlPVVRfSU5WQUxJRF9PUEVOKTsNCj4rCQlnb3RvIG91dF91bmxvY2s7
DQo+Kwl9DQo+Kw0KPisJLyoNCj4rCSAqIENob29zZSBhIHN0YXRlaWQgZm9yIHRoZSBMQVlPVVRH
RVQuIElmIHdlIGRvbid0IGhhdmUgYSBsYXlvdXQNCj4rCSAqIHN0YXRlaWQsIG9yIGl0IGhhcyBi
ZWVuIGludmFsaWRhdGVkLCB0aGVuIHdlIG11c3QgdXNlIHRoZSBvcGVuDQo+KwkgKiBzdGF0ZWlk
Lg0KPisJICovDQo+KwlpZiAobG8tPnBsaF9zdGF0ZWlkLnNlcWlkID09IDAgfHwNCj4rCSAgICB0
ZXN0X2JpdChORlNfTEFZT1VUX0lOVkFMSURfU1RJRCwgJmxvLT5wbGhfZmxhZ3MpKSB7DQo+Kw0K
PisJCS8qDQo+KwkJICogVGhlIGZpcnN0IGxheW91dGdldCBmb3IgdGhlIGZpbGUuIE5lZWQgdG8g
c2VyaWFsaXplIHBlcg0KPiAJCSAqIFJGQyA1NjYxIEVycmF0YSAzMjA4Lg0KPiAJCSAqLw0KPiAJ
CWlmICh0ZXN0X2FuZF9zZXRfYml0KE5GU19MQVlPVVRfRklSU1RfTEFZT1VUR0VULA0KPkBAIC0x
NTczLDE4ICsxNTcwLDE3IEBAIGxvb2t1cF9hZ2FpbjoNCj4gCQkJd2FpdF9vbl9iaXQoJmxvLT5w
bGhfZmxhZ3MsIE5GU19MQVlPVVRfRklSU1RfTEFZT1VUR0VULA0KPiAJCQkJICAgIFRBU0tfVU5J
TlRFUlJVUFRJQkxFKTsNCj4gCQkJcG5mc19wdXRfbGF5b3V0X2hkcihsbyk7DQo+KwkJCWRwcmlu
dGsoIiVzIHJldHJ5aW5nXG4iLCBfX2Z1bmNfXyk7DQo+IAkJCWdvdG8gbG9va3VwX2FnYWluOw0K
PiAJCX0NCj4rDQo+KwkJZmlyc3QgPSB0cnVlOw0KPisJCWRvIHsNCj4rCQkJc2VxID0gcmVhZF9z
ZXFiZWdpbigmY3R4LT5zdGF0ZS0+c2VxbG9jayk7DQo+KwkJCW5mczRfc3RhdGVpZF9jb3B5KCZz
dGF0ZWlkLCAmY3R4LT5zdGF0ZS0+c3RhdGVpZCk7DQo+KwkJfSB3aGlsZSAocmVhZF9zZXFyZXRy
eSgmY3R4LT5zdGF0ZS0+c2VxbG9jaywgc2VxKSk7DQo+IAl9IGVsc2Ugew0KPi0JCS8qIENoZWNr
IHRvIHNlZSBpZiB0aGUgbGF5b3V0IGZvciB0aGUgZ2l2ZW4gcmFuZ2UNCj4tCQkgKiBhbHJlYWR5
IGV4aXN0cw0KPi0JCSAqLw0KPi0JCWxzZWcgPSBwbmZzX2ZpbmRfbHNlZyhsbywgJmFyZyk7DQo+
LQkJaWYgKGxzZWcpIHsNCj4tCQkJdHJhY2VfcG5mc191cGRhdGVfbGF5b3V0KGlubywgcG9zLCBj
b3VudCwgaW9tb2RlLCBsbywNCj4tCQkJCQlQTkZTX1VQREFURV9MQVlPVVRfRk9VTkRfQ0FDSEVE
KTsNCj4tCQkJZ290byBvdXRfdW5sb2NrOw0KPi0JCX0NCj4rCQluZnM0X3N0YXRlaWRfY29weSgm
c3RhdGVpZCwgJmxvLT5wbGhfc3RhdGVpZCk7DQo+IAl9DQo+IA0KPiAJLyoNCj5AQCAtMTU5OSwx
NSArMTU5NSwxNyBAQCBsb29rdXBfYWdhaW46DQo+IAkJCQlwbmZzX2NsZWFyX2ZpcnN0X2xheW91
dGdldChsbyk7DQo+IAkJCXBuZnNfcHV0X2xheW91dF9oZHIobG8pOw0KPiAJCQlkcHJpbnRrKCIl
cyByZXRyeWluZ1xuIiwgX19mdW5jX18pOw0KPisJCQl0cmFjZV9wbmZzX3VwZGF0ZV9sYXlvdXQo
aW5vLCBwb3MsIGNvdW50LCBpb21vZGUsIGxvLA0KPisJCQkJCWxzZWcsIFBORlNfVVBEQVRFX0xB
WU9VVF9SRVRSWSk7DQo+IAkJCWdvdG8gbG9va3VwX2FnYWluOw0KPiAJCX0NCj4tCQl0cmFjZV9w
bmZzX3VwZGF0ZV9sYXlvdXQoaW5vLCBwb3MsIGNvdW50LCBpb21vZGUsIGxvLA0KPisJCXRyYWNl
X3BuZnNfdXBkYXRlX2xheW91dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIGxzZWcsDQo+
IAkJCQlQTkZTX1VQREFURV9MQVlPVVRfUkVUVVJOKTsNCj4gCQlnb3RvIG91dF9wdXRfbGF5b3V0
X2hkcjsNCj4gCX0NCj4gDQo+IAlpZiAocG5mc19sYXlvdXRnZXRzX2Jsb2NrZWQobG8pKSB7DQo+
LQkJdHJhY2VfcG5mc191cGRhdGVfbGF5b3V0KGlubywgcG9zLCBjb3VudCwgaW9tb2RlLCBsbywN
Cj4rCQl0cmFjZV9wbmZzX3VwZGF0ZV9sYXlvdXQoaW5vLCBwb3MsIGNvdW50LCBpb21vZGUsIGxv
LCBsc2VnLA0KPiAJCQkJUE5GU19VUERBVEVfTEFZT1VUX0JMT0NLRUQpOw0KPiAJCWdvdG8gb3V0
X3VubG9jazsNCj4gCX0NCj5AQCAtMTYzMiwyNCArMTYzMCwzNCBAQCBsb29rdXBfYWdhaW46DQo+
IAlpZiAoYXJnLmxlbmd0aCAhPSBORlM0X01BWF9VSU5UNjQpDQo+IAkJYXJnLmxlbmd0aCA9IFBB
R0VfQUxJR04oYXJnLmxlbmd0aCk7DQo+IA0KPi0JbHNlZyA9IHNlbmRfbGF5b3V0Z2V0KGxvLCBj
dHgsICZhcmcsIGdmcF9mbGFncyk7DQo+LQlpZiAoSVNfRVJSKGxzZWcpKSB7DQo+LQkJaWYgKGxz
ZWcgPT0gRVJSX1BUUigtRUFHQUlOKSkgew0KPisJbHNlZyA9IHNlbmRfbGF5b3V0Z2V0KGxvLCBj
dHgsICZzdGF0ZWlkLCAmYXJnLCAmdGltZW91dCwgZ2ZwX2ZsYWdzKTsNCj4rCXRyYWNlX3BuZnNf
dXBkYXRlX2xheW91dChpbm8sIHBvcywgY291bnQsIGlvbW9kZSwgbG8sIGxzZWcsDQo+KwkJCQkg
UE5GU19VUERBVEVfTEFZT1VUX1NFTkRfTEFZT1VUR0VUKTsNCj4rCWlmIChJU19FUlJfT1JfTlVM
TChsc2VnKSkgew0KPisJCXN3aXRjaChQVFJfRVJSKGxzZWcpKSB7DQo+KwkJY2FzZSAtTkZTNEVS
Ul9SRUNBTExDT05GTElDVDoNCg0KU2VlIGNvbW1lbnQgYWJvdmUuIFRoaXMgY2Fubm90IHdvcms6
IHRoZSByZXN1bHQgd2lsbCBiZSB0aGF0IHRoZSBJU19FUlJfT1JfTlVMTCgpIHdpbGwgZmFsbCB0
aHJvdWdoLCBzaW5jZSBORlM0RVJSX1JFQ0FMTENPTkZMSUNUID4gNDA5NS4NCg0KPisJCQlpZiAo
dGltZV9hZnRlcihqaWZmaWVzLCBnaXZldXApKQ0KPisJCQkJbHNlZyA9IE5VTEw7DQo+KwkJCS8q
IEZhbGx0aHJvdWdoICovDQo+KwkJY2FzZSAtRUFHQUlOOg0KPisJCQlwbmZzX3B1dF9sYXlvdXRf
aGRyKGxvKTsNCj4gCQkJaWYgKGZpcnN0KQ0KPiAJCQkJcG5mc19jbGVhcl9maXJzdF9sYXlvdXRn
ZXQobG8pOw0KPi0JCQlwbmZzX3B1dF9sYXlvdXRfaGRyKGxvKTsNCj4tCQkJZ290byBsb29rdXBf
YWdhaW47DQo+KwkJCWlmIChsc2VnKSB7DQo+KwkJCQl0cmFjZV9wbmZzX3VwZGF0ZV9sYXlvdXQo
aW5vLCBwb3MsIGNvdW50LA0KPisJCQkJCWlvbW9kZSwgbG8sIGxzZWcsIFBORlNfVVBEQVRFX0xB
WU9VVF9SRVRSWSk7DQo+KwkJCQlnb3RvIGxvb2t1cF9hZ2FpbjsNCj4rCQkJfQ0KPisJCQlicmVh
azsNCj4rCQlkZWZhdWx0Og0KPisJCQlpZiAoIW5mc19lcnJvcl9pc19mYXRhbChQVFJfRVJSKGxz
ZWcpKSkNCj4rCQkJCWxzZWcgPSBOVUxMOw0KPiAJCX0NCj4tDQo+LQkJaWYgKCFuZnNfZXJyb3Jf
aXNfZmF0YWwoUFRSX0VSUihsc2VnKSkpDQo+LQkJCWxzZWcgPSBOVUxMOw0KPiAJfSBlbHNlIHsN
Cj4gCQlwbmZzX2xheW91dF9jbGVhcl9mYWlsX2JpdChsbywgcG5mc19pb21vZGVfdG9fZmFpbF9i
aXQoaW9tb2RlKSk7DQo+IAl9DQo+IA0KPiAJYXRvbWljX2RlYygmbG8tPnBsaF9vdXRzdGFuZGlu
Zyk7DQo+LQl0cmFjZV9wbmZzX3VwZGF0ZV9sYXlvdXQoaW5vLCBwb3MsIGNvdW50LCBpb21vZGUs
IGxvLA0KPi0JCQkJIFBORlNfVVBEQVRFX0xBWU9VVF9TRU5EX0xBWU9VVEdFVCk7DQo+IG91dF9w
dXRfbGF5b3V0X2hkcjoNCj4gCWlmIChmaXJzdCkNCj4gCQlwbmZzX2NsZWFyX2ZpcnN0X2xheW91
dGdldChsbyk7DQo+ZGlmZiAtLWdpdCBhL2ZzL25mcy9wbmZzLmggYi9mcy9uZnMvcG5mcy5oDQo+
aW5kZXggOTcxMDY4YjU4NjQ3Li5mOWYzMzMxYmVmNDkgMTAwNjQ0DQo+LS0tIGEvZnMvbmZzL3Bu
ZnMuaA0KPisrKyBiL2ZzL25mcy9wbmZzLmgNCj5AQCAtMjI4LDcgKzIyOCw3IEBAIGV4dGVybiB2
b2lkIHBuZnNfdW5yZWdpc3Rlcl9sYXlvdXRkcml2ZXIoc3RydWN0IHBuZnNfbGF5b3V0ZHJpdmVy
X3R5cGUgKik7DQo+IGV4dGVybiBpbnQgbmZzNF9wcm9jX2dldGRldmljZWluZm8oc3RydWN0IG5m
c19zZXJ2ZXIgKnNlcnZlciwNCj4gCQkJCSAgIHN0cnVjdCBwbmZzX2RldmljZSAqZGV2LA0KPiAJ
CQkJICAgc3RydWN0IHJwY19jcmVkICpjcmVkKTsNCj4tZXh0ZXJuIHN0cnVjdCBwbmZzX2xheW91
dF9zZWdtZW50KiBuZnM0X3Byb2NfbGF5b3V0Z2V0KHN0cnVjdCBuZnM0X2xheW91dGdldCAqbGdw
LCBnZnBfdCBnZnBfZmxhZ3MpOw0KPitleHRlcm4gc3RydWN0IHBuZnNfbGF5b3V0X3NlZ21lbnQq
IG5mczRfcHJvY19sYXlvdXRnZXQoc3RydWN0IG5mczRfbGF5b3V0Z2V0ICpsZ3AsIGxvbmcgKnRp
bWVvdXQsIGdmcF90IGdmcF9mbGFncyk7DQo+IGV4dGVybiBpbnQgbmZzNF9wcm9jX2xheW91dHJl
dHVybihzdHJ1Y3QgbmZzNF9sYXlvdXRyZXR1cm4gKmxycCwgYm9vbCBzeW5jKTsNCj4gDQo+IC8q
IHBuZnMuYyAqLw0KPkBAIC0yNjAsMTAgKzI2MCw2IEBAIHZvaWQgcG5mc19wdXRfbGF5b3V0X2hk
cihzdHJ1Y3QgcG5mc19sYXlvdXRfaGRyICpsbyk7DQo+IHZvaWQgcG5mc19zZXRfbGF5b3V0X3N0
YXRlaWQoc3RydWN0IHBuZnNfbGF5b3V0X2hkciAqbG8sDQo+IAkJCSAgICAgY29uc3QgbmZzNF9z
dGF0ZWlkICpuZXcsDQo+IAkJCSAgICAgYm9vbCB1cGRhdGVfYmFycmllcik7DQo+LWludCBwbmZz
X2Nob29zZV9sYXlvdXRnZXRfc3RhdGVpZChuZnM0X3N0YXRlaWQgKmRzdCwNCj4tCQkJCSAgc3Ry
dWN0IHBuZnNfbGF5b3V0X2hkciAqbG8sDQo+LQkJCQkgIGNvbnN0IHN0cnVjdCBwbmZzX2xheW91
dF9yYW5nZSAqcmFuZ2UsDQo+LQkJCQkgIHN0cnVjdCBuZnM0X3N0YXRlICpvcGVuX3N0YXRlKTsN
Cj4gaW50IHBuZnNfbWFya19tYXRjaGluZ19sc2Vnc19pbnZhbGlkKHN0cnVjdCBwbmZzX2xheW91
dF9oZHIgKmxvLA0KPiAJCQkJc3RydWN0IGxpc3RfaGVhZCAqdG1wX2xpc3QsDQo+IAkJCQljb25z
dCBzdHJ1Y3QgcG5mc19sYXlvdXRfcmFuZ2UgKnJlY2FsbF9yYW5nZSwNCj5kaWZmIC0tZ2l0IGEv
aW5jbHVkZS9saW51eC9uZnM0LmggYi9pbmNsdWRlL2xpbnV4L25mczQuaA0KPmluZGV4IDAxMTQz
MzQ3OGExNC4uZjQ4NzBhMzMwMjkwIDEwMDY0NA0KPi0tLSBhL2luY2x1ZGUvbGludXgvbmZzNC5o
DQo+KysrIGIvaW5jbHVkZS9saW51eC9uZnM0LmgNCj5AQCAtNjIxLDcgKzYyMSw5IEBAIGVudW0g
cG5mc191cGRhdGVfbGF5b3V0X3JlYXNvbiB7DQo+IAlQTkZTX1VQREFURV9MQVlPVVRfSU9fVEVT
VF9GQUlMLA0KPiAJUE5GU19VUERBVEVfTEFZT1VUX0ZPVU5EX0NBQ0hFRCwNCj4gCVBORlNfVVBE
QVRFX0xBWU9VVF9SRVRVUk4sDQo+KwlQTkZTX1VQREFURV9MQVlPVVRfUkVUUlksDQo+IAlQTkZT
X1VQREFURV9MQVlPVVRfQkxPQ0tFRCwNCj4rCVBORlNfVVBEQVRFX0xBWU9VVF9JTlZBTElEX09Q
RU4sDQo+IAlQTkZTX1VQREFURV9MQVlPVVRfU0VORF9MQVlPVVRHRVQsDQo+IH07DQo+IA0KPmRp
ZmYgLS1naXQgYS9pbmNsdWRlL2xpbnV4L25mc194ZHIuaCBiL2luY2x1ZGUvbGludXgvbmZzX3hk
ci5oDQo+aW5kZXggY2I5OTgyZDhmMzhmLi5hNGNiOGEzM2FlMmMgMTAwNjQ0DQo+LS0tIGEvaW5j
bHVkZS9saW51eC9uZnNfeGRyLmgNCj4rKysgYi9pbmNsdWRlL2xpbnV4L25mc194ZHIuaA0KPkBA
IC0yMzMsNyArMjMzLDYgQEAgc3RydWN0IG5mczRfbGF5b3V0Z2V0X2FyZ3Mgew0KPiAJc3RydWN0
IGlub2RlICppbm9kZTsNCj4gCXN0cnVjdCBuZnNfb3Blbl9jb250ZXh0ICpjdHg7DQo+IAluZnM0
X3N0YXRlaWQgc3RhdGVpZDsNCj4tCXVuc2lnbmVkIGxvbmcgdGltZXN0YW1wOw0KPiAJc3RydWN0
IG5mczRfbGF5b3V0ZHJpdmVyX2RhdGEgbGF5b3V0Ow0KPiB9Ow0KPiANCj5AQCAtMjUxLDcgKzI1
MCw2IEBAIHN0cnVjdCBuZnM0X2xheW91dGdldCB7DQo+IAlzdHJ1Y3QgbmZzNF9sYXlvdXRnZXRf
cmVzIHJlczsNCj4gCXN0cnVjdCBycGNfY3JlZCAqY3JlZDsNCj4gCWdmcF90IGdmcF9mbGFnczsN
Cj4tCWxvbmcgdGltZW91dDsNCj4gfTsNCj4gDQo+IHN0cnVjdCBuZnM0X2dldGRldmljZWluZm9f
YXJncyB7DQo+LS0gDQo+Mi41LjUNCj4NCg0K


2016-05-17 10:17:03

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 12/13] pnfs: rework LAYOUTGET retry handling

On Tue, 2016-05-17 at 02:07 +0000, Trond Myklebust wrote:
>
>
> On 5/14/16, 21:06, "Jeff Layton" <[email protected]> wrote:
>
> >There are several problems in the way a stateid is selected for a
> >LAYOUTGET operation:
> >
> >We pick a stateid to use in the RPC prepare op, but that makes
> >it difficult to serialize LAYOUTGETs that use the open stateid. That
> >serialization is done in pnfs_update_layout, which occurs well before
> >the rpc_prepare operation.
> >
> >Between those two events, the i_lock is dropped and reacquired.
> >pnfs_update_layout can find that the list has lsegs in it and not do any
> >serialization, but then later pnfs_choose_layoutget_stateid ends up
> >choosing the open stateid.
> >
> >This patch changes the client to select the stateid to use in the
> >LAYOUTGET earlier, when we're searching for a usable layout segment.
> >This way we can do it all while holding the i_lock the first time, and
> >ensure that we serialize any LAYOUTGET call that uses a non-layout
> >stateid.
> >
> >This also means a rework of how LAYOUTGET replies are handled, as we
> >must now get the latest stateid if we want to retransmit in response
> >to a retryable error.
> >
> >Most of those errors boil down to the fact that the layout state has
> >changed in some fashion. Thus, what we really want to do is to re-search
> >for a layout when it fails with a retryable error, so that we can avoid
> >reissuing the RPC at all if possible.
> >
> >While the LAYOUTGET RPC is async, the initiating thread always waits for
> >it to complete, so it's effectively synchronous anyway. Currently, when
> >we need to retry a LAYOUTGET because of an error, we drive that retry
> >via the rpc state machine.
> >
> >This means that once the call has been submitted, it runs until it
> >completes. So, we must move the error handling for this RPC out of the
> >rpc_call_done operation and into the caller.
> >
> >In order to handle errors like NFS4ERR_DELAY properly, we must also
> >pass a pointer to the sliding timeout, which is now moved to the stack
> >in pnfs_update_layout.
> >
> >The complicating errors are -NFS4ERR_RECALLCONFLICT and
> >-NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
> >up and return NULL back to the caller. So, there is some special
> >handling for those errors to ensure that the layers driving the retries
> >can handle that appropriately.
> >
> >Signed-off-by: Jeff Layton <[email protected]>
> >---
> > fs/nfs/nfs4proc.c | 111 +++++++++++++++----------------------
> > fs/nfs/nfs4trace.h | 10 +++-
> > fs/nfs/pnfs.c | 142 +++++++++++++++++++++++++-----------------------
> > fs/nfs/pnfs.h | 6 +-
> > include/linux/nfs4.h | 2 +
> > include/linux/nfs_xdr.h | 2 -
> > 6 files changed, 131 insertions(+), 142 deletions(-)
> >
> >diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> >index c0d75be8cb69..1254ed84c760 100644
> >--- a/fs/nfs/nfs4proc.c
> >+++ b/fs/nfs/nfs4proc.c
> >@@ -416,6 +416,7 @@ static int nfs4_do_handle_exception(struct nfs_server *server,
> > case -NFS4ERR_DELAY:
> > nfs_inc_server_stats(server, NFSIOS_DELAY);
> > case -NFS4ERR_GRACE:
> >+  case -NFS4ERR_RECALLCONFLICT:
> > exception->delay = 1;
> > return 0;
> > 
> >@@ -7824,40 +7825,34 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
> > struct nfs4_layoutget *lgp = calldata;
> > struct nfs_server *server = NFS_SERVER(lgp->args.inode);
> > struct nfs4_session *session = nfs4_get_session(server);
> >- int ret;
> > 
> > dprintk("--> %s\n", __func__);
> >- /* Note the is a race here, where a CB_LAYOUTRECALL can come in
> >-  * right now covering the LAYOUTGET we are about to send.
> >-  * However, that is not so catastrophic, and there seems
> >-  * to be no way to prevent it completely.
> >-  */
> >- if (nfs41_setup_sequence(session, &lgp->args.seq_args,
> >-  &lgp->res.seq_res, task))
> >-  return;
> >- ret = pnfs_choose_layoutget_stateid(&lgp->args.stateid,
> >-  NFS_I(lgp->args.inode)->layout,
> >-  &lgp->args.range,
> >-  lgp->args.ctx->state);
> >- if (ret < 0)
> >-  rpc_exit(task, ret);
> >+ nfs41_setup_sequence(session, &lgp->args.seq_args,
> >+  &lgp->res.seq_res, task);
> >+ dprintk("<-- %s\n", __func__);
> > }
> > 
> > static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
> > {
> > struct nfs4_layoutget *lgp = calldata;
> >+
> >+ dprintk("--> %s\n", __func__);
> >+ nfs41_sequence_done(task, &lgp->res.seq_res);
> >+ dprintk("<-- %s\n", __func__);
> >+}
> >+
> >+static int
> >+nfs4_layoutget_handle_exception(struct rpc_task *task,
> >+  struct nfs4_layoutget *lgp, struct nfs4_exception *exception)
> >+{
> > struct inode *inode = lgp->args.inode;
> > struct nfs_server *server = NFS_SERVER(inode);
> > struct pnfs_layout_hdr *lo;
> >- struct nfs4_state *state = NULL;
> >- unsigned long timeo, now, giveup;
> >+ int status = task->tk_status;
> > 
> > dprintk("--> %s tk_status => %d\n", __func__, -task->tk_status);
> > 
> >- if (!nfs41_sequence_done(task, &lgp->res.seq_res))
> >-  goto out;
> >-
> >- switch (task->tk_status) {
> >+ switch (status) {
> > case 0:
> > goto out;
> > 
> >@@ -7867,57 +7862,39 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
> > * retry go inband.
> > */
> > case -NFS4ERR_LAYOUTUNAVAILABLE:
> >-  task->tk_status = -ENODATA;
> >+  status = -ENODATA;
> > goto out;
> > /*
> > * NFS4ERR_BADLAYOUT means the MDS cannot return a layout of
> > * length lgp->args.minlength != 0 (see RFC5661 section 18.43.3).
> > */
> > case -NFS4ERR_BADLAYOUT:
> >-  goto out_overflow;
> >+  status = -EOVERFLOW;
> >+  goto out;
> > /*
> > * NFS4ERR_LAYOUTTRYLATER is a conflict with another client
> > * (or clients) writing to the same RAID stripe except when
> > * the minlength argument is 0 (see RFC5661 section 18.43.3).
> >+  *
> >+  * Treat it like we would RECALLCONFLICT -- we retry for a little
> >+  * while, and then eventually give up.
> > */
> > case -NFS4ERR_LAYOUTTRYLATER:
> >-  if (lgp->args.minlength == 0)
> >-  goto out_overflow;
> >- /*
> >-  * NFS4ERR_RECALLCONFLICT is when conflict with self (must recall
> >-  * existing layout before getting a new one).
> >-  */
> >- case -NFS4ERR_RECALLCONFLICT:
> >-  timeo = rpc_get_timeout(task->tk_client);
> >-  giveup = lgp->args.timestamp + timeo;
> >-  now = jiffies;
> >-  if (time_after(giveup, now)) {
> >-  unsigned long delay;
> >-
> >-  /* Delay for:
> >-  * - Not less then NFS4_POLL_RETRY_MIN.
> >-  * - One last time a jiffie before we give up
> >-  * - exponential backoff (time_now minus start_attempt)
> >-  */
> >-  delay = max_t(unsigned long, NFS4_POLL_RETRY_MIN,
> >-  min((giveup - now - 1),
> >-  now - lgp->args.timestamp));
> >-
> >-  dprintk("%s: NFS4ERR_RECALLCONFLICT waiting %lu\n",
> >-  __func__, delay);
> >-  rpc_delay(task, delay);
> >-  /* Do not call nfs4_async_handle_error() */
> >-  goto out_restart;
> >+  if (lgp->args.minlength == 0) {
> >+  status = -EOVERFLOW;
> >+  goto out;
> > }
> >+  status = -NFS4ERR_RECALLCONFLICT;
> > break;
> > case -NFS4ERR_EXPIRED:
> > case -NFS4ERR_BAD_STATEID:
> >+  exception->timeout = 0;
> > spin_lock(&inode->i_lock);
> > if (nfs4_stateid_match(&lgp->args.stateid,
> > &lgp->args.ctx->state->stateid)) {
> > spin_unlock(&inode->i_lock);
> > /* If the open stateid was bad, then recover it. */
> >-  state = lgp->args.ctx->state;
> >+  exception->state = lgp->args.ctx->state;
> > break;
> > }
> > lo = NFS_I(inode)->layout;
> >@@ -7935,20 +7912,18 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
> > pnfs_free_lseg_list(&head);
> > } else
> > spin_unlock(&inode->i_lock);
> >-  goto out_restart;
> >+  status = -EAGAIN;
> >+  goto out;
> > }
> >- if (nfs4_async_handle_error(task, server, state, &lgp->timeout) == -EAGAIN)
> >-  goto out_restart;
> >+
> >+ status = nfs4_handle_exception(server, status, exception);
> >+ if (status == 0)
> >+  status = task->tk_status;
> >+ if (exception->retry && status != -NFS4ERR_RECALLCONFLICT)
> >+  status = -EAGAIN;
> > out:
> > dprintk("<-- %s\n", __func__);
> >- return;
> >-out_restart:
> >- task->tk_status = 0;
> >- rpc_restart_call_prepare(task);
> >- return;
> >-out_overflow:
> >- task->tk_status = -EOVERFLOW;
> >- goto out;
> >+ return status;
> > }
> > 
> > static size_t max_response_pages(struct nfs_server *server)
> >@@ -8017,7 +7992,7 @@ static const struct rpc_call_ops nfs4_layoutget_call_ops = {
> > };
> > 
> > struct pnfs_layout_segment *
> >-nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> >+nfs4_proc_layoutget(struct nfs4_layoutget *lgp, long *timeout, gfp_t gfp_flags)
> > {
> > struct inode *inode = lgp->args.inode;
> > struct nfs_server *server = NFS_SERVER(inode);
> >@@ -8037,6 +8012,7 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> > .flags = RPC_TASK_ASYNC,
> > };
> > struct pnfs_layout_segment *lseg = NULL;
> >+ struct nfs4_exception exception = { .timeout = *timeout };
> > int status = 0;
> > 
> > dprintk("--> %s\n", __func__);
> >@@ -8050,7 +8026,6 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> > return ERR_PTR(-ENOMEM);
> > }
> > lgp->args.layout.pglen = max_pages * PAGE_SIZE;
> >- lgp->args.timestamp = jiffies;
> > 
> > lgp->res.layoutp = &lgp->args.layout;
> > lgp->res.seq_res.sr_slot = NULL;
> >@@ -8060,13 +8035,17 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> > if (IS_ERR(task))
> > return ERR_CAST(task);
> > status = nfs4_wait_for_completion_rpc_task(task);
> >- if (status == 0)
> >-  status = task->tk_status;
> >+ if (status == 0) {
> >+  status = nfs4_layoutget_handle_exception(task, lgp, &exception);
>
> This is borked… You’re now returning an NFSv4 status as an ERR_PTR(), which is a big no-no!
> MAX_ERRNO is 4095, while NFS4ERR_RECALLCONFLICT takes the value 10061.
>

Ahh good catch!

Ok, I think the right fix there is probably to just declare the
nfs4_exception in pnfs_update_layout, and then just pass a pointer to
it down to this function. Then it can do the interpretation of the result at the point where we decide to retry.

I'll respin...

Thanks,
Jeff

> >+  *timeout = exception.timeout;
> >+ }
> >+
> > trace_nfs4_layoutget(lgp->args.ctx,
> > &lgp->args.range,
> > &lgp->res.range,
> > &lgp->res.stateid,
> > status);
> >+
> > /* if layoutp->len is 0, nfs4_layoutget_prepare called rpc_exit */
> > if (status == 0 && lgp->res.layoutp->len)
> > lseg = pnfs_layout_process(lgp);
> >diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
> >index 2c8d05dae5b1..9c150b153782 100644
> >--- a/fs/nfs/nfs4trace.h
> >+++ b/fs/nfs/nfs4trace.h
> >@@ -1520,6 +1520,8 @@ DEFINE_NFS4_INODE_EVENT(nfs4_layoutreturn_on_close);
> > { PNFS_UPDATE_LAYOUT_FOUND_CACHED, "found cached" }, \
> > { PNFS_UPDATE_LAYOUT_RETURN, "layoutreturn" },  \
> > { PNFS_UPDATE_LAYOUT_BLOCKED, "layouts blocked" }, \
> >+  { PNFS_UPDATE_LAYOUT_INVALID_OPEN, "invalid open" }, \
> >+  { PNFS_UPDATE_LAYOUT_RETRY, "retrying" }, \
> > { PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET, "sent layoutget" })
> > 
> > TRACE_EVENT(pnfs_update_layout,
> >@@ -1528,9 +1530,10 @@ TRACE_EVENT(pnfs_update_layout,
> > u64 count,
> > enum pnfs_iomode iomode,
> > struct pnfs_layout_hdr *lo,
> >+  struct pnfs_layout_segment *lseg,
> > enum pnfs_update_layout_reason reason
> > ),
> >-  TP_ARGS(inode, pos, count, iomode, lo, reason),
> >+  TP_ARGS(inode, pos, count, iomode, lo, lseg, reason),
> > TP_STRUCT__entry(
> > __field(dev_t, dev)
> > __field(u64, fileid)
> >@@ -1540,6 +1543,7 @@ TRACE_EVENT(pnfs_update_layout,
> > __field(enum pnfs_iomode, iomode)
> > __field(int, layoutstateid_seq)
> > __field(u32, layoutstateid_hash)
> >+  __field(long, lseg)
> > __field(enum pnfs_update_layout_reason, reason)
> > ),
> > TP_fast_assign(
> >@@ -1559,11 +1563,12 @@ TRACE_EVENT(pnfs_update_layout,
> > __entry->layoutstateid_seq = 0;
> > __entry->layoutstateid_hash = 0;
> > }
> >+  __entry->lseg = (long)lseg;
> > ),
> > TP_printk(
> > "fileid=%02x:%02x:%llu fhandle=0x%08x "
> > "iomode=%s pos=%llu count=%llu "
> >-  "layoutstateid=%d:0x%08x (%s)",
> >+  "layoutstateid=%d:0x%08x lseg=0x%lx (%s)",
> > MAJOR(__entry->dev), MINOR(__entry->dev),
> > (unsigned long long)__entry->fileid,
> > __entry->fhandle,
> >@@ -1571,6 +1576,7 @@ TRACE_EVENT(pnfs_update_layout,
> > (unsigned long long)__entry->pos,
> > (unsigned long long)__entry->count,
> > __entry->layoutstateid_seq, __entry->layoutstateid_hash,
> >+  __entry->lseg,
> > show_pnfs_update_layout_reason(__entry->reason)
> > )
> > );
> >diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
> >index 5a8c19c57f16..d0760d30734d 100644
> >--- a/fs/nfs/pnfs.c
> >+++ b/fs/nfs/pnfs.c
> >@@ -796,45 +796,18 @@ pnfs_layoutgets_blocked(const struct pnfs_layout_hdr *lo)
> > test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags);
> > }
> > 
> >-int
> >-pnfs_choose_layoutget_stateid(nfs4_stateid *dst, struct pnfs_layout_hdr *lo,
> >-  const struct pnfs_layout_range *range,
> >-  struct nfs4_state *open_state)
> >-{
> >- int status = 0;
> >-
> >- dprintk("--> %s\n", __func__);
> >- spin_lock(&lo->plh_inode->i_lock);
> >- if (pnfs_layoutgets_blocked(lo)) {
> >-  status = -EAGAIN;
> >- } else if (!nfs4_valid_open_stateid(open_state)) {
> >-  status = -EBADF;
> >- } else if (list_empty(&lo->plh_segs) ||
> >-  test_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags)) {
> >-  int seq;
> >-
> >-  do {
> >-  seq = read_seqbegin(&open_state->seqlock);
> >-  nfs4_stateid_copy(dst, &open_state->stateid);
> >-  } while (read_seqretry(&open_state->seqlock, seq));
> >- } else
> >-  nfs4_stateid_copy(dst, &lo->plh_stateid);
> >- spin_unlock(&lo->plh_inode->i_lock);
> >- dprintk("<-- %s\n", __func__);
> >- return status;
> >-}
> >-
> > /*
> >-* Get layout from server.
> >-* for now, assume that whole file layouts are requested.
> >-* arg->offset: 0
> >-* arg->length: all ones
> >-*/
> >+ * Get layout from server.
> >+ * for now, assume that whole file layouts are requested.
> >+ * arg->offset: 0
> >+ * arg->length: all ones
> >+ */
> > static struct pnfs_layout_segment *
> > send_layoutget(struct pnfs_layout_hdr *lo,
> > struct nfs_open_context *ctx,
> >+  nfs4_stateid *stateid,
> > const struct pnfs_layout_range *range,
> >-  gfp_t gfp_flags)
> >+  long *timeout, gfp_t gfp_flags)
> > {
> > struct inode *ino = lo->plh_inode;
> > struct nfs_server *server = NFS_SERVER(ino);
> >@@ -868,10 +841,11 @@ send_layoutget(struct pnfs_layout_hdr *lo,
> > lgp->args.type = server->pnfs_curr_ld->id;
> > lgp->args.inode = ino;
> > lgp->args.ctx = get_nfs_open_context(ctx);
> >+ nfs4_stateid_copy(&lgp->args.stateid, stateid);
> > lgp->gfp_flags = gfp_flags;
> > lgp->cred = lo->plh_lc_cred;
> > 
> >- return nfs4_proc_layoutget(lgp, gfp_flags);
> >+ return nfs4_proc_layoutget(lgp, timeout, gfp_flags);
> > }
> > 
> > static void pnfs_clear_layoutcommit(struct inode *inode,
> >@@ -1511,27 +1485,30 @@ pnfs_update_layout(struct inode *ino,
> > .offset = pos,
> > .length = count,
> > };
> >- unsigned pg_offset;
> >+ unsigned pg_offset, seq;
> > struct nfs_server *server = NFS_SERVER(ino);
> > struct nfs_client *clp = server->nfs_client;
> >- struct pnfs_layout_hdr *lo;
> >+ struct pnfs_layout_hdr *lo = NULL;
> > struct pnfs_layout_segment *lseg = NULL;
> >+ nfs4_stateid stateid;
> >+ long timeout = 0;
> >+ unsigned long giveup = jiffies + rpc_get_timeout(server->client);
> > bool first;
> > 
> > if (!pnfs_enabled_sb(NFS_SERVER(ino))) {
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_NO_PNFS);
> > goto out;
> > }
> > 
> > if (iomode == IOMODE_READ && i_size_read(ino) == 0) {
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_RD_ZEROLEN);
> > goto out;
> > }
> > 
> > if (pnfs_within_mdsthreshold(ctx, ino, iomode)) {
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_MDSTHRESH);
> > goto out;
> > }
> >@@ -1542,14 +1519,14 @@ lookup_again:
> > lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
> > if (lo == NULL) {
> > spin_unlock(&ino->i_lock);
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, NULL,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_NOMEM);
> > goto out;
> > }
> > 
> > /* Do we even need to bother with this? */
> > if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, lo,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_BULK_RECALL);
> > dprintk("%s matches recall, use MDS\n", __func__);
> > goto out_unlock;
> >@@ -1557,14 +1534,34 @@ lookup_again:
> > 
> > /* if LAYOUTGET already failed once we don't try again */
> > if (pnfs_layout_io_test_failed(lo, iomode)) {
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, lo,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_IO_TEST_FAIL);
> > goto out_unlock;
> > }
> > 
> >- first = list_empty(&lo->plh_segs);
> >- if (first) {
> >-  /* The first layoutget for the file. Need to serialize per
> >+ lseg = pnfs_find_lseg(lo, &arg);
> >+ if (lseg) {
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> >+  PNFS_UPDATE_LAYOUT_FOUND_CACHED);
> >+  goto out_unlock;
> >+ }
> >+
> >+ if (!nfs4_valid_open_stateid(ctx->state)) {
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> >+  PNFS_UPDATE_LAYOUT_INVALID_OPEN);
> >+  goto out_unlock;
> >+ }
> >+
> >+ /*
> >+  * Choose a stateid for the LAYOUTGET. If we don't have a layout
> >+  * stateid, or it has been invalidated, then we must use the open
> >+  * stateid.
> >+  */
> >+ if (lo->plh_stateid.seqid == 0 ||
> >+  test_bit(NFS_LAYOUT_INVALID_STID, &lo->plh_flags)) {
> >+
> >+  /*
> >+  * The first layoutget for the file. Need to serialize per
> > * RFC 5661 Errata 3208.
> > */
> > if (test_and_set_bit(NFS_LAYOUT_FIRST_LAYOUTGET,
> >@@ -1573,18 +1570,17 @@ lookup_again:
> > wait_on_bit(&lo->plh_flags, NFS_LAYOUT_FIRST_LAYOUTGET,
> > TASK_UNINTERRUPTIBLE);
> > pnfs_put_layout_hdr(lo);
> >+  dprintk("%s retrying\n", __func__);
> > goto lookup_again;
> > }
> >+
> >+  first = true;
> >+  do {
> >+  seq = read_seqbegin(&ctx->state->seqlock);
> >+  nfs4_stateid_copy(&stateid, &ctx->state->stateid);
> >+  } while (read_seqretry(&ctx->state->seqlock, seq));
> > } else {
> >-  /* Check to see if the layout for the given range
> >-  * already exists
> >-  */
> >-  lseg = pnfs_find_lseg(lo, &arg);
> >-  if (lseg) {
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, lo,
> >-  PNFS_UPDATE_LAYOUT_FOUND_CACHED);
> >-  goto out_unlock;
> >-  }
> >+  nfs4_stateid_copy(&stateid, &lo->plh_stateid);
> > }
> > 
> > /*
> >@@ -1599,15 +1595,17 @@ lookup_again:
> > pnfs_clear_first_layoutget(lo);
> > pnfs_put_layout_hdr(lo);
> > dprintk("%s retrying\n", __func__);
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo,
> >+  lseg, PNFS_UPDATE_LAYOUT_RETRY);
> > goto lookup_again;
> > }
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, lo,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_RETURN);
> > goto out_put_layout_hdr;
> > }
> > 
> > if (pnfs_layoutgets_blocked(lo)) {
> >-  trace_pnfs_update_layout(ino, pos, count, iomode, lo,
> >+  trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> > PNFS_UPDATE_LAYOUT_BLOCKED);
> > goto out_unlock;
> > }
> >@@ -1632,24 +1630,34 @@ lookup_again:
> > if (arg.length != NFS4_MAX_UINT64)
> > arg.length = PAGE_ALIGN(arg.length);
> > 
> >- lseg = send_layoutget(lo, ctx, &arg, gfp_flags);
> >- if (IS_ERR(lseg)) {
> >-  if (lseg == ERR_PTR(-EAGAIN)) {
> >+ lseg = send_layoutget(lo, ctx, &stateid, &arg, &timeout, gfp_flags);
> >+ trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
> >+  PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET);
> >+ if (IS_ERR_OR_NULL(lseg)) {
> >+  switch(PTR_ERR(lseg)) {
> >+  case -NFS4ERR_RECALLCONFLICT:
>
> See comment above. This cannot work: the result will be that the IS_ERR_OR_NULL() will fall through, since NFS4ERR_RECALLCONFLICT > 4095.
>
> >+  if (time_after(jiffies, giveup))
> >+  lseg = NULL;
> >+  /* Fallthrough */
> >+  case -EAGAIN:
> >+  pnfs_put_layout_hdr(lo);
> > if (first)
> > pnfs_clear_first_layoutget(lo);
> >-  pnfs_put_layout_hdr(lo);
> >-  goto lookup_again;
> >+  if (lseg) {
> >+  trace_pnfs_update_layout(ino, pos, count,
> >+  iomode, lo, lseg, PNFS_UPDATE_LAYOUT_RETRY);
> >+  goto lookup_again;
> >+  }
> >+  break;
> >+  default:
> >+  if (!nfs_error_is_fatal(PTR_ERR(lseg)))
> >+  lseg = NULL;
> > }
> >-
> >-  if (!nfs_error_is_fatal(PTR_ERR(lseg)))
> >-  lseg = NULL;
> > } else {
> > pnfs_layout_clear_fail_bit(lo, pnfs_iomode_to_fail_bit(iomode));
> > }
> > 
> > atomic_dec(&lo->plh_outstanding);
> >- trace_pnfs_update_layout(ino, pos, count, iomode, lo,
> >-  PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET);
> > out_put_layout_hdr:
> > if (first)
> > pnfs_clear_first_layoutget(lo);
> >diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
> >index 971068b58647..f9f3331bef49 100644
> >--- a/fs/nfs/pnfs.h
> >+++ b/fs/nfs/pnfs.h
> >@@ -228,7 +228,7 @@ extern void pnfs_unregister_layoutdriver(struct pnfs_layoutdriver_type *);
> > extern int nfs4_proc_getdeviceinfo(struct nfs_server *server,
> > struct pnfs_device *dev,
> > struct rpc_cred *cred);
> >-extern struct pnfs_layout_segment* nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags);
> >+extern struct pnfs_layout_segment* nfs4_proc_layoutget(struct nfs4_layoutget *lgp, long *timeout, gfp_t gfp_flags);
> > extern int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp, bool sync);
> > 
> > /* pnfs.c */
> >@@ -260,10 +260,6 @@ void pnfs_put_layout_hdr(struct pnfs_layout_hdr *lo);
> > void pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo,
> > const nfs4_stateid *new,
> > bool update_barrier);
> >-int pnfs_choose_layoutget_stateid(nfs4_stateid *dst,
> >-  struct pnfs_layout_hdr *lo,
> >-  const struct pnfs_layout_range *range,
> >-  struct nfs4_state *open_state);
> > int pnfs_mark_matching_lsegs_invalid(struct pnfs_layout_hdr *lo,
> > struct list_head *tmp_list,
> > const struct pnfs_layout_range *recall_range,
> >diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
> >index 011433478a14..f4870a330290 100644
> >--- a/include/linux/nfs4.h
> >+++ b/include/linux/nfs4.h
> >@@ -621,7 +621,9 @@ enum pnfs_update_layout_reason {
> > PNFS_UPDATE_LAYOUT_IO_TEST_FAIL,
> > PNFS_UPDATE_LAYOUT_FOUND_CACHED,
> > PNFS_UPDATE_LAYOUT_RETURN,
> >+ PNFS_UPDATE_LAYOUT_RETRY,
> > PNFS_UPDATE_LAYOUT_BLOCKED,
> >+ PNFS_UPDATE_LAYOUT_INVALID_OPEN,
> > PNFS_UPDATE_LAYOUT_SEND_LAYOUTGET,
> > };
> > 
> >diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> >index cb9982d8f38f..a4cb8a33ae2c 100644
> >--- a/include/linux/nfs_xdr.h
> >+++ b/include/linux/nfs_xdr.h
> >@@ -233,7 +233,6 @@ struct nfs4_layoutget_args {
> > struct inode *inode;
> > struct nfs_open_context *ctx;
> > nfs4_stateid stateid;
> >- unsigned long timestamp;
> > struct nfs4_layoutdriver_data layout;
> > };
> > 
> >@@ -251,7 +250,6 @@ struct nfs4_layoutget {
> > struct nfs4_layoutget_res res;
> > struct rpc_cred *cred;
> > gfp_t gfp_flags;
> >- long timeout;
> > };
> > 
> > struct nfs4_getdeviceinfo_args {
> >-- 
> >2.5.5
> >
>
>
> Disclaimer
> The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

--
Jeff Layton <[email protected]>

2016-05-17 12:19:46

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 12/13] pnfs: rework LAYOUTGET retry handling

On Tue, 2016-05-17 at 06:16 -0400, Jeff Layton wrote:
> On Tue, 2016-05-17 at 02:07 +0000, Trond Myklebust wrote:
> >
> >
> > On 5/14/16, 21:06, "Jeff Layton" <[email protected]> wrote:
> >
> > > There are several problems in the way a stateid is selected for a
> > > LAYOUTGET operation:
> > >
> > > We pick a stateid to use in the RPC prepare op, but that makes
> > > it difficult to serialize LAYOUTGETs that use the open stateid. That
> > > serialization is done in pnfs_update_layout, which occurs well before
> > > the rpc_prepare operation.
> > >
> > > Between those two events, the i_lock is dropped and reacquired.
> > > pnfs_update_layout can find that the list has lsegs in it and not do any
> > > serialization, but then later pnfs_choose_layoutget_stateid ends up
> > > choosing the open stateid.
> > >
> > > This patch changes the client to select the stateid to use in the
> > > LAYOUTGET earlier, when we're searching for a usable layout segment.
> > > This way we can do it all while holding the i_lock the first time, and
> > > ensure that we serialize any LAYOUTGET call that uses a non-layout
> > > stateid.
> > >
> > > This also means a rework of how LAYOUTGET replies are handled, as we
> > > must now get the latest stateid if we want to retransmit in response
> > > to a retryable error.
> > >
> > > Most of those errors boil down to the fact that the layout state has
> > > changed in some fashion. Thus, what we really want to do is to re-search
> > > for a layout when it fails with a retryable error, so that we can avoid
> > > reissuing the RPC at all if possible.
> > >
> > > While the LAYOUTGET RPC is async, the initiating thread always waits for
> > > it to complete, so it's effectively synchronous anyway. Currently, when
> > > we need to retry a LAYOUTGET because of an error, we drive that retry
> > > via the rpc state machine.
> > >
> > > This means that once the call has been submitted, it runs until it
> > > completes. So, we must move the error handling for this RPC out of the
> > > rpc_call_done operation and into the caller.
> > >
> > > In order to handle errors like NFS4ERR_DELAY properly, we must also
> > > pass a pointer to the sliding timeout, which is now moved to the stack
> > > in pnfs_update_layout.
> > >
> > > The complicating errors are -NFS4ERR_RECALLCONFLICT and
> > > -NFS4ERR_LAYOUTTRYLATER, as those involve a timeout after which we give
> > > up and return NULL back to the caller. So, there is some special
> > > handling for those errors to ensure that the layers driving the retries
> > > can handle that appropriately.
> > >
> > > Signed-off-by: Jeff Layton <[email protected]>
> > > ---
> > > fs/nfs/nfs4proc.c | 111 +++++++++++++++----------------------
> > > fs/nfs/nfs4trace.h | 10 +++-
> > > fs/nfs/pnfs.c | 142 +++++++++++++++++++++++++-----------------------
> > > fs/nfs/pnfs.h | 6 +-
> > > include/linux/nfs4.h | 2 +
> > > include/linux/nfs_xdr.h | 2 -
> > > 6 files changed, 131 insertions(+), 142 deletions(-)
> > >
> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > > index c0d75be8cb69..1254ed84c760 100644
> > > --- a/fs/nfs/nfs4proc.c
> > > +++ b/fs/nfs/nfs4proc.c
> > > @@ -416,6 +416,7 @@ static int nfs4_do_handle_exception(struct nfs_server *server,
> > > case -NFS4ERR_DELAY:
> > > nfs_inc_server_stats(server, NFSIOS_DELAY);
> > > case -NFS4ERR_GRACE:
> > > +  case -NFS4ERR_RECALLCONFLICT:
> > > exception->delay = 1;
> > > return 0;
> > >  
> > > @@ -7824,40 +7825,34 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
> > > struct nfs4_layoutget *lgp = calldata;
> > > struct nfs_server *server = NFS_SERVER(lgp->args.inode);
> > > struct nfs4_session *session = nfs4_get_session(server);
> > > - int ret;
> > >  
> > > dprintk("--> %s\n", __func__);
> > > - /* Note the is a race here, where a CB_LAYOUTRECALL can come in
> > > -  * right now covering the LAYOUTGET we are about to send.
> > > -  * However, that is not so catastrophic, and there seems
> > > -  * to be no way to prevent it completely.
> > > -  */
> > > - if (nfs41_setup_sequence(session, &lgp->args.seq_args,
> > > -  &lgp->res.seq_res, task))
> > > -  return;
> > > - ret = pnfs_choose_layoutget_stateid(&lgp->args.stateid,
> > > -  NFS_I(lgp->args.inode)->layout,
> > > -  &lgp->args.range,
> > > -  lgp->args.ctx->state);
> > > - if (ret < 0)
> > > -  rpc_exit(task, ret);
> > > + nfs41_setup_sequence(session, &lgp->args.seq_args,
> > > +  &lgp->res.seq_res, task);
> > > + dprintk("<-- %s\n", __func__);
> > > }
> > >  
> > > static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
> > > {
> > > struct nfs4_layoutget *lgp = calldata;
> > > +
> > > + dprintk("--> %s\n", __func__);
> > > + nfs41_sequence_done(task, &lgp->res.seq_res);
> > > + dprintk("<-- %s\n", __func__);
> > > +}
> > > +
> > > +static int
> > > +nfs4_layoutget_handle_exception(struct rpc_task *task,
> > > +  struct nfs4_layoutget *lgp, struct nfs4_exception *exception)
> > > +{
> > > struct inode *inode = lgp->args.inode;
> > > struct nfs_server *server = NFS_SERVER(inode);
> > > struct pnfs_layout_hdr *lo;
> > > - struct nfs4_state *state = NULL;
> > > - unsigned long timeo, now, giveup;
> > > + int status = task->tk_status;
> > >  
> > > dprintk("--> %s tk_status => %d\n", __func__, -task->tk_status);
> > >  
> > > - if (!nfs41_sequence_done(task, &lgp->res.seq_res))
> > > -  goto out;
> > > -
> > > - switch (task->tk_status) {
> > > + switch (status) {
> > > case 0:
> > > goto out;
> > >  
> > > @@ -7867,57 +7862,39 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
> > > * retry go inband.
> > > */
> > > case -NFS4ERR_LAYOUTUNAVAILABLE:
> > > -  task->tk_status = -ENODATA;
> > > +  status = -ENODATA;
> > > goto out;
> > > /*
> > > * NFS4ERR_BADLAYOUT means the MDS cannot return a layout of
> > > * length lgp->args.minlength != 0 (see RFC5661 section 18.43.3).
> > > */
> > > case -NFS4ERR_BADLAYOUT:
> > > -  goto out_overflow;
> > > +  status = -EOVERFLOW;
> > > +  goto out;
> > > /*
> > > * NFS4ERR_LAYOUTTRYLATER is a conflict with another client
> > > * (or clients) writing to the same RAID stripe except when
> > > * the minlength argument is 0 (see RFC5661 section 18.43.3).
> > > +  *
> > > +  * Treat it like we would RECALLCONFLICT -- we retry for a little
> > > +  * while, and then eventually give up.
> > > */
> > > case -NFS4ERR_LAYOUTTRYLATER:
> > > -  if (lgp->args.minlength == 0)
> > > -  goto out_overflow;
> > > - /*
> > > -  * NFS4ERR_RECALLCONFLICT is when conflict with self (must recall
> > > -  * existing layout before getting a new one).
> > > -  */
> > > - case -NFS4ERR_RECALLCONFLICT:
> > > -  timeo = rpc_get_timeout(task->tk_client);
> > > -  giveup = lgp->args.timestamp + timeo;
> > > -  now = jiffies;
> > > -  if (time_after(giveup, now)) {
> > > -  unsigned long delay;
> > > -
> > > -  /* Delay for:
> > > -  * - Not less then NFS4_POLL_RETRY_MIN.
> > > -  * - One last time a jiffie before we give up
> > > -  * - exponential backoff (time_now minus start_attempt)
> > > -  */
> > > -  delay = max_t(unsigned long, NFS4_POLL_RETRY_MIN,
> > > -  min((giveup - now - 1),
> > > -  now - lgp->args.timestamp));
> > > -
> > > -  dprintk("%s: NFS4ERR_RECALLCONFLICT waiting %lu\n",
> > > -  __func__, delay);
> > > -  rpc_delay(task, delay);
> > > -  /* Do not call nfs4_async_handle_error() */
> > > -  goto out_restart;
> > > +  if (lgp->args.minlength == 0) {
> > > +  status = -EOVERFLOW;
> > > +  goto out;
> > > }
> > > +  status = -NFS4ERR_RECALLCONFLICT;
> > > break;
> > > case -NFS4ERR_EXPIRED:
> > > case -NFS4ERR_BAD_STATEID:
> > > +  exception->timeout = 0;
> > > spin_lock(&inode->i_lock);
> > > if (nfs4_stateid_match(&lgp->args.stateid,
> > > &lgp->args.ctx->state->stateid)) {
> > > spin_unlock(&inode->i_lock);
> > > /* If the open stateid was bad, then recover it. */
> > > -  state = lgp->args.ctx->state;
> > > +  exception->state = lgp->args.ctx->state;
> > > break;
> > > }
> > > lo = NFS_I(inode)->layout;
> > > @@ -7935,20 +7912,18 @@ static void nfs4_layoutget_done(struct rpc_task *task, void *calldata)
> > > pnfs_free_lseg_list(&head);
> > > } else
> > > spin_unlock(&inode->i_lock);
> > > -  goto out_restart;
> > > +  status = -EAGAIN;
> > > +  goto out;
> > > }
> > > - if (nfs4_async_handle_error(task, server, state, &lgp->timeout) == -EAGAIN)
> > > -  goto out_restart;
> > > +
> > > + status = nfs4_handle_exception(server, status, exception);
> > > + if (status == 0)
> > > +  status = task->tk_status;
> > > + if (exception->retry && status != -NFS4ERR_RECALLCONFLICT)
> > > +  status = -EAGAIN;
> > > out:
> > > dprintk("<-- %s\n", __func__);
> > > - return;
> > > -out_restart:
> > > - task->tk_status = 0;
> > > - rpc_restart_call_prepare(task);
> > > - return;
> > > -out_overflow:
> > > - task->tk_status = -EOVERFLOW;
> > > - goto out;
> > > + return status;
> > > }
> > >  
> > > static size_t max_response_pages(struct nfs_server *server)
> > > @@ -8017,7 +7992,7 @@ static const struct rpc_call_ops nfs4_layoutget_call_ops = {
> > > };
> > >  
> > > struct pnfs_layout_segment *
> > > -nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> > > +nfs4_proc_layoutget(struct nfs4_layoutget *lgp, long *timeout, gfp_t gfp_flags)
> > > {
> > > struct inode *inode = lgp->args.inode;
> > > struct nfs_server *server = NFS_SERVER(inode);
> > > @@ -8037,6 +8012,7 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> > > .flags = RPC_TASK_ASYNC,
> > > };
> > > struct pnfs_layout_segment *lseg = NULL;
> > > + struct nfs4_exception exception = { .timeout = *timeout };
> > > int status = 0;
> > >  
> > > dprintk("--> %s\n", __func__);
> > > @@ -8050,7 +8026,6 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> > > return ERR_PTR(-ENOMEM);
> > > }
> > > lgp->args.layout.pglen = max_pages * PAGE_SIZE;
> > > - lgp->args.timestamp = jiffies;
> > >  
> > > lgp->res.layoutp = &lgp->args.layout;
> > > lgp->res.seq_res.sr_slot = NULL;
> > > @@ -8060,13 +8035,17 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
> > > if (IS_ERR(task))
> > > return ERR_CAST(task);
> > > status = nfs4_wait_for_completion_rpc_task(task);
> > > - if (status == 0)
> > > -  status = task->tk_status;
> > > + if (status == 0) {
> > > +  status = nfs4_layoutget_handle_exception(task, lgp, &exception);
> >
> > This is borked… You’re now returning an NFSv4 status as an ERR_PTR(), which is a big no-no!
> > MAX_ERRNO is 4095, while NFS4ERR_RECALLCONFLICT takes the value 10061.
> >
>
> Ahh good catch!
>
> Ok, I think the right fix there is probably to just declare the
> nfs4_exception in pnfs_update_layout, and then just pass a pointer to
> it down to this function. Then it can do the interpretation of the result at the point where we decide to retry.
>
> I'll respin...
>
> Thanks,
> Jeff
>

Ok, incremental patch on top of the series is attached. I ended up not
dealing with the exception at higher layers, but rather just declaring
a new "ERECALLCONFLICT" error code and using that to indicate the
special behavior that those errors require. I'll plan to fold this into
the patch that broke this before I resend.

--
Jeff Layton <[email protected]>


Attachments:
0001-pnfs-rework-LAYOUTGET-error-handling.patch (2.36 kB)