2022-05-14 14:51:42

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 2/5] NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout

From: Trond Myklebust <[email protected]>

Commit 587f03deb69b caused pnfs_update_layout() to stop returning ENOMEM
when the memory allocation fails, and hence causes it to fall back to
trying to do I/O through the MDS. There is no guarantee that this will
fare any better. If we're failing the pNFS layout allocation, then we
should just redirty the page and retry later.

Reported-by: Olga Kornievskaia <[email protected]>
Fixes: 587f03deb69b ("pnfs: refactor send_layoutget")
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/pnfs.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 856c962273c7..68a87be3e6f9 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2000,6 +2000,7 @@ pnfs_update_layout(struct inode *ino,
lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);
if (lo == NULL) {
spin_unlock(&ino->i_lock);
+ lseg = ERR_PTR(-ENOMEM);
trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
PNFS_UPDATE_LAYOUT_NOMEM);
goto out;
@@ -2128,6 +2129,7 @@ pnfs_update_layout(struct inode *ino,

lgp = pnfs_alloc_init_layoutget_args(ino, ctx, &stateid, &arg, gfp_flags);
if (!lgp) {
+ lseg = ERR_PTR(-ENOMEM);
trace_pnfs_update_layout(ino, pos, count, iomode, lo, NULL,
PNFS_UPDATE_LAYOUT_NOMEM);
nfs_layoutget_end(lo);
--
2.36.1



2022-05-14 16:27:39

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 3/5] NFS: Further fixes to the writeback error handling

From: Trond Myklebust <[email protected]>

When we handle an error by redirtying the page, we're not corrupting the
mapping, so we don't want the error to be recorded in the mapping.
If the caller has specified a sync_mode of WB_SYNC_NONE, we can just
return AOP_WRITEPAGE_ACTIVATE. However if we're dealing with
WB_SYNC_ALL, we need to ensure that retries happen when the errors are
non-fatal.

Reported-by: Olga Kornievskaia <[email protected]>
Fixes: 8fc75bed96bb ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()")
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/write.c | 39 ++++++++++++++++++---------------------
1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index f00d45cf80ef..a8eb348947a6 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -603,8 +603,9 @@ static void nfs_write_error(struct nfs_page *req, int error)
* Find an associated nfs write request, and prepare to flush it out
* May return an error if the user signalled nfs_wait_on_request().
*/
-static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
- struct page *page)
+static int nfs_page_async_flush(struct page *page,
+ struct writeback_control *wbc,
+ struct nfs_pageio_descriptor *pgio)
{
struct nfs_page *req;
int ret = 0;
@@ -630,11 +631,11 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
/*
* Remove the problematic req upon fatal errors on the server
*/
- if (nfs_error_is_fatal(ret)) {
- if (nfs_error_is_fatal_on_server(ret))
- goto out_launder;
- } else
- ret = -EAGAIN;
+ if (nfs_error_is_fatal_on_server(ret))
+ goto out_launder;
+ if (wbc->sync_mode == WB_SYNC_NONE)
+ ret = AOP_WRITEPAGE_ACTIVATE;
+ redirty_page_for_writepage(wbc, page);
nfs_redirty_request(req);
pgio->pg_error = 0;
} else
@@ -650,15 +651,8 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
static int nfs_do_writepage(struct page *page, struct writeback_control *wbc,
struct nfs_pageio_descriptor *pgio)
{
- int ret;
-
nfs_pageio_cond_complete(pgio, page_index(page));
- ret = nfs_page_async_flush(pgio, page);
- if (ret == -EAGAIN) {
- redirty_page_for_writepage(wbc, page);
- ret = AOP_WRITEPAGE_ACTIVATE;
- }
- return ret;
+ return nfs_page_async_flush(page, wbc, pgio);
}

/*
@@ -737,12 +731,15 @@ int nfs_writepages(struct address_space *mapping, struct writeback_control *wbc)
priority = wb_priority(wbc);
}

- nfs_pageio_init_write(&pgio, inode, priority, false,
- &nfs_async_write_completion_ops);
- pgio.pg_io_completion = ioc;
- err = write_cache_pages(mapping, wbc, nfs_writepages_callback, &pgio);
- pgio.pg_error = 0;
- nfs_pageio_complete(&pgio);
+ do {
+ nfs_pageio_init_write(&pgio, inode, priority, false,
+ &nfs_async_write_completion_ops);
+ pgio.pg_io_completion = ioc;
+ err = write_cache_pages(mapping, wbc, nfs_writepages_callback,
+ &pgio);
+ pgio.pg_error = 0;
+ nfs_pageio_complete(&pgio);
+ } while (err < 0 && !nfs_error_is_fatal(err));
nfs_io_completion_put(ioc);

if (err < 0)
--
2.36.1


2022-05-14 21:58:30

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 4/5] pNFS/files: Fall back to I/O through the MDS on non-fatal layout errors

From: Trond Myklebust <[email protected]>

Only report the error when the server is returning a fatal error, such
as ESTALE, EIO, etc...

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/filelayout/filelayout.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/filelayout/filelayout.c b/fs/nfs/filelayout/filelayout.c
index 76deddab0a8f..2b2661582bbe 100644
--- a/fs/nfs/filelayout/filelayout.c
+++ b/fs/nfs/filelayout/filelayout.c
@@ -839,7 +839,12 @@ fl_pnfs_update_layout(struct inode *ino,

lseg = pnfs_update_layout(ino, ctx, pos, count, iomode, strict_iomode,
gfp_flags);
- if (IS_ERR_OR_NULL(lseg))
+ if (IS_ERR(lseg)) {
+ /* Fall back to MDS on recoverable errors */
+ if (!nfs_error_is_fatal_on_server(PTR_ERR(lseg)))
+ lseg = NULL;
+ goto out;
+ } else if (!lseg)
goto out;

lo = NFS_I(ino)->layout;
--
2.36.1


2022-05-15 10:07:07

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 5/5] NFSv4: Don't hold the layoutget locks across multiple RPC calls

From: Trond Myklebust <[email protected]>

When doing layoutget as part of the open() compound, we have to be
careful to release the layout locks before we can call any further RPC
calls, such as setattr(). The reason is that those calls could trigger
a recall, which could deadlock.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/nfs4proc.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a79f66432bd3..bf3ba541b9fb 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3098,6 +3098,10 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
}

out:
+ if (opendata->lgp) {
+ nfs4_lgopen_release(opendata->lgp);
+ opendata->lgp = NULL;
+ }
if (!opendata->cancelled)
nfs4_sequence_free_slot(&opendata->o_res.seq_res);
return ret;
--
2.36.1