LinuxLists.cc - [PATCH] NFS: Always return the error that truncates a flushing page

2019-01-27 15:19:50

Subject: [PATCH] NFS: Always return the error that truncates a flushing page

We can't have nfs_wb_page() truncate the page from the mapping if there's
an error on the context without returning that error, because we may be in
nfs_updatepage() holding the page and trying to update the request. Not
having any error returned means we'll proceed to create a new request and
dereference the truncated page->mapping.

If we're going to remove the page, always return the error that signaled us
to do so in nfs_page_async_flush().

Fixes: c373fff7bd25 ("NFSv4: Don't special case "launder"")
Cc: [email protected] # v4.11
Signed-off-by: Benjamin Coddington <[email protected]>
---
fs/nfs/write.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 5a0bbf917a32..c274339176cc 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -622,9 +622,11 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
WARN_ON_ONCE(test_bit(PG_CLEAN, &req->wb_flags));

ret = 0;
- /* If there is a fatal error that covers this write, just exit */
- if (nfs_error_is_fatal_on_server(req->wb_context->error))
+ /* If there is a fatal on server error on this context, just exit */
+ if (nfs_error_is_fatal_on_server(req->wb_context->error)) {
+ ret = req->wb_context->error;
goto out_launder;
+ }

if (!nfs_pageio_add_request(pgio, req)) {
ret = pgio->pg_error;
--
2.14.3

2019-01-28 17:59:39

by Trond Myklebust

[permalink] [raw]

Subject: Re: [PATCH] NFS: Always return the error that truncates a flushing page

On Sun, 2019-01-27 at 10:19 -0500, Benjamin Coddington wrote:
> We can't have nfs_wb_page() truncate the page from the mapping if
> there's
> an error on the context without returning that error, because we may
> be in
> nfs_updatepage() holding the page and trying to update the
> request. Not
> having any error returned means we'll proceed to create a new request
> and
> dereference the truncated page->mapping.
>
> If we're going to remove the page, always return the error that
> signaled us
> to do so in nfs_page_async_flush().
>
> Fixes: c373fff7bd25 ("NFSv4: Don't special case "launder"")
> Cc: [email protected] # v4.11
> Signed-off-by: Benjamin Coddington <[email protected]>
> ---
> fs/nfs/write.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index 5a0bbf917a32..c274339176cc 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -622,9 +622,11 @@ static int nfs_page_async_flush(struct
> nfs_pageio_descriptor *pgio,
> WARN_ON_ONCE(test_bit(PG_CLEAN, &req->wb_flags));
>
> ret = 0;
> - /* If there is a fatal error that covers this write, just exit
> */
> - if (nfs_error_is_fatal_on_server(req->wb_context->error))
> + /* If there is a fatal on server error on this context, just
> exit */
> + if (nfs_error_is_fatal_on_server(req->wb_context->error)) {
> + ret = req->wb_context->error;
> goto out_launder;
> + }
>
> if (!nfs_pageio_add_request(pgio, req)) {
> ret = pgio->pg_error;

Hi Ben

We were apparently both looking at the same code last week ☺. I have a
similar patch, but that also fixes a similar clobbering issue with the
nfs_error_is_fatal() error just a few lines further down.

Without the extra hunk, we could end up converting an interrupted call
into a 'successful' write.

Cheers
Trond

---
From 676d3340abc26ce9ff2b7609c293454dec6d4897 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Tue, 22 Jan 2019 07:34:45 -0500
Subject: [PATCH] NFS: Fix up return value on fatal errors in
nfs_page_async_flush()

Ensure that we return the fatal error value that caused us to exit
nfs_page_async_flush().

Fixes: a6598813a4c5 ("NFS: Don't write back further requests...")
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected] # v4.12+
---
fs/nfs/write.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 5a0bbf917a32..f12cb31a41e5 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -621,11 +621,12 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
nfs_set_page_writeback(page);
WARN_ON_ONCE(test_bit(PG_CLEAN, &req->wb_flags));

- ret = 0;
+ ret = req->wb_context->error;
/* If there is a fatal error that covers this write, just exit */
- if (nfs_error_is_fatal_on_server(req->wb_context->error))
+ if (nfs_error_is_fatal_on_server(ret))
goto out_launder;

+ ret = 0;
if (!nfs_pageio_add_request(pgio, req)) {
ret = pgio->pg_error;
/*
@@ -635,9 +636,9 @@ static int nfs_page_async_flush(struct nfs_pageio_descriptor *pgio,
nfs_context_set_write_error(req->wb_context, ret);
if (nfs_error_is_fatal_on_server(ret))
goto out_launder;
- }
+ } else
+ ret = -EAGAIN;
nfs_redirty_request(req);
- ret = -EAGAIN;
} else
nfs_add_stats(page_file_mapping(page)->host,
NFSIOS_WRITEPAGES, 1);
--
2.20.1

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2019-01-28 18:55:38

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH] NFS: Always return the error that truncates a flushing page

On 28 Jan 2019, at 12:59, Trond Myklebust wrote:
> Hi Ben
>
> We were apparently both looking at the same code last week ☺. I have a
> similar patch, but that also fixes a similar clobbering issue with the
> nfs_error_is_fatal() error just a few lines further down.
>
> Without the extra hunk, we could end up converting an interrupted call
> into a 'successful' write.
>
> Cheers
> Trond

Looks right to me. I hope you'll take the Fixes and stable version from
my patch, which will cover the problem I'm seeing.

Reviewed-by: Benjamin Coddington <[email protected]>

Ben

2019-01-29 20:49:31

by Trond Myklebust

[permalink] [raw]

Subject: Re: [PATCH] NFS: Always return the error that truncates a flushing page

On Mon, 2019-01-28 at 13:55 -0500, Benjamin Coddington wrote:
> On 28 Jan 2019, at 12:59, Trond Myklebust wrote:
> > Hi Ben
> >
> > We were apparently both looking at the same code last week ☺. I
> > have a
> > similar patch, but that also fixes a similar clobbering issue with
> > the
> > nfs_error_is_fatal() error just a few lines further down.
> >
> > Without the extra hunk, we could end up converting an interrupted
> > call
> > into a 'successful' write.
> >
> > Cheers
> > Trond
>
> Looks right to me. I hope you'll take the Fixes and stable version
> from
> my patch, which will cover the problem I'm seeing.
>
> Reviewed-by: Benjamin Coddington <[email protected]>
>

At first, I thought I'd have to split this patch in 2 so that it could
apply to 4.11 and 4.12, but it looks to me as if the commit from the
Fixes line you were showing is actually only present in 4.12 and later.
Am I wrong?

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]

2019-01-29 22:35:03

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH] NFS: Always return the error that truncates a flushing page

On 29 Jan 2019, at 20:49, Trond Myklebust wrote:

> On Mon, 2019-01-28 at 13:55 -0500, Benjamin Coddington wrote:
>> On 28 Jan 2019, at 12:59, Trond Myklebust wrote:
>>> Hi Ben
>>>
>>> We were apparently both looking at the same code last week ☺. I
>>> have a
>>> similar patch, but that also fixes a similar clobbering issue with
>>> the
>>> nfs_error_is_fatal() error just a few lines further down.
>>>
>>> Without the extra hunk, we could end up converting an interrupted
>>> call
>>> into a 'successful' write.
>>>
>>> Cheers
>>> Trond
>>
>> Looks right to me. I hope you'll take the Fixes and stable version
>> from
>> my patch, which will cover the problem I'm seeing.
>>
>> Reviewed-by: Benjamin Coddington <[email protected]>
>>
>
> At first, I thought I'd have to split this patch in 2 so that it could
> apply to 4.11 and 4.12, but it looks to me as if the commit from the
> Fixes line you were showing is actually only present in 4.12 and
> later.
> Am I wrong?

You're right. I did

# git describe c373fff7bd25
v4.11-rc7-70-gc373fff7bd25

.. and assumed it ended up in v4.11. I wonder what actually happened to
it, I'll see if I can find out.

Ben

2019-01-30 09:03:30

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH] NFS: Always return the error that truncates a flushing page

On 29 Jan 2019, at 22:34, Benjamin Coddington wrote:
> You're right. I did
>
> # git describe c373fff7bd25
> v4.11-rc7-70-gc373fff7bd25
>
> .. and assumed it ended up in v4.11. I wonder what actually happened to
> it, I'll see if I can find out.

git describe is giving me the closest ancestor of the commit, not (what I
expected) the closest ancestor after the merge. I should use git describe
--contains instead.

Ben