Hi Trond,
I'm trying to figure existing code in order to fix a problem. Can you
please help?
Say there was an IO to the DS and it timeouts. Pnfs code in the
filelayout_async_handle_error() marks the device unavailable and marks
the layout to be returned. Timed out IO is retried against the MDS.
The new IO tries to do the pnfs and because the device is marked
unavailable (returning EINVAL) it propagates back all the way to the
pg_init setup and fails the IO with EINVAL. But IO shouldn't fail.
My question is why are we returning the status of the
filelayout_check_deviceid(), I propose that instead we at that point
set lseg=NULL and then the IO would be redirected to the MDS.
Or maybe I'm missing something else?
My proposed fix would be:
diff --git a/fs/nfs/filelayout/filelayout.c b/fs/nfs/filelayout/filelayout.c
index 61f46facb39c..b3e8ba3bd654 100644
--- a/fs/nfs/filelayout/filelayout.c
+++ b/fs/nfs/filelayout/filelayout.c
@@ -904,7 +904,7 @@ fl_pnfs_update_layout(struct inode *ino,
status = filelayout_check_deviceid(lo, fl, gfp_flags);
if (status) {
pnfs_put_lseg(lseg);
- lseg = ERR_PTR(status);
+ lseg = NULL;
}
out:
return lseg;
Hi Olga,
On Tue, 2019-04-30 at 17:25 -0400, Olga Kornievskaia wrote:
> Hi Trond,
>
> I'm trying to figure existing code in order to fix a problem. Can you
> please help?
>
> Say there was an IO to the DS and it timeouts. Pnfs code in the
> filelayout_async_handle_error() marks the device unavailable and
> marks
> the layout to be returned. Timed out IO is retried against the MDS.
> The new IO tries to do the pnfs and because the device is marked
> unavailable (returning EINVAL) it propagates back all the way to the
> pg_init setup and fails the IO with EINVAL. But IO shouldn't fail.
>
> My question is why are we returning the status of the
> filelayout_check_deviceid(), I propose that instead we at that point
> set lseg=NULL and then the IO would be redirected to the MDS.
>
> Or maybe I'm missing something else?
>
> My proposed fix would be:
> diff --git a/fs/nfs/filelayout/filelayout.c
> b/fs/nfs/filelayout/filelayout.c
> index 61f46facb39c..b3e8ba3bd654 100644
> --- a/fs/nfs/filelayout/filelayout.c
> +++ b/fs/nfs/filelayout/filelayout.c
> @@ -904,7 +904,7 @@ fl_pnfs_update_layout(struct inode *ino,
> status = filelayout_check_deviceid(lo, fl, gfp_flags);
> if (status) {
> pnfs_put_lseg(lseg);
> - lseg = ERR_PTR(status);
> + lseg = NULL;
> }
> out:
> return lseg;
I fully agree with your assessment. The objective of the code there
should indeed be to fall back to doing I/O through the MDS rather than
failing it altogether.
Given that the 5.2 merge window is likely to open next week, and that
we haven't seen any urgent reports of this bug before now, I suggest
that you add the signed-off-by line to the above patch and send it to
Anna for inclusion in that tree.
Please also add a 'Fixes:' line so that the patch can be automatically
backported to older kernel.
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
On Wed, May 1, 2019 at 7:10 AM Trond Myklebust <[email protected]> wrote:
>
> Hi Olga,
>
> On Tue, 2019-04-30 at 17:25 -0400, Olga Kornievskaia wrote:
> > Hi Trond,
> >
> > I'm trying to figure existing code in order to fix a problem. Can you
> > please help?
> >
> > Say there was an IO to the DS and it timeouts. Pnfs code in the
> > filelayout_async_handle_error() marks the device unavailable and
> > marks
> > the layout to be returned. Timed out IO is retried against the MDS.
> > The new IO tries to do the pnfs and because the device is marked
> > unavailable (returning EINVAL) it propagates back all the way to the
> > pg_init setup and fails the IO with EINVAL. But IO shouldn't fail.
> >
> > My question is why are we returning the status of the
> > filelayout_check_deviceid(), I propose that instead we at that point
> > set lseg=NULL and then the IO would be redirected to the MDS.
> >
> > Or maybe I'm missing something else?
> >
> > My proposed fix would be:
> > diff --git a/fs/nfs/filelayout/filelayout.c
> > b/fs/nfs/filelayout/filelayout.c
> > index 61f46facb39c..b3e8ba3bd654 100644
> > --- a/fs/nfs/filelayout/filelayout.c
> > +++ b/fs/nfs/filelayout/filelayout.c
> > @@ -904,7 +904,7 @@ fl_pnfs_update_layout(struct inode *ino,
> > status = filelayout_check_deviceid(lo, fl, gfp_flags);
> > if (status) {
> > pnfs_put_lseg(lseg);
> > - lseg = ERR_PTR(status);
> > + lseg = NULL;
> > }
> > out:
> > return lseg;
>
> I fully agree with your assessment. The objective of the code there
> should indeed be to fall back to doing I/O through the MDS rather than
> failing it altogether.
>
> Given that the 5.2 merge window is likely to open next week, and that
> we haven't seen any urgent reports of this bug before now, I suggest
> that you add the signed-off-by line to the above patch and send it to
> Anna for inclusion in that tree.
> Please also add a 'Fixes:' line so that the patch can be automatically
> backported to older kernel.
Thanks Trond for the confirmation. I'll do this as soon as I'll chase
down one more element that I can't explain in this failing scenario.
>
> Cheers
> Trond
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>