2024-02-22 11:05:09

by Jeff Layton

[permalink] [raw]
Subject: Re: PROBLEM: NFS client IO fails with ERESTARTSYS when another mount point with the same export is unmounted with force [NFS] [SUNRPC]

On Wed, 2024-02-21 at 13:48 +0000, Trond Myklebust wrote:
> On Wed, 2024-02-21 at 16:20 +0800, Zhitao Li wrote:
> > [You don't often get email from [email protected]. Learn why this
> > is important at https://aka.ms/LearnAboutSenderIdentification?]
> >
> > Hi, everyone,
> >
> > - Facts:
> > I have a remote NFS export and I mount the same export on two
> > different directories in my OS with the same options. There is an
> > inflight IO under one mounted directory. And then I unmount another
> > mounted directory with force. The inflight IO ends up with "Unknown
> > error 512", which is ERESTARTSYS.
> >
>
> All of the above is well known. That's because forced umount affects
> the entire filesystem. Why are you using it here in the first place? It
> is not intended for casual use.
>

While I agree Trond's above statement, the kernel is not supposed to
leak error codes that high into userland. Are you seeing ERESTARTSYS
being returned to system calls? If so, which ones?
--
Jeff Layton <[email protected]>


2024-02-22 15:20:20

by Trond Myklebust

[permalink] [raw]
Subject: Re: PROBLEM: NFS client IO fails with ERESTARTSYS when another mount point with the same export is unmounted with force [NFS] [SUNRPC]

On Thu, 2024-02-22 at 06:05 -0500, Jeff Layton wrote:
> On Wed, 2024-02-21 at 13:48 +0000, Trond Myklebust wrote:
> > On Wed, 2024-02-21 at 16:20 +0800, Zhitao Li wrote:
> > > [You don't often get email from [email protected]. Learn why
> > > this
> > > is important at https://aka.ms/LearnAboutSenderIdentificationĀ ]
> > >
> > > Hi, everyone,
> > >
> > > - Facts:
> > > I have a remote NFS export and I mount the same export on two
> > > different directories in my OS with the same options. There is an
> > > inflight IO under one mounted directory. And then I unmount
> > > another
> > > mounted directory with force. The inflight IO ends up with
> > > "Unknown
> > > error 512", which is ERESTARTSYS.
> > >
> >
> > All of the above is well known. That's because forced umount
> > affects
> > the entire filesystem. Why are you using it here in the first
> > place? It
> > is not intended for casual use.
> >
>
> While I agree Trond's above statement, the kernel is not supposed to
> leak error codes that high into userland. Are you seeing ERESTARTSYS
> being returned to system calls? If so, which ones?

The point of forced umount is to kill all RPC calls associated with the
filesystem in order to unblock the umount. Basically, it triggers this
code before the unmount starts:

void nfs_umount_begin(struct super_block *sb)
{
struct nfs_server *server;
struct rpc_clnt *rpc;

server = NFS_SB(sb);
/* -EIO all pending I/O */
rpc = server->client_acl;
if (!IS_ERR(rpc))
rpc_killall_tasks(rpc);
rpc = server->client;
if (!IS_ERR(rpc))
rpc_killall_tasks(rpc);
}

So yes, that does signal all the way up to the application level, and
it is very much intended to do so.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2024-02-23 10:32:25

by Jeff Layton

[permalink] [raw]
Subject: Re: PROBLEM: NFS client IO fails with ERESTARTSYS when another mount point with the same export is unmounted with force [NFS] [SUNRPC]

On Thu, 2024-02-22 at 15:20 +0000, Trond Myklebust wrote:
> On Thu, 2024-02-22 at 06:05 -0500, Jeff Layton wrote:
> > On Wed, 2024-02-21 at 13:48 +0000, Trond Myklebust wrote:
> > > On Wed, 2024-02-21 at 16:20 +0800, Zhitao Li wrote:
> > > > [You don't often get email from [email protected]. Learn why
> > > > this
> > > > is important at https://aka.ms/LearnAboutSenderIdentification?]
> > > >
> > > > Hi, everyone,
> > > >
> > > > - Facts:
> > > > I have a remote NFS export and I mount the same export on two
> > > > different directories in my OS with the same options. There is an
> > > > inflight IO under one mounted directory. And then I unmount
> > > > another
> > > > mounted directory with force. The inflight IO ends up with
> > > > "Unknown
> > > > error 512", which is ERESTARTSYS.
> > > >
> > >
> > > All of the above is well known. That's because forced umount
> > > affects
> > > the entire filesystem. Why are you using it here in the first
> > > place? It
> > > is not intended for casual use.
> > >
> >
> > While I agree Trond's above statement, the kernel is not supposed to
> > leak error codes that high into userland. Are you seeing ERESTARTSYS
> > being returned to system calls? If so, which ones?
>
> The point of forced umount is to kill all RPC calls associated with the
> filesystem in order to unblock the umount. Basically, it triggers this
> code before the unmount starts:
>
> void nfs_umount_begin(struct super_block *sb)
> {
> struct nfs_server *server;
> struct rpc_clnt *rpc;
>
> server = NFS_SB(sb);
> /* -EIO all pending I/O */
> rpc = server->client_acl;
> if (!IS_ERR(rpc))
> rpc_killall_tasks(rpc);
> rpc = server->client;
> if (!IS_ERR(rpc))
> rpc_killall_tasks(rpc);
> }
>
> So yes, that does signal all the way up to the application level, and
> it is very much intended to do so.

Returning an error to userland in this situation is fine, but userland
programs aren't really equipped to deal with error numbers in this
range.

Emphasis on the first sentence in the comment in include/linux/errno.h:

-------------------8<-----------------------
/*
* These should never be seen by user programs. To return one of ERESTART*
* codes, signal_pending() MUST be set. Note that ptrace can observe these
* at syscall exit tracing, but they will never be left for the debugged user
* process to see.
*/
#define ERESTARTSYS 512
#define ERESTARTNOINTR 513
#define ERESTARTNOHAND 514 /* restart if no handler.. */
#define ENOIOCTLCMD 515 /* No ioctl command */
#define ERESTART_RESTARTBLOCK 516 /* restart by calling sys_restart_syscall */
#define EPROBE_DEFER 517 /* Driver requests probe retry */
#define EOPENSTALE 518 /* open found a stale dentry */
#define ENOPARAM 519 /* Parameter not supported */
-------------------8<-----------------------

If these values are leaking into userland, then that seems like a bug.
--
Jeff Layton <[email protected]>

2024-02-27 05:40:20

by Zhitao Li

[permalink] [raw]
Subject: Re: PROBLEM: NFS client IO fails with ERESTARTSYS when another mount point with the same export is unmounted with force [NFS] [SUNRPC]

Is there any plan for this ERESTARTSYS leak issue?

--
Zhitao Li, at SmartX

On Fri, Feb 23, 2024 at 6:31ā€ÆPM Jeff Layton <[email protected]> wrote:
>
> On Thu, 2024-02-22 at 15:20 +0000, Trond Myklebust wrote:
> > On Thu, 2024-02-22 at 06:05 -0500, Jeff Layton wrote:
> > > On Wed, 2024-02-21 at 13:48 +0000, Trond Myklebust wrote:
> > > > On Wed, 2024-02-21 at 16:20 +0800, Zhitao Li wrote:
> > > > > [You don't often get email from [email protected]. Learn why
> > > > > this
> > > > > is important at https://aka.ms/LearnAboutSenderIdentification ]
> > > > >
> > > > > Hi, everyone,
> > > > >
> > > > > - Facts:
> > > > > I have a remote NFS export and I mount the same export on two
> > > > > different directories in my OS with the same options. There is an
> > > > > inflight IO under one mounted directory. And then I unmount
> > > > > another
> > > > > mounted directory with force. The inflight IO ends up with
> > > > > "Unknown
> > > > > error 512", which is ERESTARTSYS.
> > > > >
> > > >
> > > > All of the above is well known. That's because forced umount
> > > > affects
> > > > the entire filesystem. Why are you using it here in the first
> > > > place? It
> > > > is not intended for casual use.
> > > >
> > >
> > > While I agree Trond's above statement, the kernel is not supposed to
> > > leak error codes that high into userland. Are you seeing ERESTARTSYS
> > > being returned to system calls? If so, which ones?
> >
> > The point of forced umount is to kill all RPC calls associated with the
> > filesystem in order to unblock the umount. Basically, it triggers this
> > code before the unmount starts:
> >
> > void nfs_umount_begin(struct super_block *sb)
> > {
> > struct nfs_server *server;
> > struct rpc_clnt *rpc;
> >
> > server = NFS_SB(sb);
> > /* -EIO all pending I/O */
> > rpc = server->client_acl;
> > if (!IS_ERR(rpc))
> > rpc_killall_tasks(rpc);
> > rpc = server->client;
> > if (!IS_ERR(rpc))
> > rpc_killall_tasks(rpc);
> > }
> >
> > So yes, that does signal all the way up to the application level, and
> > it is very much intended to do so.
>
> Returning an error to userland in this situation is fine, but userland
> programs aren't really equipped to deal with error numbers in this
> range.
>
> Emphasis on the first sentence in the comment in include/linux/errno.h:
>
> -------------------8<-----------------------
> /*
> * These should never be seen by user programs. To return one of ERESTART*
> * codes, signal_pending() MUST be set. Note that ptrace can observe these
> * at syscall exit tracing, but they will never be left for the debugged user
> * process to see.
> */
> #define ERESTARTSYS 512
> #define ERESTARTNOINTR 513
> #define ERESTARTNOHAND 514 /* restart if no handler.. */
> #define ENOIOCTLCMD 515 /* No ioctl command */
> #define ERESTART_RESTARTBLOCK 516 /* restart by calling sys_restart_syscall */
> #define EPROBE_DEFER 517 /* Driver requests probe retry */
> #define EOPENSTALE 518 /* open found a stale dentry */
> #define ENOPARAM 519 /* Parameter not supported */
> -------------------8<-----------------------
>
> If these values are leaking into userland, then that seems like a bug.
> --
> Jeff Layton <[email protected]>