2014-09-02 17:59:07

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 1/2] lockd: Do not start the lockd thread before we've set nlmsvc_rqst->rq_task

This fixes an Oopsable race when starting lockd.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/lockd/svc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 673668a9eec1..c35cd43a06e6 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -306,7 +306,7 @@ static int lockd_start_svc(struct svc_serv *serv)
svc_sock_update_bufs(serv);
serv->sv_maxconn = nlm_max_connections;

- nlmsvc_task = kthread_run(lockd, nlmsvc_rqst, "%s", serv->sv_name);
+ nlmsvc_task = kthread_create(lockd, nlmsvc_rqst, "%s", serv->sv_name);
if (IS_ERR(nlmsvc_task)) {
error = PTR_ERR(nlmsvc_task);
printk(KERN_WARNING
@@ -314,6 +314,7 @@ static int lockd_start_svc(struct svc_serv *serv)
goto out_task;
}
nlmsvc_rqst->rq_task = nlmsvc_task;
+ wake_up_process(nlmsvc_task);

dprintk("lockd_up: service started\n");
return 0;
--
1.9.3



2014-09-02 19:46:21

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

On Tue, Sep 02, 2014 at 03:32:55PM -0400, Trond Myklebust wrote:
> On Tue, Sep 2, 2014 at 3:23 PM, Christoph Hellwig <[email protected]> wrote:
> > On Tue, Sep 02, 2014 at 01:58:58PM -0400, Trond Myklebust wrote:
> >> This fixes an Oopsable race when starting up the callback server.
> >>
> >> Signed-off-by: Trond Myklebust <[email protected]>
> >> ---
> >> fs/nfs/callback.c | 3 ++-
> >> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> >> index e3dd1cd175d9..b8fb3a4ef649 100644
> >> --- a/fs/nfs/callback.c
> >> +++ b/fs/nfs/callback.c
> >> @@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> >>
> >> cb_info->serv = serv;
> >> cb_info->rqst = rqstp;
> >> - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> >> + cb_info->task = kthread_create(callback_svc, cb_info->rqst,
> >> "nfsv4.%u-svc", minorversion);
> >> if (IS_ERR(cb_info->task)) {
> >> ret = PTR_ERR(cb_info->task);
> >> @@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> >> return ret;
> >> }
> >> rqstp->rq_task = cb_info->task;
> >> + wake_up_process(cb_info->task);
> >
> > Wouldn't it be cleaner to do something like:
> >
> > - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> > + cb_info->task = rqstp->rq_run =
> > + kthread_create(callback_svc, cb_info->rqst,
> >
> > or am I missing something subtile that the changelog didn't mention?
>
> The above is fine if you call kthread_create(), but if you stick with
> kthread_run(), then there is still the same atomicity issue that the
> thread can be started before we've initialised cb_info->task and
> rqstp->rq_run.
>
> Internal testing has shown that this can lead to an oops when starting
> lockd.

The oops seen in practice were probably after applying 983c684466e0
"SUNRPC: get rid of the request wait queue"?

Though it was a bug before then too, of course.

--b.

> I'm therefore assuming that the same thing can happen with the
> NFS client callback channel.
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> [email protected]

2014-09-02 18:13:37

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

On Tue, 2 Sep 2014 13:58:58 -0400
Trond Myklebust <[email protected]> wrote:

> This fixes an Oopsable race when starting up the callback server.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/callback.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> index e3dd1cd175d9..b8fb3a4ef649 100644
> --- a/fs/nfs/callback.c
> +++ b/fs/nfs/callback.c
> @@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
>
> cb_info->serv = serv;
> cb_info->rqst = rqstp;
> - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> + cb_info->task = kthread_create(callback_svc, cb_info->rqst,
> "nfsv4.%u-svc", minorversion);
> if (IS_ERR(cb_info->task)) {
> ret = PTR_ERR(cb_info->task);
> @@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> return ret;
> }
> rqstp->rq_task = cb_info->task;
> + wake_up_process(cb_info->task);
> dprintk("nfs_callback_up: service started\n");
> return 0;
> }


Reviewed-by: Jeff Layton <[email protected]>

2014-09-02 19:49:22

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

On Tue, Sep 2, 2014 at 3:45 PM, J. Bruce Fields <[email protected]> wrote:
> On Tue, Sep 02, 2014 at 03:32:55PM -0400, Trond Myklebust wrote:
>> On Tue, Sep 2, 2014 at 3:23 PM, Christoph Hellwig <[email protected]> wrote:
>> > On Tue, Sep 02, 2014 at 01:58:58PM -0400, Trond Myklebust wrote:
>> >> This fixes an Oopsable race when starting up the callback server.
>> >>
>> >> Signed-off-by: Trond Myklebust <[email protected]>
>> >> ---
>> >> fs/nfs/callback.c | 3 ++-
>> >> 1 file changed, 2 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
>> >> index e3dd1cd175d9..b8fb3a4ef649 100644
>> >> --- a/fs/nfs/callback.c
>> >> +++ b/fs/nfs/callback.c
>> >> @@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
>> >>
>> >> cb_info->serv = serv;
>> >> cb_info->rqst = rqstp;
>> >> - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
>> >> + cb_info->task = kthread_create(callback_svc, cb_info->rqst,
>> >> "nfsv4.%u-svc", minorversion);
>> >> if (IS_ERR(cb_info->task)) {
>> >> ret = PTR_ERR(cb_info->task);
>> >> @@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
>> >> return ret;
>> >> }
>> >> rqstp->rq_task = cb_info->task;
>> >> + wake_up_process(cb_info->task);
>> >
>> > Wouldn't it be cleaner to do something like:
>> >
>> > - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
>> > + cb_info->task = rqstp->rq_run =
>> > + kthread_create(callback_svc, cb_info->rqst,
>> >
>> > or am I missing something subtile that the changelog didn't mention?
>>
>> The above is fine if you call kthread_create(), but if you stick with
>> kthread_run(), then there is still the same atomicity issue that the
>> thread can be started before we've initialised cb_info->task and
>> rqstp->rq_run.
>>
>> Internal testing has shown that this can lead to an oops when starting
>> lockd.
>
> The oops seen in practice were probably after applying 983c684466e0
> "SUNRPC: get rid of the request wait queue"?
>
> Though it was a bug before then too, of course.
>

Right. This is not needed until you merge the new sunrpc server
scalability stuff (which I'm assuming will be 3.18).

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-09-02 18:13:14

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 1/2] lockd: Do not start the lockd thread before we've set nlmsvc_rqst->rq_task

On Tue, 2 Sep 2014 13:58:57 -0400
Trond Myklebust <[email protected]> wrote:

> This fixes an Oopsable race when starting lockd.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/lockd/svc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
> index 673668a9eec1..c35cd43a06e6 100644
> --- a/fs/lockd/svc.c
> +++ b/fs/lockd/svc.c
> @@ -306,7 +306,7 @@ static int lockd_start_svc(struct svc_serv *serv)
> svc_sock_update_bufs(serv);
> serv->sv_maxconn = nlm_max_connections;
>
> - nlmsvc_task = kthread_run(lockd, nlmsvc_rqst, "%s", serv->sv_name);
> + nlmsvc_task = kthread_create(lockd, nlmsvc_rqst, "%s", serv->sv_name);
> if (IS_ERR(nlmsvc_task)) {
> error = PTR_ERR(nlmsvc_task);
> printk(KERN_WARNING
> @@ -314,6 +314,7 @@ static int lockd_start_svc(struct svc_serv *serv)
> goto out_task;
> }
> nlmsvc_rqst->rq_task = nlmsvc_task;
> + wake_up_process(nlmsvc_task);
>
> dprintk("lockd_up: service started\n");
> return 0;


Reviewed-by: Jeff Layton <[email protected]>

2014-09-02 19:23:30

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

On Tue, Sep 02, 2014 at 01:58:58PM -0400, Trond Myklebust wrote:
> This fixes an Oopsable race when starting up the callback server.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/callback.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> index e3dd1cd175d9..b8fb3a4ef649 100644
> --- a/fs/nfs/callback.c
> +++ b/fs/nfs/callback.c
> @@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
>
> cb_info->serv = serv;
> cb_info->rqst = rqstp;
> - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> + cb_info->task = kthread_create(callback_svc, cb_info->rqst,
> "nfsv4.%u-svc", minorversion);
> if (IS_ERR(cb_info->task)) {
> ret = PTR_ERR(cb_info->task);
> @@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> return ret;
> }
> rqstp->rq_task = cb_info->task;
> + wake_up_process(cb_info->task);

Wouldn't it be cleaner to do something like:

- cb_info->task = kthread_run(callback_svc, cb_info->rqst,
+ cb_info->task = rqstp->rq_run =
+ kthread_create(callback_svc, cb_info->rqst,

or am I missing something subtile that the changelog didn't mention?

2014-09-02 21:53:12

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

On Tue, Sep 02, 2014 at 03:49:21PM -0400, Trond Myklebust wrote:
> On Tue, Sep 2, 2014 at 3:45 PM, J. Bruce Fields <[email protected]> wrote:
> > On Tue, Sep 02, 2014 at 03:32:55PM -0400, Trond Myklebust wrote:
> >> On Tue, Sep 2, 2014 at 3:23 PM, Christoph Hellwig <[email protected]> wrote:
> >> > On Tue, Sep 02, 2014 at 01:58:58PM -0400, Trond Myklebust wrote:
> >> >> This fixes an Oopsable race when starting up the callback server.
> >> >>
> >> >> Signed-off-by: Trond Myklebust <[email protected]>
> >> >> ---
> >> >> fs/nfs/callback.c | 3 ++-
> >> >> 1 file changed, 2 insertions(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> >> >> index e3dd1cd175d9..b8fb3a4ef649 100644
> >> >> --- a/fs/nfs/callback.c
> >> >> +++ b/fs/nfs/callback.c
> >> >> @@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> >> >>
> >> >> cb_info->serv = serv;
> >> >> cb_info->rqst = rqstp;
> >> >> - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> >> >> + cb_info->task = kthread_create(callback_svc, cb_info->rqst,
> >> >> "nfsv4.%u-svc", minorversion);
> >> >> if (IS_ERR(cb_info->task)) {
> >> >> ret = PTR_ERR(cb_info->task);
> >> >> @@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> >> >> return ret;
> >> >> }
> >> >> rqstp->rq_task = cb_info->task;
> >> >> + wake_up_process(cb_info->task);
> >> >
> >> > Wouldn't it be cleaner to do something like:
> >> >
> >> > - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> >> > + cb_info->task = rqstp->rq_run =
> >> > + kthread_create(callback_svc, cb_info->rqst,
> >> >
> >> > or am I missing something subtile that the changelog didn't mention?
> >>
> >> The above is fine if you call kthread_create(), but if you stick with
> >> kthread_run(), then there is still the same atomicity issue that the
> >> thread can be started before we've initialised cb_info->task and
> >> rqstp->rq_run.
> >>
> >> Internal testing has shown that this can lead to an oops when starting
> >> lockd.
> >
> > The oops seen in practice were probably after applying 983c684466e0
> > "SUNRPC: get rid of the request wait queue"?
> >
> > Though it was a bug before then too, of course.
> >
>
> Right. This is not needed until you merge the new sunrpc server
> scalability stuff (which I'm assuming will be 3.18).

Got it. Well there's also an rq_task use in
net/sunrpc/svc.c:choose_victim(), but I'm guessing it would take a
pretty strange case to hit, so I'll plan to take these for 3.18 without
a stable cc.

--b.

2014-09-02 19:32:55

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

On Tue, Sep 2, 2014 at 3:23 PM, Christoph Hellwig <[email protected]> wrote:
> On Tue, Sep 02, 2014 at 01:58:58PM -0400, Trond Myklebust wrote:
>> This fixes an Oopsable race when starting up the callback server.
>>
>> Signed-off-by: Trond Myklebust <[email protected]>
>> ---
>> fs/nfs/callback.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
>> index e3dd1cd175d9..b8fb3a4ef649 100644
>> --- a/fs/nfs/callback.c
>> +++ b/fs/nfs/callback.c
>> @@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
>>
>> cb_info->serv = serv;
>> cb_info->rqst = rqstp;
>> - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
>> + cb_info->task = kthread_create(callback_svc, cb_info->rqst,
>> "nfsv4.%u-svc", minorversion);
>> if (IS_ERR(cb_info->task)) {
>> ret = PTR_ERR(cb_info->task);
>> @@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
>> return ret;
>> }
>> rqstp->rq_task = cb_info->task;
>> + wake_up_process(cb_info->task);
>
> Wouldn't it be cleaner to do something like:
>
> - cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> + cb_info->task = rqstp->rq_run =
> + kthread_create(callback_svc, cb_info->rqst,
>
> or am I missing something subtile that the changelog didn't mention?

The above is fine if you call kthread_create(), but if you stick with
kthread_run(), then there is still the same atomicity issue that the
thread can be started before we've initialised cb_info->task and
rqstp->rq_run.

Internal testing has shown that this can lead to an oops when starting
lockd. I'm therefore assuming that the same thing can happen with the
NFS client callback channel.

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-09-02 17:59:08

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

This fixes an Oopsable race when starting up the callback server.

Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/callback.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index e3dd1cd175d9..b8fb3a4ef649 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,

cb_info->serv = serv;
cb_info->rqst = rqstp;
- cb_info->task = kthread_run(callback_svc, cb_info->rqst,
+ cb_info->task = kthread_create(callback_svc, cb_info->rqst,
"nfsv4.%u-svc", minorversion);
if (IS_ERR(cb_info->task)) {
ret = PTR_ERR(cb_info->task);
@@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
return ret;
}
rqstp->rq_task = cb_info->task;
+ wake_up_process(cb_info->task);
dprintk("nfs_callback_up: service started\n");
return 0;
}
--
1.9.3