2008-02-06 18:21:46

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 0/2] convert lockd to kthread API (try #10)

This is the tenth iteration of the patchset to convert lockd to use the
kthread API. This patchset is smaller than the earlier ones since some
of the patches in those sets have already been taken into Bruce's tree.
This set only changes lockd to use the kthread API.

The only real difference between this patchset and the one posted
yesterday is some added comments to clarify the possible race involved
when signaling and calling kthread_stop.

Bruce, would you be willing to take this into your git tree once 2.6.25
development settles down? I'd like to have this considered for 2.6.26.

Thanks,

Signed-off-by: Jeff Layton <[email protected]>



2008-02-06 18:21:47

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 2/2] NLM: Convert lockd to use kthreads

Have lockd_up start lockd using kthread_run. With this change,
lockd_down now blocks until lockd actually exits, so there's no longer
need for the waitqueue code at the end of lockd_down. This also means
that only one lockd can be running at a time which simplifies the code
within lockd's main loop.

This also adds a check for kthread_should_stop in the main loop of
nlmsvc_retry_blocked and after that function returns. There's no sense
continuing to retry blocks if lockd is coming down anyway.

The main difference between this patch and earlier ones is that it
changes lockd_down to again send SIGKILL to lockd when it's coming
down. svc_recv() uses schedule_timeout, so we can end up blocking there
for a long time if we end up calling into it after kthread_stop wakes
up lockd. Sending a SIGKILL should help ensure that svc_recv returns
quickly if this occurs.

Because kthread_stop blocks until the kthread actually goes down,
we have to send the signal before calling it. This means that there
is a very small race window like this where lockd_down could block
for a long time:

lockd_down signals lockd
lockd invalidates locks
lockd flushes signals
lockd checks kthread_should_stop
lockd_down calls kthread_stop
lockd calls svc_recv

...and lockd blocks until svc_recv returns. I think this is a
pretty unlikely scenario though. This doesn't appear to be fixable
without changing the kthread_stop machinery to send a signal.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/lockd/svc.c | 144 +++++++++++++++++++++++++---------------------------
fs/lockd/svclock.c | 3 +-
2 files changed, 72 insertions(+), 75 deletions(-)

diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index 0822646..35e5ae2 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -25,6 +25,7 @@
#include <linux/smp.h>
#include <linux/smp_lock.h>
#include <linux/mutex.h>
+#include <linux/kthread.h>
#include <linux/freezer.h>

#include <linux/sunrpc/types.h>
@@ -48,14 +49,11 @@ EXPORT_SYMBOL(nlmsvc_ops);

static DEFINE_MUTEX(nlmsvc_mutex);
static unsigned int nlmsvc_users;
-static pid_t nlmsvc_pid;
+static struct task_struct *nlmsvc_task;
static struct svc_serv *nlmsvc_serv;
int nlmsvc_grace_period;
unsigned long nlmsvc_timeout;

-static DECLARE_COMPLETION(lockd_start_done);
-static DECLARE_WAIT_QUEUE_HEAD(lockd_exit);
-
/*
* These can be set at insmod time (useful for NFS as root filesystem),
* and also changed through the sysctl interface. -- Jamie Lokier, Aug 2003
@@ -111,35 +109,30 @@ static inline void clear_grace_period(void)
/*
* This is the lockd kernel thread
*/
-static void
-lockd(struct svc_rqst *rqstp)
+static int
+lockd(void *vrqstp)
{
int err = 0;
+ struct svc_rqst *rqstp = vrqstp;
unsigned long grace_period_expire;

- /* Lock module and set up kernel thread */
- /* lockd_up is waiting for us to startup, so will
- * be holding a reference to this module, so it
- * is safe to just claim another reference
- */
- __module_get(THIS_MODULE);
- lock_kernel();
-
- /*
- * Let our maker know we're running.
- */
- nlmsvc_pid = current->pid;
- nlmsvc_serv = rqstp->rq_server;
- complete(&lockd_start_done);
-
- daemonize("lockd");
+ /* try_to_freeze() is called from svc_recv() */
set_freezable();

- /* Process request with signals blocked, but allow SIGKILL. */
+ /* Allow SIGKILL to tell lockd to drop all of its locks */
allow_signal(SIGKILL);

dprintk("NFS locking service started (ver " LOCKD_VERSION ").\n");

+ /*
+ * FIXME: it would be nice if lockd didn't spend its entire life
+ * running under the BKL. At the very least, it would be good to
+ * have someone clarify what it's intended to protect here. I've
+ * seen some handwavy posts about posix locking needing to be
+ * done under the BKL, but it's far from clear.
+ */
+ lock_kernel();
+
if (!nlm_timeout)
nlm_timeout = LOCKD_DFLT_TIMEO;
nlmsvc_timeout = nlm_timeout * HZ;
@@ -148,10 +141,9 @@ lockd(struct svc_rqst *rqstp)

/*
* The main request loop. We don't terminate until the last
- * NFS mount or NFS daemon has gone away, and we've been sent a
- * signal, or else another process has taken over our job.
+ * NFS mount or NFS daemon has gone away.
*/
- while ((nlmsvc_users || !signalled()) && nlmsvc_pid == current->pid) {
+ while (!kthread_should_stop()) {
long timeout = MAX_SCHEDULE_TIMEOUT;
char buf[RPC_MAX_ADDRBUFLEN];

@@ -161,6 +153,7 @@ lockd(struct svc_rqst *rqstp)
nlmsvc_invalidate_all();
grace_period_expire = set_grace_period();
}
+ continue;
}

/*
@@ -174,6 +167,10 @@ lockd(struct svc_rqst *rqstp)
} else if (time_before(grace_period_expire, jiffies))
clear_grace_period();

+ /* nlmsvc_retry_blocked can block, so check for kthread_stop */
+ if (kthread_should_stop())
+ break;
+
/*
* Find a socket with data available and call its
* recvfrom routine.
@@ -195,28 +192,19 @@ lockd(struct svc_rqst *rqstp)
}

flush_signals(current);
+ if (nlmsvc_ops)
+ nlmsvc_invalidate_all();
+ nlm_shutdown_hosts();

- /*
- * Check whether there's a new lockd process before
- * shutting down the hosts and clearing the slot.
- */
- if (!nlmsvc_pid || current->pid == nlmsvc_pid) {
- if (nlmsvc_ops)
- nlmsvc_invalidate_all();
- nlm_shutdown_hosts();
- nlmsvc_pid = 0;
- nlmsvc_serv = NULL;
- } else
- printk(KERN_DEBUG
- "lockd: new process, skipping host shutdown\n");
- wake_up(&lockd_exit);
+ unlock_kernel();
+
+ nlmsvc_task = NULL;
+ nlmsvc_serv = NULL;

/* Exit the RPC thread */
svc_exit_thread(rqstp);

- /* Release module */
- unlock_kernel();
- module_put_and_exit(0);
+ return 0;
}

/*
@@ -261,14 +249,15 @@ static int make_socks(struct svc_serv *serv, int proto)
int
lockd_up(int proto) /* Maybe add a 'family' option when IPv6 is supported ?? */
{
- struct svc_serv * serv;
- int error = 0;
+ struct svc_serv *serv;
+ struct svc_rqst *rqstp;
+ int error = 0;

mutex_lock(&nlmsvc_mutex);
/*
* Check whether we're already up and running.
*/
- if (nlmsvc_pid) {
+ if (nlmsvc_serv) {
if (proto)
error = make_socks(nlmsvc_serv, proto);
goto out;
@@ -295,13 +284,28 @@ lockd_up(int proto) /* Maybe add a 'family' option when IPv6 is supported ?? */
/*
* Create the kernel thread and wait for it to start.
*/
- error = svc_create_thread(lockd, serv);
- if (error) {
+ rqstp = svc_prepare_thread(serv, &serv->sv_pools[0]);
+ if (IS_ERR(rqstp)) {
+ error = PTR_ERR(rqstp);
+ printk(KERN_WARNING
+ "lockd_up: svc_rqst allocation failed, error=%d\n",
+ error);
+ goto destroy_and_out;
+ }
+
+ svc_sock_update_bufs(serv);
+ nlmsvc_serv = rqstp->rq_server;
+
+ nlmsvc_task = kthread_run(lockd, rqstp, serv->sv_name);
+ if (IS_ERR(nlmsvc_task)) {
+ error = PTR_ERR(nlmsvc_task);
+ nlmsvc_task = NULL;
+ nlmsvc_serv = NULL;
printk(KERN_WARNING
- "lockd_up: create thread failed, error=%d\n", error);
+ "lockd_up: kthread_run failed, error=%d\n", error);
+ svc_exit_thread(rqstp);
goto destroy_and_out;
}
- wait_for_completion(&lockd_start_done);

/*
* Note: svc_serv structures have an initial use count of 1,
@@ -323,37 +327,29 @@ EXPORT_SYMBOL(lockd_up);
void
lockd_down(void)
{
- static int warned;
-
mutex_lock(&nlmsvc_mutex);
if (nlmsvc_users) {
if (--nlmsvc_users)
goto out;
- } else
- printk(KERN_WARNING "lockd_down: no users! pid=%d\n", nlmsvc_pid);
-
- if (!nlmsvc_pid) {
- if (warned++ == 0)
- printk(KERN_WARNING "lockd_down: no lockd running.\n");
- goto out;
+ } else {
+ printk(KERN_ERR "lockd_down: no users! task=%p\n",
+ nlmsvc_task);
+ BUG();
}
- warned = 0;

- kill_proc(nlmsvc_pid, SIGKILL, 1);
- /*
- * Wait for the lockd process to exit, but since we're holding
- * the lockd semaphore, we can't wait around forever ...
- */
- clear_thread_flag(TIF_SIGPENDING);
- interruptible_sleep_on_timeout(&lockd_exit, HZ);
- if (nlmsvc_pid) {
- printk(KERN_WARNING
- "lockd_down: lockd failed to exit, clearing pid\n");
- nlmsvc_pid = 0;
+ if (!nlmsvc_task) {
+ printk(KERN_ERR "lockd_down: no lockd running.\n");
+ BUG();
}
- spin_lock_irq(&current->sighand->siglock);
- recalc_sigpending();
- spin_unlock_irq(&current->sighand->siglock);
+ /*
+ * send a signal to prevent lockd blocking in svc_recv(). This is
+ * "best effort". It's possible (though unlikely) that lockd will
+ * flush the signal and be woken up by kthread stop just before it
+ * calls into svc_recv. If that happens lockd can block for a long
+ * time -- possibly until a packet hits one of its sockets.
+ */
+ send_sig(SIGKILL, nlmsvc_task, 1);
+ kthread_stop(nlmsvc_task);
out:
mutex_unlock(&nlmsvc_mutex);
}
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 2f4d8fa..d8324b6 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -29,6 +29,7 @@
#include <linux/sunrpc/svc.h>
#include <linux/lockd/nlm.h>
#include <linux/lockd/lockd.h>
+#include <linux/kthread.h>

#define NLMDBG_FACILITY NLMDBG_SVCLOCK

@@ -867,7 +868,7 @@ nlmsvc_retry_blocked(void)
unsigned long timeout = MAX_SCHEDULE_TIMEOUT;
struct nlm_block *block;

- while (!list_empty(&nlm_blocked)) {
+ while (!list_empty(&nlm_blocked) && !kthread_should_stop()) {
block = list_entry(nlm_blocked.next, struct nlm_block, b_list);

if (block->b_when == NLM_NEVER)
--
1.5.3.8


2008-02-06 18:21:46

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 1/2] SUNRPC: export svc_sock_update_bufs

Needed since the plan is to not have a svc_create_thread helper and to
have current users of that function just call kthread_run directly.

Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: NeilBrown <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
---
net/sunrpc/svcsock.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 1d3e5fc..b73a92a 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1101,6 +1101,7 @@ void svc_sock_update_bufs(struct svc_serv *serv)
}
spin_unlock_bh(&serv->sv_lock);
}
+EXPORT_SYMBOL(svc_sock_update_bufs);

/*
* Initialize socket for RPC use and create svc_sock struct
--
1.5.3.8


2008-02-06 18:36:45

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads


On Wed, 2008-02-06 at 13:21 -0500, Jeff Layton wrote:
> Have lockd_up start lockd using kthread_run. With this change,
> lockd_down now blocks until lockd actually exits, so there's no longer
> need for the waitqueue code at the end of lockd_down. This also means
> that only one lockd can be running at a time which simplifies the code
> within lockd's main loop.
>
> This also adds a check for kthread_should_stop in the main loop of
> nlmsvc_retry_blocked and after that function returns. There's no sense
> continuing to retry blocks if lockd is coming down anyway.
>
> The main difference between this patch and earlier ones is that it
> changes lockd_down to again send SIGKILL to lockd when it's coming
> down. svc_recv() uses schedule_timeout, so we can end up blocking there
> for a long time if we end up calling into it after kthread_stop wakes
> up lockd. Sending a SIGKILL should help ensure that svc_recv returns
> quickly if this occurs.
>
> Because kthread_stop blocks until the kthread actually goes down,
> we have to send the signal before calling it. This means that there
> is a very small race window like this where lockd_down could block
> for a long time:

Having looked again at the code, could you please remind me _why_ we
need to signal the process?

AFAICS, kthread_stop() should normally wake the process up if it is in
the schedule_timeout() state in svc_recv() since it uses
wake_up_process(). Shouldn't the only difference be that svc_recv() will
return -EAGAIN instead of -EINTR?

If so, why can't we just forgo the signal?

Trond


2008-02-06 18:47:26

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads

On Wed, 06 Feb 2008 13:36:31 -0500
Trond Myklebust <[email protected]> wrote:

>
> On Wed, 2008-02-06 at 13:21 -0500, Jeff Layton wrote:
> > Have lockd_up start lockd using kthread_run. With this change,
> > lockd_down now blocks until lockd actually exits, so there's no
> > longer need for the waitqueue code at the end of lockd_down. This
> > also means that only one lockd can be running at a time which
> > simplifies the code within lockd's main loop.
> >
> > This also adds a check for kthread_should_stop in the main loop of
> > nlmsvc_retry_blocked and after that function returns. There's no
> > sense continuing to retry blocks if lockd is coming down anyway.
> >
> > The main difference between this patch and earlier ones is that it
> > changes lockd_down to again send SIGKILL to lockd when it's coming
> > down. svc_recv() uses schedule_timeout, so we can end up blocking
> > there for a long time if we end up calling into it after
> > kthread_stop wakes up lockd. Sending a SIGKILL should help ensure
> > that svc_recv returns quickly if this occurs.
> >
> > Because kthread_stop blocks until the kthread actually goes down,
> > we have to send the signal before calling it. This means that there
> > is a very small race window like this where lockd_down could block
> > for a long time:
>
> Having looked again at the code, could you please remind me _why_ we
> need to signal the process?
>
> AFAICS, kthread_stop() should normally wake the process up if it is in
> the schedule_timeout() state in svc_recv() since it uses
> wake_up_process(). Shouldn't the only difference be that svc_recv()
> will return -EAGAIN instead of -EINTR?
>
> If so, why can't we just forgo the signal?
>

There's no guarantee that kthread_stop() won't wake up lockd before
schedule_timeout() gets called, but after the last check for
kthread_should_stop().

--
Jeff Layton <[email protected]>

2008-02-06 18:52:47

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads


On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
> There's no guarantee that kthread_stop() won't wake up lockd before
> schedule_timeout() gets called, but after the last check for
> kthread_should_stop().

Doesn't the BKL pretty much eliminate this race? (assuming you transform
that call to 'if (!kthread_should_stop()) schedule_timeout();')

Trond


2008-02-06 19:09:59

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads

On Wed, 06 Feb 2008 13:52:34 -0500
Trond Myklebust <[email protected]> wrote:

>
> On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
> > There's no guarantee that kthread_stop() won't wake up lockd before
> > schedule_timeout() gets called, but after the last check for
> > kthread_should_stop().
>
> Doesn't the BKL pretty much eliminate this race? (assuming you
> transform that call to 'if (!kthread_should_stop())
> schedule_timeout();')
>
> Trond
>

I don't think so. That would require that lockd_down is always called
with the BKL held, and I don't think it is, is it?
--
Jeff Layton <[email protected]>

2008-02-06 19:52:10

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads

On Wed, 6 Feb 2008 13:47:02 -0500
Jeff Layton <[email protected]> wrote:

> On Wed, 06 Feb 2008 13:36:31 -0500
> Trond Myklebust <[email protected]> wrote:
>
> >
> > On Wed, 2008-02-06 at 13:21 -0500, Jeff Layton wrote:
> > > Have lockd_up start lockd using kthread_run. With this change,
> > > lockd_down now blocks until lockd actually exits, so there's no
> > > longer need for the waitqueue code at the end of lockd_down. This
> > > also means that only one lockd can be running at a time which
> > > simplifies the code within lockd's main loop.
> > >
> > > This also adds a check for kthread_should_stop in the main loop of
> > > nlmsvc_retry_blocked and after that function returns. There's no
> > > sense continuing to retry blocks if lockd is coming down anyway.
> > >
> > > The main difference between this patch and earlier ones is that it
> > > changes lockd_down to again send SIGKILL to lockd when it's coming
> > > down. svc_recv() uses schedule_timeout, so we can end up blocking
> > > there for a long time if we end up calling into it after
> > > kthread_stop wakes up lockd. Sending a SIGKILL should help ensure
> > > that svc_recv returns quickly if this occurs.
> > >
> > > Because kthread_stop blocks until the kthread actually goes down,
> > > we have to send the signal before calling it. This means that
> > > there is a very small race window like this where lockd_down
> > > could block for a long time:
> >
> > Having looked again at the code, could you please remind me _why_ we
> > need to signal the process?
> >
> > AFAICS, kthread_stop() should normally wake the process up if it is
> > in the schedule_timeout() state in svc_recv() since it uses
> > wake_up_process(). Shouldn't the only difference be that svc_recv()
> > will return -EAGAIN instead of -EINTR?
> >
> > If so, why can't we just forgo the signal?
> >
>
> There's no guarantee that kthread_stop() won't wake up lockd before
> schedule_timeout() gets called, but after the last check for
> kthread_should_stop().
>

Sorry, I hit send too quick...

I'm certainly open to alternatives to signaling, but having a pending
signal seems to be the best way to ensure that we don't end up blocking
in schedule_timeout() here.

As a side note, I've rolled up a patch to add a kthread_stop_sig()
variant that will use "force_sig" to wake up a kthread instead of just
waking it up. I've not tested it yet, but once I do and if we can get
it in then we should be able to close the race I'm talking about in
this patch description as well...

--
Jeff Layton <[email protected]>

2008-02-06 23:01:29

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads


On Wed, 2008-02-06 at 14:09 -0500, Jeff Layton wrote:
> On Wed, 06 Feb 2008 13:52:34 -0500
> Trond Myklebust <[email protected]> wrote:
>
> >
> > On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
> > > There's no guarantee that kthread_stop() won't wake up lockd before
> > > schedule_timeout() gets called, but after the last check for
> > > kthread_should_stop().
> >
> > Doesn't the BKL pretty much eliminate this race? (assuming you
> > transform that call to 'if (!kthread_should_stop())
> > schedule_timeout();')
> >
> > Trond
> >
>
> I don't think so. That would require that lockd_down is always called
> with the BKL held, and I don't think it is, is it?

Nothing stops you from grabbing the BKL inside lockd_down, though :-)


2008-02-07 08:50:12

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads

On Wed, Feb 06, 2008 at 07:34:28AM -0500, Jeff Layton wrote:
> Yes. Perhaps we should consider a kthread_stop_with_signal() function
> that does a kthread_stop and sends a signal before waiting for
> completion? Most users of kthread_stop won't need it, but it would be
> nice here. CIFS could also probably use something like that.

Yes, it'd be useful for everythign that calls socket functions from
a kernel thread.


2008-02-07 11:37:39

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads

On Wed, 06 Feb 2008 18:01:16 -0500
Trond Myklebust <[email protected]> wrote:

>
> On Wed, 2008-02-06 at 14:09 -0500, Jeff Layton wrote:
> > On Wed, 06 Feb 2008 13:52:34 -0500
> > Trond Myklebust <[email protected]> wrote:
> >
> > >
> > > On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
> > > > There's no guarantee that kthread_stop() won't wake up lockd before
> > > > schedule_timeout() gets called, but after the last check for
> > > > kthread_should_stop().
> > >
> > > Doesn't the BKL pretty much eliminate this race? (assuming you
> > > transform that call to 'if (!kthread_should_stop())
> > > schedule_timeout();')
> > >
> > > Trond
> > >
> >
> > I don't think so. That would require that lockd_down is always called
> > with the BKL held, and I don't think it is, is it?
>
> Nothing stops you from grabbing the BKL inside lockd_down, though :-)
>

True, but what's worse -- keeping signaling in the codepath or
increasing reliance on the BKL? :-)

I think in the near term, you're probably right that taking the BKL in
lockd_down is the best scheme to prevent these races. I think the best
long term solution though is to add a way for kthread_stop to signal
the task before calling wait_for_completion.

I'll keep plugging away on this for now. If I can get a patch
upstream to add a kthread_stop_sig() call then we might be able to
avoid relying on the BKL here...

--
Jeff Layton <[email protected]>

2008-02-07 15:17:10

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 2/2] NLM: Convert lockd to use kthreads


On Thu, 2008-02-07 at 06:37 -0500, Jeff Layton wrote:
> On Wed, 06 Feb 2008 18:01:16 -0500
> Trond Myklebust <[email protected]> wrote:
>
> >
> > On Wed, 2008-02-06 at 14:09 -0500, Jeff Layton wrote:
> > > On Wed, 06 Feb 2008 13:52:34 -0500
> > > Trond Myklebust <[email protected]> wrote:
> > >
> > > >
> > > > On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
> > > > > There's no guarantee that kthread_stop() won't wake up lockd before
> > > > > schedule_timeout() gets called, but after the last check for
> > > > > kthread_should_stop().
> > > >
> > > > Doesn't the BKL pretty much eliminate this race? (assuming you
> > > > transform that call to 'if (!kthread_should_stop())
> > > > schedule_timeout();')
> > > >
> > > > Trond
> > > >
> > >
> > > I don't think so. That would require that lockd_down is always called
> > > with the BKL held, and I don't think it is, is it?
> >
> > Nothing stops you from grabbing the BKL inside lockd_down, though :-)
> >
>
> True, but what's worse -- keeping signaling in the codepath or
> increasing reliance on the BKL? :-)
>
> I think in the near term, you're probably right that taking the BKL in
> lockd_down is the best scheme to prevent these races. I think the best
> long term solution though is to add a way for kthread_stop to signal
> the task before calling wait_for_completion.

Doh! Where are my brains?

You don't need the BKL either here. Just make sure that the test for
'kthread_should_stop()' occurs after the call to set_current_state().
wake_up_process() will reset the state to TASK_RUNNING, and so the
process is kept on the run queue when schedule_timeout() is called.

> I'll keep plugging away on this for now. If I can get a patch
> upstream to add a kthread_stop_sig() call then we might be able to
> avoid relying on the BKL here...

No need for signals. We can get rid of them altogether (and good
riddance!).

Cheers
Trond