2020-05-01 16:02:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 0/4] allow multiple kthreadd's

From: "J. Bruce Fields" <[email protected]>

These patches allow a caller to create its own kthreadd.

The motivation is file delegations: currently any write operation from a
client breaks all delegations, even delegations held by the same client.

To fix that, we need to know which client is performing a given
operation.

So, we let nfsd put all the nfsd threads into the same thread group (by
spawning them from its own private kthreadd), then patch the delegation
code to treat delegation breaks from the same thread group as not
conflicting, and then leave it to nfsd to sort out conflicts among its
own clients. Those patches are in:

git://linux-nfs.org/~bfields/linux.git deleg-fix-self-conflicts

This was an idea from Trond. Part of his motivation was that it could
work for userspace servers (like Ganesha and Samba) as well. (We don't
currently let them request delegations, but probably will some day--it
shouldn't be difficult.)

Previously I considered instead adding a new field somewhere in the
struct task. That might require a new system call to expose to user
space. Or we might be able to put this in a keyring, if David Howells
thought that would work.

Before that I tried passing the identity of the breaker explicitly, but
that looks like it would require passing the new argument around to huge
swaths of the VFS.

Anyway, does this multiple kthreadd approach look reasonable?

(If so, who should handle the patches?)

--b.

J. Bruce Fields (4):
kthreads: minor kthreadd refactoring
kthreads: Simplify tsk_fork_get_node
kthreads: allow multiple kthreadd's
kthreads: allow cloning threads with different flags

include/linux/kthread.h | 21 +++++-
init/init_task.c | 3 +
init/main.c | 4 +-
kernel/fork.c | 4 ++
kernel/kthread.c | 140 +++++++++++++++++++++++++++++-----------
5 files changed, 132 insertions(+), 40 deletions(-)

--
2.26.2


2020-05-01 16:02:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 1/4] kthreads: minor kthreadd refactoring

From: "J. Bruce Fields" <[email protected]>

Trivial refactoring, no change in behavior.

Not really necessary, a separate function for the inner loop just seems
a little nicer to me.

Signed-off-by: J. Bruce Fields <[email protected]>
---
kernel/kthread.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index bfbfa481be3a..4217fded891a 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -578,6 +578,24 @@ int kthread_stop(struct task_struct *k)
}
EXPORT_SYMBOL(kthread_stop);

+void kthread_do_work(void)
+{
+ spin_lock(&kthread_create_lock);
+ while (!list_empty(&kthread_create_list)) {
+ struct kthread_create_info *create;
+
+ create = list_entry(kthread_create_list.next,
+ struct kthread_create_info, list);
+ list_del_init(&create->list);
+ spin_unlock(&kthread_create_lock);
+
+ create_kthread(create);
+
+ spin_lock(&kthread_create_lock);
+ }
+ spin_unlock(&kthread_create_lock);
+}
+
int kthreadd(void *unused)
{
struct task_struct *tsk = current;
@@ -597,20 +615,7 @@ int kthreadd(void *unused)
schedule();
__set_current_state(TASK_RUNNING);

- spin_lock(&kthread_create_lock);
- while (!list_empty(&kthread_create_list)) {
- struct kthread_create_info *create;
-
- create = list_entry(kthread_create_list.next,
- struct kthread_create_info, list);
- list_del_init(&create->list);
- spin_unlock(&kthread_create_lock);
-
- create_kthread(create);
-
- spin_lock(&kthread_create_lock);
- }
- spin_unlock(&kthread_create_lock);
+ kthread_do_work();
}

return 0;
--
2.26.2

2020-05-01 16:02:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 2/4] kthreads: Simplify tsk_fork_get_node

From: "J. Bruce Fields" <[email protected]>

This will also simplify a following patch that allows multiple
kthreadd's.

Signed-off-by: J. Bruce Fields <[email protected]>
---
init/init_task.c | 3 +++
kernel/fork.c | 4 ++++
kernel/kthread.c | 3 +--
3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/init/init_task.c b/init/init_task.c
index bd403ed3e418..fdd760393760 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -154,6 +154,9 @@ struct task_struct init_task
.vtime.starttime = 0,
.vtime.state = VTIME_SYS,
#endif
+#ifdef CONFIG_NUMA
+ .pref_node_fork = NUMA_NO_NODE,
+#endif
#ifdef CONFIG_NUMA_BALANCING
.numa_preferred_nid = NUMA_NO_NODE,
.numa_group = NULL,
diff --git a/kernel/fork.c b/kernel/fork.c
index 8c700f881d92..fa35890534d5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -942,6 +942,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
tsk->fail_nth = 0;
#endif

+#ifdef CONFIG_NUMA
+ tsk->pref_node_fork = NUMA_NO_NODE;
+#endif
+
#ifdef CONFIG_BLK_CGROUP
tsk->throttle_queue = NULL;
tsk->use_memdelay = 0;
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 4217fded891a..483bee57a9c8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -274,8 +274,7 @@ static int kthread(void *_create)
int tsk_fork_get_node(struct task_struct *tsk)
{
#ifdef CONFIG_NUMA
- if (tsk == kthreadd_task)
- return tsk->pref_node_fork;
+ return tsk->pref_node_fork;
#endif
return NUMA_NO_NODE;
}
--
2.26.2

2020-05-01 16:02:32

by J. Bruce Fields

[permalink] [raw]
Subject: [PATCH 4/4] kthreads: allow cloning threads with different flags

From: "J. Bruce Fields" <[email protected]>

This is so knfsd can add CLONE_THREAD.

Signed-off-by: J. Bruce Fields <[email protected]>
---
include/linux/kthread.h | 3 ++-
kernel/kthread.c | 11 +++++++----
2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index a7ffdf96a3b2..7069feb6da65 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -10,11 +10,12 @@ struct kthread_group {
spinlock_t create_lock;
struct list_head create_list;
struct task_struct *task;
+ unsigned long flags;
};

extern struct kthread_group kthreadd_default;

-struct kthread_group *kthread_start_group(char *);
+struct kthread_group *kthread_start_group(unsigned long, char *);
void kthread_stop_group(struct kthread_group *);

struct task_struct *kthread_group_create_on_node(struct kthread_group *,
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 5232f6f597b5..57f6687ecec7 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -29,6 +29,7 @@ struct kthread_group kthreadd_default = {
.name = "kthreadd",
.create_lock = __SPIN_LOCK_UNLOCKED(kthreadd_default.create_lock),
.create_list = LIST_HEAD_INIT(kthreadd_default.create_list),
+ .flags = CLONE_FS | CLONE_FILES | SIGCHLD,
};

void wake_kthreadd(struct kthread_group *kg)
@@ -36,7 +37,7 @@ void wake_kthreadd(struct kthread_group *kg)
wake_up_process(kg->task);
}

-struct kthread_group *kthread_start_group(char *name)
+struct kthread_group *kthread_start_group(unsigned long flags, char *name)
{
struct kthread_group *new;
struct task_struct *task;
@@ -47,6 +48,7 @@ struct kthread_group *kthread_start_group(char *name)
spin_lock_init(&new->create_lock);
INIT_LIST_HEAD(&new->create_list);
new->name = name;
+ new->flags = flags;
task = kthread_run(kthreadd, new, name);
if (IS_ERR(task)) {
kfree(new);
@@ -314,7 +316,8 @@ int tsk_fork_get_node(struct task_struct *tsk)
return NUMA_NO_NODE;
}

-static void create_kthread(struct kthread_create_info *create)
+static void create_kthread(struct kthread_create_info *create,
+ unsigned long flags)
{
int pid;

@@ -322,7 +325,7 @@ static void create_kthread(struct kthread_create_info *create)
current->pref_node_fork = create->node;
#endif
/* We want our own signal handler (we take no signals by default). */
- pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
+ pid = kernel_thread(kthread, create, flags);
if (pid < 0) {
/* If user was SIGKILLed, I release the structure. */
struct completion *done = xchg(&create->done, NULL);
@@ -645,7 +648,7 @@ void kthread_do_work(struct kthread_group *kg)
list_del_init(&create->list);
spin_unlock(&kg->create_lock);

- create_kthread(create);
+ create_kthread(create, kg->flags);

spin_lock(&kg->create_lock);
}
--
2.26.2

2020-05-01 18:00:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Fri, May 1, 2020 at 9:02 AM J. Bruce Fields <[email protected]> wrote:
>
> Anyway, does this multiple kthreadd approach look reasonable?

I don't see anything that looks alarming.

My main reaction was that I don't like the "kthreadd" name, but that's
because for some reason I always read it as "kthre add". That may be
just me. It normally doesn't bother me (this code doesn't get all that
much work on it, it's been very stable), but it was very obvious when
reading your patches.

In fact, I liked _your_ naming better, to the point where I was going
"'kthread_group' is a much better name than 'kthreadd', and that
'kthreadd()' function would read better as 'kthread_group_run()' or
something".

But that may just be a personal quirk of mine, and isn't a big deal.
On the whole the patches looked all sane to me.

> (If so, who should handle the patches?)

We have had _very_ little work in this area, probably because most of
the kthread work has been subsumed by workqueues.

Which kind of makes me want to point a finger at Tejun. But it's been
mostly PeterZ touching this file lately..

Linus

2020-05-01 18:22:49

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

Hello,

On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> Which kind of makes me want to point a finger at Tejun. But it's been
> mostly PeterZ touching this file lately..

Looks fine to me too. I don't quite understand the usecase tho. It looks
like all it's being used for is to tag some kthreads as belonging to the
same group. Can't that be done with kthread_data()?

Thanks.

--
tejun

2020-05-01 18:32:01

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Fri, May 1, 2020 at 11:22 AM Tejun Heo <[email protected]> wrote:
>
> Looks fine to me too. I don't quite understand the usecase tho. It looks
> like all it's being used for is to tag some kthreads as belonging to the
> same group. Can't that be done with kthread_data()?

I _think_ Bruce wants the signal handling unification too, because
nfsd wants to react to being shut down with signals.

But you're right, more explanation of why nfsd wants/needs a separate
grouping is a good idea.

Linus

2020-05-01 18:50:14

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> Hello,
>
> On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > Which kind of makes me want to point a finger at Tejun. But it's been
> > mostly PeterZ touching this file lately..
>
> Looks fine to me too. I don't quite understand the usecase tho. It looks
> like all it's being used for is to tag some kthreads as belonging to the
> same group.

Pretty much.

> Can't that be done with kthread_data()?

Huh, maybe so, thanks.

I need to check this from generic file locking code that could be run by
any task--but I assume there's an easy way I can check if I'm a kthread
before calling kthread_data(current).

I do expect to expose a delegation interface for userspace servers
eventually too. But we could do the tgid check for them and still use
kthread_data() for nfsd. That might work.

--b.

2020-05-01 19:06:20

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Fri, 2020-05-01 at 14:49 -0400, J. Bruce Fields wrote:
> On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> > Hello,
> >
> > On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > > Which kind of makes me want to point a finger at Tejun. But it's
> > > been
> > > mostly PeterZ touching this file lately..
> >
> > Looks fine to me too. I don't quite understand the usecase tho. It
> > looks
> > like all it's being used for is to tag some kthreads as belonging
> > to the
> > same group.
>
> Pretty much.
>

Wen running an instance of knfsd from inside a container, you want to
be able to have the knfsd kthreads be parented to the container init
process so that they get killed off when the container is killed.

Right now, we can easily leak those kernel threads simply by killing
the container.

> > Can't that be done with kthread_data()?
>
> Huh, maybe so, thanks.
>
> I need to check this from generic file locking code that could be run
> by
> any task--but I assume there's an easy way I can check if I'm a
> kthread
> before calling kthread_data(current).
>
> I do expect to expose a delegation interface for userspace servers
> eventually too. But we could do the tgid check for them and still
> use
> kthread_data() for nfsd. That might work.
>
> --b.
>
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2020-05-01 19:21:33

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

Hello,

On Fri, May 01, 2020 at 07:05:46PM +0000, Trond Myklebust wrote:
> Wen running an instance of knfsd from inside a container, you want to
> be able to have the knfsd kthreads be parented to the container init
> process so that they get killed off when the container is killed.
>
> Right now, we can easily leak those kernel threads simply by killing
> the container.

Hmm... I'm not sure that'd be a lot easier because they're in their own
thread group. It's not like you can queue signal to the group leader cause
the other kthreads to automatically die. Each would have to handle the exit
explicitly. As long as there is a way to iterate the member kthreads (e.g.
list going through kthread_data or sth else hanging off there), just using
existing kthread interface might not be much different, or maybe even easier
given how hairy signal handling in kthreads can get.

Thanks.

--
tejun

2020-05-01 19:23:31

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Fri, May 01, 2020 at 07:05:46PM +0000, Trond Myklebust wrote:
> On Fri, 2020-05-01 at 14:49 -0400, J. Bruce Fields wrote:
> > On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> > > Hello,
> > >
> > > On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > > > Which kind of makes me want to point a finger at Tejun. But it's
> > > > been
> > > > mostly PeterZ touching this file lately..
> > >
> > > Looks fine to me too. I don't quite understand the usecase tho. It
> > > looks
> > > like all it's being used for is to tag some kthreads as belonging
> > > to the
> > > same group.
> >
> > Pretty much.
>
> Wen running an instance of knfsd from inside a container, you want to
> be able to have the knfsd kthreads be parented to the container init
> process so that they get killed off when the container is killed.
>
> Right now, we can easily leak those kernel threads simply by killing
> the container.

Oh, got it.

Currently knfsd supports nfs service in containers, but it uses a single
set of threads to serve requests from any container. It should shut the
server threads down when the last container using them goes away.

--b.

2020-05-05 02:16:10

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Fri, May 01, 2020 at 02:21:54PM -0400, Tejun Heo wrote:
> Hello,
>
> On Fri, May 01, 2020 at 10:59:24AM -0700, Linus Torvalds wrote:
> > Which kind of makes me want to point a finger at Tejun. But it's been
> > mostly PeterZ touching this file lately..
>
> Looks fine to me too. I don't quite understand the usecase tho. It looks
> like all it's being used for is to tag some kthreads as belonging to the
> same group. Can't that be done with kthread_data()?

Yeah, so I'd forgotten about kthread->data.

We're currently using it to pass the struct svc_rqst that a new nfsd
thread needs. But once the new thread has gotten that, I guess it could
set kthread->data to some global value that it uses to say "I'm a knfsd
thread"?

I suppose that would work.

Though now I'm feeling greedy: it would be nice to have both some kind
of global flag, *and* keep kthread->data pointing to svc_rqst (as that
would give me a simpler and quicker way to figure out which client is
conflicting). Could I take a flag bit in kthread->flags, maybe?

--b.

2020-05-05 15:54:47

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

Hello, Bruce.

On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> We're currently using it to pass the struct svc_rqst that a new nfsd
> thread needs. But once the new thread has gotten that, I guess it could
> set kthread->data to some global value that it uses to say "I'm a knfsd
> thread"?
>
> I suppose that would work.
>
> Though now I'm feeling greedy: it would be nice to have both some kind
> of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> would give me a simpler and quicker way to figure out which client is
> conflicting). Could I take a flag bit in kthread->flags, maybe?

Hmm... that'd be solvable if kthread->data can point to a struct which does
both things, right? Because it doesn't have free() callback, it's a bit
awkward but the threadfn itself can unlink and RCU-free it before returning.

Thanks.

--
tejun

2020-05-05 16:24:32

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Tue, May 05, 2020 at 11:54:05AM -0400, Tejun Heo wrote:
> Hello, Bruce.
>
> On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> > We're currently using it to pass the struct svc_rqst that a new nfsd
> > thread needs. But once the new thread has gotten that, I guess it could
> > set kthread->data to some global value that it uses to say "I'm a knfsd
> > thread"?
> >
> > I suppose that would work.
> >
> > Though now I'm feeling greedy: it would be nice to have both some kind
> > of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> > would give me a simpler and quicker way to figure out which client is
> > conflicting). Could I take a flag bit in kthread->flags, maybe?
>
> Hmm... that'd be solvable if kthread->data can point to a struct which does
> both things, right?

Isn't this some sort of chicken-and-egg problem?

If you don't know whether a given kthread is an nfsd thread or not, then
it's not safe to assume that kthread->data points to some nfsd-specific
structure that might tell you whether it's an nfsd thread.

> Because it doesn't have free() callback, it's a bit
> awkward but the threadfn itself can unlink and RCU-free it before returning.

It's only ever going to be referenced from the thread itself. This is
just a way to ask "am I running as an nfsd thread?" when we're deep
inside generic filesystem code somewhere. So I don't think there's any
complicated lifetime issues here.

--b.

2020-05-05 21:01:58

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> Though now I'm feeling greedy: it would be nice to have both some kind
> of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> would give me a simpler and quicker way to figure out which client is
> conflicting). Could I take a flag bit in kthread->flags, maybe?

Would something like this be too hacky?:

--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -58,6 +58,7 @@ enum KTHREAD_BITS {
KTHREAD_IS_PER_CPU = 0,
KTHREAD_SHOULD_STOP,
KTHREAD_SHOULD_PARK,
+ KTHREAD_IS_NFSD,
};

static inline void set_kthread_struct(void *kthread)
@@ -164,6 +165,25 @@ void *kthread_data(struct task_struct *task)
return to_kthread(task)->data;
}

+void kthread_set_nfsd()
+{
+ set_bit(KTHREAD_IS_NFSD, &to_kthread(current)->flags);
+}
+EXPORT_SYMBOL_GPL(kthread_set_nfsd);
+
+void *kthread_nfsd_data()
+{
+ struct kthread *k;
+
+ if (!(current->flags & PF_KTHREAD))
+ return NULL;
+ k = to_kthread(current);
+ if (test_bit(KTHREAD_IS_NFSD, &k->flags))
+ return k->data;
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(kthread_nfsd_data);
+
/**
* kthread_probe_data - speculative version of kthread_data()
* @task: possible kthread task in question

It feels weird to make such a special case for nfsd, but it makes this
all very easy for me; complete patch below.

--b.

commit 8b0a2e86dafa
Author: J. Bruce Fields <[email protected]>
Date: Fri Jul 28 16:35:15 2017 -0400

nfsd: clients don't need to break their own delegations

We currently revoke read delegations on any write open or any operation
that modifies file data or metadata (including rename, link, and
unlink). But if the delegation in question is the only read delegation
and is held by the client performing the operation, that's not really
necessary.

It's not always possible to prevent this in the NFSv4.0 case, because
there's not always a way to determine which client an NFSv4.0 delegation
came from. (In theory we could try to guess this from the transport
layer, e.g., by assuming all traffic on a given TCP connection comes
from the same client. But that's not really correct.)

In the NFSv4.1 case the session layer always tells us the client.

This patch should remove such self-conflicts in all cases where we can
reliably determine the client from the compound.

To do that we need to track "who" is performing a given (possibly
lease-breaking) file operation. We're doing that by storing the
information in the svc_rqst and using kthread_data() to map the current
task back to a svc_rqst.

Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 5057e4d9dcd1..9fdcec416614 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -425,6 +425,7 @@ prototypes::
int (*lm_grant)(struct file_lock *, struct file_lock *, int);
void (*lm_break)(struct file_lock *); /* break_lease callback */
int (*lm_change)(struct file_lock **, int);
+ bool (*lm_breaker_owns_lease)(struct file_lock *);

locking rules:

@@ -435,6 +436,7 @@ lm_notify: yes yes no
lm_grant: no no no
lm_break: yes no no
lm_change yes no no
+lm_breaker_owns_lease: no no no
========== ============= ================= =========

buffer_head
diff --git a/fs/locks.c b/fs/locks.c
index b8a31c1c4fff..a3f186846e93 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1557,6 +1557,9 @@ static bool leases_conflict(struct file_lock *lease, struct file_lock *breaker)
{
bool rc;

+ if (lease->fl_lmops->lm_breaker_owns_lease
+ && lease->fl_lmops->lm_breaker_owns_lease(lease))
+ return false;
if ((breaker->fl_flags & FL_LAYOUT) != (lease->fl_flags & FL_LAYOUT)) {
rc = false;
goto trace;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 0e75f7fb5fec..a6d73aa51ce4 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2302,6 +2302,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
}
check_if_stalefh_allowed(args);

+ rqstp->rq_lease_breaker = (void **)&cstate->clp;
+
trace_nfsd_compound(rqstp, args->opcnt);
while (!status && resp->opcnt < args->opcnt) {
op = &args->ops[resp->opcnt++];
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index e32ecedece0f..2368051bbef3 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4520,6 +4520,19 @@ nfsd_break_deleg_cb(struct file_lock *fl)
return ret;
}

+static bool nfsd_breaker_owns_lease(struct file_lock *fl)
+{
+ struct nfs4_delegation *dl = fl->fl_owner;
+ struct svc_rqst *rqst;
+ struct nfs4_client *clp;
+
+ rqst = kthread_nfsd_data();
+ if (!rqst)
+ return false;
+ clp = *(rqst->rq_lease_breaker);
+ return dl->dl_stid.sc_client == clp;
+}
+
static int
nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
struct list_head *dispose)
@@ -4531,6 +4544,7 @@ nfsd_change_deleg_cb(struct file_lock *onlist, int arg,
}

static const struct lock_manager_operations nfsd_lease_mng_ops = {
+ .lm_breaker_owns_lease = nfsd_breaker_owns_lease,
.lm_break = nfsd_break_deleg_cb,
.lm_change = nfsd_change_deleg_cb,
};
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index ca9fd348548b..9c15b77a726f 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -888,6 +888,8 @@ nfsd(void *vrqstp)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
int err;

+ kthread_set_nfsd();
+
/* Lock module and set up kernel thread */
mutex_lock(&nfsd_mutex);

@@ -1011,6 +1013,7 @@ nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp)
*statp = rpc_garbage_args;
return 1;
}
+ rqstp->rq_lease_breaker = NULL;
/*
* Give the xdr decoder a chance to change this if it wants
* (necessary in the NFSv4.0 compound case)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4f6f59b4f22a..4b784560ffaf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1045,6 +1045,7 @@ struct lock_manager_operations {
bool (*lm_break)(struct file_lock *);
int (*lm_change)(struct file_lock *, int, struct list_head *);
void (*lm_setup)(struct file_lock *, void **);
+ bool (*lm_breaker_owns_lease)(struct file_lock *);
};

struct lock_manager {
diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 8bbcaad7ef0f..d374cad65931 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -58,6 +58,8 @@ bool kthread_should_park(void);
bool __kthread_should_park(struct task_struct *k);
bool kthread_freezable_should_stop(bool *was_frozen);
void *kthread_data(struct task_struct *k);
+void kthread_set_nfsd(void);
+void *kthread_nfsd_data(void);
void *kthread_probe_data(struct task_struct *k);
int kthread_park(struct task_struct *k);
void kthread_unpark(struct task_struct *k);
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index fd390894a584..abf4a57ce4a7 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -299,6 +299,7 @@ struct svc_rqst {
struct net *rq_bc_net; /* pointer to backchannel's
* net namespace
*/
+ void ** rq_lease_breaker; /* The v4 client breaking a lease */
};

#define SVC_NET(rqst) (rqst->rq_xprt ? rqst->rq_xprt->xpt_net : rqst->rq_bc_net)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 09b103b92c5a..54d83f329b85 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -58,6 +58,7 @@ enum KTHREAD_BITS {
KTHREAD_IS_PER_CPU = 0,
KTHREAD_SHOULD_STOP,
KTHREAD_SHOULD_PARK,
+ KTHREAD_IS_NFSD,
};

static inline void set_kthread_struct(void *kthread)
@@ -164,6 +165,25 @@ void *kthread_data(struct task_struct *task)
return to_kthread(task)->data;
}

+void kthread_set_nfsd()
+{
+ set_bit(KTHREAD_IS_NFSD, &to_kthread(current)->flags);
+}
+EXPORT_SYMBOL_GPL(kthread_set_nfsd);
+
+void *kthread_nfsd_data()
+{
+ struct kthread *k;
+
+ if (!(current->flags & PF_KTHREAD))
+ return NULL;
+ k = to_kthread(current);
+ if (test_bit(KTHREAD_IS_NFSD, &k->flags))
+ return k->data;
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(kthread_nfsd_data);
+
/**
* kthread_probe_data - speculative version of kthread_data()
* @task: possible kthread task in question

2020-05-05 21:10:34

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

Hello,

On Tue, May 05, 2020 at 05:01:18PM -0400, J. Bruce Fields wrote:
> On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> > Though now I'm feeling greedy: it would be nice to have both some kind
> > of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> > would give me a simpler and quicker way to figure out which client is
> > conflicting). Could I take a flag bit in kthread->flags, maybe?
>
> Would something like this be too hacky?:

It's not the end of the world but a bit hacky. I wonder whether something
like the following would work better for identifying worker type so that you
can do sth like

if (kthread_fn(current) == nfsd)
return kthread_data(current);
else
return NULL;

Thanks.

diff --git a/kernel/kthread.c b/kernel/kthread.c
index bfbfa481be3a..4f3ab9f2c994 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -46,6 +46,7 @@ struct kthread_create_info
struct kthread {
unsigned long flags;
unsigned int cpu;
+ int (*threadfn)(void *);
void *data;
struct completion parked;
struct completion exited;
@@ -152,6 +153,13 @@ bool kthread_freezable_should_stop(bool *was_frozen)
}
EXPORT_SYMBOL_GPL(kthread_freezable_should_stop);

+void *kthread_fn(struct task_struct *task)
+{
+ if (task->flags & PF_KTHREAD)
+ return to_kthread(task)->threadfn;
+ return NULL;
+}
+
/**
* kthread_data - return data value specified on kthread creation
* @task: kthread task in question
@@ -244,6 +252,7 @@ static int kthread(void *_create)
do_exit(-ENOMEM);
}

+ self->threadfn = threadfn;
self->data = data;
init_completion(&self->exited);
init_completion(&self->parked);

--
tejun

2020-05-05 21:26:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> Hello,
>
> On Tue, May 05, 2020 at 05:01:18PM -0400, J. Bruce Fields wrote:
> > On Mon, May 04, 2020 at 10:15:14PM -0400, J. Bruce Fields wrote:
> > > Though now I'm feeling greedy: it would be nice to have both some kind
> > > of global flag, *and* keep kthread->data pointing to svc_rqst (as that
> > > would give me a simpler and quicker way to figure out which client is
> > > conflicting). Could I take a flag bit in kthread->flags, maybe?
> >
> > Would something like this be too hacky?:
>
> It's not the end of the world but a bit hacky. I wonder whether something
> like the following would work better for identifying worker type so that you
> can do sth like
>
> if (kthread_fn(current) == nfsd)
> return kthread_data(current);
> else
> return NULL;

Yes, definitely more generic, looks good to me.

--b.

>
> Thanks.
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index bfbfa481be3a..4f3ab9f2c994 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -46,6 +46,7 @@ struct kthread_create_info
> struct kthread {
> unsigned long flags;
> unsigned int cpu;
> + int (*threadfn)(void *);
> void *data;
> struct completion parked;
> struct completion exited;
> @@ -152,6 +153,13 @@ bool kthread_freezable_should_stop(bool *was_frozen)
> }
> EXPORT_SYMBOL_GPL(kthread_freezable_should_stop);
>
> +void *kthread_fn(struct task_struct *task)
> +{
> + if (task->flags & PF_KTHREAD)
> + return to_kthread(task)->threadfn;
> + return NULL;
> +}
> +
> /**
> * kthread_data - return data value specified on kthread creation
> * @task: kthread task in question
> @@ -244,6 +252,7 @@ static int kthread(void *_create)
> do_exit(-ENOMEM);
> }
>
> + self->threadfn = threadfn;
> self->data = data;
> init_completion(&self->exited);
> init_completion(&self->parked);
>
> --
> tejun

2020-05-06 15:39:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Tue, May 05, 2020 at 05:25:27PM -0400, J. Bruce Fields wrote:
> On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> > It's not the end of the world but a bit hacky. I wonder whether something
> > like the following would work better for identifying worker type so that you
> > can do sth like
> >
> > if (kthread_fn(current) == nfsd)
> > return kthread_data(current);
> > else
> > return NULL;
>
> Yes, definitely more generic, looks good to me.

This is what I'm testing with.

If it's OK with you, could I add your Signed-off-by and take it through
the nfsd tree? I'll have some other patches that will depend on it.

--b.


commit 379bfe5257b6
Author: Tejun Heo <[email protected]>
Date: Tue May 5 21:26:07 2020 -0400

kthread: save thread function

It's handy to keep the kthread_fn just as a unique cookie to identify
classes of kthreads. E.g. if you can verify that a given task is
running your thread_fn, then you may know what sort of type kthread_data
points to.

We'll use this in nfsd to pass some information into the vfs. Note it
will need kthread_data() exported too.

Signed-off-by: J. Bruce Fields <[email protected]>

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 8bbcaad7ef0f..c00ee443833f 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -57,6 +57,7 @@ bool kthread_should_stop(void);
bool kthread_should_park(void);
bool __kthread_should_park(struct task_struct *k);
bool kthread_freezable_should_stop(bool *was_frozen);
+void *kthread_fn(struct task_struct *k);
void *kthread_data(struct task_struct *k);
void *kthread_probe_data(struct task_struct *k);
int kthread_park(struct task_struct *k);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index bfbfa481be3a..b87c4a9ba91d 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -46,6 +46,7 @@ struct kthread_create_info
struct kthread {
unsigned long flags;
unsigned int cpu;
+ int (*threadfn)(void *);
void *data;
struct completion parked;
struct completion exited;
@@ -152,6 +153,20 @@ bool kthread_freezable_should_stop(bool *was_frozen)
}
EXPORT_SYMBOL_GPL(kthread_freezable_should_stop);

+/**
+ * kthread_fn - return the function specified on kthread creation
+ * @task: kthread task in question
+ *
+ * Returns NULL if the task is not a kthread.
+ */
+void *kthread_fn(struct task_struct *task)
+{
+ if (task->flags & PF_KTHREAD)
+ return to_kthread(task)->threadfn;
+ return NULL;
+}
+EXPORT_SYMBOL_GPL(kthread_fn);
+
/**
* kthread_data - return data value specified on kthread creation
* @task: kthread task in question
@@ -164,6 +179,7 @@ void *kthread_data(struct task_struct *task)
{
return to_kthread(task)->data;
}
+EXPORT_SYMBOL_GPL(kthread_data);

/**
* kthread_probe_data - speculative version of kthread_data()
@@ -244,6 +260,7 @@ static int kthread(void *_create)
do_exit(-ENOMEM);
}

+ self->threadfn = threadfn;
self->data = data;
init_completion(&self->exited);
init_completion(&self->parked);

2020-05-06 15:40:48

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

Hello, Bruce.

On Wed, May 06, 2020 at 11:36:58AM -0400, J. Bruce Fields wrote:
> On Tue, May 05, 2020 at 05:25:27PM -0400, J. Bruce Fields wrote:
> > On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> > > It's not the end of the world but a bit hacky. I wonder whether something
> > > like the following would work better for identifying worker type so that you
> > > can do sth like
> > >
> > > if (kthread_fn(current) == nfsd)
> > > return kthread_data(current);
> > > else
> > > return NULL;
> >
> > Yes, definitely more generic, looks good to me.
>
> This is what I'm testing with.
>
> If it's OK with you, could I add your Signed-off-by and take it through
> the nfsd tree? I'll have some other patches that will depend on it.

Please feel free to use the code however you see fit. Given that it'll be
originating from you, my signed-off-by might not be the right tag. Something
like Original-patch-by should be good (nothing is fine too).

Thanks.

--
tejun

2020-05-06 15:55:34

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 0/4] allow multiple kthreadd's

On Wed, May 06, 2020 at 11:39:20AM -0400, Tejun Heo wrote:
> Hello, Bruce.
>
> On Wed, May 06, 2020 at 11:36:58AM -0400, J. Bruce Fields wrote:
> > On Tue, May 05, 2020 at 05:25:27PM -0400, J. Bruce Fields wrote:
> > > On Tue, May 05, 2020 at 05:09:56PM -0400, Tejun Heo wrote:
> > > > It's not the end of the world but a bit hacky. I wonder whether something
> > > > like the following would work better for identifying worker type so that you
> > > > can do sth like
> > > >
> > > > if (kthread_fn(current) == nfsd)
> > > > return kthread_data(current);
> > > > else
> > > > return NULL;
> > >
> > > Yes, definitely more generic, looks good to me.
> >
> > This is what I'm testing with.
> >
> > If it's OK with you, could I add your Signed-off-by and take it through
> > the nfsd tree? I'll have some other patches that will depend on it.
>
> Please feel free to use the code however you see fit. Given that it'll be
> originating from you, my signed-off-by might not be the right tag. Something
> like Original-patch-by should be good (nothing is fine too).

OK, I'll do that, thanks!

--b.