2021-11-29 22:30:49

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH v2] NFSD: Fix RCU-related sparse splat

To address this error:

CC [M] fs/nfsd/filecache.o
CHECK /home/cel/src/linux/linux/fs/nfsd/filecache.c
/home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: error: incompatible types in comparison expression (different address spaces):
/home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net [noderef] __rcu *
/home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net *

The "net" field in struct nfsd_fcache_disposal is not annotated as
requiring an RCU assignment, so replace the macro that includes an
invocation of rcu_check_sparse() with an equivalent that does not.

Signed-off-by: Chuck Lever <[email protected]>
---
fs/nfsd/filecache.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index fdf89fcf1a0c..3b172eda0e9a 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -772,7 +772,7 @@ nfsd_alloc_fcache_disposal(struct net *net)
static void
nfsd_free_fcache_disposal(struct nfsd_fcache_disposal *l)
{
- rcu_assign_pointer(l->net, NULL);
+ WRITE_ONCE(l->net, NULL);
cancel_work_sync(&l->work);
nfsd_file_dispose_list(&l->freeme);
kfree_rcu(l, rcu);



2021-11-30 22:52:55

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2] NFSD: Fix RCU-related sparse splat

On Mon, Nov 29, 2021 at 01:46:07PM -0500, Chuck Lever wrote:
> To address this error:
>
> CC [M] fs/nfsd/filecache.o
> CHECK /home/cel/src/linux/linux/fs/nfsd/filecache.c
> /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: error: incompatible types in comparison expression (different address spaces):
> /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net [noderef] __rcu *
> /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net *
>
> The "net" field in struct nfsd_fcache_disposal is not annotated as
> requiring an RCU assignment, so replace the macro that includes an
> invocation of rcu_check_sparse() with an equivalent that does not.
>
> Signed-off-by: Chuck Lever <[email protected]>

From an RCU perspective:

Acked-by: Paul E. McKenney <[email protected]>

But it would be good to get someone more familiar with the code to
look at this.

Thanx, Paul

> ---
> fs/nfsd/filecache.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> index fdf89fcf1a0c..3b172eda0e9a 100644
> --- a/fs/nfsd/filecache.c
> +++ b/fs/nfsd/filecache.c
> @@ -772,7 +772,7 @@ nfsd_alloc_fcache_disposal(struct net *net)
> static void
> nfsd_free_fcache_disposal(struct nfsd_fcache_disposal *l)
> {
> - rcu_assign_pointer(l->net, NULL);
> + WRITE_ONCE(l->net, NULL);
> cancel_work_sync(&l->work);
> nfsd_file_dispose_list(&l->freeme);
> kfree_rcu(l, rcu);
>

2021-11-30 23:36:25

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v2] NFSD: Fix RCU-related sparse splat

On Wed, 01 Dec 2021, [email protected] wrote:
> On Mon, Nov 29, 2021 at 01:46:07PM -0500, Chuck Lever wrote:
> > To address this error:
> >
> > CC [M] fs/nfsd/filecache.o
> > CHECK /home/cel/src/linux/linux/fs/nfsd/filecache.c
> > /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: error: incompatible types in comparison expression (different address spaces):
> > /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net [noderef] __rcu *
> > /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net *
> >
> > The "net" field in struct nfsd_fcache_disposal is not annotated as
> > requiring an RCU assignment, so replace the macro that includes an
> > invocation of rcu_check_sparse() with an equivalent that does not.
> >
> > Signed-off-by: Chuck Lever <[email protected]>
>
> From an RCU perspective:
>
> Acked-by: Paul E. McKenney <[email protected]>
>
> But it would be good to get someone more familiar with the code to
> look at this.

There is a global list of 'struct nfsd_fcache_disposal', potentially one
for each network namespace. Each contains a '*net' which is effectively
an opaque lookup key. It is never dereferenced - only used to find the
nfsd_fcache_disposal which matches a given 'struct net *'.

The lookup happens under rcu_read_lock(). A 'struct net' is freed after
a grace period (see cleanup_net() in net_namespace.c) so to ensure the
lookup doesn't find a false positive - caused by a 'struct net' being
deallocated and reallocated, it is sufficient to clear the ->net pointer
while the the net is known to still be active. At most, a write-barrier
might be justified. i.e. smp_store_release(l->net, NULL);

However ... the list of nfsd_fcache_disposal seems unnecessary.
nfsd has a 'struct nfsd_net' which parallels any interesting 'struct
net' using the net_generic() framework.
If the nfs_fcache_disposal were linked from the nfsd_net, there would be
no need for the list, no need to record the ->net, and not need for the
code in question here.
The nfsd_fcache_disposal is destroyed (nfsd_file_cache_shutdown_new())
before the nfsd_net is destroyed, so there are no lifetime issues with
keeping the separate list.

So I think this patch is not the best fix for the problem highlighted by
the warning... it's not wrong though.

NeilBrown

2021-11-30 23:58:22

by NeilBrown

[permalink] [raw]
Subject: [PATCH/rfc] NFSD: simplify per-net file cache management


We currently have a 'laundrette' for closing cached files - a different
work-item for each network-namespace.

These 'laundrettes' (aka struct nfsd_fcache_disposal) are currently on a
list, and are freed using rcu.

The list is not necessary as we have a per-namespace structure (struct
nfsd_net) which can hold a link to the nfsd_fcache_disposal.
The use of kfree_rcu is also unnecessary as the cache is cleaned of all
files associated with a given namespace, and no new files can be added,
before the nfsd_fcache_disposal is freed.

So add a '->fcache_disposal' link to nfsd_net, and discard the list
management and rcu usage.

Signed-off-by: NeilBrown <[email protected]>
---

This patch compiles, but I haven't tested it. It is a suggestion.

NeilBrown


fs/nfsd/filecache.c | 76 +++++++++------------------------------------
fs/nfsd/netns.h | 2 ++
2 files changed, 17 insertions(+), 61 deletions(-)

diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index fdf89fcf1a0c..aa5dca498b27 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -44,12 +44,9 @@ struct nfsd_fcache_bucket {
static DEFINE_PER_CPU(unsigned long, nfsd_file_cache_hits);

struct nfsd_fcache_disposal {
- struct list_head list;
struct work_struct work;
- struct net *net;
spinlock_t lock;
struct list_head freeme;
- struct rcu_head rcu;
};

static struct workqueue_struct *nfsd_filecache_wq __read_mostly;
@@ -62,8 +59,6 @@ static long nfsd_file_lru_flags;
static struct fsnotify_group *nfsd_file_fsnotify_group;
static atomic_long_t nfsd_filecache_count;
static struct delayed_work nfsd_filecache_laundrette;
-static DEFINE_SPINLOCK(laundrette_lock);
-static LIST_HEAD(laundrettes);

static void nfsd_file_gc(void);

@@ -367,19 +362,13 @@ nfsd_file_list_remove_disposal(struct list_head *dst,
static void
nfsd_file_list_add_disposal(struct list_head *files, struct net *net)
{
- struct nfsd_fcache_disposal *l;
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct nfsd_fcache_disposal *l = nn->fcache_disposal;

- rcu_read_lock();
- list_for_each_entry_rcu(l, &laundrettes, list) {
- if (l->net == net) {
- spin_lock(&l->lock);
- list_splice_tail_init(files, &l->freeme);
- spin_unlock(&l->lock);
- queue_work(nfsd_filecache_wq, &l->work);
- break;
- }
- }
- rcu_read_unlock();
+ spin_lock(&l->lock);
+ list_splice_tail_init(files, &l->freeme);
+ spin_unlock(&l->lock);
+ queue_work(nfsd_filecache_wq, &l->work);
}

static void
@@ -755,7 +744,7 @@ nfsd_file_cache_purge(struct net *net)
}

static struct nfsd_fcache_disposal *
-nfsd_alloc_fcache_disposal(struct net *net)
+nfsd_alloc_fcache_disposal(void)
{
struct nfsd_fcache_disposal *l;

@@ -763,7 +752,6 @@ nfsd_alloc_fcache_disposal(struct net *net)
if (!l)
return NULL;
INIT_WORK(&l->work, nfsd_file_delayed_close);
- l->net = net;
spin_lock_init(&l->lock);
INIT_LIST_HEAD(&l->freeme);
return l;
@@ -772,61 +760,27 @@ nfsd_alloc_fcache_disposal(struct net *net)
static void
nfsd_free_fcache_disposal(struct nfsd_fcache_disposal *l)
{
- rcu_assign_pointer(l->net, NULL);
cancel_work_sync(&l->work);
nfsd_file_dispose_list(&l->freeme);
- kfree_rcu(l, rcu);
-}
-
-static void
-nfsd_add_fcache_disposal(struct nfsd_fcache_disposal *l)
-{
- spin_lock(&laundrette_lock);
- list_add_tail_rcu(&l->list, &laundrettes);
- spin_unlock(&laundrette_lock);
-}
-
-static void
-nfsd_del_fcache_disposal(struct nfsd_fcache_disposal *l)
-{
- spin_lock(&laundrette_lock);
- list_del_rcu(&l->list);
- spin_unlock(&laundrette_lock);
-}
-
-static int
-nfsd_alloc_fcache_disposal_net(struct net *net)
-{
- struct nfsd_fcache_disposal *l;
-
- l = nfsd_alloc_fcache_disposal(net);
- if (!l)
- return -ENOMEM;
- nfsd_add_fcache_disposal(l);
- return 0;
+ kfree(l);
}

static void
nfsd_free_fcache_disposal_net(struct net *net)
{
- struct nfsd_fcache_disposal *l;
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+ struct nfsd_fcache_disposal *l = nn->fcache_disposal;

- rcu_read_lock();
- list_for_each_entry_rcu(l, &laundrettes, list) {
- if (l->net != net)
- continue;
- nfsd_del_fcache_disposal(l);
- rcu_read_unlock();
- nfsd_free_fcache_disposal(l);
- return;
- }
- rcu_read_unlock();
+ nfsd_free_fcache_disposal(l);
}

int
nfsd_file_cache_start_net(struct net *net)
{
- return nfsd_alloc_fcache_disposal_net(net);
+ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
+ nn->fcache_disposal = nfsd_alloc_fcache_disposal();
+ return nn->fcache_disposal ? 0 : -ENOMEM;
}

void
diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 021acdc0d03b..9e8b77d2a3a4 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -185,6 +185,8 @@ struct nfsd_net {

/* utsname taken from the process that starts the server */
char nfsd_name[UNX_MAXNODENAME+1];
+
+ struct nfsd_fcache_disposal *fcache_disposal;
};

/* Simple check to find out if a given net was properly initialized */
--
2.34.1


2021-12-01 00:22:18

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH v2] NFSD: Fix RCU-related sparse splat



> On Nov 30, 2021, at 6:36 PM, NeilBrown <[email protected]> wrote:
>
> On Wed, 01 Dec 2021, [email protected] wrote:
>> On Mon, Nov 29, 2021 at 01:46:07PM -0500, Chuck Lever wrote:
>>> To address this error:
>>>
>>> CC [M] fs/nfsd/filecache.o
>>> CHECK /home/cel/src/linux/linux/fs/nfsd/filecache.c
>>> /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: error: incompatible types in comparison expression (different address spaces):
>>> /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net [noderef] __rcu *
>>> /home/cel/src/linux/linux/fs/nfsd/filecache.c:772:9: struct net *
>>>
>>> The "net" field in struct nfsd_fcache_disposal is not annotated as
>>> requiring an RCU assignment, so replace the macro that includes an
>>> invocation of rcu_check_sparse() with an equivalent that does not.
>>>
>>> Signed-off-by: Chuck Lever <[email protected]>
>>
>> From an RCU perspective:
>>
>> Acked-by: Paul E. McKenney <[email protected]>
>>
>> But it would be good to get someone more familiar with the code to
>> look at this.
>
> There is a global list of 'struct nfsd_fcache_disposal', potentially one
> for each network namespace. Each contains a '*net' which is effectively
> an opaque lookup key. It is never dereferenced - only used to find the
> nfsd_fcache_disposal which matches a given 'struct net *'.
>
> The lookup happens under rcu_read_lock(). A 'struct net' is freed after
> a grace period (see cleanup_net() in net_namespace.c) so to ensure the
> lookup doesn't find a false positive - caused by a 'struct net' being
> deallocated and reallocated, it is sufficient to clear the ->net pointer
> while the the net is known to still be active. At most, a write-barrier
> might be justified. i.e. smp_store_release(l->net, NULL);

By way of further explanation:

The Documentation/ for RCU (ie, "RCU for Dummies") suggests that
rcu_assign_pointer() is adequate for preventing undesirable
compiler optimizations or CPU load/store reordering.

rcu_assign_pointer() uses WRITE_ONCE for constants like NULL and
smp_store_release() when assigning variable values. I copied the
use of WRITE_ONCE() from rcu_assign_pointer(). So I expect exactly
zero change in behavior or compiled code... (but I should have
checked the generated object to verify that is the case).


> However ... the list of nfsd_fcache_disposal seems unnecessary.
> nfsd has a 'struct nfsd_net' which parallels any interesting 'struct
> net' using the net_generic() framework.
> If the nfs_fcache_disposal were linked from the nfsd_net, there would be
> no need for the list, no need to record the ->net, and not need for the
> code in question here.
> The nfsd_fcache_disposal is destroyed (nfsd_file_cache_shutdown_new())
> before the nfsd_net is destroyed, so there are no lifetime issues with
> keeping the separate list.

Sure. If it makes sense to use nfsd_net instead of a separate data
structure, then that can be made to happen. I would like to hear
from Trond regarding why he felt a separate data structure was
necessary for fcache_disposal.


--
Chuck Lever




2021-12-01 01:38:57

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH v2] NFSD: Fix RCU-related sparse splat

On Wed, 01 Dec 2021, Chuck Lever III wrote:
>
> By way of further explanation:
>
> The Documentation/ for RCU (ie, "RCU for Dummies") suggests that
> rcu_assign_pointer() is adequate for preventing undesirable
> compiler optimizations or CPU load/store reordering.
>
> rcu_assign_pointer() uses WRITE_ONCE for constants like NULL and
> smp_store_release() when assigning variable values. I copied the
> use of WRITE_ONCE() from rcu_assign_pointer(). So I expect exactly
> zero change in behavior or compiled code... (but I should have
> checked the generated object to verify that is the case).

True, there would be no change in behaviour - but we should at least ask
if that behaviour is correct, and why.

If any barriers are needed here, they would have to be between the
assignment of NULL here, and the tests of l->net in
nfsd_file_list_add_disposal()
nfsd_free_fcache_disposal_net()
(because they are the only places ->net is used)
The only conceivable race is that they will see a value in ->net "after"
NULL has been assigned.
If there were a race there, it would be between different cpus, so
smp_store_relase() and smp_load_acquire()
would be the correct tools to avoid that race.

If, on the other hand, there is no chance of a race, then there is no
need to assign NULL to ->net at all. I believe this is actually the
case.
As the 'net' is freed using kfree_rcu, there is no possibility for a
search that started before something was removed from the list to get a
false-positive when testing if l->net == net.

So while your change is safe from a behavioural perspective, I don't
think it is safe from a maintenance perspective because it leaves in
place code that doesn't really make sense, but removes the warning that
helps us to find that nonsense.

>
> Sure. If it makes sense to use nfsd_net instead of a separate data
> structure, then that can be made to happen. I would like to hear
> from Trond regarding why he felt a separate data structure was
> necessary for fcache_disposal.

That would be best, yes.

Thanks,
NeilBrown

2021-12-02 15:33:38

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH v2] NFSD: Fix RCU-related sparse splat



> On Nov 30, 2021, at 8:38 PM, NeilBrown <[email protected]> wrote:
>
> On Wed, 01 Dec 2021, Chuck Lever III wrote:
>>
>> By way of further explanation:
>>
>> The Documentation/ for RCU (ie, "RCU for Dummies") suggests that
>> rcu_assign_pointer() is adequate for preventing undesirable
>> compiler optimizations or CPU load/store reordering.
>>
>> rcu_assign_pointer() uses WRITE_ONCE for constants like NULL and
>> smp_store_release() when assigning variable values. I copied the
>> use of WRITE_ONCE() from rcu_assign_pointer(). So I expect exactly
>> zero change in behavior or compiled code... (but I should have
>> checked the generated object to verify that is the case).
>
> True, there would be no change in behaviour - but we should at least ask
> if that behaviour is correct, and why.
>
> If any barriers are needed here, they would have to be between the
> assignment of NULL here, and the tests of l->net in
> nfsd_file_list_add_disposal()
> nfsd_free_fcache_disposal_net()
> (because they are the only places ->net is used)
> The only conceivable race is that they will see a value in ->net "after"
> NULL has been assigned.
> If there were a race there, it would be between different cpus, so
> smp_store_relase() and smp_load_acquire()
> would be the correct tools to avoid that race.

smp_load_acquire() is not used in those functions, so one might
draw the conclusion that either the RCU annotations are incorrect,
or that degree of concurrence paranoia (including using
rcu_assign_pointer() to assign the NULL) is unnecessary.


> If, on the other hand, there is no chance of a race, then there is no
> need to assign NULL to ->net at all. I believe this is actually the
> case.
> As the 'net' is freed using kfree_rcu, there is no possibility for a
> search that started before something was removed from the list to get a
> false-positive when testing if l->net == net.
>
> So while your change is safe from a behavioural perspective, I don't
> think it is safe from a maintenance perspective because it leaves in
> place code that doesn't really make sense, but removes the warning that
> helps us to find that nonsense.

Well I agree that the code is either hard to follow, incomplete,
or incorrect, and therefore needs to be cleaned up so it does not
accrue further technical debt.


>> Sure. If it makes sense to use nfsd_net instead of a separate data
>> structure, then that can be made to happen. I would like to hear
>> from Trond regarding why he felt a separate data structure was
>> necessary for fcache_disposal.
>
> That would be best, yes.

Since Trond was active recently, I think we can conclude from his
silence he does not care and we are free to act as we see fit.

I'm comfortable starting with https://lore.kernel.org/linux-nfs/[email protected]/raw

--
Chuck Lever