LinuxLists.cc - [PATCH 0/2] nfs_complete_rename() calls d_move() without i

2017-06-15 17:31:47

Subject: [PATCH 0/2] nfs_complete_rename() calls d_move() without i_mutex

Once commit 920b4530fb80430ff30ef83efe21ba1fa5623731 "NFS: nfs_rename()
handle -ERESTARTSYS dentry left behind" moved the local d_move() into the
RPC asyncronous context, d_move() could be called without holding the
directories' i_mutex.

Let's revert that commit, and a follow-up fix for it in 1/2, and then fix
the original problem once more by forcing a revalidation of the old and new
directories if we notice that the rename was interrupted in 2/2.

Benjamin Coddington (2):
Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"
NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

fs/nfs/dir.c | 47 ++++++++++++++++++++---------------------------
fs/nfs/unlink.c | 7 +++++++
include/linux/nfs_xdr.h | 1 +
3 files changed, 28 insertions(+), 27 deletions(-)

--
2.9.3

2017-06-15 17:31:48

by Benjamin Coddington

[permalink] [raw]

Subject: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

An interrupted rename will leave the old dentry behind if the rename
succeeds. Fix this by forcing a lookup the next time through
->d_revalidate.

A previous attempt at solving this problem took the approach to complete
the work of the rename asynchronously, however that approach was wrong
since it would allow the d_move() to occur after the directory's i_mutex
had been dropped by the original process.

Signed-off-by: Benjamin Coddington <[email protected]>
---
fs/nfs/dir.c | 2 ++
fs/nfs/unlink.c | 7 +++++++
include/linux/nfs_xdr.h | 1 +
3 files changed, 10 insertions(+)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 1faf337b316f..bb697e5c3f6c 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
error = rpc_wait_for_completion_task(task);
if (error == 0)
error = task->tk_status;
+ else
+ ((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
rpc_put_task(task);
nfs_mark_for_revalidate(old_inode);
out:
diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
index 191aa577dd1f..b47158a34879 100644
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -288,6 +288,13 @@ static void nfs_async_rename_release(void *calldata)
if (d_really_is_positive(data->old_dentry))
nfs_mark_for_revalidate(d_inode(data->old_dentry));

+ /* The result of the rename is unknown. Play it safe by
+ * forcing a new lookup */
+ if (data->cancelled) {
+ nfs_force_lookup_revalidate(data->old_dir);
+ nfs_force_lookup_revalidate(data->new_dir);
+ }
+
dput(data->old_dentry);
dput(data->new_dentry);
iput(data->old_dir);
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index b28c83475ee8..2a8406b4b353 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1533,6 +1533,7 @@ struct nfs_renamedata {
struct nfs_fattr new_fattr;
void (*complete)(struct rpc_task *, struct nfs_renamedata *);
long timeout;
+ int cancelled;
};

struct nfs_access_entry;
--
2.9.3

2017-06-15 17:31:52

by Benjamin Coddington

[permalink] [raw]

Subject: [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"

This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could
call d_move() without holding the directory's i_mutex, and reverts commit
d4ea7e3c5c0e341c15b073016dbf3ab6c65f12f3 "NFS: Fix old dentry rehash after
move", which was a follow-up fix.

Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind")
Cc: [email protected] # v4.10+
Signed-off-by: Benjamin Coddington <[email protected]>
---
fs/nfs/dir.c | 45 ++++++++++++++++++---------------------------
1 file changed, 18 insertions(+), 27 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 32ccd7754f8a..1faf337b316f 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1946,29 +1946,6 @@ nfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
}
EXPORT_SYMBOL_GPL(nfs_link);

-static void
-nfs_complete_rename(struct rpc_task *task, struct nfs_renamedata *data)
-{
- struct dentry *old_dentry = data->old_dentry;
- struct dentry *new_dentry = data->new_dentry;
- struct inode *old_inode = d_inode(old_dentry);
- struct inode *new_inode = d_inode(new_dentry);
-
- nfs_mark_for_revalidate(old_inode);
-
- switch (task->tk_status) {
- case 0:
- if (new_inode != NULL)
- nfs_drop_nlink(new_inode);
- d_move(old_dentry, new_dentry);
- nfs_set_verifier(new_dentry,
- nfs_save_change_attribute(data->new_dir));
- break;
- case -ENOENT:
- nfs_dentry_handle_enoent(old_dentry);
- }
-}
-
/*
* RENAME
* FIXME: Some nfsds, like the Linux user space nfsd, may generate a
@@ -1999,7 +1976,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
{
struct inode *old_inode = d_inode(old_dentry);
struct inode *new_inode = d_inode(new_dentry);
- struct dentry *dentry = NULL;
+ struct dentry *dentry = NULL, *rehash = NULL;
struct rpc_task *task;
int error = -EBUSY;

@@ -2022,8 +1999,10 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
* To prevent any new references to the target during the
* rename, we unhash the dentry in advance.
*/
- if (!d_unhashed(new_dentry))
+ if (!d_unhashed(new_dentry)) {
d_drop(new_dentry);
+ rehash = new_dentry;
+ }

if (d_count(new_dentry) > 2) {
int err;
@@ -2040,6 +2019,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
goto out;

new_dentry = dentry;
+ rehash = NULL;
new_inode = NULL;
}
}
@@ -2048,8 +2028,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
if (new_inode != NULL)
NFS_PROTO(new_inode)->return_delegation(new_inode);

- task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry,
- nfs_complete_rename);
+ task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry, NULL);
if (IS_ERR(task)) {
error = PTR_ERR(task);
goto out;
@@ -2059,9 +2038,21 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
if (error == 0)
error = task->tk_status;
rpc_put_task(task);
+ nfs_mark_for_revalidate(old_inode);
out:
+ if (rehash)
+ d_rehash(rehash);
trace_nfs_rename_exit(old_dir, old_dentry,
new_dir, new_dentry, error);
+ if (!error) {
+ if (new_inode != NULL)
+ nfs_drop_nlink(new_inode);
+ d_move(old_dentry, new_dentry);
+ nfs_set_verifier(new_dentry,
+ nfs_save_change_attribute(new_dir));
+ } else if (error == -ENOENT)
+ nfs_dentry_handle_enoent(old_dentry);
+
/* new dentry created? */
if (dentry)
dput(dentry);
--
2.9.3

2017-06-15 18:16:24

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH 1/2] Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind"

On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> This reverts commit 920b4530fb80430ff30ef83efe21ba1fa5623731 which could
> call d_move() without holding the directory's i_mutex, and reverts commit
> d4ea7e3c5c0e341c15b073016dbf3ab6c65f12f3 "NFS: Fix old dentry rehash after
> move", which was a follow-up fix.
>
> Fixes: 920b4530fb80 ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind")
> Cc: [email protected] # v4.10+
> Signed-off-by: Benjamin Coddington <[email protected]>
> ---
> fs/nfs/dir.c | 45 ++++++++++++++++++---------------------------
> 1 file changed, 18 insertions(+), 27 deletions(-)
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 32ccd7754f8a..1faf337b316f 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1946,29 +1946,6 @@ nfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
> }
> EXPORT_SYMBOL_GPL(nfs_link);
>
> -static void
> -nfs_complete_rename(struct rpc_task *task, struct nfs_renamedata *data)
> -{
> - struct dentry *old_dentry = data->old_dentry;
> - struct dentry *new_dentry = data->new_dentry;
> - struct inode *old_inode = d_inode(old_dentry);
> - struct inode *new_inode = d_inode(new_dentry);
> -
> - nfs_mark_for_revalidate(old_inode);
> -
> - switch (task->tk_status) {
> - case 0:
> - if (new_inode != NULL)
> - nfs_drop_nlink(new_inode);
> - d_move(old_dentry, new_dentry);
> - nfs_set_verifier(new_dentry,
> - nfs_save_change_attribute(data->new_dir));
> - break;
> - case -ENOENT:
> - nfs_dentry_handle_enoent(old_dentry);
> - }
> -}
> -
> /*
> * RENAME
> * FIXME: Some nfsds, like the Linux user space nfsd, may generate a
> @@ -1999,7 +1976,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> {
> struct inode *old_inode = d_inode(old_dentry);
> struct inode *new_inode = d_inode(new_dentry);
> - struct dentry *dentry = NULL;
> + struct dentry *dentry = NULL, *rehash = NULL;
> struct rpc_task *task;
> int error = -EBUSY;
>
> @@ -2022,8 +1999,10 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> * To prevent any new references to the target during the
> * rename, we unhash the dentry in advance.
> */
> - if (!d_unhashed(new_dentry))
> + if (!d_unhashed(new_dentry)) {
> d_drop(new_dentry);
> + rehash = new_dentry;
> + }
>
> if (d_count(new_dentry) > 2) {
> int err;
> @@ -2040,6 +2019,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> goto out;
>
> new_dentry = dentry;
> + rehash = NULL;
> new_inode = NULL;
> }
> }
> @@ -2048,8 +2028,7 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> if (new_inode != NULL)
> NFS_PROTO(new_inode)->return_delegation(new_inode);
>
> - task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry,
> - nfs_complete_rename);
> + task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry, NULL);
> if (IS_ERR(task)) {
> error = PTR_ERR(task);
> goto out;
> @@ -2059,9 +2038,21 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> if (error == 0)
> error = task->tk_status;
> rpc_put_task(task);
> + nfs_mark_for_revalidate(old_inode);
> out:
> + if (rehash)
> + d_rehash(rehash);
> trace_nfs_rename_exit(old_dir, old_dentry,
> new_dir, new_dentry, error);
> + if (!error) {
> + if (new_inode != NULL)
> + nfs_drop_nlink(new_inode);
> + d_move(old_dentry, new_dentry);
> + nfs_set_verifier(new_dentry,
> + nfs_save_change_attribute(new_dir));
> + } else if (error == -ENOENT)
> + nfs_dentry_handle_enoent(old_dentry);
> +
> /* new dentry created? */
> if (dentry)
> dput(dentry);

Reviewed-by: Jeff Layton <[email protected]>

2017-06-15 18:18:14

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> An interrupted rename will leave the old dentry behind if the rename
> succeeds. Fix this by forcing a lookup the next time through
> ->d_revalidate.
>
> A previous attempt at solving this problem took the approach to complete
> the work of the rename asynchronously, however that approach was wrong
> since it would allow the d_move() to occur after the directory's i_mutex
> had been dropped by the original process.
>
> Signed-off-by: Benjamin Coddington <[email protected]>
> ---
> fs/nfs/dir.c | 2 ++
> fs/nfs/unlink.c | 7 +++++++
> include/linux/nfs_xdr.h | 1 +
> 3 files changed, 10 insertions(+)
>
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 1faf337b316f..bb697e5c3f6c 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> error = rpc_wait_for_completion_task(task);
> if (error == 0)
> error = task->tk_status;
> + else
> + ((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;

This looks a bit racy. You could end up setting cancelled just after the
reply comes in and the completion callback checks it. I think that you
probably either want to do this with an atomic operation or under a lock
of some sort.

You could probably do it with an xchg() operation?

> rpc_put_task(task);
> nfs_mark_for_revalidate(old_inode);
> out:
> diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
> index 191aa577dd1f..b47158a34879 100644
> --- a/fs/nfs/unlink.c
> +++ b/fs/nfs/unlink.c
> @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void *calldata)
> if (d_really_is_positive(data->old_dentry))
> nfs_mark_for_revalidate(d_inode(data->old_dentry));
>
> + /* The result of the rename is unknown. Play it safe by
> + * forcing a new lookup */
> + if (data->cancelled) {
> + nfs_force_lookup_revalidate(data->old_dir);
> + nfs_force_lookup_revalidate(data->new_dir);
> + }
> +
> dput(data->old_dentry);
> dput(data->new_dentry);
> iput(data->old_dir);
> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> index b28c83475ee8..2a8406b4b353 100644
> --- a/include/linux/nfs_xdr.h
> +++ b/include/linux/nfs_xdr.h
> @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
> struct nfs_fattr new_fattr;
> void (*complete)(struct rpc_task *, struct nfs_renamedata *);
> long timeout;
> + int cancelled;

No need for a whole int for a flag and these do get allocated. Make it a
bool?

> };
>
> struct nfs_access_entry;

--
Jeff Layton <[email protected]>

2017-06-15 19:06:47

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

On Thu, 2017-06-15 at 14:18 -0400, Jeff Layton wrote:
> On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> > An interrupted rename will leave the old dentry behind if the rename
> > succeeds. Fix this by forcing a lookup the next time through
> > ->d_revalidate.
> >
> > A previous attempt at solving this problem took the approach to complete
> > the work of the rename asynchronously, however that approach was wrong
> > since it would allow the d_move() to occur after the directory's i_mutex
> > had been dropped by the original process.
> >
> > Signed-off-by: Benjamin Coddington <[email protected]>
> > ---
> > fs/nfs/dir.c | 2 ++
> > fs/nfs/unlink.c | 7 +++++++
> > include/linux/nfs_xdr.h | 1 +
> > 3 files changed, 10 insertions(+)
> >
> > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > index 1faf337b316f..bb697e5c3f6c 100644
> > --- a/fs/nfs/dir.c
> > +++ b/fs/nfs/dir.c
> > @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct dentry *old_dentry,
> > error = rpc_wait_for_completion_task(task);
> > if (error == 0)
> > error = task->tk_status;
> > + else
> > + ((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
>
> This looks a bit racy. You could end up setting cancelled just after the
> reply comes in and the completion callback checks it. I think that you
> probably either want to do this with an atomic operation or under a lock
> of some sort.
>
> You could probably do it with an xchg() operation?
>

As Ben pointed out on IRC, that flag is checked in rpc_release, so as
long as we ensure that it's set while we hold a task reference we should
be fine here without any locking.

That said, do we need a barrier here? We do need to ensure that
cancelled is set before the rpc_put_task occurs.

> > rpc_put_task(task);
> > nfs_mark_for_revalidate(old_inode);
> > out:
> > diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
> > index 191aa577dd1f..b47158a34879 100644
> > --- a/fs/nfs/unlink.c
> > +++ b/fs/nfs/unlink.c
> > @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void *calldata)
> > if (d_really_is_positive(data->old_dentry))
> > nfs_mark_for_revalidate(d_inode(data->old_dentry));
> >
> > + /* The result of the rename is unknown. Play it safe by
> > + * forcing a new lookup */
> > + if (data->cancelled) {
> > + nfs_force_lookup_revalidate(data->old_dir);
> > + nfs_force_lookup_revalidate(data->new_dir);
> > + }
> > +
> > dput(data->old_dentry);
> > dput(data->new_dentry);
> > iput(data->old_dir);
> > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > index b28c83475ee8..2a8406b4b353 100644
> > --- a/include/linux/nfs_xdr.h
> > +++ b/include/linux/nfs_xdr.h
> > @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
> > struct nfs_fattr new_fattr;
> > void (*complete)(struct rpc_task *, struct nfs_renamedata *);
> > long timeout;
> > + int cancelled;
>
> No need for a whole int for a flag and these do get allocated. Make it a
> bool?
>
> > };
> >
> > struct nfs_access_entry;
>
>

--
Jeff Layton <[email protected]>

2017-06-15 20:19:39

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

On 15 Jun 2017, at 15:06, Jeff Layton wrote:

> On Thu, 2017-06-15 at 14:18 -0400, Jeff Layton wrote:
>> On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
>>> An interrupted rename will leave the old dentry behind if the rename
>>> succeeds. Fix this by forcing a lookup the next time through
>>> ->d_revalidate.
>>>
>>> A previous attempt at solving this problem took the approach to
>>> complete
>>> the work of the rename asynchronously, however that approach was
>>> wrong
>>> since it would allow the d_move() to occur after the directory's
>>> i_mutex
>>> had been dropped by the original process.
>>>
>>> Signed-off-by: Benjamin Coddington <[email protected]>
>>> ---
>>> fs/nfs/dir.c | 2 ++
>>> fs/nfs/unlink.c | 7 +++++++
>>> include/linux/nfs_xdr.h | 1 +
>>> 3 files changed, 10 insertions(+)
>>>
>>> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>>> index 1faf337b316f..bb697e5c3f6c 100644
>>> --- a/fs/nfs/dir.c
>>> +++ b/fs/nfs/dir.c
>>> @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct
>>> dentry *old_dentry,
>>> error = rpc_wait_for_completion_task(task);
>>> if (error == 0)
>>> error = task->tk_status;
>>> + else
>>> + ((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
>>
>> This looks a bit racy. You could end up setting cancelled just after
>> the
>> reply comes in and the completion callback checks it. I think that
>> you
>> probably either want to do this with an atomic operation or under a
>> lock
>> of some sort.
>>
>> You could probably do it with an xchg() operation?
>>
>
> As Ben pointed out on IRC, that flag is checked in rpc_release, so as
> long as we ensure that it's set while we hold a task reference we
> should
> be fine here without any locking.
>
> That said, do we need a barrier here? We do need to ensure that
> cancelled is set before the rpc_put_task occurs.

Yes, I think we probably do.

>>> rpc_put_task(task);
>>> nfs_mark_for_revalidate(old_inode);
>>> out:
>>> diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
>>> index 191aa577dd1f..b47158a34879 100644
>>> --- a/fs/nfs/unlink.c
>>> +++ b/fs/nfs/unlink.c
>>> @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void
>>> *calldata)
>>> if (d_really_is_positive(data->old_dentry))
>>> nfs_mark_for_revalidate(d_inode(data->old_dentry));
>>>
>>> + /* The result of the rename is unknown. Play it safe by
>>> + * forcing a new lookup */
>>> + if (data->cancelled) {
>>> + nfs_force_lookup_revalidate(data->old_dir);
>>> + nfs_force_lookup_revalidate(data->new_dir);

Jeff's pointed out on IRC that we should hold the i_lock to call
nfs_force_lookup_revalidate(), so I'll add that.

>>> + }
>>> +
>>> dput(data->old_dentry);
>>> dput(data->new_dentry);
>>> iput(data->old_dir);
>>> diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
>>> index b28c83475ee8..2a8406b4b353 100644
>>> --- a/include/linux/nfs_xdr.h
>>> +++ b/include/linux/nfs_xdr.h
>>> @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
>>> struct nfs_fattr new_fattr;
>>> void (*complete)(struct rpc_task *, struct nfs_renamedata *);
>>> long timeout;
>>> + int cancelled;
>>
>> No need for a whole int for a flag and these do get allocated. Make
>> it a
>> bool?

or

unsigned int : 1

which seems to be often used -- see nfs4_opendata. The cancelled flag
could
be changed there as well I suppose.

Ben

2017-06-15 20:34:29

by Jeff Layton

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

On Thu, 2017-06-15 at 16:19 -0400, Benjamin Coddington wrote:
> On 15 Jun 2017, at 15:06, Jeff Layton wrote:
>
> > On Thu, 2017-06-15 at 14:18 -0400, Jeff Layton wrote:
> > > On Thu, 2017-06-15 at 12:13 -0400, Benjamin Coddington wrote:
> > > > An interrupted rename will leave the old dentry behind if the rename
> > > > succeeds. Fix this by forcing a lookup the next time through
> > > > ->d_revalidate.
> > > >
> > > > A previous attempt at solving this problem took the approach to
> > > > complete
> > > > the work of the rename asynchronously, however that approach was
> > > > wrong
> > > > since it would allow the d_move() to occur after the directory's
> > > > i_mutex
> > > > had been dropped by the original process.
> > > >
> > > > Signed-off-by: Benjamin Coddington <[email protected]>
> > > > ---
> > > > fs/nfs/dir.c | 2 ++
> > > > fs/nfs/unlink.c | 7 +++++++
> > > > include/linux/nfs_xdr.h | 1 +
> > > > 3 files changed, 10 insertions(+)
> > > >
> > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > index 1faf337b316f..bb697e5c3f6c 100644
> > > > --- a/fs/nfs/dir.c
> > > > +++ b/fs/nfs/dir.c
> > > > @@ -2037,6 +2037,8 @@ int nfs_rename(struct inode *old_dir, struct
> > > > dentry *old_dentry,
> > > > error = rpc_wait_for_completion_task(task);
> > > > if (error == 0)
> > > > error = task->tk_status;
> > > > + else
> > > > + ((struct nfs_renamedata *)task->tk_calldata)->cancelled = 1;
> > >
> > > This looks a bit racy. You could end up setting cancelled just after
> > > the
> > > reply comes in and the completion callback checks it. I think that
> > > you
> > > probably either want to do this with an atomic operation or under a
> > > lock
> > > of some sort.
> > >
> > > You could probably do it with an xchg() operation?
> > >
> >
> > As Ben pointed out on IRC, that flag is checked in rpc_release, so as
> > long as we ensure that it's set while we hold a task reference we
> > should
> > be fine here without any locking.
> >
> > That said, do we need a barrier here? We do need to ensure that
> > cancelled is set before the rpc_put_task occurs.
>
> Yes, I think we probably do.
>

Yeah, I think a smp_wmb() there, paired with the implied barrier in the
atomic_dec_and_test in rpc_put_task?

>
> > > > rpc_put_task(task);
> > > > nfs_mark_for_revalidate(old_inode);
> > > > out:
> > > > diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c
> > > > index 191aa577dd1f..b47158a34879 100644
> > > > --- a/fs/nfs/unlink.c
> > > > +++ b/fs/nfs/unlink.c
> > > > @@ -288,6 +288,13 @@ static void nfs_async_rename_release(void
> > > > *calldata)
> > > > if (d_really_is_positive(data->old_dentry))
> > > > nfs_mark_for_revalidate(d_inode(data->old_dentry));
> > > >
> > > > + /* The result of the rename is unknown. Play it safe by
> > > > + * forcing a new lookup */
> > > > + if (data->cancelled) {
> > > > + nfs_force_lookup_revalidate(data->old_dir);
> > > > + nfs_force_lookup_revalidate(data->new_dir);
>
> Jeff's pointed out on IRC that we should hold the i_lock to call
> nfs_force_lookup_revalidate(), so I'll add that.
>
>
> > > > + }
> > > > +
> > > > dput(data->old_dentry);
> > > > dput(data->new_dentry);
> > > > iput(data->old_dir);
> > > > diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
> > > > index b28c83475ee8..2a8406b4b353 100644
> > > > --- a/include/linux/nfs_xdr.h
> > > > +++ b/include/linux/nfs_xdr.h
> > > > @@ -1533,6 +1533,7 @@ struct nfs_renamedata {
> > > > struct nfs_fattr new_fattr;
> > > > void (*complete)(struct rpc_task *, struct nfs_renamedata *);
> > > > long timeout;
> > > > + int cancelled;
> > >
> > > No need for a whole int for a flag and these do get allocated. Make
> > > it a
> > > bool?
>
> or
>
> unsigned int : 1
>
> which seems to be often used -- see nfs4_opendata. The cancelled flag
> could
> be changed there as well I suppose.

I'd prefer a bool, but it's really up to Trond and Anna, I suppose.

--
Jeff Layton <[email protected]>

2017-06-15 20:57:26

by Benjamin Coddington

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

On 15 Jun 2017, at 16:34, Jeff Layton wrote:

> Yeah, I think a smp_wmb() there, paired with the implied barrier in the
> atomic_dec_and_test in rpc_put_task?

Yes, that should do it.

>>>> No need for a whole int for a flag and these do get allocated. Make
>>>> it a
>>>> bool?
>>
>> or
>>
>> unsigned int : 1
>>
>> which seems to be often used -- see nfs4_opendata. The cancelled flag
>> could
>> be changed there as well I suppose.
>
> I'd prefer a bool, but it's really up to Trond and Anna, I suppose.

If Anna or Trond will tell us how they'd like it, I can follow up with a
patch to make them all consistent.

Ben

2017-06-15 21:06:48

by Anna Schumaker

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: nfs_rename() - revalidate directories on -ERESTARTSYS

On 06/15/2017 04:57 PM, Benjamin Coddington wrote:
> On 15 Jun 2017, at 16:34, Jeff Layton wrote:
>
>> Yeah, I think a smp_wmb() there, paired with the implied barrier in the
>> atomic_dec_and_test in rpc_put_task?
>
> Yes, that should do it.
>
>>>>> No need for a whole int for a flag and these do get allocated. Make
>>>>> it a
>>>>> bool?
>>>
>>> or
>>>
>>> unsigned int : 1
>>>
>>> which seems to be often used -- see nfs4_opendata. The cancelled flag
>>> could
>>> be changed there as well I suppose.
>>
>> I'd prefer a bool, but it's really up to Trond and Anna, I suppose.
>
> If Anna or Trond will tell us how they'd like it, I can follow up with a
> patch to make them all consistent.

My preference is for a bool

Anna

>
> Ben
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>