2008-07-03 12:22:23

by Benny Halevy

[permalink] [raw]
Subject: writeable file with no mnt_want_write()

Bruce,

I'm seeing this warning on the open_downgrade path
when running the newpynfs tests:

Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write()
Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------
Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e()
Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc
Jul 3 07:32:50 buml kernel: Call Trace:
Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e
Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793
Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea
Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd]
Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e
Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd]
Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd]
Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd]
Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc]
Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100
Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd]
Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a
Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd]
Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a
Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c
Jul 3 07:32:50 buml kernel:
Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]---

I'm not sure what would be the right fix for that...

The following could be unrelated, and maybe I'm just confused
but there seems to be something funky about the way we convert
from bmap to access bits.
(unrelated note: for nfsv4.1 we'll need to mask the share access with
~OPEN4_SHARE_ACCESS_WANT_DELEG_MASK)

If I understand this correctly, we set a bit in
st_access_bmap corresponding to the share_access,
so for read-only bit #1 will be, write-only #2,
and for read-write (only) bits 1-3 should be set.
otherwise this wouldn't work:
if (!test_bit(od->od_share_access, &stp->st_access_bmap)) {
dprintk("NFSD:access not a subset current bitmap: 0x%lx, input access=%08x\n",
stp->st_access_bmap, od->od_share_access);
goto out;
}

but init_stateid sets only one bit:
__set_bit(open->op_share_access, &stp->st_access_bmap);

only if we went through nfs4_upgrade_open another bit may be set.

Benny


2008-07-03 20:36:02

by Bruce Fields

[permalink] [raw]
Subject: Re: writeable file with no mnt_want_write()

On Thu, Jul 03, 2008 at 03:21:58PM +0300, Benny Halevy wrote:
> Bruce,
>
> I'm seeing this warning on the open_downgrade path
> when running the newpynfs tests:
>
> Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write()
> Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------
> Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e()
> Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc
> Jul 3 07:32:50 buml kernel: Call Trace:
> Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e
> Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793
> Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea
> Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e
> Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc]
> Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100
> Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a
> Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a
> Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c
> Jul 3 07:32:50 buml kernel:
> Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]---
>
> I'm not sure what would be the right fix for that...

Yes. I'm a bit confused about that. Hm, maybe we need to be doing a
mnt_want_write on open_upgrade and mnt_put_write on downgrade?

> The following could be unrelated, and maybe I'm just confused
> but there seems to be something funky about the way we convert
> from bmap to access bits.
> (unrelated note: for nfsv4.1 we'll need to mask the share access with
> ~OPEN4_SHARE_ACCESS_WANT_DELEG_MASK)
>
> If I understand this correctly, we set a bit in
> st_access_bmap corresponding to the share_access,
> so for read-only bit #1 will be, write-only #2,
> and for read-write (only) bits 1-3 should be set.

No, in that case only the third bit 3 should be set.

We're just trying to enforce rfc 3530 14.2.19:

"The share_access and share_deny bits specified must be exactly
equal to the union of the share_access and share_deny bits
specified for some subset of the OPENs in effect for current
openowner on the current file. If that constraint is not
respected, the error NFS4ERR_INVAL should be returned."

Note that this paragraph would allow

OPEN for read
OPEN for write
OPEN_DOWNGRADE to read


but have us return NFS4ERR_INVAL if we got a sequence of opens like:

OPEN for read and write
OPEN_DOWNGRADE to read

because in this case there was only a single open (for read and write),
and no open just for read.

--b.

> otherwise this wouldn't work:
> if (!test_bit(od->od_share_access, &stp->st_access_bmap)) {
> dprintk("NFSD:access not a subset current bitmap: 0x%lx, input access=%08x\n",
> stp->st_access_bmap, od->od_share_access);
> goto out;
> }
>
> but init_stateid sets only one bit:
> __set_bit(open->op_share_access, &stp->st_access_bmap);
>
> only if we went through nfs4_upgrade_open another bit may be set.
>
> Benny

2008-07-04 12:34:39

by Benny Halevy

[permalink] [raw]
Subject: Re: writeable file with no mnt_want_write()

On Jul. 03, 2008, 23:36 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Thu, Jul 03, 2008 at 03:21:58PM +0300, Benny Halevy wrote:
>> Bruce,
>>
>> I'm seeing this warning on the open_downgrade path
>> when running the newpynfs tests:
>>
>> Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write()
>> Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------
>> Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e()
>> Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc
>> Jul 3 07:32:50 buml kernel: Call Trace:
>> Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e
>> Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793
>> Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea
>> Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e
>> Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc]
>> Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100
>> Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a
>> Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a
>> Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c
>> Jul 3 07:32:50 buml kernel:
>> Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]---
>>
>> I'm not sure what would be the right fix for that...
>
> Yes. I'm a bit confused about that. Hm, maybe we need to be doing a
> mnt_want_write on open_upgrade and mnt_put_write on downgrade?

Right on (at least for the first part of your suggestion :)
See patch in reply to this message that makes this warning go away.

>
>> The following could be unrelated, and maybe I'm just confused
>> but there seems to be something funky about the way we convert
>> from bmap to access bits.
>> (unrelated note: for nfsv4.1 we'll need to mask the share access with
>> ~OPEN4_SHARE_ACCESS_WANT_DELEG_MASK)
>>
>> If I understand this correctly, we set a bit in
>> st_access_bmap corresponding to the share_access,
>> so for read-only bit #1 will be, write-only #2,
>> and for read-write (only) bits 1-3 should be set.
>
> No, in that case only the third bit 3 should be set.
>
> We're just trying to enforce rfc 3530 14.2.19:

OK. I see.
Thanks for explaining!
It is a bit mind boggling, maybe adding some comments explaining
why the bitmap is needed would help...

Benny

>
> "The share_access and share_deny bits specified must be exactly
> equal to the union of the share_access and share_deny bits
> specified for some subset of the OPENs in effect for current
> openowner on the current file. If that constraint is not
> respected, the error NFS4ERR_INVAL should be returned."
>
> Note that this paragraph would allow
>
> OPEN for read
> OPEN for write
> OPEN_DOWNGRADE to read
>
>
> but have us return NFS4ERR_INVAL if we got a sequence of opens like:
>
> OPEN for read and write
> OPEN_DOWNGRADE to read
>
> because in this case there was only a single open (for read and write),
> and no open just for read.
>
> --b.
>
>> otherwise this wouldn't work:
>> if (!test_bit(od->od_share_access, &stp->st_access_bmap)) {
>> dprintk("NFSD:access not a subset current bitmap: 0x%lx, input access=%08x\n",
>> stp->st_access_bmap, od->od_share_access);
>> goto out;
>> }
>>
>> but init_stateid sets only one bit:
>> __set_bit(open->op_share_access, &stp->st_access_bmap);
>>
>> only if we went through nfs4_upgrade_open another bit may be set.
>>
>> Benny


2008-07-04 12:38:54

by Benny Halevy

[permalink] [raw]
Subject: [PATCH] nfsd: take file and mnt write in nfs4_upgrade_open

testing with newpynfs revealed this warning:
Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write()
Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------
Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e()
Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc
Jul 3 07:32:50 buml kernel: Call Trace:
Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e
Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793
Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea
Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd]
Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e
Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd]
Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd]
Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd]
Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc]
Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100
Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd]
Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a
Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd]
Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a
Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c
Jul 3 07:32:50 buml kernel:
Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]---

Bruce Fields suggested this (Thanks!):
maybe we need to be doing a mnt_want_write on open_upgrade and mnt_put_write on downgrade?

This patch adds a call to mnt_want_write and file_take_write (which is
doing the actual work).

The counter-calls mnt_drop_write a file_release_write are now being properly
called by drop_file_write_access in the exact path printed by the warning
above.

Signed-off-by: Benny Halevy <[email protected]>
---
fs/nfsd/nfs4state.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 0d1760f..4263445 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1570,6 +1570,10 @@ nfs4_upgrade_open(struct svc_rqst *rqstp, struct svc_fh *cur_fh, struct nfs4_sta
int err = get_write_access(inode);
if (err)
return nfserrno(err);
+ err = mnt_want_write(cur_fh->fh_export->ex_path.mnt);
+ if (err)
+ return nfserrno(err);
+ file_take_write(filp);
}
status = nfsd4_truncate(rqstp, cur_fh, open);
if (status) {
--
1.5.6.GIT


2008-07-07 19:05:19

by Bruce Fields

[permalink] [raw]
Subject: Re: writeable file with no mnt_want_write()

On Fri, Jul 04, 2008 at 03:34:11PM +0300, Benny Halevy wrote:
> On Jul. 03, 2008, 23:36 +0300, "J. Bruce Fields" <[email protected]> wrote:
> > We're just trying to enforce rfc 3530 14.2.19:
>
> OK. I see.
> Thanks for explaining!
> It is a bit mind boggling, maybe adding some comments explaining
> why the bitmap is needed would help...

Where do you think you would have looked for a comment? I figured just
before the helper functions here was one obvious place.

--b.

commit 4f83aa302f8f8b42397c6d3703d670f0588c03ec
Author: J. Bruce Fields <[email protected]>
Date: Mon Jul 7 15:02:02 2008 -0400

nfsd: document open share bit tracking

It's not immediately obvious from the code why we're doing this.

Signed-off-by: J. Bruce Fields <[email protected]>
Cc: Benny Halevy <[email protected]>

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index eca8aaa..c29b6ed 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1173,6 +1173,24 @@ static inline int deny_valid(u32 x)
return x <= NFS4_SHARE_DENY_BOTH;
}

+/*
+ * We store the NONE, READ, WRITE, and BOTH bits separately in the
+ * st_{access,deny}_bmap field of the stateid, in order to track not
+ * only what share bits are currently in force, but also what
+ * combinations of share bits previous opens have used. This allows us
+ * to enforce the recommendation of rfc 3530 14.2.19 that the server
+ * return an error if the client attempt to downgrade to a combination
+ * of share bits not explicable by closing some of its previous opens.
+ *
+ * XXX: This enforcement is actually incomplete, since we don't keep
+ * track of access/deny bit combinations; so, e.g., we allow:
+ *
+ * OPEN allow read, deny write
+ * OPEN allow both, deny none
+ * DOWNGRADE allow read, deny none
+ *
+ * which we should reject.
+ */
static void
set_access(unsigned int *access, unsigned long bmap) {
int i;

2008-07-07 19:25:45

by Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: take file and mnt write in nfs4_upgrade_open

Yes, this looks correct, at least for the immediate fix; thanks!

Eventually I think we should move to opening a new file descriptor when
upgrading, and keeping two file descriptors with the stateid; I think
this business of trying to use a file descriptor for write when we
opened it for read is probably wrong.

--b.

On Fri, Jul 04, 2008 at 03:38:41PM +0300, Benny Halevy wrote:
> testing with newpynfs revealed this warning:
> Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write()
> Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------
> Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e()
> Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc
> Jul 3 07:32:50 buml kernel: Call Trace:
> Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e
> Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793
> Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea
> Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e
> Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc]
> Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100
> Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a
> Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd]
> Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a
> Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c
> Jul 3 07:32:50 buml kernel:
> Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]---
>
> Bruce Fields suggested this (Thanks!):
> maybe we need to be doing a mnt_want_write on open_upgrade and mnt_put_write on downgrade?
>
> This patch adds a call to mnt_want_write and file_take_write (which is
> doing the actual work).
>
> The counter-calls mnt_drop_write a file_release_write are now being properly
> called by drop_file_write_access in the exact path printed by the warning
> above.
>
> Signed-off-by: Benny Halevy <[email protected]>
> ---
> fs/nfsd/nfs4state.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 0d1760f..4263445 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1570,6 +1570,10 @@ nfs4_upgrade_open(struct svc_rqst *rqstp, struct svc_fh *cur_fh, struct nfs4_sta
> int err = get_write_access(inode);
> if (err)
> return nfserrno(err);
> + err = mnt_want_write(cur_fh->fh_export->ex_path.mnt);
> + if (err)
> + return nfserrno(err);
> + file_take_write(filp);
> }
> status = nfsd4_truncate(rqstp, cur_fh, open);
> if (status) {
> --
> 1.5.6.GIT
>

2008-07-08 08:17:16

by Benny Halevy

[permalink] [raw]
Subject: Re: [PATCH] nfsd: take file and mnt write in nfs4_upgrade_open

On Jul. 07, 2008, 22:25 +0300, "J. Bruce Fields" <[email protected]> wrote:
> Yes, this looks correct, at least for the immediate fix; thanks!
>
> Eventually I think we should move to opening a new file descriptor when
> upgrading, and keeping two file descriptors with the stateid; I think
> this business of trying to use a file descriptor for write when we
> opened it for read is probably wrong.

That could work, though they need to be associated in some way
for open_downgrade/close processing.

Benny

>
> --b.
>
> On Fri, Jul 04, 2008 at 03:38:41PM +0300, Benny Halevy wrote:
>> testing with newpynfs revealed this warning:
>> Jul 3 07:32:50 buml kernel: writeable file with no mnt_want_write()
>> Jul 3 07:32:50 buml kernel: ------------[ cut here ]------------
>> Jul 3 07:32:50 buml kernel: WARNING: at /usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/include/linux/fs.h:855 drop_file_write_access+0x6b/0x7e()
>> Jul 3 07:32:50 buml kernel: Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc
>> Jul 3 07:32:50 buml kernel: Call Trace:
>> Jul 3 07:32:50 buml kernel: 6eaadc88: [<6002f471>] warn_on_slowpath+0x54/0x8e
>> Jul 3 07:32:50 buml kernel: 6eaadcc8: [<601b790d>] printk+0xa0/0x793
>> Jul 3 07:32:50 buml kernel: 6eaadd38: [<601b6205>] __mutex_lock_slowpath+0x1db/0x1ea
>> Jul 3 07:32:50 buml kernel: 6eaadd68: [<7107d4d5>] nfs4_preprocess_seqid_op+0x2a6/0x31c [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaadda8: [<60078dc9>] drop_file_write_access+0x6b/0x7e
>> Jul 3 07:32:50 buml kernel: 6eaaddc8: [<710804e4>] nfsd4_open_downgrade+0x114/0x1de [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaade08: [<71076215>] nfsd4_proc_compound+0x1ba/0x2dc [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaade48: [<71068221>] nfsd_dispatch+0xe5/0x1c2 [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaade88: [<71312f81>] svc_process+0x3fd/0x714 [sunrpc]
>> Jul 3 07:32:50 buml kernel: 6eaadea8: [<60039a81>] kernel_sigprocmask+0xf3/0x100
>> Jul 3 07:32:50 buml kernel: 6eaadee8: [<7106874b>] nfsd+0x182/0x29b [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaadf48: [<60021cc9>] run_kernel_thread+0x41/0x4a
>> Jul 3 07:32:50 buml kernel: 6eaadf58: [<710685c9>] nfsd+0x0/0x29b [nfsd]
>> Jul 3 07:32:50 buml kernel: 6eaadf98: [<60021cb0>] run_kernel_thread+0x28/0x4a
>> Jul 3 07:32:50 buml kernel: 6eaadfc8: [<60013829>] new_thread_handler+0x72/0x9c
>> Jul 3 07:32:50 buml kernel:
>> Jul 3 07:32:50 buml kernel: ---[ end trace 2426dd7cb2fba3bf ]---
>>
>> Bruce Fields suggested this (Thanks!):
>> maybe we need to be doing a mnt_want_write on open_upgrade and mnt_put_write on downgrade?
>>
>> This patch adds a call to mnt_want_write and file_take_write (which is
>> doing the actual work).
>>
>> The counter-calls mnt_drop_write a file_release_write are now being properly
>> called by drop_file_write_access in the exact path printed by the warning
>> above.
>>
>> Signed-off-by: Benny Halevy <[email protected]>
>> ---
>> fs/nfsd/nfs4state.c | 4 ++++
>> 1 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index 0d1760f..4263445 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -1570,6 +1570,10 @@ nfs4_upgrade_open(struct svc_rqst *rqstp, struct svc_fh *cur_fh, struct nfs4_sta
>> int err = get_write_access(inode);
>> if (err)
>> return nfserrno(err);
>> + err = mnt_want_write(cur_fh->fh_export->ex_path.mnt);
>> + if (err)
>> + return nfserrno(err);
>> + file_take_write(filp);
>> }
>> status = nfsd4_truncate(rqstp, cur_fh, open);
>> if (status) {
>> --
>> 1.5.6.GIT
>>


2008-07-08 08:41:16

by Benny Halevy

[permalink] [raw]
Subject: Re: writeable file with no mnt_want_write()

On Jul. 07, 2008, 22:05 +0300, "J. Bruce Fields" <[email protected]> wrote:
> On Fri, Jul 04, 2008 at 03:34:11PM +0300, Benny Halevy wrote:
>> On Jul. 03, 2008, 23:36 +0300, "J. Bruce Fields" <[email protected]> wrote:
>>> We're just trying to enforce rfc 3530 14.2.19:
>> OK. I see.
>> Thanks for explaining!
>> It is a bit mind boggling, maybe adding some comments explaining
>> why the bitmap is needed would help...
>
> Where do you think you would have looked for a comment? I figured just
> before the helper functions here was one obvious place.

Yup. Either here, or closer to (nfsd's) struct nfs4_stateid's definition
where st_{access,deny}_bmap are defined.

Benny

>
> --b.
>
> commit 4f83aa302f8f8b42397c6d3703d670f0588c03ec
> Author: J. Bruce Fields <[email protected]>
> Date: Mon Jul 7 15:02:02 2008 -0400
>
> nfsd: document open share bit tracking
>
> It's not immediately obvious from the code why we're doing this.
>
> Signed-off-by: J. Bruce Fields <[email protected]>
> Cc: Benny Halevy <[email protected]>
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index eca8aaa..c29b6ed 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1173,6 +1173,24 @@ static inline int deny_valid(u32 x)
> return x <= NFS4_SHARE_DENY_BOTH;
> }
>
> +/*
> + * We store the NONE, READ, WRITE, and BOTH bits separately in the
> + * st_{access,deny}_bmap field of the stateid, in order to track not
> + * only what share bits are currently in force, but also what
> + * combinations of share bits previous opens have used. This allows us
> + * to enforce the recommendation of rfc 3530 14.2.19 that the server
> + * return an error if the client attempt to downgrade to a combination
> + * of share bits not explicable by closing some of its previous opens.
> + *
> + * XXX: This enforcement is actually incomplete, since we don't keep
> + * track of access/deny bit combinations; so, e.g., we allow:
> + *
> + * OPEN allow read, deny write
> + * OPEN allow both, deny none
> + * DOWNGRADE allow read, deny none
> + *
> + * which we should reject.
> + */
> static void
> set_access(unsigned int *access, unsigned long bmap) {
> int i;


2008-07-08 14:35:38

by Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] nfsd: take file and mnt write in nfs4_upgrade_open

On Tue, Jul 08, 2008 at 11:16:49AM +0300, Benny Halevy wrote:
> On Jul. 07, 2008, 22:25 +0300, "J. Bruce Fields" <[email protected]> wrote:
> > Yes, this looks correct, at least for the immediate fix; thanks!
> >
> > Eventually I think we should move to opening a new file descriptor when
> > upgrading, and keeping two file descriptors with the stateid; I think
> > this business of trying to use a file descriptor for write when we
> > opened it for read is probably wrong.
>
> That could work, though they need to be associated in some way
> for open_downgrade/close processing.

Yes. I think it would work just to keep up to one read-only open and
one write-only open and then close one or the other as necessary on the
downgrade.

--b.