After writing data at NFS client, file's access mode is inconsistent
with server.
Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
but client don't get it.
#touch hello; chmod 06777 hello; stat hello;
File: ??hello??
Size: 0 Blocks: 0 IO Block: 262144 regular
empty file
Device: 24h/36d Inode: 786434 Links: 1
Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:nfs_t:s0
Access: 2014-04-13 21:00:44.996908708 +0800
Modify: 2014-04-13 21:00:44.996908708 +0800
Change: 2014-04-13 21:00:45.033908705 +0800
Birth: -
#echo 12324 > hello; stat hello; stat /nfstest/hello
File: ??hello??
Size: 6 Blocks: 0 IO Block: 262144 regular file
Device: 24h/36d Inode: 786434 Links: 1
Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
^^^^^ it should be 0777
Context: system_u:object_r:nfs_t:s0
Access: 2014-04-13 21:00:44.996908708 +0800
Modify: 2014-04-13 21:00:45.061908703 +0800
Change: 2014-04-13 21:00:45.061908703 +0800
Birth: -
File: ??/nfstest/hello??
Size: 6 Blocks: 8 IO Block: 4096 regular file
Device: 803h/2051d Inode: 786434 Links: 1
Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
^^^^^ bits on the server
Context: system_u:object_r:default_t:s0
Access: 2014-04-13 21:00:44.996908708 +0800
Modify: 2014-04-13 21:00:45.061908703 +0800
Change: 2014-04-13 21:00:45.061908703 +0800
Birth: -
Signed-off-by: Kinglong Mee <[email protected]>
---
fs/nfs/nfs4proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 397be39..f234af7 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2819,7 +2819,7 @@ static int _nfs4_server_capabilities(struct
nfs_server *server, struct nfs_fh *f
memcpy(server->cache_consistency_bitmask, res.attr_bitmask,
sizeof(server->cache_consistency_bitmask));
server->cache_consistency_bitmask[0] &=
FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE;
- server->cache_consistency_bitmask[1] &=
FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
+ server->cache_consistency_bitmask[1] &=
FATTR4_WORD1_MODE|FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
server->cache_consistency_bitmask[2] = 0;
server->acl_bitmask = res.acl_bitmask;
server->fh_expire_type = res.fh_expire_type;
--
1.9.0
On Mon, 2014-04-14 at 21:31 +0800, Kinglong Mee wrote:
>
> 于 2014/4/14 21:12, Trond Myklebust 写道:
> >
> > On Apr 14, 2014, at 8:59, Kinglong Mee <[email protected]> wrote:
> >
> >>
> >>
> >> 于 2014/4/13 23:24, Trond Myklebust 写道:
> >>> On Sun, 2014-04-13 at 22:53 +0800, Kinglong Mee wrote:
> >>>>
> >>>> 于 2014/4/13 22:28, Trond Myklebust 写道:
> >>>>>
> >>>>> On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
> >>>>>
> >>>>>> After writing data at NFS client, file's access mode is inconsistent
> >>>>>> with server.
> >>>>>> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
> >>>>>> but client don't get it.
> >>>>>>
> >>>>>> #touch hello; chmod 06777 hello; stat hello;
> >>>>>> File: ‘hello’
> >>>>>> Size: 0 Blocks: 0 IO Block: 262144 regular
> >>>>>> empty file
> >>>>>> Device: 24h/36d Inode: 786434 Links: 1
> >>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> >>>>>> Context: system_u:object_r:nfs_t:s0
> >>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
> >>>>>> Modify: 2014-04-13 21:00:44.996908708 +0800
> >>>>>> Change: 2014-04-13 21:00:45.033908705 +0800
> >>>>>> Birth: -
> >>>>>>
> >>>>>> #echo 12324 > hello; stat hello; stat /nfstest/hello
> >>>>>> File: ‘hello’
> >>>>>> Size: 6 Blocks: 0 IO Block: 262144 regular file
> >>>>>> Device: 24h/36d Inode: 786434 Links: 1
> >>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> >>>>>> ^^^^^ it should be 0777
> >>>>>> Context: system_u:object_r:nfs_t:s0
> >>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
> >>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
> >>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
> >>>>>> Birth: -
> >>>>>> File: ‘/nfstest/hello’
> >>>>>> Size: 6 Blocks: 8 IO Block: 4096 regular file
> >>>>>> Device: 803h/2051d Inode: 786434 Links: 1
> >>>>>> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> >>>>>> ^^^^^ bits on the server
> >>>>>> Context: system_u:object_r:default_t:s0
> >>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
> >>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
> >>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
> >>>>>> Birth: -
> >>>>>>
> >>> <snip>
> >>>>>
> >>>>> Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can’t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
> >>>>
> >>> <snip>
> >>>> IMO, client shoulds get all metadatas from server, so, adds the flag.
> >>>> I think should_remove_suid() should be called by nfsd, not NFS client
> >>>
> >>> I agree with 50% of that statement. Please see below.
> >>>
> >>> 8<---------------------------------------------------------------------
> >>>> From a7b05fc5fcb433e8cfca577c9275f2012b523ee8 Mon Sep 17 00:00:00 2001
> >>> From: Trond Myklebust <[email protected]>
> >>> Date: Sun, 13 Apr 2014 11:11:31 -0400
> >>> Subject: [PATCH] NFS: Don't ignore suid/sgid bit changes after a successful
> >>> write
> >>>
> >>> If we suspect that the server may have cleared the suid/sgid bit,
> >>> then mark the inode for revalidation.
> >>
> >> When testing with this patch, should_remove_suid() always return false
> >> at client, but return true at NFS server.
> >>
> >> So that, NFS server clears the suid/sgid bit, but client also remains.
> >
> > Are you running the test as root? The only explanation I can see for should_remove_suid() failing is if the ‘CAP_FSETID’ capability is set.
>
> I test it with non-root user, should_remove_suid() also return 0.
OK. Let's make a version that ignores the capabilities, and just tests
the SUID/SGID bits.
8<--------------------------------------------------------------------
>From 2e068b62316b2fa5738a8b730bcb5f2f8e7cbdb1 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Sun, 13 Apr 2014 11:11:31 -0400
Subject: [PATCH v2] NFS: Don't ignore suid/sgid bit changes after a successful
write
If we suspect that the server may have cleared the suid/sgid bit,
then mark the inode for revalidation.
Reported-by: Kinglong Mee <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/write.c | 35 +++++++++++++++++++++++++++++++++--
1 file changed, 33 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 9a3b6a4cd6b9..cd7c651f9b84 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1353,6 +1353,30 @@ static const struct rpc_call_ops nfs_write_common_ops = {
.rpc_release = nfs_writeback_release_common,
};
+/*
+ * Special version of should_remove_suid() that ignores capabilities.
+ */
+static int nfs_should_remove_suid(const struct inode *inode)
+{
+ umode_t mode = inode->i_mode;
+ int kill = 0;
+
+ /* suid always must be killed */
+ if (unlikely(mode & S_ISUID))
+ kill = ATTR_KILL_SUID;
+
+ /*
+ * sgid without any exec bits is just a mandatory locking mark; leave
+ * it alone. If some exec bits are set, it's a real sgid; kill it.
+ */
+ if (unlikely((mode & S_ISGID) && (mode & S_IXGRP)))
+ kill |= ATTR_KILL_SGID;
+
+ if (unlikely(kill && S_ISREG(mode)))
+ return kill;
+
+ return 0;
+}
/*
* This function is called when the WRITE call is complete.
@@ -1401,9 +1425,16 @@ void nfs_writeback_done(struct rpc_task *task, struct nfs_write_data *data)
}
}
#endif
- if (task->tk_status < 0)
+ if (task->tk_status < 0) {
nfs_set_pgio_error(data->header, task->tk_status, argp->offset);
- else if (resp->count < argp->count) {
+ return;
+ }
+
+ /* Deal with the suid/sgid bit corner case */
+ if (nfs_should_remove_suid(inode))
+ nfs_mark_for_revalidate(inode);
+
+ if (resp->count < argp->count) {
static unsigned long complain;
/* This a short write! */
--
1.9.0
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
于 2014/4/13 23:24, Trond Myklebust 写道:
> On Sun, 2014-04-13 at 22:53 +0800, Kinglong Mee wrote:
>>
>> 于 2014/4/13 22:28, Trond Myklebust 写道:
>>>
>>> On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
>>>
>>>> After writing data at NFS client, file's access mode is inconsistent
>>>> with server.
>>>> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
>>>> but client don't get it.
>>>>
>>>> #touch hello; chmod 06777 hello; stat hello;
>>>> File: ‘hello’
>>>> Size: 0 Blocks: 0 IO Block: 262144 regular
>>>> empty file
>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>> Context: system_u:object_r:nfs_t:s0
>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>> Modify: 2014-04-13 21:00:44.996908708 +0800
>>>> Change: 2014-04-13 21:00:45.033908705 +0800
>>>> Birth: -
>>>>
>>>> #echo 12324 > hello; stat hello; stat /nfstest/hello
>>>> File: ‘hello’
>>>> Size: 6 Blocks: 0 IO Block: 262144 regular file
>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>> ^^^^^ it should be 0777
>>>> Context: system_u:object_r:nfs_t:s0
>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>> Birth: -
>>>> File: ‘/nfstest/hello’
>>>> Size: 6 Blocks: 8 IO Block: 4096 regular file
>>>> Device: 803h/2051d Inode: 786434 Links: 1
>>>> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>> ^^^^^ bits on the server
>>>> Context: system_u:object_r:default_t:s0
>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>> Birth: -
>>>>
> <snip>
>>>
>>> Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can’t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
>>
> <snip>
>> IMO, client shoulds get all metadatas from server, so, adds the flag.
>> I think should_remove_suid() should be called by nfsd, not NFS client
>
> I agree with 50% of that statement. Please see below.
>
> 8<---------------------------------------------------------------------
>>From a7b05fc5fcb433e8cfca577c9275f2012b523ee8 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Sun, 13 Apr 2014 11:11:31 -0400
> Subject: [PATCH] NFS: Don't ignore suid/sgid bit changes after a successful
> write
>
> If we suspect that the server may have cleared the suid/sgid bit,
> then mark the inode for revalidation.
When testing with this patch, should_remove_suid() always return false
at client, but return true at NFS server.
So that, NFS server clears the suid/sgid bit, but client also remains.
thanks,
Kinglong Mee
>
> Reported-by: Kinglong Mee <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/write.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index 9a3b6a4cd6b9..80b03e064a09 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1401,9 +1401,16 @@ void nfs_writeback_done(struct rpc_task *task, struct nfs_write_data *data)
> }
> }
> #endif
> - if (task->tk_status < 0)
> + if (task->tk_status < 0) {
> nfs_set_pgio_error(data->header, task->tk_status, argp->offset);
> - else if (resp->count < argp->count) {
> + return;
> + }
> +
> + /* Deal with the suid/sgid bit corner case */
> + if (should_remove_suid(argp->context->dentry))
> + nfs_mark_for_revalidate(inode);
> +
> + if (resp->count < argp->count) {
> static unsigned long complain;
>
> /* This a short write! */
>
于 2014/4/14 21:12, Trond Myklebust 写道:
>
> On Apr 14, 2014, at 8:59, Kinglong Mee <[email protected]> wrote:
>
>>
>>
>> 于 2014/4/13 23:24, Trond Myklebust 写道:
>>> On Sun, 2014-04-13 at 22:53 +0800, Kinglong Mee wrote:
>>>>
>>>> 于 2014/4/13 22:28, Trond Myklebust 写道:
>>>>>
>>>>> On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
>>>>>
>>>>>> After writing data at NFS client, file's access mode is inconsistent
>>>>>> with server.
>>>>>> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
>>>>>> but client don't get it.
>>>>>>
>>>>>> #touch hello; chmod 06777 hello; stat hello;
>>>>>> File: ‘hello’
>>>>>> Size: 0 Blocks: 0 IO Block: 262144 regular
>>>>>> empty file
>>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>> Context: system_u:object_r:nfs_t:s0
>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>> Modify: 2014-04-13 21:00:44.996908708 +0800
>>>>>> Change: 2014-04-13 21:00:45.033908705 +0800
>>>>>> Birth: -
>>>>>>
>>>>>> #echo 12324 > hello; stat hello; stat /nfstest/hello
>>>>>> File: ‘hello’
>>>>>> Size: 6 Blocks: 0 IO Block: 262144 regular file
>>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>> ^^^^^ it should be 0777
>>>>>> Context: system_u:object_r:nfs_t:s0
>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>>> Birth: -
>>>>>> File: ‘/nfstest/hello’
>>>>>> Size: 6 Blocks: 8 IO Block: 4096 regular file
>>>>>> Device: 803h/2051d Inode: 786434 Links: 1
>>>>>> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>> ^^^^^ bits on the server
>>>>>> Context: system_u:object_r:default_t:s0
>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>>> Birth: -
>>>>>>
>>> <snip>
>>>>>
>>>>> Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can’t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
>>>>
>>> <snip>
>>>> IMO, client shoulds get all metadatas from server, so, adds the flag.
>>>> I think should_remove_suid() should be called by nfsd, not NFS client
>>>
>>> I agree with 50% of that statement. Please see below.
>>>
>>> 8<---------------------------------------------------------------------
>>>> From a7b05fc5fcb433e8cfca577c9275f2012b523ee8 Mon Sep 17 00:00:00 2001
>>> From: Trond Myklebust <[email protected]>
>>> Date: Sun, 13 Apr 2014 11:11:31 -0400
>>> Subject: [PATCH] NFS: Don't ignore suid/sgid bit changes after a successful
>>> write
>>>
>>> If we suspect that the server may have cleared the suid/sgid bit,
>>> then mark the inode for revalidation.
>>
>> When testing with this patch, should_remove_suid() always return false
>> at client, but return true at NFS server.
>>
>> So that, NFS server clears the suid/sgid bit, but client also remains.
>
> Are you running the test as root? The only explanation I can see for should_remove_suid() failing is if the ‘CAP_FSETID’ capability is set.
I test it with non-root user, should_remove_suid() also return 0.
thanks,
Kinglong Mee
On Apr 14, 2014, at 8:59, Kinglong Mee <[email protected]> wrote:
>
>
> 于 2014/4/13 23:24, Trond Myklebust 写道:
>> On Sun, 2014-04-13 at 22:53 +0800, Kinglong Mee wrote:
>>>
>>> 于 2014/4/13 22:28, Trond Myklebust 写道:
>>>>
>>>> On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
>>>>
>>>>> After writing data at NFS client, file's access mode is inconsistent
>>>>> with server.
>>>>> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
>>>>> but client don't get it.
>>>>>
>>>>> #touch hello; chmod 06777 hello; stat hello;
>>>>> File: ‘hello’
>>>>> Size: 0 Blocks: 0 IO Block: 262144 regular
>>>>> empty file
>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>> Context: system_u:object_r:nfs_t:s0
>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>> Modify: 2014-04-13 21:00:44.996908708 +0800
>>>>> Change: 2014-04-13 21:00:45.033908705 +0800
>>>>> Birth: -
>>>>>
>>>>> #echo 12324 > hello; stat hello; stat /nfstest/hello
>>>>> File: ‘hello’
>>>>> Size: 6 Blocks: 0 IO Block: 262144 regular file
>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>> ^^^^^ it should be 0777
>>>>> Context: system_u:object_r:nfs_t:s0
>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>> Birth: -
>>>>> File: ‘/nfstest/hello’
>>>>> Size: 6 Blocks: 8 IO Block: 4096 regular file
>>>>> Device: 803h/2051d Inode: 786434 Links: 1
>>>>> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>> ^^^^^ bits on the server
>>>>> Context: system_u:object_r:default_t:s0
>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>> Birth: -
>>>>>
>> <snip>
>>>>
>>>> Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can’t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
>>>
>> <snip>
>>> IMO, client shoulds get all metadatas from server, so, adds the flag.
>>> I think should_remove_suid() should be called by nfsd, not NFS client
>>
>> I agree with 50% of that statement. Please see below.
>>
>> 8<---------------------------------------------------------------------
>>> From a7b05fc5fcb433e8cfca577c9275f2012b523ee8 Mon Sep 17 00:00:00 2001
>> From: Trond Myklebust <[email protected]>
>> Date: Sun, 13 Apr 2014 11:11:31 -0400
>> Subject: [PATCH] NFS: Don't ignore suid/sgid bit changes after a successful
>> write
>>
>> If we suspect that the server may have cleared the suid/sgid bit,
>> then mark the inode for revalidation.
>
> When testing with this patch, should_remove_suid() always return false
> at client, but return true at NFS server.
>
> So that, NFS server clears the suid/sgid bit, but client also remains.
Are you running the test as root? The only explanation I can see for should_remove_suid() failing is if the ‘CAP_FSETID’ capability is set.
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On Sun, 2014-04-13 at 22:53 +0800, Kinglong Mee wrote:
>
> 于 2014/4/13 22:28, Trond Myklebust 写道:
> >
> > On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
> >
> >> After writing data at NFS client, file's access mode is inconsistent
> >> with server.
> >> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
> >> but client don't get it.
> >>
> >> #touch hello; chmod 06777 hello; stat hello;
> >> File: ‘hello’
> >> Size: 0 Blocks: 0 IO Block: 262144 regular
> >> empty file
> >> Device: 24h/36d Inode: 786434 Links: 1
> >> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> >> Context: system_u:object_r:nfs_t:s0
> >> Access: 2014-04-13 21:00:44.996908708 +0800
> >> Modify: 2014-04-13 21:00:44.996908708 +0800
> >> Change: 2014-04-13 21:00:45.033908705 +0800
> >> Birth: -
> >>
> >> #echo 12324 > hello; stat hello; stat /nfstest/hello
> >> File: ‘hello’
> >> Size: 6 Blocks: 0 IO Block: 262144 regular file
> >> Device: 24h/36d Inode: 786434 Links: 1
> >> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> >> ^^^^^ it should be 0777
> >> Context: system_u:object_r:nfs_t:s0
> >> Access: 2014-04-13 21:00:44.996908708 +0800
> >> Modify: 2014-04-13 21:00:45.061908703 +0800
> >> Change: 2014-04-13 21:00:45.061908703 +0800
> >> Birth: -
> >> File: ‘/nfstest/hello’
> >> Size: 6 Blocks: 8 IO Block: 4096 regular file
> >> Device: 803h/2051d Inode: 786434 Links: 1
> >> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> >> ^^^^^ bits on the server
> >> Context: system_u:object_r:default_t:s0
> >> Access: 2014-04-13 21:00:44.996908708 +0800
> >> Modify: 2014-04-13 21:00:45.061908703 +0800
> >> Change: 2014-04-13 21:00:45.061908703 +0800
> >> Birth: -
> >>
<snip>
> >
> > Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can’t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
>
<snip>
> IMO, client shoulds get all metadatas from server, so, adds the flag.
> I think should_remove_suid() should be called by nfsd, not NFS client
I agree with 50% of that statement. Please see below.
8<---------------------------------------------------------------------
>From a7b05fc5fcb433e8cfca577c9275f2012b523ee8 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Sun, 13 Apr 2014 11:11:31 -0400
Subject: [PATCH] NFS: Don't ignore suid/sgid bit changes after a successful
write
If we suspect that the server may have cleared the suid/sgid bit,
then mark the inode for revalidation.
Reported-by: Kinglong Mee <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/write.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 9a3b6a4cd6b9..80b03e064a09 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1401,9 +1401,16 @@ void nfs_writeback_done(struct rpc_task *task, struct nfs_write_data *data)
}
}
#endif
- if (task->tk_status < 0)
+ if (task->tk_status < 0) {
nfs_set_pgio_error(data->header, task->tk_status, argp->offset);
- else if (resp->count < argp->count) {
+ return;
+ }
+
+ /* Deal with the suid/sgid bit corner case */
+ if (should_remove_suid(argp->context->dentry))
+ nfs_mark_for_revalidate(inode);
+
+ if (resp->count < argp->count) {
static unsigned long complain;
/* This a short write! */
--
1.9.0
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On Tue, 2014-04-15 at 09:22 -0400, Trond Myklebust wrote:
> On Apr 15, 2014, at 1:02, Kinglong Mee <[email protected]> wrote:
> > Next, the "stat testfile" gets data from cache,
> > because NFS_INO_INVALID_ATTR flags is cleared below.
> >
> > Calltrace,
> > [ 4883.997254] nfs4_proc_write_setup
> > [ 4884.006885] NFS: 1365 nfs_writeback_done (status 11)
> > [ 4884.008215] nfs4_write_done
> > [ 4884.009273] nfs4_write_done_cb
> > [ 4884.010013] nfs_post_op_update_inode_force_wcc
> > [ 4884.011221] nfs_update_inode
> > [ 4884.012001] nfs_update_inode
> > [ 4884.012952] nfs_writeback_done: before nfs_should_remove_suid
> > [ 4884.014722] nfs_writeback_done: in nfs_should_remove_suid
> > [ 4884.016549] nfs4_close_done
> > [ 4884.017614] nfs_refresh_inode
> > [ 4884.018645] nfs_update_inode
> > [ 4884.019693] nfs_update_inode
> >
> > But, if getting status before close, the mode can be update to latest.
>
> Argh. That is a bug in nfs_update_inode(). It is not supposed to clear NFS_INO_INVALID_ATTR if nfs_fattr does not contain a complete set of attributes.
>
> Thanks for testing, Kinglong. This is extremely helpful...
Can you please see if the following patch fixes the above issue?
8<--------------------------------------------------------------
>From 12c9c63a005889d70fbcbdb746cd7e7d127b0f01 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Tue, 15 Apr 2014 10:07:57 -0400
Subject: [PATCH] NFS: Don't declare inode uptodate unless all attributes were
checked
Fix a bug, whereby nfs_update_inode() was declaring the inode to be
up to date despite not having checked all the attributes.
The bug occurs because the temporary variable in which we cache
the validity information is 'sanitised' before reapplying to
nfsi->cache_validity.
Reported-by: Kinglong Mee <[email protected]>
Cc: [email protected] # 3.5+
Signed-off-by: Trond Myklebust <[email protected]>
---
fs/nfs/inode.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index c0f7b1d0b814..a6f9c8581228 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1581,18 +1581,20 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
inode->i_version = fattr->change_attr;
}
} else if (server->caps & NFS_CAP_CHANGE_ATTR)
- invalid |= save_cache_validity;
+ nfsi->cache_validity |= save_cache_validity;
if (fattr->valid & NFS_ATTR_FATTR_MTIME) {
memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime));
} else if (server->caps & NFS_CAP_MTIME)
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATTR
| NFS_INO_REVAL_FORCED);
if (fattr->valid & NFS_ATTR_FATTR_CTIME) {
memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime));
} else if (server->caps & NFS_CAP_CTIME)
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATTR
| NFS_INO_REVAL_FORCED);
/* Check if our cached file size is stale */
@@ -1614,7 +1616,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
(long long)new_isize);
}
} else
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATTR
| NFS_INO_REVAL_PAGECACHE
| NFS_INO_REVAL_FORCED);
@@ -1622,7 +1625,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
if (fattr->valid & NFS_ATTR_FATTR_ATIME)
memcpy(&inode->i_atime, &fattr->atime, sizeof(inode->i_atime));
else if (server->caps & NFS_CAP_ATIME)
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATIME
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATIME
| NFS_INO_REVAL_FORCED);
if (fattr->valid & NFS_ATTR_FATTR_MODE) {
@@ -1633,7 +1637,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_ACCESS|NFS_INO_INVALID_ACL;
}
} else if (server->caps & NFS_CAP_MODE)
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATTR
| NFS_INO_INVALID_ACCESS
| NFS_INO_INVALID_ACL
| NFS_INO_REVAL_FORCED);
@@ -1644,7 +1649,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
inode->i_uid = fattr->uid;
}
} else if (server->caps & NFS_CAP_OWNER)
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATTR
| NFS_INO_INVALID_ACCESS
| NFS_INO_INVALID_ACL
| NFS_INO_REVAL_FORCED);
@@ -1655,7 +1661,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
inode->i_gid = fattr->gid;
}
} else if (server->caps & NFS_CAP_OWNER_GROUP)
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATTR
| NFS_INO_INVALID_ACCESS
| NFS_INO_INVALID_ACL
| NFS_INO_REVAL_FORCED);
@@ -1668,7 +1675,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
set_nlink(inode, fattr->nlink);
}
} else if (server->caps & NFS_CAP_NLINK)
- invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
+ nfsi->cache_validity |= save_cache_validity &
+ (NFS_INO_INVALID_ATTR
| NFS_INO_REVAL_FORCED);
if (fattr->valid & NFS_ATTR_FATTR_SPACE_USED) {
--
1.9.0
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On 2014/4/15 22:24, Trond Myklebust wrote:
> On Tue, 2014-04-15 at 09:22 -0400, Trond Myklebust wrote:
>> On Apr 15, 2014, at 1:02, Kinglong Mee <[email protected]> wrote:
>>> Next, the "stat testfile" gets data from cache,
>>> because NFS_INO_INVALID_ATTR flags is cleared below.
>>>
>>> Calltrace,
>>> [ 4883.997254] nfs4_proc_write_setup
>>> [ 4884.006885] NFS: 1365 nfs_writeback_done (status 11)
>>> [ 4884.008215] nfs4_write_done
>>> [ 4884.009273] nfs4_write_done_cb
>>> [ 4884.010013] nfs_post_op_update_inode_force_wcc
>>> [ 4884.011221] nfs_update_inode
>>> [ 4884.012001] nfs_update_inode
>>> [ 4884.012952] nfs_writeback_done: before nfs_should_remove_suid
>>> [ 4884.014722] nfs_writeback_done: in nfs_should_remove_suid
>>> [ 4884.016549] nfs4_close_done
>>> [ 4884.017614] nfs_refresh_inode
>>> [ 4884.018645] nfs_update_inode
>>> [ 4884.019693] nfs_update_inode
>>>
>>> But, if getting status before close, the mode can be update to latest.
>>
>> Argh. That is a bug in nfs_update_inode(). It is not supposed to clear NFS_INO_INVALID_ATTR if nfs_fattr does not contain a complete set of attributes.
>>
>> Thanks for testing, Kinglong. This is extremely helpful...
>
> Can you please see if the following patch fixes the above issue?
Thanks Trond,
Yes, the following patch fixes the above issue.
With the two patches you have send, suid/sgid can be updated correctly now.
thanks,
Kinglong Mee
> 8<--------------------------------------------------------------
>>From 12c9c63a005889d70fbcbdb746cd7e7d127b0f01 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Tue, 15 Apr 2014 10:07:57 -0400
> Subject: [PATCH] NFS: Don't declare inode uptodate unless all attributes were
> checked
>
> Fix a bug, whereby nfs_update_inode() was declaring the inode to be
> up to date despite not having checked all the attributes.
> The bug occurs because the temporary variable in which we cache
> the validity information is 'sanitised' before reapplying to
> nfsi->cache_validity.
>
> Reported-by: Kinglong Mee <[email protected]>
> Cc: [email protected] # 3.5+
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/inode.c | 26 +++++++++++++++++---------
> 1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index c0f7b1d0b814..a6f9c8581228 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1581,18 +1581,20 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> inode->i_version = fattr->change_attr;
> }
> } else if (server->caps & NFS_CAP_CHANGE_ATTR)
> - invalid |= save_cache_validity;
> + nfsi->cache_validity |= save_cache_validity;
>
> if (fattr->valid & NFS_ATTR_FATTR_MTIME) {
> memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime));
> } else if (server->caps & NFS_CAP_MTIME)
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATTR
> | NFS_INO_REVAL_FORCED);
>
> if (fattr->valid & NFS_ATTR_FATTR_CTIME) {
> memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime));
> } else if (server->caps & NFS_CAP_CTIME)
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATTR
> | NFS_INO_REVAL_FORCED);
>
> /* Check if our cached file size is stale */
> @@ -1614,7 +1616,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> (long long)new_isize);
> }
> } else
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATTR
> | NFS_INO_REVAL_PAGECACHE
> | NFS_INO_REVAL_FORCED);
>
> @@ -1622,7 +1625,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> if (fattr->valid & NFS_ATTR_FATTR_ATIME)
> memcpy(&inode->i_atime, &fattr->atime, sizeof(inode->i_atime));
> else if (server->caps & NFS_CAP_ATIME)
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATIME
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATIME
> | NFS_INO_REVAL_FORCED);
>
> if (fattr->valid & NFS_ATTR_FATTR_MODE) {
> @@ -1633,7 +1637,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> invalid |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_ACCESS|NFS_INO_INVALID_ACL;
> }
> } else if (server->caps & NFS_CAP_MODE)
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATTR
> | NFS_INO_INVALID_ACCESS
> | NFS_INO_INVALID_ACL
> | NFS_INO_REVAL_FORCED);
> @@ -1644,7 +1649,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> inode->i_uid = fattr->uid;
> }
> } else if (server->caps & NFS_CAP_OWNER)
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATTR
> | NFS_INO_INVALID_ACCESS
> | NFS_INO_INVALID_ACL
> | NFS_INO_REVAL_FORCED);
> @@ -1655,7 +1661,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> inode->i_gid = fattr->gid;
> }
> } else if (server->caps & NFS_CAP_OWNER_GROUP)
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATTR
> | NFS_INO_INVALID_ACCESS
> | NFS_INO_INVALID_ACL
> | NFS_INO_REVAL_FORCED);
> @@ -1668,7 +1675,8 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
> set_nlink(inode, fattr->nlink);
> }
> } else if (server->caps & NFS_CAP_NLINK)
> - invalid |= save_cache_validity & (NFS_INO_INVALID_ATTR
> + nfsi->cache_validity |= save_cache_validity &
> + (NFS_INO_INVALID_ATTR
> | NFS_INO_REVAL_FORCED);
>
> if (fattr->valid & NFS_ATTR_FATTR_SPACE_USED) {
>
On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
> After writing data at NFS client, file's access mode is inconsistent
> with server.
> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
> but client don't get it.
>
> #touch hello; chmod 06777 hello; stat hello;
> File: ??hello??
> Size: 0 Blocks: 0 IO Block: 262144 regular
> empty file
> Device: 24h/36d Inode: 786434 Links: 1
> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> Context: system_u:object_r:nfs_t:s0
> Access: 2014-04-13 21:00:44.996908708 +0800
> Modify: 2014-04-13 21:00:44.996908708 +0800
> Change: 2014-04-13 21:00:45.033908705 +0800
> Birth: -
>
> #echo 12324 > hello; stat hello; stat /nfstest/hello
> File: ??hello??
> Size: 6 Blocks: 0 IO Block: 262144 regular file
> Device: 24h/36d Inode: 786434 Links: 1
> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> ^^^^^ it should be 0777
> Context: system_u:object_r:nfs_t:s0
> Access: 2014-04-13 21:00:44.996908708 +0800
> Modify: 2014-04-13 21:00:45.061908703 +0800
> Change: 2014-04-13 21:00:45.061908703 +0800
> Birth: -
> File: ??/nfstest/hello??
> Size: 6 Blocks: 8 IO Block: 4096 regular file
> Device: 803h/2051d Inode: 786434 Links: 1
> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
> ^^^^^ bits on the server
> Context: system_u:object_r:default_t:s0
> Access: 2014-04-13 21:00:44.996908708 +0800
> Modify: 2014-04-13 21:00:45.061908703 +0800
> Change: 2014-04-13 21:00:45.061908703 +0800
> Birth: -
>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 397be39..f234af7 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2819,7 +2819,7 @@ static int _nfs4_server_capabilities(struct
> nfs_server *server, struct nfs_fh *f
>
> memcpy(server->cache_consistency_bitmask, res.attr_bitmask,
> sizeof(server->cache_consistency_bitmask));
> server->cache_consistency_bitmask[0] &=
> FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE;
> - server->cache_consistency_bitmask[1] &=
> FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
> + server->cache_consistency_bitmask[1] &=
> FATTR4_WORD1_MODE|FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
> server->cache_consistency_bitmask[2] = 0;
> server->acl_bitmask = res.acl_bitmask;
> server->fh_expire_type = res.fh_expire_type;
> --
> 1.9.0
>
Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can??t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
于 2014/4/14 23:00, Trond Myklebust 写道:
> On Mon, 2014-04-14 at 21:31 +0800, Kinglong Mee wrote:
>>
>> 于 2014/4/14 21:12, Trond Myklebust 写道:
>>>
>>> On Apr 14, 2014, at 8:59, Kinglong Mee <[email protected]> wrote:
>>>
>>>>
>>>>
>>>> 于 2014/4/13 23:24, Trond Myklebust 写道:
>>>>> On Sun, 2014-04-13 at 22:53 +0800, Kinglong Mee wrote:
>>>>>>
>>>>>> 于 2014/4/13 22:28, Trond Myklebust 写道:
>>>>>>>
>>>>>>> On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
>>>>>>>
>>>>>>>> After writing data at NFS client, file's access mode is inconsistent
>>>>>>>> with server.
>>>>>>>> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
>>>>>>>> but client don't get it.
>>>>>>>>
>>>>>>>> #touch hello; chmod 06777 hello; stat hello;
>>>>>>>> File: ‘hello’
>>>>>>>> Size: 0 Blocks: 0 IO Block: 262144 regular
>>>>>>>> empty file
>>>>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>>>> Context: system_u:object_r:nfs_t:s0
>>>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>> Modify: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>> Change: 2014-04-13 21:00:45.033908705 +0800
>>>>>>>> Birth: -
>>>>>>>>
>>>>>>>> #echo 12324 > hello; stat hello; stat /nfstest/hello
>>>>>>>> File: ‘hello’
>>>>>>>> Size: 6 Blocks: 0 IO Block: 262144 regular file
>>>>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>>>> ^^^^^ it should be 0777
>>>>>>>> Context: system_u:object_r:nfs_t:s0
>>>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>> Birth: -
>>>>>>>> File: ‘/nfstest/hello’
>>>>>>>> Size: 6 Blocks: 8 IO Block: 4096 regular file
>>>>>>>> Device: 803h/2051d Inode: 786434 Links: 1
>>>>>>>> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>>>> ^^^^^ bits on the server
>>>>>>>> Context: system_u:object_r:default_t:s0
>>>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>> Birth: -
>>>>>>>>
>>>>> <snip>
>>>>>>>
>>>>>>> Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can’t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
>>>>>>
>>>>> <snip>
>>>>>> IMO, client shoulds get all metadatas from server, so, adds the flag.
>>>>>> I think should_remove_suid() should be called by nfsd, not NFS client
>>>>>
>>>>> I agree with 50% of that statement. Please see below.
>>>>>
>>>>> 8<---------------------------------------------------------------------
>>>>>> From a7b05fc5fcb433e8cfca577c9275f2012b523ee8 Mon Sep 17 00:00:00 2001
>>>>> From: Trond Myklebust <[email protected]>
>>>>> Date: Sun, 13 Apr 2014 11:11:31 -0400
>>>>> Subject: [PATCH] NFS: Don't ignore suid/sgid bit changes after a successful
>>>>> write
>>>>>
>>>>> If we suspect that the server may have cleared the suid/sgid bit,
>>>>> then mark the inode for revalidation.
>>>>
>>>> When testing with this patch, should_remove_suid() always return false
>>>> at client, but return true at NFS server.
>>>>
>>>> So that, NFS server clears the suid/sgid bit, but client also remains.
>>>
>>> Are you running the test as root? The only explanation I can see for should_remove_suid() failing is if the ‘CAP_FSETID’ capability is set.
>>
>> I test it with non-root user, should_remove_suid() also return 0.
>
> OK. Let's make a version that ignores the capabilities, and just tests
> the SUID/SGID bits.
Due to another problem, test failed again using commands
"echo xxxdsf > testfile; stat testfile".
In nfs_writeback_done(), nfs_mark_for_revalidate() set cache_validity's
NFS_INO_INVALID_ATTR flag, but nfs4_close_done() will refresh
inode from cache (old mode, not update from server ) and clear
NFS_INO_INVALID_ATTR flags.
Next, the "stat testfile" gets data from cache,
because NFS_INO_INVALID_ATTR flags is cleared below.
Calltrace,
[ 4883.997254] nfs4_proc_write_setup
[ 4884.006885] NFS: 1365 nfs_writeback_done (status 11)
[ 4884.008215] nfs4_write_done
[ 4884.009273] nfs4_write_done_cb
[ 4884.010013] nfs_post_op_update_inode_force_wcc
[ 4884.011221] nfs_update_inode
[ 4884.012001] nfs_update_inode
[ 4884.012952] nfs_writeback_done: before nfs_should_remove_suid
[ 4884.014722] nfs_writeback_done: in nfs_should_remove_suid
[ 4884.016549] nfs4_close_done
[ 4884.017614] nfs_refresh_inode
[ 4884.018645] nfs_update_inode
[ 4884.019693] nfs_update_inode
But, if getting status before close, the mode can be update to latest.
thanks,
Kinglong Mee
> 8<--------------------------------------------------------------------
>>From 2e068b62316b2fa5738a8b730bcb5f2f8e7cbdb1 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Sun, 13 Apr 2014 11:11:31 -0400
> Subject: [PATCH v2] NFS: Don't ignore suid/sgid bit changes after a successful
> write
>
> If we suspect that the server may have cleared the suid/sgid bit,
> then mark the inode for revalidation.
>
> Reported-by: Kinglong Mee <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/write.c | 35 +++++++++++++++++++++++++++++++++--
> 1 file changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index 9a3b6a4cd6b9..cd7c651f9b84 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1353,6 +1353,30 @@ static const struct rpc_call_ops nfs_write_common_ops = {
> .rpc_release = nfs_writeback_release_common,
> };
>
> +/*
> + * Special version of should_remove_suid() that ignores capabilities.
> + */
> +static int nfs_should_remove_suid(const struct inode *inode)
> +{
> + umode_t mode = inode->i_mode;
> + int kill = 0;
> +
> + /* suid always must be killed */
> + if (unlikely(mode & S_ISUID))
> + kill = ATTR_KILL_SUID;
> +
> + /*
> + * sgid without any exec bits is just a mandatory locking mark; leave
> + * it alone. If some exec bits are set, it's a real sgid; kill it.
> + */
> + if (unlikely((mode & S_ISGID) && (mode & S_IXGRP)))
> + kill |= ATTR_KILL_SGID;
> +
> + if (unlikely(kill && S_ISREG(mode)))
> + return kill;
> +
> + return 0;
> +}
>
> /*
> * This function is called when the WRITE call is complete.
> @@ -1401,9 +1425,16 @@ void nfs_writeback_done(struct rpc_task *task, struct nfs_write_data *data)
> }
> }
> #endif
> - if (task->tk_status < 0)
> + if (task->tk_status < 0) {
> nfs_set_pgio_error(data->header, task->tk_status, argp->offset);
> - else if (resp->count < argp->count) {
> + return;
> + }
> +
> + /* Deal with the suid/sgid bit corner case */
> + if (nfs_should_remove_suid(inode))
> + nfs_mark_for_revalidate(inode);
> +
> + if (resp->count < argp->count) {
> static unsigned long complain;
>
> /* This a short write! */
>
On Apr 15, 2014, at 1:02, Kinglong Mee <[email protected]> wrote:
>
>
> 于 2014/4/14 23:00, Trond Myklebust 写道:
>> On Mon, 2014-04-14 at 21:31 +0800, Kinglong Mee wrote:
>>>
>>> 于 2014/4/14 21:12, Trond Myklebust 写道:
>>>>
>>>> On Apr 14, 2014, at 8:59, Kinglong Mee <[email protected]> wrote:
>>>>
>>>>>
>>>>>
>>>>> 于 2014/4/13 23:24, Trond Myklebust 写道:
>>>>>> On Sun, 2014-04-13 at 22:53 +0800, Kinglong Mee wrote:
>>>>>>>
>>>>>>> 于 2014/4/13 22:28, Trond Myklebust 写道:
>>>>>>>>
>>>>>>>> On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> After writing data at NFS client, file's access mode is inconsistent
>>>>>>>>> with server.
>>>>>>>>> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
>>>>>>>>> but client don't get it.
>>>>>>>>>
>>>>>>>>> #touch hello; chmod 06777 hello; stat hello;
>>>>>>>>> File: ‘hello’
>>>>>>>>> Size: 0 Blocks: 0 IO Block: 262144 regular
>>>>>>>>> empty file
>>>>>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>>>>> Context: system_u:object_r:nfs_t:s0
>>>>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>>> Modify: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>>> Change: 2014-04-13 21:00:45.033908705 +0800
>>>>>>>>> Birth: -
>>>>>>>>>
>>>>>>>>> #echo 12324 > hello; stat hello; stat /nfstest/hello
>>>>>>>>> File: ‘hello’
>>>>>>>>> Size: 6 Blocks: 0 IO Block: 262144 regular file
>>>>>>>>> Device: 24h/36d Inode: 786434 Links: 1
>>>>>>>>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>>>>> ^^^^^ it should be 0777
>>>>>>>>> Context: system_u:object_r:nfs_t:s0
>>>>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>>> Birth: -
>>>>>>>>> File: ‘/nfstest/hello’
>>>>>>>>> Size: 6 Blocks: 8 IO Block: 4096 regular file
>>>>>>>>> Device: 803h/2051d Inode: 786434 Links: 1
>>>>>>>>> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>>>>>>>>> ^^^^^ bits on the server
>>>>>>>>> Context: system_u:object_r:default_t:s0
>>>>>>>>> Access: 2014-04-13 21:00:44.996908708 +0800
>>>>>>>>> Modify: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>>> Change: 2014-04-13 21:00:45.061908703 +0800
>>>>>>>>> Birth: -
>>>>>>>>>
>>>>>> <snip>
>>>>>>>>
>>>>>>>> Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can’t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
>>>>>>>
>>>>>> <snip>
>>>>>>> IMO, client shoulds get all metadatas from server, so, adds the flag.
>>>>>>> I think should_remove_suid() should be called by nfsd, not NFS client
>>>>>>
>>>>>> I agree with 50% of that statement. Please see below.
>>>>>>
>>>>>> 8<---------------------------------------------------------------------
>>>>>>> From a7b05fc5fcb433e8cfca577c9275f2012b523ee8 Mon Sep 17 00:00:00 2001
>>>>>> From: Trond Myklebust <[email protected]>
>>>>>> Date: Sun, 13 Apr 2014 11:11:31 -0400
>>>>>> Subject: [PATCH] NFS: Don't ignore suid/sgid bit changes after a successful
>>>>>> write
>>>>>>
>>>>>> If we suspect that the server may have cleared the suid/sgid bit,
>>>>>> then mark the inode for revalidation.
>>>>>
>>>>> When testing with this patch, should_remove_suid() always return false
>>>>> at client, but return true at NFS server.
>>>>>
>>>>> So that, NFS server clears the suid/sgid bit, but client also remains.
>>>>
>>>> Are you running the test as root? The only explanation I can see for should_remove_suid() failing is if the ‘CAP_FSETID’ capability is set.
>>>
>>> I test it with non-root user, should_remove_suid() also return 0.
>>
>> OK. Let's make a version that ignores the capabilities, and just tests
>> the SUID/SGID bits.
>
> Due to another problem, test failed again using commands
> "echo xxxdsf > testfile; stat testfile".
>
> In nfs_writeback_done(), nfs_mark_for_revalidate() set cache_validity's
> NFS_INO_INVALID_ATTR flag, but nfs4_close_done() will refresh
> inode from cache (old mode, not update from server ) and clear
> NFS_INO_INVALID_ATTR flags.
>
> Next, the "stat testfile" gets data from cache,
> because NFS_INO_INVALID_ATTR flags is cleared below.
>
> Calltrace,
> [ 4883.997254] nfs4_proc_write_setup
> [ 4884.006885] NFS: 1365 nfs_writeback_done (status 11)
> [ 4884.008215] nfs4_write_done
> [ 4884.009273] nfs4_write_done_cb
> [ 4884.010013] nfs_post_op_update_inode_force_wcc
> [ 4884.011221] nfs_update_inode
> [ 4884.012001] nfs_update_inode
> [ 4884.012952] nfs_writeback_done: before nfs_should_remove_suid
> [ 4884.014722] nfs_writeback_done: in nfs_should_remove_suid
> [ 4884.016549] nfs4_close_done
> [ 4884.017614] nfs_refresh_inode
> [ 4884.018645] nfs_update_inode
> [ 4884.019693] nfs_update_inode
>
> But, if getting status before close, the mode can be update to latest.
Argh. That is a bug in nfs_update_inode(). It is not supposed to clear NFS_INO_INVALID_ATTR if nfs_fattr does not contain a complete set of attributes.
Thanks for testing, Kinglong. This is extremely helpful...
_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
?? 2014/4/13 22:28, Trond Myklebust д??:
>
> On Apr 13, 2014, at 9:11, Kinglong Mee <[email protected]> wrote:
>
>> After writing data at NFS client, file's access mode is inconsistent
>> with server.
>> Because WRITE proceduce changes the S_ISUID and S_ISGID bits,
>> but client don't get it.
>>
>> #touch hello; chmod 06777 hello; stat hello;
>> File: ??hello??
>> Size: 0 Blocks: 0 IO Block: 262144 regular
>> empty file
>> Device: 24h/36d Inode: 786434 Links: 1
>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>> Context: system_u:object_r:nfs_t:s0
>> Access: 2014-04-13 21:00:44.996908708 +0800
>> Modify: 2014-04-13 21:00:44.996908708 +0800
>> Change: 2014-04-13 21:00:45.033908705 +0800
>> Birth: -
>>
>> #echo 12324 > hello; stat hello; stat /nfstest/hello
>> File: ??hello??
>> Size: 6 Blocks: 0 IO Block: 262144 regular file
>> Device: 24h/36d Inode: 786434 Links: 1
>> Access: (6777/-rwsrwsrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>> ^^^^^ it should be 0777
>> Context: system_u:object_r:nfs_t:s0
>> Access: 2014-04-13 21:00:44.996908708 +0800
>> Modify: 2014-04-13 21:00:45.061908703 +0800
>> Change: 2014-04-13 21:00:45.061908703 +0800
>> Birth: -
>> File: ??/nfstest/hello??
>> Size: 6 Blocks: 8 IO Block: 4096 regular file
>> Device: 803h/2051d Inode: 786434 Links: 1
>> Access: (0777/-rwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
>> ^^^^^ bits on the server
>> Context: system_u:object_r:default_t:s0
>> Access: 2014-04-13 21:00:44.996908708 +0800
>> Modify: 2014-04-13 21:00:45.061908703 +0800
>> Change: 2014-04-13 21:00:45.061908703 +0800
>> Birth: -
>>
>> Signed-off-by: Kinglong Mee <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 397be39..f234af7 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -2819,7 +2819,7 @@ static int _nfs4_server_capabilities(struct
>> nfs_server *server, struct nfs_fh *f
>>
>> memcpy(server->cache_consistency_bitmask, res.attr_bitmask,
>> sizeof(server->cache_consistency_bitmask));
>> server->cache_consistency_bitmask[0] &=
>> FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE;
>> - server->cache_consistency_bitmask[1] &=
>> FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
>> + server->cache_consistency_bitmask[1] &=
>> FATTR4_WORD1_MODE|FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
>> server->cache_consistency_bitmask[2] = 0;
>> server->acl_bitmask = res.acl_bitmask;
>> server->fh_expire_type = res.fh_expire_type;
>> --
>> 1.9.0
>>
>
> Instead of requesting a new attribute on each and every operation just in order to deal with an extremely rare corner case, is there any reason why we can??t just do this by checking should_remove_suid(), clearing the mode bits ourselves, and then marking the attributes for revalidation?
Right now, the suid/sgid should be cleared in nfs_setattr,
485 int
486 nfs_setattr(struct dentry *dentry, struct iattr *attr)
487 {
... ...
494 /* skip mode change if it's just for clearing setuid/setgid */
495 if (attr->ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID))
496 attr->ia_valid &= ~ATTR_MODE;
497
but, NFS client can't pass ATTR_KILL_SUID/SGID to NFS server,
NFS server just kill those bits in nfsd_vfs_write,
860 static void kill_suid(struct dentry *dentry)
861 {
862 struct iattr ia;
863 ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
864
865 mutex_lock(&dentry->d_inode->i_mutex);
866 /*
867 * Note we call this on write, so notify_change will not
868 * encounter any conflicting delegations:
869 */
870 notify_change(dentry, &ia, NULL);
871 mutex_unlock(&dentry->d_inode->i_mutex);
872 }
... ...
911 static __be32
912 nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct
file *file,
913 loff_t offset, struct kvec *vec,
int vlen,
914 unsigned long *cnt, int *stablep)
915 {
... ...
945 /* clear setuid/setgid flag after write */
946 if (inode->i_mode & (S_ISUID | S_ISGID))
947 kill_suid(dentry);
IMO, client shoulds get all metadatas from server, so, adds the flag.
I think should_remove_suid() should be called by nfsd, not NFS client.
thanks,
Kinglong Mee