2007-08-06 13:56:26

by Jeff Layton

[permalink] [raw]
Subject: [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

Separate the handling of the local ia_valid bitmask from the one in
attr->ia_valid. This allows us to hand off the actual handling of the
ATTR_KILL_* flags to the .setattr i_op when one is defined.

notify_change still needs to process those flags for the local ia_valid
variable, since it uses that to decide whether to return early, and to pass
a (hopefully) appropriate bitmask to fsnotify_change.

Signed-off-by: Jeff Layton <[email protected]>
---
fs/attr.c | 54 +++++++++++++++++++++++++++++++++------------------
include/linux/fs.h | 1 +
2 files changed, 36 insertions(+), 19 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index f8dfc22..47015e0 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -100,15 +100,39 @@ int inode_setattr(struct inode * inode, struct iattr * attr)
}
EXPORT_SYMBOL(inode_setattr);

+void attr_kill_to_mode(struct inode *inode, struct iattr *attr)
+{
+ if (attr->ia_valid & ATTR_KILL_SUID) {
+ attr->ia_valid &= ~ATTR_KILL_SUID;
+ if (inode->i_mode & S_ISUID) {
+ if (!(attr->ia_valid & ATTR_MODE)) {
+ attr->ia_valid |= ATTR_MODE;
+ attr->ia_mode = inode->i_mode;
+ }
+ attr->ia_mode &= ~S_ISUID;
+ }
+ }
+ if (attr->ia_valid & ATTR_KILL_SGID) {
+ attr->ia_valid &= ~ATTR_KILL_SGID;
+ if ((inode->i_mode & (S_ISGID | S_IXGRP)) ==
+ (S_ISGID | S_IXGRP)) {
+ if (!(attr->ia_valid & ATTR_MODE)) {
+ attr->ia_valid |= ATTR_MODE;
+ attr->ia_mode = inode->i_mode;
+ }
+ attr->ia_mode &= ~S_ISGID;
+ }
+ }
+}
+EXPORT_SYMBOL(attr_kill_to_mode);
+
int notify_change(struct dentry * dentry, struct iattr * attr)
{
struct inode *inode = dentry->d_inode;
- mode_t mode;
int error;
struct timespec now;
unsigned int ia_valid = attr->ia_valid;

- mode = inode->i_mode;
now = current_fs_time(inode->i_sb);

attr->ia_ctime = now;
@@ -117,26 +141,17 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
if (!(ia_valid & ATTR_MTIME_SET))
attr->ia_mtime = now;
if (ia_valid & ATTR_KILL_SUID) {
- attr->ia_valid &= ~ATTR_KILL_SUID;
- if (mode & S_ISUID) {
- if (!(ia_valid & ATTR_MODE)) {
- ia_valid = attr->ia_valid |= ATTR_MODE;
- attr->ia_mode = inode->i_mode;
- }
- attr->ia_mode &= ~S_ISUID;
- }
+ ia_valid &= ~ATTR_KILL_SUID;
+ if (inode->i_mode & S_ISUID)
+ ia_valid |= ATTR_MODE;
}
if (ia_valid & ATTR_KILL_SGID) {
- attr->ia_valid &= ~ ATTR_KILL_SGID;
- if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
- if (!(ia_valid & ATTR_MODE)) {
- ia_valid = attr->ia_valid |= ATTR_MODE;
- attr->ia_mode = inode->i_mode;
- }
- attr->ia_mode &= ~S_ISGID;
- }
+ ia_valid &= ~ATTR_KILL_SGID;
+ if ((inode->i_mode & (S_ISGID | S_IXGRP)) ==
+ (S_ISGID | S_IXGRP))
+ ia_valid |= ATTR_MODE;
}
- if (!attr->ia_valid)
+ if (!ia_valid)
return 0;

if (ia_valid & ATTR_SIZE)
@@ -147,6 +162,7 @@ int notify_change(struct dentry * dentry, struct iattr * attr)
if (!error)
error = inode->i_op->setattr(dentry, attr);
} else {
+ attr_kill_to_mode(inode, attr);
error = inode_change_ok(inode, attr);
if (!error)
error = security_inode_setattr(dentry, attr);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d33bead..f617b23 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1561,6 +1561,7 @@ extern int do_remount_sb(struct super_block *sb, int flags,
#ifdef CONFIG_BLOCK
extern sector_t bmap(struct inode *, sector_t);
#endif
+extern void attr_kill_to_mode(struct inode *inode, struct iattr *attr);
extern int notify_change(struct dentry *, struct iattr *);
extern int permission(struct inode *, int, struct nameidata *);
extern int generic_permission(struct inode *, int,
--
1.5.2.2


2007-08-06 17:50:56

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

> Separate the handling of the local ia_valid bitmask from the one in
> attr->ia_valid. This allows us to hand off the actual handling of the
> ATTR_KILL_* flags to the .setattr i_op when one is defined.
>
> notify_change still needs to process those flags for the local ia_valid
> variable, since it uses that to decide whether to return early, and to pass
> a (hopefully) appropriate bitmask to fsnotify_change.

I agree with this change and fuse will make use of it as well.

Maybe instead of unconditionally moving attr_kill_to_mode() inside
->setattr() it could be made conditional based on an inode flag
similarly to S_NOCMTIME. Advantages:

- no need to modify a lot of in-tree filesystems
- no silent breakage of out-of-tree fs

Actually I think the new flag would be used by exacly the same
filesystems as S_NOCMTIME, so maybe it would make sense to rename
S_NOCMTIME to something more generic (S_NOATTRUPDATE or whatever) and
use that.

But that could still break out-of-tree fs, so a separate flag is
probably better.

Miklos

2007-08-06 18:14:38

by Jeff Layton

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

On Mon, 06 Aug 2007 19:43:46 +0200
Miklos Szeredi <[email protected]> wrote:

> > Separate the handling of the local ia_valid bitmask from the one in
> > attr->ia_valid. This allows us to hand off the actual handling of the
> > ATTR_KILL_* flags to the .setattr i_op when one is defined.
> >
> > notify_change still needs to process those flags for the local ia_valid
> > variable, since it uses that to decide whether to return early, and to pass
> > a (hopefully) appropriate bitmask to fsnotify_change.
>
> I agree with this change and fuse will make use of it as well.
>
> Maybe instead of unconditionally moving attr_kill_to_mode() inside
> ->setattr() it could be made conditional based on an inode flag
> similarly to S_NOCMTIME. Advantages:
>
> - no need to modify a lot of in-tree filesystems
> - no silent breakage of out-of-tree fs
>
> Actually I think the new flag would be used by exacly the same
> filesystems as S_NOCMTIME, so maybe it would make sense to rename
> S_NOCMTIME to something more generic (S_NOATTRUPDATE or whatever) and
> use that.
>
> But that could still break out-of-tree fs, so a separate flag is
> probably better.
>

In the past I've been told that adding new flags is something of a
"last resort". Since it's not strictly necessary to fix this then
it may be best to avoid that.

That said, if the concensus is that we need a transition mechanism,
then I'd be open to such a suggestion.

--
Jeff Layton <[email protected]>

2007-08-06 18:31:52

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

> > I agree with this change and fuse will make use of it as well.
> >
> > Maybe instead of unconditionally moving attr_kill_to_mode() inside
> > ->setattr() it could be made conditional based on an inode flag
> > similarly to S_NOCMTIME. Advantages:
> >
> > - no need to modify a lot of in-tree filesystems
> > - no silent breakage of out-of-tree fs
> >
> > Actually I think the new flag would be used by exacly the same
> > filesystems as S_NOCMTIME, so maybe it would make sense to rename
> > S_NOCMTIME to something more generic (S_NOATTRUPDATE or whatever) and
> > use that.
> >
> > But that could still break out-of-tree fs, so a separate flag is
> > probably better.
> >
>
> In the past I've been told that adding new flags is something of a
> "last resort". Since it's not strictly necessary to fix this then
> it may be best to avoid that.
>
> That said, if the concensus is that we need a transition mechanism,
> then I'd be open to such a suggestion.

I think there's really no other choice here.

Your patch is changing the API in a very unsafe way, since there will
be no error or warning on an unconverted fs. And that could lead to
security holes.

If we would rename the setattr method to setattr_new as well as
changing it's behavior, that would be fine. But I guess we do not
want to do that.

Miklos

2007-08-06 19:05:26

by Trond Myklebust

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

On Mon, 2007-08-06 at 20:28 +0200, Miklos Szeredi wrote:

> Your patch is changing the API in a very unsafe way, since there will
> be no error or warning on an unconverted fs. And that could lead to
> security holes.
>
> If we would rename the setattr method to setattr_new as well as
> changing it's behavior, that would be fine. But I guess we do not
> want to do that.

Which "unconverted fses"? If we're talking out of tree stuff, then too
bad: it is _their_ responsibility to keep up with kernel changes.

Trond

2007-08-06 19:38:46

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

> > Your patch is changing the API in a very unsafe way, since there will
> > be no error or warning on an unconverted fs. And that could lead to
> > security holes.
> >
> > If we would rename the setattr method to setattr_new as well as
> > changing it's behavior, that would be fine. But I guess we do not
> > want to do that.
>
> Which "unconverted fses"? If we're talking out of tree stuff, then too
> bad: it is _their_ responsibility to keep up with kernel changes.

It is usually a good idea to not change the semantics of an API in a
backward incompatible way without changing the syntax as well.

This is true regardless of whether we care about out-of-tree code or
not (and we should care to some degree). And especially true if the
change in question is security sensitive.

Miklos

2007-08-06 21:24:01

by Trond Myklebust

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

On Mon, 2007-08-06 at 21:37 +0200, Miklos Szeredi wrote:
> > > Your patch is changing the API in a very unsafe way, since there will
> > > be no error or warning on an unconverted fs. And that could lead to
> > > security holes.
> > >
> > > If we would rename the setattr method to setattr_new as well as
> > > changing it's behavior, that would be fine. But I guess we do not
> > > want to do that.
> >
> > Which "unconverted fses"? If we're talking out of tree stuff, then too
> > bad: it is _their_ responsibility to keep up with kernel changes.
>
> It is usually a good idea to not change the semantics of an API in a
> backward incompatible way without changing the syntax as well.

We're taking two setattr flags ATTR_KILL_SGID, and ATTR_KILL_SUID which
have existed for several years in the VFS, and making them visible to
the filesystems. Out-of-tree filesystems that care can check for them in
a completely backward compatible way: you don't even need to add a
#define.

> This is true regardless of whether we care about out-of-tree code or
> not (and we should care to some degree). And especially true if the
> change in question is security sensitive.

It is not true "regardless": the in-tree code is being converted.
Out-of-tree code is the only "problem" here, and their only problem is
that they may have to add support for the new flags if they also support
suid/sgid mode bits.

Are you advocating reserving a new filesystem bit every time we need to
add an attribute flag?

Trond

2007-08-07 06:01:46

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

(cutting out lists from CC)

> > > > Your patch is changing the API in a very unsafe way, since there will
> > > > be no error or warning on an unconverted fs. And that could lead to
> > > > security holes.
> > > >
> > > > If we would rename the setattr method to setattr_new as well as
> > > > changing it's behavior, that would be fine. But I guess we do not
> > > > want to do that.
> > >
> > > Which "unconverted fses"? If we're talking out of tree stuff, then too
> > > bad: it is _their_ responsibility to keep up with kernel changes.
> >
> > It is usually a good idea to not change the semantics of an API in a
> > backward incompatible way without changing the syntax as well.
>
> We're taking two setattr flags ATTR_KILL_SGID, and ATTR_KILL_SUID which
> have existed for several years in the VFS, and making them visible to
> the filesystems. Out-of-tree filesystems that care can check for them in
> a completely backward compatible way: you don't even need to add a
> #define.

Making flags visible is not the problem.

Making another flag invisible (ATTR_MODE) at the same time _is_.

> > This is true regardless of whether we care about out-of-tree code or
> > not (and we should care to some degree). And especially true if the
> > change in question is security sensitive.
>
> It is not true "regardless": the in-tree code is being converted.
> Out-of-tree code is the only "problem" here, and their only problem is
> that they may have to add support for the new flags if they also support
> suid/sgid mode bits.

Yes. And there would be no problem with that, as long as it would be
breaking the API in a visible way. It does not do that and that is
unsafe.

The other thing with this change is, that it's generally a good idea
to let the VFS do as much as possible, because then filesystem writers
won't get it wrong.

In this case the suid/sgid check needs to be omitted for _very_ few
filesystems (NFS and fuse). So it makes sense to leave the check
outside filesystem code, and move it inside only when necessary.

> Are you advocating reserving a new filesystem bit every time we need to
> add an attribute flag?

No, just adding a flag does not constitute a backward incompatible
change.

Miklos

2007-08-07 10:13:35

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

There's another way to deal with this in NFS and fuse, without having
to change the API:

- remove suid/sgid bits from i_mode, when refreshing the inode attributes
- store the removed bits (or the original mode) in the fs' inode strucure
- in ->getattr() restore the original mode into the returned stat

This way the VFS believes, that the inode does not in fact have the
suid/sgid bits and so won't call ->setattr() unnecessarily.

I've tested a similar change in fuse for working around unneeded check
for the directory sticky bit.

Yes, this is cheating the interface, but there's a deeper level where
it makes sense: the VFS should not be checking _any_ inode attribute
besides the file type, since they can change at any moment, and the
VFS might be using stale data without having first properly
revalidated it.

Miklos

2007-08-07 10:32:41

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

> There's another way to deal with this in NFS and fuse, without having
> to change the API:
>
> - remove suid/sgid bits from i_mode, when refreshing the inode attributes
> - store the removed bits (or the original mode) in the fs' inode strucure
> - in ->getattr() restore the original mode into the returned stat
>
> This way the VFS believes, that the inode does not in fact have the
> suid/sgid bits and so won't call ->setattr() unnecessarily.

Of course this would break exec(). Oh, well.

Miklos

2007-08-07 11:30:37

by Jeff Layton

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

On Tue, 07 Aug 2007 08:00:40 +0200
Miklos Szeredi <[email protected]> wrote:

> (cutting out lists from CC)
>
> > > > > Your patch is changing the API in a very unsafe way, since there will
> > > > > be no error or warning on an unconverted fs. And that could lead to
> > > > > security holes.
> > > > >
> > > > > If we would rename the setattr method to setattr_new as well as
> > > > > changing it's behavior, that would be fine. But I guess we do not
> > > > > want to do that.
> > > >
> > > > Which "unconverted fses"? If we're talking out of tree stuff, then too
> > > > bad: it is _their_ responsibility to keep up with kernel changes.
> > >
> > > It is usually a good idea to not change the semantics of an API in a
> > > backward incompatible way without changing the syntax as well.
> >
> > We're taking two setattr flags ATTR_KILL_SGID, and ATTR_KILL_SUID which
> > have existed for several years in the VFS, and making them visible to
> > the filesystems. Out-of-tree filesystems that care can check for them in
> > a completely backward compatible way: you don't even need to add a
> > #define.
>
> Making flags visible is not the problem.
>
> Making another flag invisible (ATTR_MODE) at the same time _is_.
>
> > > This is true regardless of whether we care about out-of-tree code or
> > > not (and we should care to some degree). And especially true if the
> > > change in question is security sensitive.
> >
> > It is not true "regardless": the in-tree code is being converted.
> > Out-of-tree code is the only "problem" here, and their only problem is
> > that they may have to add support for the new flags if they also support
> > suid/sgid mode bits.
>
> Yes. And there would be no problem with that, as long as it would be
> breaking the API in a visible way. It does not do that and that is
> unsafe.
>
> The other thing with this change is, that it's generally a good idea
> to let the VFS do as much as possible, because then filesystem writers
> won't get it wrong.
>
> In this case the suid/sgid check needs to be omitted for _very_ few
> filesystems (NFS and fuse). So it makes sense to leave the check
> outside filesystem code, and move it inside only when necessary.
>

On the other hand, the filesystem writers here are declaring their own
setattr operation. Is it unreasonable for them to take responsibility
for handling this too?

There are other in-tree filesystems that likely need to have this check
omitted (other networked filesystems in particular), but this
patchset tries to make this change transparent for now. Those authors
can go back and fix this up later.

As Trond said, in-tree filesystems will be converted so they won't
be an issue. The only danger is someone who is running unconverted
out-of-tree filesystem code on a kernel with this patch. Is that enough
of an issue to warrant us taking extra steps to deal with it?

Another alternative might be to rename notify_change(). I don't really
like gratuitously breaking the API, though, and that function is
referenced in a lot of documentation. Changing it might be confusing...

--
Jeff Layton <[email protected]>

2007-08-07 11:58:50

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

> On the other hand, the filesystem writers here are declaring their own
> setattr operation. Is it unreasonable for them to take responsibility
> for handling this too?

We have about half of all the in-tree filesystems declaring
->setattr(), and out of those only two that we know would use this.
Others haven't commented, which proably means, they just don't care
about this issue.

And even if most of them would make use of this feature, a inode flag
or filesystem flag for a smooth and backward compatible migration is
just so much better than risking to break filesystems.

And yes, I'm thinking about in-tree filesystems as well. I'm sure you
did a thorough audit of all filesystems, but we are all human and can
make mistakes. It is _always_ safer to not change an API which could
cause silent breakage. And IMO, that's far more important than the
beauty of an interface (and ->setattr() is not beautiful with or
without an inode flag).

> Another alternative might be to rename notify_change(). I don't really
> like gratuitously breaking the API, though, and that function is
> referenced in a lot of documentation. Changing it might be confusing...

Oh no. I'd like less breakage not more.

Miklos

2007-08-07 21:03:36

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

> +void attr_kill_to_mode(struct inode *inode, struct iattr *attr)

This function badly needs a kerneldoc description. Also I can't say
I like the name a lot, but without a clearly better idea I should
probably not complain :)

We should at least add a generic_ prefix to indicate it's a generic
helper valid for most filesystem (and the kerneldoc comment can explain
the details)

2007-08-07 22:23:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 01/25] VFS: move attr_kill logic from notify_change into helper function

On Tue, 7 Aug 2007 21:51:49 +0100
Christoph Hellwig <[email protected]> wrote:

> > +void attr_kill_to_mode(struct inode *inode, struct iattr *attr)
>
> This function badly needs a kerneldoc description. Also I can't say
> I like the name a lot, but without a clearly better idea I should
> probably not complain :)
>

Thanks for the comments.

I'm not thrilled with the name either, but kill_suid and *remove_suid
were already taken, and I really didn't want to name this something too
similar since there are already so many similarly named functions that
don't do the same thing. I'm definitely open to suggestions for
something different.

> We should at least add a generic_ prefix to indicate it's a generic
> helper valid for most filesystem (and the kerneldoc comment can explain
> the details)
>

Both good suggestions. I'll plan to incorporate them in the next
respin of the set.

Thanks,
--
Jeff Layton <[email protected]>