Hello!
It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
linux-2.4.x. Both capabilities affects file system and can be
considered file system capabilities.
Let's look at linux-2.6.x.
In include/linux/capability.h CAP_FS_SET is defined to contain
following capabilities:
CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER,
CAP_FSETID and CAP_MAC_OVERRIDE.
And CAP_NFSD_SET is defined to be the same plus CAP_SYS_RESOURCE.
So, both CAP_FS_SET and CAP_NFSD_SET doesn't include CAP_MKNOD and
CAP_LINUX_IMMUTABLE.
Also include/linux/capability.h there are cap_drop_fs_set(...),
cap_raise_fs_set(...),
cap_drop_nfsd_set(...) and cap_raise_nfsd_set(...) inline functions that return
corresponding capabilities sets.
Let's look how these functions are used.
In file fs/nfsd/auth.c function nfsd_setuser(...) calls
cap_raise_nfsd_set(...) and
cap_drop_nfsd_set(...) to add/exclude corresponding capabilities to/from
effective set of current nfsd process.
And in file security/commoncap.c function cap_task_post_setuid(...) calls
cap_drop_fs_set(...) and cap_raise_fs_set(...) to change effective set
of current task
when (current->fsuid != old_ruid).
In linux-2.4.x the story is the same.
In file include/linux/capability.h CAP_FS_MASK is defined to contain
CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER, CAP_FSETID
capabilities.
And in file fs/nfsd/auth.c CAP_NFSD_MASK is defined to be same as CAP_FS_MASK
plus CAP_SYS_RESOURCE.
In file fs/nfsd/auth.c function nfsd_setuser(...) uses CAP_NFSD_MASK
to add/exclude corresponding capabilities to/from effective set of current
nfsd process.
And CAP_FS_MASK used in file kernel/sys.c in function sys_setfsuid(...)
to add/exclude corresponding capabilities to/from effective set of current task.
This can be exploited (and I have succesfully tried it).
Suppose you have NFS-share exported even with root_squash option.
If one client was compromised, local root can set CAP_MKNOD to some
local user's process. Then that user can execute mknod to create a device
that will be owned by that user, e.g. block device file for /dev/hda hard drive.
And he can create that device file on NFS-share (even exported with root_squash
option). After that he can someway (ssh, cgi) execute code on another nfs client
or the server to modify it's filesystem. It will be possible because
he owns that
device file on nfs share.
The problem is because CAP_MKNOD allows that user to successfully execute
vfs_mknod(...) function on local host, and that function will call corresponding
function in nfs module which sends request to NFS server. And nfsd will not
drop CAP_MKNOD in nfsd_setuser(...) function when impersonating to that user.
Of course, NFS-shares can be mounted with nodev option, but they should be
placed on separate partition on NFS-server, so even on server that partition
is mounted with nodev option too.
So I suggest to add CAP_MKNOD and CAP_LINUX_IMMUTABLE to CAP_FS_MASK
in linux-2.4.x and to CAP_FS_MASK_B0 in linux-2.6.x.
On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
> Hello!
>
> It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
> added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
> linux-2.4.x. Both capabilities affects file system and can be
> considered file system capabilities.
Sounds right to me--I'd expect rootsquash to guarantee that new device
nodes can't be created from the network. Cc'ing random people from the
git log for include/linux/capability.h in hopes they can help.
--b.
(Also: my copy of mknod(2) claims "Linux... does not have the CAP_MKNOD
capability". I assume the manpage is out of date?)
>
> Let's look at linux-2.6.x.
>
> In include/linux/capability.h CAP_FS_SET is defined to contain
> following capabilities:
> CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER,
> CAP_FSETID and CAP_MAC_OVERRIDE.
>
> And CAP_NFSD_SET is defined to be the same plus CAP_SYS_RESOURCE.
>
> So, both CAP_FS_SET and CAP_NFSD_SET doesn't include CAP_MKNOD and
> CAP_LINUX_IMMUTABLE.
>
> Also include/linux/capability.h there are cap_drop_fs_set(...),
> cap_raise_fs_set(...),
> cap_drop_nfsd_set(...) and cap_raise_nfsd_set(...) inline functions that return
> corresponding capabilities sets.
>
> Let's look how these functions are used.
>
> In file fs/nfsd/auth.c function nfsd_setuser(...) calls
> cap_raise_nfsd_set(...) and
> cap_drop_nfsd_set(...) to add/exclude corresponding capabilities to/from
> effective set of current nfsd process.
>
> And in file security/commoncap.c function cap_task_post_setuid(...) calls
> cap_drop_fs_set(...) and cap_raise_fs_set(...) to change effective set
> of current task
> when (current->fsuid != old_ruid).
>
> In linux-2.4.x the story is the same.
>
> In file include/linux/capability.h CAP_FS_MASK is defined to contain
> CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER, CAP_FSETID
> capabilities.
>
> And in file fs/nfsd/auth.c CAP_NFSD_MASK is defined to be same as CAP_FS_MASK
> plus CAP_SYS_RESOURCE.
>
> In file fs/nfsd/auth.c function nfsd_setuser(...) uses CAP_NFSD_MASK
> to add/exclude corresponding capabilities to/from effective set of current
> nfsd process.
>
> And CAP_FS_MASK used in file kernel/sys.c in function sys_setfsuid(...)
> to add/exclude corresponding capabilities to/from effective set of current task.
>
> This can be exploited (and I have succesfully tried it).
>
> Suppose you have NFS-share exported even with root_squash option.
> If one client was compromised, local root can set CAP_MKNOD to some
> local user's process. Then that user can execute mknod to create a device
> that will be owned by that user, e.g. block device file for /dev/hda hard drive.
>
> And he can create that device file on NFS-share (even exported with root_squash
> option). After that he can someway (ssh, cgi) execute code on another nfs client
> or the server to modify it's filesystem. It will be possible because
> he owns that
> device file on nfs share.
>
> The problem is because CAP_MKNOD allows that user to successfully execute
> vfs_mknod(...) function on local host, and that function will call corresponding
> function in nfs module which sends request to NFS server. And nfsd will not
> drop CAP_MKNOD in nfsd_setuser(...) function when impersonating to that user.
>
> Of course, NFS-shares can be mounted with nodev option, but they should be
> placed on separate partition on NFS-server, so even on server that partition
> is mounted with nodev option too.
>
> So I suggest to add CAP_MKNOD and CAP_LINUX_IMMUTABLE to CAP_FS_MASK
> in linux-2.4.x and to CAP_FS_MASK_B0 in linux-2.6.x.
Quoting J. Bruce Fields ([email protected]):
> On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
> > Hello!
> >
> > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
(Still looking into this, but meanwhile...)
> > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
> > linux-2.4.x. Both capabilities affects file system and can be
> > considered file system capabilities.
>
> Sounds right to me--I'd expect rootsquash to guarantee that new device
> nodes can't be created from the network. Cc'ing random people from the
> git log for include/linux/capability.h in hopes they can help.
>
> --b.
>
> (Also: my copy of mknod(2) claims "Linux... does not have the CAP_MKNOD
> capability". I assume the manpage is out of date?)
No, the whole paragraph is:
EPERM mode requested creation of something other than a regular file, FIFO
(named pipe), or Unix domain socket, and the caller is not privileged
(Linux: does not have the CAP_MKNOD capability);
So it's saying that 'caller is not privileged', in linux, can be
interpreted to mean 'the caller does not have CAP_MKNOD'.
>
> >
> > Let's look at linux-2.6.x.
> >
> > In include/linux/capability.h CAP_FS_SET is defined to contain
> > following capabilities:
> > CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER,
> > CAP_FSETID and CAP_MAC_OVERRIDE.
> >
> > And CAP_NFSD_SET is defined to be the same plus CAP_SYS_RESOURCE.
> >
> > So, both CAP_FS_SET and CAP_NFSD_SET doesn't include CAP_MKNOD and
> > CAP_LINUX_IMMUTABLE.
> >
> > Also include/linux/capability.h there are cap_drop_fs_set(...),
> > cap_raise_fs_set(...),
> > cap_drop_nfsd_set(...) and cap_raise_nfsd_set(...) inline functions that return
> > corresponding capabilities sets.
> >
> > Let's look how these functions are used.
> >
> > In file fs/nfsd/auth.c function nfsd_setuser(...) calls
> > cap_raise_nfsd_set(...) and
> > cap_drop_nfsd_set(...) to add/exclude corresponding capabilities to/from
> > effective set of current nfsd process.
> >
> > And in file security/commoncap.c function cap_task_post_setuid(...) calls
> > cap_drop_fs_set(...) and cap_raise_fs_set(...) to change effective set
> > of current task
> > when (current->fsuid != old_ruid).
> >
> > In linux-2.4.x the story is the same.
> >
> > In file include/linux/capability.h CAP_FS_MASK is defined to contain
> > CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER, CAP_FSETID
> > capabilities.
> >
> > And in file fs/nfsd/auth.c CAP_NFSD_MASK is defined to be same as CAP_FS_MASK
> > plus CAP_SYS_RESOURCE.
> >
> > In file fs/nfsd/auth.c function nfsd_setuser(...) uses CAP_NFSD_MASK
> > to add/exclude corresponding capabilities to/from effective set of current
> > nfsd process.
> >
> > And CAP_FS_MASK used in file kernel/sys.c in function sys_setfsuid(...)
> > to add/exclude corresponding capabilities to/from effective set of current task.
> >
> > This can be exploited (and I have succesfully tried it).
> >
> > Suppose you have NFS-share exported even with root_squash option.
> > If one client was compromised, local root can set CAP_MKNOD to some
> > local user's process. Then that user can execute mknod to create a device
> > that will be owned by that user, e.g. block device file for /dev/hda hard drive.
> >
> > And he can create that device file on NFS-share (even exported with root_squash
> > option). After that he can someway (ssh, cgi) execute code on another nfs client
> > or the server to modify it's filesystem. It will be possible because
> > he owns that
> > device file on nfs share.
> >
> > The problem is because CAP_MKNOD allows that user to successfully execute
> > vfs_mknod(...) function on local host, and that function will call corresponding
> > function in nfs module which sends request to NFS server. And nfsd will not
> > drop CAP_MKNOD in nfsd_setuser(...) function when impersonating to that user.
> >
> > Of course, NFS-shares can be mounted with nodev option, but they should be
> > placed on separate partition on NFS-server, so even on server that partition
> > is mounted with nodev option too.
> >
> > So I suggest to add CAP_MKNOD and CAP_LINUX_IMMUTABLE to CAP_FS_MASK
> > in linux-2.4.x and to CAP_FS_MASK_B0 in linux-2.6.x.
Quoting J. Bruce Fields ([email protected]):
> On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
> > Hello!
> >
> > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
> > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
> > linux-2.4.x. Both capabilities affects file system and can be
> > considered file system capabilities.
>
> Sounds right to me--I'd expect rootsquash to guarantee that new device
> nodes can't be created from the network. Cc'ing random people from the
> git log for include/linux/capability.h in hopes they can help.
Yeah it seems reasonable. If it is, then does that mean that we
also need CAP_SYS_ADMIN (to write selinux labels) and CAP_SETFCAP
(to set file capabilities) as well?
-serge
On Thu, Mar 12, 2009 at 11:03:00AM -0500, Serge E. Hallyn wrote:
> Quoting J. Bruce Fields ([email protected]):
> > On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
> > > Hello!
> > >
> > > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
>
> (Still looking into this, but meanwhile...)
>
> > > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
> > > linux-2.4.x. Both capabilities affects file system and can be
> > > considered file system capabilities.
> >
> > Sounds right to me--I'd expect rootsquash to guarantee that new device
> > nodes can't be created from the network. Cc'ing random people from the
> > git log for include/linux/capability.h in hopes they can help.
> >
> > --b.
> >
> > (Also: my copy of mknod(2) claims "Linux... does not have the CAP_MKNOD
> > capability". I assume the manpage is out of date?)
>
> No, the whole paragraph is:
>
> EPERM mode requested creation of something other than a regular file, FIFO
> (named pipe), or Unix domain socket, and the caller is not privileged
> (Linux: does not have the CAP_MKNOD capability);
>
> So it's saying that 'caller is not privileged', in linux, can be
> interpreted to mean 'the caller does not have CAP_MKNOD'.
Ah! Foiled by punctuation. Apologies!--b.
On Thu, Mar 12, 2009 at 11:10:47AM -0500, Serge E. Hallyn wrote:
> Quoting J. Bruce Fields ([email protected]):
> > On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
> > > Hello!
> > >
> > > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
> > > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
> > > linux-2.4.x. Both capabilities affects file system and can be
> > > considered file system capabilities.
> >
> > Sounds right to me--I'd expect rootsquash to guarantee that new device
> > nodes can't be created from the network. Cc'ing random people from the
> > git log for include/linux/capability.h in hopes they can help.
>
> Yeah it seems reasonable. If it is, then does that mean that we
> also need CAP_SYS_ADMIN (to write selinux labels) and CAP_SETFCAP
> (to set file capabilities) as well?
For nfsd at least we should be droppping anything that concerns the
filesystem and would normally require root privileges.
We need to trace up through the users of CAP_FS_SET and figure out what
other users need.
--b.
On Fri, Mar 13, 2009 at 5:10 AM, Serge E. Hallyn <[email protected]> wrote:
> Quoting J. Bruce Fields ([email protected]):
>> On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
>> > Hello!
>> >
>> > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
>> > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
>> > linux-2.4.x. Both capabilities affects file system and can be
>> > considered file system capabilities.
>>
>> Sounds right to me--I'd expect rootsquash to guarantee that new device
>> nodes can't be created from the network. ?Cc'ing random people from the
>> git log for include/linux/capability.h in hopes they can help.
>
> Yeah it seems reasonable. ?If it is, then does that mean that we
> also need CAP_SYS_ADMIN (to write selinux labels) and CAP_SETFCAP
> (to set file capabilities) as well?
If a change is made to CAP_FS_MASK, please do remember to CC
[email protected], and linux-api@.
Cheers,
Michael
--
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/ Found a documentation bug?
http://www.kernel.org/doc/man-pages/reporting_bugs.html
2009/3/12 J. Bruce Fields <[email protected]>:
>
> For nfsd at least we should be droppping anything that concerns the
> filesystem and would normally require root privileges.
>
> We need to trace up through the users of CAP_FS_SET and figure out what
> other users need.
2009/3/12 J. Bruce Fields <[email protected]>:
> For nfsd at least we should be droppping anything that concerns the
> filesystem and would normally require root privileges.
>
> We need to trace up through the users of CAP_FS_SET and figure out what
> other users need.
CAP_FS_SET used unly in functions cap_is_fs_cap(...),
cap_drop_fs_set(...), cap_raise_fs_set(...).
And it is used as a base for CAP_NFSD_SET.
CAP_NFSD_SET is used in cap_drop_nfsd_set(...) and
cap_raise_nfsd_set(...) functions.
All above you can find in include/linux/capability.h
cap_is_fs_cap(...) is not used in mainstream kernel.
As for cap_drop_fs_set(...) and cap_raise_fs_set(...), they are used in function
cap_task_post_setuid(...) in file security/commoncap.c:
------------------------------------------------------------------------------
int cap_task_post_setuid (uid_t old_ruid, uid_t old_euid, uid_t old_suid,
int flags)
{
switch (flags) {
...
case LSM_SETID_FS:
{
uid_t old_fsuid = old_ruid;
/* Copied from kernel/sys.c:setfsuid. */
/*
* FIXME - is fsuser used for all CAP_FS_MASK capabilities?
* if not, we might be a bit too harsh here.
*/
if (!issecure (SECURE_NO_SETUID_FIXUP)) {
if (old_fsuid == 0 && current->fsuid != 0) {
current->cap_effective =
cap_drop_fs_set(
current->cap_effective);
}
if (old_fsuid != 0 && current->fsuid == 0) {
current->cap_effective =
cap_raise_fs_set(
current->cap_effective,
current->cap_permitted);
}
}
break;
}
default:
return -EINVAL;
}
return 0;
}
------------------------------------------------------------------------------
So, it raises or drops filesystem capabilities when (old_ruid !=
current->fsuid), i.e. when fsuid changes.
And cap_task_post_setuid(...) indirectly called from setfsuid(...)
syscall function in file kernel/sys.c:
------------------------------------------------------------------------------
/*
* "setfsuid()" sets the fsuid - the uid used for filesystem checks. This
* is used for "access()" and for the NFS daemon (letting nfsd stay at
* whatever uid it wants to). It normally shadows "euid", except when
* explicitly set by setfsuid() or for access..
*/
SYSCALL_DEFINE1(setfsuid, uid_t, uid)
{
int old_fsuid;
old_fsuid = current->fsuid;
if (security_task_setuid(uid, (uid_t)-1, (uid_t)-1, LSM_SETID_FS))
return old_fsuid;
if (uid == current->uid || uid == current->euid ||
uid == current->suid || uid == current->fsuid ||
capable(CAP_SETUID)) {
if (uid != old_fsuid) {
set_dumpable(current->mm, suid_dumpable);
smp_wmb();
}
current->fsuid = uid;
}
key_fsuid_changed(current);
proc_id_connector(current, PROC_EVENT_UID);
/* !!! HERE !!! */
security_task_post_setuid(old_fsuid, (uid_t)-1, (uid_t)-1, LSM_SETID_FS);
return old_fsuid;
}
------------------------------------------------------------------------------
And here is example of using setfsuid(...) syscall. I have found it in
linux-PAM package
in pam_xauth module in file modules/pam_xauth.c. This module reads
file in home directory
of authenticating user. Look at this:
------------------------------------------------------------------------------
euid = geteuid();
setfsuid(pwd->pw_uid);
fp = fopen(path, "r");
setfsuid(euid);
------------------------------------------------------------------------------
Module runs as root, but needs to temporarily drop filesystem
capabilities, so module
will not read files that user couldn't read according to permissions.
And I think that when some process running as root decides to temporarily drop
filesystem capabilities, CAP_MKNOD and CAP_LINUX_IMMUTABLE
(and probably some selinux related capablities) must be dropped too,
as ordinary users couldn't create devices, etc.
On Fri, Mar 13, 2009 at 09:21:23AM +1300, Michael Kerrisk wrote:
> On Fri, Mar 13, 2009 at 5:10 AM, Serge E. Hallyn <[email protected]> wrote:
> > Quoting J. Bruce Fields ([email protected]):
> >> On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
> >> > Hello!
> >> >
> >> > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
> >> > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
> >> > linux-2.4.x. Both capabilities affects file system and can be
> >> > considered file system capabilities.
> >>
> >> Sounds right to me--I'd expect rootsquash to guarantee that new device
> >> nodes can't be created from the network. Cc'ing random people from the
> >> git log for include/linux/capability.h in hopes they can help.
> >
> > Yeah it seems reasonable. If it is, then does that mean that we
> > also need CAP_SYS_ADMIN (to write selinux labels) and CAP_SETFCAP
> > (to set file capabilities) as well?
>
> If a change is made to CAP_FS_MASK, please do remember to CC
> [email protected], and linux-api@.
OK, that's because the exact set of capabilities that is dropped on
setfsuid is documented in capabilities(7)? (Anywhere else?)
--b.
>
> Cheers,
>
> Michael
>
>
> --
> Michael Kerrisk Linux man-pages maintainer;
> http://www.kernel.org/doc/man-pages/ Found a documentation bug?
> http://www.kernel.org/doc/man-pages/reporting_bugs.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Quoting Igor Zhbanov ([email protected]):
> But ordinary users can't create devices. It seems to me that in time
> of implementation of capabilities in kernel 2.4, capabilities related
> to filesystem was added first. And mark for them contains all above in
> header file. And when CAP_MKNOD was added later, author just forget to
> update mask.
>
> If mask was designed to drop all filesystem related capabilities, then
> it must be expanded, because ordinary users cannot create devices etc.
I think you thought Bruce was saying we shouldn't change the set of
capabilities, but he was just asking exactly what changes Michael was
interested in.
Igor, thanks for finding this. I never got your original message. Do
you have a patdch to add the two capabilities? Do you think the
other two I mentioned (CAP_SYS_ADMIN and CAP_SETFCAP) need to be
added too?
I've added Andrew Morgan, LSM and SELinux mailing lists to get another
opinion about adding those two. In particular, we'd be adding them
to the fs_masks becuase CAP_SYS_ADMIN lets you change the selinux
label, and CAP_SETFCAP lets you change the file capabilities.
thanks,
-serge
On Sat, Mar 14, 2009 at 6:58 AM, J. Bruce Fields <[email protected]> wrote:
> On Fri, Mar 13, 2009 at 09:21:23AM +1300, Michael Kerrisk wrote:
>> On Fri, Mar 13, 2009 at 5:10 AM, Serge E. Hallyn <[email protected]> wrote:
>> > Quoting J. Bruce Fields ([email protected]):
>> >> On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
>> >> > Hello!
>> >> >
>> >> > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
>> >> > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
>> >> > linux-2.4.x. Both capabilities affects file system and can be
>> >> > considered file system capabilities.
>> >>
>> >> Sounds right to me--I'd expect rootsquash to guarantee that new device
>> >> nodes can't be created from the network. ?Cc'ing random people from the
>> >> git log for include/linux/capability.h in hopes they can help.
>> >
>> > Yeah it seems reasonable. ?If it is, then does that mean that we
>> > also need CAP_SYS_ADMIN (to write selinux labels) and CAP_SETFCAP
>> > (to set file capabilities) as well?
>>
>> If a change is made to CAP_FS_MASK, please do remember to CC
>> [email protected], and linux-api@.
>
> OK, that's because the exact set of capabilities that is dropped on
> setfsuid is documented in capabilities(7)? ?(Anywhere else?)
Not that I know of.
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
Hello, everybody!
I look again at kernel sources and will tell you what I think of.
CAP_FS_MASK is used currently in two places: in setfsuid(...) system call
and as a base for CAP_NFSD_MASK which used in nfsd_setuser(...) function.
First, setfsuid(...). As I understand, this system call is a subset
of seteuid(...). And it is used when some privileged process want to do
some filesystem operations as some ordinary user could do. That privileged
process don't want to completely drop all privileges and become that ordinary
user. But it wants temporarily lower it's privileges and set fsuid, so process
don't bother itself with checking permissions on files and can rely on kernel.
Here is typical usage from linux-PAM package, file modules/pam_xauth.c:
euid = geteuid();
setfsuid(pwd->pw_uid);
fp = fopen(path, "r");
setfsuid(euid);
if (fp != NULL) {...}
So, privileged PAM authentication process sets fsuid and tries to read file
from user's home, and after attempt to open file, it sets fsuid back.
So, if user cannot read that file, PAM module cannot read it too.
And I think that CAP_MKNOD and CAP_LINUX_IMMUTABLE privileges should
be included in CAP_FS_MASK. As for CAP_SYS_ADMIN and CAP_SETFCAP,
they could be also included in CAP_FS_MASK. Perhaps with CAP_MAC_ADMIN,
so mandatory access control labels couldn't be changed too.
Although it is strange to me if someone write a code that drops filesystem
capabilities and later tries e.g. to set SElinux label. Tries just to fail? ;-)
IMHO setfsuid(...) in ordinary privileged processes should be used only
for short times just around filesystem operations, that should be checked
against another user's permission.
As for NFSD, the story is quite different. Before attempting to access
file system NFSD calls nfsd_setuser(...) function. But NFSD is unaware
of capabilities of client's process. It knows just fsuid, fsgid and additional
groups of calling process (as it is done in AUTH_UNIX authorisation method).
So decision of NFSD is simple: if uid is zero, then raises filesystem
capabilities, else drop them.
And the problem is that not all filesystem operations that client can ask NFSD
to perform are covered by CAP_NFSD_SET. So if some ordinary user (because
of broken client or by given capability) will ask NFSD to create a device,
NFSD will do it because nfsd_setuser(...) doesn't drop that capability.
As for SElinux and extended attributes, it seems that extended attributes
other that ACL are not supported by NFS by bow. So, NFSD is unaffected
with CAP_SYS_ADMIN and CAP_SETFCAP not included in CAP_NFSD_SET - you just
can't ask NFSD to set SElinux label. It's not implemented. ;-)
And my conclusion is that CAP_MKNOD, CAP_LINUX_IMMUTABLE, CAP_SYS_ADMIN,
CAP_SETFCAP and CAP_MAC_ADMIN should be included
in CAP_FS_MASK.
I'm sure about CAP_MKNOD and CAP_LINUX_IMMUTABLE, and not so sure
of CAP_SYS_ADMIN, CAP_SETFCAP and CAP_MAC_ADMIN.
(NFS doesn't support SElinux, as I know. And dropping filesystem capabilities
before manipulating SElinux labels seems to be useless. And if someone exploits
vulnerability in process with dropped filesystem capabilities, it's
easy to bring them back.)
Please tell what you think.
And there are patches:
For linux-2.6:
------------------------------------------------------------------------------
diff -purN linux-2.6.28.7/include/linux/capability.h
linux/include/linux/capability.h
--- linux-2.6.28.7/include/linux/capability.h 2009-02-21
01:41:27.000000000 +0300
+++ linux/include/linux/capability.h 2009-03-16 17:09:23.588420300 +0300
@@ -370,9 +370,14 @@ typedef struct kernel_cap_struct {
| CAP_TO_MASK(CAP_DAC_OVERRIDE) \
| CAP_TO_MASK(CAP_DAC_READ_SEARCH) \
| CAP_TO_MASK(CAP_FOWNER) \
+ | CAP_TO_MASK(CAP_MKNOD) \
+ | CAP_TO_MASK(CAP_LINUX_IMMUTABLE) \
+ | CAP_TO_MASK(CAP_SYS_ADMIN) \
+ | CAP_TO_MASK(CAP_SETFCAP) \
| CAP_TO_MASK(CAP_FSETID))
-# define CAP_FS_MASK_B1 (CAP_TO_MASK(CAP_MAC_OVERRIDE))
+# define CAP_FS_MASK_B1 (CAP_TO_MASK(CAP_MAC_OVERRIDE) \
+ | CAP_TO_MASK(CAP_MAC_ADMIN))
#if _KERNEL_CAPABILITY_U32S != 2
# error Fix up hand-coded capability macro initializers
------------------------------------------------------------------------------
And for linux-2.4:
------------------------------------------------------------------------------
diff -purN linux-2.4.37/include/linux/capability.h
linux/include/linux/capability.h
--- linux-2.4.37/include/linux/capability.h 2008-12-02 11:01:34.000000000 +0300
+++ linux/include/linux/capability.h 2009-03-16 17:14:16.308635400 +0300
@@ -101,7 +101,7 @@ typedef __u32 kernel_cap_t;
/* Used to decide between falling back on the old suser() or fsuser(). */
-#define CAP_FS_MASK 0x1f
+#define CAP_FS_MASK 0x0820021f
/* Overrides the restriction that the real or effective user ID of a
process sending a signal must match the real or effective user ID
------------------------------------------------------------------------------
Anyway, I haven't write access to git repository, so if you agree,
please commit.
P.S. CAP_SYS_ADMIN is bad - too many actions are bounded to this capability.
Perhaps it should be broken down to a set of independent capabilities.
Especially, SElinux related could be separated.
On Mon, Mar 16, 2009 at 05:16:34PM +0300, Igor Zhbanov wrote:
> I look again at kernel sources and will tell you what I think of.
>
> CAP_FS_MASK is used currently in two places: in setfsuid(...) system call
> and as a base for CAP_NFSD_MASK which used in nfsd_setuser(...) function.
>
> First, setfsuid(...). As I understand, this system call is a subset
> of seteuid(...). And it is used when some privileged process want to do
> some filesystem operations as some ordinary user could do. That privileged
> process don't want to completely drop all privileges and become that ordinary
> user. But it wants temporarily lower it's privileges and set fsuid, so process
> don't bother itself with checking permissions on files and can rely on kernel.
> Here is typical usage from linux-PAM package, file modules/pam_xauth.c:
>
> euid = geteuid();
> setfsuid(pwd->pw_uid);
> fp = fopen(path, "r");
> setfsuid(euid);
> if (fp != NULL) {...}
>
> So, privileged PAM authentication process sets fsuid and tries to read file
> from user's home, and after attempt to open file, it sets fsuid back.
> So, if user cannot read that file, PAM module cannot read it too.
>
> And I think that CAP_MKNOD and CAP_LINUX_IMMUTABLE privileges should
> be included in CAP_FS_MASK. As for CAP_SYS_ADMIN and CAP_SETFCAP,
> they could be also included in CAP_FS_MASK. Perhaps with CAP_MAC_ADMIN,
> so mandatory access control labels couldn't be changed too.
>
> Although it is strange to me if someone write a code that drops filesystem
> capabilities and later tries e.g. to set SElinux label. Tries just to fail? ;-)
> IMHO setfsuid(...) in ordinary privileged processes should be used only
> for short times just around filesystem operations, that should be checked
> against another user's permission.
>
> As for NFSD, the story is quite different. Before attempting to access
> file system NFSD calls nfsd_setuser(...) function. But NFSD is unaware
> of capabilities of client's process. It knows just fsuid, fsgid and additional
> groups of calling process (as it is done in AUTH_UNIX authorisation method).
> So decision of NFSD is simple: if uid is zero, then raises filesystem
> capabilities, else drop them.
>
> And the problem is that not all filesystem operations that client can ask NFSD
> to perform are covered by CAP_NFSD_SET. So if some ordinary user (because
> of broken client or by given capability) will ask NFSD to create a device,
> NFSD will do it because nfsd_setuser(...) doesn't drop that capability.
>
> As for SElinux and extended attributes, it seems that extended attributes
> other that ACL are not supported by NFS by bow. So, NFSD is unaffected
> with CAP_SYS_ADMIN and CAP_SETFCAP not included in CAP_NFSD_SET - you just
> can't ask NFSD to set SElinux label. It's not implemented. ;-)
That's true, though the labelled nfs people may change this some day.
> And my conclusion is that CAP_MKNOD, CAP_LINUX_IMMUTABLE, CAP_SYS_ADMIN,
> CAP_SETFCAP and CAP_MAC_ADMIN should be included
> in CAP_FS_MASK.
That may be reasonable, but I'd like to see clearer criteria for
choosing those. Some considerations:
1. As capabilities(7) says, we must "preserve the traditional
semantics for transitions between 0 and non-zero user IDs".
The setfsuid() interface predates capabilities, so the
introduction of capabilities shouldn't have changed the
behavior of a program written in ignorance of capabilities.
2. Users of the interface (like nfsd!) would be less likely to
make mistakes if we had a simpler, more conceptual
description of CAP_FS_MASK than just "the following list of
capabilities".
3. If there's a possibility new capabilities will be added again
in the future, then we should document CAP_FS_MASK in a way
that makes it clear how those new bits will be treated.
4. We must fix nfsd in any case, either by changing the nfsd
code or CAP_FS_MASK, but we should err on the side of not
changing CAP_FS_MASK, for obvious backwards-compatibility
reasons.
So ideally we'd have a clear, simple description of CAP_FS_MASK that
matches historical behavior of setfsuid(), without changing CAP_FS_MASK
if not required.
setfsuid(2) says "The system call setfsuid() sets the user ID that the
Linux kernel uses to check for all accesses to the file system." So,
"the set of capabilities that allow bypassing filesystem permission
checks" might be one candidate description of CAP_FS_MASK.
Based on that, I think I'd not include CAP_SYS_ADMIN: it covers a bunch
of operations, most of which have nothing to do with filesystems--I
think mount and umount is the only exception, and they always require
special privilege, so don't consult filesystem permissions (do I have
that right? What happened to the attempt to allow ordinary users to
mount?).
If filesystem permissions similarly never affected the ability to create
device nodes, that might also be an argument against including
CAP_MKNOD, but it would be interesting to know the pre-capabilities
behavior of a uid 0 process with fsuid non-0.
--b.
>
> I'm sure about CAP_MKNOD and CAP_LINUX_IMMUTABLE, and not so sure
> of CAP_SYS_ADMIN, CAP_SETFCAP and CAP_MAC_ADMIN.
> (NFS doesn't support SElinux, as I know. And dropping filesystem capabilities
> before manipulating SElinux labels seems to be useless. And if someone exploits
> vulnerability in process with dropped filesystem capabilities, it's
> easy to bring them back.)
>
> Please tell what you think.
>
> And there are patches:
>
> For linux-2.6:
> ------------------------------------------------------------------------------
> diff -purN linux-2.6.28.7/include/linux/capability.h
> linux/include/linux/capability.h
> --- linux-2.6.28.7/include/linux/capability.h 2009-02-21
> 01:41:27.000000000 +0300
> +++ linux/include/linux/capability.h 2009-03-16 17:09:23.588420300 +0300
> @@ -370,9 +370,14 @@ typedef struct kernel_cap_struct {
> | CAP_TO_MASK(CAP_DAC_OVERRIDE) \
> | CAP_TO_MASK(CAP_DAC_READ_SEARCH) \
> | CAP_TO_MASK(CAP_FOWNER) \
> + | CAP_TO_MASK(CAP_MKNOD) \
> + | CAP_TO_MASK(CAP_LINUX_IMMUTABLE) \
> + | CAP_TO_MASK(CAP_SYS_ADMIN) \
> + | CAP_TO_MASK(CAP_SETFCAP) \
> | CAP_TO_MASK(CAP_FSETID))
>
> -# define CAP_FS_MASK_B1 (CAP_TO_MASK(CAP_MAC_OVERRIDE))
> +# define CAP_FS_MASK_B1 (CAP_TO_MASK(CAP_MAC_OVERRIDE) \
> + | CAP_TO_MASK(CAP_MAC_ADMIN))
>
> #if _KERNEL_CAPABILITY_U32S != 2
> # error Fix up hand-coded capability macro initializers
> ------------------------------------------------------------------------------
>
> And for linux-2.4:
> ------------------------------------------------------------------------------
> diff -purN linux-2.4.37/include/linux/capability.h
> linux/include/linux/capability.h
> --- linux-2.4.37/include/linux/capability.h 2008-12-02 11:01:34.000000000 +0300
> +++ linux/include/linux/capability.h 2009-03-16 17:14:16.308635400 +0300
> @@ -101,7 +101,7 @@ typedef __u32 kernel_cap_t;
>
> /* Used to decide between falling back on the old suser() or fsuser(). */
>
> -#define CAP_FS_MASK 0x1f
> +#define CAP_FS_MASK 0x0820021f
>
> /* Overrides the restriction that the real or effective user ID of a
> process sending a signal must match the real or effective user ID
> ------------------------------------------------------------------------------
>
> Anyway, I haven't write access to git repository, so if you agree,
> please commit.
>
> P.S. CAP_SYS_ADMIN is bad - too many actions are bounded to this capability.
> Perhaps it should be broken down to a set of independent capabilities.
> Especially, SElinux related could be separated.
On Mon, Mar 16, 2009 at 12:36:11PM -0400, bfields wrote:
> That may be reasonable, but I'd like to see clearer criteria for
> choosing those. Some considerations:
>
> 1. As capabilities(7) says, we must "preserve the traditional
> semantics for transitions between 0 and non-zero user IDs".
> The setfsuid() interface predates capabilities, so the
> introduction of capabilities shouldn't have changed the
> behavior of a program written in ignorance of capabilities.
> 2. Users of the interface (like nfsd!) would be less likely to
> make mistakes if we had a simpler, more conceptual
> description of CAP_FS_MASK than just "the following list of
> capabilities".
> 3. If there's a possibility new capabilities will be added again
> in the future, then we should document CAP_FS_MASK in a way
> that makes it clear how those new bits will be treated.
> 4. We must fix nfsd in any case, either by changing the nfsd
> code or CAP_FS_MASK, but we should err on the side of not
> changing CAP_FS_MASK, for obvious backwards-compatibility
> reasons.
Also, thinking of the nfsd case: it violates the principal of least
surprise if dropping CAP_FS_MASK still leaves it possible to make a
change to the filesystem that would normally require special
privileges....
--b.
>
> So ideally we'd have a clear, simple description of CAP_FS_MASK that
> matches historical behavior of setfsuid(), without changing CAP_FS_MASK
> if not required.
>
> setfsuid(2) says "The system call setfsuid() sets the user ID that the
> Linux kernel uses to check for all accesses to the file system." So,
> "the set of capabilities that allow bypassing filesystem permission
> checks" might be one candidate description of CAP_FS_MASK.
>
> Based on that, I think I'd not include CAP_SYS_ADMIN: it covers a bunch
> of operations, most of which have nothing to do with filesystems--I
> think mount and umount is the only exception, and they always require
> special privilege, so don't consult filesystem permissions (do I have
> that right? What happened to the attempt to allow ordinary users to
> mount?).
>
> If filesystem permissions similarly never affected the ability to create
> device nodes, that might also be an argument against including
> CAP_MKNOD, but it would be interesting to know the pre-capabilities
> behavior of a uid 0 process with fsuid non-0.
>
> --b.
>
> >
> > I'm sure about CAP_MKNOD and CAP_LINUX_IMMUTABLE, and not so sure
> > of CAP_SYS_ADMIN, CAP_SETFCAP and CAP_MAC_ADMIN.
> > (NFS doesn't support SElinux, as I know. And dropping filesystem capabilities
> > before manipulating SElinux labels seems to be useless. And if someone exploits
> > vulnerability in process with dropped filesystem capabilities, it's
> > easy to bring them back.)
> >
> > Please tell what you think.
> >
> > And there are patches:
> >
> > For linux-2.6:
> > ------------------------------------------------------------------------------
> > diff -purN linux-2.6.28.7/include/linux/capability.h
> > linux/include/linux/capability.h
> > --- linux-2.6.28.7/include/linux/capability.h 2009-02-21
> > 01:41:27.000000000 +0300
> > +++ linux/include/linux/capability.h 2009-03-16 17:09:23.588420300 +0300
> > @@ -370,9 +370,14 @@ typedef struct kernel_cap_struct {
> > | CAP_TO_MASK(CAP_DAC_OVERRIDE) \
> > | CAP_TO_MASK(CAP_DAC_READ_SEARCH) \
> > | CAP_TO_MASK(CAP_FOWNER) \
> > + | CAP_TO_MASK(CAP_MKNOD) \
> > + | CAP_TO_MASK(CAP_LINUX_IMMUTABLE) \
> > + | CAP_TO_MASK(CAP_SYS_ADMIN) \
> > + | CAP_TO_MASK(CAP_SETFCAP) \
> > | CAP_TO_MASK(CAP_FSETID))
> >
> > -# define CAP_FS_MASK_B1 (CAP_TO_MASK(CAP_MAC_OVERRIDE))
> > +# define CAP_FS_MASK_B1 (CAP_TO_MASK(CAP_MAC_OVERRIDE) \
> > + | CAP_TO_MASK(CAP_MAC_ADMIN))
> >
> > #if _KERNEL_CAPABILITY_U32S != 2
> > # error Fix up hand-coded capability macro initializers
> > ------------------------------------------------------------------------------
> >
> > And for linux-2.4:
> > ------------------------------------------------------------------------------
> > diff -purN linux-2.4.37/include/linux/capability.h
> > linux/include/linux/capability.h
> > --- linux-2.4.37/include/linux/capability.h 2008-12-02 11:01:34.000000000 +0300
> > +++ linux/include/linux/capability.h 2009-03-16 17:14:16.308635400 +0300
> > @@ -101,7 +101,7 @@ typedef __u32 kernel_cap_t;
> >
> > /* Used to decide between falling back on the old suser() or fsuser(). */
> >
> > -#define CAP_FS_MASK 0x1f
> > +#define CAP_FS_MASK 0x0820021f
> >
> > /* Overrides the restriction that the real or effective user ID of a
> > process sending a signal must match the real or effective user ID
> > ------------------------------------------------------------------------------
> >
> > Anyway, I haven't write access to git repository, so if you agree,
> > please commit.
> >
> > P.S. CAP_SYS_ADMIN is bad - too many actions are bounded to this capability.
> > Perhaps it should be broken down to a set of independent capabilities.
> > Especially, SElinux related could be separated.
Quoting J. Bruce Fields ([email protected]):
> On Mon, Mar 16, 2009 at 05:16:34PM +0300, Igor Zhbanov wrote:
> > I look again at kernel sources and will tell you what I think of.
> >
> > CAP_FS_MASK is used currently in two places: in setfsuid(...) system call
> > and as a base for CAP_NFSD_MASK which used in nfsd_setuser(...) function.
> >
> > First, setfsuid(...). As I understand, this system call is a subset
> > of seteuid(...). And it is used when some privileged process want to do
> > some filesystem operations as some ordinary user could do. That privileged
> > process don't want to completely drop all privileges and become that ordinary
> > user. But it wants temporarily lower it's privileges and set fsuid, so process
> > don't bother itself with checking permissions on files and can rely on kernel.
> > Here is typical usage from linux-PAM package, file modules/pam_xauth.c:
> >
> > euid = geteuid();
> > setfsuid(pwd->pw_uid);
> > fp = fopen(path, "r");
> > setfsuid(euid);
> > if (fp != NULL) {...}
> >
> > So, privileged PAM authentication process sets fsuid and tries to read file
> > from user's home, and after attempt to open file, it sets fsuid back.
> > So, if user cannot read that file, PAM module cannot read it too.
> >
> > And I think that CAP_MKNOD and CAP_LINUX_IMMUTABLE privileges should
> > be included in CAP_FS_MASK. As for CAP_SYS_ADMIN and CAP_SETFCAP,
> > they could be also included in CAP_FS_MASK. Perhaps with CAP_MAC_ADMIN,
> > so mandatory access control labels couldn't be changed too.
> >
> > Although it is strange to me if someone write a code that drops filesystem
> > capabilities and later tries e.g. to set SElinux label. Tries just to fail? ;-)
> > IMHO setfsuid(...) in ordinary privileged processes should be used only
> > for short times just around filesystem operations, that should be checked
> > against another user's permission.
> >
> > As for NFSD, the story is quite different. Before attempting to access
> > file system NFSD calls nfsd_setuser(...) function. But NFSD is unaware
> > of capabilities of client's process. It knows just fsuid, fsgid and additional
> > groups of calling process (as it is done in AUTH_UNIX authorisation method).
> > So decision of NFSD is simple: if uid is zero, then raises filesystem
> > capabilities, else drop them.
> >
> > And the problem is that not all filesystem operations that client can ask NFSD
> > to perform are covered by CAP_NFSD_SET. So if some ordinary user (because
> > of broken client or by given capability) will ask NFSD to create a device,
> > NFSD will do it because nfsd_setuser(...) doesn't drop that capability.
> >
> > As for SElinux and extended attributes, it seems that extended attributes
> > other that ACL are not supported by NFS by bow. So, NFSD is unaffected
> > with CAP_SYS_ADMIN and CAP_SETFCAP not included in CAP_NFSD_SET - you just
> > can't ask NFSD to set SElinux label. It's not implemented. ;-)
>
> That's true, though the labelled nfs people may change this some day.
>
> > And my conclusion is that CAP_MKNOD, CAP_LINUX_IMMUTABLE, CAP_SYS_ADMIN,
> > CAP_SETFCAP and CAP_MAC_ADMIN should be included
> > in CAP_FS_MASK.
>
> That may be reasonable, but I'd like to see clearer criteria for
> choosing those. Some considerations:
>
> 1. As capabilities(7) says, we must "preserve the traditional
> semantics for transitions between 0 and non-zero user IDs".
> The setfsuid() interface predates capabilities, so the
> introduction of capabilities shouldn't have changed the
> behavior of a program written in ignorance of capabilities.
> 2. Users of the interface (like nfsd!) would be less likely to
> make mistakes if we had a simpler, more conceptual
> description of CAP_FS_MASK than just "the following list of
> capabilities".
> 3. If there's a possibility new capabilities will be added again
> in the future, then we should document CAP_FS_MASK in a way
> that makes it clear how those new bits will be treated.
> 4. We must fix nfsd in any case, either by changing the nfsd
> code or CAP_FS_MASK, but we should err on the side of not
> changing CAP_FS_MASK, for obvious backwards-compatibility
> reasons.
>
> So ideally we'd have a clear, simple description of CAP_FS_MASK that
> matches historical behavior of setfsuid(), without changing CAP_FS_MASK
> if not required.
>
> setfsuid(2) says "The system call setfsuid() sets the user ID that the
> Linux kernel uses to check for all accesses to the file system." So,
> "the set of capabilities that allow bypassing filesystem permission
> checks" might be one candidate description of CAP_FS_MASK.
>
> Based on that, I think I'd not include CAP_SYS_ADMIN: it covers a bunch
> of operations, most of which have nothing to do with filesystems--I
> think mount and umount is the only exception, and they always require
Yes, but actually the thing to keep in mind is that this is all
operating in the (supposedly temporary) whacky-land where we have
capabilities, yet we have a privileged root. Note that when
issecure(NOSETUIDFIXUP) then these things aren't done. So we are
enabling these capabilities precisely because we are emulating the
concept of root being privileged (and non-root being unprivileged).
Now with *that* in mind, I think it suddenly becomes clear that the
right thing to do is add every capability which could be related
to the uid wrt fs to the cap_fs_mask.
Note that even if euid=0 code does
setresuid(500,500,0)
...
setfsuid(0);
then only the capabilities which are both in fsmask *and* in the
process' original permitted set (before the setresuid call) will
be replaced into the new effective set. So if the task had taken
CAP_SYS_ADMIN out of its permitted set before the setresuid, then
it won't have that after the setfsuid(0).
> special privilege, so don't consult filesystem permissions (do I have
> that right? What happened to the attempt to allow ordinary users to
> mount?).
Well, they keep getting stalled because we don't have a good answer for
what to do about the fact that an unprivileged user can make trees
undeletable by pinning them with mounts. (Miklos and Eric cc'd in case
I didn't explain that well enough).
> If filesystem permissions similarly never affected the ability to create
> device nodes, that might also be an argument against including
> CAP_MKNOD, but it would be interesting to know the pre-capabilities
> behavior of a uid 0 process with fsuid non-0.
The sentiment rings true, but again since before capabilities, privilege
was fully tied to the userid, the question doesn't make sense. Either
you had uid 0 and could mknod, or you didn't and couldn't. And that is
the behavior which we unfortunately have to emulate when
!issecure(SECURE_NOROOT|SECURE_NOSUIDFIXUP).
-serge
Quoting J. Bruce Fields ([email protected]):
> On Mon, Mar 16, 2009 at 12:36:11PM -0400, bfields wrote:
> > That may be reasonable, but I'd like to see clearer criteria for
> > choosing those. Some considerations:
> >
> > 1. As capabilities(7) says, we must "preserve the traditional
> > semantics for transitions between 0 and non-zero user IDs".
> > The setfsuid() interface predates capabilities, so the
> > introduction of capabilities shouldn't have changed the
> > behavior of a program written in ignorance of capabilities.
> > 2. Users of the interface (like nfsd!) would be less likely to
> > make mistakes if we had a simpler, more conceptual
> > description of CAP_FS_MASK than just "the following list of
> > capabilities".
> > 3. If there's a possibility new capabilities will be added again
> > in the future, then we should document CAP_FS_MASK in a way
> > that makes it clear how those new bits will be treated.
> > 4. We must fix nfsd in any case, either by changing the nfsd
> > code or CAP_FS_MASK, but we should err on the side of not
> > changing CAP_FS_MASK, for obvious backwards-compatibility
> > reasons.
>
> Also, thinking of the nfsd case: it violates the principal of least
> surprise if dropping CAP_FS_MASK still leaves it possible to make a
> change to the filesystem that would normally require special
> privileges....
Agreed, and so between that and the labeled nfs work, I think we
should add all 4 capabilities to both the CAP_FS_MASK and CAP_NFSD_MASK.
-serge
On Fri, 2009-03-13 at 14:00 -0500, Serge E. Hallyn wrote:
> Quoting Igor Zhbanov ([email protected]):
> > But ordinary users can't create devices. It seems to me that in time
> > of implementation of capabilities in kernel 2.4, capabilities related
> > to filesystem was added first. And mark for them contains all above in
> > header file. And when CAP_MKNOD was added later, author just forget to
> > update mask.
> >
> > If mask was designed to drop all filesystem related capabilities, then
> > it must be expanded, because ordinary users cannot create devices etc.
>
> I think you thought Bruce was saying we shouldn't change the set of
> capabilities, but he was just asking exactly what changes Michael was
> interested in.
>
> Igor, thanks for finding this. I never got your original message. Do
> you have a patdch to add the two capabilities? Do you think the
> other two I mentioned (CAP_SYS_ADMIN and CAP_SETFCAP) need to be
> added too?
>
> I've added Andrew Morgan, LSM and SELinux mailing lists to get another
> opinion about adding those two. In particular, we'd be adding them
> to the fs_masks becuase CAP_SYS_ADMIN lets you change the selinux
> label, and CAP_SETFCAP lets you change the file capabilities.
I'd be inclined against adding CAP_SYS_ADMIN to the mask; note that it
is only checked for setting SELinux security contexts (or more broadly
any attributes in the security namespace) when SELinux is disabled. In
the SELinux-enabled case, we are checking SELinux-specific permissions
when setting the SELinux attributes, whether on the client or the
server.
--
Stephen Smalley
National Security Agency
Quoting Stephen Smalley ([email protected]):
> On Fri, 2009-03-13 at 14:00 -0500, Serge E. Hallyn wrote:
> > Quoting Igor Zhbanov ([email protected]):
> > > But ordinary users can't create devices. It seems to me that in time
> > > of implementation of capabilities in kernel 2.4, capabilities related
> > > to filesystem was added first. And mark for them contains all above in
> > > header file. And when CAP_MKNOD was added later, author just forget to
> > > update mask.
> > >
> > > If mask was designed to drop all filesystem related capabilities, then
> > > it must be expanded, because ordinary users cannot create devices etc.
> >
> > I think you thought Bruce was saying we shouldn't change the set of
> > capabilities, but he was just asking exactly what changes Michael was
> > interested in.
> >
> > Igor, thanks for finding this. I never got your original message. Do
> > you have a patdch to add the two capabilities? Do you think the
> > other two I mentioned (CAP_SYS_ADMIN and CAP_SETFCAP) need to be
> > added too?
> >
> > I've added Andrew Morgan, LSM and SELinux mailing lists to get another
> > opinion about adding those two. In particular, we'd be adding them
> > to the fs_masks becuase CAP_SYS_ADMIN lets you change the selinux
> > label, and CAP_SETFCAP lets you change the file capabilities.
>
> I'd be inclined against adding CAP_SYS_ADMIN to the mask; note that it
> is only checked for setting SELinux security contexts (or more broadly
> any attributes in the security namespace) when SELinux is disabled. In
> the SELinux-enabled case, we are checking SELinux-specific permissions
> when setting the SELinux attributes, whether on the client or the
> server.
But that's exactly why it seemed like it ought to be in there. If
SELinux is enabled, then SELinux will continue to perform it's own
checks based on security context and ignoring privileged root. But
outside of that, since we are in a root-is-privileged mode, should it
not be the case that having fsuid=0 means that you can set extended
attributes in the security namespace?
Conversely, if setting fsuid to non-zero, shouldn't all of the
privileged ways of setting file attributes be lost? Or, will we run
into a problem where software wanted to set its fsuid to non-0 but
still be able to call sethostname(2), for instance? In which case
we simply cannot put CAP_SYS_ADMIN in CAP_FS_MASK.
I guess it comes back down to whether those xattrs are considered a
security attribute or a simple file property.
-serge
On Mon, 2009-03-16 at 13:49 -0500, Serge E. Hallyn wrote:
> Quoting Stephen Smalley ([email protected]):
> > On Fri, 2009-03-13 at 14:00 -0500, Serge E. Hallyn wrote:
> > > Quoting Igor Zhbanov ([email protected]):
> > > > But ordinary users can't create devices. It seems to me that in time
> > > > of implementation of capabilities in kernel 2.4, capabilities related
> > > > to filesystem was added first. And mark for them contains all above in
> > > > header file. And when CAP_MKNOD was added later, author just forget to
> > > > update mask.
> > > >
> > > > If mask was designed to drop all filesystem related capabilities, then
> > > > it must be expanded, because ordinary users cannot create devices etc.
> > >
> > > I think you thought Bruce was saying we shouldn't change the set of
> > > capabilities, but he was just asking exactly what changes Michael was
> > > interested in.
> > >
> > > Igor, thanks for finding this. I never got your original message. Do
> > > you have a patdch to add the two capabilities? Do you think the
> > > other two I mentioned (CAP_SYS_ADMIN and CAP_SETFCAP) need to be
> > > added too?
> > >
> > > I've added Andrew Morgan, LSM and SELinux mailing lists to get another
> > > opinion about adding those two. In particular, we'd be adding them
> > > to the fs_masks becuase CAP_SYS_ADMIN lets you change the selinux
> > > label, and CAP_SETFCAP lets you change the file capabilities.
> >
> > I'd be inclined against adding CAP_SYS_ADMIN to the mask; note that it
> > is only checked for setting SELinux security contexts (or more broadly
> > any attributes in the security namespace) when SELinux is disabled. In
> > the SELinux-enabled case, we are checking SELinux-specific permissions
> > when setting the SELinux attributes, whether on the client or the
> > server.
>
> But that's exactly why it seemed like it ought to be in there. If
> SELinux is enabled, then SELinux will continue to perform it's own
> checks based on security context and ignoring privileged root. But
> outside of that, since we are in a root-is-privileged mode, should it
> not be the case that having fsuid=0 means that you can set extended
> attributes in the security namespace?
>
> Conversely, if setting fsuid to non-zero, shouldn't all of the
> privileged ways of setting file attributes be lost? Or, will we run
> into a problem where software wanted to set its fsuid to non-0 but
> still be able to call sethostname(2), for instance? In which case
> we simply cannot put CAP_SYS_ADMIN in CAP_FS_MASK.
>
> I guess it comes back down to whether those xattrs are considered a
> security attribute or a simple file property.
Well, they are a security attribute, but CAP_SYS_ADMIN was never
supposed to cover them. In fact, none of the upstream security modules
uses CAP_SYS_ADMIN to control setting of its own attributes:
- SELinux applies a DAC check and its own set of MAC file permission
checks,
- Smack applies CAP_MAC_ADMIN,
- Capabilities applies CAP_SETFCAP.
Checking CAP_SYS_ADMIN was really just a fallback to prevent unchecked
setting of attributes in the no-LSM case. It might make more sense to
return EOPNOTSUPP for any attributes unknown to the enabled security
module and require you to enable the desired module before setting the
attributes these days.
http://marc.info/?t=107428809400002&r=1&w=2
I don't think this will make any difference for labeled NFS at present,
as the current labeled NFS patches only export the MAC label attribute
if the server has the MAC model enabled. So CAP_SYS_ADMIN won't get
checked regardless.
Trusted namespace is another case where CAP_SYS_ADMIN check is applied
on file operations.
--
Stephen Smalley
National Security Agency
Seems this isn't entirely obvious in the general case. In the specific
case of nfsd, however, this is pretty obvious. So I'm inclined to
submit the following now (and leave it to be reverted by a later patch
if CAP_FS_MASK ends up including CAP_MKNOD, as seems likely).
--b.
commit 2ec8f8f0c0005ffe3cf93bbf3d9976de76cf4652
Author: J. Bruce Fields <[email protected]>
Date: Mon Mar 16 18:34:20 2009 -0400
nfsd: nfsd should drop CAP_MKNOD for non-root
Since creating a device node is normally an operation requiring special
privilege, Igor Zhbanov points out that it is surprising (to say the
least) that a client can, for example, create a device node on a
filesystem exported with root_squash.
So, make sure CAP_MKNOD is among the capabilities dropped when an nfsd
thread handles a request from a non-root user.
Reported-by: Igor Zhbanov <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
diff --git a/include/linux/capability.h b/include/linux/capability.h
index 02bdb76..7824483 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -393,8 +393,10 @@ struct cpu_vfs_cap_data {
# define CAP_FULL_SET ((kernel_cap_t){{ ~0, ~0 }})
# define CAP_INIT_EFF_SET ((kernel_cap_t){{ ~CAP_TO_MASK(CAP_SETPCAP), ~0 }})
# define CAP_FS_SET ((kernel_cap_t){{ CAP_FS_MASK_B0, CAP_FS_MASK_B1 } })
-# define CAP_NFSD_SET ((kernel_cap_t){{ CAP_FS_MASK_B0|CAP_TO_MASK(CAP_SYS_RESOURCE), \
- CAP_FS_MASK_B1 } })
+# define CAP_NFSD_SET ((kernel_cap_t){{ CAP_FS_MASK_B0 \
+ | CAP_TO_MASK(CAP_SYS_RESOURCE) \
+ | CAP_TO_MASK(CAP_MKNOD), \
+ CAP_FS_MASK_B1 } })
#endif /* _KERNEL_CAPABILITY_U32S != 2 */
On Mon, Mar 16, 2009 at 12:04:33PM -0500, Serge E. Hallyn wrote:
> Quoting J. Bruce Fields ([email protected]):
> > If filesystem permissions similarly never affected the ability to create
> > device nodes, that might also be an argument against including
> > CAP_MKNOD, but it would be interesting to know the pre-capabilities
> > behavior of a uid 0 process with fsuid non-0.
>
> The sentiment rings true, but again since before capabilities, privilege
> was fully tied to the userid, the question doesn't make sense. Either
> you had uid 0 and could mknod, or you didn't and couldn't. And that is
> the behavior which we unfortunately have to emulate when
> !issecure(SECURE_NOROOT|SECURE_NOSUIDFIXUP).
The historical behavior of setfsuid() is still interesting, though.
>From a quick glance at Debian's code for the (long-neglected) userspace
nfsd server, it looks like it depends on setfsuid() and the kernel to
enforce permissions for operations (including mknod). Might be
interesting to confirm whether it has the same problem, and if so,
whether that was a problem introduced with some capability changes or
whether it always existed.
--b.
Quoting J. Bruce Fields ([email protected]):
> On Mon, Mar 16, 2009 at 12:04:33PM -0500, Serge E. Hallyn wrote:
> > Quoting J. Bruce Fields ([email protected]):
> > > If filesystem permissions similarly never affected the ability to create
> > > device nodes, that might also be an argument against including
> > > CAP_MKNOD, but it would be interesting to know the pre-capabilities
> > > behavior of a uid 0 process with fsuid non-0.
> >
> > The sentiment rings true, but again since before capabilities, privilege
> > was fully tied to the userid, the question doesn't make sense. Either
> > you had uid 0 and could mknod, or you didn't and couldn't. And that is
> > the behavior which we unfortunately have to emulate when
> > !issecure(SECURE_NOROOT|SECURE_NOSUIDFIXUP).
>
> The historical behavior of setfsuid() is still interesting, though.
> >From a quick glance at Debian's code for the (long-neglected) userspace
> nfsd server, it looks like it depends on setfsuid() and the kernel to
> enforce permissions for operations (including mknod). Might be
Sorry, do you mean that it would expect setfsuid(0) to allow a task to
do mknod, and setfsuid(500) to disable it?
Actually I guess for mknod, that is the question we can answer with the
a 2.1.x tree: which uid did mknod check?
Ah, answer is... fsuid!
> interesting to confirm whether it has the same problem, and if so,
> whether that was a problem introduced with some capability changes or
> whether it always existed.
>
> --b.
Quoting J. Bruce Fields ([email protected]):
> Seems this isn't entirely obvious in the general case. In the specific
I think the MKNOD part is (based on historical behavior as you
suggested) obvious: both masks should include CAP_MKNOD and
CAP_LINUX_IMMUTABLE.
(references: http://lxr.linux.no/linux-old+v2.0.21/fs/ext2/ioctl.c#L60
for immutable and http://lxr.linux.no/linux-old+v2.0.21/fs/namei.c#L503
for mknod)
-serge
> case of nfsd, however, this is pretty obvious. So I'm inclined to
> submit the following now (and leave it to be reverted by a later patch
> if CAP_FS_MASK ends up including CAP_MKNOD, as seems likely).
>
> --b.
>
> commit 2ec8f8f0c0005ffe3cf93bbf3d9976de76cf4652
> Author: J. Bruce Fields <[email protected]>
> Date: Mon Mar 16 18:34:20 2009 -0400
>
> nfsd: nfsd should drop CAP_MKNOD for non-root
>
> Since creating a device node is normally an operation requiring special
> privilege, Igor Zhbanov points out that it is surprising (to say the
> least) that a client can, for example, create a device node on a
> filesystem exported with root_squash.
>
> So, make sure CAP_MKNOD is among the capabilities dropped when an nfsd
> thread handles a request from a non-root user.
>
> Reported-by: Igor Zhbanov <[email protected]>
> Signed-off-by: J. Bruce Fields <[email protected]>
>
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index 02bdb76..7824483 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -393,8 +393,10 @@ struct cpu_vfs_cap_data {
> # define CAP_FULL_SET ((kernel_cap_t){{ ~0, ~0 }})
> # define CAP_INIT_EFF_SET ((kernel_cap_t){{ ~CAP_TO_MASK(CAP_SETPCAP), ~0 }})
> # define CAP_FS_SET ((kernel_cap_t){{ CAP_FS_MASK_B0, CAP_FS_MASK_B1 } })
> -# define CAP_NFSD_SET ((kernel_cap_t){{ CAP_FS_MASK_B0|CAP_TO_MASK(CAP_SYS_RESOURCE), \
> - CAP_FS_MASK_B1 } })
> +# define CAP_NFSD_SET ((kernel_cap_t){{ CAP_FS_MASK_B0 \
> + | CAP_TO_MASK(CAP_SYS_RESOURCE) \
> + | CAP_TO_MASK(CAP_MKNOD), \
> + CAP_FS_MASK_B1 } })
>
> #endif /* _KERNEL_CAPABILITY_U32S != 2 */
Quoting Stephen Smalley ([email protected]):
> On Mon, 2009-03-16 at 13:49 -0500, Serge E. Hallyn wrote:
> > Quoting Stephen Smalley ([email protected]):
> > > On Fri, 2009-03-13 at 14:00 -0500, Serge E. Hallyn wrote:
> > > > Quoting Igor Zhbanov ([email protected]):
> > > > > But ordinary users can't create devices. It seems to me that in time
> > > > > of implementation of capabilities in kernel 2.4, capabilities related
> > > > > to filesystem was added first. And mark for them contains all above in
> > > > > header file. And when CAP_MKNOD was added later, author just forget to
> > > > > update mask.
> > > > >
> > > > > If mask was designed to drop all filesystem related capabilities, then
> > > > > it must be expanded, because ordinary users cannot create devices etc.
> > > >
> > > > I think you thought Bruce was saying we shouldn't change the set of
> > > > capabilities, but he was just asking exactly what changes Michael was
> > > > interested in.
> > > >
> > > > Igor, thanks for finding this. I never got your original message. Do
> > > > you have a patdch to add the two capabilities? Do you think the
> > > > other two I mentioned (CAP_SYS_ADMIN and CAP_SETFCAP) need to be
> > > > added too?
> > > >
> > > > I've added Andrew Morgan, LSM and SELinux mailing lists to get another
> > > > opinion about adding those two. In particular, we'd be adding them
> > > > to the fs_masks becuase CAP_SYS_ADMIN lets you change the selinux
> > > > label, and CAP_SETFCAP lets you change the file capabilities.
> > >
> > > I'd be inclined against adding CAP_SYS_ADMIN to the mask; note that it
> > > is only checked for setting SELinux security contexts (or more broadly
> > > any attributes in the security namespace) when SELinux is disabled. In
> > > the SELinux-enabled case, we are checking SELinux-specific permissions
> > > when setting the SELinux attributes, whether on the client or the
> > > server.
> >
> > But that's exactly why it seemed like it ought to be in there. If
> > SELinux is enabled, then SELinux will continue to perform it's own
> > checks based on security context and ignoring privileged root. But
> > outside of that, since we are in a root-is-privileged mode, should it
> > not be the case that having fsuid=0 means that you can set extended
> > attributes in the security namespace?
> >
> > Conversely, if setting fsuid to non-zero, shouldn't all of the
> > privileged ways of setting file attributes be lost? Or, will we run
> > into a problem where software wanted to set its fsuid to non-0 but
> > still be able to call sethostname(2), for instance? In which case
> > we simply cannot put CAP_SYS_ADMIN in CAP_FS_MASK.
> >
> > I guess it comes back down to whether those xattrs are considered a
> > security attribute or a simple file property.
>
> Well, they are a security attribute, but CAP_SYS_ADMIN was never
> supposed to cover them. In fact, none of the upstream security modules
So do you think it makes sense to have CAP_MAC_ADMIN and CAP_FOWNER
in CAP_FS_MASK? In other words are you objecting to CAP_SYS_ADMIN
because of all of its other implications, or because you disagree
that labels for security modules should be treated as mere fs data
here?
> uses CAP_SYS_ADMIN to control setting of its own attributes:
> - SELinux applies a DAC check and its own set of MAC file permission
> checks,
> - Smack applies CAP_MAC_ADMIN,
> - Capabilities applies CAP_SETFCAP.
>
> Checking CAP_SYS_ADMIN was really just a fallback to prevent unchecked
> setting of attributes in the no-LSM case. It might make more sense to
> return EOPNOTSUPP for any attributes unknown to the enabled security
I suspect that would create a LOT of bug reports. Would requiring
CAP_MAC_ADMIN seem reasonable?
> module and require you to enable the desired module before setting the
> attributes these days.
> http://marc.info/?t=107428809400002&r=1&w=2
>
> I don't think this will make any difference for labeled NFS at present,
> as the current labeled NFS patches only export the MAC label attribute
> if the server has the MAC model enabled. So CAP_SYS_ADMIN won't get
> checked regardless.
>
> Trusted namespace is another case where CAP_SYS_ADMIN check is applied
> on file operations.
Which seems like all the more reason why CAP_SYS_ADMIN would need to
be added to the CAP_FS_MASK. Or do you mean that check should also be
changed for something else? (CAP_MAC_ADMIN, or some new CAP_FS_XATTR?)
-serge
On Mon, 2009-03-16 at 18:13 -0500, Serge E. Hallyn wrote:
> Quoting Stephen Smalley ([email protected]):
> > On Mon, 2009-03-16 at 13:49 -0500, Serge E. Hallyn wrote:
> > > Quoting Stephen Smalley ([email protected]):
> > > > On Fri, 2009-03-13 at 14:00 -0500, Serge E. Hallyn wrote:
> > > > > Quoting Igor Zhbanov ([email protected]):
> > > > > > But ordinary users can't create devices. It seems to me that in time
> > > > > > of implementation of capabilities in kernel 2.4, capabilities related
> > > > > > to filesystem was added first. And mark for them contains all above in
> > > > > > header file. And when CAP_MKNOD was added later, author just forget to
> > > > > > update mask.
> > > > > >
> > > > > > If mask was designed to drop all filesystem related capabilities, then
> > > > > > it must be expanded, because ordinary users cannot create devices etc.
> > > > >
> > > > > I think you thought Bruce was saying we shouldn't change the set of
> > > > > capabilities, but he was just asking exactly what changes Michael was
> > > > > interested in.
> > > > >
> > > > > Igor, thanks for finding this. I never got your original message. Do
> > > > > you have a patdch to add the two capabilities? Do you think the
> > > > > other two I mentioned (CAP_SYS_ADMIN and CAP_SETFCAP) need to be
> > > > > added too?
> > > > >
> > > > > I've added Andrew Morgan, LSM and SELinux mailing lists to get another
> > > > > opinion about adding those two. In particular, we'd be adding them
> > > > > to the fs_masks becuase CAP_SYS_ADMIN lets you change the selinux
> > > > > label, and CAP_SETFCAP lets you change the file capabilities.
> > > >
> > > > I'd be inclined against adding CAP_SYS_ADMIN to the mask; note that it
> > > > is only checked for setting SELinux security contexts (or more broadly
> > > > any attributes in the security namespace) when SELinux is disabled. In
> > > > the SELinux-enabled case, we are checking SELinux-specific permissions
> > > > when setting the SELinux attributes, whether on the client or the
> > > > server.
> > >
> > > But that's exactly why it seemed like it ought to be in there. If
> > > SELinux is enabled, then SELinux will continue to perform it's own
> > > checks based on security context and ignoring privileged root. But
> > > outside of that, since we are in a root-is-privileged mode, should it
> > > not be the case that having fsuid=0 means that you can set extended
> > > attributes in the security namespace?
> > >
> > > Conversely, if setting fsuid to non-zero, shouldn't all of the
> > > privileged ways of setting file attributes be lost? Or, will we run
> > > into a problem where software wanted to set its fsuid to non-0 but
> > > still be able to call sethostname(2), for instance? In which case
> > > we simply cannot put CAP_SYS_ADMIN in CAP_FS_MASK.
> > >
> > > I guess it comes back down to whether those xattrs are considered a
> > > security attribute or a simple file property.
> >
> > Well, they are a security attribute, but CAP_SYS_ADMIN was never
> > supposed to cover them. In fact, none of the upstream security modules
>
> So do you think it makes sense to have CAP_MAC_ADMIN and CAP_FOWNER
> in CAP_FS_MASK? In other words are you objecting to CAP_SYS_ADMIN
> because of all of its other implications, or because you disagree
> that labels for security modules should be treated as mere fs data
> here?
For CAP_FOWNER, yes (and it is already there). CAP_MAC_ADMIN is less
ideal as it isn't clearly tied to filesystem accesses, and likewise for
CAP_MAC_OVERRIDE (but that one is included in CAP_FS_MASK already).
Ideally the capability space would be partitioned into capabilities that
affect filesystem accesses and the rest so that setfsuid() would yield
the expected behavior of only affecting filesystem access.
CAP_SYS_ADMIN is even less suitable due to its pervasive use outside of
the filesystem. So that's the first concern.
The second one is that we don't want CAP_SYS_ADMIN (or CAP_MAC_ADMIN) to
be required when setting SELinux labels. Only the SELinux permission
checks should govern setting those labels (aside from the usual DAC
ownership || CAP_FOWNER check).
> > uses CAP_SYS_ADMIN to control setting of its own attributes:
> > - SELinux applies a DAC check and its own set of MAC file permission
> > checks,
> > - Smack applies CAP_MAC_ADMIN,
> > - Capabilities applies CAP_SETFCAP.
> >
> > Checking CAP_SYS_ADMIN was really just a fallback to prevent unchecked
> > setting of attributes in the no-LSM case. It might make more sense to
> > return EOPNOTSUPP for any attributes unknown to the enabled security
>
> I suspect that would create a LOT of bug reports. Would requiring
> CAP_MAC_ADMIN seem reasonable?
It would narrow the scope a bit more, but it still isn't ideal.
> > module and require you to enable the desired module before setting the
> > attributes these days.
> > http://marc.info/?t=107428809400002&r=1&w=2
> >
> > I don't think this will make any difference for labeled NFS at present,
> > as the current labeled NFS patches only export the MAC label attribute
> > if the server has the MAC model enabled. So CAP_SYS_ADMIN won't get
> > checked regardless.
> >
> > Trusted namespace is another case where CAP_SYS_ADMIN check is applied
> > on file operations.
>
> Which seems like all the more reason why CAP_SYS_ADMIN would need to
> be added to the CAP_FS_MASK. Or do you mean that check should also be
> changed for something else? (CAP_MAC_ADMIN, or some new CAP_FS_XATTR?)
I'd favor changing it to a new capability. We have CAP_SETFCAP for
setting file capabilities; why not have CAP_SETTRUSTED for setting
attributes in the trusted namespace? Then adding it to CAP_FS_MASK has
no further side effects beyond controlling filesystem accesses.
--
Stephen Smalley
National Security Agency
Quoting Stephen Smalley ([email protected]):
> > So do you think it makes sense to have CAP_MAC_ADMIN and CAP_FOWNER
> > in CAP_FS_MASK? In other words are you objecting to CAP_SYS_ADMIN
> > because of all of its other implications, or because you disagree
> > that labels for security modules should be treated as mere fs data
> > here?
>
> For CAP_FOWNER, yes (and it is already there). CAP_MAC_ADMIN is less
Sorry, I meant CAP_SETFCAP. Should it be added?
> ideal as it isn't clearly tied to filesystem accesses, and likewise for
> CAP_MAC_OVERRIDE (but that one is included in CAP_FS_MASK already).
So it is. I didn't realize that.
> Ideally the capability space would be partitioned into capabilities that
> affect filesystem accesses and the rest so that setfsuid() would yield
> the expected behavior of only affecting filesystem access.
> CAP_SYS_ADMIN is even less suitable due to its pervasive use outside of
> the filesystem. So that's the first concern.
>
> The second one is that we don't want CAP_SYS_ADMIN (or CAP_MAC_ADMIN) to
> be required when setting SELinux labels. Only the SELinux permission
> checks should govern setting those labels (aside from the usual DAC
> ownership || CAP_FOWNER check).
So if a non-selinux kernel is booted, then you think only the usual
DAC checks should be required to set selinux labels?
Or is this assuming EOPNOTSUPP were returned for setting
security.selinux xattrs in such a kernel?
> > > uses CAP_SYS_ADMIN to control setting of its own attributes:
> > > - SELinux applies a DAC check and its own set of MAC file permission
> > > checks,
> > > - Smack applies CAP_MAC_ADMIN,
> > > - Capabilities applies CAP_SETFCAP.
> > >
> > > Checking CAP_SYS_ADMIN was really just a fallback to prevent unchecked
> > > setting of attributes in the no-LSM case. It might make more sense to
> > > return EOPNOTSUPP for any attributes unknown to the enabled security
> >
> > I suspect that would create a LOT of bug reports. Would requiring
> > CAP_MAC_ADMIN seem reasonable?
>
> It would narrow the scope a bit more, but it still isn't ideal.
>
> > > module and require you to enable the desired module before setting the
> > > attributes these days.
> > > http://marc.info/?t=107428809400002&r=1&w=2
> > >
> > > I don't think this will make any difference for labeled NFS at present,
> > > as the current labeled NFS patches only export the MAC label attribute
> > > if the server has the MAC model enabled. So CAP_SYS_ADMIN won't get
> > > checked regardless.
> > >
> > > Trusted namespace is another case where CAP_SYS_ADMIN check is applied
> > > on file operations.
> >
> > Which seems like all the more reason why CAP_SYS_ADMIN would need to
> > be added to the CAP_FS_MASK. Or do you mean that check should also be
> > changed for something else? (CAP_MAC_ADMIN, or some new CAP_FS_XATTR?)
>
> I'd favor changing it to a new capability. We have CAP_SETFCAP for
> setting file capabilities; why not have CAP_SETTRUSTED for setting
> attributes in the trusted namespace? Then adding it to CAP_FS_MASK has
> no further side effects beyond controlling filesystem accesses.
Does anyone know what the trusted xattrs are used for?
-serge
On Tue, 2009-03-17 at 12:39 -0500, Serge E. Hallyn wrote:
> Quoting Stephen Smalley ([email protected]):
> > > So do you think it makes sense to have CAP_MAC_ADMIN and CAP_FOWNER
> > > in CAP_FS_MASK? In other words are you objecting to CAP_SYS_ADMIN
> > > because of all of its other implications, or because you disagree
> > > that labels for security modules should be treated as mere fs data
> > > here?
> >
> > For CAP_FOWNER, yes (and it is already there). CAP_MAC_ADMIN is less
>
> Sorry, I meant CAP_SETFCAP. Should it be added?
Sure - it is only used for filesystem operations.
> > ideal as it isn't clearly tied to filesystem accesses, and likewise for
> > CAP_MAC_OVERRIDE (but that one is included in CAP_FS_MASK already).
>
> So it is. I didn't realize that.
>
> > Ideally the capability space would be partitioned into capabilities that
> > affect filesystem accesses and the rest so that setfsuid() would yield
> > the expected behavior of only affecting filesystem access.
> > CAP_SYS_ADMIN is even less suitable due to its pervasive use outside of
> > the filesystem. So that's the first concern.
> >
> > The second one is that we don't want CAP_SYS_ADMIN (or CAP_MAC_ADMIN) to
> > be required when setting SELinux labels. Only the SELinux permission
> > checks should govern setting those labels (aside from the usual DAC
> > ownership || CAP_FOWNER check).
>
> So if a non-selinux kernel is booted, then you think only the usual
> DAC checks should be required to set selinux labels?
I'm talking about the dumb NFS server case (non-SELinux NFS server
providing label and data storage to SELinux clients, MAC enforcement
handled client-side). But we aren't there yet, so I don't think we have
to worry about it right now.
> Or is this assuming EOPNOTSUPP were returned for setting
> security.selinux xattrs in such a kernel?
>
> > > > uses CAP_SYS_ADMIN to control setting of its own attributes:
> > > > - SELinux applies a DAC check and its own set of MAC file permission
> > > > checks,
> > > > - Smack applies CAP_MAC_ADMIN,
> > > > - Capabilities applies CAP_SETFCAP.
> > > >
> > > > Checking CAP_SYS_ADMIN was really just a fallback to prevent unchecked
> > > > setting of attributes in the no-LSM case. It might make more sense to
> > > > return EOPNOTSUPP for any attributes unknown to the enabled security
> > >
> > > I suspect that would create a LOT of bug reports. Would requiring
> > > CAP_MAC_ADMIN seem reasonable?
> >
> > It would narrow the scope a bit more, but it still isn't ideal.
> >
> > > > module and require you to enable the desired module before setting the
> > > > attributes these days.
> > > > http://marc.info/?t=107428809400002&r=1&w=2
> > > >
> > > > I don't think this will make any difference for labeled NFS at present,
> > > > as the current labeled NFS patches only export the MAC label attribute
> > > > if the server has the MAC model enabled. So CAP_SYS_ADMIN won't get
> > > > checked regardless.
> > > >
> > > > Trusted namespace is another case where CAP_SYS_ADMIN check is applied
> > > > on file operations.
> > >
> > > Which seems like all the more reason why CAP_SYS_ADMIN would need to
> > > be added to the CAP_FS_MASK. Or do you mean that check should also be
> > > changed for something else? (CAP_MAC_ADMIN, or some new CAP_FS_XATTR?)
> >
> > I'd favor changing it to a new capability. We have CAP_SETFCAP for
> > setting file capabilities; why not have CAP_SETTRUSTED for setting
> > attributes in the trusted namespace? Then adding it to CAP_FS_MASK has
> > no further side effects beyond controlling filesystem accesses.
>
> Does anyone know what the trusted xattrs are used for?
Not beyond what attr(5) says about them.
--
Stephen Smalley
National Security Agency
Quoting Stephen Smalley ([email protected]):
> On Tue, 2009-03-17 at 12:39 -0500, Serge E. Hallyn wrote:
> > Quoting Stephen Smalley ([email protected]):
> > > > So do you think it makes sense to have CAP_MAC_ADMIN and CAP_FOWNER
> > > > in CAP_FS_MASK? In other words are you objecting to CAP_SYS_ADMIN
> > > > because of all of its other implications, or because you disagree
> > > > that labels for security modules should be treated as mere fs data
> > > > here?
> > >
> > > For CAP_FOWNER, yes (and it is already there). CAP_MAC_ADMIN is less
> >
> > Sorry, I meant CAP_SETFCAP. Should it be added?
>
> Sure - it is only used for filesystem operations.
Ok, so then:
>
> > > ideal as it isn't clearly tied to filesystem accesses, and likewise for
> > > CAP_MAC_OVERRIDE (but that one is included in CAP_FS_MASK already).
> >
> > So it is. I didn't realize that.
> >
> > > Ideally the capability space would be partitioned into capabilities that
> > > affect filesystem accesses and the rest so that setfsuid() would yield
> > > the expected behavior of only affecting filesystem access.
> > > CAP_SYS_ADMIN is even less suitable due to its pervasive use outside of
> > > the filesystem. So that's the first concern.
> > >
> > > The second one is that we don't want CAP_SYS_ADMIN (or CAP_MAC_ADMIN) to
> > > be required when setting SELinux labels. Only the SELinux permission
> > > checks should govern setting those labels (aside from the usual DAC
> > > ownership || CAP_FOWNER check).
> >
> > So if a non-selinux kernel is booted, then you think only the usual
> > DAC checks should be required to set selinux labels?
>
> I'm talking about the dumb NFS server case (non-SELinux NFS server
> providing label and data storage to SELinux clients, MAC enforcement
> handled client-side). But we aren't there yet, so I don't think we have
> to worry about it right now.
But in cap_inode_setxattr, any security.* xattrs are controlled by
CAP_SYS_ADMIN. So do you think that this should be changed to a
CAP_XATTR_SECURITY capability which can be added to CAP_FS_MASK?
Or do you think it's ok that fsuid=0 does not allow you to set
security.selinux (or security.SMACK64, etc) xattrs when selinux is
not compiled in?
(You may have already answered this with your EOPNOTSUPP comment, but
I want to make sure I understand right)
> > Does anyone know what the trusted xattrs are used for?
>
> Not beyond what attr(5) says about them.
Well, if attr(5) says CAP_SYS_ADMIN being needed is the very
thing that defines these xattrs, then changing that seems a
bigger deal. That really does seem akin to changing kernel-user
API.
thanks,
-serge
Serge E. Hallyn wrote:
> Quoting Stephen Smalley ([email protected]):
>
>> On Tue, 2009-03-17 at 12:39 -0500, Serge E. Hallyn wrote:
>>
>>> Quoting Stephen Smalley ([email protected]):
>>>
>>>>> So do you think it makes sense to have CAP_MAC_ADMIN and CAP_FOWNER
>>>>> in CAP_FS_MASK? In other words are you objecting to CAP_SYS_ADMIN
>>>>> because of all of its other implications, or because you disagree
>>>>> that labels for security modules should be treated as mere fs data
>>>>> here?
>>>>>
>>>> For CAP_FOWNER, yes (and it is already there). CAP_MAC_ADMIN is less
>>>>
>>> Sorry, I meant CAP_SETFCAP. Should it be added?
>>>
>> Sure - it is only used for filesystem operations.
>>
>
> Ok, so then:
>
>
>>>> ideal as it isn't clearly tied to filesystem accesses, and likewise for
>>>> CAP_MAC_OVERRIDE (but that one is included in CAP_FS_MASK already).
>>>>
>>> So it is. I didn't realize that.
>>>
>>>
>>>> Ideally the capability space would be partitioned into capabilities that
>>>> affect filesystem accesses and the rest so that setfsuid() would yield
>>>> the expected behavior of only affecting filesystem access.
>>>> CAP_SYS_ADMIN is even less suitable due to its pervasive use outside of
>>>> the filesystem. So that's the first concern.
>>>>
>>>> The second one is that we don't want CAP_SYS_ADMIN (or CAP_MAC_ADMIN) to
>>>> be required when setting SELinux labels. Only the SELinux permission
>>>> checks should govern setting those labels (aside from the usual DAC
>>>> ownership || CAP_FOWNER check).
>>>>
>>> So if a non-selinux kernel is booted, then you think only the usual
>>> DAC checks should be required to set selinux labels?
>>>
>> I'm talking about the dumb NFS server case (non-SELinux NFS server
>> providing label and data storage to SELinux clients, MAC enforcement
>> handled client-side). But we aren't there yet, so I don't think we have
>> to worry about it right now.
>>
>
> But in cap_inode_setxattr, any security.* xattrs are controlled by
> CAP_SYS_ADMIN. So do you think that this should be changed to a
> CAP_XATTR_SECURITY capability which can be added to CAP_FS_MASK?
>
Hum. The intention of CAP_MAC_ADMIN was that it control the explicit
setting of the access control attributes used by the Smack LSM. I
personally prefer a single capability for the action over multiple
capabilities based on the objects involved. If you introduce
CAP_XATTR_SECURITY I would think that CAP_PROC_XATTR,
CAP_SVIPC_XATTR, CAP_NETWORK_XATTR, ... would follow in short order
and I hate the idea of having hundreds of capabilities. If you
must decouple the capability from MAC, how about a new name?
> Or do you think it's ok that fsuid=0 does not allow you to set
> security.selinux (or security.SMACK64, etc) xattrs when selinux is
> not compiled in?
>
> (You may have already answered this with your EOPNOTSUPP comment, but
> I want to make sure I understand right)
>
>
>>> Does anyone know what the trusted xattrs are used for?
>>>
>> Not beyond what attr(5) says about them.
>>
>
> Well, if attr(5) says CAP_SYS_ADMIN being needed is the very
> thing that defines these xattrs, then changing that seems a
> bigger deal. That really does seem akin to changing kernel-user
> API.
>
> thanks,
> -serge
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
On Tue, 2009-03-17 at 13:23 -0500, Serge E. Hallyn wrote:
> But in cap_inode_setxattr, any security.* xattrs are controlled by
> CAP_SYS_ADMIN. So do you think that this should be changed to a
> CAP_XATTR_SECURITY capability which can be added to CAP_FS_MASK?
I think that would be preferable to CAP_SYS_ADMIN, yes.
> Or do you think it's ok that fsuid=0 does not allow you to set
> security.selinux (or security.SMACK64, etc) xattrs when selinux is
> not compiled in?
Just to be clear, at present fsuid is irrelevant to setting the
security.* xattrs since it doesn't affect the CAP_SYS_ADMIN capability
at all, so it all depends on the initial capability state prior to using
setfsuid(), typically the full capability set.
> (You may have already answered this with your EOPNOTSUPP comment, but
> I want to make sure I understand right)
>
> > > Does anyone know what the trusted xattrs are used for?
> >
> > Not beyond what attr(5) says about them.
>
> Well, if attr(5) says CAP_SYS_ADMIN being needed is the very
> thing that defines these xattrs, then changing that seems a
> bigger deal. That really does seem akin to changing kernel-user
> API.
Perhaps, although it isn't clear that this API is in use by anyone or in
use in a way that would actually distinguish based on individual
capability.
But if you were to add CAP_SYS_ADMIN to CAP_FS_MASK in order to ensure
that setfsuid() does in fact affect all filesystem accesses, how much
meaningful difference remains between fsuid==0 and euid==0? It
obviously takes you far afield of only affecting filesystem accesses.
--
Stephen Smalley
National Security Agency
Quoting Casey Schaufler ([email protected]):
> Serge E. Hallyn wrote:
> > Quoting Stephen Smalley ([email protected]):
> >
> >> On Tue, 2009-03-17 at 12:39 -0500, Serge E. Hallyn wrote:
> >>
> >>> Quoting Stephen Smalley ([email protected]):
> >>>
> >>>>> So do you think it makes sense to have CAP_MAC_ADMIN and CAP_FOWNER
> >>>>> in CAP_FS_MASK? In other words are you objecting to CAP_SYS_ADMIN
> >>>>> because of all of its other implications, or because you disagree
> >>>>> that labels for security modules should be treated as mere fs data
> >>>>> here?
> >>>>>
> >>>> For CAP_FOWNER, yes (and it is already there). CAP_MAC_ADMIN is less
> >>>>
> >>> Sorry, I meant CAP_SETFCAP. Should it be added?
> >>>
> >> Sure - it is only used for filesystem operations.
> >>
> >
> > Ok, so then:
> >
> >
> >>>> ideal as it isn't clearly tied to filesystem accesses, and likewise for
> >>>> CAP_MAC_OVERRIDE (but that one is included in CAP_FS_MASK already).
> >>>>
> >>> So it is. I didn't realize that.
> >>>
> >>>
> >>>> Ideally the capability space would be partitioned into capabilities that
> >>>> affect filesystem accesses and the rest so that setfsuid() would yield
> >>>> the expected behavior of only affecting filesystem access.
> >>>> CAP_SYS_ADMIN is even less suitable due to its pervasive use outside of
> >>>> the filesystem. So that's the first concern.
> >>>>
> >>>> The second one is that we don't want CAP_SYS_ADMIN (or CAP_MAC_ADMIN) to
> >>>> be required when setting SELinux labels. Only the SELinux permission
> >>>> checks should govern setting those labels (aside from the usual DAC
> >>>> ownership || CAP_FOWNER check).
> >>>>
> >>> So if a non-selinux kernel is booted, then you think only the usual
> >>> DAC checks should be required to set selinux labels?
> >>>
> >> I'm talking about the dumb NFS server case (non-SELinux NFS server
> >> providing label and data storage to SELinux clients, MAC enforcement
> >> handled client-side). But we aren't there yet, so I don't think we have
> >> to worry about it right now.
> >>
> >
> > But in cap_inode_setxattr, any security.* xattrs are controlled by
> > CAP_SYS_ADMIN. So do you think that this should be changed to a
> > CAP_XATTR_SECURITY capability which can be added to CAP_FS_MASK?
> >
>
> Hum. The intention of CAP_MAC_ADMIN was that it control the explicit
> setting of the access control attributes used by the Smack LSM. I
> personally prefer a single capability for the action over multiple
> capabilities based on the objects involved. If you introduce
> CAP_XATTR_SECURITY I would think that CAP_PROC_XATTR,
> CAP_SVIPC_XATTR, CAP_NETWORK_XATTR, ... would follow in short order
> and I hate the idea of having hundreds of capabilities. If you
> must decouple the capability from MAC, how about a new name?
Oh I didn't say that we must, I'm just trying to figure out what we want
to do in the case that a security.foo xattr is being set, and the foo
LSM is not compiled in.
What is being done right now is that CAP_SYS_ADMIN is required to do
the setting, and so doing
setresuid(500,500,0);
setfsuid(0);
setxattr(somefilename, "security.SMACK64", LABEL, sizeof(LABEL), 0);
will fail the setxattr.
-serge
Quoting Stephen Smalley ([email protected]):
> On Tue, 2009-03-17 at 13:23 -0500, Serge E. Hallyn wrote:
> > But in cap_inode_setxattr, any security.* xattrs are controlled by
> > CAP_SYS_ADMIN. So do you think that this should be changed to a
> > CAP_XATTR_SECURITY capability which can be added to CAP_FS_MASK?
>
> I think that would be preferable to CAP_SYS_ADMIN, yes.
>
> > Or do you think it's ok that fsuid=0 does not allow you to set
> > security.selinux (or security.SMACK64, etc) xattrs when selinux is
> > not compiled in?
>
> Just to be clear, at present fsuid is irrelevant to setting the
> security.* xattrs since it doesn't affect the CAP_SYS_ADMIN capability
> at all, so it all depends on the initial capability state prior to using
> setfsuid(), typically the full capability set.
Right.
> > (You may have already answered this with your EOPNOTSUPP comment, but
> > I want to make sure I understand right)
> >
> > > > Does anyone know what the trusted xattrs are used for?
> > >
> > > Not beyond what attr(5) says about them.
> >
> > Well, if attr(5) says CAP_SYS_ADMIN being needed is the very
> > thing that defines these xattrs, then changing that seems a
> > bigger deal. That really does seem akin to changing kernel-user
> > API.
>
> Perhaps, although it isn't clear that this API is in use by anyone or in
> use in a way that would actually distinguish based on individual
> capability.
>
> But if you were to add CAP_SYS_ADMIN to CAP_FS_MASK in order to ensure
> that setfsuid() does in fact affect all filesystem accesses, how much
> meaningful difference remains between fsuid==0 and euid==0? It
> obviously takes you far afield of only affecting filesystem accesses.
Ok, thanks for time. It's all pretty clear to me now:
CAP_MKNOD and CAP_LINUX_IMMUTABLE need to be added to the CAP_FS_MASK
because, in 2.0 timeframe, fsuid==0 gave you those privileges.
xattrs didn't exist back then, so the setting of security.* and
trusted.* xattrs doesn't need to be enabled by fsuid==0. So really
CAP_SETFCAP also doesn't need to be added to CAP_FS_MASK either.
I'll send out a patch later today for 2.6, unless Igor wants to
do it (since he's the one who found this originally).
thanks,
-serge
On Wed, Mar 18, 2009 at 11:47:12AM -0500, Serge E. Hallyn wrote:
> Ok, thanks for time. It's all pretty clear to me now:
>
> CAP_MKNOD and CAP_LINUX_IMMUTABLE need to be added to the CAP_FS_MASK
> because, in 2.0 timeframe, fsuid==0 gave you those privileges.
>
> xattrs didn't exist back then, so the setting of security.* and
> trusted.* xattrs doesn't need to be enabled by fsuid==0. So really
> CAP_SETFCAP also doesn't need to be added to CAP_FS_MASK either.
Are we left with any simple one-sentence description of what CAP_FS_MASK
means? (Other than just a particular list of bits?) I'm just wondering
how additional bits will get decided in the future.
> I'll send out a patch later today for 2.6, unless Igor wants to
> do it (since he's the one who found this originally).
OK, apologies if I jumped the gun on the nfsd-specific patch--I lost
track of this discussion, thought it might take longer, and wanted to
get the one patch into 2.6.30. Feel free to revert that, or ignore it
and leave it to me to revert it afterwards, as convenient....
--b.
Quoting J. Bruce Fields ([email protected]):
> On Wed, Mar 18, 2009 at 11:47:12AM -0500, Serge E. Hallyn wrote:
> > Ok, thanks for time. It's all pretty clear to me now:
> >
> > CAP_MKNOD and CAP_LINUX_IMMUTABLE need to be added to the CAP_FS_MASK
> > because, in 2.0 timeframe, fsuid==0 gave you those privileges.
> >
> > xattrs didn't exist back then, so the setting of security.* and
> > trusted.* xattrs doesn't need to be enabled by fsuid==0. So really
> > CAP_SETFCAP also doesn't need to be added to CAP_FS_MASK either.
>
> Are we left with any simple one-sentence description of what CAP_FS_MASK
> means? (Other than just a particular list of bits?) I'm just wondering
> how additional bits will get decided in the future.
It means all the privileges that fsuid==0 historically meant.
At one time in the past, that meant CAP_MKNOD and CAP_LINUX_IMMUTABLE.
It has never meant setting security.* and trusted.* xattrs.
We could also define it as follows:
CAP_FS_MASK is the privilege to bypass all fs-related DAC permissions
security.* and trusted.* xattrs are MAC fs permissions
or something like that.
I guess one or both of those should go as a comment into capability.h
> > I'll send out a patch later today for 2.6, unless Igor wants to
> > do it (since he's the one who found this originally).
>
> OK, apologies if I jumped the gun on the nfsd-specific patch--I lost
Nonsense, I appreciate it... And there's certainly still a chance that
there will be objections to my interpretation, whereas the NFSD bit
seems straightened out.
> track of this discussion, thought it might take longer, and wanted to
> get the one patch into 2.6.30. Feel free to revert that, or ignore it
> and leave it to me to revert it afterwards, as convenient....
I'm not sure what the best path forward is, so I'll go ahead and
incorporate your patch into mine for now.
thanks,
-serge
[CCs trimmed]
On Mon, 16 Mar 2009, Serge E. Hallyn wrote:
> Quoting J. Bruce Fields ([email protected]):
> > special privilege, so don't consult filesystem permissions (do I have
> > that right? What happened to the attempt to allow ordinary users to
> > mount?).
>
> Well, they keep getting stalled because we don't have a good answer for
> what to do about the fact that an unprivileged user can make trees
> undeletable by pinning them with mounts. (Miklos and Eric cc'd in case
> I didn't explain that well enough).
That's correct.
The best answer I can come up with is to allow rmdir/unlink to
automatically umount trees from their respective dentries. Obviously
this can't be done for regular (privileged) mounts, which must keep
returning EBUSY in such situations.
But for unprivileged mounts I can't see any fundamental issue with
such an approach.
Does anyone see a problem with this? Is there a better solution?
Thanks,
Miklos
On Mon 2009-03-23 14:21:30, Miklos Szeredi wrote:
> [CCs trimmed]
>
> On Mon, 16 Mar 2009, Serge E. Hallyn wrote:
> > Quoting J. Bruce Fields ([email protected]):
> > > special privilege, so don't consult filesystem permissions (do I have
> > > that right? What happened to the attempt to allow ordinary users to
> > > mount?).
> >
> > Well, they keep getting stalled because we don't have a good answer for
> > what to do about the fact that an unprivileged user can make trees
> > undeletable by pinning them with mounts. (Miklos and Eric cc'd in case
> > I didn't explain that well enough).
>
> That's correct.
>
> The best answer I can come up with is to allow rmdir/unlink to
> automatically umount trees from their respective dentries. Obviously
> this can't be done for regular (privileged) mounts, which must keep
> returning EBUSY in such situations.
>
> But for unprivileged mounts I can't see any fundamental issue with
> such an approach.
>
> Does anyone see a problem with this? Is there a better solution?
Well... traditionally if you have an open file or cwd inside mounted
tree... that blocks unmount, right?
What will you do with processes that have open (deleted) files inside
the mount? What about cwd?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Thu, Mar 26, 2009 at 01:43:38PM +0100, Pavel Machek wrote:
> Well... traditionally if you have an open file or cwd inside mounted
> tree... that blocks unmount, right?
>
> What will you do with processes that have open (deleted) files inside
> the mount? What about cwd?
umount -l ?
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
Pavel Machek <[email protected]> writes:
> On Mon 2009-03-23 14:21:30, Miklos Szeredi wrote:
>> [CCs trimmed]
>>
>> On Mon, 16 Mar 2009, Serge E. Hallyn wrote:
>> > Quoting J. Bruce Fields ([email protected]):
>> > > special privilege, so don't consult filesystem permissions (do I have
>> > > that right? What happened to the attempt to allow ordinary users to
>> > > mount?).
>> >
>> > Well, they keep getting stalled because we don't have a good answer for
>> > what to do about the fact that an unprivileged user can make trees
>> > undeletable by pinning them with mounts. (Miklos and Eric cc'd in case
>> > I didn't explain that well enough).
>>
>> That's correct.
>>
>> The best answer I can come up with is to allow rmdir/unlink to
>> automatically umount trees from their respective dentries. Obviously
>> this can't be done for regular (privileged) mounts, which must keep
>> returning EBUSY in such situations.
>>
>> But for unprivileged mounts I can't see any fundamental issue with
>> such an approach.
>>
>> Does anyone see a problem with this? Is there a better solution?
>
> Well... traditionally if you have an open file or cwd inside mounted
> tree... that blocks unmount, right?
>
> What will you do with processes that have open (deleted) files inside
> the mount? What about cwd?
That is a backwards understanding, of the problem.
Currently I can not delete my mount point if I have something mounted on it in another
mount namespace.
Generally lazy unmounts solve the deleted inodes problem, your were talking about.
Eric
But ordinary users can't create devices. It seems to me that in time
of implementation of capabilities in kernel 2.4, capabilities related
to filesystem was added first. And mark for them contains all above in
header file. And when CAP_MKNOD was added later, author just forget to
update mask.
If mask was designed to drop all filesystem related capabilities, then
it must be expanded, because ordinary users cannot create devices etc.
2009/3/13, J. Bruce Fields <[email protected]>:
> On Fri, Mar 13, 2009 at 09:21:23AM +1300, Michael Kerrisk wrote:
>> On Fri, Mar 13, 2009 at 5:10 AM, Serge E. Hallyn <[email protected]> wrote:
>> > Quoting J. Bruce Fields ([email protected]):
>> >> On Wed, Mar 11, 2009 at 03:53:34PM +0300, Igor Zhbanov wrote:
>> >> > Hello!
>> >> >
>> >> > It seems that CAP_MKNOD and CAP_LINUX_IMMUTABLE were forgotten to be
>> >> > added to CAP_FS_MASK_B0 in linux-2.6.x and to CAP_FS_MASK in
>> >> > linux-2.4.x. Both capabilities affects file system and can be
>> >> > considered file system capabilities.
>> >>
>> >> Sounds right to me--I'd expect rootsquash to guarantee that new device
>> >> nodes can't be created from the network. ?Cc'ing random people from the
>> >> git log for include/linux/capability.h in hopes they can help.
>> >
>> > Yeah it seems reasonable. ?If it is, then does that mean that we
>> > also need CAP_SYS_ADMIN (to write selinux labels) and CAP_SETFCAP
>> > (to set file capabilities) as well?
>>
>> If a change is made to CAP_FS_MASK, please do remember to CC
>> [email protected], and linux-api@.
>
> OK, that's because the exact set of capabilities that is dropped on
> setfsuid is documented in capabilities(7)? (Anywhere else?)
>
> --b.
>
>>
>> Cheers,
>>
>> Michael
>>
>>
>> --
>> Michael Kerrisk Linux man-pages maintainer;
>> http://www.kernel.org/doc/man-pages/ Found a documentation bug?
>> http://www.kernel.org/doc/man-pages/reporting_bugs.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>
Here you can find who and how use setfsuid(...):
http://www.google.com/codesearch?hl=ru&sa=N&q=%5B%5E_%5Dsetfsuid%5B%5C+%5Ct%5D*%5C(%5Ba-z_%5D%2B+file:%5C.%5Bch%5D%5Bcpx%5D*%24
Anyway, please commit changes to linux-2.4 too.
fs/nfsd/auth.c for CAP_NFSD_MASK:
#define CAP_NFSD_MASK (CAP_FS_MASK|CAP_TO_MASK(CAP_SYS_RESOURCE))
and
include/linux/capability.h for CAP_FS_MASK:
#define CAP_FS_MASK 0x1f