2006-08-14 22:08:03

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Serge E. Hallyn ([email protected]):
> This patch implements file (posix) capabilities. This allows
> a binary to gain a subset of root's capabilities without having
> the file actually be setuid root.
>
> There are some other implementations out there taking various
> approaches. This patch keeps all the changes within the
> capability LSM, and stores the file capabilities in xattrs
> named "security.capability". First question is, do we want
> this in the kernel? Second is, is this sort of implementation
> we'd want?
>
> Some userspace tools to manipulate the fscaps are at
> http://www.sr71.net/~hallyn/fscaps/. For instance,
>
> setcap writeroot "cap_dac_read_search,cap_dac_override+eip"
>
> allows the 'writeroot' testcase to write to /root/ab when
> run as a normal user.
>
> This patch doesn't address the need to update
> cap_bprm_secureexec().
>
> Signed-off-by: Serge E. Hallyn <[email protected]>
> ---
> include/linux/security.h | 10 +++-
> security/Kconfig | 11 ++++
> security/capability.c | 5 ++
> security/commoncap.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 152 insertions(+), 1 deletions(-)

Here is a different version which sticks closer to the original
version by Simmonds, in that it simply calls getxattr at
bprm_set_security. Very inadequate perftesting showed no
overhead, and this makes the patch much smaller and compatible
with SELinux. Note that this patch changes nothing if
CONFIG_FS_CAPABILITIES is not enabled, which is the default, and
should therefore be safe to include in -mm.

(Also CC:ing lsm list which I forgot to do last time)

Signed-off-by: Serge E. Hallyn <[email protected]>
---
security/Kconfig | 10 ++++++++++
security/commoncap.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/security/Kconfig b/security/Kconfig
index 67785df..aa3558f 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -80,6 +80,16 @@ config SECURITY_CAPABILITIES
This enables the "default" Linux capabilities functionality.
If you are unsure how to answer this question, answer Y.

+config SECURITY_FS_CAPABILITIES
+ bool "Filesystem Capabilities"
+ depends on SECURITY_CAPABILITIES
+ default n
+ help
+ This enables filesystem capabilities, allowing you to give
+ binaries a subset of root's powers without using setuid 0.
+
+ If in doubt, answer N.
+
config SECURITY_ROOTPLUG
tristate "Root Plug Support"
depends on USB && SECURITY
diff --git a/security/commoncap.c b/security/commoncap.c
index f50fc29..064fbb0 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -109,11 +109,17 @@ void cap_capset_set (struct task_struct
target->cap_permitted = *permitted;
}

+#define XATTR_CAPS_SUFFIX "capability"
+#define XATTR_NAME_CAPS XATTR_SECURITY_PREFIX XATTR_CAPS_SUFFIX
int cap_bprm_set_security (struct linux_binprm *bprm)
{
+ ssize_t (*bprm_getxattr)(struct dentry *,const char *,void *,size_t);
+ struct dentry *bprm_dentry;
+ ssize_t rc;
+ u32 fscaps[3];
+
/* Copied from fs/exec.c:prepare_binprm. */

- /* We don't have VFS support for capabilities yet */
cap_clear (bprm->cap_inheritable);
cap_clear (bprm->cap_permitted);
cap_clear (bprm->cap_effective);
@@ -134,6 +140,46 @@ int cap_bprm_set_security (struct linux_
if (bprm->e_uid == 0)
cap_set_full (bprm->cap_effective);
}
+
+#ifdef CONFIG_SECURITY_FS_CAPABILITIES
+ /* Locate any VFS capabilities: */
+
+ bprm_dentry = dget(bprm->file->f_dentry);
+ if(!(bprm_dentry->d_inode->i_op &&
+ bprm_dentry->d_inode->i_op->getxattr)) {
+ dput(bprm_dentry);
+ return 0;
+ }
+ bprm_getxattr = bprm_dentry->d_inode->i_op->getxattr;
+
+ rc = bprm_getxattr(bprm_dentry, XATTR_NAME_CAPS, &fscaps,
+ sizeof(fscaps));
+ dput(bprm_dentry);
+
+ /*
+ * serge: not sure about the return values...
+ * think about them some more, maybe some should
+ * return rc.
+ */
+ if (rc < 0 && rc != -ENODATA) {
+ printk(KERN_NOTICE "%s: Error (%d) getting xattr\n",
+ __FUNCTION__, rc);
+ return 0;
+ }
+ if (rc < 0)
+ return 0;
+
+ if (rc != sizeof(fscaps)) {
+ printk(KERN_NOTICE "%s: got wrong size for getxattr (%d)\n",
+ __FUNCTION__, rc);
+ return 0;
+ }
+
+ bprm->cap_effective = fscaps[0];
+ bprm->cap_inheritable = fscaps[1];
+ bprm->cap_permitted = fscaps[2];
+
+#endif
return 0;
}

--
1.4.1.1


2006-08-15 00:20:27

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

"Serge E. Hallyn" <[email protected]> writes:

> Quoting Serge E. Hallyn ([email protected]):
>> This patch implements file (posix) capabilities. This allows
>> a binary to gain a subset of root's capabilities without having
>> the file actually be setuid root.
>>
>> There are some other implementations out there taking various
>> approaches. This patch keeps all the changes within the
>> capability LSM, and stores the file capabilities in xattrs
>> named "security.capability". First question is, do we want
>> this in the kernel? Second is, is this sort of implementation
>> we'd want?
>>
>> Some userspace tools to manipulate the fscaps are at
>> http://www.sr71.net/~hallyn/fscaps/. For instance,
>>
>> setcap writeroot "cap_dac_read_search,cap_dac_override+eip"
>>
>> allows the 'writeroot' testcase to write to /root/ab when
>> run as a normal user.
>>
>> This patch doesn't address the need to update
>> cap_bprm_secureexec().

Looking at your ondisk format it doesn't look like you include a
version. There is no reason to believe the current set of kernel
capabilities is fixed for all time. So we need some for of
forward/backward compatibility. Maybe in the cap name?

Eric

2006-08-15 02:07:04

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Eric W. Biederman ([email protected]):
> "Serge E. Hallyn" <[email protected]> writes:
>
> > Quoting Serge E. Hallyn ([email protected]):
> >> This patch implements file (posix) capabilities. This allows
> >> a binary to gain a subset of root's capabilities without having
> >> the file actually be setuid root.
> >>
> >> There are some other implementations out there taking various
> >> approaches. This patch keeps all the changes within the
> >> capability LSM, and stores the file capabilities in xattrs
> >> named "security.capability". First question is, do we want
> >> this in the kernel? Second is, is this sort of implementation
> >> we'd want?
> >>
> >> Some userspace tools to manipulate the fscaps are at
> >> http://www.sr71.net/~hallyn/fscaps/. For instance,
> >>
> >> setcap writeroot "cap_dac_read_search,cap_dac_override+eip"
> >>
> >> allows the 'writeroot' testcase to write to /root/ab when
> >> run as a normal user.
> >>
> >> This patch doesn't address the need to update
> >> cap_bprm_secureexec().
>
> Looking at your ondisk format it doesn't look like you include a
> version. There is no reason to believe the current set of kernel
> capabilities is fixed for all time.

In fact my version knowingly ignores CAP_AUDIT_WRITE and
CAP_AUDIT_CONTROL (because on my little test .iso they didn't exist).
So a version number may make sense.

> So we need some for of
> forward/backward compatibility. Maybe in the cap name?

You mean as in use 'security.capability_v32" for the xattr name?
Or do you really mean add a cap name to the structure?

thanks,
-serge

2006-08-15 03:30:21

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

"Serge E. Hallyn" <[email protected]> writes:

> In fact my version knowingly ignores CAP_AUDIT_WRITE and
> CAP_AUDIT_CONTROL (because on my little test .iso they didn't exist).
> So a version number may make sense.
>
>> So we need some for of
>> forward/backward compatibility. Maybe in the cap name?
>
> You mean as in use 'security.capability_v32" for the xattr name?
> Or do you really mean add a cap name to the structure?

I was thinking the xattr name. But mostly I was looking
for a place where you had possibly stashed a version.

Thinking about it possibly the most portable thing to do
is to assign each cap a well known name. Say
"security.cap.dac_override" and have a value in there like +1
add the cap -1 clear the cap. That at least seems to provide
granularity and some measure of future proofing and some measure of
portability. The space it would take with those names looks ugly
though.

The practical question is what do you do with a program that
was give a set of capabilities you no longer support?
Do you run it without any capabilities at all?
Do you give it as many capabilities of what it asked for
as you can?
Do you complain loudly and refuse to execute it at all?

What is the secure choice that least violates the principle of least surprise?

Eric

2006-08-15 04:22:21

by Nicholas Miell

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

On Mon, 2006-08-14 at 21:29 -0600, Eric W. Biederman wrote:
> "Serge E. Hallyn" <[email protected]> writes:
>
> > In fact my version knowingly ignores CAP_AUDIT_WRITE and
> > CAP_AUDIT_CONTROL (because on my little test .iso they didn't exist).
> > So a version number may make sense.
> >
> >> So we need some for of
> >> forward/backward compatibility. Maybe in the cap name?
> >
> > You mean as in use 'security.capability_v32" for the xattr name?
> > Or do you really mean add a cap name to the structure?
>
> I was thinking the xattr name. But mostly I was looking
> for a place where you had possibly stashed a version.
>
> Thinking about it possibly the most portable thing to do
> is to assign each cap a well known name. Say
> "security.cap.dac_override" and have a value in there like +1
> add the cap -1 clear the cap. That at least seems to provide
> granularity and some measure of future proofing and some measure of
> portability. The space it would take with those names looks ugly
> though.
>
> The practical question is what do you do with a program that
> was give a set of capabilities you no longer support?
> Do you run it without any capabilities at all?
> Do you give it as many capabilities of what it asked for
> as you can?
> Do you complain loudly and refuse to execute it at all?
>
> What is the secure choice that least violates the principle of least surprise?

Make it an arbitrary length bitfield with a defined byte order (little
endian, probably). Bits at offsets greater than the length of the
bitfield are defined to be zero. If the kernel encounters a set bit that
it doesn't recognizes, fail with EPERM. If userspace attempts to set a
bit that the kernel doesn't recognize, fail with EINVAL.

It's extensible (as new capability bits are added, the length of the
bitfield grows), backward compatible (as long as there are no unknown
bits set, it'll still work) and secure (if an unknown bit is set, the
kernel fails immediately, so there's no chance of a "secure" app running
with less privileges than it expects and opening up a security hole).

OTOH, everybody seems to have moved from capability-based security
models on to TE/RBAC-based security models, so maybe this isn't worth
the effort?

--
Nicholas Miell <[email protected]>

2006-08-15 11:49:50

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Nicholas Miell ([email protected]):
> On Mon, 2006-08-14 at 21:29 -0600, Eric W. Biederman wrote:
> > "Serge E. Hallyn" <[email protected]> writes:
> >
> > > In fact my version knowingly ignores CAP_AUDIT_WRITE and
> > > CAP_AUDIT_CONTROL (because on my little test .iso they didn't exist).
> > > So a version number may make sense.
> > >
> > >> So we need some for of
> > >> forward/backward compatibility. Maybe in the cap name?
> > >
> > > You mean as in use 'security.capability_v32" for the xattr name?
> > > Or do you really mean add a cap name to the structure?
> >
> > I was thinking the xattr name. But mostly I was looking
> > for a place where you had possibly stashed a version.
> >
> > Thinking about it possibly the most portable thing to do
> > is to assign each cap a well known name. Say
> > "security.cap.dac_override" and have a value in there like +1
> > add the cap -1 clear the cap. That at least seems to provide
> > granularity and some measure of future proofing and some measure of
> > portability. The space it would take with those names looks ugly
> > though.

On the one hand, there shouldn't be many executables with capabilities
so even a horrendous abuse of disk space isn't so bad, but on the other
hand, yes, it'd be a horrendous abuse of disk space :)

> > The practical question is what do you do with a program that
> > was give a set of capabilities you no longer support?
> > Do you run it without any capabilities at all?
> > Do you give it as many capabilities of what it asked for
> > as you can?
> > Do you complain loudly and refuse to execute it at all?
> >
> > What is the secure choice that least violates the principle of least surprise?
>
> Make it an arbitrary length bitfield with a defined byte order (little
> endian, probably). Bits at offsets greater than the length of the
> bitfield are defined to be zero. If the kernel encounters a set bit that
> it doesn't recognizes, fail with EPERM. If userspace attempts to set a
> bit that the kernel doesn't recognize, fail with EINVAL.
>
> It's extensible (as new capability bits are added, the length of the
> bitfield grows), backward compatible (as long as there are no unknown
> bits set, it'll still work) and secure (if an unknown bit is set, the
> kernel fails immediately, so there's no chance of a "secure" app running
> with less privileges than it expects and opening up a security hole).

Sounds good.

The version number will imply the bitfield length, or do we feel warm
fuzzies if the length is redundantly encoded in the structure?

> OTOH, everybody seems to have moved from capability-based security
> models on to TE/RBAC-based security models, so maybe this isn't worth
> the effort?

One day perhaps, but that day isn't here yet. People are still using
setuid (see /sbin/passwd), so obviously they're not sufficiently
comfortable using *only* TE/RBAC.

-serge

2006-08-15 12:20:29

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Serge E. Hallyn ([email protected]):
> > Make it an arbitrary length bitfield with a defined byte order (little
> > endian, probably). Bits at offsets greater than the length of the
> > bitfield are defined to be zero. If the kernel encounters a set bit that
> > it doesn't recognizes, fail with EPERM. If userspace attempts to set a
> > bit that the kernel doesn't recognize, fail with EINVAL.
> >
> > It's extensible (as new capability bits are added, the length of the
> > bitfield grows), backward compatible (as long as there are no unknown
> > bits set, it'll still work) and secure (if an unknown bit is set, the
> > kernel fails immediately, so there's no chance of a "secure" app running
> > with less privileges than it expects and opening up a security hole).
>
> Sounds good.
>
> The version number will imply the bitfield length, or do we feel warm
> fuzzies if the length is redundantly encoded in the structure?

nm, 'encoded in the structure' clearly is silly.

-serge

2006-08-15 16:02:48

by Casey Schaufler

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

/*
*
* Copyright 1990 Silicon Graphics, Inc.
* All rights reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*
*/
#ident "$SGIRevision: 1.30 $"

#include "sys/types.h"
#include "sys/systm.h"
#include "sys/debug.h"
#include "sys/cmn_err.h"
#include "ksys/cred.h"
#include "sys/proc.h"
#include "sys/vfs.h"
#include "sys/param.h"
#include "sys/errno.h"
#include "sys/vfs.h"
#include "sys/vnode.h"
#include "sys/fstyp.h"
#include "ksys/vfile.h"
#include "ksys/fdt.h"
#include "sys/uio.h"
#include "sys/pathname.h"
#include "sys/capability.h"
#include "sys/sat.h"
#include "sys/attributes.h"
#include "os/proc/pproc_private.h" /* XXX bogus */
#include "ksys/fdt.h"
#include "sys/uthread.h"

/*
* Make the default be augmented superuser.
*/
void
cap_init()
{
cap_enabled = CAP_SYS_SUPERUSER;
}

/*
* Give the requested credential all privileges.
* This is used to initialize sys_cred and NFS server credentials.
*/
void
cap_empower_cred(cred_t *cr)
{
cr->cr_cap.cap_effective = CAP_ALL_ON;
cr->cr_cap.cap_inheritable = CAP_ALL_ON;
cr->cr_cap.cap_permitted = CAP_ALL_ON;
}

int
cap_vtocap(vnode_t *vp, cap_t capp)
{
int error;
int i = sizeof(cap_set_t);

VOP_ATTR_GET(vp, SGI_CAP_FILE, (char *)capp, &i, ATTR_ROOT,
sys_cred, error);

return error;
}

/*
* Recalculate the process capabilities appropriate after exec(2). Uses the
* POSIX P1003.1eD16 algorythm. For exec()'s of SCAP files:
*
* New-inheritable = Proc-inheritable & File-inheritable
* New-permitted = (New-inheritable & Proc-permitted) | File-permitted
* New-effective = File-effective & New-permitted
*
* For exec()'s of non-SCAP files when the current cap_inheritable has
* CAP_FLAG_PURE_RECALC set we drop all capabilities. Otherwise we leave
* the process' current capabilities alone.
*
* Called with proposed new capabilities from file, "fcap" (which will be
* invalid if the file doesn't have any attached capabilities). Returns with
* recalculated capabilities in *current* process credentials. Also returns
* whether or not the calculation resulted in an increase in capabilities.
* Must be called with pcred_lock() set.
*/
int
cap_recalc(const cap_set_t *fcap)
{
uthread_t *ut = curuthread;
proc_t *p = UT_TO_PROC(ut); /* XXX virtualize this? */
cred_t *cr = p->p_cred;
cap_set_t *pcap = &cr->cr_cap;
int rxev;

/*
* This routine is called only from exec, so there's only one
* uthread, and we don't have to pcred_push() the changes to the
* uthread. Also assert that we have the credentials lock ...
*/
ASSERT(!(ut->ut_flags & UT_PTHREAD));
ASSERT(pcred_islocked(p));

if (fcap->cap_effective & CAP_INVALID) {
/*
* File didn't have any capabilities attached to it. If
* CAP_FLAG_PURE_RECALC is set clear any capabilities.
* Otherwise, leave everything alone and return. Note: We
* oughtn't need to retain the CAP_FLAG_PURE_RECALC flag.
*/
if (pcap->cap_inheritable & CAP_FLAG_PURE_RECALC) {
#pragma mips_frequency_hint NEVER
pcap->cap_effective = CAP_ALL_OFF;
pcap->cap_permitted = CAP_ALL_OFF;
pcap->cap_inheritable = CAP_FLAG_PURE_RECALC;
}
rxev = 0;
} else {
#pragma mips_frequency_hint NEVER
cap_set_t opcap = *pcap;

pcap->cap_inheritable = CAP_ALL_ON &
(pcap->cap_inheritable & fcap->cap_inheritable);
pcap->cap_permitted = CAP_ALL_ON &
((pcap->cap_inheritable & pcap->cap_permitted) |
fcap->cap_permitted);
pcap->cap_effective = CAP_ALL_ON &
(fcap->cap_effective & pcap->cap_permitted);

rxev = ((pcap->cap_effective & ~opcap.cap_effective) ||
(pcap->cap_permitted & ~opcap.cap_permitted) ||
(pcap->cap_inheritable & ~opcap.cap_inheritable));
}

if ((pcap->cap_effective & CAP_ALL_ON) ||
(pcap->cap_permitted & CAP_ALL_ON))
p_flagset(p, SPRPROTECT);
else
p_flagclr(p, SPRPROTECT);

return rxev;
}

int
cap_setpcap(cap_set_t *src, cap_value_t *flags)
{
struct cred *cr;
/* REFERENCED */
proc_t *pp = curprocp;

if (flags != NULL) {
ASSERT(src == NULL);
cr = pcred_lock(pp);
cr = crcopy(pp);
cr->cr_cap.cap_inheritable =
(cr->cr_cap.cap_inheritable & CAP_ALL_ON) |
(*flags & CAP_FLAGS);
pcred_push(pp);
return 0;
}

ASSERT(src != NULL);

if ((src->cap_inheritable & CAP_INVALID) ||
(src->cap_permitted & CAP_INVALID) ||
(src->cap_effective & CAP_INVALID))
return EINVAL;

/*
* Clear any flag bits in the new set.
*/
src->cap_inheritable &= CAP_ALL_ON;
src->cap_permitted &= CAP_ALL_ON;
src->cap_effective &= CAP_ALL_ON;

/*
* Audit the old capability set
*/
_SAT_SAVE_ATTR(SAT_CAP_SET_TOKEN, curuthread);

cr = pcred_lock(pp);

if ((!CAP_ID_ISSET(src->cap_inheritable,
cr->cr_cap.cap_inheritable) &&
!CAP_ID_ISSET(src->cap_inheritable,
cr->cr_cap.cap_permitted)) ||
(!CAP_ID_ISSET(src->cap_permitted, cr->cr_cap.cap_permitted)) ||
(!CAP_ID_ISSET(src->cap_effective, cr->cr_cap.cap_effective) &&
!CAP_ID_ISSET(src->cap_effective, cr->cr_cap.cap_permitted))) {
if (!cap_able(CAP_SETPPRIV)) {
pcred_push(pp);
_SAT_SETPCAP(src, EPERM);
return EPERM;
}
}

/*
* Retain the flags stored in the inheritable set.
*/
src->cap_inheritable |= cr->cr_cap.cap_inheritable & CAP_FLAGS;

/*
* At this point we are going to set the requested capability state
* so, check whether the process will have any permitted or effective
* capabilities after the call is over. If so, taint the process
* so that access through /proc is privileged from now on (until
* the next exec())
*/
if ( (src->cap_permitted & CAP_ALL_ON)
|| (src->cap_effective & CAP_ALL_ON)) {
p_flagset(pp, SPRPROTECT);
}

/*
* Set the requested capability state.
*/
cr = crcopy(pp);
cr->cr_cap = *src;
pcred_push(pp);

_SAT_SETPCAP(src, 0);
return 0;
}

int
cap_request(cap_value_t cid)
{
proc_t *pp = curprocp;
struct cred *cr = pcred_lock(pp);

if (CAP_ID_ISSET(cid, cr->cr_cap.cap_permitted)) {
if (!CAP_ID_ISSET(cid, cr->cr_cap.cap_effective)) {
cr = crcopy(pp);
CAP_ID_SET(cid, cr->cr_cap.cap_effective);
pcred_push(pp);
} else
pcred_unlock(pp);
return 0;
}
pcred_unlock(pp);
return EPERM;
}


int
cap_surrender(cap_value_t cid)
{
proc_t *pp = curprocp;
struct cred *cr = pcred_lock(pp);

if (CAP_ID_ISSET(cid, cr->cr_cap.cap_effective)) {
cr = crcopy(pp);
CAP_ID_CLEAR(cid, cr->cr_cap.cap_effective);
pcred_push(pp);
} else
pcred_unlock(pp);
return 0;
}

/*
* cap_get - get the capability set of a file
*
* Returns 0 on success.
* Returns ENOATTR if no capability set is associated with the file.
*/

int
cap_get(char *fname, int fdes, struct cap_set *cap)
{
struct cap_set kcap;
vfile_t *fp;
vnode_t *vp;
int error = 0;

if (fdes != -1 && fname != NULL)
return (EINVAL);
if (fdes == -1 && fname == NULL)
return (EINVAL);

if (!cap)
return (EINVAL);

_SAT_PN_BOOK(SAT_FILE_ATTR_READ, curuthread);

if (fname) {
if (error=lookupname(fname,UIO_USERSPACE,NO_FOLLOW,NULLVPP,&vp, NULL))
return (error);
}
else {
if (error = getf(fdes, &fp))
return (error);
if (!VF_IS_VNODE(fp))
return (EINVAL);
vp = VF_TO_VNODE(fp);

VN_HOLD(vp);
}

if (!(error = _MAC_VACCESS(vp, curuthread->ut_cred, VREAD)))
error = cap_vtocap(vp, &kcap); /* Returns ENOATTR if no capability set */

if (!error) {
if (copyout((caddr_t)&kcap,(caddr_t)cap,sizeof(struct cap_set)))
error = EFAULT;
}

VN_RELE(vp);

_SAT_ACCESS(SAT_FILE_ATTR_READ, error);
return (error);
}

/*
* cap_set - set the capability set on a file
*
* If cap is a null pointer, the capability set is removed from
* the file.
*
* Returns 0 on success.
*/

int
cap_set(char *fname, int fdes, struct cap_set *cap)
{
struct cap_set kcap;
vnode_t *vp;
vfile_t *fp;
int error;

if (!_CAP_ABLE(CAP_SETFCAP))
return (EPERM);

if (fdes != -1 && fname != NULL)
return (EINVAL);
if (fdes == -1 && fname == NULL)
return (EINVAL);


if (cap && copyin((caddr_t)cap, (caddr_t)&kcap, sizeof(struct cap_set)))
return (EFAULT);

_SAT_PN_BOOK(SAT_FILE_ATTR_WRITE, curuthread);

if (fname) {
if (error=lookupname(fname,UIO_USERSPACE,NO_FOLLOW,NULLVPP,&vp, NULL))
return (error);
}
else {
if (error = getf(fdes, &fp))
return (error);
if (!VF_IS_VNODE(fp))
return (EINVAL);
vp = VF_TO_VNODE(fp);
VN_HOLD(vp);
}

/*
* MAC and ownership checks.
*/
if (!(error = _MAC_VACCESS(vp, curuthread->ut_cred, VWRITE))) {
vattr_t va;

va.va_mask = AT_UID;
VOP_GETATTR(vp, &va, 0, sys_cred, error);

if (!error && va.va_uid != curuthread->ut_cred->cr_uid &&
!_CAP_CRABLE(curuthread->ut_cred, CAP_FOWNER))
error = EACCES;
}

if (!error) {
/*
* Only regular files should have cap sets.
* Better not try to update a read-only file system.
*/
if (vp->v_type != VREG)
error = EINVAL;
else if (vp->v_vfsp->vfs_flag & VFS_RDONLY)
error = EROFS;
else if ( cap )
{
VOP_ATTR_SET(vp, SGI_CAP_FILE, (char *)cap,
sizeof(cap_set_t), ATTR_ROOT, sys_cred, error);
}
else
{
VOP_ATTR_REMOVE(vp, SGI_CAP_FILE, ATTR_ROOT, sys_cred, error);
if ( error == ENOATTR )
/* There already was no capability associated with the file */
error = 0;
}
}

VN_RELE(vp);
_SAT_SETCAP( cap, error );
return (error);
}

int
cap_style(int new)
{

#ifdef DEBUG
if (cap_enabled == CAP_SYS_SUPERUSER)
cmn_err(CE_NOTE,"Superuser and capabilities provide privilege");
else if (cap_enabled == CAP_SYS_NO_SUPERUSER)
cmn_err(CE_NOTE,"Capabilities alone provide privilege");
else if (cap_enabled == CAP_SYS_DISABLED)
cmn_err(CE_NOTE,"Superuser alone provides privilege");
#endif /* DEBUG */

if (!_CAP_ABLE(CAP_SYSINFO_MGT))
return (EPERM);

#ifdef DEBUG
if (new == CAP_SYS_SUPERUSER)
cmn_err(CE_NOTE,
"Now, Superuser and capabilities provide privilege");
else if (new == CAP_SYS_NO_SUPERUSER)
cmn_err(CE_NOTE,"Now, Capabilities alone provide privilege");
else if (new == CAP_SYS_DISABLED)
cmn_err(CE_NOTE,"Now, Superuser alone provides privilege");
#endif /* DEBUG */
cap_enabled = new;
return (0);
}


Attachments:
capability.c (10.50 kB)
1104682580-capability.c

2006-08-15 16:17:45

by Stephen Smalley

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

On Tue, 2006-08-15 at 06:49 -0500, Serge E. Hallyn wrote:
> Quoting Nicholas Miell ([email protected]):
> > OTOH, everybody seems to have moved from capability-based security
> > models on to TE/RBAC-based security models, so maybe this isn't worth
> > the effort?
>
> One day perhaps, but that day isn't here yet. People are still using
> setuid (see /sbin/passwd), so obviously they're not sufficiently
> comfortable using *only* TE/RBAC.

The hard part of capabilities isn't the kernel mechanism - it is the
proper assignment and management of the capability bits on files, and
teaching userland that uid 0 is no longer magic. Which is all work that
is already well underway for SELinux, but you would have to replicate it
for capabilities. And since there is no notion of equivalence classes
ala SELinux types and the "policy" is completely distributed throughout
the filesystem state, management is going to be even more painful for
the capabilities.

On the kernel side, in addition to updating the bprm_secureexec logic,
you would need to consider whether the capability module needs to
implement capability comparisons for the other hooks, like task_kill.
At present, many operations only involve uid comparisons and SELinux
checks without explicitly comparing capability sets. Properly isolating
and protecting processes with different capability sets but the same uid
is something SELinux already can do (based on domain), whereas the
existing capability module doesn't really provide that.

--
Stephen Smalley
National Security Agency

2006-08-15 16:36:37

by Casey Schaufler

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities



--- Stephen Smalley <[email protected]> wrote:


> The hard part of capabilities isn't the kernel
> mechanism - it is the
> proper assignment and management of the capability
> bits on files, and
> teaching userland that uid 0 is no longer magic.

Stephen is correct here. Getting the capability
settings correct on all setuid programs is a
tough nut. Training people out of the root
model isn't easy, either.

> Which is all work that
> is already well underway for SELinux, but you would
> have to replicate it
> for capabilities.

The work underway for SELinux is different
from that of capabilities. POSIX capabilities
offer substantial benefits without the high
cost of SELinux.

> And since there is no notion of
> equivalence classes
> ala SELinux types and the "policy" is completely
> distributed throughout
> the filesystem state, management is going to be even
> more painful for
> the capabilities.

This is not the experiance of Irix, which
has supported POSIX capabilities for years.

> On the kernel side, in addition to updating the
> bprm_secureexec logic,
> you would need to consider whether the capability
> module needs to
> implement capability comparisons for the other
> hooks, like task_kill.
> At present, many operations only involve uid
> comparisons and SELinux
> checks without explicitly comparing capability sets.

Yes. There is work to be done.

> Properly isolating
> and protecting processes with different capability
> sets but the same uid
> is something SELinux already can do (based on
> domain), whereas the
> existing capability module doesn't really provide
> that.

That's a matter of taste in policy. There
is certainly no Common Criteria requirement
for that.

Sure, SELinux and POSIX capabilities are
different. No arguement. POSIX capabilities
predate SELinux in the community and in the
Linux kernel. To date they have been limited
by the lack of xattr support, but now that
that is available there is every reason to
complete the implementation.


Casey Schaufler
[email protected]

2006-08-15 19:31:08

by Nicholas Miell

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

On Tue, 2006-08-15 at 07:20 -0500, Serge E. Hallyn wrote:
> Quoting Serge E. Hallyn ([email protected]):
> > > Make it an arbitrary length bitfield with a defined byte order (little
> > > endian, probably). Bits at offsets greater than the length of the
> > > bitfield are defined to be zero. If the kernel encounters a set bit that
> > > it doesn't recognizes, fail with EPERM. If userspace attempts to set a
> > > bit that the kernel doesn't recognize, fail with EINVAL.
> > >
> > > It's extensible (as new capability bits are added, the length of the
> > > bitfield grows), backward compatible (as long as there are no unknown
> > > bits set, it'll still work) and secure (if an unknown bit is set, the
> > > kernel fails immediately, so there's no chance of a "secure" app running
> > > with less privileges than it expects and opening up a security hole).
> >
> > Sounds good.
> >
> > The version number will imply the bitfield length, or do we feel warm
> > fuzzies if the length is redundantly encoded in the structure?
>
> nm, 'encoded in the structure' clearly is silly.
>

There isn't really a version number, just recognized and unrecognized
capability bits. If you wanted, you could use a single byte to give a
binary CAP_DAC_OVERRIDE, with capability bits 8-30 being "stored" in
not-present bytes and therefore assumed to be zero.

--
Nicholas Miell <[email protected]>

2006-08-15 19:41:56

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Nicholas Miell ([email protected]):
> On Tue, 2006-08-15 at 07:20 -0500, Serge E. Hallyn wrote:
> > Quoting Serge E. Hallyn ([email protected]):
> > > > Make it an arbitrary length bitfield with a defined byte order (little
> > > > endian, probably). Bits at offsets greater than the length of the
> > > > bitfield are defined to be zero. If the kernel encounters a set bit that
> > > > it doesn't recognizes, fail with EPERM. If userspace attempts to set a
> > > > bit that the kernel doesn't recognize, fail with EINVAL.
> > > >
> > > > It's extensible (as new capability bits are added, the length of the
> > > > bitfield grows), backward compatible (as long as there are no unknown
> > > > bits set, it'll still work) and secure (if an unknown bit is set, the
> > > > kernel fails immediately, so there's no chance of a "secure" app running
> > > > with less privileges than it expects and opening up a security hole).
> > >
> > > Sounds good.
> > >
> > > The version number will imply the bitfield length, or do we feel warm
> > > fuzzies if the length is redundantly encoded in the structure?
> >
> > nm, 'encoded in the structure' clearly is silly.
> >
>
> There isn't really a version number, just recognized and unrecognized

Well, there is, defined in include/linux/capability.h, but it has never
changed thus far, and tying it to the bitfield length could get ugly as
things change.

In the below patch, the length is simply defined in
include/linux/capability.h, nothing fancy. If anything beyond the known
capability bits is set to 1, we refuse exec permission.

> capability bits. If you wanted, you could use a single byte to give a
> binary CAP_DAC_OVERRIDE, with capability bits 8-30 being "stored" in
> not-present bytes and therefore assumed to be zero.

From: Serge E. Hallyn <[email protected]>
Subject: [PATCH 1/1] security: introduce fs caps

Implement filesystem posix capabilities. This allows programs to
be given a subset of root's powers regardless of who runs them,
without having to use setuid and giving the binary all of root's
powers.

This version works with Kaigai Kohei's userspace tools, found at
http://www.kaigai.gr.jp/pub/fscaps-1.0-kg.src.rpm under
http://www.kaigai.gr.jp/index.php?FrontPage#b556e50d.

Changelog:
Aug 15:
Handle endianness of xattrs.
Enforce capability version match between kernel and disk.
Enforce that no bits beyond the known max capability are
set, else return -EPERM.
With this extra processing, it may be worth reconsidering
doing all the work at bprm_set_security rather than
d_instantiate.

Aug 10:
Always call getxattr at bprm_set_security, rather than
caching it at d_instantiate.

Signed-off-by: Serge E. Hallyn <[email protected]>
---
include/linux/capability.h | 2 +
security/Kconfig | 10 +++++
security/commoncap.c | 85 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 96 insertions(+), 1 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index 6548b35..8b1932c 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -288,6 +288,8 @@ #define CAP_AUDIT_WRITE 29

#define CAP_AUDIT_CONTROL 30

+#define CAP_NUMCAPS 30
+
#ifdef __KERNEL__
/*
* Bounding set
diff --git a/security/Kconfig b/security/Kconfig
index 67785df..ce2bac7 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -80,6 +80,16 @@ config SECURITY_CAPABILITIES
This enables the "default" Linux capabilities functionality.
If you are unsure how to answer this question, answer Y.

+config SECURITY_FS_CAPABILITIES
+ bool "Filesystem Capabilities"
+ depends on SECURITY_CAPABILITIES
+ default n
+ help
+ This enables filesystem capabilities, allowing you to give
+ binaries a subset of root's powers without using setuid 0.
+
+ If in doubt, answer N.
+
config SECURITY_ROOTPLUG
tristate "Root Plug Support"
depends on USB && SECURITY
diff --git a/security/commoncap.c b/security/commoncap.c
index f50fc29..6bf030d 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -109,11 +109,55 @@ void cap_capset_set (struct task_struct
target->cap_permitted = *permitted;
}

+#define XATTR_CAPS_SUFFIX "capability"
+#define XATTR_NAME_CAPS XATTR_SECURITY_PREFIX XATTR_CAPS_SUFFIX
+struct vfs_cap_data_struct {
+ __u32 version;
+ __u32 effective;
+ __u32 permitted;
+ __u32 inheritable;
+};
+
+static inline void convert_to_le(struct vfs_cap_data_struct *cap)
+{
+ cap->version = le32_to_cpu(cap->version);
+ cap->effective = le32_to_cpu(cap->effective);
+ cap->permitted = le32_to_cpu(cap->permitted);
+ cap->inheritable = le32_to_cpu(cap->inheritable);
+}
+
+static int check_cap_sanity(struct vfs_cap_data_struct *cap)
+{
+ int i;
+
+ if (cap->version != _LINUX_CAPABILITY_VERSION)
+ return -EPERM;
+
+ for (i=CAP_NUMCAPS; i<sizeof(cap->effective); i++) {
+ if (cap->effective & CAP_TO_MASK(i))
+ return -EPERM;
+ }
+ for (i=CAP_NUMCAPS; i<sizeof(cap->permitted); i++) {
+ if (cap->permitted & CAP_TO_MASK(i))
+ return -EPERM;
+ }
+ for (i=CAP_NUMCAPS; i<sizeof(cap->inheritable); i++) {
+ if (cap->inheritable & CAP_TO_MASK(i))
+ return -EPERM;
+ }
+
+ return 0;
+}
+
int cap_bprm_set_security (struct linux_binprm *bprm)
{
+ struct dentry *dentry;
+ ssize_t rc;
+ struct vfs_cap_data_struct cap_struct;
+ struct inode *inode;
+
/* Copied from fs/exec.c:prepare_binprm. */

- /* We don't have VFS support for capabilities yet */
cap_clear (bprm->cap_inheritable);
cap_clear (bprm->cap_permitted);
cap_clear (bprm->cap_effective);
@@ -134,6 +178,45 @@ int cap_bprm_set_security (struct linux_
if (bprm->e_uid == 0)
cap_set_full (bprm->cap_effective);
}
+
+#ifdef CONFIG_SECURITY_FS_CAPABILITIES
+ /* Locate any VFS capabilities: */
+
+ dentry = dget(bprm->file->f_dentry);
+ inode = dentry->d_inode;
+ if (!inode->i_op || !inode->i_op->getxattr) {
+ dput(dentry);
+ return 0;
+ }
+
+ rc = inode->i_op->getxattr(dentry, XATTR_NAME_CAPS, &cap_struct,
+ sizeof(cap_struct));
+ dput(dentry);
+
+ if (rc == -ENODATA)
+ return 0;
+
+ if (rc < 0) {
+ printk(KERN_NOTICE "%s: Error (%d) getting xattr\n",
+ __FUNCTION__, rc);
+ return rc;
+ }
+
+ if (rc != sizeof(cap_struct)) {
+ printk(KERN_NOTICE "%s: got wrong size for getxattr (%d)\n",
+ __FUNCTION__, rc);
+ return -EPERM;
+ }
+
+ convert_to_le(&cap_struct);
+ if (check_cap_sanity(&cap_struct))
+ return -EPERM;
+
+ bprm->cap_effective = cap_struct.effective;
+ bprm->cap_permitted = cap_struct.permitted;
+ bprm->cap_inheritable = cap_struct.inheritable;
+
+#endif
return 0;
}

--
1.4.2

2006-08-16 02:25:54

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Casey Schaufler ([email protected]):
>
>
> --- "Serge E. Hallyn" <[email protected]> wrote:
>
>
> > +
> > + bprm->cap_effective = fscaps[0];
> > + bprm->cap_inheritable = fscaps[1];
> > + bprm->cap_permitted = fscaps[2];
> > +
>
> It does not appear that you're attempting
> to maintain the POSIX exec semantics for
> capability sets. (If you're doing it
> elsewhere in the code, nevermind) I don't
> know if this is intentional or not.

It should be getting done correctly at bprm_apply_creds.
The code you quote here is just setting it on the
binprm, which represents the executable itself (and as
pointed out in the comment above it).

Now the cap_bprm_secureexec() function needs to be
updated as I believe I pointed out in the original
submission. But if anything else is not getting done
right please correct me.

> I will have a closer look, but just for
> grins, I've attached code from the SGI
> OB1 offering of some years back that
> includes a function, cap_recalc, that
> implements the correct behavior. I will
> also take a stab at working it in, but

Excellent, thanks.

> I expect someone will beat me to it.

-serge

2006-08-16 02:42:04

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Stephen Smalley ([email protected]):
> On Tue, 2006-08-15 at 06:49 -0500, Serge E. Hallyn wrote:
> > Quoting Nicholas Miell ([email protected]):
> > > OTOH, everybody seems to have moved from capability-based security
> > > models on to TE/RBAC-based security models, so maybe this isn't worth
> > > the effort?
> >
> > One day perhaps, but that day isn't here yet. People are still using
> > setuid (see /sbin/passwd), so obviously they're not sufficiently
> > comfortable using *only* TE/RBAC.
>
> The hard part of capabilities isn't the kernel mechanism - it is the

I didn't claim to be doing the hard part :)

> proper assignment and management of the capability bits on files, and
> teaching userland that uid 0 is no longer magic.

Of course setuid still works, so it doesn't need to be done all at once.

> Which is all work that
> is already well underway for SELinux, but you would have to replicate it
> for capabilities. And since there is no notion of equivalence classes
> ala SELinux types and the "policy" is completely distributed throughout
> the filesystem state, management is going to be even more painful for
> the capabilities.

But since file capabilities cannot survive an exec, analysis with a gui
which walks the fs could be pretty simple.

> On the kernel side, in addition to updating the bprm_secureexec logic,
> you would need to consider whether the capability module needs to
> implement capability comparisons for the other hooks, like task_kill.
> At present, many operations only involve uid comparisons and SELinux
> checks without explicitly comparing capability sets. Properly isolating
> and protecting processes with different capability sets but the same uid
> is something SELinux already can do (based on domain), whereas the
> existing capability module doesn't really provide that.

Very good point. Preventing communication channels i.e. through signals
isn't a concern, but user hallyn ptracing himself running /bin/passwd
certainly is.

Thanks.

-serge

2006-08-16 13:18:18

by Stephen Smalley

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

On Tue, 2006-08-15 at 21:42 -0500, Serge E. Hallyn wrote:
> Quoting Stephen Smalley ([email protected]):
> > On Tue, 2006-08-15 at 06:49 -0500, Serge E. Hallyn wrote:
> > > Quoting Nicholas Miell ([email protected]):
> > > > OTOH, everybody seems to have moved from capability-based security
> > > > models on to TE/RBAC-based security models, so maybe this isn't worth
> > > > the effort?
> > >
> > > One day perhaps, but that day isn't here yet. People are still using
> > > setuid (see /sbin/passwd), so obviously they're not sufficiently
> > > comfortable using *only* TE/RBAC.
> >
> > The hard part of capabilities isn't the kernel mechanism - it is the
>
> I didn't claim to be doing the hard part :)

Yes, but the question is whether anyone will do the hard part. And
whether it is worth doing. And whether you make the system unsafe along
the way, particularly given the permissive nature of capabilities.

> But since file capabilities cannot survive an exec, analysis with a gui
> which walks the fs could be pretty simple.

Except that people actually want them to be inheritable (under certain
conditions), just in a way that avoids the problems encountered in the
past. If you start on the path of making capabilities useful, you'll
have to tackle that as well.

> > On the kernel side, in addition to updating the bprm_secureexec logic,
> > you would need to consider whether the capability module needs to
> > implement capability comparisons for the other hooks, like task_kill.
> > At present, many operations only involve uid comparisons and SELinux
> > checks without explicitly comparing capability sets. Properly isolating
> > and protecting processes with different capability sets but the same uid
> > is something SELinux already can do (based on domain), whereas the
> > existing capability module doesn't really provide that.
>
> Very good point. Preventing communication channels i.e. through signals
> isn't a concern, but user hallyn ptracing himself running /bin/passwd
> certainly is.

Actually, ptrace already performs a capability comparison (cap_ptrace).
Wrt signals, it wasn't the communication channel that concerned me but
the ability to interfere with the operation of a process running in the
same uid but different capabilities, like stopping it at a critical
point. Likewise with many other task hooks - you wouldn't want to be
able to depress the priority of a process running with greater
capabilities.

One other point to consider is Solaris seems to have diverged from their
own past approaches for privileges/capabilities,
http://blogs.sun.com/casper/20040722
http://www.opensolaris.org/os/community/security/library/howto/privbracket/

Doesn't sound like they are even using file capabilities at all.

Also, think about the real benefits of capabilities, at least as defined
in Linux. The coarse granularity and the lack of any per-object support
is a fairly significant deficiency there that is much better handled via
TE. At least some of the Linux capabilities lend themselves to easy
privilege escalation to gaining other capabilities or effectively
bypassing them.

--
Stephen Smalley
National Security Agency

2006-08-17 12:01:06

by Joshua Brindle

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Stephen Smalley wrote:
> On Tue, 2006-08-15 at 21:42 -0500, Serge E. Hallyn wrote:
>
> <snip>
>> Very good point. Preventing communication channels i.e. through signals
>> isn't a concern, but user hallyn ptracing himself running /bin/passwd
>> certainly is.
>>
>
> Actually, ptrace already performs a capability comparison (cap_ptrace).
> Wrt signals, it wasn't the communication channel that concerned me but
> the ability to interfere with the operation of a process running in the
> same uid but different capabilities, like stopping it at a critical
> point. Likewise with many other task hooks - you wouldn't want to be
> able to depress the priority of a process running with greater
> capabilities.
>
>
On this point, what about environment tampering of processes with caps?
LD_PRELOAD=my_bad_lib.so /usr/bin/passwd. glibc atsecure logic would
have to be updated to do a capability comparison.

> One other point to consider is Solaris seems to have diverged from their
> own past approaches for privileges/capabilities,
> http://blogs.sun.com/casper/20040722
> http://www.opensolaris.org/os/community/security/library/howto/privbracket/
>
> Doesn't sound like they are even using file capabilities at all.
>
> Also, think about the real benefits of capabilities, at least as defined
> in Linux. The coarse granularity and the lack of any per-object support
> is a fairly significant deficiency there that is much better handled via
> TE. At least some of the Linux capabilities lend themselves to easy
> privilege escalation to gaining other capabilities or effectively
> bypassing them

2006-08-17 12:26:20

by Stephen Smalley

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

On Thu, 2006-08-17 at 08:00 -0400, Joshua Brindle wrote:
> Stephen Smalley wrote:
> > On Tue, 2006-08-15 at 21:42 -0500, Serge E. Hallyn wrote:
> >
> > <snip>
> >> Very good point. Preventing communication channels i.e. through signals
> >> isn't a concern, but user hallyn ptracing himself running /bin/passwd
> >> certainly is.
> >>
> >
> > Actually, ptrace already performs a capability comparison (cap_ptrace).
> > Wrt signals, it wasn't the communication channel that concerned me but
> > the ability to interfere with the operation of a process running in the
> > same uid but different capabilities, like stopping it at a critical
> > point. Likewise with many other task hooks - you wouldn't want to be
> > able to depress the priority of a process running with greater
> > capabilities.
> >
> >
> On this point, what about environment tampering of processes with caps?
> LD_PRELOAD=my_bad_lib.so /usr/bin/passwd. glibc atsecure logic would
> have to be updated to do a capability comparison.

That's the bprm_secureexec logic change that has already been mentioned;
that determines the AT_SECURE value, and glibc then just acts based on
that value provided by the kernel. Just a matter of extending
cap_bprm_secureexec to compare the capability sets. Already on Serge's
todo list, but it is necessary for this to be a safe change, and should
happen before this patch goes anywhere (even -mm), IMHO.

--
Stephen Smalley
National Security Agency

2006-08-19 02:02:35

by Crispin Cowan

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Nicholas Miell wrote:
> OTOH, everybody seems to have moved from capability-based security
> models on to TE/RBAC-based security models, so maybe this isn't worth
> the effort?
>
TE, RBAC, AppArmor, and POSIX.1e Capabilities are all capability-based
systems, in that they all store the security attributes in the principal
(process, program, whatever) rather than the object (the files being
accessed). The difference is in the style of specifying the principals
and objects.

Crispin

--
Crispin Cowan, Ph.D. http://crispincowan.com/~crispin/
Director of Software Engineering, Novell http://novell.com
Hack: adroit engineering solution to an unanticipated problem
Hacker: one who is adroit at pounding round pegs into square holes

2006-08-19 02:02:50

by Crispin Cowan

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Stephen Smalley wrote:
> On Tue, 2006-08-15 at 21:42 -0500, Serge E. Hallyn wrote:
>
>> But since file capabilities cannot survive an exec, analysis with a gui
>> which walks the fs could be pretty simple.
>>
> Except that people actually want them to be inheritable (under certain
> conditions), just in a way that avoids the problems encountered in the
> past. If you start on the path of making capabilities useful, you'll
> have to tackle that as well.
>
Stephen makes a good point here.

Have you looked at how AppArmor handles POSIX Capabilities? It has these
advantages:

* Whether a capability is inheritable or not is specified by policy,
addressing Stephan's' point.
* Does not depend on extended attributes, so becomes filesystem
independent.
* As I understand it, it resembles the Solaris "process rights"
mechanism, and so (as Albert Cahalan suggested) will be less
surprising to Solaris users transitioning to Linux.

Currently it is implemented as part of the AppArmor module, but I could
imagine it being busted out into a separate module. The main thing I
would wonder about is some kind of API so that policy engines like
AppArmor and SELinux could manipulate the POSIX capabilities.

> Actually, ptrace already performs a capability comparison (cap_ptrace).
> Wrt signals, it wasn't the communication channel that concerned me but
> the ability to interfere with the operation of a process running in the
> same uid but different capabilities, like stopping it at a critical
> point. Likewise with many other task hooks - you wouldn't want to be
> able to depress the priority of a process running with greater
> capabilities.
>
Is that a property of Serge's module? Or just a property of the basic
crappyness of the POSIX Capabilities idea in the first place?

> Also, think about the real benefits of capabilities, at least as defined
> in Linux. The coarse granularity and the lack of any per-object support
> is a fairly significant deficiency there that is much better handled via
> TE.
Only if the user wants to buy all the way into TE. Making POSIX
Capabilities, TE, and AppArmor composeable choices seems like a good
goal. The question is whether POSIX Capabilities on their own are worth
while. But consider:

* They are already there on their own, pulling POSIX Capabilities
out seems like a non-option because too much already uses them.
* They are nearly useless without some kind of management interface.
Adding a decent management interface can only make it better.

Serge has proposed a reasonable model. I would like to suggest that
people, especially Serge, consider the AppArmor model as well before
deciding.

To quickly summarize the AppArmor model, you have an external policy
file that says that e.g. /usr/local/foo can have net_bind_service and
ipc_lock. This is a bit mask overlaid on top of whatever capabilities
the process already has, e.g. because it is UID 0 it has all of them. So
if someone runs /usr/local/foo as an unprivileged user, it has no
capabilities, and the bitmask does nothing. If someone runs
/usr/local/foo as root, then instead of all 32 capabilities, they get
only those 2.

> At least some of the Linux capabilities lend themselves to easy
> privilege escalation to gaining other capabilities or effectively
> bypassing them.
>
Certainly; cap_sys_admin effectively gives you ownership of the machine.
But that is fundamental to the POSIX Capabilities model, and not
something that Serge can change.

Crispin

--
Crispin Cowan, Ph.D. http://crispincowan.com/~crispin/
Director of Software Engineering, Novell http://novell.com
Hack: adroit engineering solution to an unanticipated problem
Hacker: one who is adroit at pounding round pegs into square holes

2006-08-19 17:08:36

by Casey Schaufler

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities



--- Crispin Cowan <[email protected]> wrote:

> > At least some of the Linux capabilities lend
> themselves to easy
> > privilege escalation to gaining other capabilities
> or effectively
> > bypassing them.
> >
> Certainly; cap_sys_admin effectively gives you
> ownership of the machine.
> But that is fundamental to the POSIX Capabilities
> model, and not
> something that Serge can change.

In turn it is fundamental to the curious
granularity of privileged operations in Unix.

I maintain that the real value in the POSIX
capability model derives from seperating the
permission to violate policy from the UID.

Granularity of such privilege is a bonus, and
a matter of considerable debate. DGUX ended
up with over 330 distinct capabilities, while
Irix had (last I looked) 24, and Solaris
came in somewhere between. All these systems
work.


Casey Schaufler
[email protected]

2006-08-21 20:37:14

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Stephen Smalley ([email protected]):
> On Thu, 2006-08-17 at 08:00 -0400, Joshua Brindle wrote:
> > Stephen Smalley wrote:
> > > On Tue, 2006-08-15 at 21:42 -0500, Serge E. Hallyn wrote:
> > >
> > > <snip>
> > >> Very good point. Preventing communication channels i.e. through signals
> > >> isn't a concern, but user hallyn ptracing himself running /bin/passwd
> > >> certainly is.
> > >>
> > >
> > > Actually, ptrace already performs a capability comparison (cap_ptrace).
> > > Wrt signals, it wasn't the communication channel that concerned me but
> > > the ability to interfere with the operation of a process running in the
> > > same uid but different capabilities, like stopping it at a critical
> > > point. Likewise with many other task hooks - you wouldn't want to be
> > > able to depress the priority of a process running with greater
> > > capabilities.
> > >
> > >
> > On this point, what about environment tampering of processes with caps?
> > LD_PRELOAD=my_bad_lib.so /usr/bin/passwd. glibc atsecure logic would
> > have to be updated to do a capability comparison.
>
> That's the bprm_secureexec logic change that has already been mentioned;
> that determines the AT_SECURE value, and glibc then just acts based on
> that value provided by the kernel. Just a matter of extending
> cap_bprm_secureexec to compare the capability sets. Already on Serge's
> todo list, but it is necessary for this to be a safe change, and should
> happen before this patch goes anywhere (even -mm), IMHO.

Does the following seem a reasonable way to handle secureexec in the
face of file caps?

Seems like it should work so long as we don't have to worry about the
bprm changing partway through exec. This should avoid the need to store
the old capabilities.

Unless I've got my head screwed on completely backward...

thanks,
-serge

(note - Still need at least a signals patch on top of this before the
file caps could be safe for use)

>From e791e196cda84743e805167c11e424f4132aa2a4 Mon Sep 17 00:00:00 2001
From: Serge E. Hallyn <[email protected]>
Date: Mon, 21 Aug 2006 14:54:17 -0500
Subject: [PATCH 1/1] security: handle secureexec with filesystem capabilities

A secure exec is required if euid!=uid, and, correspondingly,
if the executable's file capability set was not empty and the
process is not owned by (real uid) root.

Signed-off-by: Serge E. Hallyn <[email protected]>
---
security/commoncap.c | 14 +++++++++-----
1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/security/commoncap.c b/security/commoncap.c
index 6bf030d..b1777a9 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -265,11 +265,15 @@ void cap_bprm_apply_creds (struct linux_

int cap_bprm_secureexec (struct linux_binprm *bprm)
{
- /* If/when this module is enhanced to incorporate capability
- bits on files, the test below should be extended to also perform a
- test between the old and new capability sets. For now,
- it simply preserves the legacy decision algorithm used by
- the old userland. */
+ if (current->uid != 0) {
+ if (!cap_isclear(bprm->cap_effective))
+ return 1;
+ if (!cap_isclear(bprm->cap_permitted))
+ return 1;
+ if (!cap_isclear(bprm->cap_inheritable))
+ return 1;
+ }
+
return (current->euid != current->uid ||
current->egid != current->gid);
}
--
1.4.2

2006-08-22 02:51:03

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Crispin Cowan ([email protected]):
> Stephen Smalley wrote:
> > Also, think about the real benefits of capabilities, at least as defined
> > in Linux. The coarse granularity and the lack of any per-object support
> > is a fairly significant deficiency there that is much better handled via
> > TE.
> Only if the user wants to buy all the way into TE. Making POSIX
> Capabilities, TE, and AppArmor composeable choices seems like a good
> goal. The question is whether POSIX Capabilities on their own are worth
> while. But consider:
>
> * They are already there on their own, pulling POSIX Capabilities
> out seems like a non-option because too much already uses them.
> * They are nearly useless without some kind of management interface.
> Adding a decent management interface can only make it better.
>
> Serge has proposed a reasonable model. I would like to suggest that
> people, especially Serge, consider the AppArmor model as well before
> deciding.

So far this is not deciding on anything, just trying to follow the
partially implemented draft to it's specified and logical conclusion.
It may well be that it will turn out to just not be manageable, safe, or
useful, or none of the three.

> To quickly summarize the AppArmor model, you have an external policy

Does this stack with the capability module, or do you use purely your
own logic?

> file that says that e.g. /usr/local/foo can have net_bind_service and
> ipc_lock. This is a bit mask overlaid on top of whatever capabilities
> the process already has, e.g. because it is UID 0 it has all of them. So
> if someone runs /usr/local/foo as an unprivileged user, it has no
> capabilities, and the bitmask does nothing. If someone runs
> /usr/local/foo as root, then instead of all 32 capabilities, they get
> only those 2.

Can't do that with the fs capabilities.

But, the fs caps aren't intended to be an alternative to a policy-basd
system. What I like about them is simply that instead of making a
binary setuid 0, and expecting it to give up the caps it doesn't need,
it can be given just the caps it needs right off the bat.

The apparmor and selinux policies would be complementary and useful as
ever on top of those, just as they currently are on top of setuid.

> > At least some of the Linux capabilities lend themselves to easy
> > privilege escalation to gaining other capabilities or effectively
> > bypassing them.
> >
> Certainly; cap_sys_admin effectively gives you ownership of the machine.
> But that is fundamental to the POSIX Capabilities model, and not
> something that Serge can change.

Yup, sigh...

A better split of the caps might be more useful than fs caps
themselves...

-serge

2006-08-22 03:19:14

by Seth Arnold

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

On Mon, Aug 21, 2006 at 09:50:36PM -0500, Serge E. Hallyn wrote:
> > To quickly summarize the AppArmor model, you have an external policy
>
> Does this stack with the capability module, or do you use purely your
> own logic?

We link against the commoncap facility introduced by Bert Hubert, to
provide 'standard' capabilities support; we simply add another check at
capable() time to _also_ check the capability against the list allowed
in the current profile.

> But, the fs caps aren't intended to be an alternative to a policy-basd
> system. What I like about them is simply that instead of making a
> binary setuid 0, and expecting it to give up the caps it doesn't need,
> it can be given just the caps it needs right off the bat.
>
> The apparmor and selinux policies would be complementary and useful as
> ever on top of those, just as they currently are on top of setuid.

Seems like a great idea for e.g. binding to low ports, chroot, and
changing users for e.g. password changing. The other 24-26 capabilities
may be less useful. :) Still, I agree, complementary, and hopefully a
mechanism such as this proposed mechanism would help drag capabilities
out of the dark ages.

Thanks Serge


Attachments:
(No filename) (1.17 kB)
(No filename) (189.00 B)
Download all attachments

2006-08-28 21:39:16

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

The following patch addresses remaining LSM hooks which need to be
handled in the face of file capabilities. This is on top of the
cap_bprm_secureexec patch in the email I had replied to.

-serge

>From 7ee4dafddb85d6ccf34ddc50e2a44b244bb4b6b3 Mon Sep 17 00:00:00 2001
From: Serge E. Hallyn <[email protected]>
Date: Mon, 28 Aug 2006 14:39:57 -0500
Subject: [PATCH 3/3] fscaps: define task_kill and other new capchecks

task_setrlimit, task_prctl, and task_movememory are not a problem
with fscaps. This patch defines task_setscheduler, task_setioprio,
cap_task_kill, and task_setnice to make sure a user cannot affect a
process in which they called a program with some fscaps.

One remaining question is the note under task_setscheduler: are we
ok with CAP_SYS_NICE being sufficient to confine a process to a
cpuset?

It is a semantic change, as without fsccaps, attach_task doesn't
allow CAP_SYS_NICE to override the uid equivalence check. But since
it uses security_task_setscheduler, which elsewhere is used where
CAP_SYS_NICE can be used to override the uid equivalence check,
fixing it might be tough.

task_setscheduler
need to check for !capable(CAP_SYS_NICE) && current->euid==p->euid
&& !cap_subset(current->caps, p->caps) where p->caps are not
subset of current->caps
note: this also controls cpuset:attach_task. Are we ok with
CAP_SYS_NICE being used to confine to a cpuset?
task_setioprio
Same check as task_setscheduler
task_setnice
sys_setpriority uses this (through set_one_prio) for another
process. Need same checks as setrlimit

no need:
task_setrlimit
affect only calling task
task_prctl (set_dumpable, set_keepcaps)
affect only calling task
task_movememory
CAP_SYS_NICE is already checked

Signed-off-by: Serge E. Hallyn <[email protected]>
---
include/linux/security.h | 10 +++++--
security/capability.c | 4 +++
security/commoncap.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 73 insertions(+), 3 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index f753038..7a99930 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -51,6 +51,10 @@ extern int cap_inode_setxattr(struct den
extern int cap_inode_removexattr(struct dentry *dentry, char *name);
extern int cap_task_post_setuid (uid_t old_ruid, uid_t old_euid, uid_t old_suid, int flags);
extern void cap_task_reparent_to_init (struct task_struct *p);
+extern int cap_task_kill(struct task_struct *p, struct siginfo *info, int sig, u32 secid);
+extern int cap_task_setscheduler (struct task_struct *p, int policy, struct sched_param *lp);
+extern int cap_task_setioprio (struct task_struct *p, int ioprio);
+extern int cap_task_setnice (struct task_struct *p, int nice);
extern int cap_syslog (int type);
extern int cap_vm_enough_memory (long pages);

@@ -2522,12 +2526,12 @@ static inline int security_task_setgroup

static inline int security_task_setnice (struct task_struct *p, int nice)
{
- return 0;
+ return cap_task_setnice(p, nice);
}

static inline int security_task_setioprio (struct task_struct *p, int ioprio)
{
- return 0;
+ return cap_task_setioprio(p, ioprio);
}

static inline int security_task_getioprio (struct task_struct *p)
@@ -2562,7 +2566,7 @@ static inline int security_task_kill (st
struct siginfo *info, int sig,
u32 secid)
{
- return 0;
+ return cap_task_kill(p, info, sig, secid);
}

static inline int security_task_wait (struct task_struct *p)
diff --git a/security/capability.c b/security/capability.c
index b868e7e..14cb592 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -40,6 +40,10 @@ static struct security_operations capabi
.inode_setxattr = cap_inode_setxattr,
.inode_removexattr = cap_inode_removexattr,

+ .task_kill = cap_task_kill,
+ .task_setscheduler = cap_task_setscheduler,
+ .task_setioprio = cap_task_setioprio,
+ .task_setnice = cap_task_setnice,
.task_post_setuid = cap_task_post_setuid,
.task_reparent_to_init = cap_task_reparent_to_init,

diff --git a/security/commoncap.c b/security/commoncap.c
index b1777a9..82bd59f 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -387,6 +387,64 @@ int cap_task_post_setuid (uid_t old_ruid
return 0;
}

+/*
+ * Rationale: code calling task_setscheduler, task_setioprio, and
+ * task_setnice, assumes that
+ * . if capable(cap_sys_nice), then those actions should be allowed
+ * . if not capable(cap_sys_nice), but acting on your own processes,
+ * then those actions should be allowed
+ * This is insufficient now since you can call code without suid, but
+ * yet with increased caps.
+ * So we check for increased caps on the target process.
+ */
+static inline int cap_safe_nice(struct task_struct *p)
+{
+ if (!cap_issubset(p->cap_permitted, current->cap_permitted) &&
+ !__capable(current, CAP_SYS_NICE))
+ return -EPERM;
+ return 0;
+}
+
+int cap_task_setscheduler (struct task_struct *p, int policy,
+ struct sched_param *lp)
+{
+ return cap_safe_nice(p);
+}
+
+int cap_task_setioprio (struct task_struct *p, int ioprio)
+{
+ return cap_safe_nice(p);
+}
+
+int cap_task_setnice (struct task_struct *p, int nice)
+{
+ return cap_safe_nice(p);
+}
+
+int cap_task_kill(struct task_struct *p, struct siginfo *info,
+ int sig, u32 secid)
+{
+ if (info != SEND_SIG_NOINFO && (is_si_special(info) || SI_FROMKERNEL(info)))
+ return 0;
+
+ if (secid)
+ /*
+ * Signal sent as a particular user.
+ * Capabilities are ignored. May be wrong, but it's the
+ * only thing we can do at the moment.
+ * Used only by usb drivers?
+ */
+ return 0;
+ if (current->uid == 0 || current->euid == 0)
+ return 0;
+ if (capable(CAP_KILL))
+ return 0;
+ if (cap_issubset(p->cap_permitted, current->cap_permitted))
+ return 0;
+
+ return -EPERM;
+}
+
void cap_task_reparent_to_init (struct task_struct *p)
{
p->cap_effective = CAP_INIT_EFF_SET;
@@ -424,6 +482,10 @@ EXPORT_SYMBOL(cap_bprm_secureexec);
EXPORT_SYMBOL(cap_inode_setxattr);
EXPORT_SYMBOL(cap_inode_removexattr);
EXPORT_SYMBOL(cap_task_post_setuid);
+EXPORT_SYMBOL(cap_task_kill);
+EXPORT_SYMBOL(cap_task_setscheduler);
+EXPORT_SYMBOL(cap_task_setioprio);
+EXPORT_SYMBOL(cap_task_setnice);
EXPORT_SYMBOL(cap_task_reparent_to_init);
EXPORT_SYMBOL(cap_syslog);
EXPORT_SYMBOL(cap_vm_enough_memory);
--
1.4.2

2006-08-29 18:37:15

by Seth Arnold

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

On Mon, Aug 28, 2006 at 04:39:12PM -0500, Serge E. Hallyn wrote:
> +int cap_task_kill(struct task_struct *p, struct siginfo *info,
> + int sig, u32 secid)
> +{
> + if (info != SEND_SIG_NOINFO && (is_si_special(info) || SI_FROMKERNEL(info)))
> + return 0;
> +
> + if (secid)
> + /*
> + * Signal sent as a particular user.
> + * Capabilities are ignored. May be wrong, but it's the
> + * only thing we can do at the moment.
> + * Used only by usb drivers?
> + */
> + return 0;
> + if (current->uid == 0 || current->euid == 0)
> + return 0;

uid/euid checks feel out of place in the capabilities code.

> + if (capable(CAP_KILL))
> + return 0;
> + if (cap_issubset(p->cap_permitted, current->cap_permitted))
> + return 0;
> +
> + return -EPERM;
> +}

Thanks Serge


Attachments:
(No filename) (781.00 B)
(No filename) (189.00 B)
Download all attachments

2006-08-29 19:58:18

by Serge E. Hallyn

[permalink] [raw]
Subject: Re: [RFC] [PATCH] file posix capabilities

Quoting Seth Arnold ([email protected]):
> On Mon, Aug 28, 2006 at 04:39:12PM -0500, Serge E. Hallyn wrote:
> > +int cap_task_kill(struct task_struct *p, struct siginfo *info,
> > + int sig, u32 secid)
> > +{
> > + if (info != SEND_SIG_NOINFO && (is_si_special(info) || SI_FROMKERNEL(info)))
> > + return 0;
> > +
> > + if (secid)
> > + /*
> > + * Signal sent as a particular user.
> > + * Capabilities are ignored. May be wrong, but it's the
> > + * only thing we can do at the moment.
> > + * Used only by usb drivers?
> > + */
> > + return 0;
> > + if (current->uid == 0 || current->euid == 0)
> > + return 0;
>
> uid/euid checks feel out of place in the capabilities code.

Wow, good point.

Will fix.

thanks,
-serge