Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932778AbbBBRyH (ORCPT ); Mon, 2 Feb 2015 12:54:07 -0500 Received: from smtp107.biz.mail.bf1.yahoo.com ([98.139.244.55]:37559 "EHLO smtp107.biz.mail.bf1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932119AbbBBRyC (ORCPT ); Mon, 2 Feb 2015 12:54:02 -0500 X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: 2yBuDkoVM1lOh2ge.1HresDoAgJApsTtwQF4Zt3UrXyfxFl 1BBlRHi6Vtc004GtwDz7xvRTqaZhxUzJdC3IvhCAA9.kqVPpAYx2kHCo2ziV uYpWSye.yC6dBoQMdTsU3qVTraIeejraOddwyMuXdvlcpmf9e0foWRYCfttk 3Y.o9owUkvPvAgJ0hm8A.QoG_hA5ckPs4g.MWH4iuAr0K3_4Yxm94pOaSoRQ GZhu5jfSVNgNGKftMdVEAdS0iBz7Wb0Y1mPe6OpxukBicE2j3hbRYhHfIK9F V53gHWeqHxsLFvHTNZ1HRsypcBhX9HzPheHE96RZJrpR7SKxhCwOKJrtV9Hq 2RJpfMgsvy8D.wwYJUR.Ig3K_eHyi.It4fCZb1dRNH6QB8fyRGvGLS8.F1fm KVHBBgOvc2GUJknHFPa.AJ0XqlwyYW3Nr3h88aF6wgooygmX5QNvDax2ZBTW ShwvqQC0cWZeUItdGtjuQ1H2zyxJNXnCP3llQ76m4PxuFN24LedXl0Q7KU2d n4q7lkH772qEaPLZm36QOyRXIefakx7PHW1OChxK_Q3y9_C8otBIHQnZi035 WZVn1vbssyqOO97ZWMfodMmQEV6ahdNZCDY_hVKRysaetjNDWjFXrov2Hqw0 xFTNdsFLogK4gpDxhi8na5_2ehdftlKI0iv_1NmrTtCgeokNzR1aYfMy5XhJ XRELhT5s9Hub.im7_P1Pw54zClPuYelemkCvGsjlZBLM2F.ms60x7RjkPLiC iRJ5MyYLs5yCOFagvKXTCWONARHgHzVLnyz6m3eUFK9uVxsM26CsQwg6SrWL UtpyEHNu9wqM66Qtk6.BhmHNU.M2Wg5P0cUPbo97sTkm.PA2qjFGGeYp1.F8 xBhT6jHG8psMErsVB0gH5upo2Rtjb9ShzeVEUFRTkwMlZeJ.XkQBCWWeTp3N BBT2eSBelIiOwGGgbpc54kJ_hTZkYGDdLWdpt6AQW9tcZRg8v1A3ze9bGYku nU9Atg5XCh3s49AgHdoQCUE28KKsuvK67Rlrgjlx3JNAmPRaK14XAdxkXqTq cE8n9EeL6IUN17hL72Jayjzq5_meb69hC_j3wOhXNf.lBUWbRCrUXqVM5RK_ kPGNsGjpVMPjJniYJ32kfXQR9ai_RLh966TExobcgfYlpv5iuuz3J9KlLVt0 VrDt8Ec2Go32MM9xl3p_lpIxU3d3_ X-Yahoo-SMTP: OIJXglSswBDfgLtXluJ6wiAYv6_cnw-- Message-ID: <54CFB9B8.8020701@schaufler-ca.com> Date: Mon, 02 Feb 2015 09:54:00 -0800 From: Casey Schaufler User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Christoph Lameter , Serge Hallyn CC: Andy Lutomirski , Jonathan Corbet , Aaron Jones , "Ted Ts'o" , linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linuxfoundation.org, Casey Schaufler Subject: Re: [capabilities] Allow normal inheritance for a configurable set of capabilities References: In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9406 Lines: 240 On 2/2/2015 8:21 AM, Christoph Lameter wrote: > Linux capabilities suffer from the problem that they are not inheritable > like regular process characteristics under Unix. This is behavior that > is counter intuitive to the expected behavior of processes in Unix. http://wt.tuxomania.net/publications/posix.1e/download.html The POSIX* capability scheme is the simplest mechanism we could come up with that allows existing setuid programs to work unmodified and still make it possible to constrain specific capabilities. Is it complicated? Yes. Why is it complicated? Because you need the option of using the file capabilities to raise and lower the privilege of a program. Had we the option of requiring the programs to do that themselves, the whole thing would have been easier. You also need the option of having a capability aware program manipulate it's own capabilities. All the UNIX systems that implemented capabilities did so using one variate or another of the POSIX scheme. One, Trusted IRIX, successfully eliminated root privilege. --- Disclaimer: The DRAFT is withdrawn. There is no standard. You can't claim conformance. > In particular there has been recently software that controls NICs from user > space and provides IP stack like behavior also in user space (DPDK and RDMA > kernel API based implementations). Those typically need either capabilities > to allow raw network access or have to be run setsuid. There is scripting and > LD_PREFLOAD etc involved, arbitrary binaries may be run from those scripts. You're getting into pretty sketchy territory using that kind of a programming model in a security enforcing environment. Kids! > That does not go well with having file capabilities set that would enable > the capabilities. Maybe it would work if one would setup capabilities on > all executables but that would also defeat a secure design since these > binaries may only need those caps for certain situations. Ok setting the > inheritable flags on everything may also get one there (if there would not > be the issues with LD_PRELOAD, debugging etc etc). It *can* be done with the combination of inheritable, permitted and effective capabilities. I admit it's not convenient. If you're willing to modify your code to handle dropping capabilities you can simplify the file based configuration significantly. As for debugging, that's always a security nightmare. > The easy solution is that capabilities need to be inherited like setsuid > is. We really prefer to use capabilities instead of setsuid (we want to > limit what damage someone can do after all!). Therefore we have been > running a patch like this in production for the last 6 years. At some > point it becomes tedious to run your own custom kernel so we would like > to have this functionality upstream. > > See some of the earlier related discussions on the problems with capability > inheritance: > > 0. Recent surprise: > https://lkml.org/lkml/2014/1/21/175 > > 1. Attempt to revise caps > http://www.madore.org/~david/linux/newcaps/ > > 2. Problems of passing caps through exec > http://unix.stackexchange.com/questions/128394/passing-capabilities-through-exec > > 3. Problems of binding to privileged ports > http://stackoverflow.com/questions/413807/is-there-a-way-for-non-root-processes-to-bind-to-privileged-ports-1024-on-l > > 4. Reviving capabilities > http://lwn.net/Articles/199004/ > > > > There does not seem to be an alternative on the horizon. Some involved > in security development under Linux have even stated that they want to > rip out the whole thing and replace it. Its been a couple of years now > and we are still suffering from the capabilities mess. Let us just > fix it. I'm game to participate in such an effort. The POSIX scheme is workable, but given that it's 20 years old and hasn't developed real traction it's hard to call it successful. > This patch does not change the default behavior but it allows to set up > a list of capabilities in the proc filesystem that will enable regular > unix inheritance only for the selected group of capabilities. > > With that it is then possible to do something trivial like setting > CAP_NET_RAW on an executable that can then allow that capability to > be inherited by others. > > e.g > > echo 12,13,23 >/proc/sys/kernel/cap_inheritable > > Allows the inheritance of CAP_SYS_NICE, CAP_NET_RAW and CAP_NET_ADMIN. > With that device raw access is possible and also real time priorities > can be set from user space. This is a frequently needed set of > priviledged operations in HPC and HFT applications. User space > processes need to be able to directly access devices as well as > have full control over scheduling. > > Setting capabilities on an executable is not always possible if > for example LD_PRELOAD or other things also have to be used. In that > case it is possible to build a classic wrapper after applying this > patch that sets up the proper privileges for running processes > that need these. > > I usually do not dabble in security and I am not sure if this is > done correctly. If someone has a better solution then please tell > me but so far we have not seen anything else that actually works. > This keeps on coming up in various context and we need the issue > fixed! > > Signed-off-by: Christoph Lameter > > Index: linux/include/linux/capability.h > =================================================================== > --- linux.orig/include/linux/capability.h > +++ linux/include/linux/capability.h > @@ -44,6 +44,7 @@ struct user_namespace *current_user_ns(v > > extern const kernel_cap_t __cap_empty_set; > extern const kernel_cap_t __cap_init_eff_set; > +extern const unsigned long *sysctl_cap_inheritable; > > /* > * Internal kernel functions only > Index: linux/kernel/capability.c > =================================================================== > --- linux.orig/kernel/capability.c > +++ linux/kernel/capability.c > @@ -26,6 +26,16 @@ > const kernel_cap_t __cap_empty_set = CAP_EMPTY_SET; > EXPORT_SYMBOL(__cap_empty_set); > > +/* > + * Allow inheritance with typical unix semantics for capabilities. > + * This means that the inheritable flag can be omitted on the file > + * that inherits the capabilities. Capabilities will be passed down > + * via exec like other process characteristics. This is the behavior > + * sysadmins expect. > + */ > +static unsigned long cap_inheritable[BITS_TO_LONGS(CAP_LAST_CAP)]; > +const unsigned long *sysctl_cap_inheritable = cap_inheritable; > + > int file_caps_enabled = 1; > > static int __init file_caps_disable(char *str) > Index: linux/kernel/sysctl.c > =================================================================== > --- linux.orig/kernel/sysctl.c > +++ linux/kernel/sysctl.c > @@ -840,6 +840,14 @@ static struct ctl_table kern_table[] = { > .mode = 0444, > .proc_handler = proc_dointvec, > }, > + { > + .procname = "cap_inheritable", > + .data = &sysctl_cap_inheritable, > + .maxlen = CAP_LAST_CAP, > + .mode = 0644, > + .proc_handler = proc_do_large_bitmap, > + }, > + > #if defined(CONFIG_LOCKUP_DETECTOR) > { > .procname = "watchdog", > Index: linux/security/commoncap.c > =================================================================== > --- linux.orig/security/commoncap.c > +++ linux/security/commoncap.c > @@ -437,6 +437,9 @@ static int get_file_caps(struct linux_bi > struct dentry *dentry; > int rc = 0; > struct cpu_vfs_cap_data vcaps; > + kernel_cap_t inherit = CAP_EMPTY_SET; > + bool does_inherit = false; > + int i; > > bprm_clear_caps(bprm); > > @@ -446,6 +449,17 @@ static int get_file_caps(struct linux_bi > if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) > return 0; > > + /* > + * Figure out if any capabilities are inheritable without > + * setting bits in the target file. > + */ > + for_each_set_bit(i, sysctl_cap_inheritable, CAP_LAST_CAP) > + if (capable(i)) { > + cap_raise(inherit, i); > + does_inherit = true; > + } > + > + > dentry = dget(bprm->file->f_path.dentry); > > rc = get_vfs_caps_from_disk(dentry, &vcaps); > @@ -455,7 +469,8 @@ static int get_file_caps(struct linux_bi > __func__, rc, bprm->filename); > else if (rc == -ENODATA) > rc = 0; > - goto out; > + if (!does_inherit) > + goto out; > } > > rc = bprm_caps_from_vfs_caps(&vcaps, bprm, effective, has_cap); > @@ -463,6 +478,15 @@ static int get_file_caps(struct linux_bi > printk(KERN_NOTICE "%s: cap_from_disk returned %d for %s\n", > __func__, rc, bprm->filename); > > + if (does_inherit) { > + struct cred *new = bprm->cred; > + /* Add new capabilies from inheritance mask */ > + new->cap_inheritable = cap_combine(inherit, new->cap_inheritable); > + new->cap_permitted = cap_combine(inherit, new->cap_permitted); > + *effective = true; > + *has_cap = true; > + } > + > out: > dput(dentry); > if (rc) > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/