Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752510AbbGVSKw (ORCPT ); Wed, 22 Jul 2015 14:10:52 -0400 Received: from smtp105.biz.mail.bf1.yahoo.com ([98.139.221.43]:26410 "EHLO smtp105.biz.mail.bf1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752394AbbGVSKh (ORCPT ); Wed, 22 Jul 2015 14:10:37 -0400 X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: _sd_pJ8VM1nHriJhAV2HHDjExQBnCCWokYxsJQXa5whzxDn etE1sSpuTJ_KU2xu1NcZ8LMZqeIHKtvbVP07tAC9FRMsBwu.d.hy4LLq_l24 izb5sFsQC.sQ0BqD_ChMTXfksS_M2EOUQYJqGc34GM5rnt5eKr8hzjOTO_cn DCkAfFJKAMMKsMbrtlNQwcnyIWdgKwNxTZxGT4N8L32PznaD7GUsad_zh4PQ oPJ26Woo69fq0v0AlLJPajGi.DoeQ7QgrwZQ2RlMKA_6At32SqJXbYGQKBkd INjCfrd7dJ3oyJ0P7Mwl6RKVct9_Nev0Gv5eQT7l.9ZKvBlB0zLDPYrcsa0D xWwDvKQDpM5WzHRishFbz_8p8jyTJ1xZPnxyoNiqjeyIkigkXEUHj9XcS9nq 9xszVTPI3Olb8gpbL_Nfi1CJa4BvUJe6u7sHfS.rZt4TkwlpBqydRgkyQYJk Bi5u0dxdr7ZkFlo9uiilFSAW5B.AB3crp9NvVZ1mHtyTkHP4Nqkgso6552zT SOI5AqzMmcTnvGO5fkLC7qIyzcXHfvBDCiA-- X-Yahoo-SMTP: OIJXglSswBDfgLtXluJ6wiAYv6_cnw-- Subject: Re: [PATCH 0/7] Initial support for user namespace owned mounts To: Seth Forshee References: <55A7C920.7090206@schaufler-ca.com> <20150716185750.GB51751@ubuntu-hedt> <55A8253E.3000407@schaufler-ca.com> <55A8398A.3000802@schaufler-ca.com> <55A85041.2070301@schaufler-ca.com> <20150721203550.GA80838@ubuntu-hedt> <55AEF75F.9010703@schaufler-ca.com> <20150722155634.GB124342@ubuntu-hedt> Cc: Andy Lutomirski , "Eric W. Biederman" , Alexander Viro , Linux FS Devel , LSM List , SELinux-NSA , Serge Hallyn , "linux-kernel@vger.kernel.org" , Casey Schaufler From: Casey Schaufler Message-ID: <55AFDCA6.10201@schaufler-ca.com> Date: Wed, 22 Jul 2015 11:10:46 -0700 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20150722155634.GB124342@ubuntu-hedt> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9909 Lines: 245 On 7/22/2015 8:56 AM, Seth Forshee wrote: > On Tue, Jul 21, 2015 at 06:52:31PM -0700, Casey Schaufler wrote: >> On 7/21/2015 1:35 PM, Seth Forshee wrote: >>> On Thu, Jul 16, 2015 at 05:59:22PM -0700, Andy Lutomirski wrote: >>>> On Thu, Jul 16, 2015 at 5:45 PM, Casey Schaufler wrote: >>>>> On 7/16/2015 4:29 PM, Andy Lutomirski wrote: >>>>>> I really don't see the benefit of making up extra rules that apply to >>>>>> users outside a userns who try to access specifically a filesystem >>>>>> with backing store. They wouldn't make sense for filesystems without >>>>>> backing store. >>>>> Sure it would. For Smack, it would be the label a file would be >>>>> created with, which would be the label of the process creating >>>>> the memory based filesystem. For SELinux the rules are more a >>>>> touch more sophisticated, but I'm sure that Paul or Stephen could >>>>> come up with how to determine it. >>>>> >>>>> The point, looping all the way back to the beginning, where we >>>>> were talking about just ignoring the labels on the filesystem, >>>>> is that if you use the same Smack label on the files in the >>>>> filesystem as the backing store file has, we'll all be happy. >>>>> If that label isn't something user can write to, he won't be >>>>> able to write to the mounted objects, either. If there is no >>>>> backing store then use the label of the process creating the >>>>> filesystem, which will be the user, which will mean everything >>>>> will work hunky dory. >>>>> >>>>> Yes, there's work involved, but I doubt there's a lot. Getting >>>>> the label from the backing store or the creating process is >>>>> simple enough. >>>>> >>> So something like the diff below (untested)? >> I think that this is close, and quite good for someone >> who isn't very familiar with Smack. It's definitely headed >> in the right direction. >> >>> All I'm really doing is setting smk_default as you describe above and >>> then using it instead of smk_of_current() in >>> smack_inode_alloc_security() and instead of the label from the disk in >>> smack_d_instantiate(). >> Let's say your backing store is a file labeled Rubble. >> >> mount -o smackfsroot=Rubble,smackfsdef=Rubble ... >> >> It is completely reasonable for a process labeled Flintstone to >> have rwxa access to a file labeled Rubble. >> >> Smack rule: Flintstone Rubble rwxa >> >> In the case of writing to an existing Rubble file, what you >> have looks fine. What's not so great is that if the Flintstone >> process creates a file, it should be labeled Flintstone. Your >> use of the smk_default, which is going to violate the principle >> of least astonishment, and break the Smack policy as well. >> >> Let's make a minor change. Instead of using smackfsroot let's >> use smackfstransmute and a slightly different access rule: >> >> mount -o smackfstransmute=Rubble,smackfsdef=Rubble ... >> >> Smack rule: Flintstone Rubble rwxat >> >> Now the only change we have to make to the Smack code is >> that we don't want to create any files unless either the >> process is labeled Rubble or the rule allowing the creation >> has the "t" for transmute access. That should ensure that >> everything is labeled Rubble. If it isn't, someone has mucked >> with the metadata in a detectable way. > All right, that kind of makes sense, but I'm still missing some pieces. > Questions follow. > >>> diff --git a/include/linux/fs.h b/include/linux/fs.h >>> index 32f598db0b0d..4597420ab933 100644 >>> --- a/include/linux/fs.h >>> +++ b/include/linux/fs.h >>> @@ -1486,6 +1486,10 @@ static inline void sb_start_intwrite(struct super_block *sb) >>> __sb_start_write(sb, SB_FREEZE_FS, true); >>> } >>> >>> +static inline bool sb_in_userns(struct super_block *sb) >>> +{ >>> + return sb->s_user_ns != &init_user_ns; >>> +} >>> >>> extern bool inode_owner_or_capable(const struct inode *inode); >>> >>> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c >>> index a143328f75eb..591fd19294e7 100644 >>> --- a/security/smack/smack_lsm.c >>> +++ b/security/smack/smack_lsm.c >>> @@ -255,6 +255,10 @@ static struct smack_known *smk_fetch(const char *name, struct inode *ip, >>> char *buffer; >>> struct smack_known *skp = NULL; >>> >>> + /* Should never fetch xattrs from untrusted mounts */ >>> + if (WARN_ON(sb_in_userns(ip->i_sb))) >>> + return ERR_PTR(-EPERM); >>> + >> Go ahead and fetch it, we'll check to make sure it's viable later. >> >>> if (ip->i_op->getxattr == NULL) >>> return ERR_PTR(-EOPNOTSUPP); >>> >>> @@ -656,10 +660,14 @@ static int smack_sb_kern_mount(struct super_block *sb, int flags, void *data) >>> */ >>> if (specified) >>> return -EPERM; >>> + >>> /* >>> - * Unprivileged mounts get root and default from the caller. >>> + * User namespace mounts get root and default from the backing >>> + * store, if there is one. Other unprivileged mounts get them >>> + * from the caller. >>> */ >>> - skp = smk_of_current(); >>> + skp = (sb_in_userns(sb) && sb->s_bdev) ? >>> + smk_of_inode(sb->s_bdev->bd_inode) : smk_of_current(); >>> sp->smk_root = skp; >>> sp->smk_default = skp; >> sp->smk_flags |= SMK_INODE_TRANSMUTE; > I assume that you meant skp and not sp here. Actually, neither is correct. You want to set SMK_INODE_TRANSMUTE in the smk_flags field of the root inode. That's easy: transmute = 1; and the code after "Initialize the root inode" will take care of it. >>> } >>> @@ -792,7 +800,12 @@ static int smack_bprm_secureexec(struct linux_binprm *bprm) >>> */ >>> static int smack_inode_alloc_security(struct inode *inode) >>> { >>> - struct smack_known *skp = smk_of_current(); >>> + struct smack_known *skp; >>> + >>> + if (sb_in_userns(inode->i_sb)) >>> + skp = ((struct superblock_smack *)(inode->i_sb->s_security))->smk_default; >>> + else >>> + skp = smk_of_current(); >> This should be left alone. >> smack_inode_init_security is where you could disallow access that doesn't >> legitimately result in a Rubble label on the file. It's something like >> >> ... after the call may = smk_access_entry(...) >> if (sb_in_userns(inode->i_sb)) >> if (skp != dsp && (may & MAY_TRANSMUTE) == 0) >> return -EACCES; > I'm not getting how this covers all cases. > > So we've set the transmute flag on the root inode. Files and directories > created in the root directory get the same label, and directories also > get the transmute attribute. That's all fine. > > What about an existing directory in the filesystem that already has a > Slate label? I'm not getting what happens with this directory, or for > new files created in this directory, which also relates to my other > questions below. > > Also an aside - smk_access_entry looks weird. may is initialized to > -ENOENT, and then rule_list is searched for a rule which matches the > object and subject labels. Presumably it's possible that no rule could > be found, otherwise the prior initialization of may is pointless. If > this happens the following code treats it as though it always contains > access flags even though it might contain -ENOENT. Nothing bad actually > happens with a two's compliement representation of -ENOENT since it will > just set a bit that's already set, but it still seems like it should > have a may > 0 condition, for clarity if for no other reason. My suggested code is just wrong. I wasn't looking at the whole code, only the patch, and got myself confused. Apologies. If we want to go straight for the jugular how about this? I'm assuming that inode->i_sb->s_bdev->bd_inode is the inode of the backing store. static int smack_inode_permission(struct inode *inode, int mask) { struct smk_audit_info ad; int no_block = mask & MAY_NOT_BLOCK; int rc; mask &= (MAY_READ|MAY_WRITE|MAY_EXEC|MAY_APPEND); /* * No permission to check. Existence test. Yup, it's there. */ if (mask == 0) return 0; + if (sb_in_userns(inode->i_sb)) && + smk_of_inode(inode) != smk_of_inode(inode->i_sb->s_bdev->bd_inode)) + return -EACCES; + /* May be droppable after audit */ if (no_block) return -ECHILD; smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_INODE); smk_ad_setfield_u_fs_inode(&ad, inode); rc = smk_curacc(smk_of_inode(inode), mask, &ad); rc = smk_bu_inode(inode, mask, rc); return rc; } > >>> inode->i_security = new_inode_smack(skp); >>> if (inode->i_security == NULL) >>> @@ -3175,6 +3188,11 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode) >>> break; >>> } >>> /* >>> + * Don't use labels from xattrs for unprivileged mounts. >>> + */ >>> + if (sb_in_userns(inode->i_sb)) >>> + break; >>> + /* >> Again, use the label. Just check to make sure it's what you expect. > What happens if it's not what I expect? smack_d_instantiate cannot fail > ... so just use the default label? In that case why bother reading it at > all? Or would we actually want to change the on-disk label if it didn't > match? > >>> * No xattr support means, alas, no SMACK label. >>> * Use the aforeapplied default. >>> * It would be curious if the label of the task >> Also untested. >> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> Please read the FAQ at http://www.tux.org/lkml/ >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/