Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754091AbbGaOlI (ORCPT ); Fri, 31 Jul 2015 10:41:08 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:43541 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752317AbbGaOlF (ORCPT ); Fri, 31 Jul 2015 10:41:05 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Amir Goldstein Cc: Seth Forshee , Alexander Viro , Jeff Layton , "J. Bruce Fields" , Serge Hallyn , Andy Lutomirski , linux-fsdevel , LSM List , SELinux-NSA , linux-kernel , "Theodore Ts'o" References: Date: Fri, 31 Jul 2015 09:34:25 -0500 In-Reply-To: (Amir Goldstein's message of "Fri, 31 Jul 2015 11:36:07 +0300") Message-ID: <87fv44iefi.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-AID: U2FsdGVkX1/BLpqunULx4NgeEoi7CQqX/CsxVqZfiWY= X-SA-Exim-Connect-IP: 97.119.22.40 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Amir Goldstein X-Spam-Relay-Country: X-Spam-Timing: total 477 ms - load_scoreonly_sql: 0.06 (0.0%), signal_user_changed: 10 (2.1%), b_tie_ro: 9 (1.8%), parse: 1.03 (0.2%), extract_message_metadata: 28 (5.9%), get_uri_detail_list: 3.6 (0.8%), tests_pri_-1000: 16 (3.4%), tests_pri_-950: 1.72 (0.4%), tests_pri_-900: 1.54 (0.3%), tests_pri_-400: 41 (8.6%), check_bayes: 40 (8.3%), b_tokenize: 8 (1.7%), b_tok_get_all: 9 (2.0%), b_comp_prob: 2.8 (0.6%), b_tok_touch_all: 17 (3.5%), b_finish: 0.71 (0.1%), tests_pri_0: 366 (76.8%), tests_pri_500: 7 (1.4%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH 1/7] fs: Add user namesapace member to struct super_block X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3277 Lines: 75 Amir Goldstein writes: > On Thu, Jul 16, 2015 at 5:47 AM, Eric W. Biederman > wrote: >> Seth Forshee writes: >> >>> Initially this will be used to eliminate the implicit MNT_NODEV >>> flag for mounts from user namespaces. In the future it will also >>> be used for translating ids and checking capabilities for >>> filesystems mounted from user namespaces. >>> >>> s_user_ns is initialized in alloc_super() and is generally set to >>> current_user_ns(). To avoid security and corruption issues, two >>> additional mount checks are also added: >>> >>> - do_new_mount() gains a check that the user has CAP_SYS_ADMIN >>> in current_user_ns(). >>> >>> - sget() will fail with EBUSY when the filesystem it's looking >>> for is already mounted from another user namespace. >>> >>> proc needs some special handling here. The user namespace of >>> current isn't appropriate when forking as a result of clone (2) >>> with CLONE_NEWPID|CLONE_NEWUSER, as it will make proc unmountable >>> from within the new user namespace. Instead, the user namespace >>> which owns the new pid namespace should be used. sget_userns() is >>> added to allow passing of a user namespace other than that of >>> current, and this is used by proc_mount(). sget() becomes a >>> wrapper around sget_userns() which passes current_user_ns(). >> >> From bits of the previous conversation. >> >> We need sget_userns(..., &init_user_ns) for sysfs. The sysfs >> xattrs can travel from one mount of sysfs to another via the sysfs >> backing store. >> >> For tmpfs and any other filesystems we support mounting without >> privilige that support xattrs. We need to identify them and >> see if userspace is taking advantage of the ability to set >> xattrs and file caps (unlikely). If they are we need to call >> sget_userns(..., &init_user_ns) on those filesystems as well. >> >> Possibly/Probably we should just do that for all of the interesting >> filesystems to start with and then change back to an ordinary old sget >> after we have done the testing and confirmed we will not be introducing >> userspace regressions. > > Eric, > > Perhaps it is too soon to discuss here, but how do you envision > handling of file system private mount options in user ns. > > For example, suppose that we get to a point where we can trust > an ext4 loopback mount to be non vulnerable to exploits. > That loopback mounted fs could very well have errors and so > error=panic option would be very much undesired from unprivileged user mount. > > Do you think this would require extra flags/callbacks from VFS to > file system code or would s_user_ns be sufficient? This case is easy. In mount or remount we just need to check capable(CAP_SYS_ADMIN) if someone sets error=panic, and if the capable call fails don't allow the mount or the remount. But this corner case is another good reminder that we have to be very deliberate and very careful before we enable mounting a filesystem this way. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/