Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752335AbbE0Dgn (ORCPT ); Tue, 26 May 2015 23:36:43 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:49007 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751787AbbE0Dgj (ORCPT ); Tue, 26 May 2015 23:36:39 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Seth Forshee Cc: Miklos Szeredi , alexey@kurnosov.spb.ru, Andy Lutomirski , Serge Hallyn , fuse-devel , Linux-Fsdevel , Kernel Mailing List References: <20150331011423.GC13083@unsen.q53.spb.ru> <20150401155515.GA2994@unsen.q53.spb.ru> <20150502155623.GD13083@unsen.q53.spb.ru> <20150522144702.GA126334@ubuntu-hedt> <87iobk4id8.fsf@x220.int.ebiederm.org> <20150522185932.GC126334@ubuntu-hedt> <20150526152138.GB4531@tucsk.suse.de> <20150526161451.GB10248@ubuntu-hedt> Date: Tue, 26 May 2015 22:31:40 -0500 In-Reply-To: <20150526161451.GB10248@ubuntu-hedt> (Seth Forshee's message of "Tue, 26 May 2015 11:14:51 -0500") Message-ID: <87k2vuvgpv.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-AID: U2FsdGVkX18V85NmMURBxtBY62XsJ2PXLt8Q0Q3roXY= X-SA-Exim-Connect-IP: 67.3.205.90 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_03 6+ unique symbols in subject * 1.0 T_XMDrugObfuBody_04 obfuscated drug references * 0.0 T_TooManySym_02 5+ unique symbols in subject * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Seth Forshee X-Spam-Relay-Country: X-Spam-Timing: total 820 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 4.2 (0.5%), b_tie_ro: 3.0 (0.4%), parse: 1.23 (0.2%), extract_message_metadata: 29 (3.5%), get_uri_detail_list: 7 (0.9%), tests_pri_-1000: 10 (1.2%), tests_pri_-950: 1.34 (0.2%), tests_pri_-900: 1.13 (0.1%), tests_pri_-400: 41 (5.0%), check_bayes: 40 (4.8%), b_tokenize: 12 (1.5%), b_tok_get_all: 14 (1.7%), b_comp_prob: 6 (0.7%), b_tok_touch_all: 4.9 (0.6%), b_finish: 0.80 (0.1%), tests_pri_0: 722 (88.1%), tests_pri_500: 6 (0.7%), rewrite_mail: 0.00 (0.0%) Subject: Re: [fuse-devel] fuse_get_context() and namespaces X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Wed, 24 Sep 2014 11:00:52 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6933 Lines: 167 Seth Forshee writes: > On Tue, May 26, 2015 at 05:21:38PM +0200, Miklos Szeredi wrote: >> On Fri, May 22, 2015 at 01:59:32PM -0500, Seth Forshee wrote: >> > On Fri, May 22, 2015 at 12:44:35PM -0500, Eric W. Biederman wrote: >> > > Seth Forshee writes: >> > > >> > > > On Fri, May 22, 2015 at 04:23:55PM +0200, Miklos Szeredi wrote: >> > > >> On Sat, May 2, 2015 at 5:56 PM, wrote: >> > > >> > >> > > >> > 3.10.0-229 form Scientific Linux and native 4.0.1-1 (from elrepo). >> > > >> > SL 7.1 on the host and SL 6.6 on the LXC guest. At least in 3.10 >> > > >> > the 499dcf2024092e5cce41d05599a5b51d1f92031a is present. >> > > >> > Steps to reproduce: >> > > >> > >> > > >> > On first console: >> > > >> > [root@sl7test ~]# lxc-start -n test-2 /bin/su - >> > > >> > [root@test-2 ~]# diff -u hello.py /usr/share/doc/fuse-python-0.2.1/example/hello.py >> > > >> > --- hello.py 2015-05-02 11:12:13.963093580 -0400 >> > > >> > +++ /usr/share/doc/fuse-python-0.2.1/example/hello.py 2010-04-14 18:29:21.000000000 -0400 >> > > >> > @@ -41,8 +41,6 @@ >> > > >> > class HelloFS(Fuse): >> > > >> > >> > > >> > def getattr(self, path): >> > > >> > - dic = Fuse.GetContext(self) >> > > >> > - print dic >> > > >> > st = MyStat() >> > > >> > if path == '/': >> > > >> > st.st_mode = stat.S_IFDIR | 0755 >> > > >> > [root@test-2 ~]# python hello.py -f /mnt/ >> > > >> > >> > > >> > On second console: >> > > >> > [root@test-2 ~]# echo $$ >> > > >> > 41 >> > > >> > [root@test-2 ~]# ls /mnt/ >> > > >> > hello >> > > >> > >> > > >> > Output of first console: >> > > >> > {'gid': 0, 'pid': 12083, 'uid': 0} >> > > >> >> > > >> Thanks. >> > > >> >> > > >> Digging in mailbox... There was a thread last year about adding >> > > >> support for running fuse daemon in a container: >> > > >> >> > > >> http://thread.gmane.org/gmane.linux.kernel/1811658 >> > > >> >> > > >> Not sure what happened, but no updated patches have been posted or >> > > >> maybe I just missed them. >> > > > >> > > > I haven't sent updated patches in a while. I still intend to, but I >> > > > shifted focus to first getting general support for mounts from user >> > > > namespaces into the vfs (which will give a clearer direction for some of >> > > > the concerns raised about the fuse patches). >> > > > >> > > > All of this code is available in the userns-mounts branch of >> > > > git://kernel.ubuntu.com/sforshee/linux.git, and I don't think the fuse >> > > > patches actually depend on any of the stuff that precedes them. I'm >> > > > planning to start submitting some of the earlier patches from that >> > > > branch soon, and eventually get back to resubmitting the fuse patches. >> > > > >> > > > This is about pid namespaces though, and the fuse pid namespace patch >> > > > from that series (see below) should be more or less independent of the >> > > > rest of the patches. Potentially that could be merged separately from >> > > > the user namespae stuff. >> > > >> > > [snip] >> > > >> > > > @@ -2076,7 +2077,15 @@ static int convert_fuse_file_lock(const struct fuse_file_lock *ffl, >> > > > >> > > > fl->fl_start = ffl->start; >> > > > fl->fl_end = ffl->end; >> > > > - fl->fl_pid = ffl->pid; >> > > > + >> > > > + /* >> > > > + * Convert pid into the connection's pid namespace. If the >> > > > + * pid does not map into the namespace fl_pid will get set >> > > > + * to 0. >> > > > + */ >> > > > + rcu_read_lock(); >> > > > + fl->fl_pid = pid_vnr(find_pid_ns(ffl->pid, fc->pid_ns)); >> > > > + rcu_read_unlock(); >> > > >> > > Scratches my head. This looks wrong. >> > > >> > > I would have expected pid_nr_ns. Am I missing something reading this >> > > patch quickly? >> > >> > Here we're in the context of a F_GETLK operation. We've requested the >> > lock information from the fuse process, and ffl->pid is the pid number >> > in that process's pid namespace so it needs to be translated into >> > current's namespace. First we have to look up the struct pid, then >> > pid_vnr is just a wrapper for pid_nr_ns in the current pid namespace: >> > >> > pid_t pid_vnr(struct pid *pid) >> > { >> > return pid_nr_ns(pid, task_active_pid_ns(current)); >> > } >> > >> > Oh, but the comment is wrong, so maybe that's what confused you. >> > s/connection/caller/ there and it should make more sense. >> >> Attaching updated patch against fuse.git for-next. Check namespace in both >> device read and write. Check them at the start (doesn't matter if requests are >> stuck in the queue, if server isn't playing by the rules, then all is lost >> anyway). >> >> One thing: we return error if current tgid isn't valid in server's namespace. >> That's looks good. However we silently succeed and set in.h.pid to zero if >> current pid isn't representable in the server's namespace. That doesn't sound >> quite right. >> >> Again the question is, does it make sense to allow access by tasks whose pid >> isn't representable in the server. If not, then they should be rejected instead >> of succeeding with an invalid PID, right? > > All of the fuse filesystems I looked at didn't pay any attention at all > to in.h.pid, and I don't see any reason to make them unusable by > processes outside the pid namespace. Filesystems which do care about > pids can reject requests when in.h.pid is 0 if they wish. Well where we came in at was there is a fuse filesystem paying attention to pids and is having problems running in a container because the pids are not translated. > Of course it's a different matter if there are existing filesystems > which could be broken by in.h.pid == 0. It seems there is at least one filesystem being written that has that limitation. Given the way fuse works it does seem reasonable to deny access outside of a pid namespace that the fuse filesystem is in. Set I believe you implemented or were at least considering limiting the clients to the same user namespace as well. Unfortunately the test Miklos proposed is wrong. It tested for pid namespace identity which fails to account for nested pid namespaces. Instead of saying: + if (task_active_pid_ns(current) != fc->pid_ns) + return -EIO; The code should probably say: if (!pid_nr_ns(fc->pid_ns, task_pid(current))) return -EIO; Or possibly: static bool in_pid_ns(struct pid_namespace *ns, struct pid *pid) { return ns->level <= pid->level && pid->numbers[ns->level].ns == ns; } if (!in_pid_ns(fc->pid_ns, task_pid(current))) return -EIO; We don't have the in_pid_ns function but it would be easy enough to add. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/