Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp3745194imj; Tue, 19 Feb 2019 08:42:58 -0800 (PST) X-Google-Smtp-Source: AHgI3IaaupkiY4NrR4okxpGq9vMSmJ8SRBQcAMyLkudQh2GDIQiEZYwHLwzQ/Iw7Rv5XiO/FluU0 X-Received: by 2002:a17:902:e090:: with SMTP id cb16mr30725661plb.32.1550594578022; Tue, 19 Feb 2019 08:42:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550594578; cv=none; d=google.com; s=arc-20160816; b=N7Z6g1AqZ+9+shX7ZDFxif9gkFAFGBdjMtD0R1InkyVvcfBvnm0KUJPfvj8GnLSjE9 H9d7CDhaICj6e3x53XUiXIr3WGW2eEAmj9jLrf3UNvBqyw81NsTehNuGocO9W4efsR7Y k7Ea0gAO+B1IhO8VGIfc2Cs2U7V9bbcMKo0a6gJ9e8bdLDWgH2071QDeQjTN/pmYizlj HbFmKnigTyuJ7tXZi17xWOQ64MHjqX7KB4/ISUPe+R+d1IsgzARGR3mjDtUUE5WRDcYS 2+6Xu3ZEG76QxG/zyNTH1ifjzg5QY1npzm/oMiMkqQTUh/8pEwl4Y4Yo3PB8nsX27pN0 mctw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:mime-version:user-agent :message-id:in-reply-to:date:references:cc:to:from; bh=Mj9UUgLlUz5bbY/Dea/vBaW5NBY3/4YOO3dlz4gsFsY=; b=SM3pr5GuP10RmDbwrnEW6KO1XLNqDwYPk6W5b3usPXfrpIvJ6iOcqBIvzpUlCI9Rgv u0YZpHbGm36dERemb6n+EaXYwonlzqPDk3XZoSm865MAtQjTaVg8xh37P/fFXUG8jjxF GbtuCl0aJfoB/Wae2iG0Ls2D2cAigpab2V4sMc7Fn/RHZ/msoGqNZGicbkcEqnw0pBnQ HZiBVrrNQgCmYoITb1zljNfBZ4sWmoI79UDKgyjT3y3svGK1Hwd9ZQgYxOBY1hMXUbf3 kNqMz2idw09Kovr7tJp2sPb2jqL+iDdkkwdekKw4Crq7V6+Bradb8gA1NuX4rNl5nSdL yqwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q5si8362563pgc.425.2019.02.19.08.42.42; Tue, 19 Feb 2019 08:42:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xmission.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729240AbfBSQmL (ORCPT + 99 others); Tue, 19 Feb 2019 11:42:11 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:51760 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725820AbfBSQmK (ORCPT ); Tue, 19 Feb 2019 11:42:10 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out03.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1gw8TB-0003Ko-4h; Tue, 19 Feb 2019 09:42:09 -0700 Received: from ip68-227-174-240.om.om.cox.net ([68.227.174.240] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1gw8T9-0002pA-To; Tue, 19 Feb 2019 09:42:09 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: David Howells Cc: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org, linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, linux-kernel@vger.kernel.org References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> <155024687804.21651.13220990774688382294.stgit@warthog.procyon.org.uk> Date: Tue, 19 Feb 2019 10:41:56 -0600 In-Reply-To: <155024687804.21651.13220990774688382294.stgit@warthog.procyon.org.uk> (David Howells's message of "Fri, 15 Feb 2019 16:07:58 +0000") Message-ID: <87k1hvwx0r.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1gw8T9-0002pA-To;;;mid=<87k1hvwx0r.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=68.227.174.240;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+eDxco8mTJxxxZDHThx6wGVh+wuPemyU0= X-SA-Exim-Connect-IP: 68.227.174.240 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa07.xmission.com X-Spam-Level: **** X-Spam-Status: No, score=4.2 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,LotsOfNums_01,T_TM2_M_HEADER_IN_MSG, T_XMDrugObfuBody_08,XMNoVowels,XMSubLong autolearn=disabled version=3.4.2 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 1.2 LotsOfNums_01 BODY: Lots of long strings of numbers * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMDrugObfuBody_08 obfuscated drug references X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ****;David Howells X-Spam-Relay-Country: X-Spam-Timing: total 832 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 3.1 (0.4%), b_tie_ro: 2.1 (0.3%), parse: 1.29 (0.2%), extract_message_metadata: 16 (1.9%), get_uri_detail_list: 4.8 (0.6%), tests_pri_-1000: 13 (1.6%), tests_pri_-950: 1.25 (0.2%), tests_pri_-900: 1.06 (0.1%), tests_pri_-90: 35 (4.2%), check_bayes: 33 (4.0%), b_tokenize: 14 (1.7%), b_tok_get_all: 9 (1.1%), b_comp_prob: 2.6 (0.3%), b_tok_touch_all: 4.5 (0.5%), b_finish: 0.60 (0.1%), tests_pri_0: 745 (89.6%), check_dkim_signature: 0.60 (0.1%), check_dkim_adsp: 2.3 (0.3%), poll_dns_idle: 0.71 (0.1%), tests_pri_10: 2.5 (0.3%), tests_pri_500: 10 (1.2%), rewrite_mail: 0.00 (0.0%) Subject: Re: [RFC PATCH 05/27] containers: Open a socket inside a container X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org David Howells writes: > Provide a system call to open a socket inside of a container, using that > container's network namespace. This allows netlink to be used to manage > the container. > > fd = container_socket(int container_fd, > int domain, int type, int protocol); > Nacked-by: "Eric W. Biederman" Use a namespace file descriptor if you need this. So far we have not added this system call as it is just a performance optimization. And it has been too niche to matter. If this that has changed we can add this separately from everything else you are doing here. > Signed-off-by: David Howells > --- > > arch/x86/entry/syscalls/syscall_32.tbl | 1 + > arch/x86/entry/syscalls/syscall_64.tbl | 1 + > include/linux/socket.h | 3 ++- > include/linux/syscalls.h | 2 ++ > kernel/sys_ni.c | 1 + > net/compat.c | 2 +- > net/socket.c | 34 +++++++++++++++++++++++++++----- > 7 files changed, 37 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl > index 8666693510f9..f4c9beff77a6 100644 > --- a/arch/x86/entry/syscalls/syscall_32.tbl > +++ b/arch/x86/entry/syscalls/syscall_32.tbl > @@ -409,3 +409,4 @@ > 395 i386 sb_notify sys_sb_notify __ia32_sys_sb_notify > 396 i386 container_create sys_container_create __ia32_sys_container_create > 397 i386 fork_into_container sys_fork_into_container __ia32_sys_fork_into_container > +398 i386 container_socket sys_container_socket __ia32_sys_container_socket > diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl > index d40d4790fcb2..e20cdf7b5527 100644 > --- a/arch/x86/entry/syscalls/syscall_64.tbl > +++ b/arch/x86/entry/syscalls/syscall_64.tbl > @@ -354,6 +354,7 @@ > 343 common sb_notify __x64_sys_sb_notify > 344 common container_create __x64_sys_container_create > 345 common fork_into_container __x64_sys_fork_into_container > +346 common container_socket __x64_sys_container_socket > > # > # x32-specific system call numbers start at 512 to avoid cache impact > diff --git a/include/linux/socket.h b/include/linux/socket.h > index ab2041a00e01..154ac900a8a5 100644 > --- a/include/linux/socket.h > +++ b/include/linux/socket.h > @@ -10,6 +10,7 @@ > #include /* __user */ > #include > > +struct net; > struct pid; > struct cred; > > @@ -376,7 +377,7 @@ extern int __sys_sendto(int fd, void __user *buff, size_t len, > int addr_len); > extern int __sys_accept4(int fd, struct sockaddr __user *upeer_sockaddr, > int __user *upeer_addrlen, int flags); > -extern int __sys_socket(int family, int type, int protocol); > +extern int __sys_socket(struct net *net, int family, int type, int protocol); > extern int __sys_bind(int fd, struct sockaddr __user *umyaddr, int addrlen); > extern int __sys_connect(int fd, struct sockaddr __user *uservaddr, > int addrlen); > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h > index 15e5cc704df3..547334c6ffc2 100644 > --- a/include/linux/syscalls.h > +++ b/include/linux/syscalls.h > @@ -947,6 +947,8 @@ asmlinkage long sys_container_create(const char __user *name, unsigned int flags > unsigned long spare3, unsigned long spare4, > unsigned long spare5); > asmlinkage long sys_fork_into_container(int containerfd); > +asmlinkage long sys_container_socket(int containerfd, > + int domain, int type, int protocol); > > /* > * Architecture-specific system calls > diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c > index a23ad529d548..ce9c5bb30e7f 100644 > --- a/kernel/sys_ni.c > +++ b/kernel/sys_ni.c > @@ -236,6 +236,7 @@ COND_SYSCALL(shmdt); > /* net/socket.c */ > COND_SYSCALL(socket); > COND_SYSCALL(socketpair); > +COND_SYSCALL(container_socket); > COND_SYSCALL(bind); > COND_SYSCALL(listen); > COND_SYSCALL(accept); > diff --git a/net/compat.c b/net/compat.c > index 959d1c51826d..1b2db740fd33 100644 > --- a/net/compat.c > +++ b/net/compat.c > @@ -856,7 +856,7 @@ COMPAT_SYSCALL_DEFINE2(socketcall, int, call, u32 __user *, args) > > switch (call) { > case SYS_SOCKET: > - ret = __sys_socket(a0, a1, a[2]); > + ret = __sys_socket(current->nsproxy->net_ns, a0, a1, a[2]); > break; > case SYS_BIND: > ret = __sys_bind(a0, compat_ptr(a1), a[2]); > diff --git a/net/socket.c b/net/socket.c > index 7d271a1d0c7e..7406580598b9 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -80,6 +80,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1326,9 +1327,9 @@ int sock_create_kern(struct net *net, int family, int type, int protocol, struct > } > EXPORT_SYMBOL(sock_create_kern); > > -int __sys_socket(int family, int type, int protocol) > +int __sys_socket(struct net *net, int family, int type, int protocol) > { > - int retval; > + long retval; > struct socket *sock; > int flags; > > @@ -1346,7 +1347,7 @@ int __sys_socket(int family, int type, int protocol) > if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK)) > flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK; > > - retval = sock_create(family, type, protocol, &sock); > + retval = __sock_create(net, family, type, protocol, &sock, 0); > if (retval < 0) > return retval; > > @@ -1355,9 +1356,32 @@ int __sys_socket(int family, int type, int protocol) > > SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol) > { > - return __sys_socket(family, type, protocol); > + return __sys_socket(current->nsproxy->net_ns, family, type, protocol); > } > > +/* > + * Create a socket inside a container. > + */ > +#ifdef CONFIG_CONTAINERS > +SYSCALL_DEFINE4(container_socket, > + int, containerfd, int, family, int, type, int, protocol) > +{ > + struct fd f = fdget(containerfd); > + long ret; > + > + if (!f.file) > + return -EBADF; > + ret = -EINVAL; > + if (is_container_file(f.file)) { > + struct container *c = f.file->private_data; > + > + ret = __sys_socket(c->ns->net_ns, family, type, protocol); > + } > + fdput(f); > + return ret; > +} > +#endif > + > /* > * Create a pair of connected sockets. > */ > @@ -2555,7 +2579,7 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args) > > switch (call) { > case SYS_SOCKET: > - err = __sys_socket(a0, a1, a[2]); > + err = __sys_socket(current->nsproxy->net_ns, a0, a1, a[2]); > break; > case SYS_BIND: > err = __sys_bind(a0, (struct sockaddr __user *)a1, a[2]);