Received: by 10.213.65.68 with SMTP id h4csp1251166imn; Wed, 4 Apr 2018 15:40:47 -0700 (PDT) X-Google-Smtp-Source: AIpwx483ayyA2aJNJBPP+2Aradn7SwWAmAsfMy0N3PbbJr471BeYYakK1trU5nr2+fcrmVOiPFE1 X-Received: by 2002:a17:902:30f:: with SMTP id 15-v6mr19733206pld.365.1522881646975; Wed, 04 Apr 2018 15:40:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522881646; cv=none; d=google.com; s=arc-20160816; b=OZP92bfX7rAM3oi5k4F67NcJbj95zVJIgYuFUaQ/rsOhpstxma4EQeXPywwVS7fBlk xyheKCV0k2ygdo+whTjfUZcLRYwfo/E5qGk1h+3jw8mVT128prDIZI7SRtvdA1qAx/aS vOPDeYET80CWgSklj5jNK2gUy5KiTxp+s+K+1bLWpAR21BnYj4UXq3Ngrflb3DSuir98 GNz8SvT5NBR9RW1oMRB2txKQhIQ0eCLF2bPceuEelh9Uh9Vhi7OGFQWTZU0cGR5CrBwW vU/D8/E0bV3Kvvii4F+MDbixEw59XGmi0eGWxzGbX/5bfRuLBZq2LoShPm7IK6qq4WVV xx1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:mime-version:user-agent :message-id:in-reply-to:date:references:cc:to:from :arc-authentication-results; bh=B5p10nvjNz1uhJppGkixhuxtFYnIuAgUXib6y0HmBNY=; b=auQJQHDKQpRfFkWNjqE4AeF/VZWLCwb23l0VWfnd25k8Bzz+cIf6jbn8n8R6wDicbU MMElLGD5XP48XuHsFdlQyN1MbhESUDD+lSLNGRkV7g5i7PqIRA3MfghVeYxOsaev0Puj RPdI97AmkNK9t4hR+tPW7HYzoB8pKCLzdBZ4un9ruIZu1eq5dNyf2HhQSuSBRO3THLHK sDgL07UU1T74NQBD98dS39wKaE4dfuY+wmOYHg0kecDFBmRQH79Ql1xwJNS1FcDvjCxg KIM3AR/YCnqxRdxR08MvJjw8XGjihKqiFbRkp/pQoyflpWk3p9TMT6u5fgSypqiOpWAD VS+Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j8-v6si5070099pli.9.2018.04.04.15.40.33; Wed, 04 Apr 2018 15:40:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752628AbeDDWjO (ORCPT + 99 others); Wed, 4 Apr 2018 18:39:14 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:41371 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752414AbeDDWjM (ORCPT ); Wed, 4 Apr 2018 18:39:12 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1f3r3e-0006uO-Nb; Wed, 04 Apr 2018 16:39:10 -0600 Received: from 67-3-145-25.omah.qwest.net ([67.3.145.25] helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1f3r3e-0000vY-2p; Wed, 04 Apr 2018 16:39:10 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Christian Brauner Cc: davem@davemloft.net, gregkh@linuxfoundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, avagin@virtuozzo.com, ktkhai@virtuozzo.com, serge@hallyn.com References: <20180404194857.29375-1-christian.brauner@ubuntu.com> <20180404203048.GA21118@gmail.com> Date: Wed, 04 Apr 2018 17:38:02 -0500 In-Reply-To: <20180404203048.GA21118@gmail.com> (Christian Brauner's message of "Wed, 4 Apr 2018 22:30:49 +0200") Message-ID: <871sfuha2d.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1f3r3e-0000vY-2p;;;mid=<871sfuha2d.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=67.3.145.25;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18v4kKt4NfxzXZb+ADTLYvp8VNf3/ydzeM= X-SA-Exim-Connect-IP: 67.3.145.25 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on sa04.xmission.com X-Spam-Level: X-Spam-Status: No, score=-0.2 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,TVD_RCVD_IP,T_TM2_M_HEADER_IN_MSG autolearn=disabled version=3.4.1 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Christian Brauner X-Spam-Relay-Country: X-Spam-Timing: total 322 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.3 (0.7%), b_tie_ro: 1.52 (0.5%), parse: 0.82 (0.3%), extract_message_metadata: 14 (4.3%), get_uri_detail_list: 2.5 (0.8%), tests_pri_-1000: 6 (1.7%), tests_pri_-950: 1.07 (0.3%), tests_pri_-900: 0.89 (0.3%), tests_pri_-400: 31 (9.7%), check_bayes: 30 (9.4%), b_tokenize: 8 (2.6%), b_tok_get_all: 8 (2.6%), b_comp_prob: 2.5 (0.8%), b_tok_touch_all: 9 (2.7%), b_finish: 0.55 (0.2%), tests_pri_0: 257 (79.9%), check_dkim_signature: 0.52 (0.2%), check_dkim_adsp: 3.0 (0.9%), tests_pri_500: 6 (1.8%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH net] netns: filter uevents correctly X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Christian Brauner writes: > On Wed, Apr 04, 2018 at 09:48:57PM +0200, Christian Brauner wrote: >> commit 07e98962fa77 ("kobject: Send hotplug events in all network namespaces") >> >> enabled sending hotplug events into all network namespaces back in 2010. >> Over time the set of uevents that get sent into all network namespaces has >> shrunk. We have now reached the point where hotplug events for all devices >> that carry a namespace tag are filtered according to that namespace. >> >> Specifically, they are filtered whenever the namespace tag of the kobject >> does not match the namespace tag of the netlink socket. One example are >> network devices. Uevents for network devices only show up in the network >> namespaces these devices are moved to or created in. >> >> However, any uevent for a kobject that does not have a namespace tag >> associated with it will not be filtered and we will *try* to broadcast it >> into all network namespaces. >> >> The original patchset was written in 2010 before user namespaces were a >> thing. With the introduction of user namespaces sending out uevents became >> partially isolated as they were filtered by user namespaces: >> >> net/netlink/af_netlink.c:do_one_broadcast() >> >> if (!net_eq(sock_net(sk), p->net)) { >> if (!(nlk->flags & NETLINK_F_LISTEN_ALL_NSID)) >> return; >> >> if (!peernet_has_id(sock_net(sk), p->net)) >> return; >> >> if (!file_ns_capable(sk->sk_socket->file, p->net->user_ns, >> CAP_NET_BROADCAST)) >> j return; >> } >> >> The file_ns_capable() check will check whether the caller had >> CAP_NET_BROADCAST at the time of opening the netlink socket in the user >> namespace of interest. This check is fine in general but seems insufficient >> to me when paired with uevents. The reason is that devices always belong to >> the initial user namespace so uevents for kobjects that do not carry a >> namespace tag should never be sent into another user namespace. This has >> been the intention all along. But there's one case where this breaks, >> namely if a new user namespace is created by root on the host and an >> identity mapping is established between root on the host and root in the >> new user namespace. Here's a reproducer: >> >> sudo unshare -U --map-root >> udevadm monitor -k >> # Now change to initial user namespace and e.g. do >> modprobe kvm >> # or >> rmmod kvm >> >> will allow the non-initial user namespace to retrieve all uevents from the >> host. This seems very anecdotal given that in the general case user >> namespaces do not see any uevents and also can't really do anything useful >> with them. >> >> Additionally, it is now possible to send uevents from userspace. As such we >> can let a sufficiently privileged (CAP_SYS_ADMIN in the owning user >> namespace of the network namespace of the netlink socket) userspace process >> make a decision what uevents should be sent. >> >> This makes me think that we should simply ensure that uevents for kobjects >> that do not carry a namespace tag are *always* filtered by user namespace >> in kobj_bcast_filter(). Specifically: >> - If the owning user namespace of the uevent socket is not init_user_ns the >> event will always be filtered. >> - If the network namespace the uevent socket belongs to was created in the >> initial user namespace but was opened from a non-initial user namespace >> the event will be filtered as well. >> Put another way, uevents for kobjects not carrying a namespace tag are now >> always only sent to the initial user namespace. The regression potential >> for this is near to non-existent since user namespaces can't really do >> anything with interesting devices. >> >> Signed-off-by: Christian Brauner > > That was supposed to be [PATCH net] not [PATCH net-next] which is > obviously closed. Sorry about that. This does not appear to be a fix. This looks like feature work. The motivation appears to be that looks wrong let's change it. So let's please leave this for when net-next opens again so we can have time to fully consider a change in semantics. Thank you, Eric