Received: by 10.213.65.68 with SMTP id h4csp1379363imn; Wed, 4 Apr 2018 18:37:20 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+x1LJMSO3l8XE5LVF0b98SFncnlxQ/CR/mfWiSjqdl1VQLh4aJBtrquqruXiXybhh3iKLe X-Received: by 10.99.55.68 with SMTP id g4mr13145431pgn.283.1522892240314; Wed, 04 Apr 2018 18:37:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522892240; cv=none; d=google.com; s=arc-20160816; b=cuhf2nqADqpDwmHltPj1xlTHWxECHhqNh97kWVUkktzYgknkSBZWRMP7DYSTtcGBhI TJCJqA4J6kP2B40f1dgLMsBzvMpYAOhtgixwERvp0B4YMWmnN79++6bxF7xE/3w205Qn xcAfWq3IIp1oe5mVnSyJEMiwnofDP1gw44qjL3krC9a9h9xyWUYa8JEv6LcvbrxPOPVO ijO7LsBsHO78uwb2rJlNYzjJKOil6dGMVh0abqh8VvpBA/4fCSi1KfCl2WZnSJSbF1Jf DY1DDUeee+B6QebLliXQiTdtWoocuncUC464jbe/LqUm7Fm0DMEVoefoeHsdEPGMJzdu Tj/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:date:from:arc-authentication-results; bh=zzj66VHatLk+ujZCbGMRqwa989zEG0BUPUCoSD8AXKg=; b=VlZLe0tlgaj0UvXhR4sRWwzVQ/sQVtKNR17HHOUDGNrN0oHp8X/hu1mh24rk0n3SSe RNrs5KC48MkBeMtRNa8s9ZYpBgutoIEgwRpcJ+BW1hwwYKybajQrKPrFe9IvZSanUOyp I9scw/HfJSWsA5zN/ondAzh41WYta4Wc+R6oQXHR/j4530Mx4tRchajBMtniyVvAmSIV z+Bg2GozGKuYF40s0DaH6ZP4YzVBS0BIGgpq+ETQBnzjtPdmXTwRJDbn/cn8x3FVJtKg iNGvdT4X36gZMMCTkyQgwunmH7sJT2OzCAErCOADd8RfI92z47aSKE65awxZcRX1gudW xkoA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j8-v6si5494226pli.9.2018.04.04.18.37.06; Wed, 04 Apr 2018 18:37:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753014AbeDEBf0 (ORCPT + 99 others); Wed, 4 Apr 2018 21:35:26 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:41302 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752701AbeDEBfX (ORCPT ); Wed, 4 Apr 2018 21:35:23 -0400 Received: from mail-wm0-f70.google.com ([74.125.82.70]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1f3toA-00084C-3T for linux-kernel@vger.kernel.org; Thu, 05 Apr 2018 01:35:22 +0000 Received: by mail-wm0-f70.google.com with SMTP id 195so432239wmf.0 for ; Wed, 04 Apr 2018 18:35:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=zzj66VHatLk+ujZCbGMRqwa989zEG0BUPUCoSD8AXKg=; b=blDo4iMjtPLOyvT+CIbEf2eFsNbu63HHfIYwAP7CMML5jwhkyirmw4QZ/PctR3DyLG WEDgubjAKQ2FmCvu6XaaffxLRrS/9Z1qrqfNwynVjVwvZ/smLpXC3ZoYVpOApMF6Ys9b 5NY5SV1xH8oy9xn/UlPDxCD6vyR1mZnI/74eCGFL3S2czD1nVPGIRaiNON484gD2a/il igFLTXqn6jEOaISj2k/GYPtsFqF1sQuRQ2NHGollB2KTsZRhAWS62wQiJU5/GzLRNG3U En/DY5PcHO2RxYGfmwrU/3vAvZOIZM/voKsCeSBJ4Gtspu/tlP8YrXiMUS3XJ9xP5hcX Mhaw== X-Gm-Message-State: AElRT7HS+GDIip4gv+TK/Lx0MY1ObLpZOxVFy0g3B/i8YtwEiEoQMSeY ypwod+6zYATmpZJXIfz061d2ozSsdpm9euMt83oqlJr921TcRtvFELM4i70H9mSBNvHm5Z0WNRy rMzbGCrxMvutC+IoXoWcmMiyoAxRh+7+vhpS+vw8ONw== X-Received: by 10.28.11.207 with SMTP id 198mr8755222wml.70.1522892121771; Wed, 04 Apr 2018 18:35:21 -0700 (PDT) X-Received: by 10.28.11.207 with SMTP id 198mr8755213wml.70.1522892121522; Wed, 04 Apr 2018 18:35:21 -0700 (PDT) Received: from gmail.com ([2a02:8070:8895:9700:f846:923f:cd5e:df8b]) by smtp.gmail.com with ESMTPSA id i52sm10947972wra.82.2018.04.04.18.35.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 04 Apr 2018 18:35:20 -0700 (PDT) From: Christian Brauner X-Google-Original-From: Christian Brauner Date: Thu, 5 Apr 2018 03:35:20 +0200 To: "Eric W. Biederman" , davem@davemloft.net Cc: Christian Brauner , gregkh@linuxfoundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, avagin@virtuozzo.com, ktkhai@virtuozzo.com, serge@hallyn.com Subject: Re: [PATCH net] netns: filter uevents correctly Message-ID: <20180405013519.GA15319@gmail.com> References: <20180404194857.29375-1-christian.brauner@ubuntu.com> <20180404203048.GA21118@gmail.com> <871sfuha2d.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <871sfuha2d.fsf@xmission.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 04, 2018 at 05:38:02PM -0500, Eric W. Biederman wrote: > Christian Brauner writes: > > > On Wed, Apr 04, 2018 at 09:48:57PM +0200, Christian Brauner wrote: > >> commit 07e98962fa77 ("kobject: Send hotplug events in all network namespaces") > >> > >> enabled sending hotplug events into all network namespaces back in 2010. > >> Over time the set of uevents that get sent into all network namespaces has > >> shrunk. We have now reached the point where hotplug events for all devices > >> that carry a namespace tag are filtered according to that namespace. > >> > >> Specifically, they are filtered whenever the namespace tag of the kobject > >> does not match the namespace tag of the netlink socket. One example are > >> network devices. Uevents for network devices only show up in the network > >> namespaces these devices are moved to or created in. > >> > >> However, any uevent for a kobject that does not have a namespace tag > >> associated with it will not be filtered and we will *try* to broadcast it > >> into all network namespaces. > >> > >> The original patchset was written in 2010 before user namespaces were a > >> thing. With the introduction of user namespaces sending out uevents became > >> partially isolated as they were filtered by user namespaces: > >> > >> net/netlink/af_netlink.c:do_one_broadcast() > >> > >> if (!net_eq(sock_net(sk), p->net)) { > >> if (!(nlk->flags & NETLINK_F_LISTEN_ALL_NSID)) > >> return; > >> > >> if (!peernet_has_id(sock_net(sk), p->net)) > >> return; > >> > >> if (!file_ns_capable(sk->sk_socket->file, p->net->user_ns, > >> CAP_NET_BROADCAST)) > >> j return; > >> } > >> > >> The file_ns_capable() check will check whether the caller had > >> CAP_NET_BROADCAST at the time of opening the netlink socket in the user > >> namespace of interest. This check is fine in general but seems insufficient > >> to me when paired with uevents. The reason is that devices always belong to > >> the initial user namespace so uevents for kobjects that do not carry a > >> namespace tag should never be sent into another user namespace. This has > >> been the intention all along. But there's one case where this breaks, > >> namely if a new user namespace is created by root on the host and an > >> identity mapping is established between root on the host and root in the > >> new user namespace. Here's a reproducer: > >> > >> sudo unshare -U --map-root > >> udevadm monitor -k > >> # Now change to initial user namespace and e.g. do > >> modprobe kvm > >> # or > >> rmmod kvm > >> > >> will allow the non-initial user namespace to retrieve all uevents from the > >> host. This seems very anecdotal given that in the general case user > >> namespaces do not see any uevents and also can't really do anything useful > >> with them. > >> > >> Additionally, it is now possible to send uevents from userspace. As such we > >> can let a sufficiently privileged (CAP_SYS_ADMIN in the owning user > >> namespace of the network namespace of the netlink socket) userspace process > >> make a decision what uevents should be sent. > >> > >> This makes me think that we should simply ensure that uevents for kobjects > >> that do not carry a namespace tag are *always* filtered by user namespace > >> in kobj_bcast_filter(). Specifically: > >> - If the owning user namespace of the uevent socket is not init_user_ns the > >> event will always be filtered. > >> - If the network namespace the uevent socket belongs to was created in the > >> initial user namespace but was opened from a non-initial user namespace > >> the event will be filtered as well. > >> Put another way, uevents for kobjects not carrying a namespace tag are now > >> always only sent to the initial user namespace. The regression potential > >> for this is near to non-existent since user namespaces can't really do > >> anything with interesting devices. > >> > >> Signed-off-by: Christian Brauner > > > > That was supposed to be [PATCH net] not [PATCH net-next] which is > > obviously closed. Sorry about that. > > This does not appear to be a fix. > This looks like feature work. > The motivation appears to be that looks wrong let's change it. Hm, it looked like an oversight an therefore seems like a bug which is why I thought would be a good candidate for net. Recent patches to the semantics here just make it more obvious and provide a better argument to fix it in the current release rather then defer it to the next one. But I'm happy to leave this for net-next. I don't want to rush things if this change in semantics is not trivial enough. For the record, I'm merely fixing/expanding on the current status quo. David, is it ok to queue this or would you prefer I resend when net-next reopens? > > So let's please leave this for when net-next opens again so we can > have time to fully consider a change in semantics. Sure, if we agree that this is the way to go I'm happy too. Is your issue just with when we merge it and you disagree from a technical perspective? That wasn't entirely obvious from your previous mail. :) Thanks! Christian > > Thank you, > Eric