Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp3376616rwb; Fri, 30 Sep 2022 02:39:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5/E7vVZ9u36CaDDBrujUtAL197EgfuPNWYaE6R9r/lmx9PNGyk2I8BbEbzxufzrw7xvm+6 X-Received: by 2002:a17:90a:f28b:b0:203:627c:7ba1 with SMTP id fs11-20020a17090af28b00b00203627c7ba1mr21263407pjb.191.1664530793803; Fri, 30 Sep 2022 02:39:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664530793; cv=none; d=google.com; s=arc-20160816; b=g8PabVyC+zAwe6Yhd2Wu/VXsjSqPcWoKxmJ2SI6LBt+xxOXAN4LnU2o0fM1LtalmCp 2Vcq8dp4eH/z204HSF1L3bDilRO4acKt+gNiJVSdyuj8h4kr9JW14rlLkip4h/yQ2fQ0 i1/qi4Ex4jPqWvLzttxF0PTTZf5y2SXVFHjY4ymozvgui/T6RHyQf4iNzhzInODZu8AH DAzHfLXWHPZDP66LNMgn7lZlGX2ssCohY31aNE/b+RJPpWjdsYyfD75VvRmnky+CL2hO eOEcXJigQAv3zoA6zGkRjA0xdqqYFDeb6/sqh5LsUybEVYUcdHlP4VGReFXgQt8Oi49P vA+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=+opBuS8wjH8gNd3dq3oTf+wSzoXLuLo5eZYPePneZxo=; b=zkTx40N7I0L5inuNEDohWI6BI9Y7iepUrxfnwuqXRYVes7AGkN9C05Ch/uzvgITuWL jjrEEOJ9RGXVkSDbbA/0z+ytyCvDIm3sxFK4S3rrCBa7mTJGxX4Q6iqux/7pTwaDIjVV WI+VRDEx3u5Ep2uhCCdNZzfNzeqJEq41+ZejvSVl+aZj2/QY1p4jnaEpuQTeOgch1uVA MhDlpikT1ipeRVezJc8mqr8IAfYxzOsYeHWR0OFlwbKjh6i/jw0934R37fkZwVVpqLFO qIj6mIFvBMzY+t7ATtbw0GnUL3OWp+YPRhFbrrbpNOpiZbjkIc7oZNNabtkBTNkOY/tG 2Ovw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v2-20020a631502000000b0043c1481a835si2604788pgl.267.2022.09.30.02.39.41; Fri, 30 Sep 2022 02:39:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231414AbiI3JbH convert rfc822-to-8bit (ORCPT + 99 others); Fri, 30 Sep 2022 05:31:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231379AbiI3JbE (ORCPT ); Fri, 30 Sep 2022 05:31:04 -0400 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD86EBC938 for ; Fri, 30 Sep 2022 02:31:01 -0700 (PDT) Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-44-avJ3a8oNN9K4AaK0uayjow-1; Fri, 30 Sep 2022 10:30:53 +0100 X-MC-Unique: avJ3a8oNN9K4AaK0uayjow-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Fri, 30 Sep 2022 10:30:41 +0100 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.040; Fri, 30 Sep 2022 10:30:41 +0100 From: David Laight To: "'Eric W. Biederman'" , Linus Torvalds CC: Al Viro , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , "Serge E. Hallyn" Subject: RE: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace Thread-Topic: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace Thread-Index: AQHY1FWf381Lc0KOOEGaF1/0a4qSLq33sN8w Date: Fri, 30 Sep 2022 09:30:41 +0000 Message-ID: References: <871qrt4ymg.fsf@email.froward.int.ebiederm.org> <87ill53igy.fsf_-_@email.froward.int.ebiederm.org> In-Reply-To: <87ill53igy.fsf_-_@email.froward.int.ebiederm.org> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Eric W. Biederman > Sent: 29 September 2022 23:48 > > Since common apparmor policies don't allow access /proc/tgid/task/tid/net > point the code at /proc/tid/net instead. > > Link: https://lkml.kernel.org/r/dacfc18d6667421d97127451eafe4f29@AcuMS.aculab.com > Signed-off-by: "Eric W. Biederman" > --- > > I have only compile tested this. All of the boiler plate is a copy of > /proc/self and /proc/thread-self, so it should work. > > Can David or someone who cares and has access to the limited apparmor > configurations could test this to make certain this works? It works with a minor 'cut & paste' fixup. (Not nested inside a program that changes namespaces.) Although if it is reasonable for /proc/net -> /proc/tid/net why not just make /proc/thread-self -> /proc/tid Then /proc/net can just be thread-self/net I have wondered if the namespace lookup could be done as a 'special' directory lookup for "net" rather that changing everything when the namespace is changed. I can imagine scenarios where a thread needs to keep changing between two namespaces, at the moment I suspect that is rather more expensive than a lookup and changing the reference counts. Notwithstanding the apparmor issues, /proc/net could actuall be a symlink to (say) /proc/net_namespaces/namespace_name with readlink returning the name based on the threads actual namespace. I've also had problems with accessing /sys/class/net for multiple namespaces within the same thread (think of a system monitor process). The simplest solution is to start the program with: ip netne exec namespace program 3 > fs/proc/base.c | 12 ++++++-- > fs/proc/internal.h | 2 ++ > fs/proc/proc_net.c | 68 ++++++++++++++++++++++++++++++++++++++++- > fs/proc/root.c | 7 ++++- > include/linux/proc_fs.h | 1 + > 5 files changed, 85 insertions(+), 5 deletions(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 93f7e3d971e4..c205234f3822 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -3479,7 +3479,7 @@ static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter ite > return iter; > } > > -#define TGID_OFFSET (FIRST_PROCESS_ENTRY + 2) > +#define TGID_OFFSET (FIRST_PROCESS_ENTRY + 3) > > /* for the /proc/ directory itself, after non-process stuff has been done */ > int proc_pid_readdir(struct file *file, struct dir_context *ctx) > @@ -3492,18 +3492,24 @@ int proc_pid_readdir(struct file *file, struct dir_context *ctx) > if (pos >= PID_MAX_LIMIT + TGID_OFFSET) > return 0; > > - if (pos == TGID_OFFSET - 2) { > + if (pos == TGID_OFFSET - 3) { > struct inode *inode = d_inode(fs_info->proc_self); > if (!dir_emit(ctx, "self", 4, inode->i_ino, DT_LNK)) > return 0; > ctx->pos = pos = pos + 1; > } > - if (pos == TGID_OFFSET - 1) { > + if (pos == TGID_OFFSET - 2) { > struct inode *inode = d_inode(fs_info->proc_thread_self); > if (!dir_emit(ctx, "thread-self", 11, inode->i_ino, DT_LNK)) > return 0; > ctx->pos = pos = pos + 1; > } > + if (pos == TGID_OFFSET - 1) { > + struct inode *inode = d_inode(fs_info->proc_net); > + if (!dir_emit(ctx, "net", 11, inode->i_ino, DT_LNK)) The 11 is the length so needs to be 4. This block can also be put first - to reduce churn. David > + return 0; > + ctx->pos = pos = pos + 1; > + } > iter.tgid = pos - TGID_OFFSET; > iter.task = NULL; > for (iter = next_tgid(ns, iter); > diff --git a/fs/proc/internal.h b/fs/proc/internal.h > index 06a80f78433d..9d13c24b80c8 100644 > --- a/fs/proc/internal.h > +++ b/fs/proc/internal.h > @@ -232,8 +232,10 @@ extern const struct inode_operations proc_net_inode_operations; > > #ifdef CONFIG_NET > extern int proc_net_init(void); > +extern int proc_setup_net_symlink(struct super_block *s); > #else > static inline int proc_net_init(void) { return 0; } > +static inline int proc_setup_net_symlink(struct super_block *s) { return 0; } > #endif > > /* > diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c > index 856839b8ae8b..99335e800c1c 100644 > --- a/fs/proc/proc_net.c > +++ b/fs/proc/proc_net.c > @@ -408,9 +408,75 @@ static struct pernet_operations __net_initdata proc_net_ns_ops = { > .exit = proc_net_ns_exit, > }; > > +/* > + * /proc/net: > + */ > +static const char *proc_net_symlink_get_link(struct dentry *dentry, > + struct inode *inode, > + struct delayed_call *done) > +{ > + struct pid_namespace *ns = proc_pid_ns(inode->i_sb); > + pid_t tid = task_pid_nr_ns(current, ns); > + char *name; > + > + if (!tid) > + return ERR_PTR(-ENOENT); > + name = kmalloc(10 + 4 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC); > + if (unlikely(!name)) > + return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD); > + sprintf(name, "%u/net", tid); > + set_delayed_call(done, kfree_link, name); > + return name; > +} > + > +static const struct inode_operations proc_net_symlink_inode_operations = { > + .get_link = proc_net_symlink_get_link, > +}; > + > +static unsigned net_symlink_inum __ro_after_init; > + > +int proc_setup_net_symlink(struct super_block *s) > +{ > + struct inode *root_inode = d_inode(s->s_root); > + struct proc_fs_info *fs_info = proc_sb_info(s); > + struct dentry *net_symlink; > + int ret = -ENOMEM; > + > + inode_lock(root_inode); > + net_symlink = d_alloc_name(s->s_root, "net"); > + if (net_symlink) { > + struct inode *inode = new_inode(s); > + if (inode) { > + inode->i_ino = net_symlink_inum; > + inode->i_mtime = inode->i_atime = inode->i_ctime = current_time(inode); > + inode->i_mode = S_IFLNK | S_IRWXUGO; > + inode->i_uid = GLOBAL_ROOT_UID; > + inode->i_gid = GLOBAL_ROOT_GID; > + inode->i_op = &proc_net_symlink_inode_operations; > + d_add(net_symlink, inode); > + ret = 0; > + } else { > + dput(net_symlink); > + } > + } > + inode_unlock(root_inode); > + > + if (ret) > + pr_err("proc_fill_super: can't allocate /proc/net\n"); > + else > + fs_info->proc_net = net_symlink; > + > + return ret; > +} > + > +void __init proc_net_symlink_init(void) > +{ > + proc_alloc_inum(&net_symlink_inum); > +} > + > int __init proc_net_init(void) > { > - proc_symlink("net", NULL, "self/net"); > + proc_net_symlink_init(); > > return register_pernet_subsys(&proc_net_ns_ops); > } > diff --git a/fs/proc/root.c b/fs/proc/root.c > index 3c2ee3eb1138..6e57e9a4acf9 100644 > --- a/fs/proc/root.c > +++ b/fs/proc/root.c > @@ -207,7 +207,11 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc) > if (ret) { > return ret; > } > - return proc_setup_thread_self(s); > + ret = proc_setup_thread_self(s); > + if (ret) { > + return ret; > + } > + return proc_setup_net_symlink(s); > } > > static int proc_reconfigure(struct fs_context *fc) > @@ -268,6 +272,7 @@ static void proc_kill_sb(struct super_block *sb) > > dput(fs_info->proc_self); > dput(fs_info->proc_thread_self); > + dput(fs_info->proc_net); > > kill_anon_super(sb); > put_pid_ns(fs_info->pid_ns); > diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h > index 81d6e4ec2294..65f4ef15c8bf 100644 > --- a/include/linux/proc_fs.h > +++ b/include/linux/proc_fs.h > @@ -62,6 +62,7 @@ struct proc_fs_info { > struct pid_namespace *pid_ns; > struct dentry *proc_self; /* For /proc/self */ > struct dentry *proc_thread_self; /* For /proc/thread-self */ > + struct dentry *proc_net; /* For /proc/net */ > kgid_t pid_gid; > enum proc_hidepid hide_pid; > enum proc_pidonly pidonly; > -- > 2.35.3 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)