2022-09-29 21:58:20

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 3/4] proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net

On Thu, Sep 29, 2022 at 2:15 PM Al Viro <[email protected]> wrote:
>
> FWIW, what e.g. debian profile for dhclient has is
> @{PROC}/@{pid}/net/dev r,
>
> Note that it's not
> @{PROC}/net/dev r,

Argh. Yeah, then a bind mount or a hardlink won't work either, you're
right. I was assuming that any Apparmor rules allowed for just
/proc/net.

Oh well. I guess we're screwed any which way we turn.

Linus


2022-09-29 22:29:09

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 3/4] proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net

Linus Torvalds <[email protected]> writes:

> On Thu, Sep 29, 2022 at 2:15 PM Al Viro <[email protected]> wrote:
>>
>> FWIW, what e.g. debian profile for dhclient has is
>> @{PROC}/@{pid}/net/dev r,
>>
>> Note that it's not
>> @{PROC}/net/dev r,
>
> Argh. Yeah, then a bind mount or a hardlink won't work either, you're
> right. I was assuming that any Apparmor rules allowed for just
> /proc/net.
>
> Oh well. I guess we're screwed any which way we turn.

I actually think there is a solution.

Instead of going to /proc/self/net -> /proc/tgid/net
or /proc/thread-self/net -> /proc/tgid/task/tid/net

We should be able to go to: /proc/tid/net

That directory does not show up in readdir, but the tid directories were
put in /proc because of how our pthread support evolved and gdb which
made gdb expect them to be their.

That should continue to work with the incomplete apparmor rules that
don't allow accessing /proc/tgid/tid/net for some reason.

Eric

2022-09-29 23:06:13

by Eric W. Biederman

[permalink] [raw]
Subject: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace


Since common apparmor policies don't allow access /proc/tgid/task/tid/net
point the code at /proc/tid/net instead.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: "Eric W. Biederman" <[email protected]>
---

I have only compile tested this. All of the boiler plate is a copy of
/proc/self and /proc/thread-self, so it should work.

Can David or someone who cares and has access to the limited apparmor
configurations could test this to make certain this works?

fs/proc/base.c | 12 ++++++--
fs/proc/internal.h | 2 ++
fs/proc/proc_net.c | 68 ++++++++++++++++++++++++++++++++++++++++-
fs/proc/root.c | 7 ++++-
include/linux/proc_fs.h | 1 +
5 files changed, 85 insertions(+), 5 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 93f7e3d971e4..c205234f3822 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3479,7 +3479,7 @@ static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter ite
return iter;
}

-#define TGID_OFFSET (FIRST_PROCESS_ENTRY + 2)
+#define TGID_OFFSET (FIRST_PROCESS_ENTRY + 3)

/* for the /proc/ directory itself, after non-process stuff has been done */
int proc_pid_readdir(struct file *file, struct dir_context *ctx)
@@ -3492,18 +3492,24 @@ int proc_pid_readdir(struct file *file, struct dir_context *ctx)
if (pos >= PID_MAX_LIMIT + TGID_OFFSET)
return 0;

- if (pos == TGID_OFFSET - 2) {
+ if (pos == TGID_OFFSET - 3) {
struct inode *inode = d_inode(fs_info->proc_self);
if (!dir_emit(ctx, "self", 4, inode->i_ino, DT_LNK))
return 0;
ctx->pos = pos = pos + 1;
}
- if (pos == TGID_OFFSET - 1) {
+ if (pos == TGID_OFFSET - 2) {
struct inode *inode = d_inode(fs_info->proc_thread_self);
if (!dir_emit(ctx, "thread-self", 11, inode->i_ino, DT_LNK))
return 0;
ctx->pos = pos = pos + 1;
}
+ if (pos == TGID_OFFSET - 1) {
+ struct inode *inode = d_inode(fs_info->proc_net);
+ if (!dir_emit(ctx, "net", 11, inode->i_ino, DT_LNK))
+ return 0;
+ ctx->pos = pos = pos + 1;
+ }
iter.tgid = pos - TGID_OFFSET;
iter.task = NULL;
for (iter = next_tgid(ns, iter);
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 06a80f78433d..9d13c24b80c8 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -232,8 +232,10 @@ extern const struct inode_operations proc_net_inode_operations;

#ifdef CONFIG_NET
extern int proc_net_init(void);
+extern int proc_setup_net_symlink(struct super_block *s);
#else
static inline int proc_net_init(void) { return 0; }
+static inline int proc_setup_net_symlink(struct super_block *s) { return 0; }
#endif

/*
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 856839b8ae8b..99335e800c1c 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -408,9 +408,75 @@ static struct pernet_operations __net_initdata proc_net_ns_ops = {
.exit = proc_net_ns_exit,
};

+/*
+ * /proc/net:
+ */
+static const char *proc_net_symlink_get_link(struct dentry *dentry,
+ struct inode *inode,
+ struct delayed_call *done)
+{
+ struct pid_namespace *ns = proc_pid_ns(inode->i_sb);
+ pid_t tid = task_pid_nr_ns(current, ns);
+ char *name;
+
+ if (!tid)
+ return ERR_PTR(-ENOENT);
+ name = kmalloc(10 + 4 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC);
+ if (unlikely(!name))
+ return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
+ sprintf(name, "%u/net", tid);
+ set_delayed_call(done, kfree_link, name);
+ return name;
+}
+
+static const struct inode_operations proc_net_symlink_inode_operations = {
+ .get_link = proc_net_symlink_get_link,
+};
+
+static unsigned net_symlink_inum __ro_after_init;
+
+int proc_setup_net_symlink(struct super_block *s)
+{
+ struct inode *root_inode = d_inode(s->s_root);
+ struct proc_fs_info *fs_info = proc_sb_info(s);
+ struct dentry *net_symlink;
+ int ret = -ENOMEM;
+
+ inode_lock(root_inode);
+ net_symlink = d_alloc_name(s->s_root, "net");
+ if (net_symlink) {
+ struct inode *inode = new_inode(s);
+ if (inode) {
+ inode->i_ino = net_symlink_inum;
+ inode->i_mtime = inode->i_atime = inode->i_ctime = current_time(inode);
+ inode->i_mode = S_IFLNK | S_IRWXUGO;
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
+ inode->i_op = &proc_net_symlink_inode_operations;
+ d_add(net_symlink, inode);
+ ret = 0;
+ } else {
+ dput(net_symlink);
+ }
+ }
+ inode_unlock(root_inode);
+
+ if (ret)
+ pr_err("proc_fill_super: can't allocate /proc/net\n");
+ else
+ fs_info->proc_net = net_symlink;
+
+ return ret;
+}
+
+void __init proc_net_symlink_init(void)
+{
+ proc_alloc_inum(&net_symlink_inum);
+}
+
int __init proc_net_init(void)
{
- proc_symlink("net", NULL, "self/net");
+ proc_net_symlink_init();

return register_pernet_subsys(&proc_net_ns_ops);
}
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 3c2ee3eb1138..6e57e9a4acf9 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -207,7 +207,11 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc)
if (ret) {
return ret;
}
- return proc_setup_thread_self(s);
+ ret = proc_setup_thread_self(s);
+ if (ret) {
+ return ret;
+ }
+ return proc_setup_net_symlink(s);
}

static int proc_reconfigure(struct fs_context *fc)
@@ -268,6 +272,7 @@ static void proc_kill_sb(struct super_block *sb)

dput(fs_info->proc_self);
dput(fs_info->proc_thread_self);
+ dput(fs_info->proc_net);

kill_anon_super(sb);
put_pid_ns(fs_info->pid_ns);
diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h
index 81d6e4ec2294..65f4ef15c8bf 100644
--- a/include/linux/proc_fs.h
+++ b/include/linux/proc_fs.h
@@ -62,6 +62,7 @@ struct proc_fs_info {
struct pid_namespace *pid_ns;
struct dentry *proc_self; /* For /proc/self */
struct dentry *proc_thread_self; /* For /proc/thread-self */
+ struct dentry *proc_net; /* For /proc/net */
kgid_t pid_gid;
enum proc_hidepid hide_pid;
enum proc_pidonly pidonly;
--
2.35.3

2022-09-30 00:28:16

by Al Viro

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

On Thu, Sep 29, 2022 at 05:48:29PM -0500, Eric W. Biederman wrote:

> +static const char *proc_net_symlink_get_link(struct dentry *dentry,
> + struct inode *inode,
> + struct delayed_call *done)
> +{
> + struct pid_namespace *ns = proc_pid_ns(inode->i_sb);
> + pid_t tid = task_pid_nr_ns(current, ns);
> + char *name;
> +
> + if (!tid)
> + return ERR_PTR(-ENOENT);
> + name = kmalloc(10 + 4 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC);
> + if (unlikely(!name))
> + return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
> + sprintf(name, "%u/net", tid);
> + set_delayed_call(done, kfree_link, name);
> + return name;
> +}

Just to troll adobriyan a bit:

static const char *dynamic_get_link(struct delayed_call *done,
bool is_rcu,
const char *fmt, ...)
{
va_list args;
char *body;

va_start(args, fmt);
body = kvasprintf(is_rcu ? GFP_ATOMIC : GFP_KERNEL, fmt, args);
va_end(args);

if (unlikely(!body))
return is_rcu ? ERR_PTR(-ECHILD) : ERR_PTR(-ENOMEM);
set_delayed_call(done, kfree_link, body);
return body;
}

static const char *proc_net_symlink_get_link(struct dentry *dentry,
struct inode *inode,
struct delayed_call *done)
{
struct pid_namespace *ns = proc_pid_ns(inode->i_sb);
pid_t tid = task_pid_nr_ns(current, ns);

if (!tid)
return ERR_PTR(-ENOENT);
return dyname_get_link(done, !dentry, "%u/net", tid);
}

static const char *proc_self_get_link(struct dentry *dentry,
struct inode *inode,
struct delayed_call *done)
{
struct pid_namespace *ns = proc_pid_ns(inode->i_sb);
pid_t tgid = task_tgid_nr_ns(current, ns);

if (!tgid)
return ERR_PTR(-ENOENT);
return dynamic_get_link(done, !dentry, "%u", tgid);
}

static const char *proc_thread_self_get_link(struct dentry *dentry,
struct inode *inode,
struct delayed_call *done)
{
struct pid_namespace *ns = proc_pid_ns(inode->i_sb);
pid_t tgid = task_tgid_nr_ns(current, ns);
pid_t pid = task_pid_nr_ns(current, ns);

if (!pid)
return ERR_PTR(-ENOENT);
return dynamic_get_link(done, !dentry, "%u/task/%u", tgid, pid);
}

2022-09-30 04:29:15

by kernel test robot

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

Hi Eric,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linux/master]
[also build test WARNING on linus/master v6.0-rc7 next-20220929]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Eric-W-Biederman/proc-Update-proc-net-to-point-at-the-accessing-threads-network-namespace/20220930-065017
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 987a926c1d8a40e4256953b04771fbdb63bc7938
config: m68k-allyesconfig
compiler: m68k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/5336f1902b4ba8a646f082f32fbb183850a13080
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Eric-W-Biederman/proc-Update-proc-net-to-point-at-the-accessing-threads-network-namespace/20220930-065017
git checkout 5336f1902b4ba8a646f082f32fbb183850a13080
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash fs/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> fs/proc/proc_net.c:472:13: warning: no previous prototype for 'proc_net_symlink_init' [-Wmissing-prototypes]
472 | void __init proc_net_symlink_init(void)
| ^~~~~~~~~~~~~~~~~~~~~


vim +/proc_net_symlink_init +472 fs/proc/proc_net.c

471
> 472 void __init proc_net_symlink_init(void)
473 {
474 proc_alloc_inum(&net_symlink_inum);
475 }
476

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (2.08 kB)
config (285.05 kB)
Download all attachments

2022-09-30 06:15:31

by kernel test robot

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

Hi Eric,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linux/master]
[also build test WARNING on linus/master v6.0-rc7 next-20220929]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Eric-W-Biederman/proc-Update-proc-net-to-point-at-the-accessing-threads-network-namespace/20220930-065017
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 987a926c1d8a40e4256953b04771fbdb63bc7938
config: i386-randconfig-s001
compiler: gcc-11 (Debian 11.3.0-5) 11.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.4-39-gce1a6720-dirty
# https://github.com/intel-lab-lkp/linux/commit/5336f1902b4ba8a646f082f32fbb183850a13080
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Eric-W-Biederman/proc-Update-proc-net-to-point-at-the-accessing-threads-network-namespace/20220930-065017
git checkout 5336f1902b4ba8a646f082f32fbb183850a13080
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=i386 SHELL=/bin/bash fs/proc/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

sparse warnings: (new ones prefixed by >>)
>> fs/proc/proc_net.c:472:13: sparse: sparse: symbol 'proc_net_symlink_init' was not declared. Should it be static?

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (1.73 kB)
config (151.47 kB)
Download all attachments

2022-09-30 09:39:54

by David Laight

[permalink] [raw]
Subject: RE: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

From: Eric W. Biederman
> Sent: 29 September 2022 23:48
>
> Since common apparmor policies don't allow access /proc/tgid/task/tid/net
> point the code at /proc/tid/net instead.
>
> Link: https://lkml.kernel.org/r/[email protected]
> Signed-off-by: "Eric W. Biederman" <[email protected]>
> ---
>
> I have only compile tested this. All of the boiler plate is a copy of
> /proc/self and /proc/thread-self, so it should work.
>
> Can David or someone who cares and has access to the limited apparmor
> configurations could test this to make certain this works?

It works with a minor 'cut & paste' fixup.
(Not nested inside a program that changes namespaces.)

Although if it is reasonable for /proc/net -> /proc/tid/net
why not just make /proc/thread-self -> /proc/tid
Then /proc/net can just be thread-self/net

I have wondered if the namespace lookup could be done as a 'special'
directory lookup for "net" rather that changing everything when the
namespace is changed.
I can imagine scenarios where a thread needs to keep changing
between two namespaces, at the moment I suspect that is rather
more expensive than a lookup and changing the reference counts.

Notwithstanding the apparmor issues, /proc/net could actuall be
a symlink to (say) /proc/net_namespaces/namespace_name with
readlink returning the name based on the threads actual namespace.

I've also had problems with accessing /sys/class/net for multiple
namespaces within the same thread (think of a system monitor process).
The simplest solution is to start the program with:
ip netne exec namespace program 3</sys/class/net
and the use openat(3, ...) to read items in the 'init' namespace.

FWIW I'm pretty sure there a sequence involving unshare() that
can get you out of a chroot - but I've not found it yet.

David

>
> fs/proc/base.c | 12 ++++++--
> fs/proc/internal.h | 2 ++
> fs/proc/proc_net.c | 68 ++++++++++++++++++++++++++++++++++++++++-
> fs/proc/root.c | 7 ++++-
> include/linux/proc_fs.h | 1 +
> 5 files changed, 85 insertions(+), 5 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 93f7e3d971e4..c205234f3822 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -3479,7 +3479,7 @@ static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter ite
> return iter;
> }
>
> -#define TGID_OFFSET (FIRST_PROCESS_ENTRY + 2)
> +#define TGID_OFFSET (FIRST_PROCESS_ENTRY + 3)
>
> /* for the /proc/ directory itself, after non-process stuff has been done */
> int proc_pid_readdir(struct file *file, struct dir_context *ctx)
> @@ -3492,18 +3492,24 @@ int proc_pid_readdir(struct file *file, struct dir_context *ctx)
> if (pos >= PID_MAX_LIMIT + TGID_OFFSET)
> return 0;
>
> - if (pos == TGID_OFFSET - 2) {
> + if (pos == TGID_OFFSET - 3) {
> struct inode *inode = d_inode(fs_info->proc_self);
> if (!dir_emit(ctx, "self", 4, inode->i_ino, DT_LNK))
> return 0;
> ctx->pos = pos = pos + 1;
> }
> - if (pos == TGID_OFFSET - 1) {
> + if (pos == TGID_OFFSET - 2) {
> struct inode *inode = d_inode(fs_info->proc_thread_self);
> if (!dir_emit(ctx, "thread-self", 11, inode->i_ino, DT_LNK))
> return 0;
> ctx->pos = pos = pos + 1;
> }
> + if (pos == TGID_OFFSET - 1) {
> + struct inode *inode = d_inode(fs_info->proc_net);
> + if (!dir_emit(ctx, "net", 11, inode->i_ino, DT_LNK))

The 11 is the length so needs to be 4.
This block can also be put first - to reduce churn.

David

> + return 0;
> + ctx->pos = pos = pos + 1;
> + }
> iter.tgid = pos - TGID_OFFSET;
> iter.task = NULL;
> for (iter = next_tgid(ns, iter);
> diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> index 06a80f78433d..9d13c24b80c8 100644
> --- a/fs/proc/internal.h
> +++ b/fs/proc/internal.h
> @@ -232,8 +232,10 @@ extern const struct inode_operations proc_net_inode_operations;
>
> #ifdef CONFIG_NET
> extern int proc_net_init(void);
> +extern int proc_setup_net_symlink(struct super_block *s);
> #else
> static inline int proc_net_init(void) { return 0; }
> +static inline int proc_setup_net_symlink(struct super_block *s) { return 0; }
> #endif
>
> /*
> diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
> index 856839b8ae8b..99335e800c1c 100644
> --- a/fs/proc/proc_net.c
> +++ b/fs/proc/proc_net.c
> @@ -408,9 +408,75 @@ static struct pernet_operations __net_initdata proc_net_ns_ops = {
> .exit = proc_net_ns_exit,
> };
>
> +/*
> + * /proc/net:
> + */
> +static const char *proc_net_symlink_get_link(struct dentry *dentry,
> + struct inode *inode,
> + struct delayed_call *done)
> +{
> + struct pid_namespace *ns = proc_pid_ns(inode->i_sb);
> + pid_t tid = task_pid_nr_ns(current, ns);
> + char *name;
> +
> + if (!tid)
> + return ERR_PTR(-ENOENT);
> + name = kmalloc(10 + 4 + 1, dentry ? GFP_KERNEL : GFP_ATOMIC);
> + if (unlikely(!name))
> + return dentry ? ERR_PTR(-ENOMEM) : ERR_PTR(-ECHILD);
> + sprintf(name, "%u/net", tid);
> + set_delayed_call(done, kfree_link, name);
> + return name;
> +}
> +
> +static const struct inode_operations proc_net_symlink_inode_operations = {
> + .get_link = proc_net_symlink_get_link,
> +};
> +
> +static unsigned net_symlink_inum __ro_after_init;
> +
> +int proc_setup_net_symlink(struct super_block *s)
> +{
> + struct inode *root_inode = d_inode(s->s_root);
> + struct proc_fs_info *fs_info = proc_sb_info(s);
> + struct dentry *net_symlink;
> + int ret = -ENOMEM;
> +
> + inode_lock(root_inode);
> + net_symlink = d_alloc_name(s->s_root, "net");
> + if (net_symlink) {
> + struct inode *inode = new_inode(s);
> + if (inode) {
> + inode->i_ino = net_symlink_inum;
> + inode->i_mtime = inode->i_atime = inode->i_ctime = current_time(inode);
> + inode->i_mode = S_IFLNK | S_IRWXUGO;
> + inode->i_uid = GLOBAL_ROOT_UID;
> + inode->i_gid = GLOBAL_ROOT_GID;
> + inode->i_op = &proc_net_symlink_inode_operations;
> + d_add(net_symlink, inode);
> + ret = 0;
> + } else {
> + dput(net_symlink);
> + }
> + }
> + inode_unlock(root_inode);
> +
> + if (ret)
> + pr_err("proc_fill_super: can't allocate /proc/net\n");
> + else
> + fs_info->proc_net = net_symlink;
> +
> + return ret;
> +}
> +
> +void __init proc_net_symlink_init(void)
> +{
> + proc_alloc_inum(&net_symlink_inum);
> +}
> +
> int __init proc_net_init(void)
> {
> - proc_symlink("net", NULL, "self/net");
> + proc_net_symlink_init();
>
> return register_pernet_subsys(&proc_net_ns_ops);
> }
> diff --git a/fs/proc/root.c b/fs/proc/root.c
> index 3c2ee3eb1138..6e57e9a4acf9 100644
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -207,7 +207,11 @@ static int proc_fill_super(struct super_block *s, struct fs_context *fc)
> if (ret) {
> return ret;
> }
> - return proc_setup_thread_self(s);
> + ret = proc_setup_thread_self(s);
> + if (ret) {
> + return ret;
> + }
> + return proc_setup_net_symlink(s);
> }
>
> static int proc_reconfigure(struct fs_context *fc)
> @@ -268,6 +272,7 @@ static void proc_kill_sb(struct super_block *sb)
>
> dput(fs_info->proc_self);
> dput(fs_info->proc_thread_self);
> + dput(fs_info->proc_net);
>
> kill_anon_super(sb);
> put_pid_ns(fs_info->pid_ns);
> diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h
> index 81d6e4ec2294..65f4ef15c8bf 100644
> --- a/include/linux/proc_fs.h
> +++ b/include/linux/proc_fs.h
> @@ -62,6 +62,7 @@ struct proc_fs_info {
> struct pid_namespace *pid_ns;
> struct dentry *proc_self; /* For /proc/self */
> struct dentry *proc_thread_self; /* For /proc/thread-self */
> + struct dentry *proc_net; /* For /proc/net */
> kgid_t pid_gid;
> enum proc_hidepid hide_pid;
> enum proc_pidonly pidonly;
> --
> 2.35.3

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-09-30 14:34:38

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

Al wrote:

> Just to troll adobriyan a bit:
>
> static const char *dynamic_get_link(struct delayed_call *done,
> bool is_rcu,
> const char *fmt, ...)
> {
> va_list args;
> char *body;
>
> va_start(args, fmt);
> body = kvasprintf(is_rcu ? GFP_ATOMIC : GFP_KERNEL, fmt, args);
> va_end(args);

Ouch... Double pass over data. Who wrote this?

>
> if (unlikely(!body))
> return is_rcu ? ERR_PTR(-ECHILD) : ERR_PTR(-ENOMEM);
> set_delayed_call(done, kfree_link, body);
> return body;
> }

2022-09-30 16:29:06

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

David Laight <[email protected]> writes:

> From: Eric W. Biederman
>> Sent: 29 September 2022 23:48
>>
>> Since common apparmor policies don't allow access /proc/tgid/task/tid/net
>> point the code at /proc/tid/net instead.
>>
>> Link: https://lkml.kernel.org/r/[email protected]
>> Signed-off-by: "Eric W. Biederman" <[email protected]>
>> ---
>>
>> I have only compile tested this. All of the boiler plate is a copy of
>> /proc/self and /proc/thread-self, so it should work.
>>
>> Can David or someone who cares and has access to the limited apparmor
>> configurations could test this to make certain this works?
>
> It works with a minor 'cut & paste' fixup.
> (Not nested inside a program that changes namespaces.)

Were there any apparmor problems? I just want to confirm that is what
you tested.

Assuming not this patch looks like it reveals a solution to this
issue.

> Although if it is reasonable for /proc/net -> /proc/tid/net
> why not just make /proc/thread-self -> /proc/tid
> Then /proc/net can just be thread-self/net

There are minor differences between the process directories that
tend to report process wide information and task directories that
only report some of the same information per-task. So in general
thread-self makes much more sense pointing to a per-task directory.

The hidden /proc/tid/ directories use the per process code to generate
themselves. The difference is that they assume the tid is the leading
thread instead of the other process. Those directories are all a bit of
a scrambled mess. I was suspecting the other day we might be able to
fix gdb and make them go away entirely in a decade or so.

So I don't think it makes sense in general to point /proc/thread-self at
the hidden per /proc/tid/ directories.

> I have wondered if the namespace lookup could be done as a 'special'
> directory lookup for "net" rather that changing everything when the
> namespace is changed.
> I can imagine scenarios where a thread needs to keep changing
> between two namespaces, at the moment I suspect that is rather
> more expensive than a lookup and changing the reference counts.

You can always open the net directories once, and then change as
an open directory will not change between namespaces.

> Notwithstanding the apparmor issues, /proc/net could actuall be
> a symlink to (say) /proc/net_namespaces/namespace_name with
> readlink returning the name based on the threads actual namespace.

There really aren't good names for namespaces at the kernel level. As
one of their use cases is to make process migration possible between
machines. So any kernel level name would need to be migrated as well.
So those kernel level names would need a name in another namespace,
or an extra namespace would have to be created for those names.

> I've also had problems with accessing /sys/class/net for multiple
> namespaces within the same thread (think of a system monitor process).
> The simplest solution is to start the program with:
> ip netne exec namespace program 3</sys/class/net
> and the use openat(3, ...) to read items in the 'init' namespace.
>
> FWIW I'm pretty sure there a sequence involving unshare() that
> can get you out of a chroot - but I've not found it yet.

Out of a chroot is essentially just:
chdir("/");
chroot("/somedir");
chdir("../../../../../../../../../../../../../../../..");
Out of most namespaces except the pid and user namespace is
just chns.

You can't get out of the pid namespace as you can't change your pid.

Not being able to escape a user namespace is what makes it impossible to
confuse a process and gain privileges through a privilege gaining exec.

Eric

2022-09-30 21:31:00

by David Laight

[permalink] [raw]
Subject: RE: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

From: Eric W. Biederman
> Sent: 30 September 2022 17:17
>
> David Laight <[email protected]> writes:
>
> > From: Eric W. Biederman
> >> Sent: 29 September 2022 23:48
> >>
> >> Since common apparmor policies don't allow access /proc/tgid/task/tid/net
> >> point the code at /proc/tid/net instead.
> >>
> >> Link: https://lkml.kernel.org/r/[email protected]
> >> Signed-off-by: "Eric W. Biederman" <[email protected]>
> >> ---
> >>
> >> I have only compile tested this. All of the boiler plate is a copy of
> >> /proc/self and /proc/thread-self, so it should work.
> >>
> >> Can David or someone who cares and has access to the limited apparmor
> >> configurations could test this to make certain this works?
> >
> > It works with a minor 'cut & paste' fixup.
> > (Not nested inside a program that changes namespaces.)
>
> Were there any apparmor problems? I just want to confirm that is what
> you tested.

I know nothing about apparmor - I just tested that /proc/net
pointed to somewhere that looked right.

> Assuming not this patch looks like it reveals a solution to this
> issue.
>
> > Although if it is reasonable for /proc/net -> /proc/tid/net
> > why not just make /proc/thread-self -> /proc/tid
> > Then /proc/net can just be thread-self/net
>
> There are minor differences between the process directories that
> tend to report process wide information and task directories that
> only report some of the same information per-task. So in general
> thread-self makes much more sense pointing to a per-task directory.
>
> The hidden /proc/tid/ directories use the per process code to generate
> themselves. The difference is that they assume the tid is the leading
> thread instead of the other process. Those directories are all a bit of
> a scrambled mess. I was suspecting the other day we might be able to
> fix gdb and make them go away entirely in a decade or so.
>
> So I don't think it makes sense in general to point /proc/thread-self at
> the hidden per /proc/tid/ directories.

Ok - I hadn't actually looked in them.
But if you have a long-term plan to remove them directing /proc/net
thought them might not be such a good idea.

> > I have wondered if the namespace lookup could be done as a 'special'
> > directory lookup for "net" rather that changing everything when the
> > namespace is changed.
> > I can imagine scenarios where a thread needs to keep changing
> > between two namespaces, at the moment I suspect that is rather
> > more expensive than a lookup and changing the reference counts.
>
> You can always open the net directories once, and then change as
> an open directory will not change between namespaces.

Part of the problem is that changing the net namespace isn't
enough, you also have to remount /sys - which isn't entirely
trivial.
It might be possibly to mount a network namespace version
of /sys on a different mountpoint - I've not tried very
hard to do that.

> > Notwithstanding the apparmor issues, /proc/net could actuall be
> > a symlink to (say) /proc/net_namespaces/namespace_name with
> > readlink returning the name based on the threads actual namespace.
>
> There really aren't good names for namespaces at the kernel level. As
> one of their use cases is to make process migration possible between
> machines. So any kernel level name would need to be migrated as well.
> So those kernel level names would need a name in another namespace,
> or an extra namespace would have to be created for those names.

Network namespaces do seem to have names.
Although I gave up working out how to change to a named network
namespace from within the kernel (especially in a non-GPL module).

...
> > FWIW I'm pretty sure there a sequence involving unshare() that
> > can get you out of a chroot - but I've not found it yet.
>
> Out of a chroot is essentially just:
> chdir("/");
> chroot("/somedir");
> chdir("../../../../../../../../../../../../../../../..");

A chdir() inside a chroot anchors at the base of the chroot.
fchdir() will get you out if you have an open fd to a directory
outside the chroot.
The 'usual' way out requires a process outside the chroot to
just use mvdir().
But there isn't supposed to be a way to get out.

I can certainly get the /proc symlinks (for a copy of /proc
mounted inside a chroot) to report the full paths for files
that exist inside the chroot.
These should (and do normally) truncate at the chroot base.
(This all happened because a pivot_root() was failing.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-10-02 00:38:57

by Al Viro

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

On Fri, Sep 30, 2022 at 09:28:31PM +0000, David Laight wrote:
> > > FWIW I'm pretty sure there a sequence involving unshare() that
> > > can get you out of a chroot - but I've not found it yet.
> >
> > Out of a chroot is essentially just:
> > chdir("/");
> > chroot("/somedir");
> > chdir("../../../../../../../../../../../../../../../..");
>
> A chdir() inside a chroot anchors at the base of the chroot.
> fchdir() will get you out if you have an open fd to a directory
> outside the chroot.
> The 'usual' way out requires a process outside the chroot to
> just use mvdir().
> But there isn't supposed to be a way to get out.

In order of original claims:

* chdir inside a chroot does *NOT* "anchor at the base of the chroot".
What it does is (a) start at the base if the pathname is absolute and
(b) treats .. in the base as ., same as any other syscall.

* correct.

* WTF is "mvdir()"? Some Unices used to have mvdir(1), but it had never
been a function... And mv(1) (or rename(2)) is far from being the only
way for assistant outside of jail to let the chrooted process out.

* ability to chroot(2) had always been equivalent to ability to undo
chroot(2). If you want to prevent getting out of there, you need
(among other things) to prevent the processes to be confined from
further chroot(2).

2022-10-03 10:41:13

by David Laight

[permalink] [raw]
Subject: RE: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

...
> * ability to chroot(2) had always been equivalent to ability to undo
> chroot(2). If you want to prevent getting out of there, you need
> (among other things) to prevent the processes to be confined from
> further chroot(2).

Not always, certainly not historically.
chroot() inside a chroot() just constrained you further.
If fchdir() and openat() have broken that it is a serious
problem.

NetBSD certainly has checks to detect (log and fix)
programs that have (or might) escape from chroots.

unshare() seems to create a 'shadow' inode structure
for the chroot's "/" so at least some of the tests
when following ".." fail to detect it.

I also thought containers relied on the same scheme?
(But I'm too old fashioned to have looked into them!)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-10-03 14:29:10

by Al Viro

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

On Mon, Oct 03, 2022 at 09:36:46AM +0000, David Laight wrote:
> ...
> > * ability to chroot(2) had always been equivalent to ability to undo
> > chroot(2). If you want to prevent getting out of there, you need
> > (among other things) to prevent the processes to be confined from
> > further chroot(2).
>
> Not always, certainly not historically.

Factually incorrect.

> chroot() inside a chroot() just constrained you further.

What it did was change your root directory. Yes, deeper.
And leave your current directory where it had been.

Now, recall that chroot does *NOT* affect the
interpretation of .. other than in the current root.

Which means that attacker doing
chdir("/");
chroot(some_existing_directory);
chdir("..");
will end up outside of the original chroot environment.

This is POSIX-mandated behaviour. Moreover, that is behaviour of
historical Unices. Any Unix programmer who tries to use chroot(2)
should be aware of that. Ability of making chroot(2) calls
means the ability to break out of any chroot you are currently in.

> If fchdir() and openat() have broken that it is a serious
> problem.

Have you even read the mail you'd been replying to? Where had anything
in the example given (OK sketched out) to you upthread involve fchdir()
or openat()?

2022-10-03 17:27:13

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

David Laight <[email protected]> writes:

> From: Eric W. Biederman
>> Sent: 30 September 2022 17:17
>>
>> David Laight <[email protected]> writes:
>>
>> > From: Eric W. Biederman
>> >> Sent: 29 September 2022 23:48
>> >>
>> >> Since common apparmor policies don't allow access /proc/tgid/task/tid/net
>> >> point the code at /proc/tid/net instead.
>> >>
>> >> Link: https://lkml.kernel.org/r/[email protected]
>> >> Signed-off-by: "Eric W. Biederman" <[email protected]>
>> >> ---
>> >>
>> >> I have only compile tested this. All of the boiler plate is a copy of
>> >> /proc/self and /proc/thread-self, so it should work.
>> >>
>> >> Can David or someone who cares and has access to the limited apparmor
>> >> configurations could test this to make certain this works?
>> >
>> > It works with a minor 'cut & paste' fixup.
>> > (Not nested inside a program that changes namespaces.)
>>
>> Were there any apparmor problems? I just want to confirm that is what
>> you tested.
>
> I know nothing about apparmor - I just tested that /proc/net
> pointed to somewhere that looked right.

Fair enough. We should attempt to verify with an apparmor configuration
before merging this just in case there is a detail someone overlooked.
It doesn't help much if there is a fix that has to be reverted right
away.


>> Assuming not this patch looks like it reveals a solution to this
>> issue.
>>
>> > Although if it is reasonable for /proc/net -> /proc/tid/net
>> > why not just make /proc/thread-self -> /proc/tid
>> > Then /proc/net can just be thread-self/net
>>
>> There are minor differences between the process directories that
>> tend to report process wide information and task directories that
>> only report some of the same information per-task. So in general
>> thread-self makes much more sense pointing to a per-task directory.
>>
>> The hidden /proc/tid/ directories use the per process code to generate
>> themselves. The difference is that they assume the tid is the leading
>> thread instead of the other process. Those directories are all a bit of
>> a scrambled mess. I was suspecting the other day we might be able to
>> fix gdb and make them go away entirely in a decade or so.
>>
>> So I don't think it makes sense in general to point /proc/thread-self at
>> the hidden per /proc/tid/ directories.
>
> Ok - I hadn't actually looked in them.
> But if you have a long-term plan to remove them directing /proc/net
> thought them might not be such a good idea.

Nah. I just want to grouse about them and encourage people not to
use them in general. They are a weird special case. They aren't
painful enough to maintain to make me want to do something else.

It would actually be less work to fix the apparmor security polices,
and the to verify over the course of a several years that the broken
security policies are no longer shipped.


>> > I have wondered if the namespace lookup could be done as a 'special'
>> > directory lookup for "net" rather that changing everything when the
>> > namespace is changed.
>> > I can imagine scenarios where a thread needs to keep changing
>> > between two namespaces, at the moment I suspect that is rather
>> > more expensive than a lookup and changing the reference counts.
>>
>> You can always open the net directories once, and then change as
>> an open directory will not change between namespaces.
>
> Part of the problem is that changing the net namespace isn't
> enough, you also have to remount /sys - which isn't entirely
> trivial.

Yes. That is actually a much more maintainable model. But it is still
imperfect. I was thinking about the proc/net directories when
I made my comment. Unlike proc where we have task ids there is nothing
in /proc that can do anything.

> It might be possibly to mount a network namespace version
> of /sys on a different mountpoint - I've not tried very
> hard to do that.

It is a bug if that doesn't work.

>> > Notwithstanding the apparmor issues, /proc/net could actuall be
>> > a symlink to (say) /proc/net_namespaces/namespace_name with
>> > readlink returning the name based on the threads actual namespace.
>>
>> There really aren't good names for namespaces at the kernel level. As
>> one of their use cases is to make process migration possible between
>> machines. So any kernel level name would need to be migrated as well.
>> So those kernel level names would need a name in another namespace,
>> or an extra namespace would have to be created for those names.
>
> Network namespaces do seem to have names.
> Although I gave up working out how to change to a named network
> namespace from within the kernel (especially in a non-GPL module).

Network namespaces have mount points. The mount points have names.

It is just a matter of finding the right filesystem and calling
sys_rename().

There are a some network namespace local names for other network
namespaces. For those I don't see how it would make any sense
to change the name. If you need to you can always create a
new network namespace and ensure you get the name you want there.
Which is good enough for process migration. I don't know why else
anyone would want to change names.

> ...
>> > FWIW I'm pretty sure there a sequence involving unshare() that
>> > can get you out of a chroot - but I've not found it yet.
>>
>> Out of a chroot is essentially just:
>> chdir("/");
>> chroot("/somedir");
>> chdir("../../../../../../../../../../../../../../../..");
>
> A chdir() inside a chroot anchors at the base of the chroot.
But the check is very simple.
If (working_directory == root_directory) make chdir("...") a noop.

Once the working directory is below the root directory (as
chroot("/somedir") achieves the chroot checks are no longer usable.

> fchdir() will get you out if you have an open fd to a directory
> outside the chroot.
> The 'usual' way out requires a process outside the chroot to
> just use mvdir().
> But there isn't supposed to be a way to get out.

As I recall the history chroot was a quick hack to allow building a
building against a different version of the binaries than were currently
installed. It was not built as a security feature.

Eric

2022-10-03 19:03:59

by Al Viro

[permalink] [raw]
Subject: Re: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

On Mon, Oct 03, 2022 at 12:07:27PM -0500, Eric W. Biederman wrote:

> > fchdir() will get you out if you have an open fd to a directory
> > outside the chroot.
> > The 'usual' way out requires a process outside the chroot to
> > just use mvdir().
> > But there isn't supposed to be a way to get out.
>
> As I recall the history chroot was a quick hack to allow building a
> building against a different version of the binaries than were currently
> installed. It was not built as a security feature.

A last-moment prerelease hack in v7, by the look of it; at that point it
hadn't even tried to modify ".." behaviour in the directory you'd been
chrooted into - just modified the starting point for resolving absolute pathnames.

Not even token attempts of confinement until 1982 commit by Bill Joy,
during one of the namei rewrites. No idea how when non-BSD branches
had picked that.

At no point did chroot(2) switch the current directory. fchdir(2) doesn't
add anything to the situation when
chdir("/");
chroot("some_directory");
chdir("../../../../../../../..");
chroot(".");
will break you out of it nicely.

Again, chroot(2) had never been intended to be root-resistant; there's
a reason why "drop elevated priveleges right after chrooting" is
in all kinds of UNIX FAQs (very likely in Stevens et.al. as well -
I don't have the relevant volume in front of me, but it's certainly
something covered in textbooks).

chroot(2) can be useful in confining processes, but you need to be
really careful about the ways you use it.

2022-10-04 09:21:14

by David Laight

[permalink] [raw]
Subject: RE: [CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace

From: Eric W. Biederman
> Sent: 03 October 2022 18:07
>
> David Laight <[email protected]> writes:
>
> > From: Eric W. Biederman
...
> > Part of the problem is that changing the net namespace isn't
> > enough, you also have to remount /sys - which isn't entirely
> > trivial.
>
> Yes. That is actually a much more maintainable model. But it is still
> imperfect. I was thinking about the proc/net directories when
> I made my comment. Unlike proc where we have task ids there is nothing
> in /proc that can do anything.
>
> > It might be possibly to mount a network namespace version
> > of /sys on a different mountpoint - I've not tried very
> > hard to do that.
>
> It is a bug if that doesn't work.

The difficultly is picking the 'spell'.
I think you need to run mount after switching to the namespace.
But you don't want the unshare() that 'ip netns exec' does.
So I think it needs a silly wrapper program.

> >> > Notwithstanding the apparmor issues, /proc/net could actuall be
> >> > a symlink to (say) /proc/net_namespaces/namespace_name with
> >> > readlink returning the name based on the threads actual namespace.
> >>
> >> There really aren't good names for namespaces at the kernel level. As
> >> one of their use cases is to make process migration possible between
> >> machines. So any kernel level name would need to be migrated as well.
> >> So those kernel level names would need a name in another namespace,
> >> or an extra namespace would have to be created for those names.
> >
> > Network namespaces do seem to have names.
> > Although I gave up working out how to change to a named network
> > namespace from within the kernel (especially in a non-GPL module).
>
> Network namespaces have mount points. The mount points have names.
>
> It is just a matter of finding the right filesystem and calling
> sys_rename().

I wanted to lookup a net namespace by name - so I could create
a kernel socket in a namespace specified in configuration data.
Not change the name of a namespace.

I ended up only giving a few options - basically saving the
namespace of code that called into the driver.
(Harder in a non-gpl driver since you can't directly hold/release
the namespace itself - fortunately you can create a socket!)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-10-05 13:16:54

by kernel test robot

[permalink] [raw]
Subject: [proc] 5336f1902b: BUG:KASAN:global-out-of-bounds_in_memchr


Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 5336f1902b4ba8a646f082f32fbb183850a13080 ("[CFT][PATCH] proc: Update /proc/net to point at the accessing threads network namespace")
url: https://github.com/intel-lab-lkp/linux/commits/Eric-W-Biederman/proc-Update-proc-net-to-point-at-the-accessing-threads-network-namespace/20220930-065017
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 987a926c1d8a40e4256953b04771fbdb63bc7938
patch link: https://lore.kernel.org/lkml/[email protected]

in testcase: xfstests
version: xfstests-x86_64-5a5e419-1_20220927
with following parameters:

disk: 6HDD
fs: btrfs
test: btrfs-group-21

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+------------------------------------------+------------+------------+
| | 987a926c1d | 5336f1902b |
+------------------------------------------+------------+------------+
| BUG:KASAN:global-out-of-bounds_in_memchr | 0 | 78 |
+------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/r/[email protected]


[ 71.510417][ T7965] BUG: KASAN: global-out-of-bounds in memchr (lib/string.c:883)
[ 71.517984][ T7965] Read of size 1 at addr ffffffff83b51604 by task killall/7965
[ 71.526948][ T7965]
[ 71.530801][ T7965] CPU: 1 PID: 7965 Comm: killall Tainted: G S 6.0.0-rc7-00133-g5336f1902b4b #1
[ 71.541870][ T7965] Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A05 12/05/2013
[ 71.550663][ T7965] Call Trace:
[ 71.554659][ T7965] <TASK>
[ 71.558263][ T7965] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[ 71.563451][ T7965] print_address_description+0x1f/0x200
[ 71.570717][ T7965] print_report.cold (mm/kasan/report.c:434)
[ 71.576191][ T7965] ? _raw_spin_lock_irqsave (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[ 71.582283][ T7965] ? memchr (lib/string.c:883)
[ 71.586903][ T7965] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:497)
[ 71.591946][ T7965] ? memchr (lib/string.c:883)
[ 71.596524][ T7965] memchr (lib/string.c:883)
[ 71.600908][ T7965] verify_dirent_name (include/linux/fortify-string.h:432 fs/readdir.c:114)
[ 71.606310][ T7965] filldir64 (fs/readdir.c:320)
[ 71.611049][ T7965] ? folio_add_lru (arch/x86/include/asm/preempt.h:85 mm/swap.c:491)
[ 71.616173][ T7965] proc_pid_readdir (fs/proc/base.c:3509)
[ 71.621607][ T7965] ? proc_pid_lookup (fs/proc/base.c:3486)
[ 71.627090][ T7965] ? proc_readdir_de (arch/x86/include/asm/atomic.h:165 arch/x86/include/asm/atomic.h:178 include/linux/atomic/atomic-instrumented.h:147 include/asm-generic/qrwlock.h:113 include/linux/rwlock_api_smp.h:232 fs/proc/generic.c:321 fs/proc/generic.c:284)
[ 71.632617][ T7965] iterate_dir (fs/readdir.c:65)
[ 71.637581][ T7965] __x64_sys_getdents64 (fs/readdir.c:370 fs/readdir.c:354 fs/readdir.c:354)
[ 71.643287][ T7965] ? __ia32_sys_getdents (fs/readdir.c:354)
[ 71.649053][ T7965] ? handle_mm_fault (mm/memory.c:5157)
[ 71.654497][ T7965] ? __x64_sys_getdents (fs/readdir.c:312)
[ 71.660169][ T7965] ? do_user_addr_fault (arch/x86/mm/fault.c:1426)
[ 71.665863][ T7965] ? exit_to_user_mode_loop (include/linux/sched.h:2305 include/linux/resume_user_mode.h:61 kernel/entry/common.c:169)
[ 71.671798][ T7965] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 71.676678][ T7965] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 71.682981][ T7965] RIP: 0033:0x7f7f3cbe5387
[ 71.687826][ T7965] Code: 0f 1f 00 48 8b 47 20 c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 81 fa ff ff ff 7f b8 ff ff ff 7f 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 d9 aa 10 00 f7 d8 64 89 02 48
All code
========
0: 0f 1f 00 nopl (%rax)
3: 48 8b 47 20 mov 0x20(%rdi),%rax
7: c3 retq
8: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
f: 00 00 00
12: 90 nop
13: 48 81 fa ff ff ff 7f cmp $0x7fffffff,%rdx
1a: b8 ff ff ff 7f mov $0x7fffffff,%eax
1f: 48 0f 47 d0 cmova %rax,%rdx
23: b8 d9 00 00 00 mov $0xd9,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 01 ja 0x33
32: c3 retq
33: 48 8b 15 d9 aa 10 00 mov 0x10aad9(%rip),%rdx # 0x10ab13
3a: f7 d8 neg %eax
3c: 64 89 02 mov %eax,%fs:(%rdx)
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 01 ja 0x9
8: c3 retq
9: 48 8b 15 d9 aa 10 00 mov 0x10aad9(%rip),%rdx # 0x10aae9
10: f7 d8 neg %eax
12: 64 89 02 mov %eax,%fs:(%rdx)
15: 48 rex.W
[ 71.708650][ T7965] RSP: 002b:00007ffe0194edd8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
[ 71.717564][ T7965] RAX: ffffffffffffffda RBX: 00005566e677e3a0 RCX: 00007f7f3cbe5387
[ 71.726016][ T7965] RDX: 0000000000008000 RSI: 00005566e677e3d0 RDI: 0000000000000004
[ 71.734485][ T7965] RBP: 00005566e677e3d0 R08: 0000000000000030 R09: 00007f7f3ccf0be0
[ 71.742941][ T7965] R10: fffffffffffffd18 R11: 0000000000000293 R12: ffffffffffffff80
[ 71.751511][ T7965] R13: 00005566e677e3a4 R14: 0000000000000000 R15: 00005566e67863e0
[ 71.759971][ T7965] </TASK>
[ 71.763468][ T7965]
[ 71.766290][ T7965] The buggy address belongs to the variable:
[ 71.772827][ T7965] proc_fs_parameters+0xcc4/0xd60
[ 71.778272][ T7965]
[ 71.780980][ T7965] Memory state around the buggy address:
[ 71.787003][ T7965] ffffffff83b51500: 03 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
[ 71.795510][ T7965] ffffffff83b51580: 05 f9 f9 f9 f9 f9 f9 f9 00 04 f9 f9 f9 f9 f9 f9
[ 71.803973][ T7965] >ffffffff83b51600: 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9
[ 71.812459][ T7965] ^
[ 71.816899][ T7965] ffffffff83b51680: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 07 f9 f9 f9
[ 71.825394][ T7965] ffffffff83b51700: f9 f9 f9 f9 05 f9 f9 f9 f9 f9 f9 f9 07 f9 f9 f9
[ 71.833858][ T7965] ==================================================================
[ 71.842353][ T7965] Disabling lock debugging due to kernel taint
[ 73.113893][ T7993] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 73.122970][ T7993] BTRFS info (device sdb2): disk space caching is enabled
[ 73.249734][ T353] btrfs/212 _check_dmesg: something found in dmesg (see /lkp/benchmarks/xfstests/results//btrfs/212.dmesg)
[ 73.249753][ T353]
[ 73.265812][ T353]
[ 73.265821][ T353]
[ 73.290165][ T1650] run fstests btrfs/213 at 2022-10-02 03:09:04
[ 73.975400][ T8186] BTRFS info (device sdb1): using crc32c (crc32c-intel) checksum algorithm
[ 73.984538][ T8186] BTRFS info (device sdb1): disk space caching is enabled
[ 74.305018][ T8250] BTRFS: device fsid 7b1643d2-a0ef-4a60-a3a1-7dfaa39dabb2 devid 1 transid 5 /dev/sdb2 scanned by mkfs.btrfs (8250)
[ 74.338350][ T8261] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 74.347523][ T8261] BTRFS info (device sdb2): disk space caching is enabled
[ 74.359821][ T8261] BTRFS info (device sdb2): checking UUID tree
[ 78.669924][ T8302] BTRFS info (device sdb2): balance: start -d
[ 78.676960][ T8302] BTRFS info (device sdb2): relocating block group 2177892352 flags data
[ 80.486450][ T8302] BTRFS info (device sdb2): balance: canceled
[ 80.569587][ T8311] BTRFS info (device sdb2): balance: start -m -s
[ 80.578570][ T8311] BTRFS info (device sdb2): relocating block group 30408704 flags metadata|dup
[ 80.728132][ T8311] BTRFS info (device sdb2): found 74 extents, stage: move data extents
[ 80.845157][ T8311] BTRFS info (device sdb2): relocating block group 22020096 flags system|dup
[ 81.002861][ T8311] BTRFS info (device sdb2): found 1 extents, stage: move data extents
[ 81.161533][ T8311] BTRFS info (device sdb2): balance: ended with status: 0
[ 83.031379][ T8342] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 83.040869][ T8342] BTRFS info (device sdb2): disk space caching is enabled
[ 83.140497][ T353] btrfs/213 10s
[ 83.140507][ T353]
[ 83.177038][ T1650] run fstests btrfs/214 at 2022-10-02 03:09:14
[ 83.520920][ T8539] BTRFS info (device sdb1): using crc32c (crc32c-intel) checksum algorithm
[ 83.530034][ T8539] BTRFS info (device sdb1): disk space caching is enabled
[ 83.795491][ T8591] BTRFS: device fsid 047db66e-9c5a-43b9-a0ed-a59a5a54c01a devid 1 transid 5 /dev/sdb2 scanned by mkfs.btrfs (8591)
[ 83.828333][ T8602] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 83.837501][ T8602] BTRFS info (device sdb2): disk space caching is enabled
[ 83.849771][ T8602] BTRFS info (device sdb2): checking UUID tree
[ 84.553095][ T8643] BTRFS: device fsid 2b3475c1-7363-46bb-b5be-3b7de42f32b5 devid 1 transid 5 /dev/sdb2 scanned by mkfs.btrfs (8643)
[ 84.587062][ T8657] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 84.596436][ T8657] BTRFS info (device sdb2): disk space caching is enabled
[ 84.609434][ T8657] BTRFS info (device sdb2): checking UUID tree
[ 85.519592][ T8718] BTRFS: device fsid 1489ff22-2c49-4e57-9c0b-ec0e45befbb1 devid 1 transid 5 /dev/sdb2 scanned by mkfs.btrfs (8718)
[ 85.552817][ T8732] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 85.562090][ T8732] BTRFS info (device sdb2): disk space caching is enabled
[ 85.574899][ T8732] BTRFS info (device sdb2): checking UUID tree
[ 86.469477][ T8795] BTRFS: device fsid e3efb665-e701-41d2-a531-ffb2423afdc9 devid 1 transid 5 /dev/sdb2 scanned by mkfs.btrfs (8795)
[ 86.503705][ T8809] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 86.513013][ T8809] BTRFS info (device sdb2): disk space caching is enabled
[ 86.525302][ T8809] BTRFS info (device sdb2): checking UUID tree
[ 87.419276][ T8872] BTRFS: device fsid 2e2cffbe-0236-4737-889d-f93e6a2de77b devid 1 transid 5 /dev/sdb2 scanned by mkfs.btrfs (8872)
[ 87.452360][ T8886] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 87.461728][ T8886] BTRFS info (device sdb2): disk space caching is enabled
[ 87.474447][ T8886] BTRFS info (device sdb2): checking UUID tree
[ 88.348072][ T353] btrfs/214 5s
[ 88.348082][ T353]
[ 88.384248][ T1650] run fstests btrfs/215 at 2022-10-02 03:09:19
[ 88.728498][ T9138] BTRFS info (device sdb1): using crc32c (crc32c-intel) checksum algorithm
[ 88.737813][ T9138] BTRFS info (device sdb1): disk space caching is enabled
[ 88.994029][ T9188] BTRFS: device fsid c723a7a3-6fd5-4421-bee8-2f5ae485be5d devid 1 transid 5 /dev/sdb2 scanned by mkfs.btrfs (9188)
[ 89.035307][ T9202] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 89.044653][ T9202] BTRFS info (device sdb2): disabling disk space caching
[ 89.057213][ T9202] BTRFS info (device sdb2): cleaning free space cache v1
[ 89.093093][ T9202] BTRFS info (device sdb2): checking UUID tree
[ 89.231074][ T9248] BTRFS info (device sdb2): using crc32c (crc32c-intel) checksum algorithm
[ 89.251683][ T59] BTRFS warning (device sdb2): csum failed root 5 ino 257 off 0 csum 0x656bd64e expected csum 0x4ef41b07 mirror 1
[ 89.265224][ T59] BTRFS error (device sdb2): bdev /dev/sdb2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[ 89.275939][ T59] BTRFS warning (device sdb2): csum failed root 5 ino 257 off 0 csum 0x656bd64e expected csum 0x4ef41b07 mirror 1
[ 89.289527][ T59] BTRFS error (device sdb2): bdev /dev/sdb2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
[ 89.312524][ T59] BTRFS warning (device sdb2): csum failed root 5 ino 257 off 0 csum 0x656bd64e expected csum 0x4ef41b07 mirror 1
[ 89.326147][ T59] BTRFS error (device sdb2): bdev /dev/sdb2 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
[ 89.336731][ T59] BTRFS warning (device sdb2): csum failed root 5 ino 257 off 4096 csum 0x656bd64e expected csum 0x4ef41b07 mirror 1
[ 89.350710][ T59] BTRFS error (device sdb2): bdev /dev/sdb2 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
[ 89.361254][ T59] BTRFS warning (device sdb2): csum failed root 5 ino 257 off 8192 csum 0x656bd64e expected csum 0x4ef41b07 mirror 1


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (13.58 kB)
config-6.0.0-rc7-00133-g5336f1902b4b (170.84 kB)
job-script (6.02 kB)
dmesg.xz (29.23 kB)
xfstests (2.50 kB)
job.yaml (4.79 kB)
reproduce (1.02 kB)
Download all attachments