2022-08-19 07:25:11

by Christian Brauner

[permalink] [raw]
Subject: Re: data-race in cgroup_get_tree / proc_cgroup_show

On Thu, Aug 18, 2022 at 07:24:00PM -0400, Abhishek Shah wrote:
> Hi all,
>
> We found the following data race involving the *cgrp_dfl_visible *variable.
> We think it has security implications as the racing variable controls the
> contents used in /proc/<pid>/cgroup which has been used in prior work
> <https://www.cyberark.com/resources/threat-research-blog/the-strange-case-of-how-we-escaped-the-docker-default-container>
> in container escapes. Please let us know what you think. Thanks!

One straightforward fix might be to use
cmpxchg(&cgrp_dfl_visible, false, true) in cgroup_get_tree()
and READ_ONCE(cgrp_dfl_visible) in proc_cgroup_show() or sm like that.
I'm not sure this is an issue though but might still be nice to fix it.

>
> *-----------------------------Report--------------------------------------*
> *write* to 0xffffffff881d0344 of 1 bytes by task 6542 on cpu 0:
> cgroup_get_tree+0x30/0x1c0 kernel/cgroup/cgroup.c:2153
> vfs_get_tree+0x53/0x1b0 fs/super.c:1497
> do_new_mount+0x208/0x6a0 fs/namespace.c:3040
> path_mount+0x4a0/0xbd0 fs/namespace.c:3370
> do_mount fs/namespace.c:3383 [inline]
> __do_sys_mount fs/namespace.c:3591 [inline]
> __se_sys_mount+0x215/0x2d0 fs/namespace.c:3568
> __x64_sys_mount+0x67/0x80 fs/namespace.c:3568
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> *read* to 0xffffffff881d0344 of 1 bytes by task 6541 on cpu 1:
> proc_cgroup_show+0x1ec/0x4e0 kernel/cgroup/cgroup.c:6017
> proc_single_show+0x96/0x120 fs/proc/base.c:777
> seq_read_iter+0x2d2/0x8e0 fs/seq_file.c:230
> seq_read+0x1c9/0x210 fs/seq_file.c:162
> vfs_read+0x1b5/0x6e0 fs/read_write.c:480
> ksys_read+0xde/0x190 fs/read_write.c:620
> __do_sys_read fs/read_write.c:630 [inline]
> __se_sys_read fs/read_write.c:628 [inline]
> __x64_sys_read+0x43/0x50 fs/read_write.c:628
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 6541 Comm: syz-executor2-n Not tainted 5.18.0-rc5+ #107
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1
> 04/01/2014
>
>
> *Reproducing Inputs*
> Input CPU 0:
> r0 = fsopen(&(0x7f0000000000)='cgroup2\x00', 0x0)
> fsconfig$FSCONFIG_CMD_CREATE(r0, 0x6, 0x0, 0x0, 0x0)
> fsmount(r0, 0x0, 0x83)
>
> Input CPU 1:
> r0 = syz_open_procfs(0x0, &(0x7f0000000040)='cgroup\x00')
> read$eventfd(r0, &(0x7f0000000080), 0x8)


2022-08-22 17:30:12

by Gabriel Ryan

[permalink] [raw]
Subject: Re: data-race in cgroup_get_tree / proc_cgroup_show

Hi Christian,

We ran a quick test and confirm your suggestion would eliminate the
data race alert we observed. If the data race is benign (and it
appears to be), using WRITE_ONCE(cgrp_dfl_visible, true) instead of
cmpxchg in cgroup_get_tree() would probably also be ok.

Best,

Gabe

On Fri, Aug 19, 2022 at 3:23 AM Christian Brauner <[email protected]> wrote:
>
> On Thu, Aug 18, 2022 at 07:24:00PM -0400, Abhishek Shah wrote:
> > Hi all,
> >
> > We found the following data race involving the *cgrp_dfl_visible *variable.
> > We think it has security implications as the racing variable controls the
> > contents used in /proc/<pid>/cgroup which has been used in prior work
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cyberark.com_resources_threat-2Dresearch-2Dblog_the-2Dstrange-2Dcase-2Dof-2Dhow-2Dwe-2Descaped-2Dthe-2Ddocker-2Ddefault-2Dcontainer&d=DwIBaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=EyAJYRJu01oaAhhVVY3o8zKgZvacDAXd_PNRtaqACCo&m=oB43wXi5itVN6tAAOVg5q3rzeXp6QVvxICYqYL6p0wnMMhRB_HrHCwwt0dYa5x44&s=78sLv2vexAVEQwQPx_CuCJ90is9f3iixNbmbCp0Agpo&e= >
> > in container escapes. Please let us know what you think. Thanks!
>
> One straightforward fix might be to use
> cmpxchg(&cgrp_dfl_visible, false, true) in cgroup_get_tree()
> and READ_ONCE(cgrp_dfl_visible) in proc_cgroup_show() or sm like that.
> I'm not sure this is an issue though but might still be nice to fix it.
>
> >
> > *-----------------------------Report--------------------------------------*
> > *write* to 0xffffffff881d0344 of 1 bytes by task 6542 on cpu 0:
> > cgroup_get_tree+0x30/0x1c0 kernel/cgroup/cgroup.c:2153
> > vfs_get_tree+0x53/0x1b0 fs/super.c:1497
> > do_new_mount+0x208/0x6a0 fs/namespace.c:3040
> > path_mount+0x4a0/0xbd0 fs/namespace.c:3370
> > do_mount fs/namespace.c:3383 [inline]
> > __do_sys_mount fs/namespace.c:3591 [inline]
> > __se_sys_mount+0x215/0x2d0 fs/namespace.c:3568
> > __x64_sys_mount+0x67/0x80 fs/namespace.c:3568
> > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > *read* to 0xffffffff881d0344 of 1 bytes by task 6541 on cpu 1:
> > proc_cgroup_show+0x1ec/0x4e0 kernel/cgroup/cgroup.c:6017
> > proc_single_show+0x96/0x120 fs/proc/base.c:777
> > seq_read_iter+0x2d2/0x8e0 fs/seq_file.c:230
> > seq_read+0x1c9/0x210 fs/seq_file.c:162
> > vfs_read+0x1b5/0x6e0 fs/read_write.c:480
> > ksys_read+0xde/0x190 fs/read_write.c:620
> > __do_sys_read fs/read_write.c:630 [inline]
> > __se_sys_read fs/read_write.c:628 [inline]
> > __x64_sys_read+0x43/0x50 fs/read_write.c:628
> > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 1 PID: 6541 Comm: syz-executor2-n Not tainted 5.18.0-rc5+ #107
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1
> > 04/01/2014
> >
> >
> > *Reproducing Inputs*
> > Input CPU 0:
> > r0 = fsopen(&(0x7f0000000000)='cgroup2\x00', 0x0)
> > fsconfig$FSCONFIG_CMD_CREATE(r0, 0x6, 0x0, 0x0, 0x0)
> > fsmount(r0, 0x0, 0x83)
> >
> > Input CPU 1:
> > r0 = syz_open_procfs(0x0, &(0x7f0000000040)='cgroup\x00')
> > read$eventfd(r0, &(0x7f0000000080), 0x8)

--
Gabriel Ryan
PhD Candidate at Columbia University

2022-08-28 18:42:21

by Tejun Heo

[permalink] [raw]
Subject: Re: data-race in cgroup_get_tree / proc_cgroup_show

On Mon, Aug 22, 2022 at 01:04:58PM -0400, Gabriel Ryan wrote:
> Hi Christian,
>
> We ran a quick test and confirm your suggestion would eliminate the
> data race alert we observed. If the data race is benign (and it
> appears to be), using WRITE_ONCE(cgrp_dfl_visible, true) instead of
> cmpxchg in cgroup_get_tree() would probably also be ok.

I don't see how the data race can lead to anything but would the following
work?

Thanks.

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index ffaccd6373f1e..a90fdba881bdb 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2172,7 +2172,7 @@ static int cgroup_get_tree(struct fs_context *fc)
struct cgroup_fs_context *ctx = cgroup_fc2context(fc);
int ret;

- cgrp_dfl_visible = true;
+ WRITE_ONCE(cgrp_dfl_visible, true);
cgroup_get_live(&cgrp_dfl_root.cgrp);
ctx->root = &cgrp_dfl_root;

@@ -6056,7 +6056,7 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
struct cgroup *cgrp;
int ssid, count = 0;

- if (root == &cgrp_dfl_root && !cgrp_dfl_visible)
+ if (root == &cgrp_dfl_root && !READ_ONCE(cgrp_dfl_visible))
continue;

seq_printf(m, "%d:", root->hierarchy_id);


--
tejun

2022-08-29 07:41:56

by Christian Brauner

[permalink] [raw]
Subject: Re: data-race in cgroup_get_tree / proc_cgroup_show

On Sun, Aug 28, 2022 at 08:22:02AM -1000, Tejun Heo wrote:
> On Mon, Aug 22, 2022 at 01:04:58PM -0400, Gabriel Ryan wrote:
> > Hi Christian,
> >
> > We ran a quick test and confirm your suggestion would eliminate the
> > data race alert we observed. If the data race is benign (and it
> > appears to be), using WRITE_ONCE(cgrp_dfl_visible, true) instead of
> > cmpxchg in cgroup_get_tree() would probably also be ok.
>
> I don't see how the data race can lead to anything but would the following
> work?

Yep. You can take my,
Reviewed-by: Christian Brauner (Microsoft) <[email protected]>
when you turn it into a patch.

2022-09-04 19:42:17

by Tejun Heo

[permalink] [raw]
Subject: [PATCH cgroup/for-6.1] cgroup: Remove data-race around cgrp_dfl_visible

From dc79ec1b232ad2c165d381d3dd2626df4ef9b5a4 Mon Sep 17 00:00:00 2001
From: Tejun Heo <[email protected]>
Date: Sun, 4 Sep 2022 09:16:19 -1000

There's a seemingly harmless data-race around cgrp_dfl_visible detected by
kernel concurrency sanitizer. Let's remove it by throwing WRITE/READ_ONCE at
it.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Abhishek Shah <[email protected]>
Cc: Gabriel Ryan <[email protected]>
Reviewed-by: Christian Brauner (Microsoft) <[email protected]>
Link: https://lore.kernel.org/netdev/20220819072256.fn7ctciefy4fc4cu@wittgenstein/
---
Applied to cgroup/for-6.1.

Thanks.

kernel/cgroup/cgroup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 0005de2e2ed9..e0b72eb5d283 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2173,7 +2173,7 @@ static int cgroup_get_tree(struct fs_context *fc)
struct cgroup_fs_context *ctx = cgroup_fc2context(fc);
int ret;

- cgrp_dfl_visible = true;
+ WRITE_ONCE(cgrp_dfl_visible, true);
cgroup_get_live(&cgrp_dfl_root.cgrp);
ctx->root = &cgrp_dfl_root;

@@ -6098,7 +6098,7 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
struct cgroup *cgrp;
int ssid, count = 0;

- if (root == &cgrp_dfl_root && !cgrp_dfl_visible)
+ if (root == &cgrp_dfl_root && !READ_ONCE(cgrp_dfl_visible))
continue;

seq_printf(m, "%d:", root->hierarchy_id);
--
2.37.3