2023-07-13 07:31:20

by Xuewen Yan

[permalink] [raw]
Subject: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

There is no the judgment of whether namspace is NULL in find_pid_ns.
But there is a corner case when ns is null, for example: if user
call find_get_pid when current is in exiting, the following stack would
set thread_id be null:
release_task
__exit_signal(p);
__unhash_process(tsk, group_dead);
detach_pid(p, PIDTYPE_PID);
__change_pid(task, type, NULL);

If user call find_get_pid at now, in find_vpid function, the
task_active_pid_ns would return NULL. As a result, it would be
error when access the ns in find_pid_ns.

So add the judgment of whether ns is NULL in the find_pid_ns to
prevent this case happen.

Signed-off-by: Xuewen Yan <[email protected]>
---
kernel/pid.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/pid.c b/kernel/pid.c
index 6a1d23a11026..d4a9cb6f3eb9 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -308,6 +308,9 @@ void disable_pid_allocation(struct pid_namespace *ns)

struct pid *find_pid_ns(int nr, struct pid_namespace *ns)
{
+ if (!ns)
+ return NULL;
+
return idr_find(&ns->idr, nr);
}
EXPORT_SYMBOL_GPL(find_pid_ns);
--
2.25.1



2023-07-19 12:40:50

by Xuewen Yan

[permalink] [raw]
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

Dear all

Is there any comment about this patch?

Thanks!

On Thu, Jul 13, 2023 at 3:58 PM Xuewen Yan <[email protected]> wrote:
>
> There is no the judgment of whether namspace is NULL in find_pid_ns.
> But there is a corner case when ns is null, for example: if user
> call find_get_pid when current is in exiting, the following stack would
> set thread_id be null:
> release_task
> __exit_signal(p);
> __unhash_process(tsk, group_dead);
> detach_pid(p, PIDTYPE_PID);
> __change_pid(task, type, NULL);
>
> If user call find_get_pid at now, in find_vpid function, the
> task_active_pid_ns would return NULL. As a result, it would be
> error when access the ns in find_pid_ns.
>
> So add the judgment of whether ns is NULL in the find_pid_ns to
> prevent this case happen.
>
> Signed-off-by: Xuewen Yan <[email protected]>
> ---
> kernel/pid.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 6a1d23a11026..d4a9cb6f3eb9 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -308,6 +308,9 @@ void disable_pid_allocation(struct pid_namespace *ns)
>
> struct pid *find_pid_ns(int nr, struct pid_namespace *ns)
> {
> + if (!ns)
> + return NULL;
> +
> return idr_find(&ns->idr, nr);
> }
> EXPORT_SYMBOL_GPL(find_pid_ns);
> --
> 2.25.1
>

2023-07-25 08:48:00

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> There is no the judgment of whether namspace is NULL in find_pid_ns.
> But there is a corner case when ns is null, for example: if user
> call find_get_pid when current is in exiting, the following stack would
> set thread_id be null:
> release_task
> __exit_signal(p);
> __unhash_process(tsk, group_dead);
> detach_pid(p, PIDTYPE_PID);
> __change_pid(task, type, NULL);
>
> If user call find_get_pid at now, in find_vpid function, the

I fail to see how this can happen. The code you're referencing is in
release_task(). If current has gone through that then current obviously
can't call find_vpid() on itself anymore or anything else for that
matter.

2023-07-25 12:30:46

by Xuewen Yan

[permalink] [raw]
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <[email protected]> wrote:
>
> On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > But there is a corner case when ns is null, for example: if user
> > call find_get_pid when current is in exiting, the following stack would
> > set thread_id be null:
> > release_task
> > __exit_signal(p);
> > __unhash_process(tsk, group_dead);
> > detach_pid(p, PIDTYPE_PID);
> > __change_pid(task, type, NULL);
> >
> > If user call find_get_pid at now, in find_vpid function, the
>
> I fail to see how this can happen. The code you're referencing is in
> release_task(). If current has gone through that then current obviously
> can't call find_vpid() on itself anymore or anything else for that
> matter.

This happened when user calls find_vpid() in irq.

[72117.635162] Call trace:
[72117.635595] idr_find+0xc/0x24
[72117.636103] find_get_pid+0x40/0x68
[72117.636812] send_event+0x88/0x180 [demux]
[72117.637593] vbvop_copy_data+0x150/0x344 [demux]
[72117.638434] dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
[72117.639393] dmisr_video_parsing_switch+0x68/0xec [demux]
[72117.640332] dmisr_handle_video_pes+0x10c/0x26c [demux]
[72117.641108] tasklet_action_common+0x130/0x224
[72117.641784] tasklet_action+0x28/0x34
[72117.642366] __do_softirq+0x128/0x4dc
[72117.642944] irq_exit+0xf8/0xfc
[72117.643459] __handle_domain_irq+0xb0/0x108
[72117.644102] gic_handle_irq+0x6c/0x124
[72117.644691] el1_irq+0x108/0x200
[72117.645217] _raw_write_unlock_irq+0x2c/0x5c
[72117.645870] release_task+0x144/0x1ac <<<<<<
[72117.646447] do_exit+0x524/0x94c
[72117.646970] __do_sys_exit_group+0x0/0x14
[72117.647591] do_group_exit+0x0/0xa0
[72117.648146] __se_sys_exit+0x0/0x20
[72117.648704] el0_svc_common+0xcc/0x1bc
[72117.649292] el0_svc_handler+0x2c/0x3c
[72117.649881] el0_svc+0x8/0xc

In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
this time, if user calls find_get_pid() in irq, because
current->thread_id is NULL,
it will handle the NULL pointer.

BR
---
xuewen

2023-07-25 13:10:56

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <[email protected]> wrote:
> >
> > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > But there is a corner case when ns is null, for example: if user
> > > call find_get_pid when current is in exiting, the following stack would
> > > set thread_id be null:
> > > release_task
> > > __exit_signal(p);
> > > __unhash_process(tsk, group_dead);
> > > detach_pid(p, PIDTYPE_PID);
> > > __change_pid(task, type, NULL);
> > >
> > > If user call find_get_pid at now, in find_vpid function, the
> >
> > I fail to see how this can happen. The code you're referencing is in
> > release_task(). If current has gone through that then current obviously
> > can't call find_vpid() on itself anymore or anything else for that
> > matter.
>
> This happened when user calls find_vpid() in irq.
>
> [72117.635162] Call trace:
> [72117.635595] idr_find+0xc/0x24
> [72117.636103] find_get_pid+0x40/0x68
> [72117.636812] send_event+0x88/0x180 [demux]
> [72117.637593] vbvop_copy_data+0x150/0x344 [demux]
> [72117.638434] dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> [72117.639393] dmisr_video_parsing_switch+0x68/0xec [demux]
> [72117.640332] dmisr_handle_video_pes+0x10c/0x26c [demux]
> [72117.641108] tasklet_action_common+0x130/0x224
> [72117.641784] tasklet_action+0x28/0x34
> [72117.642366] __do_softirq+0x128/0x4dc
> [72117.642944] irq_exit+0xf8/0xfc
> [72117.643459] __handle_domain_irq+0xb0/0x108
> [72117.644102] gic_handle_irq+0x6c/0x124
> [72117.644691] el1_irq+0x108/0x200
> [72117.645217] _raw_write_unlock_irq+0x2c/0x5c
> [72117.645870] release_task+0x144/0x1ac <<<<<<
> [72117.646447] do_exit+0x524/0x94c
> [72117.646970] __do_sys_exit_group+0x0/0x14
> [72117.647591] do_group_exit+0x0/0xa0
> [72117.648146] __se_sys_exit+0x0/0x20
> [72117.648704] el0_svc_common+0xcc/0x1bc
> [72117.649292] el0_svc_handler+0x2c/0x3c
> [72117.649881] el0_svc+0x8/0xc
>
> In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> this time, if user calls find_get_pid() in irq, because
> current->thread_id is NULL,
> it will handle the NULL pointer.

Uhm, where is that code from? This doesn't seem to be upstream.

2023-07-26 01:57:06

by Xuewen Yan

[permalink] [raw]
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

On Tue, Jul 25, 2023 at 8:47 PM Christian Brauner <[email protected]> wrote:
>
> On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> > On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <[email protected]> wrote:
> > >
> > > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > > But there is a corner case when ns is null, for example: if user
> > > > call find_get_pid when current is in exiting, the following stack would
> > > > set thread_id be null:
> > > > release_task
> > > > __exit_signal(p);
> > > > __unhash_process(tsk, group_dead);
> > > > detach_pid(p, PIDTYPE_PID);
> > > > __change_pid(task, type, NULL);
> > > >
> > > > If user call find_get_pid at now, in find_vpid function, the
> > >
> > > I fail to see how this can happen. The code you're referencing is in
> > > release_task(). If current has gone through that then current obviously
> > > can't call find_vpid() on itself anymore or anything else for that
> > > matter.
> >
> > This happened when user calls find_vpid() in irq.
> >
> > [72117.635162] Call trace:
> > [72117.635595] idr_find+0xc/0x24
> > [72117.636103] find_get_pid+0x40/0x68
> > [72117.636812] send_event+0x88/0x180 [demux]
> > [72117.637593] vbvop_copy_data+0x150/0x344 [demux]
> > [72117.638434] dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> > [72117.639393] dmisr_video_parsing_switch+0x68/0xec [demux]
> > [72117.640332] dmisr_handle_video_pes+0x10c/0x26c [demux]
> > [72117.641108] tasklet_action_common+0x130/0x224
> > [72117.641784] tasklet_action+0x28/0x34
> > [72117.642366] __do_softirq+0x128/0x4dc
> > [72117.642944] irq_exit+0xf8/0xfc
> > [72117.643459] __handle_domain_irq+0xb0/0x108
> > [72117.644102] gic_handle_irq+0x6c/0x124
> > [72117.644691] el1_irq+0x108/0x200
> > [72117.645217] _raw_write_unlock_irq+0x2c/0x5c
> > [72117.645870] release_task+0x144/0x1ac <<<<<<
> > [72117.646447] do_exit+0x524/0x94c
> > [72117.646970] __do_sys_exit_group+0x0/0x14
> > [72117.647591] do_group_exit+0x0/0xa0
> > [72117.648146] __se_sys_exit+0x0/0x20
> > [72117.648704] el0_svc_common+0xcc/0x1bc
> > [72117.649292] el0_svc_handler+0x2c/0x3c
> > [72117.649881] el0_svc+0x8/0xc
> >
> > In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> > this time, if user calls find_get_pid() in irq, because
> > current->thread_id is NULL,
> > it will handle the NULL pointer.
>
> Uhm, where is that code from? This doesn't seem to be upstream.

It's from our own platform, we found someone called find_get_pid() in
the module, and caused the crash.

BR

2023-07-26 10:02:52

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

On Wed, Jul 26, 2023 at 09:23:13AM +0800, Xuewen Yan wrote:
> On Tue, Jul 25, 2023 at 8:47 PM Christian Brauner <[email protected]> wrote:
> >
> > On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> > > On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <[email protected]> wrote:
> > > >
> > > > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > > > But there is a corner case when ns is null, for example: if user
> > > > > call find_get_pid when current is in exiting, the following stack would
> > > > > set thread_id be null:
> > > > > release_task
> > > > > __exit_signal(p);
> > > > > __unhash_process(tsk, group_dead);
> > > > > detach_pid(p, PIDTYPE_PID);
> > > > > __change_pid(task, type, NULL);
> > > > >
> > > > > If user call find_get_pid at now, in find_vpid function, the
> > > >
> > > > I fail to see how this can happen. The code you're referencing is in
> > > > release_task(). If current has gone through that then current obviously
> > > > can't call find_vpid() on itself anymore or anything else for that
> > > > matter.
> > >
> > > This happened when user calls find_vpid() in irq.
> > >
> > > [72117.635162] Call trace:
> > > [72117.635595] idr_find+0xc/0x24
> > > [72117.636103] find_get_pid+0x40/0x68
> > > [72117.636812] send_event+0x88/0x180 [demux]
> > > [72117.637593] vbvop_copy_data+0x150/0x344 [demux]
> > > [72117.638434] dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> > > [72117.639393] dmisr_video_parsing_switch+0x68/0xec [demux]
> > > [72117.640332] dmisr_handle_video_pes+0x10c/0x26c [demux]
> > > [72117.641108] tasklet_action_common+0x130/0x224
> > > [72117.641784] tasklet_action+0x28/0x34
> > > [72117.642366] __do_softirq+0x128/0x4dc
> > > [72117.642944] irq_exit+0xf8/0xfc
> > > [72117.643459] __handle_domain_irq+0xb0/0x108
> > > [72117.644102] gic_handle_irq+0x6c/0x124
> > > [72117.644691] el1_irq+0x108/0x200
> > > [72117.645217] _raw_write_unlock_irq+0x2c/0x5c
> > > [72117.645870] release_task+0x144/0x1ac <<<<<<
> > > [72117.646447] do_exit+0x524/0x94c
> > > [72117.646970] __do_sys_exit_group+0x0/0x14
> > > [72117.647591] do_group_exit+0x0/0xa0
> > > [72117.648146] __se_sys_exit+0x0/0x20
> > > [72117.648704] el0_svc_common+0xcc/0x1bc
> > > [72117.649292] el0_svc_handler+0x2c/0x3c
> > > [72117.649881] el0_svc+0x8/0xc
> > >
> > > In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> > > this time, if user calls find_get_pid() in irq, because
> > > current->thread_id is NULL,
> > > it will handle the NULL pointer.
> >
> > Uhm, where is that code from? This doesn't seem to be upstream.
>
> It's from our own platform, we found someone called find_get_pid() in
> the module, and caused the crash.

So this is a bug report for an out of tree driver which I'm sure you're
aware we consider mostly irrelevant unless this is an upstream issue.

Please work around or better fix this in your out of tree driver or
please show a reproducer how this can happen on upstream kernels.

Otherwise I don't see why we'd care.

2023-07-27 01:58:35

by Xuewen Yan

[permalink] [raw]
Subject: Re: [PATCH] pid: Add the judgment of whether ns is NULL in the find_pid_ns

On Wed, Jul 26, 2023 at 5:25 PM Christian Brauner <[email protected]> wrote:
>
> On Wed, Jul 26, 2023 at 09:23:13AM +0800, Xuewen Yan wrote:
> > On Tue, Jul 25, 2023 at 8:47 PM Christian Brauner <[email protected]> wrote:
> > >
> > > On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> > > > On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <[email protected]> wrote:
> > > > >
> > > > > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > > > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > > > > But there is a corner case when ns is null, for example: if user
> > > > > > call find_get_pid when current is in exiting, the following stack would
> > > > > > set thread_id be null:
> > > > > > release_task
> > > > > > __exit_signal(p);
> > > > > > __unhash_process(tsk, group_dead);
> > > > > > detach_pid(p, PIDTYPE_PID);
> > > > > > __change_pid(task, type, NULL);
> > > > > >
> > > > > > If user call find_get_pid at now, in find_vpid function, the
> > > > >
> > > > > I fail to see how this can happen. The code you're referencing is in
> > > > > release_task(). If current has gone through that then current obviously
> > > > > can't call find_vpid() on itself anymore or anything else for that
> > > > > matter.
> > > >
> > > > This happened when user calls find_vpid() in irq.
> > > >
> > > > [72117.635162] Call trace:
> > > > [72117.635595] idr_find+0xc/0x24
> > > > [72117.636103] find_get_pid+0x40/0x68
> > > > [72117.636812] send_event+0x88/0x180 [demux]
> > > > [72117.637593] vbvop_copy_data+0x150/0x344 [demux]
> > > > [72117.638434] dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> > > > [72117.639393] dmisr_video_parsing_switch+0x68/0xec [demux]
> > > > [72117.640332] dmisr_handle_video_pes+0x10c/0x26c [demux]
> > > > [72117.641108] tasklet_action_common+0x130/0x224
> > > > [72117.641784] tasklet_action+0x28/0x34
> > > > [72117.642366] __do_softirq+0x128/0x4dc
> > > > [72117.642944] irq_exit+0xf8/0xfc
> > > > [72117.643459] __handle_domain_irq+0xb0/0x108
> > > > [72117.644102] gic_handle_irq+0x6c/0x124
> > > > [72117.644691] el1_irq+0x108/0x200
> > > > [72117.645217] _raw_write_unlock_irq+0x2c/0x5c
> > > > [72117.645870] release_task+0x144/0x1ac <<<<<<
> > > > [72117.646447] do_exit+0x524/0x94c
> > > > [72117.646970] __do_sys_exit_group+0x0/0x14
> > > > [72117.647591] do_group_exit+0x0/0xa0
> > > > [72117.648146] __se_sys_exit+0x0/0x20
> > > > [72117.648704] el0_svc_common+0xcc/0x1bc
> > > > [72117.649292] el0_svc_handler+0x2c/0x3c
> > > > [72117.649881] el0_svc+0x8/0xc
> > > >
> > > > In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> > > > this time, if user calls find_get_pid() in irq, because
> > > > current->thread_id is NULL,
> > > > it will handle the NULL pointer.
> > >
> > > Uhm, where is that code from? This doesn't seem to be upstream.
> >
> > It's from our own platform, we found someone called find_get_pid() in
> > the module, and caused the crash.
>
> So this is a bug report for an out of tree driver which I'm sure you're
> aware we consider mostly irrelevant unless this is an upstream issue.
>
> Please work around or better fix this in your out of tree driver or
> please show a reproducer how this can happen on upstream kernels.
>
> Otherwise I don't see why we'd care.

Okay, Thanks a lot!

BR