2019-07-19 14:01:11

by Sasha Levin

[permalink] [raw]
Subject: NULL ptr deref in wq_worker_sleeping on 4.19

Hi folks,

We're seeing a rare panic on boot in wq_worker_sleeping() on boot in
4.19 kernels. I wasn't able to reproduce this with 5.2, but I'm not sure
whether it's because the issue is fixed, or I was just unlucky.

The panic looks like this:

[ 0.852791] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[ 0.853260] PGD 0 P4D 0
[ 0.853260] Oops: 0000 [#1] SMP PTI
[ 0.853260] CPU: 7 PID: 49 Comm: Not tainted 4.19.52-9858d02fd940 #1
[ 0.853260] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017
[ 0.853260] RIP: 0010:kthread_data+0x12/0x30
[ 0.853260] Code: 83 7f 58 00 74 02 0f 0b e9 bb 2d 19 00 0f 0b eb e2 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f6 47 26 20 74 0c 48 8b 87 98 05 00 00 <48> 8b 40 10 c3 0f 0b 48 8b 87 98 05 00 00 48 8b 40 10 c3 90 66 2e
[ 0.853260] RSP: 0000:ffffc900036abe38 EFLAGS: 00010002
[ 0.853260] RAX: 0000000000000000 RBX: ffff8887bfbe17c0 RCX: 0000000000000000
[ 0.853260] RDX: 0000000000000001 RSI: 000000000000000a RDI: ffff8887bbb4bb00
[ 0.853260] RBP: ffffc900036abea0 R08: 0000000000000000 R09: 0000000000000000
[ 0.853260] R10: ffffc9000368bd90 R11: 0000000000000000 R12: ffff8887bbb4bb00
[ 0.853260] R13: 0000000000000000 R14: ffffc900036abe60 R15: 0000000000000000
[ 0.853260] FS: 0000000000000000(0000) GS:ffff8887bfbc0000(0000) knlGS:0000000000000000
[ 0.853260] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.853260] CR2: 0000000000000068 CR3: 00000007df40a000 CR4: 00000000001406e0
[ 0.853260] Call Trace:
[ 0.853260] wq_worker_sleeping+0xa/0x60
[ 0.853260] __schedule+0x571/0x8c0
[ 0.853260] schedule+0x32/0x80
[ 0.853260] worker_thread+0xc7/0x440
[ 0.853260] kthread+0xf8/0x130
[ 0.853260] ret_from_fork+0x35/0x40
[ 0.853260] Modules linked in:
[ 0.853260] CR2: 0000000000000010
[ 0.853260] ---[ end trace 160fda44361ab977 ]---

I see that this area was recently touched by 6d25be5782e4 ("sched/core,
workqueues: Distangle worker accounting from rq lock") but I'm not sure
if it's related.

--
Thanks,
Sasha


Subject: Re: NULL ptr deref in wq_worker_sleeping on 4.19

On 2019-07-19 09:53:52 [-0400], Sasha Levin wrote:
> Hi folks,
Hi,

> We're seeing a rare panic on boot in wq_worker_sleeping() on boot in
> 4.19 kernels. I wasn't able to reproduce this with 5.2, but I'm not sure
> whether it's because the issue is fixed, or I was just unlucky.
>
> The panic looks like this:
>
> [ 0.852791] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> [ 0.853260] PGD 0 P4D 0 [ 0.853260] Oops: 0000 [#1] SMP PTI
> [ 0.853260] CPU: 7 PID: 49 Comm: Not tainted 4.19.52-9858d02fd940 #1
> [ 0.853260] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017
> [ 0.853260] RIP: 0010:kthread_data+0x12/0x30
> [ 0.853260] Code: 83 7f 58 00 74 02 0f 0b e9 bb 2d 19 00 0f 0b eb e2 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f6 47 26 20 74 0c 48 8b 87 98 05 00 00 <48> 8b 40 10 c3 0f 0b 48 8b 87 98 05 00 00 48 8b 40 10 c3 90 66 2e
> [ 0.853260] RSP: 0000:ffffc900036abe38 EFLAGS: 00010002
> [ 0.853260] RAX: 0000000000000000 RBX: ffff8887bfbe17c0 RCX: 0000000000000000
> [ 0.853260] RDX: 0000000000000001 RSI: 000000000000000a RDI: ffff8887bbb4bb00
> [ 0.853260] RBP: ffffc900036abea0 R08: 0000000000000000 R09: 0000000000000000
> [ 0.853260] R10: ffffc9000368bd90 R11: 0000000000000000 R12: ffff8887bbb4bb00
> [ 0.853260] R13: 0000000000000000 R14: ffffc900036abe60 R15: 0000000000000000
> [ 0.853260] FS: 0000000000000000(0000) GS:ffff8887bfbc0000(0000) knlGS:0000000000000000
> [ 0.853260] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.853260] CR2: 0000000000000068 CR3: 00000007df40a000 CR4: 00000000001406e0
> [ 0.853260] Call Trace:
> [ 0.853260] wq_worker_sleeping+0xa/0x60
> [ 0.853260] __schedule+0x571/0x8c0
> [ 0.853260] schedule+0x32/0x80
> [ 0.853260] worker_thread+0xc7/0x440
> [ 0.853260] kthread+0xf8/0x130
> [ 0.853260] ret_from_fork+0x35/0x40
> [ 0.853260] Modules linked in:
> [ 0.853260] CR2: 0000000000000010
> [ 0.853260] ---[ end trace 160fda44361ab977 ]---
>
> I see that this area was recently touched by 6d25be5782e4 ("sched/core,
> workqueues: Distangle worker accounting from rq lock") but I'm not sure
> if it's related.

The change should just move code outside of the scheduler and not lead
to any changed behaviour (except the small detail mentioned in the
changelog, nothing explaining what you have here).

The way the call chain looks is, kthread() allocated the struct kthread
(self) and saved it in current->set_child_tid and this pointer is not
NULL. Everything works out and `threadfn' is invoked which is
worker_thread().
The first thing it does, is to set the special kworker flag via
set_pf_worker() which enables the additional code in the scheduler. Then
it has nothing to do and invokes schedule() which then gets us to
wq_worker_sleeping(). Here it invokes wq_worker_sleeping() which is what
explodes.
Based on the register dump and code dump, RAX is NULL which is
current->set_child_tid (from the begin of ktread()). It adds 0x10 for
the ->date pointer and OOPSes while reading from 0x10.

So everything looks fine, except that `set_child_tid' seems to be zeroed
out. Also, task_struct has a few lines after `set_child_tid' the `comm'
member which seems to contain also 0x00 because the trace contains no
task name. At this point I would have expected "kworker/…".

Based on this two hints it looks like something zeroed that memory area
shortly after it was occupied by the task (aka use after free).

Sebastian

2019-08-07 12:25:50

by Austin Kim

[permalink] [raw]
Subject: Re: NULL ptr deref in wq_worker_sleeping on 4.19

I wonder what kinds of workqueue is used in case of this panic.

If system workqueue(system_wq) is used for this case, it would be a
help to replace it with high priority workqueue(system_highpri_wq). If
panic disappers with high priority workqueue(system_highpri_wq), we
would think about another scenario.

BR,
Guillermo Austin Kim

2019년 8월 7일 (수) 오후 8:48, Sebastian Andrzej Siewior <[email protected]>님이 작성:
>
> On 2019-07-19 09:53:52 [-0400], Sasha Levin wrote:
> > Hi folks,
> Hi,
>
> > We're seeing a rare panic on boot in wq_worker_sleeping() on boot in
> > 4.19 kernels. I wasn't able to reproduce this with 5.2, but I'm not sure
> > whether it's because the issue is fixed, or I was just unlucky.
> >
> > The panic looks like this:
> >
> > [ 0.852791] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> > [ 0.853260] PGD 0 P4D 0 [ 0.853260] Oops: 0000 [#1] SMP PTI
> > [ 0.853260] CPU: 7 PID: 49 Comm: Not tainted 4.19.52-9858d02fd940 #1
> > [ 0.853260] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017
> > [ 0.853260] RIP: 0010:kthread_data+0x12/0x30
> > [ 0.853260] Code: 83 7f 58 00 74 02 0f 0b e9 bb 2d 19 00 0f 0b eb e2 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f6 47 26 20 74 0c 48 8b 87 98 05 00 00 <48> 8b 40 10 c3 0f 0b 48 8b 87 98 05 00 00 48 8b 40 10 c3 90 66 2e
> > [ 0.853260] RSP: 0000:ffffc900036abe38 EFLAGS: 00010002
> > [ 0.853260] RAX: 0000000000000000 RBX: ffff8887bfbe17c0 RCX: 0000000000000000
> > [ 0.853260] RDX: 0000000000000001 RSI: 000000000000000a RDI: ffff8887bbb4bb00
> > [ 0.853260] RBP: ffffc900036abea0 R08: 0000000000000000 R09: 0000000000000000
> > [ 0.853260] R10: ffffc9000368bd90 R11: 0000000000000000 R12: ffff8887bbb4bb00
> > [ 0.853260] R13: 0000000000000000 R14: ffffc900036abe60 R15: 0000000000000000
> > [ 0.853260] FS: 0000000000000000(0000) GS:ffff8887bfbc0000(0000) knlGS:0000000000000000
> > [ 0.853260] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.853260] CR2: 0000000000000068 CR3: 00000007df40a000 CR4: 00000000001406e0
> > [ 0.853260] Call Trace:
> > [ 0.853260] wq_worker_sleeping+0xa/0x60
> > [ 0.853260] __schedule+0x571/0x8c0
> > [ 0.853260] schedule+0x32/0x80
> > [ 0.853260] worker_thread+0xc7/0x440
> > [ 0.853260] kthread+0xf8/0x130
> > [ 0.853260] ret_from_fork+0x35/0x40
> > [ 0.853260] Modules linked in:
> > [ 0.853260] CR2: 0000000000000010
> > [ 0.853260] ---[ end trace 160fda44361ab977 ]---
> >
> > I see that this area was recently touched by 6d25be5782e4 ("sched/core,
> > workqueues: Distangle worker accounting from rq lock") but I'm not sure
> > if it's related.
>
> The change should just move code outside of the scheduler and not lead
> to any changed behaviour (except the small detail mentioned in the
> changelog, nothing explaining what you have here).
>
> The way the call chain looks is, kthread() allocated the struct kthread
> (self) and saved it in current->set_child_tid and this pointer is not
> NULL. Everything works out and `threadfn' is invoked which is
> worker_thread().
> The first thing it does, is to set the special kworker flag via
> set_pf_worker() which enables the additional code in the scheduler. Then
> it has nothing to do and invokes schedule() which then gets us to
> wq_worker_sleeping(). Here it invokes wq_worker_sleeping() which is what
> explodes.
> Based on the register dump and code dump, RAX is NULL which is
> current->set_child_tid (from the begin of ktread()). It adds 0x10 for
> the ->date pointer and OOPSes while reading from 0x10.
>
> So everything looks fine, except that `set_child_tid' seems to be zeroed
> out. Also, task_struct has a few lines after `set_child_tid' the `comm'
> member which seems to contain also 0x00 because the trace contains no
> task name. At this point I would have expected "kworker/…".
>
> Based on this two hints it looks like something zeroed that memory area
> shortly after it was occupied by the task (aka use after free).
>
> Sebastian

2019-08-07 13:36:28

by Austin Kim

[permalink] [raw]
Subject: Re: NULL ptr deref in wq_worker_sleeping on 4.19

Is it possible for you to generate kernel patch only with **kernel
log** for 'use-after-free' case?
With kernel log, we only have limited information, so it is hard to
debug 'use-after-free' case with this signature.

For this matter, kernel should be compiled with below config.
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y

2019년 8월 7일 (수) 오후 9:56, Thomas Gleixner <[email protected]>님이 작성:
>
> On Wed, 7 Aug 2019, Austin Kim wrote:
>
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
>
> A: No.
> Q: Should I include quotations after my reply?
>
> http://daringfireball.net/2007/07/on_top
>
> > I wonder what kinds of workqueue is used in case of this panic.
> >
> > If system workqueue(system_wq) is used for this case, it would be a
> > help to replace it with high priority workqueue(system_highpri_wq). If
> > panic disappers with high priority workqueue(system_highpri_wq), we
> > would think about another scenario.
>
> How would that help? As Sebastian explained, something overwrote memory or
> it is a Use After Free. How would a high priority workqueue 'fix' that?
>
> You need to find the root cause, which is either memory corruption or a use
> after free.
>
> Thanks,
>
> tglx

2019-08-07 13:54:53

by Thomas Gleixner

[permalink] [raw]
Subject: Re: NULL ptr deref in wq_worker_sleeping on 4.19

On Wed, 7 Aug 2019, Austin Kim wrote:

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

> I wonder what kinds of workqueue is used in case of this panic.
>
> If system workqueue(system_wq) is used for this case, it would be a
> help to replace it with high priority workqueue(system_highpri_wq). If
> panic disappers with high priority workqueue(system_highpri_wq), we
> would think about another scenario.

How would that help? As Sebastian explained, something overwrote memory or
it is a Use After Free. How would a high priority workqueue 'fix' that?

You need to find the root cause, which is either memory corruption or a use
after free.

Thanks,

tglx

2019-08-07 20:59:27

by Thomas Gleixner

[permalink] [raw]
Subject: Re: NULL ptr deref in wq_worker_sleeping on 4.19

On Wed, 7 Aug 2019, Austin Kim wrote:

Again:

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

> Is it possible for you to generate kernel patch only with **kernel
> log** for 'use-after-free' case?

What? I have no idea how to reproduce that issue so how should I create a
kernel patch for something which can happen at some random place in several
million lines of code?

> With kernel log, we only have limited information, so it is hard to
> debug 'use-after-free' case with this signature.

My information is even more limited...

Thanks,

tglx