2023-01-16 09:07:11

by Pengfei Xu

[permalink] [raw]
Subject: [Syzkaller & bisect] There is "register_kprobe" WARNING in v6.2-rc3 mainline kernel

Hi Masami Hiramatsu,

There is "register_kprobe" WARNING in v6.2-rc3 in guest.
[ 120.736243] ------------[ cut here ]------------
[ 120.736691] WARNING: CPU: 0 PID: 366 at kernel/kprobes.c:589 kprobe_optimizer+0x26b/0x400
[ 120.737161] Modules linked in:
[ 120.737339] CPU: 0 PID: 366 Comm: kworker/0:7 Not tainted 6.2.0-rc3-b7bfaa761d76 #1
[ 120.737831] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 120.738416] Workqueue: events kprobe_optimizer
[ 120.738683] RIP: 0010:kprobe_optimizer+0x26b/0x400
[ 120.739026] Code: c3 ff ff 4c 8b 25 25 28 b5 02 4d 8b 2c 24 49 8d 5c 24 80 4d 8d 75 80 49 81 fc f0 7e e1 83 75 2f e9 a9 00 00 00 e8 75 c3 ff ff <0f> 0b e8 6e c
[ 120.740035] RSP: 0018:ffffc90000ebfe10 EFLAGS: 00010246
[ 120.740394] RAX: 0000000000000000 RBX: ffff88800d094180 RCX: ffffffff812c5752
[ 120.740763] RDX: 0000000000000000 RSI: ffff88800eed8000 RDI: 0000000000000002
[ 120.741129] RBP: ffffc90000ebfe38 R08: 0000000000000001 R09: 0000000000000001
[ 120.741552] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 120.741953] R13: ffffffff83e17ef0 R14: ffffffff83e17e70 R15: ffff88807dc3a400
[ 120.742319] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 120.742778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 120.743125] CR2: 00007ffff951d3f8 CR3: 00000000131bc006 CR4: 0000000000770ef0
[ 120.743503] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 120.743868] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 120.744339] PKRU: 55555554
[ 120.744495] Call Trace:
[ 120.744633] <TASK>
[ 120.744764] process_one_work+0x3b1/0x960
[ 120.745002] worker_thread+0x52/0x660
[ 120.745228] ? __pfx_worker_thread+0x10/0x10
[ 120.745536] kthread+0x161/0x1a0
[ 120.745748] ? __pfx_kthread+0x10/0x10
[ 120.745968] ret_from_fork+0x29/0x50
[ 120.746192] </TASK>
[ 120.746321] irq event stamp: 71069
[ 120.746523] hardirqs last enabled at (71077): [<ffffffff811d4891>] __up_console_sem+0x91/0xb0
[ 120.747078] hardirqs last disabled at (71084): [<ffffffff811d4876>] __up_console_sem+0x76/0xb0
[ 120.747529] softirqs last enabled at (70986): [<ffffffff82f9d213>] __do_softirq+0x323/0x48a
[ 120.748051] softirqs last disabled at (71101): [<ffffffff81123142>] irq_exit_rcu+0xd2/0x100
[ 120.748562] ---[ end trace 0000000000000000 ]---
[ 149.825325] ------------[ cut here ]------------
[ 149.825894] WARNING: CPU: 0 PID: 2872 at kernel/kprobes.c:777 register_kprobe+0x9a2/0x9b0
[ 149.826370] Modules linked in:
[ 149.826548] CPU: 0 PID: 2872 Comm: repro Tainted: G W 6.2.0-rc3-b7bfaa761d76 #1
[ 149.827010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 149.827602] RIP: 0010:register_kprobe+0x9a2/0x9b0
[ 149.827874] Code: 4d 89 74 24 08 e8 de 91 ff ff 45 89 ec 48 b8 22 01 00 00 00 00 ad de 48 89 43 08 e8 48 e4 f3 ff e9 d2 fd ff ff e8 be 91 ff ff <0f> 0b e9 4e 0
[ 149.828872] RSP: 0018:ffffc90001aafc40 EFLAGS: 00010246
[ 149.829162] RAX: 0000000000000000 RBX: ffff88800cefeb18 RCX: ffffffff812c86ba
[ 149.829541] RDX: 0000000000000000 RSI: ffff88801324a340 RDI: 0000000000000002
[ 149.829924] RBP: ffffc90001aafc88 R08: 0000000000000001 R09: 0000000000000001
[ 149.830311] R10: ffffc90001aafc30 R11: 0000000000000001 R12: 0000000000000000
[ 149.830697] R13: ffff88800d0949c0 R14: ffff88800cefeb28 R15: 0000000000000002
[ 149.831089] FS: 00007fdc0d3c9740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
[ 149.831525] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 149.831844] CR2: 00007fdc0d5841d3 CR3: 000000000d70c006 CR4: 0000000000770ef0
[ 149.832260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 149.832637] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 149.833021] PKRU: 55555554
[ 149.833179] Call Trace:
[ 149.833323] <TASK>
[ 149.833458] ? knl_uncore_pci_init+0x9/0x90
[ 149.833712] ? knl_uncore_pci_init+0x9/0x90
[ 149.833959] __register_trace_kprobe+0x183/0x190
[ 149.834241] create_local_trace_kprobe+0x107/0x190
[ 149.834519] perf_kprobe_init+0xa7/0x160
[ 149.834753] perf_kprobe_event_init+0x98/0x120
[ 149.835024] perf_try_init_event+0x88/0x280
[ 149.835274] perf_event_alloc+0xf0a/0x1c50
[ 149.835514] ? __do_sys_perf_event_open+0x9ca/0x1870
[ 149.835805] ? __this_cpu_preempt_check+0x20/0x30
[ 149.836102] __do_sys_perf_event_open+0x3b6/0x1870
[ 149.836388] ? knl_uncore_pci_init+0x9/0x90
[ 149.836636] __x64_sys_perf_event_open+0x2f/0x40
[ 149.836905] do_syscall_64+0x3b/0x90
[ 149.837115] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 149.837417] RIP: 0033:0x7fdc0d4ee59d
[ 149.837630] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 8
[ 149.838602] RSP: 002b:00007ffff951d3f8 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
[ 149.839036] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fdc0d4ee59d
[ 149.839420] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000200002c0
[ 149.839800] RBP: 00007ffff951d410 R08: 0000000000000000 R09: 00007ffff951d410
[ 149.840203] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000004011a0
[ 149.840593] R13: 00007ffff951d530 R14: 0000000000000000 R15: 0000000000000000
[ 149.840984] </TASK>
[ 149.841120] irq event stamp: 1631
[ 149.841303] hardirqs last enabled at (1639): [<ffffffff811d4891>] __up_console_sem+0x91/0xb0
[ 149.841825] hardirqs last disabled at (1646): [<ffffffff811d4876>] __up_console_sem+0x76/0xb0
[ 149.842279] softirqs last enabled at (1304): [<ffffffff82f9d213>] __do_softirq+0x323/0x48a
[ 149.842723] softirqs last disabled at (1295): [<ffffffff81123142>] irq_exit_rcu+0xd2/0x100
[ 149.843179] ---[ end trace 0000000000000000 ]---

Bisect and found that bad commit is:
63dc6325ff41ee9e570bde705ac34a39c5dbeb44
x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK

After reverted the above commit on top of v6.2-rc3 kernel, this issue was gone.

The reproduced code, kconfig, bisect info, and v6.2-rc3 reproduced log are in
attached email.

And all the detailed logs are in link:
https://github.com/xupengfe/syzkaller_logs/tree/main/230113_100005_register_kprobe

I hope it's helpful.

Thanks!
BR.


Attachments:
(No filename) (6.38 kB)
repro.c (10.75 kB)
bisect_info.log (5.50 kB)
kconfig (283.80 kB)
b7bfaa761d760e72a969d116517eaa12e404c262_dmesg.log (46.34 kB)
Download all attachments

2023-01-20 00:33:12

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [Syzkaller & bisect] There is "register_kprobe" WARNING in v6.2-rc3 mainline kernel

On Mon, 16 Jan 2023 16:57:24 +0800
Pengfei Xu <[email protected]> wrote:

> Hi Masami Hiramatsu,
>
> There is "register_kprobe" WARNING in v6.2-rc3 in guest.
> [ 120.736243] ------------[ cut here ]------------
> [ 120.736691] WARNING: CPU: 0 PID: 366 at kernel/kprobes.c:589 kprobe_optimizer+0x26b/0x400
> [ 120.737161] Modules linked in:
> [ 120.737339] CPU: 0 PID: 366 Comm: kworker/0:7 Not tainted 6.2.0-rc3-b7bfaa761d76 #1
> [ 120.737831] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [ 120.738416] Workqueue: events kprobe_optimizer
> [ 120.738683] RIP: 0010:kprobe_optimizer+0x26b/0x400
> [ 120.739026] Code: c3 ff ff 4c 8b 25 25 28 b5 02 4d 8b 2c 24 49 8d 5c 24 80 4d 8d 75 80 49 81 fc f0 7e e1 83 75 2f e9 a9 00 00 00 e8 75 c3 ff ff <0f> 0b e8 6e c
> [ 120.740035] RSP: 0018:ffffc90000ebfe10 EFLAGS: 00010246
> [ 120.740394] RAX: 0000000000000000 RBX: ffff88800d094180 RCX: ffffffff812c5752
> [ 120.740763] RDX: 0000000000000000 RSI: ffff88800eed8000 RDI: 0000000000000002
> [ 120.741129] RBP: ffffc90000ebfe38 R08: 0000000000000001 R09: 0000000000000001
> [ 120.741552] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> [ 120.741953] R13: ffffffff83e17ef0 R14: ffffffff83e17e70 R15: ffff88807dc3a400
> [ 120.742319] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> [ 120.742778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 120.743125] CR2: 00007ffff951d3f8 CR3: 00000000131bc006 CR4: 0000000000770ef0
> [ 120.743503] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 120.743868] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> [ 120.744339] PKRU: 55555554
> [ 120.744495] Call Trace:
> [ 120.744633] <TASK>
> [ 120.744764] process_one_work+0x3b1/0x960
> [ 120.745002] worker_thread+0x52/0x660
> [ 120.745228] ? __pfx_worker_thread+0x10/0x10
> [ 120.745536] kthread+0x161/0x1a0
> [ 120.745748] ? __pfx_kthread+0x10/0x10
> [ 120.745968] ret_from_fork+0x29/0x50
> [ 120.746192] </TASK>
> [ 120.746321] irq event stamp: 71069
> [ 120.746523] hardirqs last enabled at (71077): [<ffffffff811d4891>] __up_console_sem+0x91/0xb0
> [ 120.747078] hardirqs last disabled at (71084): [<ffffffff811d4876>] __up_console_sem+0x76/0xb0
> [ 120.747529] softirqs last enabled at (70986): [<ffffffff82f9d213>] __do_softirq+0x323/0x48a
> [ 120.748051] softirqs last disabled at (71101): [<ffffffff81123142>] irq_exit_rcu+0xd2/0x100
> [ 120.748562] ---[ end trace 0000000000000000 ]---
> [ 149.825325] ------------[ cut here ]------------
> [ 149.825894] WARNING: CPU: 0 PID: 2872 at kernel/kprobes.c:777 register_kprobe+0x9a2/0x9b0
> [ 149.826370] Modules linked in:
> [ 149.826548] CPU: 0 PID: 2872 Comm: repro Tainted: G W 6.2.0-rc3-b7bfaa761d76 #1
> [ 149.827010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [ 149.827602] RIP: 0010:register_kprobe+0x9a2/0x9b0
> [ 149.827874] Code: 4d 89 74 24 08 e8 de 91 ff ff 45 89 ec 48 b8 22 01 00 00 00 00 ad de 48 89 43 08 e8 48 e4 f3 ff e9 d2 fd ff ff e8 be 91 ff ff <0f> 0b e9 4e 0
> [ 149.828872] RSP: 0018:ffffc90001aafc40 EFLAGS: 00010246
> [ 149.829162] RAX: 0000000000000000 RBX: ffff88800cefeb18 RCX: ffffffff812c86ba
> [ 149.829541] RDX: 0000000000000000 RSI: ffff88801324a340 RDI: 0000000000000002
> [ 149.829924] RBP: ffffc90001aafc88 R08: 0000000000000001 R09: 0000000000000001
> [ 149.830311] R10: ffffc90001aafc30 R11: 0000000000000001 R12: 0000000000000000
> [ 149.830697] R13: ffff88800d0949c0 R14: ffff88800cefeb28 R15: 0000000000000002
> [ 149.831089] FS: 00007fdc0d3c9740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> [ 149.831525] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 149.831844] CR2: 00007fdc0d5841d3 CR3: 000000000d70c006 CR4: 0000000000770ef0
> [ 149.832260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 149.832637] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> [ 149.833021] PKRU: 55555554
> [ 149.833179] Call Trace:
> [ 149.833323] <TASK>
> [ 149.833458] ? knl_uncore_pci_init+0x9/0x90
> [ 149.833712] ? knl_uncore_pci_init+0x9/0x90
> [ 149.833959] __register_trace_kprobe+0x183/0x190
> [ 149.834241] create_local_trace_kprobe+0x107/0x190
> [ 149.834519] perf_kprobe_init+0xa7/0x160
> [ 149.834753] perf_kprobe_event_init+0x98/0x120
> [ 149.835024] perf_try_init_event+0x88/0x280
> [ 149.835274] perf_event_alloc+0xf0a/0x1c50
> [ 149.835514] ? __do_sys_perf_event_open+0x9ca/0x1870
> [ 149.835805] ? __this_cpu_preempt_check+0x20/0x30
> [ 149.836102] __do_sys_perf_event_open+0x3b6/0x1870
> [ 149.836388] ? knl_uncore_pci_init+0x9/0x90
> [ 149.836636] __x64_sys_perf_event_open+0x2f/0x40
> [ 149.836905] do_syscall_64+0x3b/0x90
> [ 149.837115] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 149.837417] RIP: 0033:0x7fdc0d4ee59d
> [ 149.837630] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 8
> [ 149.838602] RSP: 002b:00007ffff951d3f8 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
> [ 149.839036] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fdc0d4ee59d
> [ 149.839420] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000200002c0
> [ 149.839800] RBP: 00007ffff951d410 R08: 0000000000000000 R09: 00007ffff951d410
> [ 149.840203] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000004011a0
> [ 149.840593] R13: 00007ffff951d530 R14: 0000000000000000 R15: 0000000000000000
> [ 149.840984] </TASK>
> [ 149.841120] irq event stamp: 1631
> [ 149.841303] hardirqs last enabled at (1639): [<ffffffff811d4891>] __up_console_sem+0x91/0xb0
> [ 149.841825] hardirqs last disabled at (1646): [<ffffffff811d4876>] __up_console_sem+0x76/0xb0
> [ 149.842279] softirqs last enabled at (1304): [<ffffffff82f9d213>] __do_softirq+0x323/0x48a
> [ 149.842723] softirqs last disabled at (1295): [<ffffffff81123142>] irq_exit_rcu+0xd2/0x100
> [ 149.843179] ---[ end trace 0000000000000000 ]---
>
> Bisect and found that bad commit is:
> 63dc6325ff41ee9e570bde705ac34a39c5dbeb44
> x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK

Oops, thanks for reporting. Let me check why this happened.


>
> After reverted the above commit on top of v6.2-rc3 kernel, this issue was gone.
>
> The reproduced code, kconfig, bisect info, and v6.2-rc3 reproduced log are in
> attached email.
>
> And all the detailed logs are in link:
> https://github.com/xupengfe/syzkaller_logs/tree/main/230113_100005_register_kprobe

Great, thanks!

>
> I hope it's helpful.
>
> Thanks!
> BR.


--
Masami Hiramatsu (Google) <[email protected]>

2023-01-23 13:24:13

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH] kprobes: Fix to handle forcibly unoptimized kprobes on freeing_list

From: Masami Hiramatsu (Google) <[email protected]>

Sinec forcibly unoptimized kprobes will be put on the freeing_list directly
in the unoptimize_kprobe(), do_unoptimize_kprobes() must continue to check
the freeing_list even if unoptimizing_list is empty.

This bug can be happen if a kprobe is put in an instruction which is in the
middle of the jump-replaced instruction sequence of an optprobe, *and* the
optprobe is recently unregistered and queued on unoptimizing_list.
In this case, the optprobe will be unoptimized forcibly (means immediately)
and put it into the freeing_list, expecting the optprobe will be handled in
do_unoptimize_kprobe().
But if there is no other optprobes on the unoptimizing_list, current code
returns from the do_unoptimize_kprobe() soon and do not handle the optprobe
which is on the freeing_list, and it will hit the WARN_ON_ONCE() in the
do_free_cleaned_kprobes(), because it is not handled in the latter loop of
the do_unoptimize_kprobe().

To solve this issue, do not return from do_unoptimize_kprobes() immediately
even if unoptimizing_list is empty.

Moreover, this change affects another case. kill_optimized_kprobes() expects
kprobe_optimizer() will just free the optprobe on freeing_list.
So I changed it to just do list_move() to freeing_list if optprobes are on
unoptimizing list. And the do_unoptimize_kprobe() will skip
arch_disarm_kprobe() if the probe on freeing_list has gone flag.

Link: https://lore.kernel.org/all/[email protected]/

Fixes: e4add247789e ("kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic")
Reported-by: Pengfei Xu <[email protected]>
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
---
kernel/kprobes.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 1c18ecf9f98b..73b150fad936 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -555,17 +555,15 @@ static void do_unoptimize_kprobes(void)
/* See comment in do_optimize_kprobes() */
lockdep_assert_cpus_held();

- /* Unoptimization must be done anytime */
- if (list_empty(&unoptimizing_list))
- return;
+ if (!list_empty(&unoptimizing_list))
+ arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);

- arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);
- /* Loop on 'freeing_list' for disarming */
+ /* Loop on 'freeing_list' for disarming and removing from kprobe hash list */
list_for_each_entry_safe(op, tmp, &freeing_list, list) {
/* Switching from detour code to origin */
op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
- /* Disarm probes if marked disabled */
- if (kprobe_disabled(&op->kp))
+ /* Disarm probes if marked disabled and not gone */
+ if (kprobe_disabled(&op->kp) && !kprobe_gone(&op->kp))
arch_disarm_kprobe(&op->kp);
if (kprobe_unused(&op->kp)) {
/*
@@ -797,14 +795,13 @@ static void kill_optimized_kprobe(struct kprobe *p)
op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;

if (kprobe_unused(p)) {
- /* Enqueue if it is unused */
- list_add(&op->list, &freeing_list);
/*
- * Remove unused probes from the hash list. After waiting
- * for synchronization, this probe is reclaimed.
- * (reclaiming is done by do_free_cleaned_kprobes().)
+ * Unused kprobe is on unoptimizing or freeing list. We move it
+ * to freeing_list and let the kprobe_optimizer() removes it from
+ * the kprobe hash list and frees it.
*/
- hlist_del_rcu(&op->kp.hlist);
+ if (optprobe_queued_unopt(op))
+ list_move(&op->list, &freeing_list);
}

/* Don't touch the code, because it is already freed. */


2023-01-23 18:39:42

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] kprobes: Fix to handle forcibly unoptimized kprobes on freeing_list

On Mon, 23 Jan 2023 22:24:05 +0900
"Masami Hiramatsu (Google)" <[email protected]> wrote:

> From: Masami Hiramatsu (Google) <[email protected]>
>
> Sinec forcibly unoptimized kprobes will be put on the freeing_list directly

"Since"

> in the unoptimize_kprobe(), do_unoptimize_kprobes() must continue to check
> the freeing_list even if unoptimizing_list is empty.
>
> This bug can be happen if a kprobe is put in an instruction which is in the

"This bug can happen if"

> middle of the jump-replaced instruction sequence of an optprobe, *and* the
> optprobe is recently unregistered and queued on unoptimizing_list.
> In this case, the optprobe will be unoptimized forcibly (means immediately)
> and put it into the freeing_list, expecting the optprobe will be handled in
> do_unoptimize_kprobe().
> But if there is no other optprobes on the unoptimizing_list, current code
> returns from the do_unoptimize_kprobe() soon and do not handle the optprobe

"and does not handle'

> which is on the freeing_list, and it will hit the WARN_ON_ONCE() in the
> do_free_cleaned_kprobes(), because it is not handled in the latter loop of
> the do_unoptimize_kprobe().
>
> To solve this issue, do not return from do_unoptimize_kprobes() immediately
> even if unoptimizing_list is empty.
>
> Moreover, this change affects another case. kill_optimized_kprobes() expects
> kprobe_optimizer() will just free the optprobe on freeing_list.
> So I changed it to just do list_move() to freeing_list if optprobes are on
> unoptimizing list. And the do_unoptimize_kprobe() will skip
> arch_disarm_kprobe() if the probe on freeing_list has gone flag.
>
> Link: https://lore.kernel.org/all/[email protected]/
>
> Fixes: e4add247789e ("kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic")
> Reported-by: Pengfei Xu <[email protected]>
> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
> ---
> kernel/kprobes.c | 23 ++++++++++-------------
> 1 file changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 1c18ecf9f98b..73b150fad936 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -555,17 +555,15 @@ static void do_unoptimize_kprobes(void)
> /* See comment in do_optimize_kprobes() */
> lockdep_assert_cpus_held();
>
> - /* Unoptimization must be done anytime */
> - if (list_empty(&unoptimizing_list))
> - return;
> + if (!list_empty(&unoptimizing_list))
> + arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);
>
> - arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);
> - /* Loop on 'freeing_list' for disarming */
> + /* Loop on 'freeing_list' for disarming and removing from kprobe hash list */
> list_for_each_entry_safe(op, tmp, &freeing_list, list) {
> /* Switching from detour code to origin */
> op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
> - /* Disarm probes if marked disabled */
> - if (kprobe_disabled(&op->kp))
> + /* Disarm probes if marked disabled and not gone */
> + if (kprobe_disabled(&op->kp) && !kprobe_gone(&op->kp))
> arch_disarm_kprobe(&op->kp);
> if (kprobe_unused(&op->kp)) {
> /*
> @@ -797,14 +795,13 @@ static void kill_optimized_kprobe(struct kprobe *p)
> op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
>
> if (kprobe_unused(p)) {
> - /* Enqueue if it is unused */
> - list_add(&op->list, &freeing_list);
> /*
> - * Remove unused probes from the hash list. After waiting
> - * for synchronization, this probe is reclaimed.
> - * (reclaiming is done by do_free_cleaned_kprobes().)
> + * Unused kprobe is on unoptimizing or freeing list. We move it
> + * to freeing_list and let the kprobe_optimizer() removes it from

"remove it"

> + * the kprobe hash list and frees it.

"and free it."

> */
> - hlist_del_rcu(&op->kp.hlist);
> + if (optprobe_queued_unopt(op))
> + list_move(&op->list, &freeing_list);
> }
>
> /* Don't touch the code, because it is already freed. */

Other than the spelling issues,

Acked-by: Steven Rostedt (Google) <[email protected]>

-- Steve

2023-01-24 00:00:36

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH] kprobes: Fix to handle forcibly unoptimized kprobes on freeing_list

On Mon, 23 Jan 2023 13:39:31 -0500
Steven Rostedt <[email protected]> wrote:

> On Mon, 23 Jan 2023 22:24:05 +0900
> "Masami Hiramatsu (Google)" <[email protected]> wrote:
>
> > From: Masami Hiramatsu (Google) <[email protected]>
> >
> > Sinec forcibly unoptimized kprobes will be put on the freeing_list directly
>
> "Since"
>
> > in the unoptimize_kprobe(), do_unoptimize_kprobes() must continue to check
> > the freeing_list even if unoptimizing_list is empty.
> >
> > This bug can be happen if a kprobe is put in an instruction which is in the
>
> "This bug can happen if"
>
> > middle of the jump-replaced instruction sequence of an optprobe, *and* the
> > optprobe is recently unregistered and queued on unoptimizing_list.
> > In this case, the optprobe will be unoptimized forcibly (means immediately)
> > and put it into the freeing_list, expecting the optprobe will be handled in
> > do_unoptimize_kprobe().
> > But if there is no other optprobes on the unoptimizing_list, current code
> > returns from the do_unoptimize_kprobe() soon and do not handle the optprobe
>
> "and does not handle'
>
> > which is on the freeing_list, and it will hit the WARN_ON_ONCE() in the
> > do_free_cleaned_kprobes(), because it is not handled in the latter loop of
> > the do_unoptimize_kprobe().
> >
> > To solve this issue, do not return from do_unoptimize_kprobes() immediately
> > even if unoptimizing_list is empty.
> >
> > Moreover, this change affects another case. kill_optimized_kprobes() expects
> > kprobe_optimizer() will just free the optprobe on freeing_list.
> > So I changed it to just do list_move() to freeing_list if optprobes are on
> > unoptimizing list. And the do_unoptimize_kprobe() will skip
> > arch_disarm_kprobe() if the probe on freeing_list has gone flag.
> >
> > Link: https://lore.kernel.org/all/[email protected]/
> >
> > Fixes: e4add247789e ("kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic")
> > Reported-by: Pengfei Xu <[email protected]>
> > Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
> > ---
> > kernel/kprobes.c | 23 ++++++++++-------------
> > 1 file changed, 10 insertions(+), 13 deletions(-)
> >
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 1c18ecf9f98b..73b150fad936 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -555,17 +555,15 @@ static void do_unoptimize_kprobes(void)
> > /* See comment in do_optimize_kprobes() */
> > lockdep_assert_cpus_held();
> >
> > - /* Unoptimization must be done anytime */
> > - if (list_empty(&unoptimizing_list))
> > - return;
> > + if (!list_empty(&unoptimizing_list))
> > + arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);
> >
> > - arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);
> > - /* Loop on 'freeing_list' for disarming */
> > + /* Loop on 'freeing_list' for disarming and removing from kprobe hash list */
> > list_for_each_entry_safe(op, tmp, &freeing_list, list) {
> > /* Switching from detour code to origin */
> > op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
> > - /* Disarm probes if marked disabled */
> > - if (kprobe_disabled(&op->kp))
> > + /* Disarm probes if marked disabled and not gone */
> > + if (kprobe_disabled(&op->kp) && !kprobe_gone(&op->kp))
> > arch_disarm_kprobe(&op->kp);
> > if (kprobe_unused(&op->kp)) {
> > /*
> > @@ -797,14 +795,13 @@ static void kill_optimized_kprobe(struct kprobe *p)
> > op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
> >
> > if (kprobe_unused(p)) {
> > - /* Enqueue if it is unused */
> > - list_add(&op->list, &freeing_list);
> > /*
> > - * Remove unused probes from the hash list. After waiting
> > - * for synchronization, this probe is reclaimed.
> > - * (reclaiming is done by do_free_cleaned_kprobes().)
> > + * Unused kprobe is on unoptimizing or freeing list. We move it
> > + * to freeing_list and let the kprobe_optimizer() removes it from
>
> "remove it"
>
> > + * the kprobe hash list and frees it.
>
> "and free it."
>
> > */
> > - hlist_del_rcu(&op->kp.hlist);
> > + if (optprobe_queued_unopt(op))
> > + list_move(&op->list, &freeing_list);
> > }
> >
> > /* Don't touch the code, because it is already freed. */
>
> Other than the spelling issues,
>
> Acked-by: Steven Rostedt (Google) <[email protected]>

Thanks for review! I'll fix typos and put on probes/urgent.
Also,
Cc: [email protected]

Thank you,

>
> -- Steve


--
Masami Hiramatsu (Google) <[email protected]>