2022-05-06 06:27:37

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64

Hey again,

On Thu, May 05, 2022 at 04:50:16PM +0200, Jason A. Donenfeld wrote:
> Hi Rafael,
>
> I'm seeing the following boot time splat on 5.4.191:

Figured it out. 25928deeb1e4 ("ACPICA: Always create namespace nodes
using acpi_ns_create_node()") needs to be backported to stable.

Jason


2022-05-09 01:26:00

by Jason A. Donenfeld

[permalink] [raw]
Subject: [PATCH 5.4] ACPICA: Always create namespace nodes using acpi_ns_create_node()

From: Vegard Nossum <[email protected]>

commit 25928deeb1e4e2cdae1dccff349320c6841eb5f8 upstream.

ACPICA commit 29da9a2a3f5b2c60420893e5c6309a0586d7a329

ACPI is allocating an object using kmalloc(), but then frees it
using kmem_cache_free(<"Acpi-Namespace" kmem_cache>).

This is wrong and can lead to boot failures manifesting like this:

hpet0: 3 comparators, 64-bit 100.000000 MHz counter
clocksource: Switched to clocksource tsc-early
BUG: unable to handle page fault for address: 000000003ffe0018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0+ #211
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:kmem_cache_alloc+0x70/0x1d0
Code: 00 00 4c 8b 45 00 65 49 8b 50 08 65 4c 03 05 6f cc e7 7e 4d 8b
20 4d 85 e4 0f 84 3d 01 00 00 8b 45 20 48 8b 7d 00 48 8d 4a 01 <49> 8b
1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 c5 8b 45 20
RSP: 0000:ffffc90000013df8 EFLAGS: 00010206
RAX: 0000000000000018 RBX: ffffffff81c49200 RCX: 0000000000000002
RDX: 0000000000000001 RSI: 0000000000000dc0 RDI: 000000000002b300
RBP: ffff88803e403d00 R08: ffff88803ec2b300 R09: 0000000000000001
R10: 0000000000000dc0 R11: 0000000000000006 R12: 000000003ffe0000
R13: ffffffff8110a583 R14: 0000000000000dc0 R15: ffffffff81c49a80
FS: 0000000000000000(0000) GS:ffff88803ec00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000003ffe0018 CR3: 0000000001c0a001 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__trace_define_field+0x33/0xa0
event_trace_init+0xeb/0x2b4
tracer_init_tracefs+0x60/0x195
? register_tracer+0x1e7/0x1e7
do_one_initcall+0x74/0x160
kernel_init_freeable+0x190/0x1f0
? rest_init+0x9a/0x9a
kernel_init+0x5/0xf6
ret_from_fork+0x35/0x40
CR2: 000000003ffe0018
---[ end trace 707efa023f2ee960 ]---
RIP: 0010:kmem_cache_alloc+0x70/0x1d0

Bisection leads to unrelated changes in slab; Vlastimil Babka
suggests an unrelated layout or slab merge change merely exposed
the underlying bug.

Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/r/[email protected]
Link: https://github.com/acpica/acpica/commit/29da9a2a
Fixes: f79c8e4136ea ("ACPICA: Namespace: simplify creation of the initial/default namespace")
Reported-by: Jan Kiszka <[email protected]>
Diagnosed-by: Vlastimil Babka <[email protected]>
Diagnosed-by: Kees Cook <[email protected]>
Signed-off-by: Vegard Nossum <[email protected]>
Signed-off-by: Bob Moore <[email protected]>
Signed-off-by: Erik Kaneda <[email protected]>
Cc: 5.10+ <[email protected]> # 5.10+
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
Greg/Rafael - tihs was marked as 5.10, but 5.4 crashes without it. So
maybe it was mistagged? Will let you guys decide. -Jason

drivers/acpi/acpica/nsaccess.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
index 3f045b5953b2..a0c1a665dfc1 100644
--- a/drivers/acpi/acpica/nsaccess.c
+++ b/drivers/acpi/acpica/nsaccess.c
@@ -99,13 +99,12 @@ acpi_status acpi_ns_root_initialize(void)
* just create and link the new node(s) here.
*/
new_node =
- ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_namespace_node));
+ acpi_ns_create_node(*ACPI_CAST_PTR(u32, init_val->name));
if (!new_node) {
status = AE_NO_MEMORY;
goto unlock_and_exit;
}

- ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
new_node->type = init_val->type;

--
2.35.1


2022-05-09 03:07:53

by Vegard Nossum

[permalink] [raw]
Subject: Re: [PATCH 5.4] ACPICA: Always create namespace nodes using acpi_ns_create_node()


On 5/5/22 17:01, Jason A. Donenfeld wrote:
> From: Vegard Nossum <[email protected]>
>
> commit 25928deeb1e4e2cdae1dccff349320c6841eb5f8 upstream.
>
> ACPICA commit 29da9a2a3f5b2c60420893e5c6309a0586d7a329
>
> ACPI is allocating an object using kmalloc(), but then frees it
> using kmem_cache_free(<"Acpi-Namespace" kmem_cache>).
>

[...]

> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://lore.kernel.org/r/[email protected]
> Link: https://github.com/acpica/acpica/commit/29da9a2a
> Fixes: f79c8e4136ea ("ACPICA: Namespace: simplify creation of the initial/default namespace")
> Reported-by: Jan Kiszka <[email protected]>
> Diagnosed-by: Vlastimil Babka <[email protected]>
> Diagnosed-by: Kees Cook <[email protected]>
> Signed-off-by: Vegard Nossum <[email protected]>
> Signed-off-by: Bob Moore <[email protected]>
> Signed-off-by: Erik Kaneda <[email protected]>
> Cc: 5.10+ <[email protected]> # 5.10+
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> Greg/Rafael - tihs was marked as 5.10, but 5.4 crashes without it. So
> maybe it was mistagged? Will let you guys decide. -Jason

If I look up the Fixes: commit I get:

$ git name-rev f79c8e4136eac37255ead8875593ae33a2c16d20
f79c8e4136eac37255ead8875593ae33a2c16d20 tags/linus/v5.3-rc1~166^2~1^2~4

so it looks like the buggy commit actually went into v5.3.

I think maybe the bug was there since v5.3 but it was merely exposed by
some unrelated SLUB change that went in later, maybe that's where the
version number confusion came from, see
<https://lore.kernel.org/lkml/[email protected]/>
as well. The commit I had bisected to it was:

$ git name-rev --refs='v5.*' 67a72420a326b45514deb3f212085fb2cd1595b5
67a72420a326b45514deb3f212085fb2cd1595b5 linus/v5.4-rc1~141^2~2^2~7

But as Vlastimil Babka pointed out, the bug is sensitive to slab merging.

Anyway, thanks for spotting that.


Vegard

2022-05-09 10:40:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.4] ACPICA: Always create namespace nodes using acpi_ns_create_node()

On Thu, May 05, 2022 at 05:01:40PM +0200, Jason A. Donenfeld wrote:
> From: Vegard Nossum <[email protected]>
>
> commit 25928deeb1e4e2cdae1dccff349320c6841eb5f8 upstream.
>
> ACPICA commit 29da9a2a3f5b2c60420893e5c6309a0586d7a329
>
> ACPI is allocating an object using kmalloc(), but then frees it
> using kmem_cache_free(<"Acpi-Namespace" kmem_cache>).
>
> This is wrong and can lead to boot failures manifesting like this:
>
> hpet0: 3 comparators, 64-bit 100.000000 MHz counter
> clocksource: Switched to clocksource tsc-early
> BUG: unable to handle page fault for address: 000000003ffe0018
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0+ #211
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> RIP: 0010:kmem_cache_alloc+0x70/0x1d0
> Code: 00 00 4c 8b 45 00 65 49 8b 50 08 65 4c 03 05 6f cc e7 7e 4d 8b
> 20 4d 85 e4 0f 84 3d 01 00 00 8b 45 20 48 8b 7d 00 48 8d 4a 01 <49> 8b
> 1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 c5 8b 45 20
> RSP: 0000:ffffc90000013df8 EFLAGS: 00010206
> RAX: 0000000000000018 RBX: ffffffff81c49200 RCX: 0000000000000002
> RDX: 0000000000000001 RSI: 0000000000000dc0 RDI: 000000000002b300
> RBP: ffff88803e403d00 R08: ffff88803ec2b300 R09: 0000000000000001
> R10: 0000000000000dc0 R11: 0000000000000006 R12: 000000003ffe0000
> R13: ffffffff8110a583 R14: 0000000000000dc0 R15: ffffffff81c49a80
> FS: 0000000000000000(0000) GS:ffff88803ec00000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000003ffe0018 CR3: 0000000001c0a001 CR4: 00000000003606f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> __trace_define_field+0x33/0xa0
> event_trace_init+0xeb/0x2b4
> tracer_init_tracefs+0x60/0x195
> ? register_tracer+0x1e7/0x1e7
> do_one_initcall+0x74/0x160
> kernel_init_freeable+0x190/0x1f0
> ? rest_init+0x9a/0x9a
> kernel_init+0x5/0xf6
> ret_from_fork+0x35/0x40
> CR2: 000000003ffe0018
> ---[ end trace 707efa023f2ee960 ]---
> RIP: 0010:kmem_cache_alloc+0x70/0x1d0
>
> Bisection leads to unrelated changes in slab; Vlastimil Babka
> suggests an unrelated layout or slab merge change merely exposed
> the underlying bug.
>
> Link: https://lore.kernel.org/lkml/[email protected]/
> Link: https://lore.kernel.org/r/[email protected]
> Link: https://github.com/acpica/acpica/commit/29da9a2a
> Fixes: f79c8e4136ea ("ACPICA: Namespace: simplify creation of the initial/default namespace")
> Reported-by: Jan Kiszka <[email protected]>
> Diagnosed-by: Vlastimil Babka <[email protected]>
> Diagnosed-by: Kees Cook <[email protected]>
> Signed-off-by: Vegard Nossum <[email protected]>
> Signed-off-by: Bob Moore <[email protected]>
> Signed-off-by: Erik Kaneda <[email protected]>
> Cc: 5.10+ <[email protected]> # 5.10+
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> Greg/Rafael - tihs was marked as 5.10, but 5.4 crashes without it. So
> maybe it was mistagged? Will let you guys decide. -Jason

Makes sense, now queued up, thanks.

greg k-h