2020-06-04 17:19:40

by Vegard Nossum

[permalink] [raw]
Subject: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018


Hi all,

I ran into a boot problem with latest linus/master
(6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:

hpet0: 3 comparators, 64-bit 100.000000 MHz counter
clocksource: Switched to clocksource tsc-early
BUG: unable to handle page fault for address: 000000003ffe0018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0+ #211
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:kmem_cache_alloc+0x70/0x1d0
Code: 00 00 4c 8b 45 00 65 49 8b 50 08 65 4c 03 05 6f cc e7 7e 4d 8b 20
4d 85 e4 0f 84 3d 01 00 00 8b 45 20 48 8b 7d 00 48 8d 4a 01 <49> 8b 1c
04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 c5 8b 45 20
RSP: 0000:ffffc90000013df8 EFLAGS: 00010206
RAX: 0000000000000018 RBX: ffffffff81c49200 RCX: 0000000000000002
RDX: 0000000000000001 RSI: 0000000000000dc0 RDI: 000000000002b300
RBP: ffff88803e403d00 R08: ffff88803ec2b300 R09: 0000000000000001
R10: 0000000000000dc0 R11: 0000000000000006 R12: 000000003ffe0000
R13: ffffffff8110a583 R14: 0000000000000dc0 R15: ffffffff81c49a80
FS: 0000000000000000(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000003ffe0018 CR3: 0000000001c0a001 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__trace_define_field+0x33/0xa0
event_trace_init+0xeb/0x2b4
tracer_init_tracefs+0x60/0x195
? register_tracer+0x1e7/0x1e7
do_one_initcall+0x74/0x160
kernel_init_freeable+0x190/0x1f0
? rest_init+0x9a/0x9a
kernel_init+0x5/0xf6
ret_from_fork+0x35/0x40
CR2: 000000003ffe0018
---[ end trace 707efa023f2ee960 ]---
RIP: 0010:kmem_cache_alloc+0x70/0x1d0

Bisection gives me:

commit 3202fa62fb43087387c65bfa9c100feffac74aa6
Author: Kees Cook <[email protected]>
Date: Wed Apr 1 21:04:27 2020 -0700

slub: relocate freelist pointer to middle of object

Reverting these three commits fixes it:

3202fa62fb43087387c65bfa9c100feffac74aa6 slub: relocate freelist pointer
to middle of object
89b83f282d8ba380cf2124f88106c57df49c538c slub: avoid redzone when
choosing freepointer location
cbfc35a48609ceac978791e3ab9dde0c01f8cb20 mm/slub: fix incorrect
interpretation of s->offset


Vegard


2020-06-04 17:20:55

by Vlastimil Babka

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 6/4/20 7:14 PM, Vegard Nossum wrote:
>
> Hi all,
>
> I ran into a boot problem with latest linus/master
> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:

Hi, what's the .config you use?

> hpet0: 3 comparators, 64-bit 100.000000 MHz counter
> clocksource: Switched to clocksource tsc-early
> BUG: unable to handle page fault for address: 000000003ffe0018
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.6.0+ #211
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> RIP: 0010:kmem_cache_alloc+0x70/0x1d0
> Code: 00 00 4c 8b 45 00 65 49 8b 50 08 65 4c 03 05 6f cc e7 7e 4d 8b 20
> 4d 85 e4 0f 84 3d 01 00 00 8b 45 20 48 8b 7d 00 48 8d 4a 01 <49> 8b 1c
> 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 c5 8b 45 20
> RSP: 0000:ffffc90000013df8 EFLAGS: 00010206
> RAX: 0000000000000018 RBX: ffffffff81c49200 RCX: 0000000000000002
> RDX: 0000000000000001 RSI: 0000000000000dc0 RDI: 000000000002b300
> RBP: ffff88803e403d00 R08: ffff88803ec2b300 R09: 0000000000000001
> R10: 0000000000000dc0 R11: 0000000000000006 R12: 000000003ffe0000
> R13: ffffffff8110a583 R14: 0000000000000dc0 R15: ffffffff81c49a80
> FS: 0000000000000000(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000003ffe0018 CR3: 0000000001c0a001 CR4: 00000000003606f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> __trace_define_field+0x33/0xa0
> event_trace_init+0xeb/0x2b4
> tracer_init_tracefs+0x60/0x195
> ? register_tracer+0x1e7/0x1e7
> do_one_initcall+0x74/0x160
> kernel_init_freeable+0x190/0x1f0
> ? rest_init+0x9a/0x9a
> kernel_init+0x5/0xf6
> ret_from_fork+0x35/0x40
> CR2: 000000003ffe0018
> ---[ end trace 707efa023f2ee960 ]---
> RIP: 0010:kmem_cache_alloc+0x70/0x1d0
>
> Bisection gives me:
>
> commit 3202fa62fb43087387c65bfa9c100feffac74aa6
> Author: Kees Cook <[email protected]>
> Date: Wed Apr 1 21:04:27 2020 -0700
>
> slub: relocate freelist pointer to middle of object
>
> Reverting these three commits fixes it:
>
> 3202fa62fb43087387c65bfa9c100feffac74aa6 slub: relocate freelist pointer
> to middle of object
> 89b83f282d8ba380cf2124f88106c57df49c538c slub: avoid redzone when
> choosing freepointer location
> cbfc35a48609ceac978791e3ab9dde0c01f8cb20 mm/slub: fix incorrect
> interpretation of s->offset
>
>
> Vegard
>

2020-06-04 17:23:50

by Vegard Nossum

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 2020-06-04 19:18, Vlastimil Babka wrote:
> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>>
>> Hi all,
>>
>> I ran into a boot problem with latest linus/master
>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
>
> Hi, what's the .config you use?

Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)

Attached.


Vegard


Attachments:
.config (72.84 kB)

2020-06-04 17:56:52

by Kees Cook

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> On 2020-06-04 19:18, Vlastimil Babka wrote:
> > On 6/4/20 7:14 PM, Vegard Nossum wrote:
> > >
> > > Hi all,
> > >
> > > I ran into a boot problem with latest linus/master
> > > (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
> >
> > Hi, what's the .config you use?
>
> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)

What are you boot parameters? This must be another corner case...
(I'll try reproducing with your .config)

--
Kees Cook

2020-06-04 18:00:14

by Kees Cook

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> On 2020-06-04 19:18, Vlastimil Babka wrote:
> > On 6/4/20 7:14 PM, Vegard Nossum wrote:
> > >
> > > Hi all,
> > >
> > > I ran into a boot problem with latest linus/master
> > > (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
> >
> > Hi, what's the .config you use?
>
> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)

Oh yes indeed. I immediately crash in the same way with this config. I'll
start digging...

(defconfig finishes boot)

-Kees

--
Kees Cook

2020-06-04 19:13:26

by Vlastimil Babka

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 6/4/20 7:57 PM, Kees Cook wrote:
> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>> > On 6/4/20 7:14 PM, Vegard Nossum wrote:
>> > >
>> > > Hi all,
>> > >
>> > > I ran into a boot problem with latest linus/master
>> > > (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
>> >
>> > Hi, what's the .config you use?
>>
>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
>
> Oh yes indeed. I immediately crash in the same way with this config. I'll
> start digging...
>
> (defconfig finishes boot)

This is funny, booting with slub_debug=F results in:
I'm not sure if it's ACPI or ftrace wrong here, but looks like the changed
free pointer offset merely exposes a bug in something else.

This would mean acpi_os_release_object() calling kmem_cache_free(ftrace_event_field, x)
where x is actually from kmalloc-64? Both parts of that sounds wrong.

Thread starts here: https://lore.kernel.org/linux-mm/[email protected]/

[ 0.144386] ACPI: Added _OSI(Module Device)
[ 0.144496] ACPI: Added _OSI(Processor Device)
[ 0.144956] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 0.145432] ACPI: Added _OSI(Processor Aggregator Device)
[ 0.145501] ACPI: Added _OSI(Linux-Dell-Video)
[ 0.145951] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[ 0.146522] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[ 0.147070] ACPI Error: AE_BAD_PARAMETER, During Region initialization (20200430/tbxfload-52)
[ 0.147494] ACPI: Unable to load the System Description Tables
[ 0.148104] ACPI Error: Could not remove SCI handler (20200430/evmisc-251)
[ 0.148507] ------------[ cut here ]------------
[ 0.148985] cache_from_obj: Wrong slab cache. ftrace_event_field but object is from kmalloc-64
[ 0.149502] WARNING: CPU: 0 PID: 1 at mm/slab.h:523 kmem_cache_free+0x248/0x260
[ 0.150254] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #43
[ 0.150490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
[ 0.150490] RIP: 0010:kmem_cache_free+0x248/0x260
[ 0.150490] Code: ff 0f 0b e9 9d fe ff ff 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 c1 a4 48 c7 c7 f0 c1 d0 a4 c6 05 9f 05 b1 00 01 e8 bc cc eb ff <0f> 0b 48 8b 15 5f 36 9b 00 4c 89 ed e9 d6 fd ff ff 0f 1f 80 00 00
[ 0.150490] RSP: 0018:ffffb4dac0013dc0 EFLAGS: 00010282
[ 0.150490] RAX: 0000000000000000 RBX: ffffa38a07409e00 RCX: 0000000000000000
[ 0.150490] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffffa51dd32c
[ 0.150490] RBP: ffffa38a07403900 R08: ffffb4dac0013c7d R09: 00000000000000eb
[ 0.150490] R10: ffffb4dac0013c78 R11: ffffb4dac0013c7d R12: ffffa38a87409e00
[ 0.150490] R13: ffffa38a07401d00 R14: 0000000000000000 R15: 0000000000000000
[ 0.150490] FS: 0000000000000000(0000) GS:ffffa38a07a00000(0000) knlGS:0000000000000000
[ 0.150490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.150490] CR2: 0000000000000000 CR3: 000000000560a000 CR4: 00000000003406f0
[ 0.150490] Call Trace:
[ 0.150490] acpi_os_release_object+0x5/0x10
[ 0.150490] acpi_ns_delete_children+0x46/0x59
[ 0.150490] acpi_ns_delete_namespace_subtree+0x5c/0x79
[ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
[ 0.150490] acpi_ns_terminate+0xc/0x31
[ 0.150490] acpi_ut_subsystem_shutdown+0x45/0xa3
[ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
[ 0.150490] acpi_terminate+0x5/0xf
[ 0.150490] acpi_init+0x27b/0x308
[ 0.150490] ? video_setup+0x79/0x79
[ 0.150490] do_one_initcall+0x7b/0x160
[ 0.150490] kernel_init_freeable+0x190/0x1f2
[ 0.150490] ? rest_init+0x9a/0x9a
[ 0.150490] kernel_init+0x5/0xf6
[ 0.150490] ret_from_fork+0x22/0x30
[ 0.150490] ---[ end trace 967e9fbc065d7911 ]---


2020-06-05 09:13:36

by Vlastimil Babka

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> On 6/4/20 7:57 PM, Kees Cook wrote:
>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I ran into a boot problem with latest linus/master
>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
>>>>
>>>> Hi, what's the .config you use?
>>>
>>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
>>
>> Oh yes indeed. I immediately crash in the same way with this config. I'll
>> start digging...
>>
>> (defconfig finishes boot)
>
> This is funny, booting with slub_debug=F results in:
> I'm not sure if it's ACPI or ftrace wrong here, but looks like the changed
> free pointer offset merely exposes a bug in something else.

So, with Kees' patch reverted, booting with slub_debug=F (or even more
specific slub_debug=F,ftrace_event_field) also hits this bug below. I
wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
further in history. So it's not new at all, and likely very specific to
your config+QEMU? (and related to the ACPI error messages that precede it?).

> This would mean acpi_os_release_object() calling kmem_cache_free(ftrace_event_field, x)
> where x is actually from kmalloc-64? Both parts of that sounds wrong.
>
> Thread starts here: https://lore.kernel.org/linux-mm/[email protected]/
>
> [ 0.144386] ACPI: Added _OSI(Module Device)
> [ 0.144496] ACPI: Added _OSI(Processor Device)
> [ 0.144956] ACPI: Added _OSI(3.0 _SCP Extensions)
> [ 0.145432] ACPI: Added _OSI(Processor Aggregator Device)
> [ 0.145501] ACPI: Added _OSI(Linux-Dell-Video)
> [ 0.145951] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> [ 0.146522] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> [ 0.147070] ACPI Error: AE_BAD_PARAMETER, During Region initialization (20200430/tbxfload-52)
> [ 0.147494] ACPI: Unable to load the System Description Tables
> [ 0.148104] ACPI Error: Could not remove SCI handler (20200430/evmisc-251)
> [ 0.148507] ------------[ cut here ]------------
> [ 0.148985] cache_from_obj: Wrong slab cache. ftrace_event_field but object is from kmalloc-64
> [ 0.149502] WARNING: CPU: 0 PID: 1 at mm/slab.h:523 kmem_cache_free+0x248/0x260
> [ 0.150254] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #43
> [ 0.150490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> [ 0.150490] RIP: 0010:kmem_cache_free+0x248/0x260
> [ 0.150490] Code: ff 0f 0b e9 9d fe ff ff 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 c1 a4 48 c7 c7 f0 c1 d0 a4 c6 05 9f 05 b1 00 01 e8 bc cc eb ff <0f> 0b 48 8b 15 5f 36 9b 00 4c 89 ed e9 d6 fd ff ff 0f 1f 80 00 00
> [ 0.150490] RSP: 0018:ffffb4dac0013dc0 EFLAGS: 00010282
> [ 0.150490] RAX: 0000000000000000 RBX: ffffa38a07409e00 RCX: 0000000000000000
> [ 0.150490] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffffa51dd32c
> [ 0.150490] RBP: ffffa38a07403900 R08: ffffb4dac0013c7d R09: 00000000000000eb
> [ 0.150490] R10: ffffb4dac0013c78 R11: ffffb4dac0013c7d R12: ffffa38a87409e00
> [ 0.150490] R13: ffffa38a07401d00 R14: 0000000000000000 R15: 0000000000000000
> [ 0.150490] FS: 0000000000000000(0000) GS:ffffa38a07a00000(0000) knlGS:0000000000000000
> [ 0.150490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.150490] CR2: 0000000000000000 CR3: 000000000560a000 CR4: 00000000003406f0
> [ 0.150490] Call Trace:
> [ 0.150490] acpi_os_release_object+0x5/0x10
> [ 0.150490] acpi_ns_delete_children+0x46/0x59
> [ 0.150490] acpi_ns_delete_namespace_subtree+0x5c/0x79
> [ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.150490] acpi_ns_terminate+0xc/0x31
> [ 0.150490] acpi_ut_subsystem_shutdown+0x45/0xa3
> [ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.150490] acpi_terminate+0x5/0xf
> [ 0.150490] acpi_init+0x27b/0x308
> [ 0.150490] ? video_setup+0x79/0x79
> [ 0.150490] do_one_initcall+0x7b/0x160
> [ 0.150490] kernel_init_freeable+0x190/0x1f2
> [ 0.150490] ? rest_init+0x9a/0x9a
> [ 0.150490] kernel_init+0x5/0xf6
> [ 0.150490] ret_from_fork+0x22/0x30
> [ 0.150490] ---[ end trace 967e9fbc065d7911 ]---
>
>
>

2020-06-05 09:42:19

by Vegard Nossum

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018


On 2020-06-05 11:11, Vlastimil Babka wrote:
> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
>> On 6/4/20 7:57 PM, Kees Cook wrote:
>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I ran into a boot problem with latest linus/master
>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
>>>>>
>>>>> Hi, what's the .config you use?
>>>>
>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
>>>
>>> Oh yes indeed. I immediately crash in the same way with this config. I'll
>>> start digging...
>>>
>>> (defconfig finishes boot)
>>
>> This is funny, booting with slub_debug=F results in:
>> I'm not sure if it's ACPI or ftrace wrong here, but looks like the changed
>> free pointer offset merely exposes a bug in something else.
>
> So, with Kees' patch reverted, booting with slub_debug=F (or even more
> specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> further in history. So it's not new at all, and likely very specific to
> your config+QEMU? (and related to the ACPI error messages that precede it?).

I see it too, but not on v5.0. I can bisect it.

Also, panic_on_warn is apparently a core parameter, it should probably
be __setup()...


Vegard

>
>> This would mean acpi_os_release_object() calling kmem_cache_free(ftrace_event_field, x)
>> where x is actually from kmalloc-64? Both parts of that sounds wrong.
>>
>> Thread starts here: https://lore.kernel.org/linux-mm/[email protected]/
>>
>> [ 0.144386] ACPI: Added _OSI(Module Device)
>> [ 0.144496] ACPI: Added _OSI(Processor Device)
>> [ 0.144956] ACPI: Added _OSI(3.0 _SCP Extensions)
>> [ 0.145432] ACPI: Added _OSI(Processor Aggregator Device)
>> [ 0.145501] ACPI: Added _OSI(Linux-Dell-Video)
>> [ 0.145951] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
>> [ 0.146522] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
>> [ 0.147070] ACPI Error: AE_BAD_PARAMETER, During Region initialization (20200430/tbxfload-52)
>> [ 0.147494] ACPI: Unable to load the System Description Tables
>> [ 0.148104] ACPI Error: Could not remove SCI handler (20200430/evmisc-251)
>> [ 0.148507] ------------[ cut here ]------------
>> [ 0.148985] cache_from_obj: Wrong slab cache. ftrace_event_field but object is from kmalloc-64
>> [ 0.149502] WARNING: CPU: 0 PID: 1 at mm/slab.h:523 kmem_cache_free+0x248/0x260
>> [ 0.150254] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #43
>> [ 0.150490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
>> [ 0.150490] RIP: 0010:kmem_cache_free+0x248/0x260
>> [ 0.150490] Code: ff 0f 0b e9 9d fe ff ff 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 c1 a4 48 c7 c7 f0 c1 d0 a4 c6 05 9f 05 b1 00 01 e8 bc cc eb ff <0f> 0b 48 8b 15 5f 36 9b 00 4c 89 ed e9 d6 fd ff ff 0f 1f 80 00 00
>> [ 0.150490] RSP: 0018:ffffb4dac0013dc0 EFLAGS: 00010282
>> [ 0.150490] RAX: 0000000000000000 RBX: ffffa38a07409e00 RCX: 0000000000000000
>> [ 0.150490] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffffa51dd32c
>> [ 0.150490] RBP: ffffa38a07403900 R08: ffffb4dac0013c7d R09: 00000000000000eb
>> [ 0.150490] R10: ffffb4dac0013c78 R11: ffffb4dac0013c7d R12: ffffa38a87409e00
>> [ 0.150490] R13: ffffa38a07401d00 R14: 0000000000000000 R15: 0000000000000000
>> [ 0.150490] FS: 0000000000000000(0000) GS:ffffa38a07a00000(0000) knlGS:0000000000000000
>> [ 0.150490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 0.150490] CR2: 0000000000000000 CR3: 000000000560a000 CR4: 00000000003406f0
>> [ 0.150490] Call Trace:
>> [ 0.150490] acpi_os_release_object+0x5/0x10
>> [ 0.150490] acpi_ns_delete_children+0x46/0x59
>> [ 0.150490] acpi_ns_delete_namespace_subtree+0x5c/0x79
>> [ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
>> [ 0.150490] acpi_ns_terminate+0xc/0x31
>> [ 0.150490] acpi_ut_subsystem_shutdown+0x45/0xa3
>> [ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
>> [ 0.150490] acpi_terminate+0x5/0xf
>> [ 0.150490] acpi_init+0x27b/0x308
>> [ 0.150490] ? video_setup+0x79/0x79
>> [ 0.150490] do_one_initcall+0x7b/0x160
>> [ 0.150490] kernel_init_freeable+0x190/0x1f2
>> [ 0.150490] ? rest_init+0x9a/0x9a
>> [ 0.150490] kernel_init+0x5/0xf6
>> [ 0.150490] ret_from_fork+0x22/0x30
>> [ 0.150490] ---[ end trace 967e9fbc065d7911 ]---
>>
>>
>>
>

2020-06-05 12:50:37

by Vegard Nossum

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 2020-06-05 11:36, Vegard Nossum wrote:
>
> On 2020-06-05 11:11, Vlastimil Babka wrote:
>> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
>>> On 6/4/20 7:57 PM, Kees Cook wrote:
>>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I ran into a boot problem with latest linus/master
>>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
>>>>>>
>>>>>> Hi, what's the .config you use?
>>>>>
>>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
>>>>
>>>> Oh yes indeed. I immediately crash in the same way with this config.
>>>> I'll
>>>> start digging...
>>>>
>>>> (defconfig finishes boot)
>>>
>>> This is funny, booting with slub_debug=F results in:
>>> I'm not sure if it's ACPI or ftrace wrong here, but looks like the
>>> changed
>>> free pointer offset merely exposes a bug in something else.
>>
>> So, with Kees' patch reverted, booting with slub_debug=F (or even more
>> specific slub_debug=F,ftrace_event_field) also hits this bug below. I
>> wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
>> further in history. So it's not new at all, and likely very specific to
>> your config+QEMU? (and related to the ACPI error messages that precede
>> it?).
>
> I see it too, but not on v5.0. I can bisect it.

commit 67a72420a326b45514deb3f212085fb2cd1595b5
Author: Bob Moore <[email protected]>
Date: Fri Aug 16 14:43:21 2019 -0700

ACPICA: Increase total number of possible Owner IDs

ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324

From 255 to 4095 possible IDs.

Link: https://github.com/acpica/acpica/commit/1f1652da
Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
Signed-off-by: Bob Moore <[email protected]>
Signed-off-by: Erik Schmauss <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>


Vegard

>>> This would mean acpi_os_release_object() calling
>>> kmem_cache_free(ftrace_event_field, x)
>>> where x is actually from kmalloc-64? Both parts of that sounds wrong.
>>>
>>> Thread starts here:
>>> https://lore.kernel.org/linux-mm/[email protected]/
>>>
>>>
>>> [    0.144386] ACPI: Added _OSI(Module Device)
>>> [    0.144496] ACPI: Added _OSI(Processor Device)
>>> [    0.144956] ACPI: Added _OSI(3.0 _SCP Extensions)
>>> [    0.145432] ACPI: Added _OSI(Processor Aggregator Device)
>>> [    0.145501] ACPI: Added _OSI(Linux-Dell-Video)
>>> [    0.145951] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
>>> [    0.146522] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
>>> [    0.147070] ACPI Error: AE_BAD_PARAMETER, During Region
>>> initialization (20200430/tbxfload-52)
>>> [    0.147494] ACPI: Unable to load the System Description Tables
>>> [    0.148104] ACPI Error: Could not remove SCI handler
>>> (20200430/evmisc-251)
>>> [    0.148507] ------------[ cut here ]------------
>>> [    0.148985] cache_from_obj: Wrong slab cache. ftrace_event_field
>>> but object is from kmalloc-64
>>> [    0.149502] WARNING: CPU: 0 PID: 1 at mm/slab.h:523
>>> kmem_cache_free+0x248/0x260
>>> [    0.150254] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #43
>>> [    0.150490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>>> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
>>> [    0.150490] RIP: 0010:kmem_cache_free+0x248/0x260
>>> [    0.150490] Code: ff 0f 0b e9 9d fe ff ff 49 8b 4d 58 48 8b 55 58
>>> 48 c7 c6 10 47 c1 a4 48 c7 c7 f0 c1 d0 a4 c6 05 9f 05 b1 00 01 e8 bc
>>> cc eb ff <0f> 0b 48 8b 15 5f 36 9b 00 4c 89 ed e9 d6 fd ff ff 0f 1f
>>> 80 00 00
>>> [    0.150490] RSP: 0018:ffffb4dac0013dc0 EFLAGS: 00010282
>>> [    0.150490] RAX: 0000000000000000 RBX: ffffa38a07409e00 RCX:
>>> 0000000000000000
>>> [    0.150490] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
>>> ffffffffa51dd32c
>>> [    0.150490] RBP: ffffa38a07403900 R08: ffffb4dac0013c7d R09:
>>> 00000000000000eb
>>> [    0.150490] R10: ffffb4dac0013c78 R11: ffffb4dac0013c7d R12:
>>> ffffa38a87409e00
>>> [    0.150490] R13: ffffa38a07401d00 R14: 0000000000000000 R15:
>>> 0000000000000000
>>> [    0.150490] FS:  0000000000000000(0000) GS:ffffa38a07a00000(0000)
>>> knlGS:0000000000000000
>>> [    0.150490] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [    0.150490] CR2: 0000000000000000 CR3: 000000000560a000 CR4:
>>> 00000000003406f0
>>> [    0.150490] Call Trace:
>>> [    0.150490]  acpi_os_release_object+0x5/0x10
>>> [    0.150490]  acpi_ns_delete_children+0x46/0x59
>>> [    0.150490]  acpi_ns_delete_namespace_subtree+0x5c/0x79
>>> [    0.150490]  ? acpi_sleep_proc_init+0x1f/0x1f
>>> [    0.150490]  acpi_ns_terminate+0xc/0x31
>>> [    0.150490]  acpi_ut_subsystem_shutdown+0x45/0xa3
>>> [    0.150490]  ? acpi_sleep_proc_init+0x1f/0x1f
>>> [    0.150490]  acpi_terminate+0x5/0xf
>>> [    0.150490]  acpi_init+0x27b/0x308
>>> [    0.150490]  ? video_setup+0x79/0x79
>>> [    0.150490]  do_one_initcall+0x7b/0x160
>>> [    0.150490]  kernel_init_freeable+0x190/0x1f2
>>> [    0.150490]  ? rest_init+0x9a/0x9a
>>> [    0.150490]  kernel_init+0x5/0xf6
>>> [    0.150490]  ret_from_fork+0x22/0x30
>>> [    0.150490] ---[ end trace 967e9fbc065d7911 ]---
>>>
>>>
>>>
>>
>

2020-06-05 13:18:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
>
> On 2020-06-05 11:36, Vegard Nossum wrote:
> >
> > On 2020-06-05 11:11, Vlastimil Babka wrote:
> >> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> >>> On 6/4/20 7:57 PM, Kees Cook wrote:
> >>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> >>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
> >>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
> >>>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I ran into a boot problem with latest linus/master
> >>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
> >>>>>>
> >>>>>> Hi, what's the .config you use?
> >>>>>
> >>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
> >>>>
> >>>> Oh yes indeed. I immediately crash in the same way with this config.
> >>>> I'll
> >>>> start digging...
> >>>>
> >>>> (defconfig finishes boot)
> >>>
> >>> This is funny, booting with slub_debug=F results in:
> >>> I'm not sure if it's ACPI or ftrace wrong here, but looks like the
> >>> changed
> >>> free pointer offset merely exposes a bug in something else.
> >>
> >> So, with Kees' patch reverted, booting with slub_debug=F (or even more
> >> specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> >> wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> >> further in history. So it's not new at all, and likely very specific to
> >> your config+QEMU? (and related to the ACPI error messages that precede
> >> it?).
> >
> > I see it too, but not on v5.0. I can bisect it.
>
> commit 67a72420a326b45514deb3f212085fb2cd1595b5
> Author: Bob Moore <[email protected]>
> Date: Fri Aug 16 14:43:21 2019 -0700
>
> ACPICA: Increase total number of possible Owner IDs
>
> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
>
> From 255 to 4095 possible IDs.
>
> Link: https://github.com/acpica/acpica/commit/1f1652da
> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
> Signed-off-by: Bob Moore <[email protected]>
> Signed-off-by: Erik Schmauss <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>

Bob, Erik, did we miss something in that patch?

> >>> This would mean acpi_os_release_object() calling
> >>> kmem_cache_free(ftrace_event_field, x)
> >>> where x is actually from kmalloc-64? Both parts of that sounds wrong.
> >>>
> >>> Thread starts here:
> >>> https://lore.kernel.org/linux-mm/[email protected]/
> >>>
> >>>
> >>> [ 0.144386] ACPI: Added _OSI(Module Device)
> >>> [ 0.144496] ACPI: Added _OSI(Processor Device)
> >>> [ 0.144956] ACPI: Added _OSI(3.0 _SCP Extensions)
> >>> [ 0.145432] ACPI: Added _OSI(Processor Aggregator Device)
> >>> [ 0.145501] ACPI: Added _OSI(Linux-Dell-Video)
> >>> [ 0.145951] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
> >>> [ 0.146522] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
> >>> [ 0.147070] ACPI Error: AE_BAD_PARAMETER, During Region
> >>> initialization (20200430/tbxfload-52)
> >>> [ 0.147494] ACPI: Unable to load the System Description Tables
> >>> [ 0.148104] ACPI Error: Could not remove SCI handler
> >>> (20200430/evmisc-251)
> >>> [ 0.148507] ------------[ cut here ]------------
> >>> [ 0.148985] cache_from_obj: Wrong slab cache. ftrace_event_field
> >>> but object is from kmalloc-64
> >>> [ 0.149502] WARNING: CPU: 0 PID: 1 at mm/slab.h:523
> >>> kmem_cache_free+0x248/0x260
> >>> [ 0.150254] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #43
> >>> [ 0.150490] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >>> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> >>> [ 0.150490] RIP: 0010:kmem_cache_free+0x248/0x260
> >>> [ 0.150490] Code: ff 0f 0b e9 9d fe ff ff 49 8b 4d 58 48 8b 55 58
> >>> 48 c7 c6 10 47 c1 a4 48 c7 c7 f0 c1 d0 a4 c6 05 9f 05 b1 00 01 e8 bc
> >>> cc eb ff <0f> 0b 48 8b 15 5f 36 9b 00 4c 89 ed e9 d6 fd ff ff 0f 1f
> >>> 80 00 00
> >>> [ 0.150490] RSP: 0018:ffffb4dac0013dc0 EFLAGS: 00010282
> >>> [ 0.150490] RAX: 0000000000000000 RBX: ffffa38a07409e00 RCX:
> >>> 0000000000000000
> >>> [ 0.150490] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> >>> ffffffffa51dd32c
> >>> [ 0.150490] RBP: ffffa38a07403900 R08: ffffb4dac0013c7d R09:
> >>> 00000000000000eb
> >>> [ 0.150490] R10: ffffb4dac0013c78 R11: ffffb4dac0013c7d R12:
> >>> ffffa38a87409e00
> >>> [ 0.150490] R13: ffffa38a07401d00 R14: 0000000000000000 R15:
> >>> 0000000000000000
> >>> [ 0.150490] FS: 0000000000000000(0000) GS:ffffa38a07a00000(0000)
> >>> knlGS:0000000000000000
> >>> [ 0.150490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [ 0.150490] CR2: 0000000000000000 CR3: 000000000560a000 CR4:
> >>> 00000000003406f0
> >>> [ 0.150490] Call Trace:
> >>> [ 0.150490] acpi_os_release_object+0x5/0x10
> >>> [ 0.150490] acpi_ns_delete_children+0x46/0x59
> >>> [ 0.150490] acpi_ns_delete_namespace_subtree+0x5c/0x79
> >>> [ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
> >>> [ 0.150490] acpi_ns_terminate+0xc/0x31
> >>> [ 0.150490] acpi_ut_subsystem_shutdown+0x45/0xa3
> >>> [ 0.150490] ? acpi_sleep_proc_init+0x1f/0x1f
> >>> [ 0.150490] acpi_terminate+0x5/0xf
> >>> [ 0.150490] acpi_init+0x27b/0x308
> >>> [ 0.150490] ? video_setup+0x79/0x79
> >>> [ 0.150490] do_one_initcall+0x7b/0x160
> >>> [ 0.150490] kernel_init_freeable+0x190/0x1f2
> >>> [ 0.150490] ? rest_init+0x9a/0x9a
> >>> [ 0.150490] kernel_init+0x5/0xf6
> >>> [ 0.150490] ret_from_fork+0x22/0x30
> >>> [ 0.150490] ---[ end trace 967e9fbc065d7911 ]---
> >>>
> >>>
> >>>
> >>
> >
>

2020-06-05 14:10:56

by Vlastimil Babka

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
>>
>> On 2020-06-05 11:36, Vegard Nossum wrote:
>> >
>> > On 2020-06-05 11:11, Vlastimil Babka wrote:
>> >> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
>> >>> On 6/4/20 7:57 PM, Kees Cook wrote:
>> >>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>> >>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>> >>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>> >>>>>>>
>> >>>>>>> Hi all,
>> >>>>>>>
>> >>>>>>> I ran into a boot problem with latest linus/master
>> >>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
>> >>>>>>
>> >>>>>> Hi, what's the .config you use?
>> >>>>>
>> >>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
>> >>>>
>> >>>> Oh yes indeed. I immediately crash in the same way with this config.
>> >>>> I'll
>> >>>> start digging...
>> >>>>
>> >>>> (defconfig finishes boot)
>> >>>
>> >>> This is funny, booting with slub_debug=F results in:
>> >>> I'm not sure if it's ACPI or ftrace wrong here, but looks like the
>> >>> changed
>> >>> free pointer offset merely exposes a bug in something else.
>> >>
>> >> So, with Kees' patch reverted, booting with slub_debug=F (or even more
>> >> specific slub_debug=F,ftrace_event_field) also hits this bug below. I
>> >> wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
>> >> further in history. So it's not new at all, and likely very specific to
>> >> your config+QEMU? (and related to the ACPI error messages that precede
>> >> it?).
>> >
>> > I see it too, but not on v5.0. I can bisect it.
>>
>> commit 67a72420a326b45514deb3f212085fb2cd1595b5
>> Author: Bob Moore <[email protected]>
>> Date: Fri Aug 16 14:43:21 2019 -0700
>>
>> ACPICA: Increase total number of possible Owner IDs
>>
>> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
>>
>> From 255 to 4095 possible IDs.
>>
>> Link: https://github.com/acpica/acpica/commit/1f1652da
>> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
>> Signed-off-by: Bob Moore <[email protected]>
>> Signed-off-by: Erik Schmauss <[email protected]>
>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>
> Bob, Erik, did we miss something in that patch?

Maybe the patch just changes layout in a way that exposes the bug.

Anyway the "ftrace_event_field" cache is not really involved, this is just
because of slab merging. After adding "slub_nomerge" to "slub_debug=F", it
starts making more sense, as the cache becomes Acpi-Namespace

[ 0.140408] ------------[ cut here ]------------
[ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
[ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
[ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
[ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
[ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
[ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
[ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
[ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX: 0000000000000000
[ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffffacfdd32c
[ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09: 00000000000000fc
[ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12: ffff937307409e00
[ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15: 0000000000000000
[ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000) knlGS:0000000000000000
[ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4: 00000000003406f0
[ 0.142393] Call Trace:
[ 0.142393] acpi_os_release_object+0x5/0x10
[ 0.142393] acpi_ns_delete_children+0x46/0x59
[ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
[ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
[ 0.142393] acpi_ns_terminate+0xc/0x31
[ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
[ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
[ 0.142393] acpi_terminate+0x5/0xf
[ 0.142393] acpi_init+0x27b/0x308
[ 0.142393] ? video_setup+0x79/0x79
[ 0.142393] do_one_initcall+0x7b/0x160
[ 0.142393] kernel_init_freeable+0x190/0x1f2
[ 0.142393] ? rest_init+0x9a/0x9a
[ 0.142393] kernel_init+0x5/0xf6
[ 0.142393] ret_from_fork+0x22/0x30
[ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
[ 0.142396] ------------[ cut here ]------------

I've also changed the warning so it's not printed just once, and also prints tracking info
(see the hunk at the end of my mail, I'll turn this to a proper patch later).

With "slub_debug=FU slub_nomerge" there are now multiple warnings, but they all look the same:

[ 0.143815] ------------[ cut here ]------------
[ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
[ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
[ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
[ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
[ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
[ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
[ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
[ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX: 0000000000000000
[ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffff8e1dd32c
[ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09: 00000000000000fc
[ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12: ffff972dc74ada80
[ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15: 0000000000000000
[ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000) knlGS:0000000000000000
[ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4: 00000000003406f0
[ 0.145129] Call Trace:
[ 0.145129] acpi_os_release_object+0x5/0x10
[ 0.145129] acpi_ns_delete_children+0x46/0x59
[ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
[ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
[ 0.145129] acpi_ns_terminate+0xc/0x31
[ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
[ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
[ 0.145129] acpi_terminate+0x5/0xf
[ 0.145129] acpi_init+0x27b/0x308
[ 0.145129] ? video_setup+0x79/0x79
[ 0.145129] do_one_initcall+0x7b/0x160
[ 0.145129] kernel_init_freeable+0x190/0x1f2
[ 0.145129] ? rest_init+0x9a/0x9a
[ 0.145129] kernel_init+0x5/0xf6
[ 0.145129] ret_from_fork+0x22/0x30
[ 0.145129] ---[ end trace 574554fca7bd06bb ]---
[ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58 cpu=0 pid=0
[ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
[ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
[ 0.146578] acpi_initialize_subsystem+0x65/0xa8
[ 0.147024] acpi_early_init+0x5d/0xd1
[ 0.147132] start_kernel+0x45b/0x518
[ 0.147491] secondary_startup_64+0xb6/0xc0
[ 0.147897] ------------[ cut here ]------------

And it seems ACPI is allocating an object via kmalloc() and then freeing it
via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which is wrong.

> ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
acpi_ns_root_initialize+0xb6/0x2d1:
kmalloc at include/linux/slab.h:555
(inlined by) kzalloc at include/linux/slab.h:669
(inlined by) acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
(inlined by) acpi_ns_root_initialize at drivers/acpi/acpica/nsaccess.c:102


----8<----
diff --git a/mm/slab.h b/mm/slab.h
index 207c83ef6e06..282e78964018 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -503,6 +503,7 @@ static __always_inline void uncharge_slab_page(struct page *page, int order,
memcg_uncharge_slab(page, order, s);
}

+extern void print_tracking(struct kmem_cache *s, void *object);
static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
{
struct kmem_cache *cachep;
@@ -520,9 +521,11 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
return s;

cachep = virt_to_cache(x);
- WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
+ if (WARN(cachep && !slab_equal_or_root(cachep, s),
"%s: Wrong slab cache. %s but object is from %s\n",
- __func__, s->name, cachep->name);
+ __func__, s->name, cachep->name)) {
+ print_tracking(cachep, x);
+ }
return cachep;
}

2020-06-05 14:27:31

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

acpi_os_allocate_zeroed(On Fri, Jun 5, 2020 at 4:08 PM Vlastimil Babka
<[email protected]> wrote:
>
> On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
> >>
> >> On 2020-06-05 11:36, Vegard Nossum wrote:
> >> >
> >> > On 2020-06-05 11:11, Vlastimil Babka wrote:
> >> >> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> >> >>> On 6/4/20 7:57 PM, Kees Cook wrote:
> >> >>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> >> >>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
> >> >>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
> >> >>>>>>>
> >> >>>>>>> Hi all,
> >> >>>>>>>
> >> >>>>>>> I ran into a boot problem with latest linus/master
> >> >>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
> >> >>>>>>
> >> >>>>>> Hi, what's the .config you use?
> >> >>>>>
> >> >>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
> >> >>>>
> >> >>>> Oh yes indeed. I immediately crash in the same way with this config.
> >> >>>> I'll
> >> >>>> start digging...
> >> >>>>
> >> >>>> (defconfig finishes boot)
> >> >>>
> >> >>> This is funny, booting with slub_debug=F results in:
> >> >>> I'm not sure if it's ACPI or ftrace wrong here, but looks like the
> >> >>> changed
> >> >>> free pointer offset merely exposes a bug in something else.
> >> >>
> >> >> So, with Kees' patch reverted, booting with slub_debug=F (or even more
> >> >> specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> >> >> wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> >> >> further in history. So it's not new at all, and likely very specific to
> >> >> your config+QEMU? (and related to the ACPI error messages that precede
> >> >> it?).
> >> >
> >> > I see it too, but not on v5.0. I can bisect it.
> >>
> >> commit 67a72420a326b45514deb3f212085fb2cd1595b5
> >> Author: Bob Moore <[email protected]>
> >> Date: Fri Aug 16 14:43:21 2019 -0700
> >>
> >> ACPICA: Increase total number of possible Owner IDs
> >>
> >> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
> >>
> >> From 255 to 4095 possible IDs.
> >>
> >> Link: https://github.com/acpica/acpica/commit/1f1652da
> >> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
> >> Signed-off-by: Bob Moore <[email protected]>
> >> Signed-off-by: Erik Schmauss <[email protected]>
> >> Signed-off-by: Rafael J. Wysocki <[email protected]>
> >
> > Bob, Erik, did we miss something in that patch?
>
> Maybe the patch just changes layout in a way that exposes the bug.
>
> Anyway the "ftrace_event_field" cache is not really involved, this is just
> because of slab merging. After adding "slub_nomerge" to "slub_debug=F", it
> starts making more sense, as the cache becomes Acpi-Namespace
>
> [ 0.140408] ------------[ cut here ]------------
> [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
> [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
> [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
> [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX: 0000000000000000
> [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffffacfdd32c
> [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09: 00000000000000fc
> [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12: ffff937307409e00
> [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15: 0000000000000000
> [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000) knlGS:0000000000000000
> [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4: 00000000003406f0
> [ 0.142393] Call Trace:
> [ 0.142393] acpi_os_release_object+0x5/0x10
> [ 0.142393] acpi_ns_delete_children+0x46/0x59
> [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.142393] acpi_ns_terminate+0xc/0x31
> [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.142393] acpi_terminate+0x5/0xf
> [ 0.142393] acpi_init+0x27b/0x308
> [ 0.142393] ? video_setup+0x79/0x79
> [ 0.142393] do_one_initcall+0x7b/0x160
> [ 0.142393] kernel_init_freeable+0x190/0x1f2
> [ 0.142393] ? rest_init+0x9a/0x9a
> [ 0.142393] kernel_init+0x5/0xf6
> [ 0.142393] ret_from_fork+0x22/0x30
> [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
> [ 0.142396] ------------[ cut here ]------------
>
> I've also changed the warning so it's not printed just once, and also prints tracking info
> (see the hunk at the end of my mail, I'll turn this to a proper patch later).
>
> With "slub_debug=FU slub_nomerge" there are now multiple warnings, but they all look the same:
>
> [ 0.143815] ------------[ cut here ]------------
> [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
> [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
> [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
> [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX: 0000000000000000
> [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffff8e1dd32c
> [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09: 00000000000000fc
> [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12: ffff972dc74ada80
> [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15: 0000000000000000
> [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000) knlGS:0000000000000000
> [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4: 00000000003406f0
> [ 0.145129] Call Trace:
> [ 0.145129] acpi_os_release_object+0x5/0x10
> [ 0.145129] acpi_ns_delete_children+0x46/0x59
> [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.145129] acpi_ns_terminate+0xc/0x31
> [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.145129] acpi_terminate+0x5/0xf
> [ 0.145129] acpi_init+0x27b/0x308
> [ 0.145129] ? video_setup+0x79/0x79
> [ 0.145129] do_one_initcall+0x7b/0x160
> [ 0.145129] kernel_init_freeable+0x190/0x1f2
> [ 0.145129] ? rest_init+0x9a/0x9a
> [ 0.145129] kernel_init+0x5/0xf6
> [ 0.145129] ret_from_fork+0x22/0x30
> [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
> [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58 cpu=0 pid=0
> [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
> [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
> [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
> [ 0.147024] acpi_early_init+0x5d/0xd1
> [ 0.147132] start_kernel+0x45b/0x518
> [ 0.147491] secondary_startup_64+0xb6/0xc0
> [ 0.147897] ------------[ cut here ]------------
>
> And it seems ACPI is allocating an object via kmalloc() and then freeing it
> via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which is wrong.
>
> > ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> acpi_ns_root_initialize+0xb6/0x2d1:
> kmalloc at include/linux/slab.h:555
> (inlined by) kzalloc at include/linux/slab.h:669
> (inlined by) acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> (inlined by) acpi_ns_root_initialize at drivers/acpi/acpica/nsaccess.c:102
>
>
> ----8<----
> diff --git a/mm/slab.h b/mm/slab.h
> index 207c83ef6e06..282e78964018 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -503,6 +503,7 @@ static __always_inline void uncharge_slab_page(struct page *page, int order,
> memcg_uncharge_slab(page, order, s);
> }
>
> +extern void print_tracking(struct kmem_cache *s, void *object);
> static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
> {
> struct kmem_cache *cachep;
> @@ -520,9 +521,11 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
> return s;
>
> cachep = virt_to_cache(x);
> - WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
> + if (WARN(cachep && !slab_equal_or_root(cachep, s),
> "%s: Wrong slab cache. %s but object is from %s\n",
> - __func__, s->name, cachep->name);
> + __func__, s->name, cachep->name)) {
> + print_tracking(cachep, x);
> + }
> return cachep;
> }

Well, the bug seems to be that the object in question is allocated via
acpi_os_allocate_zeroed() while it should be allocated with
acpi_os_acquire_object().

2020-06-05 14:52:36

by Vegard Nossum

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 2020-06-05 16:08, Vlastimil Babka wrote:
> On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
>> On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
>>>
>>> On 2020-06-05 11:36, Vegard Nossum wrote:
>>>>
>>>> On 2020-06-05 11:11, Vlastimil Babka wrote:
>>>>> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
>>>>>> On 6/4/20 7:57 PM, Kees Cook wrote:
>>>>>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>>>>>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>>>>>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I ran into a boot problem with latest linus/master
>>>>>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests like this:
>>>>>>>>>
>>>>>>>>> Hi, what's the .config you use?
>>>>>>>>
>>>>>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB, ...)
>>>>>>>
>>>>>>> Oh yes indeed. I immediately crash in the same way with this config.
>>>>>>> I'll
>>>>>>> start digging...
>>>>>>>
>>>>>>> (defconfig finishes boot)
>>>>>>
>>>>>> This is funny, booting with slub_debug=F results in:
>>>>>> I'm not sure if it's ACPI or ftrace wrong here, but looks like the
>>>>>> changed
>>>>>> free pointer offset merely exposes a bug in something else.
>>>>>
>>>>> So, with Kees' patch reverted, booting with slub_debug=F (or even more
>>>>> specific slub_debug=F,ftrace_event_field) also hits this bug below. I
>>>>> wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
>>>>> further in history. So it's not new at all, and likely very specific to
>>>>> your config+QEMU? (and related to the ACPI error messages that precede
>>>>> it?).
>>>>
>>>> I see it too, but not on v5.0. I can bisect it.
>>>
>>> commit 67a72420a326b45514deb3f212085fb2cd1595b5
>>> Author: Bob Moore <[email protected]>
>>> Date: Fri Aug 16 14:43:21 2019 -0700
>>>
>>> ACPICA: Increase total number of possible Owner IDs
>>>
>>> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
>>>
>>> From 255 to 4095 possible IDs.
>>>
>>> Link: https://github.com/acpica/acpica/commit/1f1652da
>>> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
>>> Signed-off-by: Bob Moore <[email protected]>
>>> Signed-off-by: Erik Schmauss <[email protected]>
>>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>>
>> Bob, Erik, did we miss something in that patch?
>
> Maybe the patch just changes layout in a way that exposes the bug.
>
> Anyway the "ftrace_event_field" cache is not really involved, this is just
> because of slab merging. After adding "slub_nomerge" to "slub_debug=F", it
> starts making more sense, as the cache becomes Acpi-Namespace
>
> [ 0.140408] ------------[ cut here ]------------
> [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
> [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
> [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
> [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX: 0000000000000000
> [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffffacfdd32c
> [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09: 00000000000000fc
> [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12: ffff937307409e00
> [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15: 0000000000000000
> [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000) knlGS:0000000000000000
> [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4: 00000000003406f0
> [ 0.142393] Call Trace:
> [ 0.142393] acpi_os_release_object+0x5/0x10
> [ 0.142393] acpi_ns_delete_children+0x46/0x59
> [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.142393] acpi_ns_terminate+0xc/0x31
> [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.142393] acpi_terminate+0x5/0xf
> [ 0.142393] acpi_init+0x27b/0x308
> [ 0.142393] ? video_setup+0x79/0x79
> [ 0.142393] do_one_initcall+0x7b/0x160
> [ 0.142393] kernel_init_freeable+0x190/0x1f2
> [ 0.142393] ? rest_init+0x9a/0x9a
> [ 0.142393] kernel_init+0x5/0xf6
> [ 0.142393] ret_from_fork+0x22/0x30
> [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
> [ 0.142396] ------------[ cut here ]------------
>
> I've also changed the warning so it's not printed just once, and also prints tracking info
> (see the hunk at the end of my mail, I'll turn this to a proper patch later).
>
> With "slub_debug=FU slub_nomerge" there are now multiple warnings, but they all look the same:
>
> [ 0.143815] ------------[ cut here ]------------
> [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
> [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
> [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
> [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX: 0000000000000000
> [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffff8e1dd32c
> [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09: 00000000000000fc
> [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12: ffff972dc74ada80
> [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15: 0000000000000000
> [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000) knlGS:0000000000000000
> [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4: 00000000003406f0
> [ 0.145129] Call Trace:
> [ 0.145129] acpi_os_release_object+0x5/0x10
> [ 0.145129] acpi_ns_delete_children+0x46/0x59
> [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.145129] acpi_ns_terminate+0xc/0x31
> [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> [ 0.145129] acpi_terminate+0x5/0xf
> [ 0.145129] acpi_init+0x27b/0x308
> [ 0.145129] ? video_setup+0x79/0x79
> [ 0.145129] do_one_initcall+0x7b/0x160
> [ 0.145129] kernel_init_freeable+0x190/0x1f2
> [ 0.145129] ? rest_init+0x9a/0x9a
> [ 0.145129] kernel_init+0x5/0xf6
> [ 0.145129] ret_from_fork+0x22/0x30
> [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
> [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58 cpu=0 pid=0
> [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
> [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
> [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
> [ 0.147024] acpi_early_init+0x5d/0xd1
> [ 0.147132] start_kernel+0x45b/0x518
> [ 0.147491] secondary_startup_64+0xb6/0xc0
> [ 0.147897] ------------[ cut here ]------------
>
> And it seems ACPI is allocating an object via kmalloc() and then freeing it
> via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which is wrong.
>
>> ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> acpi_ns_root_initialize+0xb6/0x2d1:
> kmalloc at include/linux/slab.h:555
> (inlined by) kzalloc at include/linux/slab.h:669
> (inlined by) acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> (inlined by) acpi_ns_root_initialize at drivers/acpi/acpica/nsaccess.c:102
>

That's it :-) This fixes it for me:

diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
index 2566e2d4c7803..b76bbab917941 100644
--- a/drivers/acpi/acpica/nsaccess.c
+++ b/drivers/acpi/acpica/nsaccess.c
@@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
* predefined names are at the root level. It is much
easier to
* just create and link the new node(s) here.
*/
- new_node =
- ACPI_ALLOCATE_ZEROED(sizeof(struct
acpi_namespace_node));
+ new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
init_val->name));
if (!new_node) {
status = AE_NO_MEMORY;
goto unlock_and_exit;
}

- ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
new_node->type = init_val->type;


Vegard

2020-06-05 15:46:42

by Kees Cook

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
> On 2020-06-05 16:08, Vlastimil Babka wrote:
> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
> > > >
> > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> > > > >
> > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even more
> > > > > > specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> > > > > > wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> > > > > > further in history. So it's not new at all, and likely very specific to
> > > > > > your config+QEMU? (and related to the ACPI error messages that precede
> > > > > > it?).
> > [...]
> > [ 0.140408] ------------[ cut here ]------------
> > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250

Ah yes! Good. I had improved this check recently too, and I was worried
the freelist pointer patch was somehow blocking it, but I see now that
the failing config didn't have CONFIG_SLAB_FREELIST_HARDENED=y. Once
SLAB_CONSISTENCY_CHECKS was enabled ("slub_debug=F"), it started
tripping. Whew.

I wonder if that entire test block should just be removed from
cache_from_obj():

if (!memcg_kmem_enabled() &&
!IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED) &&
!unlikely(s->flags & SLAB_CONSISTENCY_CHECKS))
return s;

and make this test unconditional? It's mostly only called during free(),
and shouldn't be too expensive to be made unconditional. Hmm.

> > And it seems ACPI is allocating an object via kmalloc() and then freeing it
> > via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which is wrong.
> >
> > > ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> > acpi_ns_root_initialize+0xb6/0x2d1:
> > kmalloc at include/linux/slab.h:555
> > (inlined by) kzalloc at include/linux/slab.h:669
> > (inlined by) acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> > (inlined by) acpi_ns_root_initialize at drivers/acpi/acpica/nsaccess.c:102
> >
>
> That's it :-) This fixes it for me:
>
> diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
> index 2566e2d4c7803..b76bbab917941 100644
> --- a/drivers/acpi/acpica/nsaccess.c
> +++ b/drivers/acpi/acpica/nsaccess.c
> @@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
> * predefined names are at the root level. It is much easier
> to
> * just create and link the new node(s) here.
> */
> - new_node =
> - ACPI_ALLOCATE_ZEROED(sizeof(struct
> acpi_namespace_node));
> + new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
> init_val->name));
> if (!new_node) {
> status = AE_NO_MEMORY;
> goto unlock_and_exit;
> }
>
> - ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
> new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
> new_node->type = init_val->type;

I'm a bit confused by the internals of acpi_ns_create_note(). It can still
end up calling ACPI_ALLOCATE_ZEROED() via acpi_os_acquire_object(). Is
this fix correct?

--
Kees Cook

2020-06-05 16:41:04

by Vegard Nossum

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 2020-06-05 17:44, Kees Cook wrote:
> On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
>> That's it :-) This fixes it for me:
>>
>> diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
>> index 2566e2d4c7803..b76bbab917941 100644
>> --- a/drivers/acpi/acpica/nsaccess.c
>> +++ b/drivers/acpi/acpica/nsaccess.c
>> @@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
>> * predefined names are at the root level. It is much easier
>> to
>> * just create and link the new node(s) here.
>> */
>> - new_node =
>> - ACPI_ALLOCATE_ZEROED(sizeof(struct
>> acpi_namespace_node));
>> + new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
>> init_val->name));
>> if (!new_node) {
>> status = AE_NO_MEMORY;
>> goto unlock_and_exit;
>> }
>>
>> - ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
>> new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
>> new_node->type = init_val->type;
>
> I'm a bit confused by the internals of acpi_ns_create_note(). It can still
> end up calling ACPI_ALLOCATE_ZEROED() via acpi_os_acquire_object(). Is
> this fix correct?
>

include/acpi/platform/aclinuxex.h:static inline void
*acpi_os_acquire_object(acpi_cache_t * cache)
include/acpi/platform/aclinuxex.h-{
include/acpi/platform/aclinuxex.h- return kmem_cache_zalloc(cache,
include/acpi/platform/aclinuxex.h-
irqs_disabled()? GFP_ATOMIC : GFP_KERNEL);
include/acpi/platform/aclinuxex.h-}

No comment.


Vegard

2020-06-05 16:57:53

by Vlastimil Babka

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018


On 6/5/20 5:44 PM, Kees Cook wrote:
> On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
>> On 2020-06-05 16:08, Vlastimil Babka wrote:
>> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
>> > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
>> > > >
>> > > > On 2020-06-05 11:36, Vegard Nossum wrote:
>> > > > >
>> > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
>> > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even more
>> > > > > > specific slub_debug=F,ftrace_event_field) also hits this bug below. I
>> > > > > > wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
>> > > > > > further in history. So it's not new at all, and likely very specific to
>> > > > > > your config+QEMU? (and related to the ACPI error messages that precede
>> > > > > > it?).
>> > [...]
>> > [ 0.140408] ------------[ cut here ]------------
>> > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
>> > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
>
> Ah yes! Good. I had improved this check recently too, and I was worried
> the freelist pointer patch was somehow blocking it, but I see now that
> the failing config didn't have CONFIG_SLAB_FREELIST_HARDENED=y. Once
> SLAB_CONSISTENCY_CHECKS was enabled ("slub_debug=F"), it started
> tripping. Whew.
>
> I wonder if that entire test block should just be removed from
> cache_from_obj():
>
> if (!memcg_kmem_enabled() &&
> !IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED) &&
> !unlikely(s->flags & SLAB_CONSISTENCY_CHECKS))
> return s;
>
> and make this test unconditional? It's mostly only called during free(),
> and shouldn't be too expensive to be made unconditional. Hmm.

Hmm I have a different idea. The whole cache_from_obj() was added because of
kmemcg (commit b9ce5ef49f00d) where per-memcg cache can be different from the
root one. And I just realized this usecase can go away with Roman's series [1].
But cache_from_obj() also kept the original SLUB consistency check case, and you
added the freelist hardening case. If kmemcg use case went away it would be nice
to avoid the virt_to_cache() and check completely again, unless in debugging or
hardened kernel.

Furthermore, the original SLUB debugging case was an unconditional pr_err() plus
WARN_ON_ONCE(1), which was kept by commit b9ce5ef49f00d. With freelist
hardening this all changed to WARN_ONCE. So the second and later cases are not
reported at all for hardening and also not for explicitly enabled debugging like
in this case, which is IMHO not ideal.

So I propose the following - the freelist hardening case keeps the WARN_ONCE,
but also a one-line pr_err() for each case so they are not silent. The SLUB
debugging case is always a full warning, and printing the tracking info if
enabled and available. Pure kmemcg case does virt_to_cache() for now (until
hopefully removed by Roman's series) but no checking at all. Would that work for
everyone?

[1] https://lore.kernel.org/linux-mm/[email protected]/

----8<----
diff --git a/mm/slab.h b/mm/slab.h
index 815e4e9a94cd..1182ca2cb11a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -275,6 +275,14 @@ static inline int cache_vmstat_idx(struct kmem_cache *s)
NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE;
}

+#ifdef CONFIG_SLUB_DEBUG
+void slab_print_tracking(struct kmem_cache *s, void *object);
+#else
+static inline void slab_print_tracking(struct kmem_cache *s, void *object)
+{
+}
+#endif
+
#ifdef CONFIG_MEMCG_KMEM

/* List of all root caches. */
@@ -520,9 +528,18 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
return s;

cachep = virt_to_cache(x);
- WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
- "%s: Wrong slab cache. %s but object is from %s\n",
- __func__, s->name, cachep->name);
+ if (unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) {
+ if (WARN(cachep && !slab_equal_or_root(cachep, s),
+ "%s: Wrong slab cache. %s but object is from %s\n",
+ __func__, s->name, cachep->name))
+ slab_print_tracking(cachep, x);
+ } else if (IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED)) {
+ if (unlikely(cachep && !slab_equal_or_root(cachep, s))) {
+ pr_err("%s: Wrong slab cache. %s but object is from %s\n",
+ __func__, s->name, cachep->name);
+ WARN_ON_ONCE(1);
+ }
+ }
return cachep;
}

diff --git a/mm/slub.c b/mm/slub.c
index d4a9a097da50..ff2d817c5a94 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -634,7 +634,7 @@ static void print_track(const char *s, struct track *t, unsigned long pr_time)
#endif
}

-static void print_tracking(struct kmem_cache *s, void *object)
+void slab_print_tracking(struct kmem_cache *s, void *object)
{
unsigned long pr_time = jiffies;
if (!(s->flags & SLAB_STORE_USER))
@@ -698,7 +698,7 @@ static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
unsigned int off; /* Offset of last byte */
u8 *addr = page_address(page);

- print_tracking(s, p);
+ slab_print_tracking(s, p);

print_page_info(page);

@@ -3858,7 +3858,7 @@ static void list_slab_objects(struct kmem_cache *s, struct page *page,

if (!test_bit(slab_index(p, s, addr), map)) {
pr_err("INFO: Object 0x%p @offset=%tu\n", p, p - addr);
- print_tracking(s, p);
+ slab_print_tracking(s, p);
}
}
slab_unlock(page);

2020-06-05 17:53:54

by Kees Cook

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Fri, Jun 05, 2020 at 06:37:55PM +0200, Vegard Nossum wrote:
> On 2020-06-05 17:44, Kees Cook wrote:
> > On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
> > > That's it :-) This fixes it for me:
> > >
> > > diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
> > > index 2566e2d4c7803..b76bbab917941 100644
> > > --- a/drivers/acpi/acpica/nsaccess.c
> > > +++ b/drivers/acpi/acpica/nsaccess.c
> > > @@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
> > > * predefined names are at the root level. It is much easier
> > > to
> > > * just create and link the new node(s) here.
> > > */
> > > - new_node =
> > > - ACPI_ALLOCATE_ZEROED(sizeof(struct
> > > acpi_namespace_node));
> > > + new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
> > > init_val->name));
> > > if (!new_node) {
> > > status = AE_NO_MEMORY;
> > > goto unlock_and_exit;
> > > }
> > >
> > > - ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
> > > new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
> > > new_node->type = init_val->type;
> >
> > I'm a bit confused by the internals of acpi_ns_create_note(). It can still
> > end up calling ACPI_ALLOCATE_ZEROED() via acpi_os_acquire_object(). Is
> > this fix correct?
> >
>
> include/acpi/platform/aclinuxex.h:static inline void
> *acpi_os_acquire_object(acpi_cache_t * cache)
> include/acpi/platform/aclinuxex.h-{
> include/acpi/platform/aclinuxex.h- return kmem_cache_zalloc(cache,
> include/acpi/platform/aclinuxex.h- irqs_disabled()? GFP_ATOMIC :
> GFP_KERNEL);
> include/acpi/platform/aclinuxex.h-}
>
> No comment.

ah! Thanks. I was looking at drivers/acpi/acpica/utcache.c

--
Kees Cook

2020-06-05 18:49:50

by Kees Cook

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Fri, Jun 05, 2020 at 06:55:27PM +0200, Vlastimil Babka wrote:
>
> On 6/5/20 5:44 PM, Kees Cook wrote:
> > On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
> >> On 2020-06-05 16:08, Vlastimil Babka wrote:
> >> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> >> > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
> >> > > >
> >> > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> >> > > > >
> >> > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> >> > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even more
> >> > > > > > specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> >> > > > > > wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> >> > > > > > further in history. So it's not new at all, and likely very specific to
> >> > > > > > your config+QEMU? (and related to the ACPI error messages that precede
> >> > > > > > it?).
> >> > [...]
> >> > [ 0.140408] ------------[ cut here ]------------
> >> > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> >> > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
> >
> > Ah yes! Good. I had improved this check recently too, and I was worried
> > the freelist pointer patch was somehow blocking it, but I see now that
> > the failing config didn't have CONFIG_SLAB_FREELIST_HARDENED=y. Once
> > SLAB_CONSISTENCY_CHECKS was enabled ("slub_debug=F"), it started
> > tripping. Whew.
> >
> > I wonder if that entire test block should just be removed from
> > cache_from_obj():
> >
> > if (!memcg_kmem_enabled() &&
> > !IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED) &&
> > !unlikely(s->flags & SLAB_CONSISTENCY_CHECKS))
> > return s;
> >
> > and make this test unconditional? It's mostly only called during free(),
> > and shouldn't be too expensive to be made unconditional. Hmm.
>
> Hmm I have a different idea. The whole cache_from_obj() was added because of
> kmemcg (commit b9ce5ef49f00d) where per-memcg cache can be different from the
> root one. And I just realized this usecase can go away with Roman's series [1].
> But cache_from_obj() also kept the original SLUB consistency check case, and you
> added the freelist hardening case. If kmemcg use case went away it would be nice
> to avoid the virt_to_cache() and check completely again, unless in debugging or
> hardened kernel.

Is it that expensive? (I'm fine with it staying behind debug/hardening,
but if we can make it on by default, that'd be safer.)

> Furthermore, the original SLUB debugging case was an unconditional pr_err() plus
> WARN_ON_ONCE(1), which was kept by commit b9ce5ef49f00d. With freelist
> hardening this all changed to WARN_ONCE. So the second and later cases are not
> reported at all for hardening and also not for explicitly enabled debugging like
> in this case, which is IMHO not ideal.

Oh, I have no problem with WARN vs WARN_ONCE -- there's no reason to
split this. And I'd love the hardening side to gain the tracking call
too, if it's available.

I had just used WARN_ONCE() since sometimes it can be very noisy to keep
warning for some condition that might not be correctable.

> So I propose the following - the freelist hardening case keeps the WARN_ONCE,
> but also a one-line pr_err() for each case so they are not silent. The SLUB
> debugging case is always a full warning, and printing the tracking info if
> enabled and available. Pure kmemcg case does virt_to_cache() for now (until
> hopefully removed by Roman's series) but no checking at all. Would that work for
> everyone?
> [...]
> @@ -520,9 +528,18 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
> return s;
>
> cachep = virt_to_cache(x);
> - WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
> - "%s: Wrong slab cache. %s but object is from %s\n",
> - __func__, s->name, cachep->name);
> + if (unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) {
> + if (WARN(cachep && !slab_equal_or_root(cachep, s),
> + "%s: Wrong slab cache. %s but object is from %s\n",
> + __func__, s->name, cachep->name))
> + slab_print_tracking(cachep, x);
> + } else if (IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED)) {
> + if (unlikely(cachep && !slab_equal_or_root(cachep, s))) {
> + pr_err("%s: Wrong slab cache. %s but object is from %s\n",
> + __func__, s->name, cachep->name);
> + WARN_ON_ONCE(1);
> + }
> + }

How about just this (in addition to your slab_print_tracking() refactor):

diff --git a/mm/slab.h b/mm/slab.h
index 207c83ef6e06..107b7f6db3c3 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -520,9 +520,10 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
return s;

cachep = virt_to_cache(x);
- WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
+ if (WARN(cachep && !slab_equal_or_root(cachep, s),
"%s: Wrong slab cache. %s but object is from %s\n",
- __func__, s->name, cachep->name);
+ __func__, s->name, cachep->name))
+ slab_print_tracking(cachep, x);
return cachep;
}


--
Kees Cook

2020-06-05 21:47:46

by Kaneda, Erik

[permalink] [raw]
Subject: RE: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018



> -----Original Message-----
> From: Vegard Nossum <[email protected]>
> Sent: Friday, June 5, 2020 7:45 AM
> To: Vlastimil Babka <[email protected]>; Rafael J. Wysocki
> <[email protected]>; Moore, Robert <[email protected]>; Kaneda,
> Erik <[email protected]>
> Cc: Kees Cook <[email protected]>; Wysocki, Rafael J
> <[email protected]>; Christoph Lameter <[email protected]>; Andrew
> Morton <[email protected]>; Marco Elver <[email protected]>;
> Waiman Long <[email protected]>; LKML <linux-
> [email protected]>; Linux MM <[email protected]>; ACPI Devel
> Maling List <[email protected]>; Len Brown <[email protected]>;
> Steven Rostedt <[email protected]>
> Subject: Re: slub freelist issue / BUG: unable to handle page fault for
> address: 000000003ffe0018
>
> On 2020-06-05 16:08, Vlastimil Babka wrote:
> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> >> On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum
> <[email protected]> wrote:
> >>>
> >>> On 2020-06-05 11:36, Vegard Nossum wrote:
> >>>>
> >>>> On 2020-06-05 11:11, Vlastimil Babka wrote:
> >>>>> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> >>>>>> On 6/4/20 7:57 PM, Kees Cook wrote:
> >>>>>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> >>>>>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
> >>>>>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I ran into a boot problem with latest linus/master
> >>>>>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests
> like this:
> >>>>>>>>>
> >>>>>>>>> Hi, what's the .config you use?
> >>>>>>>>
> >>>>>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB,
> >>>>>>>> ...)
> >>>>>>>
> >>>>>>> Oh yes indeed. I immediately crash in the same way with this config.
> >>>>>>> I'll
> >>>>>>> start digging...
> >>>>>>>
> >>>>>>> (defconfig finishes boot)
> >>>>>>
> >>>>>> This is funny, booting with slub_debug=F results in:
> >>>>>> I'm not sure if it's ACPI or ftrace wrong here, but looks like
> >>>>>> the changed free pointer offset merely exposes a bug in something
> >>>>>> else.
> >>>>>
> >>>>> So, with Kees' patch reverted, booting with slub_debug=F (or even
> >>>>> more specific slub_debug=F,ftrace_event_field) also hits this bug
> >>>>> below. I wanted to bisect it, but v5.7 was also bad, and also
> >>>>> v5.6. Didn't try further in history. So it's not new at all, and
> >>>>> likely very specific to your config+QEMU? (and related to the ACPI
> >>>>> error messages that precede it?).
> >>>>
> >>>> I see it too, but not on v5.0. I can bisect it.
> >>>
> >>> commit 67a72420a326b45514deb3f212085fb2cd1595b5
> >>> Author: Bob Moore <[email protected]>
> >>> Date: Fri Aug 16 14:43:21 2019 -0700
> >>>
> >>> ACPICA: Increase total number of possible Owner IDs
> >>>
> >>> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
> >>>
> >>> From 255 to 4095 possible IDs.
> >>>
> >>> Link: https://github.com/acpica/acpica/commit/1f1652da
> >>> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
> >>> Signed-off-by: Bob Moore <[email protected]>
> >>> Signed-off-by: Erik Schmauss <[email protected]>
> >>> Signed-off-by: Rafael J. Wysocki <[email protected]>
> >>
> >> Bob, Erik, did we miss something in that patch?
> >
> > Maybe the patch just changes layout in a way that exposes the bug.
> >
> > Anyway the "ftrace_event_field" cache is not really involved, this is
> > just because of slab merging. After adding "slub_nomerge" to
> > "slub_debug=F", it starts making more sense, as the cache becomes
> > Acpi-Namespace
> >
> > [ 0.140408] ------------[ cut here ]------------
> > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> object is from kmalloc-64
> > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> kmem_cache_free+0x1d3/0x250
> > [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de
> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
> > [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX:
> 0000000000000000
> > [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> ffffffffacfdd32c
> > [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09:
> 00000000000000fc
> > [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12:
> ffff937307409e00
> > [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15:
> 0000000000000000
> > [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000)
> knlGS:0000000000000000
> > [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4:
> 00000000003406f0
> > [ 0.142393] Call Trace:
> > [ 0.142393] acpi_os_release_object+0x5/0x10
> > [ 0.142393] acpi_ns_delete_children+0x46/0x59
> > [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.142393] acpi_ns_terminate+0xc/0x31
> > [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
> > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.142393] acpi_terminate+0x5/0xf
> > [ 0.142393] acpi_init+0x27b/0x308
> > [ 0.142393] ? video_setup+0x79/0x79
> > [ 0.142393] do_one_initcall+0x7b/0x160
> > [ 0.142393] kernel_init_freeable+0x190/0x1f2
> > [ 0.142393] ? rest_init+0x9a/0x9a
> > [ 0.142393] kernel_init+0x5/0xf6
> > [ 0.142393] ret_from_fork+0x22/0x30
> > [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
> > [ 0.142396] ------------[ cut here ]------------
> >
> > I've also changed the warning so it's not printed just once, and also
> > prints tracking info (see the hunk at the end of my mail, I'll turn this to a
> proper patch later).
> >
> > With "slub_debug=FU slub_nomerge" there are now multiple warnings,
> but they all look the same:
> >
> > [ 0.143815] ------------[ cut here ]------------
> > [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> object is from kmalloc-64
> > [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> kmem_cache_free+0x1d3/0x250
> > [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de
> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
> > [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX:
> 0000000000000000
> > [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> ffffffff8e1dd32c
> > [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09:
> 00000000000000fc
> > [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12:
> ffff972dc74ada80
> > [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15:
> 0000000000000000
> > [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000)
> knlGS:0000000000000000
> > [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4:
> 00000000003406f0
> > [ 0.145129] Call Trace:
> > [ 0.145129] acpi_os_release_object+0x5/0x10
> > [ 0.145129] acpi_ns_delete_children+0x46/0x59
> > [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.145129] acpi_ns_terminate+0xc/0x31
> > [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
> > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.145129] acpi_terminate+0x5/0xf
> > [ 0.145129] acpi_init+0x27b/0x308
> > [ 0.145129] ? video_setup+0x79/0x79
> > [ 0.145129] do_one_initcall+0x7b/0x160
> > [ 0.145129] kernel_init_freeable+0x190/0x1f2
> > [ 0.145129] ? rest_init+0x9a/0x9a
> > [ 0.145129] kernel_init+0x5/0xf6
> > [ 0.145129] ret_from_fork+0x22/0x30
> > [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
> > [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58
> cpu=0 pid=0
> > [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
> > [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
> > [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
> > [ 0.147024] acpi_early_init+0x5d/0xd1
> > [ 0.147132] start_kernel+0x45b/0x518
> > [ 0.147491] secondary_startup_64+0xb6/0xc0
> > [ 0.147897] ------------[ cut here ]------------
> >
> > And it seems ACPI is allocating an object via kmalloc() and then
> > freeing it via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which
> is wrong.
> >
> >> ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> > acpi_ns_root_initialize+0xb6/0x2d1:
> > kmalloc at include/linux/slab.h:555
> > (inlined by) kzalloc at include/linux/slab.h:669 (inlined by)
> > acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> > (inlined by) acpi_ns_root_initialize at
> > drivers/acpi/acpica/nsaccess.c:102
> >
>
Hi,

> That's it :-) This fixes it for me:

I've tried this and it's working on my end.

Bob, any thoughts?

Erik

>
> diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
> index 2566e2d4c7803..b76bbab917941 100644
> --- a/drivers/acpi/acpica/nsaccess.c
> +++ b/drivers/acpi/acpica/nsaccess.c
> @@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
> * predefined names are at the root level. It is much easier to
> * just create and link the new node(s) here.
> */
> - new_node =
> - ACPI_ALLOCATE_ZEROED(sizeof(struct
> acpi_namespace_node));
> + new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
> init_val->name));
> if (!new_node) {
> status = AE_NO_MEMORY;
> goto unlock_and_exit;
> }
>
> - ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
> new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
> new_node->type = init_val->type;
>
>
> Vegard

2020-06-06 06:50:01

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Fri, Jun 5, 2020 at 5:44 PM Kees Cook <[email protected]> wrote:
>
> On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote:
> > On 2020-06-05 16:08, Vlastimil Babka wrote:
> > > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> > > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum <[email protected]> wrote:
> > > > >
> > > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> > > > > >
> > > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> > > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even more
> > > > > > > specific slub_debug=F,ftrace_event_field) also hits this bug below. I
> > > > > > > wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try
> > > > > > > further in history. So it's not new at all, and likely very specific to
> > > > > > > your config+QEMU? (and related to the ACPI error messages that precede
> > > > > > > it?).
> > > [...]
> > > [ 0.140408] ------------[ cut here ]------------
> > > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64
> > > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250
>
> Ah yes! Good. I had improved this check recently too, and I was worried
> the freelist pointer patch was somehow blocking it, but I see now that
> the failing config didn't have CONFIG_SLAB_FREELIST_HARDENED=y. Once
> SLAB_CONSISTENCY_CHECKS was enabled ("slub_debug=F"), it started
> tripping. Whew.
>
> I wonder if that entire test block should just be removed from
> cache_from_obj():
>
> if (!memcg_kmem_enabled() &&
> !IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED) &&
> !unlikely(s->flags & SLAB_CONSISTENCY_CHECKS))
> return s;
>
> and make this test unconditional? It's mostly only called during free(),
> and shouldn't be too expensive to be made unconditional. Hmm.
>
> > > And it seems ACPI is allocating an object via kmalloc() and then freeing it
> > > via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which is wrong.
> > >
> > > > ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> > > acpi_ns_root_initialize+0xb6/0x2d1:
> > > kmalloc at include/linux/slab.h:555
> > > (inlined by) kzalloc at include/linux/slab.h:669
> > > (inlined by) acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> > > (inlined by) acpi_ns_root_initialize at drivers/acpi/acpica/nsaccess.c:102
> > >
> >
> > That's it :-) This fixes it for me:
> >
> > diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
> > index 2566e2d4c7803..b76bbab917941 100644
> > --- a/drivers/acpi/acpica/nsaccess.c
> > +++ b/drivers/acpi/acpica/nsaccess.c
> > @@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
> > * predefined names are at the root level. It is much easier
> > to
> > * just create and link the new node(s) here.
> > */
> > - new_node =
> > - ACPI_ALLOCATE_ZEROED(sizeof(struct
> > acpi_namespace_node));
> > + new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
> > init_val->name));
> > if (!new_node) {
> > status = AE_NO_MEMORY;
> > goto unlock_and_exit;
> > }
> >
> > - ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
> > new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
> > new_node->type = init_val->type;
>
> I'm a bit confused by the internals of acpi_ns_create_note(). It can still
> end up calling ACPI_ALLOCATE_ZEROED() via acpi_os_acquire_object().

No, it can't.

Linux uses the one from include/acpi/platform/aclinuxex.h

> Is this fix correct?

Yes, it is, AFAICS.

Cheers!

2020-06-08 10:55:58

by Vlastimil Babka

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 6/5/20 8:46 PM, Kees Cook wrote:
>>
>> Hmm I have a different idea. The whole cache_from_obj() was added because of
>> kmemcg (commit b9ce5ef49f00d) where per-memcg cache can be different from the
>> root one. And I just realized this usecase can go away with Roman's series [1].
>> But cache_from_obj() also kept the original SLUB consistency check case, and you
>> added the freelist hardening case. If kmemcg use case went away it would be nice
>> to avoid the virt_to_cache() and check completely again, unless in debugging or
>> hardened kernel.
>
> Is it that expensive? (I'm fine with it staying behind debug/hardening,
> but if we can make it on by default, that'd be safer.)

Well, it's fast path and e.g. networking guys did a lot of work to optimize
SLUB. If we decide to stop trusting the supplied cache pointer completely, we
can deprecate kmem_cache_free() and use kfree() everywhere (SLOB would need some
adjustments to store size with each object like for kmalloc) but it would have
to be a conscious decision.

>> Furthermore, the original SLUB debugging case was an unconditional pr_err() plus
>> WARN_ON_ONCE(1), which was kept by commit b9ce5ef49f00d. With freelist
>> hardening this all changed to WARN_ONCE. So the second and later cases are not
>> reported at all for hardening and also not for explicitly enabled debugging like
>> in this case, which is IMHO not ideal.
>
> Oh, I have no problem with WARN vs WARN_ONCE -- there's no reason to
> split this. And I'd love the hardening side to gain the tracking call
> too, if it's available.
>
> I had just used WARN_ONCE() since sometimes it can be very noisy to keep
> warning for some condition that might not be correctable.

OK.

>> So I propose the following - the freelist hardening case keeps the WARN_ONCE,
>> but also a one-line pr_err() for each case so they are not silent. The SLUB
>> debugging case is always a full warning, and printing the tracking info if
>> enabled and available. Pure kmemcg case does virt_to_cache() for now (until
>> hopefully removed by Roman's series) but no checking at all. Would that work for
>> everyone?
>> [...]
>> @@ -520,9 +528,18 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
>> return s;
>>
>> cachep = virt_to_cache(x);
>> - WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
>> - "%s: Wrong slab cache. %s but object is from %s\n",
>> - __func__, s->name, cachep->name);
>> + if (unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) {
>> + if (WARN(cachep && !slab_equal_or_root(cachep, s),
>> + "%s: Wrong slab cache. %s but object is from %s\n",
>> + __func__, s->name, cachep->name))
>> + slab_print_tracking(cachep, x);
>> + } else if (IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED)) {
>> + if (unlikely(cachep && !slab_equal_or_root(cachep, s))) {
>> + pr_err("%s: Wrong slab cache. %s but object is from %s\n",
>> + __func__, s->name, cachep->name);
>> + WARN_ON_ONCE(1);
>> + }
>> + }
>
> How about just this (in addition to your slab_print_tracking() refactor):

That could work, I will send a proper patch.

> diff --git a/mm/slab.h b/mm/slab.h
> index 207c83ef6e06..107b7f6db3c3 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -520,9 +520,10 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
> return s;
>
> cachep = virt_to_cache(x);
> - WARN_ONCE(cachep && !slab_equal_or_root(cachep, s),
> + if (WARN(cachep && !slab_equal_or_root(cachep, s),
> "%s: Wrong slab cache. %s but object is from %s\n",
> - __func__, s->name, cachep->name);
> + __func__, s->name, cachep->name))
> + slab_print_tracking(cachep, x);
> return cachep;
> }
>
>

2020-06-11 01:44:41

by Kaneda, Erik

[permalink] [raw]
Subject: RE: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018



> -----Original Message-----
> From: Vegard Nossum <[email protected]>
> Sent: Friday, June 5, 2020 7:45 AM
> To: Vlastimil Babka <[email protected]>; Rafael J. Wysocki
> <[email protected]>; Moore, Robert <[email protected]>; Kaneda,
> Erik <[email protected]>
> Cc: Kees Cook <[email protected]>; Wysocki, Rafael J
> <[email protected]>; Christoph Lameter <[email protected]>; Andrew
> Morton <[email protected]>; Marco Elver <[email protected]>;
> Waiman Long <[email protected]>; LKML <linux-
> [email protected]>; Linux MM <[email protected]>; ACPI Devel
> Maling List <[email protected]>; Len Brown <[email protected]>;
> Steven Rostedt <[email protected]>
> Subject: Re: slub freelist issue / BUG: unable to handle page fault for
> address: 000000003ffe0018
>
> On 2020-06-05 16:08, Vlastimil Babka wrote:
> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> >> On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum
> <[email protected]> wrote:
> >>>
> >>> On 2020-06-05 11:36, Vegard Nossum wrote:
> >>>>
> >>>> On 2020-06-05 11:11, Vlastimil Babka wrote:
> >>>>> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> >>>>>> On 6/4/20 7:57 PM, Kees Cook wrote:
> >>>>>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> >>>>>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
> >>>>>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I ran into a boot problem with latest linus/master
> >>>>>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests
> like this:
> >>>>>>>>>
> >>>>>>>>> Hi, what's the .config you use?
> >>>>>>>>
> >>>>>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB,
> >>>>>>>> ...)
> >>>>>>>
> >>>>>>> Oh yes indeed. I immediately crash in the same way with this config.
> >>>>>>> I'll
> >>>>>>> start digging...
> >>>>>>>
> >>>>>>> (defconfig finishes boot)
> >>>>>>
> >>>>>> This is funny, booting with slub_debug=F results in:
> >>>>>> I'm not sure if it's ACPI or ftrace wrong here, but looks like
> >>>>>> the changed free pointer offset merely exposes a bug in something
> >>>>>> else.
> >>>>>
> >>>>> So, with Kees' patch reverted, booting with slub_debug=F (or even
> >>>>> more specific slub_debug=F,ftrace_event_field) also hits this bug
> >>>>> below. I wanted to bisect it, but v5.7 was also bad, and also
> >>>>> v5.6. Didn't try further in history. So it's not new at all, and
> >>>>> likely very specific to your config+QEMU? (and related to the ACPI
> >>>>> error messages that precede it?).
> >>>>
> >>>> I see it too, but not on v5.0. I can bisect it.
> >>>
> >>> commit 67a72420a326b45514deb3f212085fb2cd1595b5
> >>> Author: Bob Moore <[email protected]>
> >>> Date: Fri Aug 16 14:43:21 2019 -0700
> >>>
> >>> ACPICA: Increase total number of possible Owner IDs
> >>>
> >>> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
> >>>
> >>> From 255 to 4095 possible IDs.
> >>>
> >>> Link: https://github.com/acpica/acpica/commit/1f1652da
> >>> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
> >>> Signed-off-by: Bob Moore <[email protected]>
> >>> Signed-off-by: Erik Schmauss <[email protected]>
> >>> Signed-off-by: Rafael J. Wysocki <[email protected]>
> >>
> >> Bob, Erik, did we miss something in that patch?
> >
> > Maybe the patch just changes layout in a way that exposes the bug.
> >
> > Anyway the "ftrace_event_field" cache is not really involved, this is
> > just because of slab merging. After adding "slub_nomerge" to
> > "slub_debug=F", it starts making more sense, as the cache becomes
> > Acpi-Namespace
> >
> > [ 0.140408] ------------[ cut here ]------------
> > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> object is from kmalloc-64
> > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> kmem_cache_free+0x1d3/0x250
> > [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de
> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
> > [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX:
> 0000000000000000
> > [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> ffffffffacfdd32c
> > [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09:
> 00000000000000fc
> > [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12:
> ffff937307409e00
> > [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15:
> 0000000000000000
> > [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000)
> knlGS:0000000000000000
> > [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4:
> 00000000003406f0
> > [ 0.142393] Call Trace:
> > [ 0.142393] acpi_os_release_object+0x5/0x10
> > [ 0.142393] acpi_ns_delete_children+0x46/0x59
> > [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.142393] acpi_ns_terminate+0xc/0x31
> > [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
> > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.142393] acpi_terminate+0x5/0xf
> > [ 0.142393] acpi_init+0x27b/0x308
> > [ 0.142393] ? video_setup+0x79/0x79
> > [ 0.142393] do_one_initcall+0x7b/0x160
> > [ 0.142393] kernel_init_freeable+0x190/0x1f2
> > [ 0.142393] ? rest_init+0x9a/0x9a
> > [ 0.142393] kernel_init+0x5/0xf6
> > [ 0.142393] ret_from_fork+0x22/0x30
> > [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
> > [ 0.142396] ------------[ cut here ]------------
> >
> > I've also changed the warning so it's not printed just once, and also
> > prints tracking info (see the hunk at the end of my mail, I'll turn this to a
> proper patch later).
> >
> > With "slub_debug=FU slub_nomerge" there are now multiple warnings,
> but they all look the same:
> >
> > [ 0.143815] ------------[ cut here ]------------
> > [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> object is from kmalloc-64
> > [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> kmem_cache_free+0x1d3/0x250
> > [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de
> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
> > [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX:
> 0000000000000000
> > [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> ffffffff8e1dd32c
> > [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09:
> 00000000000000fc
> > [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12:
> ffff972dc74ada80
> > [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15:
> 0000000000000000
> > [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000)
> knlGS:0000000000000000
> > [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4:
> 00000000003406f0
> > [ 0.145129] Call Trace:
> > [ 0.145129] acpi_os_release_object+0x5/0x10
> > [ 0.145129] acpi_ns_delete_children+0x46/0x59
> > [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.145129] acpi_ns_terminate+0xc/0x31
> > [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
> > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > [ 0.145129] acpi_terminate+0x5/0xf
> > [ 0.145129] acpi_init+0x27b/0x308
> > [ 0.145129] ? video_setup+0x79/0x79
> > [ 0.145129] do_one_initcall+0x7b/0x160
> > [ 0.145129] kernel_init_freeable+0x190/0x1f2
> > [ 0.145129] ? rest_init+0x9a/0x9a
> > [ 0.145129] kernel_init+0x5/0xf6
> > [ 0.145129] ret_from_fork+0x22/0x30
> > [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
> > [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58
> cpu=0 pid=0
> > [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
> > [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
> > [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
> > [ 0.147024] acpi_early_init+0x5d/0xd1
> > [ 0.147132] start_kernel+0x45b/0x518
> > [ 0.147491] secondary_startup_64+0xb6/0xc0
> > [ 0.147897] ------------[ cut here ]------------
> >
> > And it seems ACPI is allocating an object via kmalloc() and then
> > freeing it via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which
> is wrong.
> >
> >> ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> > acpi_ns_root_initialize+0xb6/0x2d1:
> > kmalloc at include/linux/slab.h:555
> > (inlined by) kzalloc at include/linux/slab.h:669 (inlined by)
> > acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> > (inlined by) acpi_ns_root_initialize at
> > drivers/acpi/acpica/nsaccess.c:102
> >
>
Hi Vegard,

> That's it :-) This fixes it for me:

We'll take this patch for ACPICA and it will be in the next release.

Rafael, do you want to take this as a part of the next rc? Or should we wait for the next merge window?

Thanks,
Erik
>
> diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
> index 2566e2d4c7803..b76bbab917941 100644
> --- a/drivers/acpi/acpica/nsaccess.c
> +++ b/drivers/acpi/acpica/nsaccess.c
> @@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
> * predefined names are at the root level. It is much easier to
> * just create and link the new node(s) here.
> */
> - new_node =
> - ACPI_ALLOCATE_ZEROED(sizeof(struct
> acpi_namespace_node));
> + new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
> init_val->name));
> if (!new_node) {
> status = AE_NO_MEMORY;
> goto unlock_and_exit;
> }
>
> - ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
> new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
> new_node->type = init_val->type;
>
>
> Vegard

2020-06-11 11:33:45

by Vlastimil Babka

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 6/11/20 3:40 AM, Kaneda, Erik wrote:
> We'll take this patch for ACPICA and it will be in the next release.
>
> Rafael, do you want to take this as a part of the next rc? Or should we wait for the next merge window?

IMHO this should rather be fixed in 5.8 with CC stable, not next merge window.

> Thanks,
> Erik
>>
>> diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
>> index 2566e2d4c7803..b76bbab917941 100644
>> --- a/drivers/acpi/acpica/nsaccess.c
>> +++ b/drivers/acpi/acpica/nsaccess.c
>> @@ -98,14 +98,12 @@ acpi_status acpi_ns_root_initialize(void)
>> * predefined names are at the root level. It is much easier to
>> * just create and link the new node(s) here.
>> */
>> - new_node =
>> - ACPI_ALLOCATE_ZEROED(sizeof(struct
>> acpi_namespace_node));
>> + new_node = acpi_ns_create_node(*ACPI_CAST_PTR (u32,
>> init_val->name));
>> if (!new_node) {
>> status = AE_NO_MEMORY;
>> goto unlock_and_exit;
>> }
>>
>> - ACPI_COPY_NAMESEG(new_node->name.ascii, init_val->name);
>> new_node->descriptor_type = ACPI_DESC_TYPE_NAMED;
>> new_node->type = init_val->type;
>>
>>
>> Vegard
>

2020-06-12 12:31:50

by Wysocki, Rafael J

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On 6/11/2020 3:40 AM, Kaneda, Erik wrote:
>
>> -----Original Message-----
>> From: Vegard Nossum <[email protected]>
>> Sent: Friday, June 5, 2020 7:45 AM
>> To: Vlastimil Babka <[email protected]>; Rafael J. Wysocki
>> <[email protected]>; Moore, Robert <[email protected]>; Kaneda,
>> Erik <[email protected]>
>> Cc: Kees Cook <[email protected]>; Wysocki, Rafael J
>> <[email protected]>; Christoph Lameter <[email protected]>; Andrew
>> Morton <[email protected]>; Marco Elver <[email protected]>;
>> Waiman Long <[email protected]>; LKML <linux-
>> [email protected]>; Linux MM <[email protected]>; ACPI Devel
>> Maling List <[email protected]>; Len Brown <[email protected]>;
>> Steven Rostedt <[email protected]>
>> Subject: Re: slub freelist issue / BUG: unable to handle page fault for
>> address: 000000003ffe0018
>>
>> On 2020-06-05 16:08, Vlastimil Babka wrote:
>>> On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
>>>> On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum
>> <[email protected]> wrote:
>>>>> On 2020-06-05 11:36, Vegard Nossum wrote:
>>>>>> On 2020-06-05 11:11, Vlastimil Babka wrote:
>>>>>>> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
>>>>>>>> On 6/4/20 7:57 PM, Kees Cook wrote:
>>>>>>>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>>>>>>>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>>>>>>>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> I ran into a boot problem with latest linus/master
>>>>>>>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests
>> like this:
>>>>>>>>>>> Hi, what's the .config you use?
>>>>>>>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB,
>>>>>>>>>> ...)
>>>>>>>>> Oh yes indeed. I immediately crash in the same way with this config.
>>>>>>>>> I'll
>>>>>>>>> start digging...
>>>>>>>>>
>>>>>>>>> (defconfig finishes boot)
>>>>>>>> This is funny, booting with slub_debug=F results in:
>>>>>>>> I'm not sure if it's ACPI or ftrace wrong here, but looks like
>>>>>>>> the changed free pointer offset merely exposes a bug in something
>>>>>>>> else.
>>>>>>> So, with Kees' patch reverted, booting with slub_debug=F (or even
>>>>>>> more specific slub_debug=F,ftrace_event_field) also hits this bug
>>>>>>> below. I wanted to bisect it, but v5.7 was also bad, and also
>>>>>>> v5.6. Didn't try further in history. So it's not new at all, and
>>>>>>> likely very specific to your config+QEMU? (and related to the ACPI
>>>>>>> error messages that precede it?).
>>>>>> I see it too, but not on v5.0. I can bisect it.
>>>>> commit 67a72420a326b45514deb3f212085fb2cd1595b5
>>>>> Author: Bob Moore <[email protected]>
>>>>> Date: Fri Aug 16 14:43:21 2019 -0700
>>>>>
>>>>> ACPICA: Increase total number of possible Owner IDs
>>>>>
>>>>> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
>>>>>
>>>>> From 255 to 4095 possible IDs.
>>>>>
>>>>> Link: https://github.com/acpica/acpica/commit/1f1652da
>>>>> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
>>>>> Signed-off-by: Bob Moore <[email protected]>
>>>>> Signed-off-by: Erik Schmauss <[email protected]>
>>>>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>>>> Bob, Erik, did we miss something in that patch?
>>> Maybe the patch just changes layout in a way that exposes the bug.
>>>
>>> Anyway the "ftrace_event_field" cache is not really involved, this is
>>> just because of slab merging. After adding "slub_nomerge" to
>>> "slub_debug=F", it starts making more sense, as the cache becomes
>>> Acpi-Namespace
>>>
>>> [ 0.140408] ------------[ cut here ]------------
>>> [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but
>> object is from kmalloc-64
>>> [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
>> kmem_cache_free+0x1d3/0x250
>>> [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
>>> [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
>>> [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
>>> [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
>> 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de
>> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
>>> [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
>>> [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX:
>> 0000000000000000
>>> [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
>> ffffffffacfdd32c
>>> [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09:
>> 00000000000000fc
>>> [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12:
>> ffff937307409e00
>>> [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15:
>> 0000000000000000
>>> [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000)
>> knlGS:0000000000000000
>>> [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4:
>> 00000000003406f0
>>> [ 0.142393] Call Trace:
>>> [ 0.142393] acpi_os_release_object+0x5/0x10
>>> [ 0.142393] acpi_ns_delete_children+0x46/0x59
>>> [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
>>> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
>>> [ 0.142393] acpi_ns_terminate+0xc/0x31
>>> [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
>>> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
>>> [ 0.142393] acpi_terminate+0x5/0xf
>>> [ 0.142393] acpi_init+0x27b/0x308
>>> [ 0.142393] ? video_setup+0x79/0x79
>>> [ 0.142393] do_one_initcall+0x7b/0x160
>>> [ 0.142393] kernel_init_freeable+0x190/0x1f2
>>> [ 0.142393] ? rest_init+0x9a/0x9a
>>> [ 0.142393] kernel_init+0x5/0xf6
>>> [ 0.142393] ret_from_fork+0x22/0x30
>>> [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
>>> [ 0.142396] ------------[ cut here ]------------
>>>
>>> I've also changed the warning so it's not printed just once, and also
>>> prints tracking info (see the hunk at the end of my mail, I'll turn this to a
>> proper patch later).
>>> With "slub_debug=FU slub_nomerge" there are now multiple warnings,
>> but they all look the same:
>>> [ 0.143815] ------------[ cut here ]------------
>>> [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but
>> object is from kmalloc-64
>>> [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
>> kmem_cache_free+0x1d3/0x250
>>> [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
>>> [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
>>> [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
>>> [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
>> 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de
>> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
>>> [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
>>> [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX:
>> 0000000000000000
>>> [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
>> ffffffff8e1dd32c
>>> [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09:
>> 00000000000000fc
>>> [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12:
>> ffff972dc74ada80
>>> [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15:
>> 0000000000000000
>>> [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000)
>> knlGS:0000000000000000
>>> [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4:
>> 00000000003406f0
>>> [ 0.145129] Call Trace:
>>> [ 0.145129] acpi_os_release_object+0x5/0x10
>>> [ 0.145129] acpi_ns_delete_children+0x46/0x59
>>> [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
>>> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
>>> [ 0.145129] acpi_ns_terminate+0xc/0x31
>>> [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
>>> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
>>> [ 0.145129] acpi_terminate+0x5/0xf
>>> [ 0.145129] acpi_init+0x27b/0x308
>>> [ 0.145129] ? video_setup+0x79/0x79
>>> [ 0.145129] do_one_initcall+0x7b/0x160
>>> [ 0.145129] kernel_init_freeable+0x190/0x1f2
>>> [ 0.145129] ? rest_init+0x9a/0x9a
>>> [ 0.145129] kernel_init+0x5/0xf6
>>> [ 0.145129] ret_from_fork+0x22/0x30
>>> [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
>>> [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58
>> cpu=0 pid=0
>>> [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
>>> [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
>>> [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
>>> [ 0.147024] acpi_early_init+0x5d/0xd1
>>> [ 0.147132] start_kernel+0x45b/0x518
>>> [ 0.147491] secondary_startup_64+0xb6/0xc0
>>> [ 0.147897] ------------[ cut here ]------------
>>>
>>> And it seems ACPI is allocating an object via kmalloc() and then
>>> freeing it via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which
>> is wrong.
>>>> ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
>>> acpi_ns_root_initialize+0xb6/0x2d1:
>>> kmalloc at include/linux/slab.h:555
>>> (inlined by) kzalloc at include/linux/slab.h:669 (inlined by)
>>> acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
>>> (inlined by) acpi_ns_root_initialize at
>>> drivers/acpi/acpica/nsaccess.c:102
>>>
> Hi Vegard,
>
>> That's it :-) This fixes it for me:
> We'll take this patch for ACPICA and it will be in the next release.
>
> Rafael, do you want to take this as a part of the next rc?

Yes, I do.

Cheers!


2021-03-23 18:34:23

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Fri, Jun 12, 2020 at 02:26:58PM +0200, Rafael J. Wysocki wrote:
> On 6/11/2020 3:40 AM, Kaneda, Erik wrote:
> >
> > > -----Original Message-----
> > > From: Vegard Nossum <[email protected]>
> > > Sent: Friday, June 5, 2020 7:45 AM
> > > To: Vlastimil Babka <[email protected]>; Rafael J. Wysocki
> > > <[email protected]>; Moore, Robert <[email protected]>; Kaneda,
> > > Erik <[email protected]>
> > > Cc: Kees Cook <[email protected]>; Wysocki, Rafael J
> > > <[email protected]>; Christoph Lameter <[email protected]>; Andrew
> > > Morton <[email protected]>; Marco Elver <[email protected]>;
> > > Waiman Long <[email protected]>; LKML <linux-
> > > [email protected]>; Linux MM <[email protected]>; ACPI Devel
> > > Maling List <[email protected]>; Len Brown <[email protected]>;
> > > Steven Rostedt <[email protected]>
> > > Subject: Re: slub freelist issue / BUG: unable to handle page fault for
> > > address: 000000003ffe0018
> > >
> > > On 2020-06-05 16:08, Vlastimil Babka wrote:
> > > > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> > > > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum
> > > <[email protected]> wrote:
> > > > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> > > > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> > > > > > > > On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> > > > > > > > > On 6/4/20 7:57 PM, Kees Cook wrote:
> > > > > > > > > > On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> > > > > > > > > > > On 2020-06-04 19:18, Vlastimil Babka wrote:
> > > > > > > > > > > > On 6/4/20 7:14 PM, Vegard Nossum wrote:
> > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I ran into a boot problem with latest linus/master
> > > > > > > > > > > > > (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests
> > > like this:
> > > > > > > > > > > > Hi, what's the .config you use?
> > > > > > > > > > > Pretty much x86_64 defconfig minus a few options (PCI, USB,
> > > > > > > > > > > ...)
> > > > > > > > > > Oh yes indeed. I immediately crash in the same way with this config.
> > > > > > > > > > I'll
> > > > > > > > > > start digging...
> > > > > > > > > >
> > > > > > > > > > (defconfig finishes boot)
> > > > > > > > > This is funny, booting with slub_debug=F results in:
> > > > > > > > > I'm not sure if it's ACPI or ftrace wrong here, but looks like
> > > > > > > > > the changed free pointer offset merely exposes a bug in something
> > > > > > > > > else.
> > > > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even
> > > > > > > > more specific slub_debug=F,ftrace_event_field) also hits this bug
> > > > > > > > below. I wanted to bisect it, but v5.7 was also bad, and also
> > > > > > > > v5.6. Didn't try further in history. So it's not new at all, and
> > > > > > > > likely very specific to your config+QEMU? (and related to the ACPI
> > > > > > > > error messages that precede it?).
> > > > > > > I see it too, but not on v5.0. I can bisect it.
> > > > > > commit 67a72420a326b45514deb3f212085fb2cd1595b5
> > > > > > Author: Bob Moore <[email protected]>
> > > > > > Date: Fri Aug 16 14:43:21 2019 -0700
> > > > > >
> > > > > > ACPICA: Increase total number of possible Owner IDs
> > > > > >
> > > > > > ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
> > > > > >
> > > > > > From 255 to 4095 possible IDs.
> > > > > >
> > > > > > Link: https://github.com/acpica/acpica/commit/1f1652da
> > > > > > Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
> > > > > > Signed-off-by: Bob Moore <[email protected]>
> > > > > > Signed-off-by: Erik Schmauss <[email protected]>
> > > > > > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > > > > Bob, Erik, did we miss something in that patch?
> > > > Maybe the patch just changes layout in a way that exposes the bug.
> > > >
> > > > Anyway the "ftrace_event_field" cache is not really involved, this is
> > > > just because of slab merging. After adding "slub_nomerge" to
> > > > "slub_debug=F", it starts making more sense, as the cache becomes
> > > > Acpi-Namespace
> > > >
> > > > [ 0.140408] ------------[ cut here ]------------
> > > > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> > > object is from kmalloc-64
> > > > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> > > kmem_cache_free+0x1d3/0x250
> > > > [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > > > [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > > > [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > > > [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> > > 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de
> > > 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > > > [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
> > > > [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX:
> > > 0000000000000000
> > > > [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> > > ffffffffacfdd32c
> > > > [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09:
> > > 00000000000000fc
> > > > [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12:
> > > ffff937307409e00
> > > > [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15:
> > > 0000000000000000
> > > > [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000)
> > > knlGS:0000000000000000
> > > > [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4:
> > > 00000000003406f0
> > > > [ 0.142393] Call Trace:
> > > > [ 0.142393] acpi_os_release_object+0x5/0x10
> > > > [ 0.142393] acpi_ns_delete_children+0x46/0x59
> > > > [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > > > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > [ 0.142393] acpi_ns_terminate+0xc/0x31
> > > > [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
> > > > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > [ 0.142393] acpi_terminate+0x5/0xf
> > > > [ 0.142393] acpi_init+0x27b/0x308
> > > > [ 0.142393] ? video_setup+0x79/0x79
> > > > [ 0.142393] do_one_initcall+0x7b/0x160
> > > > [ 0.142393] kernel_init_freeable+0x190/0x1f2
> > > > [ 0.142393] ? rest_init+0x9a/0x9a
> > > > [ 0.142393] kernel_init+0x5/0xf6
> > > > [ 0.142393] ret_from_fork+0x22/0x30
> > > > [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
> > > > [ 0.142396] ------------[ cut here ]------------
> > > >
> > > > I've also changed the warning so it's not printed just once, and also
> > > > prints tracking info (see the hunk at the end of my mail, I'll turn this to a
> > > proper patch later).
> > > > With "slub_debug=FU slub_nomerge" there are now multiple warnings,
> > > but they all look the same:
> > > > [ 0.143815] ------------[ cut here ]------------
> > > > [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> > > object is from kmalloc-64
> > > > [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> > > kmem_cache_free+0x1d3/0x250
> > > > [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > > > [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > > > [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > > > [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> > > 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de
> > > 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > > > [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
> > > > [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX:
> > > 0000000000000000
> > > > [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> > > ffffffff8e1dd32c
> > > > [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09:
> > > 00000000000000fc
> > > > [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12:
> > > ffff972dc74ada80
> > > > [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15:
> > > 0000000000000000
> > > > [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000)
> > > knlGS:0000000000000000
> > > > [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4:
> > > 00000000003406f0
> > > > [ 0.145129] Call Trace:
> > > > [ 0.145129] acpi_os_release_object+0x5/0x10
> > > > [ 0.145129] acpi_ns_delete_children+0x46/0x59
> > > > [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > > > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > [ 0.145129] acpi_ns_terminate+0xc/0x31
> > > > [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
> > > > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > [ 0.145129] acpi_terminate+0x5/0xf
> > > > [ 0.145129] acpi_init+0x27b/0x308
> > > > [ 0.145129] ? video_setup+0x79/0x79
> > > > [ 0.145129] do_one_initcall+0x7b/0x160
> > > > [ 0.145129] kernel_init_freeable+0x190/0x1f2
> > > > [ 0.145129] ? rest_init+0x9a/0x9a
> > > > [ 0.145129] kernel_init+0x5/0xf6
> > > > [ 0.145129] ret_from_fork+0x22/0x30
> > > > [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
> > > > [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58
> > > cpu=0 pid=0
> > > > [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
> > > > [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
> > > > [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
> > > > [ 0.147024] acpi_early_init+0x5d/0xd1
> > > > [ 0.147132] start_kernel+0x45b/0x518
> > > > [ 0.147491] secondary_startup_64+0xb6/0xc0
> > > > [ 0.147897] ------------[ cut here ]------------
> > > >
> > > > And it seems ACPI is allocating an object via kmalloc() and then
> > > > freeing it via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which
> > > is wrong.
> > > > > ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> > > > acpi_ns_root_initialize+0xb6/0x2d1:
> > > > kmalloc at include/linux/slab.h:555
> > > > (inlined by) kzalloc at include/linux/slab.h:669 (inlined by)
> > > > acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> > > > (inlined by) acpi_ns_root_initialize at
> > > > drivers/acpi/acpica/nsaccess.c:102
> > > >
> > Hi Vegard,
> >
> > > That's it :-) This fixes it for me:
> > We'll take this patch for ACPICA and it will be in the next release.
> >
> > Rafael, do you want to take this as a part of the next rc?
>
> Yes, I do.

Folks, what happened to the patch? I don't see it in current upstream.

Looks like it got reported again:

https://lore.kernel.org/r/[email protected]

--
Kirill A. Shutemov

2021-03-23 19:06:05

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

On Tue, Mar 23, 2021 at 7:32 PM Kirill A. Shutemov <[email protected]> wrote:
>
> On Fri, Jun 12, 2020 at 02:26:58PM +0200, Rafael J. Wysocki wrote:
> > On 6/11/2020 3:40 AM, Kaneda, Erik wrote:
> > >
> > > > -----Original Message-----
> > > > From: Vegard Nossum <[email protected]>
> > > > Sent: Friday, June 5, 2020 7:45 AM
> > > > To: Vlastimil Babka <[email protected]>; Rafael J. Wysocki
> > > > <[email protected]>; Moore, Robert <[email protected]>; Kaneda,
> > > > Erik <[email protected]>
> > > > Cc: Kees Cook <[email protected]>; Wysocki, Rafael J
> > > > <[email protected]>; Christoph Lameter <[email protected]>; Andrew
> > > > Morton <[email protected]>; Marco Elver <[email protected]>;
> > > > Waiman Long <[email protected]>; LKML <linux-
> > > > [email protected]>; Linux MM <[email protected]>; ACPI Devel
> > > > Maling List <[email protected]>; Len Brown <[email protected]>;
> > > > Steven Rostedt <[email protected]>
> > > > Subject: Re: slub freelist issue / BUG: unable to handle page fault for
> > > > address: 000000003ffe0018
> > > >
> > > > On 2020-06-05 16:08, Vlastimil Babka wrote:
> > > > > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> > > > > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum
> > > > <[email protected]> wrote:
> > > > > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> > > > > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> > > > > > > > > On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> > > > > > > > > > On 6/4/20 7:57 PM, Kees Cook wrote:
> > > > > > > > > > > On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
> > > > > > > > > > > > On 2020-06-04 19:18, Vlastimil Babka wrote:
> > > > > > > > > > > > > On 6/4/20 7:14 PM, Vegard Nossum wrote:
> > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I ran into a boot problem with latest linus/master
> > > > > > > > > > > > > > (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests
> > > > like this:
> > > > > > > > > > > > > Hi, what's the .config you use?
> > > > > > > > > > > > Pretty much x86_64 defconfig minus a few options (PCI, USB,
> > > > > > > > > > > > ...)
> > > > > > > > > > > Oh yes indeed. I immediately crash in the same way with this config.
> > > > > > > > > > > I'll
> > > > > > > > > > > start digging...
> > > > > > > > > > >
> > > > > > > > > > > (defconfig finishes boot)
> > > > > > > > > > This is funny, booting with slub_debug=F results in:
> > > > > > > > > > I'm not sure if it's ACPI or ftrace wrong here, but looks like
> > > > > > > > > > the changed free pointer offset merely exposes a bug in something
> > > > > > > > > > else.
> > > > > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even
> > > > > > > > > more specific slub_debug=F,ftrace_event_field) also hits this bug
> > > > > > > > > below. I wanted to bisect it, but v5.7 was also bad, and also
> > > > > > > > > v5.6. Didn't try further in history. So it's not new at all, and
> > > > > > > > > likely very specific to your config+QEMU? (and related to the ACPI
> > > > > > > > > error messages that precede it?).
> > > > > > > > I see it too, but not on v5.0. I can bisect it.
> > > > > > > commit 67a72420a326b45514deb3f212085fb2cd1595b5
> > > > > > > Author: Bob Moore <[email protected]>
> > > > > > > Date: Fri Aug 16 14:43:21 2019 -0700
> > > > > > >
> > > > > > > ACPICA: Increase total number of possible Owner IDs
> > > > > > >
> > > > > > > ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
> > > > > > >
> > > > > > > From 255 to 4095 possible IDs.
> > > > > > >
> > > > > > > Link: https://github.com/acpica/acpica/commit/1f1652da
> > > > > > > Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
> > > > > > > Signed-off-by: Bob Moore <[email protected]>
> > > > > > > Signed-off-by: Erik Schmauss <[email protected]>
> > > > > > > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > > > > > Bob, Erik, did we miss something in that patch?
> > > > > Maybe the patch just changes layout in a way that exposes the bug.
> > > > >
> > > > > Anyway the "ftrace_event_field" cache is not really involved, this is
> > > > > just because of slab merging. After adding "slub_nomerge" to
> > > > > "slub_debug=F", it starts making more sense, as the cache becomes
> > > > > Acpi-Namespace
> > > > >
> > > > > [ 0.140408] ------------[ cut here ]------------
> > > > > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> > > > object is from kmalloc-64
> > > > > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> > > > kmem_cache_free+0x1d3/0x250
> > > > > [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > > > > [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > > BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > > > > [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > > > > [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> > > > 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de
> > > > 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > > > > [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
> > > > > [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX:
> > > > 0000000000000000
> > > > > [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> > > > ffffffffacfdd32c
> > > > > [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09:
> > > > 00000000000000fc
> > > > > [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12:
> > > > ffff937307409e00
> > > > > [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15:
> > > > 0000000000000000
> > > > > [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000)
> > > > knlGS:0000000000000000
> > > > > [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4:
> > > > 00000000003406f0
> > > > > [ 0.142393] Call Trace:
> > > > > [ 0.142393] acpi_os_release_object+0x5/0x10
> > > > > [ 0.142393] acpi_ns_delete_children+0x46/0x59
> > > > > [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > > > > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > [ 0.142393] acpi_ns_terminate+0xc/0x31
> > > > > [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
> > > > > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > [ 0.142393] acpi_terminate+0x5/0xf
> > > > > [ 0.142393] acpi_init+0x27b/0x308
> > > > > [ 0.142393] ? video_setup+0x79/0x79
> > > > > [ 0.142393] do_one_initcall+0x7b/0x160
> > > > > [ 0.142393] kernel_init_freeable+0x190/0x1f2
> > > > > [ 0.142393] ? rest_init+0x9a/0x9a
> > > > > [ 0.142393] kernel_init+0x5/0xf6
> > > > > [ 0.142393] ret_from_fork+0x22/0x30
> > > > > [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
> > > > > [ 0.142396] ------------[ cut here ]------------
> > > > >
> > > > > I've also changed the warning so it's not printed just once, and also
> > > > > prints tracking info (see the hunk at the end of my mail, I'll turn this to a
> > > > proper patch later).
> > > > > With "slub_debug=FU slub_nomerge" there are now multiple warnings,
> > > > but they all look the same:
> > > > > [ 0.143815] ------------[ cut here ]------------
> > > > > [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but
> > > > object is from kmalloc-64
> > > > > [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> > > > kmem_cache_free+0x1d3/0x250
> > > > > [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > > > > [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > > > BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > > > > [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > > > > [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
> > > > 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de
> > > > 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > > > > [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
> > > > > [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX:
> > > > 0000000000000000
> > > > > [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> > > > ffffffff8e1dd32c
> > > > > [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09:
> > > > 00000000000000fc
> > > > > [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12:
> > > > ffff972dc74ada80
> > > > > [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15:
> > > > 0000000000000000
> > > > > [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000)
> > > > knlGS:0000000000000000
> > > > > [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4:
> > > > 00000000003406f0
> > > > > [ 0.145129] Call Trace:
> > > > > [ 0.145129] acpi_os_release_object+0x5/0x10
> > > > > [ 0.145129] acpi_ns_delete_children+0x46/0x59
> > > > > [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > > > > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > [ 0.145129] acpi_ns_terminate+0xc/0x31
> > > > > [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
> > > > > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > [ 0.145129] acpi_terminate+0x5/0xf
> > > > > [ 0.145129] acpi_init+0x27b/0x308
> > > > > [ 0.145129] ? video_setup+0x79/0x79
> > > > > [ 0.145129] do_one_initcall+0x7b/0x160
> > > > > [ 0.145129] kernel_init_freeable+0x190/0x1f2
> > > > > [ 0.145129] ? rest_init+0x9a/0x9a
> > > > > [ 0.145129] kernel_init+0x5/0xf6
> > > > > [ 0.145129] ret_from_fork+0x22/0x30
> > > > > [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
> > > > > [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58
> > > > cpu=0 pid=0
> > > > > [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
> > > > > [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
> > > > > [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
> > > > > [ 0.147024] acpi_early_init+0x5d/0xd1
> > > > > [ 0.147132] start_kernel+0x45b/0x518
> > > > > [ 0.147491] secondary_startup_64+0xb6/0xc0
> > > > > [ 0.147897] ------------[ cut here ]------------
> > > > >
> > > > > And it seems ACPI is allocating an object via kmalloc() and then
> > > > > freeing it via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which
> > > > is wrong.
> > > > > > ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> > > > > acpi_ns_root_initialize+0xb6/0x2d1:
> > > > > kmalloc at include/linux/slab.h:555
> > > > > (inlined by) kzalloc at include/linux/slab.h:669 (inlined by)
> > > > > acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> > > > > (inlined by) acpi_ns_root_initialize at
> > > > > drivers/acpi/acpica/nsaccess.c:102
> > > > >
> > > Hi Vegard,
> > >
> > > > That's it :-) This fixes it for me:
> > > We'll take this patch for ACPICA and it will be in the next release.
> > >
> > > Rafael, do you want to take this as a part of the next rc?
> >
> > Yes, I do.
>
> Folks, what happened to the patch? I don't see it in current upstream.
>
> Looks like it got reported again:
>
> https://lore.kernel.org/r/[email protected]

Well, I'm actually not sure.

Erik?

2021-03-23 21:59:01

by Kaneda, Erik

[permalink] [raw]
Subject: RE: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018



> -----Original Message-----
> From: Rafael J. Wysocki <[email protected]>
> Sent: Tuesday, March 23, 2021 12:03 PM
> To: Kirill A. Shutemov <[email protected]>; Kaneda, Erik
> <[email protected]>
> Cc: Wysocki, Rafael J <[email protected]>; Vegard Nossum
> <[email protected]>; Vlastimil Babka <[email protected]>; Rafael J.
> Wysocki <[email protected]>; Moore, Robert <[email protected]>;
> Kees Cook <[email protected]>; Christoph Lameter <[email protected]>;
> Andrew Morton <[email protected]>; Marco Elver
> <[email protected]>; Waiman Long <[email protected]>; LKML <linux-
> [email protected]>; Linux MM <[email protected]>; ACPI Devel
> Maling List <[email protected]>; Len Brown <[email protected]>;
> Steven Rostedt <[email protected]>; Jan Kiszka
> <[email protected]>
> Subject: Re: slub freelist issue / BUG: unable to handle page fault for
> address: 000000003ffe0018
>
> On Tue, Mar 23, 2021 at 7:32 PM Kirill A. Shutemov <[email protected]>
> wrote:
> >
> > On Fri, Jun 12, 2020 at 02:26:58PM +0200, Rafael J. Wysocki wrote:
> > > On 6/11/2020 3:40 AM, Kaneda, Erik wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Vegard Nossum <[email protected]>
> > > > > Sent: Friday, June 5, 2020 7:45 AM
> > > > > To: Vlastimil Babka <[email protected]>; Rafael J. Wysocki
> > > > > <[email protected]>; Moore, Robert <[email protected]>;
> Kaneda,
> > > > > Erik <[email protected]>
> > > > > Cc: Kees Cook <[email protected]>; Wysocki, Rafael J
> > > > > <[email protected]>; Christoph Lameter <[email protected]>;
> Andrew
> > > > > Morton <[email protected]>; Marco Elver
> <[email protected]>;
> > > > > Waiman Long <[email protected]>; LKML <linux-
> > > > > [email protected]>; Linux MM <[email protected]>; ACPI
> Devel
> > > > > Maling List <[email protected]>; Len Brown
> <[email protected]>;
> > > > > Steven Rostedt <[email protected]>
> > > > > Subject: Re: slub freelist issue / BUG: unable to handle page fault for
> > > > > address: 000000003ffe0018
> > > > >
> > > > > On 2020-06-05 16:08, Vlastimil Babka wrote:
> > > > > > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
> > > > > > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum
> > > > > <[email protected]> wrote:
> > > > > > > > On 2020-06-05 11:36, Vegard Nossum wrote:
> > > > > > > > > On 2020-06-05 11:11, Vlastimil Babka wrote:
> > > > > > > > > > On 6/4/20 8:46 PM, Vlastimil Babka wrote:
> > > > > > > > > > > On 6/4/20 7:57 PM, Kees Cook wrote:
> > > > > > > > > > > > On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard
> Nossum wrote:
> > > > > > > > > > > > > On 2020-06-04 19:18, Vlastimil Babka wrote:
> > > > > > > > > > > > > > On 6/4/20 7:14 PM, Vegard Nossum wrote:
> > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I ran into a boot problem with latest linus/master
> > > > > > > > > > > > > > > (6929f71e46bdddbf1c4d67c2728648176c67c555)
> that manifests
> > > > > like this:
> > > > > > > > > > > > > > Hi, what's the .config you use?
> > > > > > > > > > > > > Pretty much x86_64 defconfig minus a few options (PCI,
> USB,
> > > > > > > > > > > > > ...)
> > > > > > > > > > > > Oh yes indeed. I immediately crash in the same way with
> this config.
> > > > > > > > > > > > I'll
> > > > > > > > > > > > start digging...
> > > > > > > > > > > >
> > > > > > > > > > > > (defconfig finishes boot)
> > > > > > > > > > > This is funny, booting with slub_debug=F results in:
> > > > > > > > > > > I'm not sure if it's ACPI or ftrace wrong here, but looks like
> > > > > > > > > > > the changed free pointer offset merely exposes a bug in
> something
> > > > > > > > > > > else.
> > > > > > > > > > So, with Kees' patch reverted, booting with slub_debug=F
> (or even
> > > > > > > > > > more specific slub_debug=F,ftrace_event_field) also hits
> this bug
> > > > > > > > > > below. I wanted to bisect it, but v5.7 was also bad, and also
> > > > > > > > > > v5.6. Didn't try further in history. So it's not new at all, and
> > > > > > > > > > likely very specific to your config+QEMU? (and related to the
> ACPI
> > > > > > > > > > error messages that precede it?).
> > > > > > > > > I see it too, but not on v5.0. I can bisect it.
> > > > > > > > commit 67a72420a326b45514deb3f212085fb2cd1595b5
> > > > > > > > Author: Bob Moore <[email protected]>
> > > > > > > > Date: Fri Aug 16 14:43:21 2019 -0700
> > > > > > > >
> > > > > > > > ACPICA: Increase total number of possible Owner IDs
> > > > > > > >
> > > > > > > > ACPICA commit
> 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
> > > > > > > >
> > > > > > > > From 255 to 4095 possible IDs.
> > > > > > > >
> > > > > > > > Link: https://github.com/acpica/acpica/commit/1f1652da
> > > > > > > > Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
> > > > > > > > Signed-off-by: Bob Moore <[email protected]>
> > > > > > > > Signed-off-by: Erik Schmauss <[email protected]>
> > > > > > > > Signed-off-by: Rafael J. Wysocki
> <[email protected]>
> > > > > > > Bob, Erik, did we miss something in that patch?
> > > > > > Maybe the patch just changes layout in a way that exposes the bug.
> > > > > >
> > > > > > Anyway the "ftrace_event_field" cache is not really involved, this is
> > > > > > just because of slab merging. After adding "slub_nomerge" to
> > > > > > "slub_debug=F", it starts making more sense, as the cache becomes
> > > > > > Acpi-Namespace
> > > > > >
> > > > > > [ 0.140408] ------------[ cut here ]------------
> > > > > > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace
> but
> > > > > object is from kmalloc-64
> > > > > > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> > > > > kmem_cache_free+0x1d3/0x250
> > > > > > [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > > > > > [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX,
> 1996),
> > > > > BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > > > > > [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > > > > > [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d
> 58 48
> > > > > 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b
> 48 89 de
> > > > > 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > > > > > [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
> > > > > > [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX:
> > > > > 0000000000000000
> > > > > > [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> > > > > ffffffffacfdd32c
> > > > > > [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09:
> > > > > 00000000000000fc
> > > > > > [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12:
> > > > > ffff937307409e00
> > > > > > [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15:
> > > > > 0000000000000000
> > > > > > [ 0.142393] FS: 0000000000000000(0000)
> GS:ffff937287a00000(0000)
> > > > > knlGS:0000000000000000
> > > > > > [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4:
> > > > > 00000000003406f0
> > > > > > [ 0.142393] Call Trace:
> > > > > > [ 0.142393] acpi_os_release_object+0x5/0x10
> > > > > > [ 0.142393] acpi_ns_delete_children+0x46/0x59
> > > > > > [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > > > > > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > > [ 0.142393] acpi_ns_terminate+0xc/0x31
> > > > > > [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
> > > > > > [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > > [ 0.142393] acpi_terminate+0x5/0xf
> > > > > > [ 0.142393] acpi_init+0x27b/0x308
> > > > > > [ 0.142393] ? video_setup+0x79/0x79
> > > > > > [ 0.142393] do_one_initcall+0x7b/0x160
> > > > > > [ 0.142393] kernel_init_freeable+0x190/0x1f2
> > > > > > [ 0.142393] ? rest_init+0x9a/0x9a
> > > > > > [ 0.142393] kernel_init+0x5/0xf6
> > > > > > [ 0.142393] ret_from_fork+0x22/0x30
> > > > > > [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
> > > > > > [ 0.142396] ------------[ cut here ]------------
> > > > > >
> > > > > > I've also changed the warning so it's not printed just once, and also
> > > > > > prints tracking info (see the hunk at the end of my mail, I'll turn this
> to a
> > > > > proper patch later).
> > > > > > With "slub_debug=FU slub_nomerge" there are now multiple
> warnings,
> > > > > but they all look the same:
> > > > > > [ 0.143815] ------------[ cut here ]------------
> > > > > > [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace
> but
> > > > > object is from kmalloc-64
> > > > > > [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
> > > > > kmem_cache_free+0x1d3/0x250
> > > > > > [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
> > > > > > [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX,
> 1996),
> > > > > BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
> > > > > > [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
> > > > > > [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d
> 58 48
> > > > > 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b
> 48 89 de
> > > > > 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
> > > > > > [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
> > > > > > [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX:
> > > > > 0000000000000000
> > > > > > [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
> > > > > ffffffff8e1dd32c
> > > > > > [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09:
> > > > > 00000000000000fc
> > > > > > [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12:
> > > > > ffff972dc74ada80
> > > > > > [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15:
> > > > > 0000000000000000
> > > > > > [ 0.145129] FS: 0000000000000000(0000)
> GS:ffff972d47a00000(0000)
> > > > > knlGS:0000000000000000
> > > > > > [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4:
> > > > > 00000000003406f0
> > > > > > [ 0.145129] Call Trace:
> > > > > > [ 0.145129] acpi_os_release_object+0x5/0x10
> > > > > > [ 0.145129] acpi_ns_delete_children+0x46/0x59
> > > > > > [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
> > > > > > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > > [ 0.145129] acpi_ns_terminate+0xc/0x31
> > > > > > [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
> > > > > > [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
> > > > > > [ 0.145129] acpi_terminate+0x5/0xf
> > > > > > [ 0.145129] acpi_init+0x27b/0x308
> > > > > > [ 0.145129] ? video_setup+0x79/0x79
> > > > > > [ 0.145129] do_one_initcall+0x7b/0x160
> > > > > > [ 0.145129] kernel_init_freeable+0x190/0x1f2
> > > > > > [ 0.145129] ? rest_init+0x9a/0x9a
> > > > > > [ 0.145129] kernel_init+0x5/0xf6
> > > > > > [ 0.145129] ret_from_fork+0x22/0x30
> > > > > > [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
> > > > > > [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1
> age=58
> > > > > cpu=0 pid=0
> > > > > > [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
> > > > > > [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
> > > > > > [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
> > > > > > [ 0.147024] acpi_early_init+0x5d/0xd1
> > > > > > [ 0.147132] start_kernel+0x45b/0x518
> > > > > > [ 0.147491] secondary_startup_64+0xb6/0xc0
> > > > > > [ 0.147897] ------------[ cut here ]------------
> > > > > >
> > > > > > And it seems ACPI is allocating an object via kmalloc() and then
> > > > > > freeing it via kmem_cache_free(<"Acpi-Namespace"
> kmem_cache>) which
> > > > > is wrong.
> > > > > > > ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
> > > > > > acpi_ns_root_initialize+0xb6/0x2d1:
> > > > > > kmalloc at include/linux/slab.h:555
> > > > > > (inlined by) kzalloc at include/linux/slab.h:669 (inlined by)
> > > > > > acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
> > > > > > (inlined by) acpi_ns_root_initialize at
> > > > > > drivers/acpi/acpica/nsaccess.c:102
> > > > > >
> > > > Hi Vegard,
> > > >
> > > > > That's it :-) This fixes it for me:
> > > > We'll take this patch for ACPICA and it will be in the next release.
> > > >
> > > > Rafael, do you want to take this as a part of the next rc?
> > >
> > > Yes, I do.
> >
Hi,

> > Folks, what happened to the patch? I don't see it in current upstream.
> >
> > Looks like it got reported again:
> >
> > https://lore.kernel.org/r/a1461e21-c744-767d-6dfc-
> [email protected]
>
> Well, I'm actually not sure.
>
> Erik?

Sorry about that. It looks like I missed this. I've submitted a pull request for the ACPICA upstream here: https://github.com/acpica/acpica/pull/680
I've also submitted a Linux-ized version of the patch to the mailing list. Rafael, please take the submitted patch for the next rc.

Thanks and sorry for the mistake!
Erik

2021-03-24 07:17:36

by Vegard Nossum

[permalink] [raw]
Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018

(trimmed CCs)

On 2021-03-23 19:32, Kirill A. Shutemov wrote:
> On Fri, Jun 12, 2020 at 02:26:58PM +0200, Rafael J. Wysocki wrote:
>> On 6/11/2020 3:40 AM, Kaneda, Erik wrote:
>>>
>>>> -----Original Message-----
>>>> From: Vegard Nossum <[email protected]>
>>>> Sent: Friday, June 5, 2020 7:45 AM
>>>> To: Vlastimil Babka <[email protected]>; Rafael J. Wysocki
>>>> <[email protected]>; Moore, Robert <[email protected]>; Kaneda,
>>>> Erik <[email protected]>
>>>> Cc: Kees Cook <[email protected]>; Wysocki, Rafael J
>>>> <[email protected]>; Christoph Lameter <[email protected]>; Andrew
>>>> Morton <[email protected]>; Marco Elver <[email protected]>;
>>>> Waiman Long <[email protected]>; LKML <linux-
>>>> [email protected]>; Linux MM <[email protected]>; ACPI Devel
>>>> Maling List <[email protected]>; Len Brown <[email protected]>;
>>>> Steven Rostedt <[email protected]>
>>>> Subject: Re: slub freelist issue / BUG: unable to handle page fault for
>>>> address: 000000003ffe0018
>>>>
>>>> On 2020-06-05 16:08, Vlastimil Babka wrote:
>>>>> On 6/5/20 3:12 PM, Rafael J. Wysocki wrote:
>>>>>> On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum
>>>> <[email protected]> wrote:
>>>>>>> On 2020-06-05 11:36, Vegard Nossum wrote:
>>>>>>>> On 2020-06-05 11:11, Vlastimil Babka wrote:
>>>>>>>>> On 6/4/20 8:46 PM, Vlastimil Babka wrote:
>>>>>>>>>> On 6/4/20 7:57 PM, Kees Cook wrote:
>>>>>>>>>>> On Thu, Jun 04, 2020 at 07:20:18PM +0200, Vegard Nossum wrote:
>>>>>>>>>>>> On 2020-06-04 19:18, Vlastimil Babka wrote:
>>>>>>>>>>>>> On 6/4/20 7:14 PM, Vegard Nossum wrote:
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I ran into a boot problem with latest linus/master
>>>>>>>>>>>>>> (6929f71e46bdddbf1c4d67c2728648176c67c555) that manifests
>>>> like this:
>>>>>>>>>>>>> Hi, what's the .config you use?
>>>>>>>>>>>> Pretty much x86_64 defconfig minus a few options (PCI, USB,
>>>>>>>>>>>> ...)
>>>>>>>>>>> Oh yes indeed. I immediately crash in the same way with this config.
>>>>>>>>>>> I'll
>>>>>>>>>>> start digging...
>>>>>>>>>>>
>>>>>>>>>>> (defconfig finishes boot)
>>>>>>>>>> This is funny, booting with slub_debug=F results in:
>>>>>>>>>> I'm not sure if it's ACPI or ftrace wrong here, but looks like
>>>>>>>>>> the changed free pointer offset merely exposes a bug in something
>>>>>>>>>> else.
>>>>>>>>> So, with Kees' patch reverted, booting with slub_debug=F (or even
>>>>>>>>> more specific slub_debug=F,ftrace_event_field) also hits this bug
>>>>>>>>> below. I wanted to bisect it, but v5.7 was also bad, and also
>>>>>>>>> v5.6. Didn't try further in history. So it's not new at all, and
>>>>>>>>> likely very specific to your config+QEMU? (and related to the ACPI
>>>>>>>>> error messages that precede it?).
>>>>>>>> I see it too, but not on v5.0. I can bisect it.
>>>>>>> commit 67a72420a326b45514deb3f212085fb2cd1595b5
>>>>>>> Author: Bob Moore <[email protected]>
>>>>>>> Date: Fri Aug 16 14:43:21 2019 -0700
>>>>>>>
>>>>>>> ACPICA: Increase total number of possible Owner IDs
>>>>>>>
>>>>>>> ACPICA commit 1f1652dad88b9d767767bc1f7eb4f7d99e6b5324
>>>>>>>
>>>>>>> From 255 to 4095 possible IDs.
>>>>>>>
>>>>>>> Link: https://github.com/acpica/acpica/commit/1f1652da
>>>>>>> Reported-by: Hedi Berriche <hedi.berriche @hpe.com>
>>>>>>> Signed-off-by: Bob Moore <[email protected]>
>>>>>>> Signed-off-by: Erik Schmauss <[email protected]>
>>>>>>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>>>>>> Bob, Erik, did we miss something in that patch?
>>>>> Maybe the patch just changes layout in a way that exposes the bug.
>>>>>
>>>>> Anyway the "ftrace_event_field" cache is not really involved, this is
>>>>> just because of slab merging. After adding "slub_nomerge" to
>>>>> "slub_debug=F", it starts making more sense, as the cache becomes
>>>>> Acpi-Namespace
>>>>>
>>>>> [ 0.140408] ------------[ cut here ]------------
>>>>> [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but
>>>> object is from kmalloc-64
>>>>> [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
>>>> kmem_cache_free+0x1d3/0x250
>>>>> [ 0.142105] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
>>>>> [ 0.142393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>>>> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
>>>>> [ 0.142393] RIP: 0010:kmem_cache_free+0x1d3/0x250
>>>>> [ 0.142393] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
>>>> 8b 55 58 48 c7 c6 10 47 a1 ac 48 c7 c7 00 c2 b0 ac e8 b1 cc eb ff <0f> 0b 48 89 de
>>>> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
>>>>> [ 0.142393] RSP: 0018:ffffb39cc0013dc0 EFLAGS: 00010282
>>>>> [ 0.142393] RAX: 0000000000000000 RBX: ffff937287409e00 RCX:
>>>> 0000000000000000
>>>>> [ 0.142393] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
>>>> ffffffffacfdd32c
>>>>> [ 0.142393] RBP: ffff93728742ef00 R08: ffffb39cc0013c7d R09:
>>>> 00000000000000fc
>>>>> [ 0.142393] R10: ffffb39cc0013c78 R11: ffffb39cc0013c7d R12:
>>>> ffff937307409e00
>>>>> [ 0.142393] R13: ffff937287401d00 R14: 0000000000000000 R15:
>>>> 0000000000000000
>>>>> [ 0.142393] FS: 0000000000000000(0000) GS:ffff937287a00000(0000)
>>>> knlGS:0000000000000000
>>>>> [ 0.142393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 0.142393] CR2: 0000000000000000 CR3: 0000000003a0a000 CR4:
>>>> 00000000003406f0
>>>>> [ 0.142393] Call Trace:
>>>>> [ 0.142393] acpi_os_release_object+0x5/0x10
>>>>> [ 0.142393] acpi_ns_delete_children+0x46/0x59
>>>>> [ 0.142393] acpi_ns_delete_namespace_subtree+0x5c/0x79
>>>>> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
>>>>> [ 0.142393] acpi_ns_terminate+0xc/0x31
>>>>> [ 0.142393] acpi_ut_subsystem_shutdown+0x45/0xa3
>>>>> [ 0.142393] ? acpi_sleep_proc_init+0x1f/0x1f
>>>>> [ 0.142393] acpi_terminate+0x5/0xf
>>>>> [ 0.142393] acpi_init+0x27b/0x308
>>>>> [ 0.142393] ? video_setup+0x79/0x79
>>>>> [ 0.142393] do_one_initcall+0x7b/0x160
>>>>> [ 0.142393] kernel_init_freeable+0x190/0x1f2
>>>>> [ 0.142393] ? rest_init+0x9a/0x9a
>>>>> [ 0.142393] kernel_init+0x5/0xf6
>>>>> [ 0.142393] ret_from_fork+0x22/0x30
>>>>> [ 0.142393] ---[ end trace 3539f236ef812ba1 ]---
>>>>> [ 0.142396] ------------[ cut here ]------------
>>>>>
>>>>> I've also changed the warning so it's not printed just once, and also
>>>>> prints tracking info (see the hunk at the end of my mail, I'll turn this to a
>>>> proper patch later).
>>>>> With "slub_debug=FU slub_nomerge" there are now multiple warnings,
>>>> but they all look the same:
>>>>> [ 0.143815] ------------[ cut here ]------------
>>>>> [ 0.144131] cache_from_obj: Wrong slab cache. Acpi-Namespace but
>>>> object is from kmalloc-64
>>>>> [ 0.144929] WARNING: CPU: 0 PID: 1 at mm/slab.h:524
>>>> kmem_cache_free+0x1d3/0x250
>>>>> [ 0.145129] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0+ #45
>>>>> [ 0.145129] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>>>> BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
>>>>> [ 0.145129] RIP: 0010:kmem_cache_free+0x1d3/0x250
>>>>> [ 0.145129] Code: 18 4d 85 ed 0f 84 10 ff ff ff 4c 39 ed 74 2f 49 8b 4d 58 48
>>>> 8b 55 58 48 c7 c6 10 47 c1 8d 48 c7 c7 00 c2 d0 8d e8 b1 cc eb ff <0f> 0b 48 89 de
>>>> 4c 89 ef e8 10 d7 ff ff 48 8b 15 59 36 9b 00 4c 89
>>>>> [ 0.145129] RSP: 0018:ffff990b80013dc0 EFLAGS: 00010282
>>>>> [ 0.145129] RAX: 0000000000000000 RBX: ffff972d474ada80 RCX:
>>>> 0000000000000000
>>>>> [ 0.145129] RDX: 0000000000000001 RSI: 0000000000000092 RDI:
>>>> ffffffff8e1dd32c
>>>>> [ 0.145129] RBP: ffff972d47425680 R08: ffff990b80013c7d R09:
>>>> 00000000000000fc
>>>>> [ 0.145129] R10: ffff990b80013c78 R11: ffff990b80013c7d R12:
>>>> ffff972dc74ada80
>>>>> [ 0.145129] R13: ffff972d474038c0 R14: 0000000000000000 R15:
>>>> 0000000000000000
>>>>> [ 0.145129] FS: 0000000000000000(0000) GS:ffff972d47a00000(0000)
>>>> knlGS:0000000000000000
>>>>> [ 0.145129] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 0.145129] CR2: 0000000000000000 CR3: 000000000660a000 CR4:
>>>> 00000000003406f0
>>>>> [ 0.145129] Call Trace:
>>>>> [ 0.145129] acpi_os_release_object+0x5/0x10
>>>>> [ 0.145129] acpi_ns_delete_children+0x46/0x59
>>>>> [ 0.145129] acpi_ns_delete_namespace_subtree+0x5c/0x79
>>>>> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
>>>>> [ 0.145129] acpi_ns_terminate+0xc/0x31
>>>>> [ 0.145129] acpi_ut_subsystem_shutdown+0x45/0xa3
>>>>> [ 0.145129] ? acpi_sleep_proc_init+0x1f/0x1f
>>>>> [ 0.145129] acpi_terminate+0x5/0xf
>>>>> [ 0.145129] acpi_init+0x27b/0x308
>>>>> [ 0.145129] ? video_setup+0x79/0x79
>>>>> [ 0.145129] do_one_initcall+0x7b/0x160
>>>>> [ 0.145129] kernel_init_freeable+0x190/0x1f2
>>>>> [ 0.145129] ? rest_init+0x9a/0x9a
>>>>> [ 0.145129] kernel_init+0x5/0xf6
>>>>> [ 0.145129] ret_from_fork+0x22/0x30
>>>>> [ 0.145129] ---[ end trace 574554fca7bd06bb ]---
>>>>> [ 0.145133] INFO: Allocated in acpi_ns_root_initialize+0xb6/0x2d1 age=58
>>>> cpu=0 pid=0
>>>>> [ 0.145881] kmem_cache_alloc_trace+0x1a9/0x1c0
>>>>> [ 0.146132] acpi_ns_root_initialize+0xb6/0x2d1
>>>>> [ 0.146578] acpi_initialize_subsystem+0x65/0xa8
>>>>> [ 0.147024] acpi_early_init+0x5d/0xd1
>>>>> [ 0.147132] start_kernel+0x45b/0x518
>>>>> [ 0.147491] secondary_startup_64+0xb6/0xc0
>>>>> [ 0.147897] ------------[ cut here ]------------
>>>>>
>>>>> And it seems ACPI is allocating an object via kmalloc() and then
>>>>> freeing it via kmem_cache_free(<"Acpi-Namespace" kmem_cache>) which
>>>> is wrong.
>>>>>> ./scripts/faddr2line vmlinux 'acpi_ns_root_initialize+0xb6'
>>>>> acpi_ns_root_initialize+0xb6/0x2d1:
>>>>> kmalloc at include/linux/slab.h:555
>>>>> (inlined by) kzalloc at include/linux/slab.h:669 (inlined by)
>>>>> acpi_os_allocate_zeroed at include/acpi/platform/aclinuxex.h:57
>>>>> (inlined by) acpi_ns_root_initialize at
>>>>> drivers/acpi/acpica/nsaccess.c:102
>>>>>
>>> Hi Vegard,
>>>
>>>> That's it :-) This fixes it for me:
>>> We'll take this patch for ACPICA and it will be in the next release.
>>>
>>> Rafael, do you want to take this as a part of the next rc?
>>
>> Yes, I do.
>
> Folks, what happened to the patch? I don't see it in current upstream.
>
> Looks like it got reported again:
>
> https://lore.kernel.org/r/[email protected]
>

I've attached a properly formatted patch.

Thanks,


Vegard


Attachments:
0001-ACPICA-Always-create-namespace-nodes-using-acpi_ns_c.patch (3.81 kB)