2022-06-30 04:44:28

by Colin Foster

[permalink] [raw]
Subject: use-after-free warnings in 5.19-rcX kernel

Hi Tony,

I'm running a beaglebone black and doing some dev on the
next-next/master line. I noticed a lot of messages coming by during
boot, and more recently a change that shouldn't have made a difference
seems to stop me from booting.

The commit in question is commit: ec7aa25fa483 ("ARM: dts: Use clock-output-names for am3")
Prior to this commit, the boot seems fine. After this commit, I get
several warnings.

For these tests I'm booting a stock am335x-boneblack.dtb. My .config
file, and a good and bad boot log are attached.

This seems to be related to my current boot halt:

[ 0.000000] Linux version 5.19.0-rc3-00758-g47bcc1c3c288
...
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13c/0x174
[ 0.000000] refcount_t: underflow; use-after-free.
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.19.0-rc3-00758-g47bcc1c3c288 #776
[ 0.000000] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 0.000000] Backtrace:
[ 0.000000] dump_backtrace from show_stack+0x20/0x24
[ 0.000000] r7:00000009 r6:00000080 r5:c1680764 r4:60000093
[ 0.000000] show_stack from dump_stack_lvl+0x60/0x78
[ 0.000000] dump_stack_lvl from dump_stack+0x18/0x1c
[ 0.000000] r7:00000009 r6:c0702634 r5:0000001c r4:c16a61f8
[ 0.000000] dump_stack from __warn+0xe0/0x18c
[ 0.000000] __warn from warn_slowpath_fmt+0xa8/0xd0
[ 0.000000] r7:c0702634 r6:0000001c r5:c16a61f8 r4:c16a6234
[ 0.000000] warn_slowpath_fmt from refcount_warn_saturate+0x13c/0x174
[ 0.000000] r8:ffff0000 r7:c11398a8 r6:ffffffff r5:c1b01db0 r4:df9b2590
[ 0.000000] refcount_warn_saturate from kobject_put+0xf4/0xfc
[ 0.000000] kobject_put from of_node_put+0x24/0x28
[ 0.000000] r7:c11398a8 r6:ffffffff r5:c1b01db0 r4:df9b255c
[ 0.000000] of_node_put from of_fwnode_put+0x44/0x48
[ 0.000000] of_fwnode_put from fwnode_handle_put.part.0+0x30/0x34
[ 0.000000] fwnode_handle_put.part.0 from fwnode_handle_put+0x28/0x2c
[ 0.000000] fwnode_handle_put from fwnode_full_name_string+0x94/0xa8
[ 0.000000] fwnode_full_name_string from device_node_string+0x4e8/0x508
[ 0.000000] r10:df9b255c r9:c1b01d5e r8:c167527c r7:df9b2550 r6:c1670d0d r5:ffffffff
[ 0.000000] r4:c1b01d44
[ 0.000000] device_node_string from pointer+0x374/0x550
[ 0.000000] r10:c1b01cb4 r9:c1b01d44 r8:c17ac710 r7:c1b01e18 r6:00000000 r5:c1b01d44
[ 0.000000] r4:c1b01d5e
[ 0.000000] pointer from vsnprintf+0x224/0x418
[ 0.000000] r7:c1b01e18 r6:00000002 r5:c17ac70e r4:c1b01d5e
[ 0.000000] vsnprintf from vprintk_store+0x150/0x464
[ 0.000000] r10:00000000 r9:c1b01ef3 r8:00000000 r7:00000000 r6:60000093 r5:df9944d9
[ 0.000000] r4:c1b04dc4
[ 0.000000] vprintk_store from vprintk_emit+0x80/0x320
[ 0.000000] r10:c17ac6ec r9:c1b01ef3 r8:00000000 r7:00000000 r6:ffffffff r5:00000000
[ 0.000000] r4:c1b04dc4
[ 0.000000] vprintk_emit from vprintk_default+0x30/0x38
[ 0.000000] r10:c1d515c4 r9:c1b01ef3 r8:c1b01e9c r7:00000000 r6:c1cdbe94 r5:df9b2550
[ 0.000000] r4:00000000
[ 0.000000] vprintk_default from vprintk+0xa8/0x108
[ 0.000000] vprintk from _printk+0x40/0x68
[ 0.000000] r4:df9b2590
[ 0.000000] _printk from of_node_release+0xd4/0xdc
[ 0.000000] r3:00000008 r2:00000000 r1:df9b2550 r0:c17ac6ec
[ 0.000000] of_node_release from kobject_put+0xbc/0xfc
[ 0.000000] r5:00000000 r4:df9b2590
[ 0.000000] kobject_put from of_node_put+0x24/0x28
[ 0.000000] r7:00000002 r6:00000002 r5:c1c7e154 r4:c2152d40
[ 0.000000] of_node_put from ti_dt_clocks_register+0x2b0/0x35c
[ 0.000000] ti_dt_clocks_register from am33xx_dt_clk_init+0x24/0xb4
[ 0.000000] r10:c19dca6c r9:c1d01000 r8:00000000 r7:ffffffff r6:c1d3c374 r5:c1b04d00
[ 0.000000] r4:c1c7e11c
[ 0.000000] am33xx_dt_clk_init from omap_clk_init+0x5c/0x68
[ 0.000000] r5:c1b04d00 r4:c1d0242c
[ 0.000000] omap_clk_init from omap_init_time_of+0x18/0x20
[ 0.000000] r5:c1b04d00 r4:c1d01000
[ 0.000000] omap_init_time_of from time_init+0x30/0x44
[ 0.000000] time_init from start_kernel+0x548/0x710


I definitely don't understand how a subtle change to an MFD driver could
change how it could halt the boot this early... And it also only happens
when I have a chip powered on and plugged into SPI 0...

I haven't looked too deeply into this yet, just a simple bisect that I
figured I'd report. Let me know if you have any questions.

Colin Foster.


Attachments:
(No filename) (4.55 kB)
5.18.0-rc1-00004-g9bc059f71c0a.dmesg (29.99 kB)
5.18.0-rc1-00005-gec7aa25fa483.dmesg (46.28 kB)
config.5.19.errors (217.28 kB)
Download all attachments

2022-06-30 07:01:18

by Tony Lindgren

[permalink] [raw]
Subject: Re: use-after-free warnings in 5.19-rcX kernel

Hi,

* Colin Foster <[email protected]> [220630 04:30]:
> Hi Tony,
>
> I'm running a beaglebone black and doing some dev on the
> next-next/master line. I noticed a lot of messages coming by during
> boot, and more recently a change that shouldn't have made a difference
> seems to stop me from booting.
>
> The commit in question is commit: ec7aa25fa483 ("ARM: dts: Use clock-output-names for am3")
> Prior to this commit, the boot seems fine. After this commit, I get
> several warnings.

This should be fixed with:

[PATCH] clk: ti: Fix missing of_node_get() ti_find_clock_provider()
https://lore.kernel.org/linux-clk/[email protected]/

Can you please give it a try?

Regards,

Tony

2022-06-30 17:12:23

by Colin Foster

[permalink] [raw]
Subject: Re: use-after-free warnings in 5.19-rcX kernel

On Thu, Jun 30, 2022 at 09:56:40AM +0300, Tony Lindgren wrote:
> Hi,
>
> * Colin Foster <[email protected]> [220630 04:30]:
> > Hi Tony,
> >
> > I'm running a beaglebone black and doing some dev on the
> > next-next/master line. I noticed a lot of messages coming by during
> > boot, and more recently a change that shouldn't have made a difference
> > seems to stop me from booting.
> >
> > The commit in question is commit: ec7aa25fa483 ("ARM: dts: Use clock-output-names for am3")
> > Prior to this commit, the boot seems fine. After this commit, I get
> > several warnings.
>
> This should be fixed with:
>
> [PATCH] clk: ti: Fix missing of_node_get() ti_find_clock_provider()
> https://lore.kernel.org/linux-clk/[email protected]/
>
> Can you please give it a try?

Yep - seems to fix the issue. Thanks!

>
> Regards,
>
> Tony