2021-10-30 00:00:30

by Frank Rowand

[permalink] [raw]
Subject: Fwd: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36


Reported in bugzilla, forwarding to the mail lists and maintainers.

-Frank


-------- Forwarded Message --------
Subject: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
Date: Fri, 29 Oct 2021 13:59:02 +0000
From: [email protected]


https://bugzilla.kernel.org/show_bug.cgi?id=214867

Bug ID: 214867
Summary: UBSAN: shift-out-of-bounds in
drivers/of/unittest.c:1933:36
Product: Platform Specific/Hardware
Version: 2.5
Kernel Version: 5.15-rc7
Hardware: PPC-64
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: PPC-64
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected]
Regression: No

Created attachment 299361
--> https://bugzilla.kernel.org/attachment.cgi?id=299361&action=edit
kernel dmesg (kernel 5.15-rc7, Talos II)

UBSAN catches this at boot on my Talos II.

[...]
### dt-test ### EXPECT / : GPIO line <<int>> (line-C-input) hogged as input
================================================================================
UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
shift exponent -1 is negative
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc7-TalosII #1
Call Trace:
[c000000004163700] [c0000000008ffaa8] .dump_stack_lvl+0xa4/0x100 (unreliable)
[c000000004163790] [c0000000008fb46c] .ubsan_epilogue+0x10/0x70
[c000000004163800] [c0000000008fb270]
.__ubsan_handle_shift_out_of_bounds+0x1f0/0x34c
[c000000004163910] [c000000000ad94a0] .of_unittest_untrack_overlay+0x6c/0xe0
[c0000000041639a0] [c000000002098ff8] .of_unittest+0x4c50/0x59f8
[c000000004163b60] [c000000000011b5c] .do_one_initcall+0x7c/0x4f0
[c000000004163c50] [c00000000200300c] .kernel_init_freeable+0x704/0x858
[c000000004163d90] [c000000000012730] .kernel_init+0x20/0x190
[c000000004163e10] [c00000000000ce78] .ret_from_kernel_thread+0x58/0x60
================================================================================
### dt-test ### EXPECT \ : OF: overlay: WARNING: memory leak will occur if
overlay removed, property: /testcase-data-2/substation@100/status
[...]

--
You may reply to this email to add a comment.

You are receiving this mail because:
You are on the CC list for the bug.


2021-10-30 00:09:00

by Erhard Furtner

[permalink] [raw]
Subject: Re: Fwd: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36

(In reply to Arnd Bergmann from comment #2)
> My guess is that 'id' is negative here, which means it fails to tigger the
> WARN_ON() but ends up still being out of range.
>
> Can you try changing it to 'unsigned int id'?
When I change it to static void of_unittest_untrack_overlay(unsigned int id) I get no UBSAN message but this warning:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/of/unittest.c:1931 .of_unittest_untrack_overlay+0x20/0xac
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc7-TalosII+ #2
NIP: c000000000ad9368 LR: c000000002099000 CTR: 0000000000000000
REGS: c000000004163700 TRAP: 0700 Not tainted (5.15.0-rc7-TalosII+)
MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 44004242 XER: 20040000
CFAR: c000000002098ffc IRQMASK: 0
GPR00: c000000002098f74 c0000000041639a0 c0000000023b2f00 ffffffffffffffff
GPR04: c000000002270b90 0000000000000000 c000000004163794 c0000000027744d8
GPR08: 00000007fdaf2000 0000000000000001 0000000000000000 00000000ffffffff
GPR12: 0000000024004242 c0000000031d5000 b086ed6156fceb64 c0000000022718d8
GPR16: d5cb61e74edefbbf 0000000000000000 c0000000010840b0 c000000001084078
GPR20: c000000000e0df58 c0000000021d6120 0000000000000000 c000200004d579f0
GPR24: c0000000020ad0a0 c0000000022713c8 c0000000031c9110 0000000000000000
GPR28: c000000000e0d110 c0000000031c9158 7bba9c880526c811 c0000000041639a0
NIP [c000000000ad9368] .of_unittest_untrack_overlay+0x20/0xac
LR [c000000002099000] .of_unittest+0x4c5c/0x5a04
Call Trace:
[c0000000041639a0] [c000000002098f74] .of_unittest+0x4bd0/0x5a04 (unreliable)
[c000000004163b60] [c000000000011b5c] .do_one_initcall+0x7c/0x4f0
[c000000004163c50] [c00000000200300c] .kernel_init_freeable+0x704/0x858
[c000000004163d90] [c000000000012730] .kernel_init+0x20/0x190
[c000000004163e10] [c00000000000ce78] .ret_from_kernel_thread+0x58/0x60
Instruction dump:
60000000 60000000 38600000 4e800020 3d22ffec e929e4ca 2c090000 4d800020
7c691850 39200001 28030100 7d20481e <0b090000> 7c0802a6 fb81ffe0 fba1ffe8
irq event stamp: 514764
hardirqs last enabled at (514763): [<c0000000004398e4>] .__slab_free+0x394/0x590
hardirqs last disabled at (514764): [<c0000000000312bc>] .interrupt_enter_prepare.constprop.0+0xec/0x150
softirqs last enabled at (509346): [<c000000000d845cc>] .__do_softirq+0x4cc/0x714
softirqs last disabled at (509339): [<c0000000000f6eb8>] .__irq_exit_rcu+0x148/0x1b0
---[ end trace 0c8618d488a1a13d ]---


In unittest.c id generally seems to be derived from a constant and counted downwards in loops.

[...]
#define MAX_UNITTEST_OVERLAYS 256
[...]
static void of_unittest_destroy_tracked_overlays(void)
{
int id, ret, defers, ovcs_id;

if (overlay_first_id < 0)
return;

/* try until no defers */
do {
defers = 0;
/* remove in reverse order */
for (id = MAX_UNITTEST_OVERLAYS - 1; id >= 0; id--) {
if (!of_unittest_overlay_tracked(id))
continue;

ovcs_id = id + overlay_first_id;
ret = of_overlay_remove(&ovcs_id);
if (ret == -ENODEV) {
pr_warn("%s: no overlay to destroy for #%d\n",
__func__, id + overlay_first_id);
continue;
}
if (ret != 0) {
defers++;
pr_warn("%s: overlay destroy failed for #%d\n",
__func__, id + overlay_first_id);
continue;
}

of_unittest_untrack_overlay(id);
}
} while (defers > 0);
}

It should not get negative with the "id >= 0" above in the for-loop. But from it's purpose it should be an unsigned int probably. It would need to be changed in the whole unittest.c though as it is used as int in several other places, e.g.
static long of_unittest_overlay_tracked(int id)
static void of_unittest_track_overlay(int id)

2021-10-30 00:09:58

by Frank Rowand

[permalink] [raw]
Subject: Re: Fwd: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36

On 10/29/21 6:57 PM, Frank Rowand wrote:
>
> Reported in bugzilla, forwarding to the mail lists and maintainers.
>
> -Frank
>
>
> -------- Forwarded Message --------
> Subject: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
> Date: Fri, 29 Oct 2021 13:59:02 +0000
> From: [email protected]
>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=214867
>
> Bug ID: 214867
> Summary: UBSAN: shift-out-of-bounds in
> drivers/of/unittest.c:1933:36
> Product: Platform Specific/Hardware
> Version: 2.5
> Kernel Version: 5.15-rc7
> Hardware: PPC-64
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: PPC-64
> Assignee: [email protected]
> Reporter: [email protected]
> CC: [email protected]
> Regression: No
>
> Created attachment 299361
> --> https://bugzilla.kernel.org/attachment.cgi?id=299361&action=edit
> kernel dmesg (kernel 5.15-rc7, Talos II)
>
> UBSAN catches this at boot on my Talos II.
>
> [...]
> ### dt-test ### EXPECT / : GPIO line <<int>> (line-C-input) hogged as input
> ================================================================================
> UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36
> shift exponent -1 is negative
> CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc7-TalosII #1
> Call Trace:
> [c000000004163700] [c0000000008ffaa8] .dump_stack_lvl+0xa4/0x100 (unreliable)
> [c000000004163790] [c0000000008fb46c] .ubsan_epilogue+0x10/0x70
> [c000000004163800] [c0000000008fb270]
> .__ubsan_handle_shift_out_of_bounds+0x1f0/0x34c
> [c000000004163910] [c000000000ad94a0] .of_unittest_untrack_overlay+0x6c/0xe0
> [c0000000041639a0] [c000000002098ff8] .of_unittest+0x4c50/0x59f8
> [c000000004163b60] [c000000000011b5c] .do_one_initcall+0x7c/0x4f0
> [c000000004163c50] [c00000000200300c] .kernel_init_freeable+0x704/0x858
> [c000000004163d90] [c000000000012730] .kernel_init+0x20/0x190
> [c000000004163e10] [c00000000000ce78] .ret_from_kernel_thread+0x58/0x60
> ================================================================================
> ### dt-test ### EXPECT \ : OF: overlay: WARNING: memory leak will occur if
> overlay removed, property: /testcase-data-2/substation@100/status
> [...]
>

Further comment in Bugzilla are:

---------- comment 1:

Erhard F. 2021-10-29 14:00:20 UTC

Created attachment 299363 [details]
kernel .config (kernel 5.15-rc7, Talos II)

# lspci
0000:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0000:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Turks XT [Radeon HD 6670/7670]
0000:01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Turks HDMI Audio [Radeon HD 6500/6600 / 6700M Series]
0001:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0001:01:00.0 Non-Volatile memory controller: Phison Electronics Corporation Device 5008 (rev 01)
0002:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0004:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0004:01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0004:01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0005:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0005:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04)
0005:02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)
0030:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0031:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0032:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0033:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)


---------- comment 2:

[reply] [−] Comment 2 Arnd Bergmann 2021-10-29 14:06:48 UTC

This is the function that triggers it:

static void of_unittest_untrack_overlay(int id)
{
if (overlay_first_id < 0)
return;
id -= overlay_first_id;
if (WARN_ON(id >= MAX_UNITTEST_OVERLAYS))
return;
overlay_id_bits[BIT_WORD(id)] &= ~BIT_MASK(id);
}

My guess is that 'id' is negative here, which means it fails to tigger the
WARN_ON() but ends up still being out of range.

Can you try changing it to 'unsigned int id'?


---------- More info from me, but I did not comment in bugzilla

line 1933 is the final line of of_unittest_untrack_overlay()
(see comment 2 for context):

1933 overlay_id_bits[BIT_WORD(id)] &= ~BIT_MASK(id);

2021-10-30 00:16:06

by Frank Rowand

[permalink] [raw]
Subject: Re: Fwd: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36

On 10/29/21 7:06 PM, Erhard F. wrote:
> (In reply to Arnd Bergmann from comment #2)
>> My guess is that 'id' is negative here, which means it fails to tigger the
>> WARN_ON() but ends up still being out of range.
>>
>> Can you try changing it to 'unsigned int id'?
> When I change it to static void of_unittest_untrack_overlay(unsigned int id) I get no UBSAN message but this warning:
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1 at drivers/of/unittest.c:1931 .of_unittest_untrack_overlay+0x20/0xac
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc7-TalosII+ #2
> NIP: c000000000ad9368 LR: c000000002099000 CTR: 0000000000000000
> REGS: c000000004163700 TRAP: 0700 Not tainted (5.15.0-rc7-TalosII+)
> MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 44004242 XER: 20040000
> CFAR: c000000002098ffc IRQMASK: 0
> GPR00: c000000002098f74 c0000000041639a0 c0000000023b2f00 ffffffffffffffff
> GPR04: c000000002270b90 0000000000000000 c000000004163794 c0000000027744d8
> GPR08: 00000007fdaf2000 0000000000000001 0000000000000000 00000000ffffffff
> GPR12: 0000000024004242 c0000000031d5000 b086ed6156fceb64 c0000000022718d8
> GPR16: d5cb61e74edefbbf 0000000000000000 c0000000010840b0 c000000001084078
> GPR20: c000000000e0df58 c0000000021d6120 0000000000000000 c000200004d579f0
> GPR24: c0000000020ad0a0 c0000000022713c8 c0000000031c9110 0000000000000000
> GPR28: c000000000e0d110 c0000000031c9158 7bba9c880526c811 c0000000041639a0
> NIP [c000000000ad9368] .of_unittest_untrack_overlay+0x20/0xac
> LR [c000000002099000] .of_unittest+0x4c5c/0x5a04
> Call Trace:
> [c0000000041639a0] [c000000002098f74] .of_unittest+0x4bd0/0x5a04 (unreliable)
> [c000000004163b60] [c000000000011b5c] .do_one_initcall+0x7c/0x4f0
> [c000000004163c50] [c00000000200300c] .kernel_init_freeable+0x704/0x858
> [c000000004163d90] [c000000000012730] .kernel_init+0x20/0x190
> [c000000004163e10] [c00000000000ce78] .ret_from_kernel_thread+0x58/0x60
> Instruction dump:
> 60000000 60000000 38600000 4e800020 3d22ffec e929e4ca 2c090000 4d800020
> 7c691850 39200001 28030100 7d20481e <0b090000> 7c0802a6 fb81ffe0 fba1ffe8
> irq event stamp: 514764
> hardirqs last enabled at (514763): [<c0000000004398e4>] .__slab_free+0x394/0x590
> hardirqs last disabled at (514764): [<c0000000000312bc>] .interrupt_enter_prepare.constprop.0+0xec/0x150
> softirqs last enabled at (509346): [<c000000000d845cc>] .__do_softirq+0x4cc/0x714
> softirqs last disabled at (509339): [<c0000000000f6eb8>] .__irq_exit_rcu+0x148/0x1b0
> ---[ end trace 0c8618d488a1a13d ]---
>
>
> In unittest.c id generally seems to be derived from a constant and counted downwards in loops.
>
> [...]
> #define MAX_UNITTEST_OVERLAYS 256
> [...]
> static void of_unittest_destroy_tracked_overlays(void)
> {
> int id, ret, defers, ovcs_id;
>
> if (overlay_first_id < 0)
> return;
>
> /* try until no defers */
> do {
> defers = 0;
> /* remove in reverse order */
> for (id = MAX_UNITTEST_OVERLAYS - 1; id >= 0; id--) {
> if (!of_unittest_overlay_tracked(id))
> continue;
>
> ovcs_id = id + overlay_first_id;
> ret = of_overlay_remove(&ovcs_id);
> if (ret == -ENODEV) {
> pr_warn("%s: no overlay to destroy for #%d\n",
> __func__, id + overlay_first_id);
> continue;
> }
> if (ret != 0) {
> defers++;
> pr_warn("%s: overlay destroy failed for #%d\n",
> __func__, id + overlay_first_id);
> continue;
> }
>
> of_unittest_untrack_overlay(id);
> }
> } while (defers > 0);
> }
>
> It should not get negative with the "id >= 0" above in the for-loop. But from it's purpose it should be an unsigned int probably. It would need to be changed in the whole unittest.c though as it is used as int in several other places, e.g.
> static long of_unittest_overlay_tracked(int id)
> static void of_unittest_track_overlay(int id)
>

Thanks for digging deeper into this.

The overlay tracking implementation has annoyed me for a long time, because
it makes assumptions of the possible range of overlay id values (max is
hardcoded as 256) to allow a space efficient bitmap. I would prefer to
rewrite the overlay tracking functions to us a less space efficient data
structure that allows arbitrary overlay values, instead of 0..MAX_UNITTEST_OVERLAYS.

-Frank

2021-10-30 00:21:06

by Frank Rowand

[permalink] [raw]
Subject: Re: Fwd: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36

On 10/29/21 7:14 PM, Frank Rowand wrote:
> On 10/29/21 7:06 PM, Erhard F. wrote:
>> (In reply to Arnd Bergmann from comment #2)
>>> My guess is that 'id' is negative here, which means it fails to tigger the
>>> WARN_ON() but ends up still being out of range.
>>>
>>> Can you try changing it to 'unsigned int id'?
>> When I change it to static void of_unittest_untrack_overlay(unsigned int id) I get no UBSAN message but this warning:
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1 at drivers/of/unittest.c:1931 .of_unittest_untrack_overlay+0x20/0xac
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-rc7-TalosII+ #2
>> NIP: c000000000ad9368 LR: c000000002099000 CTR: 0000000000000000
>> REGS: c000000004163700 TRAP: 0700 Not tainted (5.15.0-rc7-TalosII+)
>> MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 44004242 XER: 20040000
>> CFAR: c000000002098ffc IRQMASK: 0
>> GPR00: c000000002098f74 c0000000041639a0 c0000000023b2f00 ffffffffffffffff
>> GPR04: c000000002270b90 0000000000000000 c000000004163794 c0000000027744d8
>> GPR08: 00000007fdaf2000 0000000000000001 0000000000000000 00000000ffffffff
>> GPR12: 0000000024004242 c0000000031d5000 b086ed6156fceb64 c0000000022718d8
>> GPR16: d5cb61e74edefbbf 0000000000000000 c0000000010840b0 c000000001084078
>> GPR20: c000000000e0df58 c0000000021d6120 0000000000000000 c000200004d579f0
>> GPR24: c0000000020ad0a0 c0000000022713c8 c0000000031c9110 0000000000000000
>> GPR28: c000000000e0d110 c0000000031c9158 7bba9c880526c811 c0000000041639a0
>> NIP [c000000000ad9368] .of_unittest_untrack_overlay+0x20/0xac
>> LR [c000000002099000] .of_unittest+0x4c5c/0x5a04
>> Call Trace:
>> [c0000000041639a0] [c000000002098f74] .of_unittest+0x4bd0/0x5a04 (unreliable)
>> [c000000004163b60] [c000000000011b5c] .do_one_initcall+0x7c/0x4f0
>> [c000000004163c50] [c00000000200300c] .kernel_init_freeable+0x704/0x858
>> [c000000004163d90] [c000000000012730] .kernel_init+0x20/0x190
>> [c000000004163e10] [c00000000000ce78] .ret_from_kernel_thread+0x58/0x60
>> Instruction dump:
>> 60000000 60000000 38600000 4e800020 3d22ffec e929e4ca 2c090000 4d800020
>> 7c691850 39200001 28030100 7d20481e <0b090000> 7c0802a6 fb81ffe0 fba1ffe8
>> irq event stamp: 514764
>> hardirqs last enabled at (514763): [<c0000000004398e4>] .__slab_free+0x394/0x590
>> hardirqs last disabled at (514764): [<c0000000000312bc>] .interrupt_enter_prepare.constprop.0+0xec/0x150
>> softirqs last enabled at (509346): [<c000000000d845cc>] .__do_softirq+0x4cc/0x714
>> softirqs last disabled at (509339): [<c0000000000f6eb8>] .__irq_exit_rcu+0x148/0x1b0
>> ---[ end trace 0c8618d488a1a13d ]---
>>
>>
>> In unittest.c id generally seems to be derived from a constant and counted downwards in loops.
>>
>> [...]
>> #define MAX_UNITTEST_OVERLAYS 256
>> [...]
>> static void of_unittest_destroy_tracked_overlays(void)
>> {
>> int id, ret, defers, ovcs_id;
>>
>> if (overlay_first_id < 0)
>> return;
>>
>> /* try until no defers */
>> do {
>> defers = 0;
>> /* remove in reverse order */
>> for (id = MAX_UNITTEST_OVERLAYS - 1; id >= 0; id--) {
>> if (!of_unittest_overlay_tracked(id))
>> continue;
>>
>> ovcs_id = id + overlay_first_id;
>> ret = of_overlay_remove(&ovcs_id);
>> if (ret == -ENODEV) {
>> pr_warn("%s: no overlay to destroy for #%d\n",
>> __func__, id + overlay_first_id);
>> continue;
>> }
>> if (ret != 0) {
>> defers++;
>> pr_warn("%s: overlay destroy failed for #%d\n",
>> __func__, id + overlay_first_id);
>> continue;
>> }
>>
>> of_unittest_untrack_overlay(id);
>> }
>> } while (defers > 0);
>> }
>>
>> It should not get negative with the "id >= 0" above in the for-loop. But from it's purpose it should be an unsigned int probably. It would need to be changed in the whole unittest.c though as it is used as int in several other places, e.g.
>> static long of_unittest_overlay_tracked(int id)
>> static void of_unittest_track_overlay(int id)
>>
>
> Thanks for digging deeper into this.
>
> The overlay tracking implementation has annoyed me for a long time, because
> it makes assumptions of the possible range of overlay id values (max is
> hardcoded as 256) to allow a space efficient bitmap. I would prefer to
> rewrite the overlay tracking functions to us a less space efficient data
> structure that allows arbitrary overlay values, instead of 0..MAX_UNITTEST_OVERLAYS.

Please be gentle and ignore any off by 1 issues with my characterization of
MAX_UNITTEST_OVERLAYS and maximum overlay id value.

>
> -Frank
>

2021-12-21 11:48:26

by Yin Xiujiang

[permalink] [raw]
Subject: Re: Fwd: [Bug 214867] New: UBSAN: shift-out-of-bounds in drivers/of/unittest.c:1933:36

I think if you use unsigned, then 'id -= overlay_first_id' will not be
less than 0 but will be a very large value and will be harder to judge.