2021-11-13 20:41:52

by Daniel Scally

[permalink] [raw]
Subject: [PATCH] device property: Check fwnode->secondary when finding properties

fwnode_property_get_reference_args() searches for named properties
against a fwnode_handle, but these could instead be against the fwnode's
secondary. If the property isn't found against the primary, check the
secondary to see if it's there instead.

Reviewed-by: Andy Shevchenko <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Daniel Scally <[email protected]>
---
drivers/base/property.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/base/property.c b/drivers/base/property.c
index f1f35b48ab8b..7bac12b32fcb 100644
--- a/drivers/base/property.c
+++ b/drivers/base/property.c
@@ -478,8 +478,16 @@ int fwnode_property_get_reference_args(const struct fwnode_handle *fwnode,
unsigned int nargs, unsigned int index,
struct fwnode_reference_args *args)
{
- return fwnode_call_int_op(fwnode, get_reference_args, prop, nargs_prop,
- nargs, index, args);
+ int ret;
+
+ ret = fwnode_call_int_op(fwnode, get_reference_args, prop, nargs_prop,
+ nargs, index, args);
+
+ if (ret < 0 && !IS_ERR_OR_NULL(fwnode->secondary))
+ ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
+ prop, nargs_prop, nargs, index, args);
+
+ return ret;
}
EXPORT_SYMBOL_GPL(fwnode_property_get_reference_args);

--
2.25.1



2021-11-16 07:41:31

by kernel test robot

[permalink] [raw]
Subject: [device property] 995fe757ec: BUG:kernel_NULL_pointer_dereference,address



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 995fe757ecaeac44e023458af64d27655f9dbf73 ("[PATCH] device property: Check fwnode->secondary when finding properties")
url: https://github.com/0day-ci/linux/commits/Daniel-Scally/device-property-Check-fwnode-secondary-when-finding-properties/20211114-044259
base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git b5013d084e03e82ceeab4db8ae8ceeaebe76b0eb
patch link: https://lore.kernel.org/lkml/[email protected]

in testcase: boot

on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+---------------------------------------------+------------+------------+
| | b5013d084e | 995fe757ec |
+---------------------------------------------+------------+------------+
| boot_successes | 23 | 0 |
| boot_failures | 0 | 22 |
| BUG:kernel_NULL_pointer_dereference,address | 0 | 22 |
| Oops:#[##] | 0 | 22 |
| EIP:fwnode_property_get_reference_args | 0 | 22 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 22 |
+---------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 17.327851][ T7] BUG: kernel NULL pointer dereference, address: 00000000
[ 17.329758][ T7] #PF: supervisor read access in kernel mode
[ 17.331371][ T7] #PF: error_code(0x0000) - not-present page
[ 17.332992][ T7] *pde = 00000000
[ 17.334107][ T7] Oops: 0000 [#1] PREEMPT
[ 17.335310][ T7] CPU: 0 PID: 7 Comm: kworker/u2:0 Tainted: G S 5.15.0-11191-g995fe757ecae #1
[ 17.338036][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 17.340544][ T7] Workqueue: events_unbound deferred_probe_work_func
[ 17.342291][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
[ 17.344051][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
All code
========
0: 8b 45 0c mov 0xc(%rbp),%eax
3: 50 push %rax
4: 8b 45 08 mov 0x8(%rbp),%eax
7: 50 push %rax
8: 89 d8 mov %ebx,%eax
a: 89 55 f4 mov %edx,-0xc(%rbp)
d: ff d6 callq *%rsi
f: 83 c4 0c add $0xc,%esp
12: 89 c6 mov %eax,%esi
14: 85 c0 test %eax,%eax
16: 78 55 js 0x6d
18: 8d 65 f8 lea -0x8(%rbp),%esp
1b: 89 f0 mov %esi,%eax
1d: 5b pop %rbx
1e: 5e pop %rsi
1f: 5d pop %rbp
20: c3 retq
21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
25: be fa ff ff ff mov $0xfffffffa,%esi
2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
2c: 85 c0 test %eax,%eax
2e: 74 e8 je 0x18
30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
35: 77 e1 ja 0x18
37: 8b 58 04 mov 0x4(%rax),%ebx
3a: 85 db test %ebx,%ebx
3c: 74 37 je 0x75
3e: 8b .byte 0x8b
3f: 5b pop %rbx

Code starting with the faulting instruction
===========================================
0: 8b 03 mov (%rbx),%eax
2: 85 c0 test %eax,%eax
4: 74 e8 je 0xffffffffffffffee
6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
b: 77 e1 ja 0xffffffffffffffee
d: 8b 58 04 mov 0x4(%rax),%ebx
10: 85 db test %ebx,%ebx
12: 74 37 je 0x4b
14: 8b .byte 0x8b
15: 5b pop %rbx
[ 17.350847][ T7] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: c37cd6d8
[ 17.352783][ T7] ESI: ffffffea EDI: f5b5a400 EBP: c4cffd24 ESP: c4cffd14
[ 17.354673][ T7] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010246
[ 17.362075][ T7] CR0: 80050033 CR2: 00000000 CR3: 04206000 CR4: 00000690
[ 17.363993][ T7] Call Trace:
[ 17.365018][ T7] fwnode_find_reference (drivers/base/property.c:514)
[ 17.366430][ T7] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
[ 17.367825][ T7] ? lockdep_init_map_type (kernel/locking/lockdep.c:4813)
[ 17.369325][ T7] ? phylink_run_resolve+0x20/0x20
[ 17.370897][ T7] ? init_timer_key (kernel/time/timer.c:818)
[ 17.372228][ T7] fwnode_get_phy_node (drivers/net/phy/phy_device.c:2986)
[ 17.373574][ T7] phylink_fwnode_phy_connect (drivers/net/phy/phylink.c:1180 drivers/net/phy/phylink.c:1166)
[ 17.375014][ T7] phylink_of_phy_connect (drivers/net/phy/phylink.c:1152)
[ 17.376373][ T7] dsa_slave_create (net/dsa/slave.c:1889 net/dsa/slave.c:2036)
[ 17.377765][ T7] dsa_tree_setup_switches (net/dsa/dsa2.c:477 net/dsa/dsa2.c:977)
[ 17.379282][ T7] dsa_register_switch (net/dsa/dsa2.c:1065 net/dsa/dsa2.c:1565 net/dsa/dsa2.c:1579)
[ 17.380762][ T7] dsa_loop_drv_probe (drivers/net/dsa/dsa_loop.c:333)
[ 17.382137][ T7] mdio_probe (drivers/net/phy/mdio_device.c:157)
[ 17.383362][ T7] really_probe (drivers/base/dd.c:744)
[ 17.384808][ T7] really_probe (drivers/base/dd.c:678)
[ 17.385999][ T7] __driver_probe_device (drivers/base/dd.c:751)
[ 17.387393][ T7] driver_probe_device (drivers/base/dd.c:781)
[ 17.388787][ T7] __device_attach_driver (drivers/base/dd.c:899)
[ 17.390253][ T7] ? driver_allows_async_probing (drivers/base/dd.c:867)
[ 17.391829][ T7] bus_for_each_drv (drivers/base/bus.c:427)
[ 17.393226][ T7] __device_attach (drivers/base/dd.c:969)
[ 17.394610][ T7] ? driver_allows_async_probing (drivers/base/dd.c:867)
[ 17.396270][ T7] device_initial_probe (drivers/base/dd.c:1017)
[ 17.397637][ T7] bus_probe_device (drivers/base/bus.c:487)
[ 17.398907][ T7] deferred_probe_work_func (drivers/base/dd.c:123)
[ 17.400385][ T7] process_one_work (arch/x86/include/asm/jump_label.h:41 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2303)
[ 17.401834][ T7] worker_thread (include/linux/list.h:282 kernel/workqueue.c:2358 kernel/workqueue.c:2450)
[ 17.403204][ T7] kthread (kernel/kthread.c:327)
[ 17.404417][ T7] ? process_one_work (kernel/workqueue.c:2388)
[ 17.405766][ T7] ? set_kthread_struct (kernel/kthread.c:272)
[ 17.407168][ T7] ret_from_fork (arch/x86/entry/entry_32.S:775)
[ 17.408421][ T7] Modules linked in:
[ 17.409637][ T7] CR2: 0000000000000000
[ 17.410743][ T7] ---[ end trace f8ecb8c3f56e69be ]---
[ 17.412229][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
[ 17.414104][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
All code
========
0: 8b 45 0c mov 0xc(%rbp),%eax
3: 50 push %rax
4: 8b 45 08 mov 0x8(%rbp),%eax
7: 50 push %rax
8: 89 d8 mov %ebx,%eax
a: 89 55 f4 mov %edx,-0xc(%rbp)
d: ff d6 callq *%rsi
f: 83 c4 0c add $0xc,%esp
12: 89 c6 mov %eax,%esi
14: 85 c0 test %eax,%eax
16: 78 55 js 0x6d
18: 8d 65 f8 lea -0x8(%rbp),%esp
1b: 89 f0 mov %esi,%eax
1d: 5b pop %rbx
1e: 5e pop %rsi
1f: 5d pop %rbp
20: c3 retq
21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
25: be fa ff ff ff mov $0xfffffffa,%esi
2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
2c: 85 c0 test %eax,%eax
2e: 74 e8 je 0x18
30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
35: 77 e1 ja 0x18
37: 8b 58 04 mov 0x4(%rax),%ebx
3a: 85 db test %ebx,%ebx
3c: 74 37 je 0x75
3e: 8b .byte 0x8b
3f: 5b pop %rbx

Code starting with the faulting instruction
===========================================
0: 8b 03 mov (%rbx),%eax
2: 85 c0 test %eax,%eax
4: 74 e8 je 0xffffffffffffffee
6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
b: 77 e1 ja 0xffffffffffffffee
d: 8b 58 04 mov 0x4(%rax),%ebx
10: 85 db test %ebx,%ebx
12: 74 37 je 0x4b
14: 8b .byte 0x8b
15: 5b pop %rbx


To reproduce:

# build kernel
cd linux
cp config-5.15.0-11191-g995fe757ecae .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (9.79 kB)
config-5.15.0-11191-g995fe757ecae (154.76 kB)
job-script (4.77 kB)
dmesg.xz (21.50 kB)
Download all attachments

2021-11-16 14:55:07

by Hans de Goede

[permalink] [raw]
Subject: Re: [device property] 995fe757ec: BUG:kernel_NULL_pointer_dereference,address

Hi,

On 11/16/21 08:41, kernel test robot wrote:
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: 995fe757ecaeac44e023458af64d27655f9dbf73 ("[PATCH] device property: Check fwnode->secondary when finding properties")
> url: https://github.com/0day-ci/linux/commits/Daniel-Scally/device-property-Check-fwnode-secondary-when-finding-properties/20211114-044259
> base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git b5013d084e03e82ceeab4db8ae8ceeaebe76b0eb
> patch link: https://lore.kernel.org/lkml/[email protected]
>
> in testcase: boot
>
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> +---------------------------------------------+------------+------------+
> | | b5013d084e | 995fe757ec |
> +---------------------------------------------+------------+------------+
> | boot_successes | 23 | 0 |
> | boot_failures | 0 | 22 |
> | BUG:kernel_NULL_pointer_dereference,address | 0 | 22 |
> | Oops:#[##] | 0 | 22 |
> | EIP:fwnode_property_get_reference_args | 0 | 22 |
> | Kernel_panic-not_syncing:Fatal_exception | 0 | 22 |
> +---------------------------------------------+------------+------------+
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>

Ok, so this patch likely needs a v2 which changes the if to this:

if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
!IS_ERR_OR_NULL(fwnode->secondary))
ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
prop, nargs_prop, nargs, index, args);


So that we check fwnode before dereferencing it, note this also changes the
(ret < 0) check to (ret == -EINVAL), this makes the secondary node handling
identical to fwnode_property_read_int_array() and
fwnode_property_read_string_array()

Danny, can you send a v2 with this change please?

Regards,

Hans






>
>
> [ 17.327851][ T7] BUG: kernel NULL pointer dereference, address: 00000000
> [ 17.329758][ T7] #PF: supervisor read access in kernel mode
> [ 17.331371][ T7] #PF: error_code(0x0000) - not-present page
> [ 17.332992][ T7] *pde = 00000000
> [ 17.334107][ T7] Oops: 0000 [#1] PREEMPT
> [ 17.335310][ T7] CPU: 0 PID: 7 Comm: kworker/u2:0 Tainted: G S 5.15.0-11191-g995fe757ecae #1
> [ 17.338036][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [ 17.340544][ T7] Workqueue: events_unbound deferred_probe_work_func
> [ 17.342291][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
> [ 17.344051][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
> All code
> ========
> 0: 8b 45 0c mov 0xc(%rbp),%eax
> 3: 50 push %rax
> 4: 8b 45 08 mov 0x8(%rbp),%eax
> 7: 50 push %rax
> 8: 89 d8 mov %ebx,%eax
> a: 89 55 f4 mov %edx,-0xc(%rbp)
> d: ff d6 callq *%rsi
> f: 83 c4 0c add $0xc,%esp
> 12: 89 c6 mov %eax,%esi
> 14: 85 c0 test %eax,%eax
> 16: 78 55 js 0x6d
> 18: 8d 65 f8 lea -0x8(%rbp),%esp
> 1b: 89 f0 mov %esi,%eax
> 1d: 5b pop %rbx
> 1e: 5e pop %rsi
> 1f: 5d pop %rbp
> 20: c3 retq
> 21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
> 25: be fa ff ff ff mov $0xfffffffa,%esi
> 2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
> 2c: 85 c0 test %eax,%eax
> 2e: 74 e8 je 0x18
> 30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> 35: 77 e1 ja 0x18
> 37: 8b 58 04 mov 0x4(%rax),%ebx
> 3a: 85 db test %ebx,%ebx
> 3c: 74 37 je 0x75
> 3e: 8b .byte 0x8b
> 3f: 5b pop %rbx
>
> Code starting with the faulting instruction
> ===========================================
> 0: 8b 03 mov (%rbx),%eax
> 2: 85 c0 test %eax,%eax
> 4: 74 e8 je 0xffffffffffffffee
> 6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> b: 77 e1 ja 0xffffffffffffffee
> d: 8b 58 04 mov 0x4(%rax),%ebx
> 10: 85 db test %ebx,%ebx
> 12: 74 37 je 0x4b
> 14: 8b .byte 0x8b
> 15: 5b pop %rbx
> [ 17.350847][ T7] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: c37cd6d8
> [ 17.352783][ T7] ESI: ffffffea EDI: f5b5a400 EBP: c4cffd24 ESP: c4cffd14
> [ 17.354673][ T7] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010246
> [ 17.362075][ T7] CR0: 80050033 CR2: 00000000 CR3: 04206000 CR4: 00000690
> [ 17.363993][ T7] Call Trace:
> [ 17.365018][ T7] fwnode_find_reference (drivers/base/property.c:514)
> [ 17.366430][ T7] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
> [ 17.367825][ T7] ? lockdep_init_map_type (kernel/locking/lockdep.c:4813)
> [ 17.369325][ T7] ? phylink_run_resolve+0x20/0x20
> [ 17.370897][ T7] ? init_timer_key (kernel/time/timer.c:818)
> [ 17.372228][ T7] fwnode_get_phy_node (drivers/net/phy/phy_device.c:2986)
> [ 17.373574][ T7] phylink_fwnode_phy_connect (drivers/net/phy/phylink.c:1180 drivers/net/phy/phylink.c:1166)
> [ 17.375014][ T7] phylink_of_phy_connect (drivers/net/phy/phylink.c:1152)
> [ 17.376373][ T7] dsa_slave_create (net/dsa/slave.c:1889 net/dsa/slave.c:2036)
> [ 17.377765][ T7] dsa_tree_setup_switches (net/dsa/dsa2.c:477 net/dsa/dsa2.c:977)
> [ 17.379282][ T7] dsa_register_switch (net/dsa/dsa2.c:1065 net/dsa/dsa2.c:1565 net/dsa/dsa2.c:1579)
> [ 17.380762][ T7] dsa_loop_drv_probe (drivers/net/dsa/dsa_loop.c:333)
> [ 17.382137][ T7] mdio_probe (drivers/net/phy/mdio_device.c:157)
> [ 17.383362][ T7] really_probe (drivers/base/dd.c:744)
> [ 17.384808][ T7] really_probe (drivers/base/dd.c:678)
> [ 17.385999][ T7] __driver_probe_device (drivers/base/dd.c:751)
> [ 17.387393][ T7] driver_probe_device (drivers/base/dd.c:781)
> [ 17.388787][ T7] __device_attach_driver (drivers/base/dd.c:899)
> [ 17.390253][ T7] ? driver_allows_async_probing (drivers/base/dd.c:867)
> [ 17.391829][ T7] bus_for_each_drv (drivers/base/bus.c:427)
> [ 17.393226][ T7] __device_attach (drivers/base/dd.c:969)
> [ 17.394610][ T7] ? driver_allows_async_probing (drivers/base/dd.c:867)
> [ 17.396270][ T7] device_initial_probe (drivers/base/dd.c:1017)
> [ 17.397637][ T7] bus_probe_device (drivers/base/bus.c:487)
> [ 17.398907][ T7] deferred_probe_work_func (drivers/base/dd.c:123)
> [ 17.400385][ T7] process_one_work (arch/x86/include/asm/jump_label.h:41 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2303)
> [ 17.401834][ T7] worker_thread (include/linux/list.h:282 kernel/workqueue.c:2358 kernel/workqueue.c:2450)
> [ 17.403204][ T7] kthread (kernel/kthread.c:327)
> [ 17.404417][ T7] ? process_one_work (kernel/workqueue.c:2388)
> [ 17.405766][ T7] ? set_kthread_struct (kernel/kthread.c:272)
> [ 17.407168][ T7] ret_from_fork (arch/x86/entry/entry_32.S:775)
> [ 17.408421][ T7] Modules linked in:
> [ 17.409637][ T7] CR2: 0000000000000000
> [ 17.410743][ T7] ---[ end trace f8ecb8c3f56e69be ]---
> [ 17.412229][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
> [ 17.414104][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
> All code
> ========
> 0: 8b 45 0c mov 0xc(%rbp),%eax
> 3: 50 push %rax
> 4: 8b 45 08 mov 0x8(%rbp),%eax
> 7: 50 push %rax
> 8: 89 d8 mov %ebx,%eax
> a: 89 55 f4 mov %edx,-0xc(%rbp)
> d: ff d6 callq *%rsi
> f: 83 c4 0c add $0xc,%esp
> 12: 89 c6 mov %eax,%esi
> 14: 85 c0 test %eax,%eax
> 16: 78 55 js 0x6d
> 18: 8d 65 f8 lea -0x8(%rbp),%esp
> 1b: 89 f0 mov %esi,%eax
> 1d: 5b pop %rbx
> 1e: 5e pop %rsi
> 1f: 5d pop %rbp
> 20: c3 retq
> 21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
> 25: be fa ff ff ff mov $0xfffffffa,%esi
> 2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
> 2c: 85 c0 test %eax,%eax
> 2e: 74 e8 je 0x18
> 30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> 35: 77 e1 ja 0x18
> 37: 8b 58 04 mov 0x4(%rax),%ebx
> 3a: 85 db test %ebx,%ebx
> 3c: 74 37 je 0x75
> 3e: 8b .byte 0x8b
> 3f: 5b pop %rbx
>
> Code starting with the faulting instruction
> ===========================================
> 0: 8b 03 mov (%rbx),%eax
> 2: 85 c0 test %eax,%eax
> 4: 74 e8 je 0xffffffffffffffee
> 6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> b: 77 e1 ja 0xffffffffffffffee
> d: 8b 58 04 mov 0x4(%rax),%ebx
> 10: 85 db test %ebx,%ebx
> 12: 74 37 je 0x4b
> 14: 8b .byte 0x8b
> 15: 5b pop %rbx
>
>
> To reproduce:
>
> # build kernel
> cd linux
> cp config-5.15.0-11191-g995fe757ecae .config
> make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>
>
> ---
> 0DAY/LKP+ Test Infrastructure Open Source Technology Center
> https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
>
> Thanks,
> Oliver Sang
>


2021-11-16 17:04:08

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [device property] 995fe757ec: BUG:kernel_NULL_pointer_dereference,address

On Tue, Nov 16, 2021 at 03:55:00PM +0100, Hans de Goede wrote:
> On 11/16/21 08:41, kernel test robot wrote:

> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: 995fe757ecaeac44e023458af64d27655f9dbf73 ("[PATCH] device property: Check fwnode->secondary when finding properties")
> > url: https://github.com/0day-ci/linux/commits/Daniel-Scally/device-property-Check-fwnode-secondary-when-finding-properties/20211114-044259
> > base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git b5013d084e03e82ceeab4db8ae8ceeaebe76b0eb
> > patch link: https://lore.kernel.org/lkml/[email protected]
> >
> > in testcase: boot
> >
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >
> >
> > +---------------------------------------------+------------+------------+
> > | | b5013d084e | 995fe757ec |
> > +---------------------------------------------+------------+------------+
> > | boot_successes | 23 | 0 |
> > | boot_failures | 0 | 22 |
> > | BUG:kernel_NULL_pointer_dereference,address | 0 | 22 |
> > | Oops:#[##] | 0 | 22 |
> > | EIP:fwnode_property_get_reference_args | 0 | 22 |
> > | Kernel_panic-not_syncing:Fatal_exception | 0 | 22 |
> > +---------------------------------------------+------------+------------+
> >
> >
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <[email protected]>
>
> Ok, so this patch likely needs a v2 which changes the if to this:
>
> if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
> !IS_ERR_OR_NULL(fwnode->secondary))
> ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
> prop, nargs_prop, nargs, index, args);
>
>
> So that we check fwnode before dereferencing it, note this also changes the
> (ret < 0) check to (ret == -EINVAL), this makes the secondary node handling
> identical to fwnode_property_read_int_array() and
> fwnode_property_read_string_array()
>
> Danny, can you send a v2 with this change please?

Hmm... So, you are suggesting that we need to check it only for EINVAL and
ENOENT in this case the one that brings us to the NULL pointer dereference.
But I don't understand what's the difference here.

> > [ 17.327851][ T7] BUG: kernel NULL pointer dereference, address: 00000000
> > [ 17.329758][ T7] #PF: supervisor read access in kernel mode
> > [ 17.331371][ T7] #PF: error_code(0x0000) - not-present page
> > [ 17.332992][ T7] *pde = 00000000
> > [ 17.334107][ T7] Oops: 0000 [#1] PREEMPT
> > [ 17.335310][ T7] CPU: 0 PID: 7 Comm: kworker/u2:0 Tainted: G S 5.15.0-11191-g995fe757ecae #1
> > [ 17.338036][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> > [ 17.340544][ T7] Workqueue: events_unbound deferred_probe_work_func
> > [ 17.342291][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
> > [ 17.344051][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
> > All code
> > ========
> > 0: 8b 45 0c mov 0xc(%rbp),%eax
> > 3: 50 push %rax
> > 4: 8b 45 08 mov 0x8(%rbp),%eax
> > 7: 50 push %rax
> > 8: 89 d8 mov %ebx,%eax
> > a: 89 55 f4 mov %edx,-0xc(%rbp)
> > d: ff d6 callq *%rsi
> > f: 83 c4 0c add $0xc,%esp
> > 12: 89 c6 mov %eax,%esi
> > 14: 85 c0 test %eax,%eax
> > 16: 78 55 js 0x6d
> > 18: 8d 65 f8 lea -0x8(%rbp),%esp
> > 1b: 89 f0 mov %esi,%eax
> > 1d: 5b pop %rbx
> > 1e: 5e pop %rsi
> > 1f: 5d pop %rbp
> > 20: c3 retq
> > 21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
> > 25: be fa ff ff ff mov $0xfffffffa,%esi
> > 2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
> > 2c: 85 c0 test %eax,%eax
> > 2e: 74 e8 je 0x18
> > 30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> > 35: 77 e1 ja 0x18
> > 37: 8b 58 04 mov 0x4(%rax),%ebx
> > 3a: 85 db test %ebx,%ebx
> > 3c: 74 37 je 0x75
> > 3e: 8b .byte 0x8b
> > 3f: 5b pop %rbx
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: 8b 03 mov (%rbx),%eax
> > 2: 85 c0 test %eax,%eax
> > 4: 74 e8 je 0xffffffffffffffee
> > 6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> > b: 77 e1 ja 0xffffffffffffffee
> > d: 8b 58 04 mov 0x4(%rax),%ebx
> > 10: 85 db test %ebx,%ebx
> > 12: 74 37 je 0x4b
> > 14: 8b .byte 0x8b
> > 15: 5b pop %rbx
> > [ 17.350847][ T7] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: c37cd6d8
> > [ 17.352783][ T7] ESI: ffffffea EDI: f5b5a400 EBP: c4cffd24 ESP: c4cffd14
> > [ 17.354673][ T7] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010246
> > [ 17.362075][ T7] CR0: 80050033 CR2: 00000000 CR3: 04206000 CR4: 00000690
> > [ 17.363993][ T7] Call Trace:
> > [ 17.365018][ T7] fwnode_find_reference (drivers/base/property.c:514)
> > [ 17.366430][ T7] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
> > [ 17.367825][ T7] ? lockdep_init_map_type (kernel/locking/lockdep.c:4813)
> > [ 17.369325][ T7] ? phylink_run_resolve+0x20/0x20
> > [ 17.370897][ T7] ? init_timer_key (kernel/time/timer.c:818)
> > [ 17.372228][ T7] fwnode_get_phy_node (drivers/net/phy/phy_device.c:2986)
> > [ 17.373574][ T7] phylink_fwnode_phy_connect (drivers/net/phy/phylink.c:1180 drivers/net/phy/phylink.c:1166)
> > [ 17.375014][ T7] phylink_of_phy_connect (drivers/net/phy/phylink.c:1152)
> > [ 17.376373][ T7] dsa_slave_create (net/dsa/slave.c:1889 net/dsa/slave.c:2036)
> > [ 17.377765][ T7] dsa_tree_setup_switches (net/dsa/dsa2.c:477 net/dsa/dsa2.c:977)
> > [ 17.379282][ T7] dsa_register_switch (net/dsa/dsa2.c:1065 net/dsa/dsa2.c:1565 net/dsa/dsa2.c:1579)
> > [ 17.380762][ T7] dsa_loop_drv_probe (drivers/net/dsa/dsa_loop.c:333)
> > [ 17.382137][ T7] mdio_probe (drivers/net/phy/mdio_device.c:157)

--
With Best Regards,
Andy Shevchenko



2021-11-17 00:12:47

by Daniel Scally

[permalink] [raw]
Subject: Re: [device property] 995fe757ec: BUG:kernel_NULL_pointer_dereference,address

Hi Hans, Andy

On 16/11/2021 16:59, Andy Shevchenko wrote:
> On Tue, Nov 16, 2021 at 03:55:00PM +0100, Hans de Goede wrote:
>> On 11/16/21 08:41, kernel test robot wrote:
>>> FYI, we noticed the following commit (built with gcc-9):
>>>
>>> commit: 995fe757ecaeac44e023458af64d27655f9dbf73 ("[PATCH] device property: Check fwnode->secondary when finding properties")
>>> url: https://github.com/0day-ci/linux/commits/Daniel-Scally/device-property-Check-fwnode-secondary-when-finding-properties/20211114-044259
>>> base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git b5013d084e03e82ceeab4db8ae8ceeaebe76b0eb
>>> patch link: https://lore.kernel.org/lkml/[email protected]
>>>
>>> in testcase: boot
>>>
>>> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>>>
>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>
>>>
>>> +---------------------------------------------+------------+------------+
>>> | | b5013d084e | 995fe757ec |
>>> +---------------------------------------------+------------+------------+
>>> | boot_successes | 23 | 0 |
>>> | boot_failures | 0 | 22 |
>>> | BUG:kernel_NULL_pointer_dereference,address | 0 | 22 |
>>> | Oops:#[##] | 0 | 22 |
>>> | EIP:fwnode_property_get_reference_args | 0 | 22 |
>>> | Kernel_panic-not_syncing:Fatal_exception | 0 | 22 |
>>> +---------------------------------------------+------------+------------+
>>>
>>>
>>> If you fix the issue, kindly add following tag
>>> Reported-by: kernel test robot <[email protected]>
>> Ok, so this patch likely needs a v2 which changes the if to this:
>>
>> if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
>> !IS_ERR_OR_NULL(fwnode->secondary))
>> ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
>> prop, nargs_prop, nargs, index, args);
>>
>>
>> So that we check fwnode before dereferencing it, note this also changes the
>> (ret < 0) check to (ret == -EINVAL), this makes the secondary node handling
>> identical to fwnode_property_read_int_array() and
>> fwnode_property_read_string_array()
>>
>> Danny, can you send a v2 with this change please?
> Hmm... So, you are suggesting that we need to check it only for EINVAL and
> ENOENT in this case the one that brings us to the NULL pointer dereference.
> But I don't understand what's the difference here.


Sticking point; the ACPI version of .get_reference_args() returns
-ENOENT (converted from -EINVAL [1]) if the property you ask for doesn't
exist against that fwnode, which unless I'm missing something means this
won't work in our use case. This confused me for a while because we
definitely call fwnode_property_read_int_array() in sensor driver probes
through v4l2_fwnode_endpoint_alloc_parse(), but it turns out the ACPI
version of _that_ operation has no matching conversion of the error
code, so when that fails to find the property it sends back -EINVAL and
so the form that exists in fwnode_property_read_int_array() currently
works fine.


We could align them all to if (ret < 0 && !IS_ERR_OR_NULL(fwnode) &&
!IS_ERR_OR_NULL(fwnode->secondary)). This is probably my preferred
option, because I can't really see why we'd only want to do the
secondary check on -EINVAL anyway - but maybe I miss something here.
Alternatively we can take Hans suggestion so they all match the existing
code, but this means we have to handle that conversion first - I
couldn't see from a cursory look that any of the direct callers check
the value of the return beyond "is it 0?", but of course it could be
done somewhere in calls to the fwnode->ops->get_reference_args()
callback instead.


Thoughts?


[1]
https://elixir.bootlin.com/linux/latest/source/drivers/acpi/property.c#L680

>
>>> [ 17.327851][ T7] BUG: kernel NULL pointer dereference, address: 00000000
>>> [ 17.329758][ T7] #PF: supervisor read access in kernel mode
>>> [ 17.331371][ T7] #PF: error_code(0x0000) - not-present page
>>> [ 17.332992][ T7] *pde = 00000000
>>> [ 17.334107][ T7] Oops: 0000 [#1] PREEMPT
>>> [ 17.335310][ T7] CPU: 0 PID: 7 Comm: kworker/u2:0 Tainted: G S 5.15.0-11191-g995fe757ecae #1
>>> [ 17.338036][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>> [ 17.340544][ T7] Workqueue: events_unbound deferred_probe_work_func
>>> [ 17.342291][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
>>> [ 17.344051][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
>>> All code
>>> ========
>>> 0: 8b 45 0c mov 0xc(%rbp),%eax
>>> 3: 50 push %rax
>>> 4: 8b 45 08 mov 0x8(%rbp),%eax
>>> 7: 50 push %rax
>>> 8: 89 d8 mov %ebx,%eax
>>> a: 89 55 f4 mov %edx,-0xc(%rbp)
>>> d: ff d6 callq *%rsi
>>> f: 83 c4 0c add $0xc,%esp
>>> 12: 89 c6 mov %eax,%esi
>>> 14: 85 c0 test %eax,%eax
>>> 16: 78 55 js 0x6d
>>> 18: 8d 65 f8 lea -0x8(%rbp),%esp
>>> 1b: 89 f0 mov %esi,%eax
>>> 1d: 5b pop %rbx
>>> 1e: 5e pop %rsi
>>> 1f: 5d pop %rbp
>>> 20: c3 retq
>>> 21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
>>> 25: be fa ff ff ff mov $0xfffffffa,%esi
>>> 2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
>>> 2c: 85 c0 test %eax,%eax
>>> 2e: 74 e8 je 0x18
>>> 30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
>>> 35: 77 e1 ja 0x18
>>> 37: 8b 58 04 mov 0x4(%rax),%ebx
>>> 3a: 85 db test %ebx,%ebx
>>> 3c: 74 37 je 0x75
>>> 3e: 8b .byte 0x8b
>>> 3f: 5b pop %rbx
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>> 0: 8b 03 mov (%rbx),%eax
>>> 2: 85 c0 test %eax,%eax
>>> 4: 74 e8 je 0xffffffffffffffee
>>> 6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
>>> b: 77 e1 ja 0xffffffffffffffee
>>> d: 8b 58 04 mov 0x4(%rax),%ebx
>>> 10: 85 db test %ebx,%ebx
>>> 12: 74 37 je 0x4b
>>> 14: 8b .byte 0x8b
>>> 15: 5b pop %rbx
>>> [ 17.350847][ T7] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: c37cd6d8
>>> [ 17.352783][ T7] ESI: ffffffea EDI: f5b5a400 EBP: c4cffd24 ESP: c4cffd14
>>> [ 17.354673][ T7] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010246
>>> [ 17.362075][ T7] CR0: 80050033 CR2: 00000000 CR3: 04206000 CR4: 00000690
>>> [ 17.363993][ T7] Call Trace:
>>> [ 17.365018][ T7] fwnode_find_reference (drivers/base/property.c:514)
>>> [ 17.366430][ T7] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
>>> [ 17.367825][ T7] ? lockdep_init_map_type (kernel/locking/lockdep.c:4813)
>>> [ 17.369325][ T7] ? phylink_run_resolve+0x20/0x20
>>> [ 17.370897][ T7] ? init_timer_key (kernel/time/timer.c:818)
>>> [ 17.372228][ T7] fwnode_get_phy_node (drivers/net/phy/phy_device.c:2986)
>>> [ 17.373574][ T7] phylink_fwnode_phy_connect (drivers/net/phy/phylink.c:1180 drivers/net/phy/phylink.c:1166)
>>> [ 17.375014][ T7] phylink_of_phy_connect (drivers/net/phy/phylink.c:1152)
>>> [ 17.376373][ T7] dsa_slave_create (net/dsa/slave.c:1889 net/dsa/slave.c:2036)
>>> [ 17.377765][ T7] dsa_tree_setup_switches (net/dsa/dsa2.c:477 net/dsa/dsa2.c:977)
>>> [ 17.379282][ T7] dsa_register_switch (net/dsa/dsa2.c:1065 net/dsa/dsa2.c:1565 net/dsa/dsa2.c:1579)
>>> [ 17.380762][ T7] dsa_loop_drv_probe (drivers/net/dsa/dsa_loop.c:333)
>>> [ 17.382137][ T7] mdio_probe (drivers/net/phy/mdio_device.c:157)

2021-11-17 11:54:58

by Hans de Goede

[permalink] [raw]
Subject: Re: [device property] 995fe757ec: BUG:kernel_NULL_pointer_dereference,address

Hi,

On 11/17/21 01:10, Daniel Scally wrote:
> Hi Hans, Andy
>
> On 16/11/2021 16:59, Andy Shevchenko wrote:
>> On Tue, Nov 16, 2021 at 03:55:00PM +0100, Hans de Goede wrote:
>>> On 11/16/21 08:41, kernel test robot wrote:
>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>
>>>> commit: 995fe757ecaeac44e023458af64d27655f9dbf73 ("[PATCH] device property: Check fwnode->secondary when finding properties")
>>>> url: https://github.com/0day-ci/linux/commits/Daniel-Scally/device-property-Check-fwnode-secondary-when-finding-properties/20211114-044259
>>>> base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git b5013d084e03e82ceeab4db8ae8ceeaebe76b0eb
>>>> patch link: https://lore.kernel.org/lkml/[email protected]
>>>>
>>>> in testcase: boot
>>>>
>>>> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>>>>
>>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>>
>>>>
>>>> +---------------------------------------------+------------+------------+
>>>> | | b5013d084e | 995fe757ec |
>>>> +---------------------------------------------+------------+------------+
>>>> | boot_successes | 23 | 0 |
>>>> | boot_failures | 0 | 22 |
>>>> | BUG:kernel_NULL_pointer_dereference,address | 0 | 22 |
>>>> | Oops:#[##] | 0 | 22 |
>>>> | EIP:fwnode_property_get_reference_args | 0 | 22 |
>>>> | Kernel_panic-not_syncing:Fatal_exception | 0 | 22 |
>>>> +---------------------------------------------+------------+------------+
>>>>
>>>>
>>>> If you fix the issue, kindly add following tag
>>>> Reported-by: kernel test robot <[email protected]>
>>> Ok, so this patch likely needs a v2 which changes the if to this:
>>>
>>> if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
>>> !IS_ERR_OR_NULL(fwnode->secondary))
>>> ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
>>> prop, nargs_prop, nargs, index, args);
>>>
>>>
>>> So that we check fwnode before dereferencing it, note this also changes the
>>> (ret < 0) check to (ret == -EINVAL), this makes the secondary node handling
>>> identical to fwnode_property_read_int_array() and
>>> fwnode_property_read_string_array()
>>>
>>> Danny, can you send a v2 with this change please?
>> Hmm... So, you are suggesting that we need to check it only for EINVAL and
>> ENOENT in this case the one that brings us to the NULL pointer dereference.
>> But I don't understand what's the difference here.
>
>
> Sticking point; the ACPI version of .get_reference_args() returns
> -ENOENT (converted from -EINVAL [1]) if the property you ask for doesn't
> exist against that fwnode, which unless I'm missing something means this
> won't work in our use case. This confused me for a while because we
> definitely call fwnode_property_read_int_array() in sensor driver probes
> through v4l2_fwnode_endpoint_alloc_parse(), but it turns out the ACPI
> version of _that_ operation has no matching conversion of the error
> code, so when that fails to find the property it sends back -EINVAL and
> so the form that exists in fwnode_property_read_int_array() currently
> works fine.
>
>
> We could align them all to if (ret < 0 && !IS_ERR_OR_NULL(fwnode) &&
> !IS_ERR_OR_NULL(fwnode->secondary)). This is probably my preferred
> option, because I can't really see why we'd only want to do the
> secondary check on -EINVAL anyway - but maybe I miss something here.
> Alternatively we can take Hans suggestion so they all match the existing
> code, but this means we have to handle that conversion first - I
> couldn't see from a cursory look that any of the direct callers check
> the value of the return beyond "is it 0?", but of course it could be
> done somewhere in calls to the fwnode->ops->get_reference_args()
> callback instead.
>
>
> Thoughts?

I missed that just checking for -EINVAL will not work for the ipu3 case
(I did not test) in that case I think using "ret < 0" as check instead
is probably fine for this patch.

As for modifying the existing 2 code paths, IMHO it does make sense
to try and preserve the error code (and not try the secondary fwnode)
when the error is an error other then the one indicating the property
is not there.

So keeping those as -EINVAL is probably best and maybe for the
the fwnode_find_reference instead of (ret < 0) use:
(ret == -EINVAL || ret == -ENOENT) ?

Regards,

Hans


>>>> [ 17.327851][ T7] BUG: kernel NULL pointer dereference, address: 00000000
>>>> [ 17.329758][ T7] #PF: supervisor read access in kernel mode
>>>> [ 17.331371][ T7] #PF: error_code(0x0000) - not-present page
>>>> [ 17.332992][ T7] *pde = 00000000
>>>> [ 17.334107][ T7] Oops: 0000 [#1] PREEMPT
>>>> [ 17.335310][ T7] CPU: 0 PID: 7 Comm: kworker/u2:0 Tainted: G S 5.15.0-11191-g995fe757ecae #1
>>>> [ 17.338036][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>> [ 17.340544][ T7] Workqueue: events_unbound deferred_probe_work_func
>>>> [ 17.342291][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
>>>> [ 17.344051][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
>>>> All code
>>>> ========
>>>> 0: 8b 45 0c mov 0xc(%rbp),%eax
>>>> 3: 50 push %rax
>>>> 4: 8b 45 08 mov 0x8(%rbp),%eax
>>>> 7: 50 push %rax
>>>> 8: 89 d8 mov %ebx,%eax
>>>> a: 89 55 f4 mov %edx,-0xc(%rbp)
>>>> d: ff d6 callq *%rsi
>>>> f: 83 c4 0c add $0xc,%esp
>>>> 12: 89 c6 mov %eax,%esi
>>>> 14: 85 c0 test %eax,%eax
>>>> 16: 78 55 js 0x6d
>>>> 18: 8d 65 f8 lea -0x8(%rbp),%esp
>>>> 1b: 89 f0 mov %esi,%eax
>>>> 1d: 5b pop %rbx
>>>> 1e: 5e pop %rsi
>>>> 1f: 5d pop %rbp
>>>> 20: c3 retq
>>>> 21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
>>>> 25: be fa ff ff ff mov $0xfffffffa,%esi
>>>> 2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
>>>> 2c: 85 c0 test %eax,%eax
>>>> 2e: 74 e8 je 0x18
>>>> 30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
>>>> 35: 77 e1 ja 0x18
>>>> 37: 8b 58 04 mov 0x4(%rax),%ebx
>>>> 3a: 85 db test %ebx,%ebx
>>>> 3c: 74 37 je 0x75
>>>> 3e: 8b .byte 0x8b
>>>> 3f: 5b pop %rbx
>>>>
>>>> Code starting with the faulting instruction
>>>> ===========================================
>>>> 0: 8b 03 mov (%rbx),%eax
>>>> 2: 85 c0 test %eax,%eax
>>>> 4: 74 e8 je 0xffffffffffffffee
>>>> 6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
>>>> b: 77 e1 ja 0xffffffffffffffee
>>>> d: 8b 58 04 mov 0x4(%rax),%ebx
>>>> 10: 85 db test %ebx,%ebx
>>>> 12: 74 37 je 0x4b
>>>> 14: 8b .byte 0x8b
>>>> 15: 5b pop %rbx
>>>> [ 17.350847][ T7] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: c37cd6d8
>>>> [ 17.352783][ T7] ESI: ffffffea EDI: f5b5a400 EBP: c4cffd24 ESP: c4cffd14
>>>> [ 17.354673][ T7] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010246
>>>> [ 17.362075][ T7] CR0: 80050033 CR2: 00000000 CR3: 04206000 CR4: 00000690
>>>> [ 17.363993][ T7] Call Trace:
>>>> [ 17.365018][ T7] fwnode_find_reference (drivers/base/property.c:514)
>>>> [ 17.366430][ T7] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
>>>> [ 17.367825][ T7] ? lockdep_init_map_type (kernel/locking/lockdep.c:4813)
>>>> [ 17.369325][ T7] ? phylink_run_resolve+0x20/0x20
>>>> [ 17.370897][ T7] ? init_timer_key (kernel/time/timer.c:818)
>>>> [ 17.372228][ T7] fwnode_get_phy_node (drivers/net/phy/phy_device.c:2986)
>>>> [ 17.373574][ T7] phylink_fwnode_phy_connect (drivers/net/phy/phylink.c:1180 drivers/net/phy/phylink.c:1166)
>>>> [ 17.375014][ T7] phylink_of_phy_connect (drivers/net/phy/phylink.c:1152)
>>>> [ 17.376373][ T7] dsa_slave_create (net/dsa/slave.c:1889 net/dsa/slave.c:2036)
>>>> [ 17.377765][ T7] dsa_tree_setup_switches (net/dsa/dsa2.c:477 net/dsa/dsa2.c:977)
>>>> [ 17.379282][ T7] dsa_register_switch (net/dsa/dsa2.c:1065 net/dsa/dsa2.c:1565 net/dsa/dsa2.c:1579)
>>>> [ 17.380762][ T7] dsa_loop_drv_probe (drivers/net/dsa/dsa_loop.c:333)
>>>> [ 17.382137][ T7] mdio_probe (drivers/net/phy/mdio_device.c:157)
>


2021-11-17 12:38:44

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [device property] 995fe757ec: BUG:kernel_NULL_pointer_dereference,address

Just realized we are discussing this w/o Sakari involved.

On Wed, Nov 17, 2021 at 12:54:51PM +0100, Hans de Goede wrote:
> On 11/17/21 01:10, Daniel Scally wrote:
> > On 16/11/2021 16:59, Andy Shevchenko wrote:
> >> On Tue, Nov 16, 2021 at 03:55:00PM +0100, Hans de Goede wrote:
> >>> On 11/16/21 08:41, kernel test robot wrote:
> >>>> FYI, we noticed the following commit (built with gcc-9):
> >>>>
> >>>> commit: 995fe757ecaeac44e023458af64d27655f9dbf73 ("[PATCH] device property: Check fwnode->secondary when finding properties")
> >>>> url: https://github.com/0day-ci/linux/commits/Daniel-Scally/device-property-Check-fwnode-secondary-when-finding-properties/20211114-044259
> >>>> base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git b5013d084e03e82ceeab4db8ae8ceeaebe76b0eb
> >>>> patch link: https://lore.kernel.org/lkml/[email protected]
> >>>>
> >>>> in testcase: boot
> >>>>
> >>>> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
> >>>>
> >>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >>>>
> >>>>
> >>>> +---------------------------------------------+------------+------------+
> >>>> | | b5013d084e | 995fe757ec |
> >>>> +---------------------------------------------+------------+------------+
> >>>> | boot_successes | 23 | 0 |
> >>>> | boot_failures | 0 | 22 |
> >>>> | BUG:kernel_NULL_pointer_dereference,address | 0 | 22 |
> >>>> | Oops:#[##] | 0 | 22 |
> >>>> | EIP:fwnode_property_get_reference_args | 0 | 22 |
> >>>> | Kernel_panic-not_syncing:Fatal_exception | 0 | 22 |
> >>>> +---------------------------------------------+------------+------------+
> >>>>
> >>>>
> >>>> If you fix the issue, kindly add following tag
> >>>> Reported-by: kernel test robot <[email protected]>
> >>> Ok, so this patch likely needs a v2 which changes the if to this:
> >>>
> >>> if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
> >>> !IS_ERR_OR_NULL(fwnode->secondary))
> >>> ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
> >>> prop, nargs_prop, nargs, index, args);
> >>>
> >>>
> >>> So that we check fwnode before dereferencing it, note this also changes the
> >>> (ret < 0) check to (ret == -EINVAL), this makes the secondary node handling
> >>> identical to fwnode_property_read_int_array() and
> >>> fwnode_property_read_string_array()
> >>>
> >>> Danny, can you send a v2 with this change please?
> >> Hmm... So, you are suggesting that we need to check it only for EINVAL and
> >> ENOENT in this case the one that brings us to the NULL pointer dereference.
> >> But I don't understand what's the difference here.
> >
> >
> > Sticking point; the ACPI version of .get_reference_args() returns
> > -ENOENT (converted from -EINVAL [1]) if the property you ask for doesn't
> > exist against that fwnode, which unless I'm missing something means this
> > won't work in our use case. This confused me for a while because we
> > definitely call fwnode_property_read_int_array() in sensor driver probes
> > through v4l2_fwnode_endpoint_alloc_parse(), but it turns out the ACPI
> > version of _that_ operation has no matching conversion of the error
> > code, so when that fails to find the property it sends back -EINVAL and
> > so the form that exists in fwnode_property_read_int_array() currently
> > works fine.
> >
> >
> > We could align them all to if (ret < 0 && !IS_ERR_OR_NULL(fwnode) &&
> > !IS_ERR_OR_NULL(fwnode->secondary)). This is probably my preferred
> > option, because I can't really see why we'd only want to do the
> > secondary check on -EINVAL anyway - but maybe I miss something here.
> > Alternatively we can take Hans suggestion so they all match the existing
> > code, but this means we have to handle that conversion first - I
> > couldn't see from a cursory look that any of the direct callers check
> > the value of the return beyond "is it 0?", but of course it could be
> > done somewhere in calls to the fwnode->ops->get_reference_args()
> > callback instead.
> >
> >
> > Thoughts?
>
> I missed that just checking for -EINVAL will not work for the ipu3 case
> (I did not test) in that case I think using "ret < 0" as check instead
> is probably fine for this patch.
>
> As for modifying the existing 2 code paths, IMHO it does make sense
> to try and preserve the error code (and not try the secondary fwnode)
> when the error is an error other then the one indicating the property
> is not there.
>
> So keeping those as -EINVAL is probably best and maybe for the
> the fwnode_find_reference instead of (ret < 0) use:
> (ret == -EINVAL || ret == -ENOENT) ?


Last time Sakari did a great job of error code alignments between DT, ACPI,
and SW nodes. Not sure why the above slipped through the fingers.

> >>>> [ 17.327851][ T7] BUG: kernel NULL pointer dereference, address: 00000000
> >>>> [ 17.329758][ T7] #PF: supervisor read access in kernel mode
> >>>> [ 17.331371][ T7] #PF: error_code(0x0000) - not-present page
> >>>> [ 17.332992][ T7] *pde = 00000000
> >>>> [ 17.334107][ T7] Oops: 0000 [#1] PREEMPT
> >>>> [ 17.335310][ T7] CPU: 0 PID: 7 Comm: kworker/u2:0 Tainted: G S 5.15.0-11191-g995fe757ecae #1
> >>>> [ 17.338036][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> >>>> [ 17.340544][ T7] Workqueue: events_unbound deferred_probe_work_func
> >>>> [ 17.342291][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
> >>>> [ 17.344051][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
> >>>> All code
> >>>> ========
> >>>> 0: 8b 45 0c mov 0xc(%rbp),%eax
> >>>> 3: 50 push %rax
> >>>> 4: 8b 45 08 mov 0x8(%rbp),%eax
> >>>> 7: 50 push %rax
> >>>> 8: 89 d8 mov %ebx,%eax
> >>>> a: 89 55 f4 mov %edx,-0xc(%rbp)
> >>>> d: ff d6 callq *%rsi
> >>>> f: 83 c4 0c add $0xc,%esp
> >>>> 12: 89 c6 mov %eax,%esi
> >>>> 14: 85 c0 test %eax,%eax
> >>>> 16: 78 55 js 0x6d
> >>>> 18: 8d 65 f8 lea -0x8(%rbp),%esp
> >>>> 1b: 89 f0 mov %esi,%eax
> >>>> 1d: 5b pop %rbx
> >>>> 1e: 5e pop %rsi
> >>>> 1f: 5d pop %rbp
> >>>> 20: c3 retq
> >>>> 21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
> >>>> 25: be fa ff ff ff mov $0xfffffffa,%esi
> >>>> 2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
> >>>> 2c: 85 c0 test %eax,%eax
> >>>> 2e: 74 e8 je 0x18
> >>>> 30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> >>>> 35: 77 e1 ja 0x18
> >>>> 37: 8b 58 04 mov 0x4(%rax),%ebx
> >>>> 3a: 85 db test %ebx,%ebx
> >>>> 3c: 74 37 je 0x75
> >>>> 3e: 8b .byte 0x8b
> >>>> 3f: 5b pop %rbx
> >>>>
> >>>> Code starting with the faulting instruction
> >>>> ===========================================
> >>>> 0: 8b 03 mov (%rbx),%eax
> >>>> 2: 85 c0 test %eax,%eax
> >>>> 4: 74 e8 je 0xffffffffffffffee
> >>>> 6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
> >>>> b: 77 e1 ja 0xffffffffffffffee
> >>>> d: 8b 58 04 mov 0x4(%rax),%ebx
> >>>> 10: 85 db test %ebx,%ebx
> >>>> 12: 74 37 je 0x4b
> >>>> 14: 8b .byte 0x8b
> >>>> 15: 5b pop %rbx
> >>>> [ 17.350847][ T7] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: c37cd6d8
> >>>> [ 17.352783][ T7] ESI: ffffffea EDI: f5b5a400 EBP: c4cffd24 ESP: c4cffd14
> >>>> [ 17.354673][ T7] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010246
> >>>> [ 17.362075][ T7] CR0: 80050033 CR2: 00000000 CR3: 04206000 CR4: 00000690
> >>>> [ 17.363993][ T7] Call Trace:
> >>>> [ 17.365018][ T7] fwnode_find_reference (drivers/base/property.c:514)
> >>>> [ 17.366430][ T7] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
> >>>> [ 17.367825][ T7] ? lockdep_init_map_type (kernel/locking/lockdep.c:4813)
> >>>> [ 17.369325][ T7] ? phylink_run_resolve+0x20/0x20
> >>>> [ 17.370897][ T7] ? init_timer_key (kernel/time/timer.c:818)
> >>>> [ 17.372228][ T7] fwnode_get_phy_node (drivers/net/phy/phy_device.c:2986)
> >>>> [ 17.373574][ T7] phylink_fwnode_phy_connect (drivers/net/phy/phylink.c:1180 drivers/net/phy/phylink.c:1166)
> >>>> [ 17.375014][ T7] phylink_of_phy_connect (drivers/net/phy/phylink.c:1152)
> >>>> [ 17.376373][ T7] dsa_slave_create (net/dsa/slave.c:1889 net/dsa/slave.c:2036)
> >>>> [ 17.377765][ T7] dsa_tree_setup_switches (net/dsa/dsa2.c:477 net/dsa/dsa2.c:977)
> >>>> [ 17.379282][ T7] dsa_register_switch (net/dsa/dsa2.c:1065 net/dsa/dsa2.c:1565 net/dsa/dsa2.c:1579)
> >>>> [ 17.380762][ T7] dsa_loop_drv_probe (drivers/net/dsa/dsa_loop.c:333)
> >>>> [ 17.382137][ T7] mdio_probe (drivers/net/phy/mdio_device.c:157)

--
With Best Regards,
Andy Shevchenko



2021-11-19 21:47:44

by Daniel Scally

[permalink] [raw]
Subject: Re: [device property] 995fe757ec: BUG:kernel_NULL_pointer_dereference,address

Hi Folks

On 17/11/2021 12:38, Andy Shevchenko wrote:
> Just realized we are discussing this w/o Sakari involved.
>
> On Wed, Nov 17, 2021 at 12:54:51PM +0100, Hans de Goede wrote:
>> On 11/17/21 01:10, Daniel Scally wrote:
>>> On 16/11/2021 16:59, Andy Shevchenko wrote:
>>>> On Tue, Nov 16, 2021 at 03:55:00PM +0100, Hans de Goede wrote:
>>>>> On 11/16/21 08:41, kernel test robot wrote:
>>>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>>>
>>>>>> commit: 995fe757ecaeac44e023458af64d27655f9dbf73 ("[PATCH] device property: Check fwnode->secondary when finding properties")
>>>>>> url: https://github.com/0day-ci/linux/commits/Daniel-Scally/device-property-Check-fwnode-secondary-when-finding-properties/20211114-044259
>>>>>> base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git b5013d084e03e82ceeab4db8ae8ceeaebe76b0eb
>>>>>> patch link: https://lore.kernel.org/lkml/[email protected]
>>>>>>
>>>>>> in testcase: boot
>>>>>>
>>>>>> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>>>>>>
>>>>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>>>>
>>>>>>
>>>>>> +---------------------------------------------+------------+------------+
>>>>>> | | b5013d084e | 995fe757ec |
>>>>>> +---------------------------------------------+------------+------------+
>>>>>> | boot_successes | 23 | 0 |
>>>>>> | boot_failures | 0 | 22 |
>>>>>> | BUG:kernel_NULL_pointer_dereference,address | 0 | 22 |
>>>>>> | Oops:#[##] | 0 | 22 |
>>>>>> | EIP:fwnode_property_get_reference_args | 0 | 22 |
>>>>>> | Kernel_panic-not_syncing:Fatal_exception | 0 | 22 |
>>>>>> +---------------------------------------------+------------+------------+
>>>>>>
>>>>>>
>>>>>> If you fix the issue, kindly add following tag
>>>>>> Reported-by: kernel test robot <[email protected]>
>>>>> Ok, so this patch likely needs a v2 which changes the if to this:
>>>>>
>>>>> if (ret == -EINVAL && !IS_ERR_OR_NULL(fwnode) &&
>>>>> !IS_ERR_OR_NULL(fwnode->secondary))
>>>>> ret = fwnode_call_int_op(fwnode->secondary, get_reference_args,
>>>>> prop, nargs_prop, nargs, index, args);
>>>>>
>>>>>
>>>>> So that we check fwnode before dereferencing it, note this also changes the
>>>>> (ret < 0) check to (ret == -EINVAL), this makes the secondary node handling
>>>>> identical to fwnode_property_read_int_array() and
>>>>> fwnode_property_read_string_array()
>>>>>
>>>>> Danny, can you send a v2 with this change please?
>>>> Hmm... So, you are suggesting that we need to check it only for EINVAL and
>>>> ENOENT in this case the one that brings us to the NULL pointer dereference.
>>>> But I don't understand what's the difference here.
>>>
>>>
>>> Sticking point; the ACPI version of .get_reference_args() returns
>>> -ENOENT (converted from -EINVAL [1]) if the property you ask for doesn't
>>> exist against that fwnode, which unless I'm missing something means this
>>> won't work in our use case. This confused me for a while because we
>>> definitely call fwnode_property_read_int_array() in sensor driver probes
>>> through v4l2_fwnode_endpoint_alloc_parse(), but it turns out the ACPI
>>> version of _that_ operation has no matching conversion of the error
>>> code, so when that fails to find the property it sends back -EINVAL and
>>> so the form that exists in fwnode_property_read_int_array() currently
>>> works fine.
>>>
>>>
>>> We could align them all to if (ret < 0 && !IS_ERR_OR_NULL(fwnode) &&
>>> !IS_ERR_OR_NULL(fwnode->secondary)). This is probably my preferred
>>> option, because I can't really see why we'd only want to do the
>>> secondary check on -EINVAL anyway - but maybe I miss something here.
>>> Alternatively we can take Hans suggestion so they all match the existing
>>> code, but this means we have to handle that conversion first - I
>>> couldn't see from a cursory look that any of the direct callers check
>>> the value of the return beyond "is it 0?", but of course it could be
>>> done somewhere in calls to the fwnode->ops->get_reference_args()
>>> callback instead.
>>>
>>>
>>> Thoughts?
>>
>> I missed that just checking for -EINVAL will not work for the ipu3 case
>> (I did not test) in that case I think using "ret < 0" as check instead
>> is probably fine for this patch.

Okedokey, I'll do a v2 on that basis

>>
>> As for modifying the existing 2 code paths, IMHO it does make sense
>> to try and preserve the error code (and not try the secondary fwnode)
>> when the error is an error other then the one indicating the property
>> is not there.

Yes, you're right of course, I agree.

>> So keeping those as -EINVAL is probably best and maybe for the
>> the fwnode_find_reference instead of (ret < 0) use:
>> (ret == -EINVAL || ret == -ENOENT) ?

Maybe...I guess left to my own devices I'd try to understand why that
conversion is there and see if we can reconcile all the different
versions of the callbacks so they return the same errors for the same
reasons (as far as is possible)

> Last time Sakari did a great job of error code alignments between DT, ACPI,
> and SW nodes. Not sure why the above slipped through the fingers.
>
>>>>>> [ 17.327851][ T7] BUG: kernel NULL pointer dereference, address: 00000000
>>>>>> [ 17.329758][ T7] #PF: supervisor read access in kernel mode
>>>>>> [ 17.331371][ T7] #PF: error_code(0x0000) - not-present page
>>>>>> [ 17.332992][ T7] *pde = 00000000
>>>>>> [ 17.334107][ T7] Oops: 0000 [#1] PREEMPT
>>>>>> [ 17.335310][ T7] CPU: 0 PID: 7 Comm: kworker/u2:0 Tainted: G S 5.15.0-11191-g995fe757ecae #1
>>>>>> [ 17.338036][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>>>> [ 17.340544][ T7] Workqueue: events_unbound deferred_probe_work_func
>>>>>> [ 17.342291][ T7] EIP: fwnode_property_get_reference_args (drivers/base/property.c:486 (discriminator 1))
>>>>>> [ 17.344051][ T7] Code: 8b 45 0c 50 8b 45 08 50 89 d8 89 55 f4 ff d6 83 c4 0c 89 c6 85 c0 78 55 8d 65 f8 89 f0 5b 5e 5d c3 8d 74 26 00 be fa ff ff ff <8b> 03 85 c0 74 e8 3d 00 f0 ff ff 77 e1 8b 58 04 85 db 74 37 8b 5b
>>>>>> All code
>>>>>> ========
>>>>>> 0: 8b 45 0c mov 0xc(%rbp),%eax
>>>>>> 3: 50 push %rax
>>>>>> 4: 8b 45 08 mov 0x8(%rbp),%eax
>>>>>> 7: 50 push %rax
>>>>>> 8: 89 d8 mov %ebx,%eax
>>>>>> a: 89 55 f4 mov %edx,-0xc(%rbp)
>>>>>> d: ff d6 callq *%rsi
>>>>>> f: 83 c4 0c add $0xc,%esp
>>>>>> 12: 89 c6 mov %eax,%esi
>>>>>> 14: 85 c0 test %eax,%eax
>>>>>> 16: 78 55 js 0x6d
>>>>>> 18: 8d 65 f8 lea -0x8(%rbp),%esp
>>>>>> 1b: 89 f0 mov %esi,%eax
>>>>>> 1d: 5b pop %rbx
>>>>>> 1e: 5e pop %rsi
>>>>>> 1f: 5d pop %rbp
>>>>>> 20: c3 retq
>>>>>> 21: 8d 74 26 00 lea 0x0(%rsi,%riz,1),%esi
>>>>>> 25: be fa ff ff ff mov $0xfffffffa,%esi
>>>>>> 2a:* 8b 03 mov (%rbx),%eax <-- trapping instruction
>>>>>> 2c: 85 c0 test %eax,%eax
>>>>>> 2e: 74 e8 je 0x18
>>>>>> 30: 3d 00 f0 ff ff cmp $0xfffff000,%eax
>>>>>> 35: 77 e1 ja 0x18
>>>>>> 37: 8b 58 04 mov 0x4(%rax),%ebx
>>>>>> 3a: 85 db test %ebx,%ebx
>>>>>> 3c: 74 37 je 0x75
>>>>>> 3e: 8b .byte 0x8b
>>>>>> 3f: 5b pop %rbx
>>>>>>
>>>>>> Code starting with the faulting instruction
>>>>>> ===========================================
>>>>>> 0: 8b 03 mov (%rbx),%eax
>>>>>> 2: 85 c0 test %eax,%eax
>>>>>> 4: 74 e8 je 0xffffffffffffffee
>>>>>> 6: 3d 00 f0 ff ff cmp $0xfffff000,%eax
>>>>>> b: 77 e1 ja 0xffffffffffffffee
>>>>>> d: 8b 58 04 mov 0x4(%rax),%ebx
>>>>>> 10: 85 db test %ebx,%ebx
>>>>>> 12: 74 37 je 0x4b
>>>>>> 14: 8b .byte 0x8b
>>>>>> 15: 5b pop %rbx
>>>>>> [ 17.350847][ T7] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: c37cd6d8
>>>>>> [ 17.352783][ T7] ESI: ffffffea EDI: f5b5a400 EBP: c4cffd24 ESP: c4cffd14
>>>>>> [ 17.354673][ T7] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010246
>>>>>> [ 17.362075][ T7] CR0: 80050033 CR2: 00000000 CR3: 04206000 CR4: 00000690
>>>>>> [ 17.363993][ T7] Call Trace:
>>>>>> [ 17.365018][ T7] fwnode_find_reference (drivers/base/property.c:514)
>>>>>> [ 17.366430][ T7] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
>>>>>> [ 17.367825][ T7] ? lockdep_init_map_type (kernel/locking/lockdep.c:4813)
>>>>>> [ 17.369325][ T7] ? phylink_run_resolve+0x20/0x20
>>>>>> [ 17.370897][ T7] ? init_timer_key (kernel/time/timer.c:818)
>>>>>> [ 17.372228][ T7] fwnode_get_phy_node (drivers/net/phy/phy_device.c:2986)
>>>>>> [ 17.373574][ T7] phylink_fwnode_phy_connect (drivers/net/phy/phylink.c:1180 drivers/net/phy/phylink.c:1166)
>>>>>> [ 17.375014][ T7] phylink_of_phy_connect (drivers/net/phy/phylink.c:1152)
>>>>>> [ 17.376373][ T7] dsa_slave_create (net/dsa/slave.c:1889 net/dsa/slave.c:2036)
>>>>>> [ 17.377765][ T7] dsa_tree_setup_switches (net/dsa/dsa2.c:477 net/dsa/dsa2.c:977)
>>>>>> [ 17.379282][ T7] dsa_register_switch (net/dsa/dsa2.c:1065 net/dsa/dsa2.c:1565 net/dsa/dsa2.c:1579)
>>>>>> [ 17.380762][ T7] dsa_loop_drv_probe (drivers/net/dsa/dsa_loop.c:333)
>>>>>> [ 17.382137][ T7] mdio_probe (drivers/net/phy/mdio_device.c:157)
>