Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: fe364a7d95c24e07e9b3f2ab917f01d6d8330bba ("dmaengine: dw: Program xBAR hardware for Elkhart Lake")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: netperf
version: netperf-x86_64-2.7-0_20210908
with following parameters:
ip: ipv4
runtime: 300s
nr_threads: 1
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance
ucode: 0xb000280
test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 47.872842][ T1341] ================================================================================
[ 47.884637][ T1341] UBSAN: array-index-out-of-bounds in drivers/acpi/acpica/dswexec.c:401:12
[ 47.884644][ T1341] index -1 is out of range for type 'acpi_operand_object *[9]'
[ 47.884647][ T1341] CPU: 9 PID: 1341 Comm: systemd-udevd Not tainted 5.14.0-rc1-00001-gfe364a7d95c2-dirty #1
[ 47.884650][ T1341] Call Trace:
0m] Reached targ[ 47.889421][ T1346] IPMI message handler: version 39.2
[ 47.927593][ T1341] ubsan_epilogue+0x5/0x40
[ 47.931873][ T1341] __ubsan_handle_out_of_bounds+0x69/0x80
[ 47.943808][ T1341] acpi_ps_parse_loop+0x4a5/0x5e4
[ 47.948707][ T1341] acpi_ps_parse_aml+0x94/0x2c0
[ 47.954716][ T1341] acpi_ps_execute_method+0x15e/0x193
[ 47.959953][ T1341] acpi_ns_evaluate+0x1c7/0x25e
[ 47.964663][ T1341] acpi_evaluate_object+0x140/0x250
[ 47.969727][ T1341] acpi_evaluate_dsm+0xac/0x140
[ 47.974456][ T1341] acpi_nfit_ctl+0x2c0/0xa00 [nfit]
[ 47.979522][ T1341] ? lock_acquire+0xbb/0x2c0
[ 47.983985][ T1341] intel_bus_fwa_businfo+0x6a/0xc0 [nfit]
[ 47.989580][ T1341] intel_bus_fwa_state+0x66/0x100 [nfit]
[ 47.995086][ T1341] intel_bus_fwa_capability+0x19/0x40 [nfit]
[ 48.000933][ T1341] nvdimm_bus_firmware_visible+0x35/0x80 [libnvdimm]
Startin[ 48.007478][ T1341] internal_create_group+0xde/0x380
OpenIPMI Driver [ 48.020614][ T1341] internal_create_groups+0x3d/0xc0
..
[ 48.033229][ T1341] ? dev_set_name+0x53/0x80
[ 48.037936][ T1341] nvdimm_bus_register+0x133/0x1c0 [libnvdimm]
[ 48.043959][ T1341] acpi_nfit_init+0xccf/0x1540 [nfit]
[ 48.049208][ T1341] ? get_object+0x40/0x40
[ 48.053409][ T1341] ? call_rcu+0x197/0x5c0
[ 48.057618][ T1341] ? lockdep_hardirqs_on_prepare+0xd4/0x180
[ 48.063392][ T1341] ? kfree+0x33b/0x5c0
[ 48.067341][ T1341] ? acpi_evaluate_object+0x229/0x250
[ 48.072592][ T1341] ? acpi_nfit_add+0x196/0x200 [nfit]
[ 48.077832][ T1341] acpi_nfit_add+0x196/0x200 [nfit]
[ 48.082897][ T1341] acpi_device_probe+0x44/0x180
Startin[ 48.087616][ T1341] really_probe+0xb3/0x340
e command to reb[ 48.106497][ T1341] __driver_attach+0x9e/0x180
.
[ 48.119201][ T1341] ? __device_attach_driver+0x100/0x100
[ 48.124863][ T1341] bus_for_each_dev+0x78/0xc0
[ 48.129409][ T1341] bus_add_driver+0x150/0x200
[ 48.133959][ T1341] driver_register+0x6c/0xc0
[ 48.138418][ T1341] ? 0xffffffffc065b000
[ 48.142453][ T1341] nfit_init+0x164/0x1000 [nfit]
[ 48.147269][ T1341] do_one_initcall+0x58/0x300
[ 48.151817][ T1341] ? kmem_cache_alloc_trace+0x58a/0x780
1;39mRegular bac[ 48.168887][ T1341] ? aa_get_task_label+0xc0/0x300
kground program [ 48.175164][ T1341] ? __do_sys_finit_module+0xae/0x140
processing daemo[ 48.181758][ T1341] __do_sys_finit_module+0xae/0x140
[ 48.188211][ T1341] do_syscall_64+0x38/0xc0
[ 48.193162][ T1341] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 48.198928][ T1341] RIP: 0033:0x7fbf24907f59
[ 48.203215][ T1341] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48
[ 48.222712][ T1341] RSP: 002b:00007fffbc5e56e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 48.222715][ T1341] RAX: ffffffffffffffda RBX: 00005607041aae50 RCX: 00007fbf24907f59
[ 48.222717][ T1341] RDX: 0000000000000000 RSI: 00007fbf2480ccad RDI: 000000000000000f
Startin[ 48.222719][ T1341] RBP: 00007fbf2480ccad R08: 0000000000000000 R09: 0000000000000000
m Logging Servic[ 48.265121][ T1341] R13: 00005607042343b0 R14: 0000000000020000 R15: 00005607041aae50
[ 48.274370][ T1341] ================================================================================
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
Thanks,
Oliver Sang
+Cc: Rafael, ACPI ml
On Sun, Sep 19, 2021 at 10:41 AM kernel test robot
<[email protected]> wrote:
>
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: fe364a7d95c24e07e9b3f2ab917f01d6d8330bba ("dmaengine: dw: Program xBAR hardware for Elkhart Lake")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
I do not believe the above commit is related to the reported issue.
> in testcase: netperf
> version: netperf-x86_64-2.7-0_20210908
> with following parameters:
>
> ip: ipv4
> runtime: 300s
> nr_threads: 1
> cluster: cs-localhost
> test: TCP_CRR
> cpufreq_governor: performance
> ucode: 0xb000280
>
> test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> test-url: http://www.netperf.org/netperf/
>
>
> on test machine: 96 threads 2 sockets Ice Lake with 256G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> [ 47.872842][ T1341] ================================================================================
> [ 47.884637][ T1341] UBSAN: array-index-out-of-bounds in drivers/acpi/acpica/dswexec.c:401:12
> [ 47.884644][ T1341] index -1 is out of range for type 'acpi_operand_object *[9]'
> [ 47.884647][ T1341] CPU: 9 PID: 1341 Comm: systemd-udevd Not tainted 5.14.0-rc1-00001-gfe364a7d95c2-dirty #1
> [ 47.884650][ T1341] Call Trace:
> 0m] Reached targ[ 47.889421][ T1346] IPMI message handler: version 39.2
> [ 47.927593][ T1341] ubsan_epilogue+0x5/0x40
> [ 47.931873][ T1341] __ubsan_handle_out_of_bounds+0x69/0x80
> [ 47.943808][ T1341] acpi_ps_parse_loop+0x4a5/0x5e4
> [ 47.948707][ T1341] acpi_ps_parse_aml+0x94/0x2c0
> [ 47.954716][ T1341] acpi_ps_execute_method+0x15e/0x193
> [ 47.959953][ T1341] acpi_ns_evaluate+0x1c7/0x25e
> [ 47.964663][ T1341] acpi_evaluate_object+0x140/0x250
> [ 47.969727][ T1341] acpi_evaluate_dsm+0xac/0x140
> [ 47.974456][ T1341] acpi_nfit_ctl+0x2c0/0xa00 [nfit]
> [ 47.979522][ T1341] ? lock_acquire+0xbb/0x2c0
> [ 47.983985][ T1341] intel_bus_fwa_businfo+0x6a/0xc0 [nfit]
> [ 47.989580][ T1341] intel_bus_fwa_state+0x66/0x100 [nfit]
> [ 47.995086][ T1341] intel_bus_fwa_capability+0x19/0x40 [nfit]
> [ 48.000933][ T1341] nvdimm_bus_firmware_visible+0x35/0x80 [libnvdimm]
> Startin[ 48.007478][ T1341] internal_create_group+0xde/0x380
> OpenIPMI Driver [ 48.020614][ T1341] internal_create_groups+0x3d/0xc0
> ..
> [ 48.033229][ T1341] ? dev_set_name+0x53/0x80
> [ 48.037936][ T1341] nvdimm_bus_register+0x133/0x1c0 [libnvdimm]
> [ 48.043959][ T1341] acpi_nfit_init+0xccf/0x1540 [nfit]
> [ 48.049208][ T1341] ? get_object+0x40/0x40
> [ 48.053409][ T1341] ? call_rcu+0x197/0x5c0
> [ 48.057618][ T1341] ? lockdep_hardirqs_on_prepare+0xd4/0x180
> [ 48.063392][ T1341] ? kfree+0x33b/0x5c0
> [ 48.067341][ T1341] ? acpi_evaluate_object+0x229/0x250
> [ 48.072592][ T1341] ? acpi_nfit_add+0x196/0x200 [nfit]
> [ 48.077832][ T1341] acpi_nfit_add+0x196/0x200 [nfit]
> [ 48.082897][ T1341] acpi_device_probe+0x44/0x180
> Startin[ 48.087616][ T1341] really_probe+0xb3/0x340
> e command to reb[ 48.106497][ T1341] __driver_attach+0x9e/0x180
> .
> [ 48.119201][ T1341] ? __device_attach_driver+0x100/0x100
> [ 48.124863][ T1341] bus_for_each_dev+0x78/0xc0
> [ 48.129409][ T1341] bus_add_driver+0x150/0x200
> [ 48.133959][ T1341] driver_register+0x6c/0xc0
> [ 48.138418][ T1341] ? 0xffffffffc065b000
> [ 48.142453][ T1341] nfit_init+0x164/0x1000 [nfit]
> [ 48.147269][ T1341] do_one_initcall+0x58/0x300
> [ 48.151817][ T1341] ? kmem_cache_alloc_trace+0x58a/0x780
> 1;39mRegular bac[ 48.168887][ T1341] ? aa_get_task_label+0xc0/0x300
> kground program [ 48.175164][ T1341] ? __do_sys_finit_module+0xae/0x140
> processing daemo[ 48.181758][ T1341] __do_sys_finit_module+0xae/0x140
> [ 48.188211][ T1341] do_syscall_64+0x38/0xc0
> [ 48.193162][ T1341] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 48.198928][ T1341] RIP: 0033:0x7fbf24907f59
> [ 48.203215][ T1341] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48
> [ 48.222712][ T1341] RSP: 002b:00007fffbc5e56e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [ 48.222715][ T1341] RAX: ffffffffffffffda RBX: 00005607041aae50 RCX: 00007fbf24907f59
> [ 48.222717][ T1341] RDX: 0000000000000000 RSI: 00007fbf2480ccad RDI: 000000000000000f
> Startin[ 48.222719][ T1341] RBP: 00007fbf2480ccad R08: 0000000000000000 R09: 0000000000000000
> m Logging Servic[ 48.265121][ T1341] R13: 00005607042343b0 R14: 0000000000020000 R15: 00005607041aae50
> [ 48.274370][ T1341] ================================================================================
>
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> sudo bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> sudo bin/lkp run generated-yaml-file
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>
>
> ---
> 0DAY/LKP+ Test Infrastructure Open Source Technology Center
> https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
>
> Thanks,
> Oliver Sang
>
--
With Best Regards,
Andy Shevchenko
Hi all,
this actually crashes s2idle e.g. on Surface Book 1 and Surface Pro 4:
================================================================================
[ 294.673738] UBSAN: array-index-out-of-bounds in
drivers/acpi/acpica/dswexec.c:401:12
[ 294.673748] index -1 is out of range for type 'acpi_operand_object *[9]'
[ 294.673755] CPU: 3 PID: 6477 Comm: systemd-sleep Tainted: G
C 5.14.9-surface-ubsan-test #1
[ 294.673762] Hardware name: Microsoft Corporation Surface Book/Surface
Book, BIOS 92.3748.768 05.04.2021
[ 294.673765] Call Trace:
[ 294.673771] dump_stack_lvl+0x4a/0x5f
[ 294.673784] dump_stack+0x10/0x12
[ 294.673792] ubsan_epilogue+0x9/0x50
[ 294.673798] __ubsan_handle_out_of_bounds+0x6f/0x80
[ 294.673805] acpi_ds_exec_end_op+0x1a0/0x79a
[ 294.673812] acpi_ps_parse_loop+0x7f5/0x8cc
[ 294.673820] acpi_ps_parse_aml+0x1bb/0x55d
[ 294.673828] acpi_ps_execute_method+0x20f/0x2d1
[ 294.673836] acpi_ns_evaluate+0x34d/0x4ef
[ 294.673841] acpi_evaluate_object+0x210/0x3da
[ 294.673848] acpi_evaluate_dsm+0xaa/0x120
[ 294.673857] ? flush_workqueue+0x19b/0x3e0
[ 294.673864] acpi_sleep_run_lps0_dsm+0x5a/0xc0
[ 294.673873] acpi_s2idle_restore_early+0x62/0x110
[ 294.673881] ? acpi_s2idle_restore_early+0x62/0x110
[ 294.673887] suspend_devices_and_enter+0x2a1/0x800
[ 294.673895] pm_suspend+0x2e5/0x420
[ 294.673900] state_store+0x85/0xf0
[ 294.673905] kobj_attr_store+0x12/0x20
[ 294.673913] sysfs_kf_write+0x3c/0x50
[ 294.673921] kernfs_fop_write_iter+0x13c/0x1b0
[ 294.673927] new_sync_write+0x117/0x1b0
[ 294.673937] vfs_write+0x1ea/0x250
[ 294.673945] ksys_write+0xa7/0xe0
[ 294.673953] __x64_sys_write+0x1a/0x20
[ 294.673961] do_syscall_64+0x5b/0xb0
[ 294.673967] ? syscall_exit_to_user_mode+0x2a/0x40
[ 294.673974] ? do_syscall_64+0x67/0xb0
[ 294.673979] ? do_syscall_64+0x67/0xb0
[ 294.673983] ? asm_exc_page_fault+0x8/0x30
[ 294.673992] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 294.674000] RIP: 0033:0x7fdd5072c1e7
[ 294.674007] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00
00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f
05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 294.674012] RSP: 002b:00007fffdcfda2b8 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[ 294.674019] RAX: ffffffffffffffda RBX: 0000000000000004 RCX:
00007fdd5072c1e7
[ 294.674023] RDX: 0000000000000004 RSI: 00007fffdcfda370 RDI:
0000000000000004
[ 294.674026] RBP: 00007fffdcfda370 R08: 0000000000000004 R09:
000000000000000d
[ 294.674029] R10: 0000560dbe6e1128 R11: 0000000000000246 R12:
0000000000000004
[ 294.674032] R13: 0000560dc03a72d0 R14: 0000000000000004 R15:
00007fdd508078a0
[ 294.674038]
================================================================================
Best regards,
Oliver
On Sat, Oct 09, 2021 at 11:56:20PM +0200, Oliver Urbann wrote:
> Hi all,
>
> this actually crashes s2idle e.g. on Surface Book 1 and Surface Pro 4:
You mean the mentioned patch?
It's impossible. Surface Book 1 (at least) has no such devices, which that
patch touches, at all!
--
With Best Regards,
Andy Shevchenko
hi All,
On Tue, Oct 12, 2021 at 09:51:37PM +0300, Andy Shevchenko wrote:
> On Sat, Oct 09, 2021 at 11:56:20PM +0200, Oliver Urbann wrote:
> > Hi all,
> >
> > this actually crashes s2idle e.g. on Surface Book 1 and Surface Pro 4:
>
> You mean the mentioned patch?
>
> It's impossible. Surface Book 1 (at least) has no such devices, which that
> patch touches, at all!
sorry about this, seems a bad bisection maybe due to our test env issues.
but rerun test, we cannot reproduce the issue now.
> --
> With Best Regards,
> Andy Shevchenko
>
>