Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: 62d528712c1db609fd5afc319378ca053ac9247e ("PCI: ACPI: PM: Power up devices in D3cold before scanning them")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: kernel-selftests
version: kernel-selftests-x86_64-a17aac1b-1_20220417
with following parameters:
group: resctrl
ucode: 0xb000280
test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 35.970292][ T1] BUG: KASAN: slab-out-of-bounds in acpi_power_up_if_adr_present (drivers/acpi/device_pm.c:433)
[ 35.970292][ T1] Read of size 1 at addr ff1100014215fe0c by task swapper/0/1
[ 35.970292][ T1]
[ 35.970292][ T1] CPU: 49 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc2-00003-g62d528712c1d #1
[ 35.970292][ T1] Call Trace:
[ 35.970292][ T1] <TASK>
[ 35.970292][ T1] dump_stack_lvl (lib/dump_stack.c:107)
[ 35.970292][ T1] print_address_description.constprop.0.cold (mm/kasan/report.c:314)
[ 35.970292][ T1] ? acpi_power_up_if_adr_present (drivers/acpi/device_pm.c:433)
[ 35.970292][ T1] ? acpi_power_up_if_adr_present (drivers/acpi/device_pm.c:433)
[ 35.970292][ T1] print_report.cold (mm/kasan/report.c:430)
[ 35.970292][ T1] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 kernel/locking/spinlock_debug.c:115)
[ 35.970292][ T1] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
[ 35.970292][ T1] ? acpi_power_up_if_adr_present (drivers/acpi/device_pm.c:433)
[ 35.970292][ T1] ? acpi_bus_set_power (drivers/acpi/device_pm.c:429)
[ 35.970292][ T1] acpi_power_up_if_adr_present (drivers/acpi/device_pm.c:433)
[ 35.970292][ T1] ? acpi_bus_set_power (drivers/acpi/device_pm.c:429)
[ 35.970292][ T1] device_for_each_child (drivers/base/core.c:3724)
[ 35.970292][ T1] ? device_platform_notify_remove (drivers/base/core.c:3714)
[ 35.970292][ T1] pci_acpi_setup (drivers/pci/pci-acpi.c:1379)
[ 35.970292][ T1] ? acpi_pci_remove_bus (drivers/pci/pci-acpi.c:1354)
[ 35.970292][ T1] ? lockdep_init_map_type (kernel/locking/lockdep.c:4812)
[ 35.970292][ T1] acpi_device_notify (drivers/acpi/glue.c:317)
[ 35.970292][ T1] device_add (drivers/base/core.c:2046 drivers/base/core.c:3347)
[ 35.970292][ T1] ? __fw_devlink_link_to_suppliers (drivers/base/core.c:3287)
[ 35.970292][ T1] ? up_write (arch/x86/include/asm/atomic64_64.h:172 include/linux/atomic/atomic-long.h:95 include/linux/atomic/atomic-instrumented.h:1348 kernel/locking/rwsem.c:1318 kernel/locking/rwsem.c:1567)
[ 35.970292][ T1] ? pci_init_reset_methods (drivers/pci/pci.c:5384)
[ 35.970292][ T1] pci_device_add (drivers/pci/probe.c:2559)
[ 35.970292][ T1] pci_scan_single_device (drivers/pci/probe.c:2578 drivers/pci/probe.c:2562)
[ 35.970292][ T1] ? pci_device_add (drivers/pci/probe.c:2563)
[ 35.970292][ T1] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 arch/x86/include/asm/irqflags.h:138 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
[ 35.970292][ T1] pci_scan_slot (drivers/pci/probe.c:2652)
[ 35.970292][ T1] pci_scan_child_bus_extend (drivers/pci/probe.c:2868)
[ 35.970292][ T1] ? pci_create_root_bus (drivers/pci/probe.c:3041)
[ 35.970292][ T1] acpi_pci_root_create (drivers/acpi/pci_root.c:933)
[ 35.970292][ T1] pci_acpi_scan_root (arch/x86/pci/acpi.c:368)
[ 35.970292][ T1] ? pci_acpi_root_init_info (arch/x86/pci/acpi.c:327)
[ 35.970292][ T1] ? decode_osc_bits+0x18a/0x18a
[ 35.970292][ T1] ? acpi_pci_find_companion (drivers/pci/pci-acpi.c:108)
[ 35.970292][ T1] acpi_pci_root_add.cold (drivers/acpi/pci_root.c:602)
[ 35.970292][ T1] ? get_root_bridge_busnr_callback (drivers/acpi/pci_root.c:522)
[ 35.970292][ T1] ? acpi_pnp_match (drivers/acpi/acpi_pnp.c:323 drivers/acpi/acpi_pnp.c:341)
[ 35.970292][ T1] ? acpi_bus_get_status_handle (drivers/acpi/bus.c:98)
[ 35.970292][ T1] acpi_bus_attach (drivers/acpi/scan.c:2177 drivers/acpi/scan.c:2225)
[ 35.970292][ T1] ? acpi_generic_device_attach (drivers/acpi/scan.c:2191)
[ 35.970292][ T1] ? __device_attach (drivers/base/dd.c:941)
[ 35.970292][ T1] ? device_bind_driver (drivers/base/dd.c:941)
[ 35.970292][ T1] acpi_bus_attach (drivers/acpi/scan.c:2245 (discriminator 3))
[ 35.970292][ T1] ? acpi_generic_device_attach (drivers/acpi/scan.c:2191)
[ 35.970292][ T1] ? __device_attach (drivers/base/dd.c:941)
[ 35.970292][ T1] ? device_bind_driver (drivers/base/dd.c:941)
[ 35.970292][ T1] acpi_bus_attach (drivers/acpi/scan.c:2245 (discriminator 3))
[ 35.970292][ T1] ? acpi_generic_device_attach (drivers/acpi/scan.c:2191)
[ 35.970292][ T1] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 arch/x86/include/asm/irqflags.h:138 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
[ 35.970292][ T1] ? acpi_os_signal_semaphore (drivers/acpi/osl.c:1308)
[ 35.970292][ T1] ? acpi_ut_release_read_lock (drivers/acpi/acpica/utlock.c:111)
[ 35.970292][ T1] ? acpi_bus_check_add_2 (drivers/acpi/scan.c:2113)
[ 35.970292][ T1] ? acpi_walk_namespace (drivers/acpi/acpica/nsxfeval.c:616 drivers/acpi/acpica/nsxfeval.c:554)
[ 35.970292][ T1] acpi_bus_scan (drivers/acpi/scan.c:2438)
[ 35.970292][ T1] ? acpi_bus_check_add_1 (drivers/acpi/scan.c:2420)
[ 35.970292][ T1] acpi_scan_init (drivers/acpi/scan.c:2600)
[ 35.970292][ T1] ? acpi_match_madt (drivers/acpi/scan.c:2550)
[ 35.970292][ T1] acpi_init (drivers/acpi/bus.c:1368)
[ 35.970292][ T1] ? acpi_bus_init (drivers/acpi/bus.c:1342)
[ 35.970292][ T1] ? rcu_read_lock_bh_held (kernel/rcu/update.c:120)
[ 35.970292][ T1] ? acpi_bus_init (drivers/acpi/bus.c:1342)
[ 35.970292][ T1] do_one_initcall (init/main.c:1298)
[ 35.970292][ T1] ? trace_event_raw_event_initcall_level (init/main.c:1289)
[ 35.970292][ T1] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125)
[ 35.970292][ T1] ? rcu_read_lock_bh_held (kernel/rcu/update.c:120)
[ 35.970292][ T1] ? __kmalloc (include/linux/kasan.h:234 mm/slub.c:4414)
[ 35.970292][ T1] kernel_init_freeable (init/main.c:1370 init/main.c:1387 init/main.c:1406 init/main.c:1613)
[ 35.970292][ T1] ? console_on_rootfs (init/main.c:1584)
[ 35.970292][ T1] ? rwlock_bug+0xc0/0xc0
[ 35.970292][ T1] ? rest_init (init/main.c:1494)
[ 35.970292][ T1] kernel_init (init/main.c:1504)
[ 35.970292][ T1] ret_from_fork (arch/x86/entry/entry_64.S:298)
[ 35.970292][ T1] </TASK>
[ 35.970292][ T1]
[ 35.970292][ T1] Allocated by task 0:
[ 35.970292][ T1] (stack is not available)
[ 35.970292][ T1]
[ 35.970292][ T1] The buggy address belongs to the object at ff1100014215f800
[ 35.970292][ T1] which belongs to the cache kmalloc-1k of size 1024
[ 35.970292][ T1] The buggy address is located 524 bytes to the right of
[ 35.970292][ T1] 1024-byte region [ff1100014215f800, ff1100014215fc00)
[ 35.970292][ T1]
[ 35.970292][ T1] The buggy address belongs to the physical page:
[ 35.970292][ T1] page:0000000091ef2032 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x142158
[ 35.970292][ T1] head:0000000091ef2032 order:3 compound_mapcount:0 compound_pincount:0
[ 35.970292][ T1] flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
[ 35.970292][ T1] raw: 0017ffffc0010200 0000000000000000 dead000000000122 ff1100010003d080
[ 35.970292][ T1] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
[ 35.970292][ T1] page dumped because: kasan: bad access detected
[ 35.970292][ T1]
[ 35.970292][ T1] Memory state around the buggy address:
[ 35.970292][ T1] ff1100014215fd00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 35.970292][ T1] ff1100014215fd80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 35.970292][ T1] >ff1100014215fe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 35.970292][ T1] ^
[ 35.970292][ T1] ff1100014215fe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 35.970292][ T1] ff1100014215ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 35.970292][ T1] ==================================================================
[ 36.528345][ T1] Disabling lock debugging due to kernel taint
[ 36.540420][ T1] pci 0000:00:1c.5: [8086:a215] type 01 class 0x060400
[ 36.547403][ T1] pci 0000:00:1c.5: PME# supported from D0 D3hot D3cold
[ 36.562298][ T1] pci 0000:00:1f.0: [8086:a245] type 00 class 0x060100
[ 36.571590][ T1] pci 0000:00:1f.2: [8086:a221] type 00 class 0x058000
[ 36.578311][ T1] pci 0000:00:1f.2: reg 0x10: [mem 0x92480000-0x92483fff]
[ 36.587589][ T1] pci 0000:00:1f.4: [8086:a223] type 00 class 0x0c0500
[ 36.594320][ T1] pci 0000:00:1f.4: reg 0x10: [mem 0x200ffff54000-0x200ffff540ff 64bit]
[ 36.602321][ T1] pci 0000:00:1f.4: reg 0x20: [io 0x4000-0x401f]
[ 36.608859][ T1] pci 0000:00:1f.5: [8086:a224] type 00 class 0x0c8000
[ 36.615313][ T1] pci 0000:00:1f.5: reg 0x10: [mem 0x90000000-0x90000fff]
[ 36.623196][ T1] pci 0000:01:00.0: working around ROM BAR overlap defect
[ 36.630295][ T1] pci 0000:01:00.0: [8086:1533] type 00 class 0x020000
[ 36.637335][ T1] pci 0000:01:00.0: reg 0x10: [mem 0x92100000-0x9217ffff]
[ 36.644337][ T1] pci 0000:01:00.0: reg 0x18: [io 0x3000-0x301f]
[ 36.650324][ T1] pci 0000:01:00.0: reg 0x1c: [mem 0x92180000-0x92183fff]
[ 36.657546][ T1] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[ 36.665073][ T1] pci 0000:00:1c.0: PCI bridge to [bus 01]
[ 36.670302][ T1] pci 0000:00:1c.0: bridge window [io 0x3000-0x3fff]
[ 36.677300][ T1] pci 0000:00:1c.0: bridge window [mem 0x92100000-0x921fffff]
[ 36.685514][ T1] pci 0000:02:00.0: [1a03:1150] type 01 class 0x060400
[ 36.692466][ T1] pci 0000:02:00.0: supports D1 D2
[ 36.697295][ T1] pci 0000:02:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[ 36.704955][ T1] pci 0000:00:1c.5: PCI bridge to [bus 02-03]
[ 36.711300][ T1] pci 0000:00:1c.5: bridge window [io 0x2000-0x2fff]
[ 36.717300][ T1] pci 0000:00:1c.5: bridge window [mem 0x91000000-0x920fffff]
[ 36.725359][ T1] pci_bus 0000:03: extended config space not accessible
[ 36.732395][ T1] pci 0000:03:00.0: [1a03:2000] type 00 class 0x030000
[ 36.738318][ T1] pci 0000:03:00.0: reg 0x10: [mem 0x91000000-0x91ffffff]
[ 36.745309][ T1] pci 0000:03:00.0: reg 0x14: [mem 0x92000000-0x9201ffff]
[ 36.752309][ T1] pci 0000:03:00.0: reg 0x18: [io 0x2000-0x207f]
[ 36.759461][ T1] pci 0000:03:00.0: supports D1 D2
[ 36.764294][ T1] pci 0000:03:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[ 36.771558][ T1] pci 0000:02:00.0: PCI bridge to [bus 03]
[ 36.777305][ T1] pci 0000:02:00.0: bridge window [io 0x2000-0x2fff]
[ 36.784302][ T1] pci 0000:02:00.0: bridge window [mem 0x91000000-0x920fffff]
[ 36.791322][ T1] pci_bus 0000:00: on NUMA node 0
[ 36.801723][ T1] ACPI: PCI Root Bridge [PC01] (domain 0000 [bus 16-2f])
[ 36.808305][ T1] acpi PNP0A08:01: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[ 37.091596][ T1] acpi PNP0A08:01: _OSC: platform does not support [SHPCHotplug AER]
[ 37.105600][ T1] acpi PNP0A08:01: _OSC: OS now controls [PCIeHotplug PME PCIeCapability LTR]
[ 37.115792][ T1] PCI host bridge to bus 0000:16
[ 37.120298][ T1] pci_bus 0000:16: root bus resource [io 0x5000-0x6fff window]
[ 37.128297][ T1] pci_bus 0000:16: root bus resource [mem 0x9b800000-0xa63fffff window]
[ 37.136296][ T1] pci_bus 0000:16: root bus resource [mem 0x201000000000-0x201fffffffff window]
[ 37.145295][ T1] pci_bus 0000:16: root bus resource [bus 16-2f]
[ 37.151345][ T1] pci 0000:16:00.0: [8086:09a2] type 00 class 0x088000
[ 37.158525][ T1] pci 0000:16:00.1: [8086:09a4] type 00 class 0x088000
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp