(please be noted we reported
"[bitops] 001bea109d: BUG:KASAN:wild-memory-access_in_dmar_parse_one_rhsa"
on
https://lore.kernel.org/all/YrnGLtDXAveqXGok@xsang-OptiPlex-9020/
now we noticed this commit has already been merged into linux-next/master,
and the issue is still existing. report again FYI)
Greeting,
FYI, we noticed the following commit (built with gcc-11):
commit: 0e862838f290147ea9c16db852d8d494b552d38d ("bitops: unify non-atomic bitops prototypes across architectures")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: xfstests
version: xfstests-x86_64-c1144bf-1_20220627
with following parameters:
disk: 2pmem
fs: ext4
test: ext4-dax
ucode: 0x700001c
test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
[ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
[ 4.683454][ T0]
[ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
[ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
[ 4.703196][ T0] Call Trace:
[ 4.706334][ T0] <TASK>
[ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
[ 4.714272][ T0] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[ 4.718632][ T0] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
[ 4.722903][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
[ 4.728042][ T0] kasan_check_range (mm/kasan/generic.c:190)
[ 4.732750][ T0] dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
[ 4.737715][ T0] dmar_walk_remapping_entries (drivers/iommu/intel/dmar.c:609)
[ 4.743375][ T0] parse_dmar_table (drivers/iommu/intel/dmar.c:671)
[ 4.748079][ T0] ? dmar_table_detect (drivers/iommu/intel/dmar.c:633)
[ 4.752872][ T0] ? dmar_free_dev_scope (drivers/iommu/intel/dmar.c:408)
[ 4.758010][ T0] ? init_dmars (drivers/iommu/intel/iommu.c:3359)
[ 4.762370][ T0] ? iommu_resume (drivers/iommu/intel/iommu.c:3419)
[ 4.766903][ T0] ? dmar_walk_dsm_resource+0x300/0x300
[ 4.772909][ T0] ? dmar_acpi_insert_dev_scope (drivers/iommu/intel/dmar.c:466)
[ 4.778655][ T0] ? dmar_check_one_atsr (drivers/iommu/intel/iommu.c:3521)
[ 4.783795][ T0] dmar_table_init (drivers/iommu/intel/dmar.c:846)
[ 4.788239][ T0] intel_prepare_irq_remapping (drivers/iommu/intel/irq_remapping.c:742)
[ 4.793811][ T0] irq_remapping_prepare (drivers/iommu/irq_remapping.c:102)
[ 4.798778][ T0] enable_IR_x2apic (arch/x86/kernel/apic/apic.c:1928)
[ 4.803395][ T0] default_setup_apic_routing (arch/x86/kernel/apic/probe_64.c:25 (discriminator 1))
[ 4.808883][ T0] apic_intr_mode_init (arch/x86/kernel/apic/apic.c:1446)
[ 4.813761][ T0] x86_late_time_init (arch/x86/kernel/time.c:101)
[ 4.818467][ T0] start_kernel (init/main.c:1101)
[ 4.822827][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
[ 4.828572][ T0] </TASK>
[ 4.831456][ T0] ==================================================================
[ 4.839377][ T0] Disabling lock debugging due to kernel taint
[ 4.845393][ T0] DMAR-IR: IOAPIC id 8 under DRHD base 0xfbffc000 IOMMU 0
[ 4.852446][ T0] DMAR-IR: IOAPIC id 9 under DRHD base 0xfbffc000 IOMMU 0
[ 4.859499][ T0] DMAR-IR: HPET id 0 under DRHD base 0xfbffc000
[ 4.865592][ T0] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[ 4.865594][ T0] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[ 4.882020][ T0] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in kdump mode
[ 4.890554][ T0] DMAR-IR: Enabled IRQ remapping in xapic mode
[ 4.896557][ T0] x2apic: IRQ remapping doesn't support X2APIC mode
[ 4.902999][ T0] Switched APIC routing to physical flat.
[ 4.908578][ T0] masked ExtINT on CPU#0
[ 4.913011][ T0] ENABLING IO-APIC IRQs
[ 4.917035][ T0] init IO_APIC IRQs
[ 4.920694][ T0] apic 8 pin 0 not connected
[ 4.925241][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 4.939400][ T0] IOAPIC[0]: Preconfigured routing entry (8-1 -> IRQ 1 Level:0 ActiveLow:0)
[ 4.947926][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:30 Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 4.962091][ T0] IOAPIC[0]: Preconfigured routing entry (8-2 -> IRQ 0 Level:0 ActiveLow:0)
[ 4.970634][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 4.984794][ T0] IOAPIC[0]: Preconfigured routing entry (8-3 -> IRQ 3 Level:0 ActiveLow:0)
[ 4.993326][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.007482][ T0] IOAPIC[0]: Preconfigured routing entry (8-4 -> IRQ 4 Level:0 ActiveLow:0)
[ 5.016013][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.030169][ T0] IOAPIC[0]: Preconfigured routing entry (8-5 -> IRQ 5 Level:0 ActiveLow:0)
[ 5.038705][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.052859][ T0] IOAPIC[0]: Preconfigured routing entry (8-6 -> IRQ 6 Level:0 ActiveLow:0)
[ 5.061389][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.075545][ T0] IOAPIC[0]: Preconfigured routing entry (8-7 -> IRQ 7 Level:0 ActiveLow:0)
[ 5.084077][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.098235][ T0] IOAPIC[0]: Preconfigured routing entry (8-8 -> IRQ 8 Level:0 ActiveLow:0)
[ 5.106767][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.120923][ T0] IOAPIC[0]: Preconfigured routing entry (8-9 -> IRQ 9 Level:1 ActiveLow:0)
[ 5.129457][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.143610][ T0] IOAPIC[0]: Preconfigured routing entry (8-10 -> IRQ 10 Level:0 ActiveLow:0)
[ 5.152320][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.166481][ T0] IOAPIC[0]: Preconfigured routing entry (8-11 -> IRQ 11 Level:0 ActiveLow:0)
[ 5.175188][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.189344][ T0] IOAPIC[0]: Preconfigured routing entry (8-12 -> IRQ 12 Level:0 ActiveLow:0)
[ 5.198049][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.212203][ T0] IOAPIC[0]: Preconfigured routing entry (8-13 -> IRQ 13 Level:0 ActiveLow:0)
[ 5.220912][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.235066][ T0] IOAPIC[0]: Preconfigured routing entry (8-14 -> IRQ 14 Level:0 ActiveLow:0)
[ 5.243770][ T0] IOAPIC[8]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:F0FF SQ:0 SVT:1)
[ 5.257926][ T0] IOAPIC[0]: Preconfigured routing entry (8-15 -> IRQ 15 Level:0 ActiveLow:0)
[ 5.266620][ T0] apic 8 pin 16 not connected
[ 5.271234][ T0] apic 8 pin 17 not connected
[ 5.275846][ T0] apic 8 pin 18 not connected
[ 5.280465][ T0] apic 8 pin 19 not connected
[ 5.285084][ T0] apic 8 pin 20 not connected
[ 5.289694][ T0] apic 8 pin 21 not connected
[ 5.294313][ T0] apic 8 pin 22 not connected
[ 5.298924][ T0] apic 8 pin 23 not connected
[ 5.303543][ T0] apic 9 pin 0 not connected
[ 5.308076][ T0] apic 9 pin 1 not connected
[ 5.312607][ T0] apic 9 pin 2 not connected
[ 5.317133][ T0] apic 9 pin 3 not connected
[ 5.321664][ T0] apic 9 pin 4 not connected
[ 5.326197][ T0] apic 9 pin 5 not connected
[ 5.330728][ T0] apic 9 pin 6 not connected
[ 5.335254][ T0] apic 9 pin 7 not connected
[ 5.339785][ T0] apic 9 pin 8 not connected
[ 5.344308][ T0] apic 9 pin 9 not connected
[ 5.348833][ T0] apic 9 pin 10 not connected
[ 5.353451][ T0] apic 9 pin 11 not connected
[ 5.358062][ T0] apic 9 pin 12 not connected
[ 5.362672][ T0] apic 9 pin 13 not connected
[ 5.367283][ T0] apic 9 pin 14 not connected
[ 5.371902][ T0] apic 9 pin 15 not connected
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
On Thu, Jul 07, 2022 at 10:10:20PM +0800, kernel test robot wrote:
>
> (please be noted we reported
> "[bitops] 001bea109d: BUG:KASAN:wild-memory-access_in_dmar_parse_one_rhsa"
> on
> https://lore.kernel.org/all/YrnGLtDXAveqXGok@xsang-OptiPlex-9020/
> now we noticed this commit has already been merged into linux-next/master,
> and the issue is still existing. report again FYI)
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-11):
>
> commit: 0e862838f290147ea9c16db852d8d494b552d38d ("bitops: unify non-atomic bitops prototypes across architectures")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> in testcase: xfstests
> version: xfstests-x86_64-c1144bf-1_20220627
> with following parameters:
>
> disk: 2pmem
> fs: ext4
> test: ext4-dax
> ucode: 0x700001c
>
> test-description: xfstests is a regression test suite for xfs and other files ystems.
> test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
>
>
> on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> [ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> [ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
> [ 4.683454][ T0]
> [ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
> [ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
> [ 4.703196][ T0] Call Trace:
> [ 4.706334][ T0] <TASK>
> [ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> [ 4.714272][ T0] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
> [ 4.718632][ T0] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
> [ 4.722903][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> [ 4.728042][ T0] kasan_check_range (mm/kasan/generic.c:190)
> [ 4.732750][ T0] dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> [ 4.737715][ T0] dmar_walk_remapping_entries (drivers/iommu/intel/dmar.c:609)
> [ 4.743375][ T0] parse_dmar_table (drivers/iommu/intel/dmar.c:671)
> [ 4.748079][ T0] ? dmar_table_detect (drivers/iommu/intel/dmar.c:633)
> [ 4.752872][ T0] ? dmar_free_dev_scope (drivers/iommu/intel/dmar.c:408)
> [ 4.758010][ T0] ? init_dmars (drivers/iommu/intel/iommu.c:3359)
> [ 4.762370][ T0] ? iommu_resume (drivers/iommu/intel/iommu.c:3419)
> [ 4.766903][ T0] ? dmar_walk_dsm_resource+0x300/0x300
> [ 4.772909][ T0] ? dmar_acpi_insert_dev_scope (drivers/iommu/intel/dmar.c:466)
> [ 4.778655][ T0] ? dmar_check_one_atsr (drivers/iommu/intel/iommu.c:3521)
> [ 4.783795][ T0] dmar_table_init (drivers/iommu/intel/dmar.c:846)
> [ 4.788239][ T0] intel_prepare_irq_remapping (drivers/iommu/intel/irq_remapping.c:742)
> [ 4.793811][ T0] irq_remapping_prepare (drivers/iommu/irq_remapping.c:102)
> [ 4.798778][ T0] enable_IR_x2apic (arch/x86/kernel/apic/apic.c:1928)
> [ 4.803395][ T0] default_setup_apic_routing (arch/x86/kernel/apic/probe_64.c:25 (discriminator 1))
> [ 4.808883][ T0] apic_intr_mode_init (arch/x86/kernel/apic/apic.c:1446)
> [ 4.813761][ T0] x86_late_time_init (arch/x86/kernel/time.c:101)
> [ 4.818467][ T0] start_kernel (init/main.c:1101)
> [ 4.822827][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
Seems like related to nodemask APIs.
--
With Best Regards,
Andy Shevchenko
From: Andy Shevchenko <[email protected]>
Date: Fri, 8 Jul 2022 15:17:40 +0300
> On Thu, Jul 07, 2022 at 10:10:20PM +0800, kernel test robot wrote:
> >
> > (please be noted we reported
> > "[bitops] 001bea109d: BUG:KASAN:wild-memory-access_in_dmar_parse_one_rhsa"
> > on
> > https://lore.kernel.org/all/YrnGLtDXAveqXGok@xsang-OptiPlex-9020/
> > now we noticed this commit has already been merged into linux-next/master,
> > and the issue is still existing. report again FYI)
> >
> > Greeting,
> >
> > FYI, we noticed the following commit (built with gcc-11):
> >
> > commit: 0e862838f290147ea9c16db852d8d494b552d38d ("bitops: unify non-atomic bitops prototypes across architectures")
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >
> > in testcase: xfstests
> > version: xfstests-x86_64-c1144bf-1_20220627
> > with following parameters:
> >
> > disk: 2pmem
> > fs: ext4
> > test: ext4-dax
> > ucode: 0x700001c
> >
> > test-description: xfstests is a regression test suite for xfs and other files ystems.
> > test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> >
> >
> > on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >
> >
> >
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot <[email protected]>
> >
> >
> > [ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > [ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
> > [ 4.683454][ T0]
> > [ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
> > [ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
> > [ 4.703196][ T0] Call Trace:
> > [ 4.706334][ T0] <TASK>
> > [ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > [ 4.714272][ T0] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
> > [ 4.718632][ T0] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
> > [ 4.722903][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > [ 4.728042][ T0] kasan_check_range (mm/kasan/generic.c:190)
> > [ 4.732750][ T0] dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > [ 4.737715][ T0] dmar_walk_remapping_entries (drivers/iommu/intel/dmar.c:609)
> > [ 4.743375][ T0] parse_dmar_table (drivers/iommu/intel/dmar.c:671)
> > [ 4.748079][ T0] ? dmar_table_detect (drivers/iommu/intel/dmar.c:633)
> > [ 4.752872][ T0] ? dmar_free_dev_scope (drivers/iommu/intel/dmar.c:408)
> > [ 4.758010][ T0] ? init_dmars (drivers/iommu/intel/iommu.c:3359)
> > [ 4.762370][ T0] ? iommu_resume (drivers/iommu/intel/iommu.c:3419)
> > [ 4.766903][ T0] ? dmar_walk_dsm_resource+0x300/0x300
> > [ 4.772909][ T0] ? dmar_acpi_insert_dev_scope (drivers/iommu/intel/dmar.c:466)
> > [ 4.778655][ T0] ? dmar_check_one_atsr (drivers/iommu/intel/iommu.c:3521)
> > [ 4.783795][ T0] dmar_table_init (drivers/iommu/intel/dmar.c:846)
> > [ 4.788239][ T0] intel_prepare_irq_remapping (drivers/iommu/intel/irq_remapping.c:742)
> > [ 4.793811][ T0] irq_remapping_prepare (drivers/iommu/irq_remapping.c:102)
> > [ 4.798778][ T0] enable_IR_x2apic (arch/x86/kernel/apic/apic.c:1928)
> > [ 4.803395][ T0] default_setup_apic_routing (arch/x86/kernel/apic/probe_64.c:25 (discriminator 1))
> > [ 4.808883][ T0] apic_intr_mode_init (arch/x86/kernel/apic/apic.c:1446)
> > [ 4.813761][ T0] x86_late_time_init (arch/x86/kernel/time.c:101)
> > [ 4.818467][ T0] start_kernel (init/main.c:1101)
> > [ 4.822827][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
>
> Seems like related to nodemask APIs.
It points to arch_test_bit() (node_online() -> test_bit()),
converted from a macro to a function, more precisely, to
variable_test_bit(), which I didn't touch.
...oh ok I got it!
pxm_to_node() can return %NUMA_NO_NODE which equals to -1. The
mentioned commit converts the macro to the function which now takes
`unsigned long` as @nr (bit number). So I guess it gets converted to
%ULONG_MAX - 1.
Now the question is: what should a bitop do if we have negative bit
number? Because there are 2 solutions:
1. (I prefer it) A caller must check that bitop arguments are valid.
UB for negative (== too big) bit numbers.
dmar_parse_one_rhsa() must be fixed so that it will check return
value of pxm_to_node():
int node = pxm_to_node(rhsa->proximity_domain);
- if (!node_online(node))
+ if (node != NUMA_NO_NODE && !node_online(node))
node = NUMA_NO_NODE;
2. My code is broken, I shouldn't change `long` to `unsigned long`
or should change it for {constant,variable}_test_bit() as well
or do something else and let it behave as it was previously
(it wasn't crashing probably due to a good luck or...).
>
> --
> With Best Regards,
> Andy Shevchenko
Thanks,
Olek
On Mon, Jul 11, 2022 at 9:15 AM Alexander Lobakin
<[email protected]> wrote:
>
> From: Andy Shevchenko <[email protected]>
> Date: Fri, 8 Jul 2022 15:17:40 +0300
>
> > On Thu, Jul 07, 2022 at 10:10:20PM +0800, kernel test robot wrote:
> > >
> > > (please be noted we reported
> > > "[bitops] 001bea109d: BUG:KASAN:wild-memory-access_in_dmar_parse_one_rhsa"
> > > on
> > > https://lore.kernel.org/all/YrnGLtDXAveqXGok@xsang-OptiPlex-9020/
> > > now we noticed this commit has already been merged into linux-next/master,
> > > and the issue is still existing. report again FYI)
> > >
> > > Greeting,
> > >
> > > FYI, we noticed the following commit (built with gcc-11):
> > >
> > > commit: 0e862838f290147ea9c16db852d8d494b552d38d ("bitops: unify non-atomic bitops prototypes across architectures")
> > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > >
> > > in testcase: xfstests
> > > version: xfstests-x86_64-c1144bf-1_20220627
> > > with following parameters:
> > >
> > > disk: 2pmem
> > > fs: ext4
> > > test: ext4-dax
> > > ucode: 0x700001c
> > >
> > > test-description: xfstests is a regression test suite for xfs and other files ystems.
> > > test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> > >
> > >
> > > on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
> > >
> > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > >
> > >
> > >
> > > If you fix the issue, kindly add following tag
> > > Reported-by: kernel test robot <[email protected]>
> > >
> > >
> > > [ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > > [ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
> > > [ 4.683454][ T0]
> > > [ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
> > > [ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
> > > [ 4.703196][ T0] Call Trace:
> > > [ 4.706334][ T0] <TASK>
> > > [ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > > [ 4.714272][ T0] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
> > > [ 4.718632][ T0] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
> > > [ 4.722903][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > > [ 4.728042][ T0] kasan_check_range (mm/kasan/generic.c:190)
> > > [ 4.732750][ T0] dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> > > [ 4.737715][ T0] dmar_walk_remapping_entries (drivers/iommu/intel/dmar.c:609)
> > > [ 4.743375][ T0] parse_dmar_table (drivers/iommu/intel/dmar.c:671)
> > > [ 4.748079][ T0] ? dmar_table_detect (drivers/iommu/intel/dmar.c:633)
> > > [ 4.752872][ T0] ? dmar_free_dev_scope (drivers/iommu/intel/dmar.c:408)
> > > [ 4.758010][ T0] ? init_dmars (drivers/iommu/intel/iommu.c:3359)
> > > [ 4.762370][ T0] ? iommu_resume (drivers/iommu/intel/iommu.c:3419)
> > > [ 4.766903][ T0] ? dmar_walk_dsm_resource+0x300/0x300
> > > [ 4.772909][ T0] ? dmar_acpi_insert_dev_scope (drivers/iommu/intel/dmar.c:466)
> > > [ 4.778655][ T0] ? dmar_check_one_atsr (drivers/iommu/intel/iommu.c:3521)
> > > [ 4.783795][ T0] dmar_table_init (drivers/iommu/intel/dmar.c:846)
> > > [ 4.788239][ T0] intel_prepare_irq_remapping (drivers/iommu/intel/irq_remapping.c:742)
> > > [ 4.793811][ T0] irq_remapping_prepare (drivers/iommu/irq_remapping.c:102)
> > > [ 4.798778][ T0] enable_IR_x2apic (arch/x86/kernel/apic/apic.c:1928)
> > > [ 4.803395][ T0] default_setup_apic_routing (arch/x86/kernel/apic/probe_64.c:25 (discriminator 1))
> > > [ 4.808883][ T0] apic_intr_mode_init (arch/x86/kernel/apic/apic.c:1446)
> > > [ 4.813761][ T0] x86_late_time_init (arch/x86/kernel/time.c:101)
> > > [ 4.818467][ T0] start_kernel (init/main.c:1101)
> > > [ 4.822827][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
> >
> > Seems like related to nodemask APIs.
>
> It points to arch_test_bit() (node_online() -> test_bit()),
> converted from a macro to a function, more precisely, to
> variable_test_bit(), which I didn't touch.
>
> ...oh ok I got it!
>
> pxm_to_node() can return %NUMA_NO_NODE which equals to -1. The
> mentioned commit converts the macro to the function which now takes
> `unsigned long` as @nr (bit number). So I guess it gets converted to
> %ULONG_MAX - 1.
>
> Now the question is: what should a bitop do if we have negative bit
> number? Because there are 2 solutions:
>
> 1. (I prefer it) A caller must check that bitop arguments are valid.
> UB for negative (== too big) bit numbers.
> dmar_parse_one_rhsa() must be fixed so that it will check return
> value of pxm_to_node():
>
> int node = pxm_to_node(rhsa->proximity_domain);
>
> - if (!node_online(node))
> + if (node != NUMA_NO_NODE && !node_online(node))
Would it make sense to check it inside node_online()?
> node = NUMA_NO_NODE;
>
> 2. My code is broken, I shouldn't change `long` to `unsigned long`
> or should change it for {constant,variable}_test_bit() as well
> or do something else and let it behave as it was previously
> (it wasn't crashing probably due to a good luck or...).
This is definitely a NUMA problem. Bitmap has 2 kernel-wide users:
cpumasks and numa nodes. Both use negative indexes for their
reasons, which is dangerous, as we can see from here, because
bitmaps don't support them and don't handle it properly...
Can you please send a fix dmar_parse_one_rhsa() as you suggested,
so that I'll add the fix before the beginning of next merge window?
Regarding a general path, this is what I'm thinking on (for a while):
- #define NUMA_NO_NODE MAX_NUMNODES;
- stronger typechecking, like you did in your series;
- introduce CONFIG_DEBUG_BITMAP to catch bad arguments
on-the-fly;
I'm working on DEBUG_BITMAP, hopefully submit it for next merge
cycle.
Thanks,
Yury
From: Yury Norov <[email protected]>
Date: Mon, 11 Jul 2022 11:24:42 -0700
> On Mon, Jul 11, 2022 at 9:15 AM Alexander Lobakin
> <[email protected]> wrote:
> >
> > From: Andy Shevchenko <[email protected]>
> > Date: Fri, 8 Jul 2022 15:17:40 +0300
> >
> >> On Thu, Jul 07, 2022 at 10:10:20PM +0800, kernel test robot wrote:
> >>>
> >>> (please be noted we reported
> >>> "[bitops] 001bea109d: BUG:KASAN:wild-memory-access_in_dmar_parse_one_rhsa"
> >>> on
> >>> https://lore.kernel.org/all/YrnGLtDXAveqXGok@xsang-OptiPlex-9020/
> >>> now we noticed this commit has already been merged into linux-next/master,
> >>> and the issue is still existing. report again FYI)
> >>>
> >>> Greeting,
> >>>
> >>> FYI, we noticed the following commit (built with gcc-11):
> >>>
> >>> commit: 0e862838f290147ea9c16db852d8d494b552d38d ("bitops: unify non-atomic bitops prototypes across architectures")
> >>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >>>
> >>> in testcase: xfstests
> >>> version: xfstests-x86_64-c1144bf-1_20220627
> >>> with following parameters:
> >>>
> >>> disk: 2pmem
> >>> fs: ext4
> >>> test: ext4-dax
> >>> ucode: 0x700001c
> >>>
> >>> test-description: xfstests is a regression test suite for xfs and other files ystems.
> >>> test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> >>>
> >>>
> >>> on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
> >>>
> >>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >>>
> >>>
> >>>
> >>> If you fix the issue, kindly add following tag
> >>> Reported-by: kernel test robot <[email protected]>
> >>>
> >>>
> >>> [ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> >>> [ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
> >>> [ 4.683454][ T0]
> >>> [ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
> >>> [ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
> >>> [ 4.703196][ T0] Call Trace:
> >>> [ 4.706334][ T0] <TASK>
> >>> [ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> >>> [ 4.714272][ T0] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
> >>> [ 4.718632][ T0] kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493)
> >>> [ 4.722903][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> >>> [ 4.728042][ T0] kasan_check_range (mm/kasan/generic.c:190)
> >>> [ 4.732750][ T0] dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
> >>> [ 4.737715][ T0] dmar_walk_remapping_entries (drivers/iommu/intel/dmar.c:609)
> >>> [ 4.743375][ T0] parse_dmar_table (drivers/iommu/intel/dmar.c:671)
> >>> [ 4.748079][ T0] ? dmar_table_detect (drivers/iommu/intel/dmar.c:633)
> >>> [ 4.752872][ T0] ? dmar_free_dev_scope (drivers/iommu/intel/dmar.c:408)
> >>> [ 4.758010][ T0] ? init_dmars (drivers/iommu/intel/iommu.c:3359)
> >>> [ 4.762370][ T0] ? iommu_resume (drivers/iommu/intel/iommu.c:3419)
> >>> [ 4.766903][ T0] ? dmar_walk_dsm_resource+0x300/0x300
> >>> [ 4.772909][ T0] ? dmar_acpi_insert_dev_scope (drivers/iommu/intel/dmar.c:466)
> >>> [ 4.778655][ T0] ? dmar_check_one_atsr (drivers/iommu/intel/iommu.c:3521)
> >>> [ 4.783795][ T0] dmar_table_init (drivers/iommu/intel/dmar.c:846)
> >>> [ 4.788239][ T0] intel_prepare_irq_remapping (drivers/iommu/intel/irq_remapping.c:742)
> >>> [ 4.793811][ T0] irq_remapping_prepare (drivers/iommu/irq_remapping.c:102)
> >>> [ 4.798778][ T0] enable_IR_x2apic (arch/x86/kernel/apic/apic.c:1928)
> >>> [ 4.803395][ T0] default_setup_apic_routing (arch/x86/kernel/apic/probe_64.c:25 (discriminator 1))
> >>> [ 4.808883][ T0] apic_intr_mode_init (arch/x86/kernel/apic/apic.c:1446)
> >>> [ 4.813761][ T0] x86_late_time_init (arch/x86/kernel/time.c:101)
> >>> [ 4.818467][ T0] start_kernel (init/main.c:1101)
> >>> [ 4.822827][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
> >>
> >> Seems like related to nodemask APIs.
> >
> > It points to arch_test_bit() (node_online() -> test_bit()),
> > converted from a macro to a function, more precisely, to
> > variable_test_bit(), which I didn't touch.
> >
> > ...oh ok I got it!
> >
> > pxm_to_node() can return %NUMA_NO_NODE which equals to -1. The
> > mentioned commit converts the macro to the function which now takes
> > `unsigned long` as @nr (bit number). So I guess it gets converted to
> > %ULONG_MAX - 1.
> >
> > Now the question is: what should a bitop do if we have negative bit
> > number? Because there are 2 solutions:
> >
> > 1. (I prefer it) A caller must check that bitop arguments are valid.
> > UB for negative (== too big) bit numbers.
> > dmar_parse_one_rhsa() must be fixed so that it will check return
> > value of pxm_to_node():
> >
> > int node = pxm_to_node(rhsa->proximity_domain);
> >
> > - if (!node_online(node))
> > + if (node != NUMA_NO_NODE && !node_online(node))
>
> Would it make sense to check it inside node_online()?
Probably as a more global improvement. I believe it's used very
often on hotpath, where it's known it can't be < 0, so for now
I'd pick the check inside this function.
>
> > node = NUMA_NO_NODE;
> >
> > 2. My code is broken, I shouldn't change `long` to `unsigned long`
> > or should change it for {constant,variable}_test_bit() as well
> > or do something else and let it behave as it was previously
> > (it wasn't crashing probably due to a good luck or...).
>
> This is definitely a NUMA problem. Bitmap has 2 kernel-wide users:
> cpumasks and numa nodes. Both use negative indexes for their
> reasons, which is dangerous, as we can see from here, because
> bitmaps don't support them and don't handle it properly...
>
> Can you please send a fix dmar_parse_one_rhsa() as you suggested,
> so that I'll add the fix before the beginning of next merge window?
Sure, sending in a couple hours. I guess it would be nice to get
Acked-by from a maintainer of that subsys (if it won't take too
long).
>
> Regarding a general path, this is what I'm thinking on (for a while):
> - #define NUMA_NO_NODE MAX_NUMNODES;
> - stronger typechecking, like you did in your series;
> - introduce CONFIG_DEBUG_BITMAP to catch bad arguments
> on-the-fly;
>
> I'm working on DEBUG_BITMAP, hopefully submit it for next merge
> cycle.
I like the idea!
>
> Thanks,
> Yury
Thanks,
Olek