2021-08-18 14:18:25

by kernel test robot

[permalink] [raw]
Subject: [mm] bdd265249f: WARNING:at_drivers/nvme/host/pci.c:#nvme_map_data[nvme]



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: bdd265249f0dc717210582c3c3cfff8674b1221f ("mm: Remove swap BIO paths and only use DIO paths [BROKEN]")
https://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git swap-dio


in testcase: vm-scalability
version: vm-scalability-x86_64-1.0-0_20210701
with following parameters:

runtime: 300
thp_enabled: always
thp_defrag: always
nr_task: 32
nr_ssd: 1
test: swap-w-seq-mt
cpufreq_governor: performance
ucode: 0x4003006

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/


on test machine: 192 threads 4 sockets Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>



[ 63.606991][ T7780] blk_update_request: I/O error, dev nvme0n1, sector 1173958664 op 0x1:(WRITE) flags 0x8800 phys_seg 18 prio class 0
[ 63.607116][ T7809] ------------[ cut here ]------------
[ 63.620244][ T7780] Write error (-5) on dio swapfile (67112960)
[ 63.626135][ T7809] Invalid SGL for payload:28672 nents:7
[ 63.626197][ T7809] WARNING: CPU: 17 PID: 7809 at drivers/nvme/host/pci.c:713 nvme_map_data+0x7c7/0x840 [nvme]
[ 63.632675][ T7780] Write error (-5) on dio swapfile (67117056)
[ 63.632678][ T7780] Write error (-5) on dio swapfile (67121152)
[ 63.632679][ T7780] Write error (-5) on dio swapfile (67125248)
[ 63.638722][ T7809] Modules linked in:
[ 63.649310][ T7780] Write error (-5) on dio swapfile (67129344)
[ 63.649313][ T7780] Write error (-5) on dio swapfile (67133440)
[ 63.655971][ T7809] xfs
[ 63.662497][ T7780] Write error (-5) on dio swapfile (67137536)
[ 63.669045][ T7809] loop binfmt_misc intel_rapl_msr intel_rapl_common
[ 63.673413][ T7780] Write error (-5) on dio swapfile (67141632)
[ 63.680035][ T7809] skx_edac
[ 63.686534][ T7780] Write error (-5) on dio swapfile (67145728)
[ 63.686536][ T7780] Write error (-5) on dio swapfile (67149824)
[ 63.686563][ T7780] ------------[ cut here ]------------
[ 63.689689][ T7809] nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs
[ 63.696222][ T7780] WARNING: CPU: 18 PID: 7780 at block/blk-merge.c:546 __blk_rq_map_sg+0x504/0x580
[ 63.703362][ T7809] irqbypass crct10dif_pclmul crc32_pclmul blake2b_generic xor
[ 63.709915][ T7780] Modules linked in: xfs loop binfmt_misc
[ 63.713526][ T7809] ghash_clmulni_intel
[ 63.720076][ T7780] intel_rapl_msr intel_rapl_common
[ 63.726649][ T7809] zstd_compress ast drm_vram_helper raid6_pq ipmi_ssif drm_ttm_helper
[ 63.732609][ T7780] skx_edac nfit libnvdimm
[ 63.742724][ T7809] rapl
[ 63.752396][ T7780] x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
[ 63.760470][ T7809] libcrc32c
[ 63.766711][ T7780] kvm btrfs
[ 63.771419][ T7809] ttm
[ 63.777180][ T7780] irqbypass
[ 63.785879][ T7809] intel_cstate
[ 63.790796][ T7780] crct10dif_pclmul crc32_pclmul blake2b_generic xor
[ 63.794046][ T7809] drm_kms_helper syscopyarea crc32c_intel sysfillrect
[ 63.801816][ T7780] ghash_clmulni_intel zstd_compress
[ 63.805514][ T7809] sysimgblt
[ 63.809172][ T7780] ast drm_vram_helper
[ 63.812285][ T7809] ahci fb_sys_fops
[ 63.815928][ T7780] raid6_pq ipmi_ssif drm_ttm_helper rapl libcrc32c ttm intel_cstate drm_kms_helper syscopyarea crc32c_intel
[ 63.819827][ T7809] libahci nvme
[ 63.826938][ T7780] sysfillrect sysimgblt ahci fb_sys_fops libahci nvme intel_uncore nvme_core
[ 63.834238][ T7809] intel_uncore
[ 63.839958][ T7780] mei_me
[ 63.843600][ T7809] nvme_core mei_me acpi_ipmi drm ioatdma t10_pi ipmi_si mei libata intel_pch_thermal joydev dca wmi ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ip_tables
[ 63.848122][ T7780] acpi_ipmi
[ 63.852340][ T7809]
[ 63.852343][ T7809] CPU: 17 PID: 7809 Comm: usemem Not tainted 5.14.0-rc1-00535-gbdd265249f0d #1
[ 63.864906][ T7780] drm ioatdma t10_pi ipmi_si mei
[ 63.868840][ T7809] RIP: 0010:nvme_map_data+0x7c7/0x840 [nvme]
[ 63.878166][ T7780] libata intel_pch_thermal joydev dca wmi
[ 63.882133][ T7809] Code: 20 41 c2 48 c7 c7 00 26 41 c2 e8 14 cd 22 bf 8b 93 84 01 00 00 f6 43 1e 04 75 41 8b 73 28 48 c7 c7 40 ed 40 c2 e8 b5 e1 8a bf <0f> 0b 41 bd 0a 00 00 00 e9 fe fe ff ff 41 bd 09 00 00 00 e9 da fb
[ 63.885581][ T7780] ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ip_tables
[ 63.885598][ T7780] CPU: 18 PID: 7780 Comm: usemem Not tainted 5.14.0-rc1-00535-gbdd265249f0d #1
[ 63.885602][ T7780] RIP: 0010:__blk_rq_map_sg+0x504/0x580
[ 63.903256][ T7809] RSP: 0000:ffffc9002474f720 EFLAGS: 00010282
[ 63.906969][ T7780] Code: 49 8b 02 83 e0 03 41 f6 c4 03 75 6c 49 09 c4 45 89 72 08 4d 89 22 41 89 5a 0c 48 8b 7d 00 c7 04 24 01 00 00 00 e9 ba fc ff ff <0f> 0b e9 e3 fc ff ff 48 8d 7c 24 50 4c 89 f6 44 89 44 24 20 48 89
[ 63.906974][ T7780] RSP: 0000:ffffc9002517f680 EFLAGS: 00010206
[ 63.906977][ T7780] RAX: 0000000000000003 RBX: 0000000000001000 RCX: 0000000000000010
[ 63.909880][ T7809] RAX: 0000000000000000 RBX: ffff8881f47f63c0 RCX: 0000000000000027
[ 63.919294][ T7780] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff888109b4a0a0
[ 63.919296][ T7780] RBP: ffffc9002517f758 R08: 00000000ffff8881 R09: 0000000000001000
[ 63.919297][ T7780] R10: 0000000000000001 R11: ffff888208285440 R12: 0000000000001000
[ 63.922242][ T7783] blk_update_request: I/O error, dev nvme0n1, sector 1175009792 op 0x1:(WRITE) flags 0x8800 phys_seg 23 prio class 0
[ 63.924822][ T7809] RDX: 0000000000000027 RSI: 0000000000000002 RDI: ffff888c4fa57d58
[ 63.925903][ T7791] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 63.925907][ T7791] #PF: supervisor read access in kernel mode
[ 63.925908][ T7791] #PF: error_code(0x0000) - not-present page
[ 63.925910][ T7791] PGD 0 P4D 0
[ 63.925912][ T7791] Oops: 0000 [#1] SMP NOPTI
[ 63.925915][ T7791] CPU: 37 PID: 7791 Comm: usemem Not tainted 5.14.0-rc1-00535-gbdd265249f0d #1
[ 63.925919][ T7791] RIP: 0010:__blk_rq_map_sg+0x118/0x580
[ 63.925931][ T7791] Code: ff 0f 84 57 03 00 00 48 83 27 fd 48 8b 3a 44 89 44 24 20 48 89 54 24 18 e8 f5 e6 03 00 44 8b 44 24 20 48 8b 54 24 18 48 89 02 <48> 8b 30 83 e6 03 41 f6 c6 03 0f 85 1e 04 00 00 83 04 24 01 4c 09
[ 63.925933][ T7791] RSP: 0000:ffffc900246a76c0 EFLAGS: 00010202
[ 63.925936][ T7791] RAX: 0000000000000010 RBX: ffe7a0820a022881 RCX: 0000000000000010
[ 63.925937][ T7791] RDX: ffffc900246a7798 RSI: 00000000ffff8881 RDI: ffff888c8446b260
[ 63.925939][ T7791] RBP: ffe7a322cbc8c000 R08: 0000000000000000 R09: 0000000000001000
[ 63.925941][ T7791] R10: 0000000000000001 R11: ffff888208280880 R12: 0000000000000000
[ 63.925943][ T7791] R13: 0000000000000001 R14: ffff888c8b2f2300 R15: 0000000000001000
[ 63.925945][ T7791] FS: 00007f6cbd2d0700(0000) GS:ffff88984f540000(0000) knlGS:0000000000000000
[ 63.925947][ T7791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 63.925949][ T7791] CR2: 0000000000000010 CR3: 0000000c906f8004 CR4: 00000000007706e0
[ 63.925950][ T7791] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 63.925952][ T7791] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 63.925954][ T7791] PKRU: 55555554
[ 63.925955][ T7791] Call Trace:
[ 63.925963][ T7791] nvme_map_data+0xa9/0x840 [nvme]
[ 63.925970][ T7791] nvme_queue_rq+0x9c/0x240 [nvme]
[ 63.925973][ T7791] __blk_mq_try_issue_directly+0x144/0x200
[ 63.925977][ T7791] blk_mq_request_issue_directly+0x58/0xc0
[ 63.925980][ T7791] blk_mq_try_issue_list_directly+0x8a/0x100
[ 63.925983][ T7791] blk_mq_sched_insert_requests+0xb1/0x100
[ 63.925988][ T7791] blk_mq_flush_plug_list+0xfd/0x1c0
[ 63.925991][ T7791] blk_flush_plug_list+0xec/0x140
[ 63.925995][ T7791] blk_finish_plug+0x21/0x40
[ 63.925997][ T7791] shrink_lruvec+0x2a9/0x340
[ 63.926008][ T7791] ? vmpressure+0x26/0x140
[ 63.926018][ T7791] shrink_node+0x2cc/0x700
[ 63.926021][ T7791] do_try_to_free_pages+0xd2/0x400
[ 63.926025][ T7791] try_to_free_pages+0xf1/0x1c0
[ 63.926029][ T7791] __alloc_pages_slowpath+0x3be/0xd80
[ 63.926036][ T7791] ? get_page_from_freelist+0x1a6/0x400
[ 63.926040][ T7791] __alloc_pages+0x302/0x380
[ 63.926043][ T7791] do_huge_pmd_anonymous_page+0x10f/0x840
[ 63.926048][ T7791] ? task_tick_fair+0x7c/0x380
[ 63.926053][ T7791] __handle_mm_fault+0x8b5/0x900
[ 63.926059][ T7791] handle_mm_fault+0xdb/0x2c0
[ 63.926062][ T7791] do_user_addr_fault+0x1c7/0x680
[ 63.926068][ T7791] ? sched_clock_cpu+0x9/0xc0
[ 63.926073][ T7791] exc_page_fault+0x62/0x140
[ 63.926082][ T7791] ? asm_exc_page_fault+0x8/0x30
[ 63.926086][ T7791] asm_exc_page_fault+0x1e/0x30
[ 63.926089][ T7791] RIP: 0033:0x556ce03fad3b
[ 63.926091][ T7791] Code: 83 c4 08 c3 48 8d 3d 3d 23 00 00 e8 1f f6 ff ff bf 01 00 00 00 e8 85 f6 ff ff 85 d2 74 08 48 8d 04 f7 48 8b 00 c3 48 8d 04 f7 <48> 89 30 b8 00 00 00 00 c3 48 89 f8 48 29 f0 48 8d 90 00 ca 9a 3b
[ 63.926093][ T7791] RSP: 002b:00007f6cbd2cfb08 EFLAGS: 00010246
[ 63.926096][ T7791] RAX: 00007efba1800000 RBX: 000000001cdc8400 RCX: 0000000000000018
[ 63.926098][ T7791] RDX: 0000000000000000 RSI: 000000001cdc8400 RDI: 00007efaba9be000
[ 63.926100][ T7791] RBP: 000000001cdc8400 R08: 0000000061165332 R09: 00007fff025e8080
[ 63.926101][ T7791] R10: 0000000000019bd2 R11: 0000000000000246 R12: 00000000e6e42000
[ 63.926103][ T7791] R13: 00007efaba9be000 R14: 00007f6cbd2cfb9c R15: 00007f6cbd2cfca0
[ 63.926105][ T7791] Modules linked in: xfs loop binfmt_misc intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul blake2b_generic xor ghash_clmulni_intel zstd_compress ast drm_vram_helper raid6_pq ipmi_ssif drm_ttm_helper rapl libcrc32c ttm intel_cstate drm_kms_helper syscopyarea crc32c_intel sysfillrect sysimgblt ahci fb_sys_fops libahci nvme intel_uncore nvme_core mei_me acpi_ipmi drm ioatdma t10_pi ipmi_si mei libata intel_pch_thermal joydev dca wmi ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ip_tables
[ 63.926167][ T7791] CR2: 0000000000000010
[ 63.926265][ T7791] ---[ end trace ab22ba5949515e86 ]---



To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (10.97 kB)
config-5.14.0-rc1-00535-gbdd265249f0d (178.15 kB)
job-script (8.30 kB)
dmesg.xz (40.84 kB)
job.yaml (5.60 kB)
Download all attachments