2020-04-07 02:29:09

by Qian Cai

[permalink] [raw]
Subject: Linux-next POWER9 NULL pointer NIP since 1st Apr.

Ever since 1st Apr, linux-next starts to trigger a NULL pointer NIP on POWER9 below using
this config,

https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config

It takes a while to reproduce, so before I bury myself into bisecting and just send a head-up
to see if anyone spots anything obvious.

[ 206.744625][T13224] LTP: starting fallocate04
[ 207.601583][T27684] /dev/zero: Can't open blockdev
[ 208.674301][T27684] EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
[ 208.680347][T27684] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 208.680383][T27684] Faulting instruction address: 0x00000000
[ 208.680406][T27684] Oops: Kernel access of bad area, sig: 11 [#1]
[ 208.680439][T27684] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
[ 208.680474][T27684] Modules linked in: ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci mdio tg3 libata libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ 208.680576][T27684] CPU: 117 PID: 27684 Comm: fallocate04 Tainted: G W 5.6.0-next-20200401+ #288
[ 208.680614][T27684] NIP: 0000000000000000 LR: c0080000102c0048 CTR: 0000000000000000
[ 208.680657][T27684] REGS: c000200361def420 TRAP: 0400 Tainted: G W (5.6.0-next-20200401+)
[ 208.680700][T27684] MSR: 900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42022228 XER: 20040000
[ 208.680760][T27684] CFAR: c00800001032c494 IRQMASK: 0
[ 208.680760][T27684] GPR00: c0000000005ac3f8 c000200361def6b0 c00000000165c200 c00020107dae0bd0
[ 208.680760][T27684] GPR04: 0000000000000000 0000000000000400 0000000000000000 0000000000000000
[ 208.680760][T27684] GPR08: c000200361def6e8 c0080000102c0040 000000007fffffff c000000001614e80
[ 208.680760][T27684] GPR12: 0000000000000000 c000201fff671280 0000000000000000 0000000000000002
[ 208.680760][T27684] GPR16: 0000000000000002 0000000000040001 c00020030f5a1000 c00020030f5a1548
[ 208.680760][T27684] GPR20: c0000000015fbad8 c00000000168c654 c000200361def818 c0000000005b4c10
[ 208.680760][T27684] GPR24: 0000000000000000 c0080000103365b8 c00020107dae0bd0 0000000000000400
[ 208.680760][T27684] GPR28: c00000000168c3a8 0000000000000000 0000000000000000 0000000000000000
[ 208.681014][T27684] NIP [0000000000000000] 0x0
[ 208.681065][T27684] LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4]
[ 208.681091][T27684] Call Trace:
[ 208.681129][T27684] [c000200361def6b0] [c0000000005ac3bc] iomap_apply+0x20c/0x920 (unreliable)
iomap_apply at fs/iomap/apply.c:80 (discriminator 4)
[ 208.681173][T27684] [c000200361def7f0] [c0000000005b4adc] iomap_bmap+0xfc/0x160
iomap_bmap at fs/iomap/fiemap.c:142
[ 208.681228][T27684] [c000200361def850] [c0080000102c2c1c] ext4_bmap+0xa4/0x180 [ext4]
ext4_bmap at fs/ext4/inode.c:3213
[ 208.681260][T27684] [c000200361def890] [c0000000004f71fc] bmap+0x4c/0x80
[ 208.681281][T27684] [c000200361def8c0] [c00800000fdb0acc] jbd2_journal_init_inode+0x44/0x1a0 [jbd2]
jbd2_journal_init_inode at fs/jbd2/journal.c:1255
[ 208.681326][T27684] [c000200361def960] [c00800001031c808] ext4_load_journal+0x440/0x860 [ext4]
[ 208.681371][T27684] [c000200361defa30] [c008000010322a14] ext4_fill_super+0x342c/0x3ab0 [ext4]
[ 208.681414][T27684] [c000200361defba0] [c0000000004cb0bc] mount_bdev+0x25c/0x290
[ 208.681478][T27684] [c000200361defc40] [c008000010310250] ext4_mount+0x28/0x50 [ext4]
[ 208.681520][T27684] [c000200361defc60] [c00000000053242c] legacy_get_tree+0x4c/0xb0
[ 208.681556][T27684] [c000200361defc90] [c0000000004c864c] vfs_get_tree+0x4c/0x130
[ 208.681593][T27684] [c000200361defd00] [c00000000050a1c8] do_mount+0xa18/0xc50
[ 208.681641][T27684] [c000200361defdd0] [c00000000050a9a8] sys_mount+0x158/0x180
[ 208.681679][T27684] [c000200361defe20] [c00000000000b3f8] system_call+0x5c/0x68
[ 208.681726][T27684] Instruction dump:
[ 208.681747][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 208.681797][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 208.681839][T27684] ---[ end trace 4e9e2bab7f1d4048 ]---
[ 208.802259][T27684]
[ 209.802373][T27684] Kernel panic - not syncing: Fatal exception

[ 215.281666][T16896] LTP: starting chown04_16
[ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 215.424289][T18297] Faulting instruction address: 0x00000000
[ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
[ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
[ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
[ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
[ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
[ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
[ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
[ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
[ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
[ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
[ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
[ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
[ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
[ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
[ 215.424914][T18297] NIP [0000000000000000] 0x0
[ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
find_free_cb at drivers/block/loop.c:2129
[ 215.424997][T18297] Call Trace:
[ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
[ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
loop_lookup at drivers/block/loop.c:2144
[ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
[ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
[ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
[ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
[ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
[ 215.425314][T18297] Instruction dump:
[ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
[ 215.642114][T18297]
[ 216.642220][T18297] Kernel panic - not syncing: Fatal exception


2020-04-07 12:44:16

by Michael Ellerman

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.

Qian Cai <[email protected]> writes:
> Ever since 1st Apr, linux-next starts to trigger a NULL pointer NIP on POWER9 below using
> this config,
>
> https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config
>
> It takes a while to reproduce, so before I bury myself into bisecting and just send a head-up
> to see if anyone spots anything obvious.
>
> [ 206.744625][T13224] LTP: starting fallocate04
> [ 207.601583][T27684] /dev/zero: Can't open blockdev
> [ 208.674301][T27684] EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
> [ 208.680347][T27684] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
> [ 208.680383][T27684] Faulting instruction address: 0x00000000
> [ 208.680406][T27684] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 208.680439][T27684] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
> [ 208.680474][T27684] Modules linked in: ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci mdio tg3 libata libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
> [ 208.680576][T27684] CPU: 117 PID: 27684 Comm: fallocate04 Tainted: G W 5.6.0-next-20200401+ #288
> [ 208.680614][T27684] NIP: 0000000000000000 LR: c0080000102c0048 CTR: 0000000000000000
> [ 208.680657][T27684] REGS: c000200361def420 TRAP: 0400 Tainted: G W (5.6.0-next-20200401+)
> [ 208.680700][T27684] MSR: 900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42022228 XER: 20040000
> [ 208.680760][T27684] CFAR: c00800001032c494 IRQMASK: 0
> [ 208.680760][T27684] GPR00: c0000000005ac3f8 c000200361def6b0 c00000000165c200 c00020107dae0bd0
> [ 208.680760][T27684] GPR04: 0000000000000000 0000000000000400 0000000000000000 0000000000000000
> [ 208.680760][T27684] GPR08: c000200361def6e8 c0080000102c0040 000000007fffffff c000000001614e80
> [ 208.680760][T27684] GPR12: 0000000000000000 c000201fff671280 0000000000000000 0000000000000002
> [ 208.680760][T27684] GPR16: 0000000000000002 0000000000040001 c00020030f5a1000 c00020030f5a1548
> [ 208.680760][T27684] GPR20: c0000000015fbad8 c00000000168c654 c000200361def818 c0000000005b4c10
> [ 208.680760][T27684] GPR24: 0000000000000000 c0080000103365b8 c00020107dae0bd0 0000000000000400
> [ 208.680760][T27684] GPR28: c00000000168c3a8 0000000000000000 0000000000000000 0000000000000000
> [ 208.681014][T27684] NIP [0000000000000000] 0x0
> [ 208.681065][T27684] LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4]

That LR looks like it's pointing to the return from _mcount in
ext4_iomap_end(), which means we have probably crashed in ftrace
somewhere.

Did you have tracing enabled when you ran the test? Or does it do
tracing itself?

cheers

> [ 208.681091][T27684] Call Trace:
> [ 208.681129][T27684] [c000200361def6b0] [c0000000005ac3bc] iomap_apply+0x20c/0x920 (unreliable) iomap_apply at fs/iomap/apply.c:80 (discriminator 4)
> [ 208.681173][T27684] [c000200361def7f0] [c0000000005b4adc] iomap_bmap+0xfc/0x160 iomap_bmap at fs/iomap/fiemap.c:142
> [ 208.681228][T27684] [c000200361def850] [c0080000102c2c1c] ext4_bmap+0xa4/0x180 [ext4] ext4_bmap at fs/ext4/inode.c:3213
> [ 208.681260][T27684] [c000200361def890] [c0000000004f71fc] bmap+0x4c/0x80
> [ 208.681281][T27684] [c000200361def8c0] [c00800000fdb0acc] jbd2_journal_init_inode+0x44/0x1a0 [jbd2] jbd2_journal_init_inode at fs/jbd2/journal.c:1255
> [ 208.681326][T27684] [c000200361def960] [c00800001031c808] ext4_load_journal+0x440/0x860 [ext4]
> [ 208.681371][T27684] [c000200361defa30] [c008000010322a14] ext4_fill_super+0x342c/0x3ab0 [ext4]
> [ 208.681414][T27684] [c000200361defba0] [c0000000004cb0bc] mount_bdev+0x25c/0x290
> [ 208.681478][T27684] [c000200361defc40] [c008000010310250] ext4_mount+0x28/0x50 [ext4]
> [ 208.681520][T27684] [c000200361defc60] [c00000000053242c] legacy_get_tree+0x4c/0xb0
> [ 208.681556][T27684] [c000200361defc90] [c0000000004c864c] vfs_get_tree+0x4c/0x130
> [ 208.681593][T27684] [c000200361defd00] [c00000000050a1c8] do_mount+0xa18/0xc50
> [ 208.681641][T27684] [c000200361defdd0] [c00000000050a9a8] sys_mount+0x158/0x180
> [ 208.681679][T27684] [c000200361defe20] [c00000000000b3f8] system_call+0x5c/0x68
> [ 208.681726][T27684] Instruction dump:
> [ 208.681747][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 208.681797][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 208.681839][T27684] ---[ end trace 4e9e2bab7f1d4048 ]---
> [ 208.802259][T27684]
> [ 209.802373][T27684] Kernel panic - not syncing: Fatal exception
>
> [ 215.281666][T16896] LTP: starting chown04_16
> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
> [ 215.424289][T18297] Faulting instruction address: 0x00000000
> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
> [ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
> [ 215.424914][T18297] NIP [0000000000000000] 0x0
> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
> find_free_cb at drivers/block/loop.c:2129
> [ 215.424997][T18297] Call Trace:
> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
> loop_lookup at drivers/block/loop.c:2144
> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
> [ 215.425314][T18297] Instruction dump:
> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
> [ 215.642114][T18297]
> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-07 13:02:20

by Qian Cai

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.

+ Steven

> On Apr 7, 2020, at 8:42 AM, Michael Ellerman <[email protected]> wrote:
>
> Qian Cai <[email protected]> writes:
>> Ever since 1st Apr, linux-next starts to trigger a NULL pointer NIP on POWER9 below using
>> this config,
>>
>> https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config
>>
>> It takes a while to reproduce, so before I bury myself into bisecting and just send a head-up
>> to see if anyone spots anything obvious.
>>
>> [ 206.744625][T13224] LTP: starting fallocate04
>> [ 207.601583][T27684] /dev/zero: Can't open blockdev
>> [ 208.674301][T27684] EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
>> [ 208.680347][T27684] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>> [ 208.680383][T27684] Faulting instruction address: 0x00000000
>> [ 208.680406][T27684] Oops: Kernel access of bad area, sig: 11 [#1]
>> [ 208.680439][T27684] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>> [ 208.680474][T27684] Modules linked in: ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci mdio tg3 libata libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [ 208.680576][T27684] CPU: 117 PID: 27684 Comm: fallocate04 Tainted: G W 5.6.0-next-20200401+ #288
>> [ 208.680614][T27684] NIP: 0000000000000000 LR: c0080000102c0048 CTR: 0000000000000000
>> [ 208.680657][T27684] REGS: c000200361def420 TRAP: 0400 Tainted: G W (5.6.0-next-20200401+)
>> [ 208.680700][T27684] MSR: 900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42022228 XER: 20040000
>> [ 208.680760][T27684] CFAR: c00800001032c494 IRQMASK: 0
>> [ 208.680760][T27684] GPR00: c0000000005ac3f8 c000200361def6b0 c00000000165c200 c00020107dae0bd0
>> [ 208.680760][T27684] GPR04: 0000000000000000 0000000000000400 0000000000000000 0000000000000000
>> [ 208.680760][T27684] GPR08: c000200361def6e8 c0080000102c0040 000000007fffffff c000000001614e80
>> [ 208.680760][T27684] GPR12: 0000000000000000 c000201fff671280 0000000000000000 0000000000000002
>> [ 208.680760][T27684] GPR16: 0000000000000002 0000000000040001 c00020030f5a1000 c00020030f5a1548
>> [ 208.680760][T27684] GPR20: c0000000015fbad8 c00000000168c654 c000200361def818 c0000000005b4c10
>> [ 208.680760][T27684] GPR24: 0000000000000000 c0080000103365b8 c00020107dae0bd0 0000000000000400
>> [ 208.680760][T27684] GPR28: c00000000168c3a8 0000000000000000 0000000000000000 0000000000000000
>> [ 208.681014][T27684] NIP [0000000000000000] 0x0
>> [ 208.681065][T27684] LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4]
>
> That LR looks like it's pointing to the return from _mcount in
> ext4_iomap_end(), which means we have probably crashed in ftrace
> somewhere.
>
> Did you have tracing enabled when you ran the test? Or does it do
> tracing itself?

Yes, it run ftrace at first before running LTP to trigger it,

https://github.com/cailca/linux-mm/blob/master/test.sh

echo function > /sys/kernel/debug/tracing/current_tracer
echo nop > /sys/kernel/debug/tracing/current_tracer

There is another crash with even non-NULL NIP, but then symbol behaves weird.

# ./scripts/faddr2line vmlinux sysctl_net_busy_read+0x0/0x4
skipping sysctl_net_busy_read address at 0xc0000000016804ac due to non-function symbol of type 'D'

[ 148.110969][T13115] LTP: starting chown04_16
[ 148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
[ 148.255099][T13380] BUG: Unable to handle kernel instruction fetch
[ 148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
[ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
[ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
[ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G W 5.6.0+ #7
[ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
[ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted: G W (5.6.0+)
[ 148.255281][T13380] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
[ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
[ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000
[ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f
[ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520
[ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000
[ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30
[ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780
[ 148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4
[ 148.255516][T13380] LR [c00800000fa60408] find_free_cb+0x8/0x30 [loop]
[ 148.255528][T13380] Call Trace:
[ 148.255538][T13380] [c0000010a6fafc90] [c0000000009732c0] idr_for_each+0xf0/0x170 (unreliable)
[ 148.255572][T13380] [c0000010a6fafd10] [c00800000fa626c4] loop_lookup.part.1+0x4c/0xb0 [loop]
[ 148.255597][T13380] [c0000010a6fafd50] [c00800000fa634d8] loop_control_ioctl+0x120/0x1d0 [loop]
[ 148.255623][T13380] [c0000010a6fafdb0] [c0000000004ddc08] ksys_ioctl+0xd8/0x130
[ 148.255636][T13380] [c0000010a6fafe00] [c0000000004ddc88] sys_ioctl+0x28/0x40
[ 148.255669][T13380] [c0000010a6fafe20] [c00000000000b378] system_call+0x5c/0x68
[ 148.255699][T13380] Instruction dump:
[ 148.255718][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 148.255744][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 148.255772][T13380] ---[ end trace a5894a74208c22ec ]---
[ 148.576663][T13380]
[ 149.576765][T13380] Kernel panic - not syncing: Fatal exception

The bisect so far indicated the bad ones always have this,

aa1a8ce53332 Merge tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

I’ll go to bisect some more but it is going to take a while.

$ git log --oneline 4c205c84e249..8e99cf91b99b
8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
ebed9628f5c2 selftests/ftrace: Add test to test new set_event_notrace_pid file
ed8839e072b8 selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
276836260301 tracing: Create set_event_notrace_pid to not trace tasks
b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact
6a13a0d7b4d1 ftrace/kprobe: Show the maxactive number on kprobe_events
8a815e6b8b88 tracing: Have the document reflect that the trace file keeps tracing enabled
c9b7a4a72ff6 ring-buffer/tracing: Have iterator acknowledge dropped events
06e0a548bad0 tracing: Do not disable tracing when reading the trace file
1039221cc278 ring-buffer: Do not disable recording when there is an iterator
07b8b10ec94f ring-buffer: Make resize disable per cpu buffer instead of total buffer
153368ce1bd0 ring-buffer: Optimize rb_iter_head_event()
ff84c50cfb4b ring-buffer: Do not die if rb_iter_peek() fails more than thrice
785888c544e0 ring-buffer: Have rb_iter_head_event() handle concurrent writer
28e3fc56a471 ring-buffer: Add page_stamp to iterator for synchronization
bc1a72afdc4a ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
ead6ecfddea5 ring-buffer: Have ring_buffer_empty() not depend on tracing stopped
ff895103a84a tracing: Save off entry when peeking at next entry
8c77f0ba4156 selftest/ftrace: Fix function trigger test to handle trace not disabling the tracer
bf2cbe044da2 tracing: Use address-of operator on section symbols
bbd9d05618a6 gpu/trace: add a gpu total memory usage tracepoint
89b74cac7834 tools/bootconfig: Show line and column in parse error
306b69dce926 bootconfig: Support O=<builddir> option
5412e0b763e0 tracing: Remove unused TRACE_BUFFER bits
b396bfdebffc tracing: Have hwlat ts be first instance and record count of instances


>
> cheers
>
>> [ 208.681091][T27684] Call Trace:
>> [ 208.681129][T27684] [c000200361def6b0] [c0000000005ac3bc] iomap_apply+0x20c/0x920 (unreliable) iomap_apply at fs/iomap/apply.c:80 (discriminator 4)
>> [ 208.681173][T27684] [c000200361def7f0] [c0000000005b4adc] iomap_bmap+0xfc/0x160 iomap_bmap at fs/iomap/fiemap.c:142
>> [ 208.681228][T27684] [c000200361def850] [c0080000102c2c1c] ext4_bmap+0xa4/0x180 [ext4] ext4_bmap at fs/ext4/inode.c:3213
>> [ 208.681260][T27684] [c000200361def890] [c0000000004f71fc] bmap+0x4c/0x80
>> [ 208.681281][T27684] [c000200361def8c0] [c00800000fdb0acc] jbd2_journal_init_inode+0x44/0x1a0 [jbd2] jbd2_journal_init_inode at fs/jbd2/journal.c:1255
>> [ 208.681326][T27684] [c000200361def960] [c00800001031c808] ext4_load_journal+0x440/0x860 [ext4]
>> [ 208.681371][T27684] [c000200361defa30] [c008000010322a14] ext4_fill_super+0x342c/0x3ab0 [ext4]
>> [ 208.681414][T27684] [c000200361defba0] [c0000000004cb0bc] mount_bdev+0x25c/0x290
>> [ 208.681478][T27684] [c000200361defc40] [c008000010310250] ext4_mount+0x28/0x50 [ext4]
>> [ 208.681520][T27684] [c000200361defc60] [c00000000053242c] legacy_get_tree+0x4c/0xb0
>> [ 208.681556][T27684] [c000200361defc90] [c0000000004c864c] vfs_get_tree+0x4c/0x130
>> [ 208.681593][T27684] [c000200361defd00] [c00000000050a1c8] do_mount+0xa18/0xc50
>> [ 208.681641][T27684] [c000200361defdd0] [c00000000050a9a8] sys_mount+0x158/0x180
>> [ 208.681679][T27684] [c000200361defe20] [c00000000000b3f8] system_call+0x5c/0x68
>> [ 208.681726][T27684] Instruction dump:
>> [ 208.681747][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 208.681797][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 208.681839][T27684] ---[ end trace 4e9e2bab7f1d4048 ]---
>> [ 208.802259][T27684]
>> [ 209.802373][T27684] Kernel panic - not syncing: Fatal exception
>>
>> [ 215.281666][T16896] LTP: starting chown04_16
>> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>> [ 215.424289][T18297] Faulting instruction address: 0x00000000
>> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
>> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
>> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
>> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
>> [ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
>> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
>> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
>> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
>> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
>> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
>> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
>> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
>> [ 215.424914][T18297] NIP [0000000000000000] 0x0
>> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
>> find_free_cb at drivers/block/loop.c:2129
>> [ 215.424997][T18297] Call Trace:
>> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
>> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
>> loop_lookup at drivers/block/loop.c:2144
>> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
>> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
>> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
>> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
>> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
>> [ 215.425314][T18297] Instruction dump:
>> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
>> [ 215.642114][T18297]
>> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-07 13:32:11

by Steven Rostedt

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.

On Tue, 7 Apr 2020 09:01:10 -0400
Qian Cai <[email protected]> wrote:

> + Steven
>
> > On Apr 7, 2020, at 8:42 AM, Michael Ellerman <[email protected]> wrote:
> >
> > Qian Cai <[email protected]> writes:
> >> Ever since 1st Apr, linux-next starts to trigger a NULL pointer NIP on POWER9 below using
> >> this config,
> >>
> >> https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config
> >>
> >> It takes a while to reproduce, so before I bury myself into bisecting and just send a head-up
> >> to see if anyone spots anything obvious.
> >>
> >> [ 206.744625][T13224] LTP: starting fallocate04
> >> [ 207.601583][T27684] /dev/zero: Can't open blockdev
> >> [ 208.674301][T27684] EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
> >> [ 208.680347][T27684] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
> >> [ 208.680383][T27684] Faulting instruction address: 0x00000000
> >> [ 208.680406][T27684] Oops: Kernel access of bad area, sig: 11 [#1]
> >> [ 208.680439][T27684] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
> >> [ 208.680474][T27684] Modules linked in: ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci mdio tg3 libata libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
> >> [ 208.680576][T27684] CPU: 117 PID: 27684 Comm: fallocate04 Tainted: G W 5.6.0-next-20200401+ #288
> >> [ 208.680614][T27684] NIP: 0000000000000000 LR: c0080000102c0048 CTR: 0000000000000000
> >> [ 208.680657][T27684] REGS: c000200361def420 TRAP: 0400 Tainted: G W (5.6.0-next-20200401+)
> >> [ 208.680700][T27684] MSR: 900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42022228 XER: 20040000
> >> [ 208.680760][T27684] CFAR: c00800001032c494 IRQMASK: 0
> >> [ 208.680760][T27684] GPR00: c0000000005ac3f8 c000200361def6b0 c00000000165c200 c00020107dae0bd0
> >> [ 208.680760][T27684] GPR04: 0000000000000000 0000000000000400 0000000000000000 0000000000000000
> >> [ 208.680760][T27684] GPR08: c000200361def6e8 c0080000102c0040 000000007fffffff c000000001614e80
> >> [ 208.680760][T27684] GPR12: 0000000000000000 c000201fff671280 0000000000000000 0000000000000002
> >> [ 208.680760][T27684] GPR16: 0000000000000002 0000000000040001 c00020030f5a1000 c00020030f5a1548
> >> [ 208.680760][T27684] GPR20: c0000000015fbad8 c00000000168c654 c000200361def818 c0000000005b4c10
> >> [ 208.680760][T27684] GPR24: 0000000000000000 c0080000103365b8 c00020107dae0bd0 0000000000000400
> >> [ 208.680760][T27684] GPR28: c00000000168c3a8 0000000000000000 0000000000000000 0000000000000000
> >> [ 208.681014][T27684] NIP [0000000000000000] 0x0
> >> [ 208.681065][T27684] LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4]
> >
> > That LR looks like it's pointing to the return from _mcount in
> > ext4_iomap_end(), which means we have probably crashed in ftrace
> > somewhere.
> >
> > Did you have tracing enabled when you ran the test? Or does it do
> > tracing itself?
>
> Yes, it run ftrace at first before running LTP to trigger it,
>
> https://github.com/cailca/linux-mm/blob/master/test.sh
>
> echo function > /sys/kernel/debug/tracing/current_tracer
> echo nop > /sys/kernel/debug/tracing/current_tracer
>
> There is another crash with even non-NULL NIP, but then symbol behaves weird.
>
> # ./scripts/faddr2line vmlinux sysctl_net_busy_read+0x0/0x4
> skipping sysctl_net_busy_read address at 0xc0000000016804ac due to non-function symbol of type 'D'
>
> [ 148.110969][T13115] LTP: starting chown04_16
> [ 148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
> [ 148.255099][T13380] BUG: Unable to handle kernel instruction fetch
> [ 148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
> [ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
> [ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
> [ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G W 5.6.0+ #7
> [ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
> [ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted: G W (5.6.0+)
> [ 148.255281][T13380] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
> [ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
> [ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000
> [ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f
> [ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520
> [ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000
> [ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30
> [ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780
> [ 148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4
> [ 148.255516][T13380] LR [c00800000fa60408] find_free_cb+0x8/0x30 [loop]
> [ 148.255528][T13380] Call Trace:
> [ 148.255538][T13380] [c0000010a6fafc90] [c0000000009732c0] idr_for_each+0xf0/0x170 (unreliable)
> [ 148.255572][T13380] [c0000010a6fafd10] [c00800000fa626c4] loop_lookup.part.1+0x4c/0xb0 [loop]
> [ 148.255597][T13380] [c0000010a6fafd50] [c00800000fa634d8] loop_control_ioctl+0x120/0x1d0 [loop]
> [ 148.255623][T13380] [c0000010a6fafdb0] [c0000000004ddc08] ksys_ioctl+0xd8/0x130
> [ 148.255636][T13380] [c0000010a6fafe00] [c0000000004ddc88] sys_ioctl+0x28/0x40
> [ 148.255669][T13380] [c0000010a6fafe20] [c00000000000b378] system_call+0x5c/0x68
> [ 148.255699][T13380] Instruction dump:
> [ 148.255718][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 148.255744][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 148.255772][T13380] ---[ end trace a5894a74208c22ec ]---
> [ 148.576663][T13380]
> [ 149.576765][T13380] Kernel panic - not syncing: Fatal exception
>
> The bisect so far indicated the bad ones always have this,
>
> aa1a8ce53332 Merge tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>
> I’ll go to bisect some more but it is going to take a while.
>
> $ git log --oneline 4c205c84e249..8e99cf91b99b
> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
> ebed9628f5c2 selftests/ftrace: Add test to test new set_event_notrace_pid file
> ed8839e072b8 selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
> 276836260301 tracing: Create set_event_notrace_pid to not trace tasks

> b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
> 717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact

If it is affecting function tracing, it is probably one of the above two
commits.

-- Steve


> 6a13a0d7b4d1 ftrace/kprobe: Show the maxactive number on kprobe_events
> 8a815e6b8b88 tracing: Have the document reflect that the trace file keeps tracing enabled
> c9b7a4a72ff6 ring-buffer/tracing: Have iterator acknowledge dropped events
> 06e0a548bad0 tracing: Do not disable tracing when reading the trace file
> 1039221cc278 ring-buffer: Do not disable recording when there is an iterator
> 07b8b10ec94f ring-buffer: Make resize disable per cpu buffer instead of total buffer
> 153368ce1bd0 ring-buffer: Optimize rb_iter_head_event()
> ff84c50cfb4b ring-buffer: Do not die if rb_iter_peek() fails more than thrice
> 785888c544e0 ring-buffer: Have rb_iter_head_event() handle concurrent writer
> 28e3fc56a471 ring-buffer: Add page_stamp to iterator for synchronization
> bc1a72afdc4a ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
> ead6ecfddea5 ring-buffer: Have ring_buffer_empty() not depend on tracing stopped
> ff895103a84a tracing: Save off entry when peeking at next entry
> 8c77f0ba4156 selftest/ftrace: Fix function trigger test to handle trace not disabling the tracer
> bf2cbe044da2 tracing: Use address-of operator on section symbols
> bbd9d05618a6 gpu/trace: add a gpu total memory usage tracepoint
> 89b74cac7834 tools/bootconfig: Show line and column in parse error
> 306b69dce926 bootconfig: Support O=<builddir> option
> 5412e0b763e0 tracing: Remove unused TRACE_BUFFER bits
> b396bfdebffc tracing: Have hwlat ts be first instance and record count of instances
>
>
> >
> > cheers
> >
> >> [ 208.681091][T27684] Call Trace:
> >> [ 208.681129][T27684] [c000200361def6b0] [c0000000005ac3bc] iomap_apply+0x20c/0x920 (unreliable) iomap_apply at fs/iomap/apply.c:80 (discriminator 4)
> >> [ 208.681173][T27684] [c000200361def7f0] [c0000000005b4adc] iomap_bmap+0xfc/0x160 iomap_bmap at fs/iomap/fiemap.c:142
> >> [ 208.681228][T27684] [c000200361def850] [c0080000102c2c1c] ext4_bmap+0xa4/0x180 [ext4] ext4_bmap at fs/ext4/inode.c:3213
> >> [ 208.681260][T27684] [c000200361def890] [c0000000004f71fc] bmap+0x4c/0x80
> >> [ 208.681281][T27684] [c000200361def8c0] [c00800000fdb0acc] jbd2_journal_init_inode+0x44/0x1a0 [jbd2] jbd2_journal_init_inode at fs/jbd2/journal.c:1255
> >> [ 208.681326][T27684] [c000200361def960] [c00800001031c808] ext4_load_journal+0x440/0x860 [ext4]
> >> [ 208.681371][T27684] [c000200361defa30] [c008000010322a14] ext4_fill_super+0x342c/0x3ab0 [ext4]
> >> [ 208.681414][T27684] [c000200361defba0] [c0000000004cb0bc] mount_bdev+0x25c/0x290
> >> [ 208.681478][T27684] [c000200361defc40] [c008000010310250] ext4_mount+0x28/0x50 [ext4]
> >> [ 208.681520][T27684] [c000200361defc60] [c00000000053242c] legacy_get_tree+0x4c/0xb0
> >> [ 208.681556][T27684] [c000200361defc90] [c0000000004c864c] vfs_get_tree+0x4c/0x130
> >> [ 208.681593][T27684] [c000200361defd00] [c00000000050a1c8] do_mount+0xa18/0xc50
> >> [ 208.681641][T27684] [c000200361defdd0] [c00000000050a9a8] sys_mount+0x158/0x180
> >> [ 208.681679][T27684] [c000200361defe20] [c00000000000b3f8] system_call+0x5c/0x68
> >> [ 208.681726][T27684] Instruction dump:
> >> [ 208.681747][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> >> [ 208.681797][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> >> [ 208.681839][T27684] ---[ end trace 4e9e2bab7f1d4048 ]---
> >> [ 208.802259][T27684]
> >> [ 209.802373][T27684] Kernel panic - not syncing: Fatal exception
> >>
> >> [ 215.281666][T16896] LTP: starting chown04_16
> >> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
> >> [ 215.424289][T18297] Faulting instruction address: 0x00000000
> >> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
> >> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
> >> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
> >> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
> >> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
> >> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
> >> [ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
> >> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
> >> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
> >> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
> >> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
> >> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
> >> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
> >> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
> >> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
> >> [ 215.424914][T18297] NIP [0000000000000000] 0x0
> >> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
> >> find_free_cb at drivers/block/loop.c:2129
> >> [ 215.424997][T18297] Call Trace:
> >> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
> >> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
> >> loop_lookup at drivers/block/loop.c:2144
> >> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
> >> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
> >> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
> >> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
> >> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
> >> [ 215.425314][T18297] Instruction dump:
> >> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> >> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> >> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
> >> [ 215.642114][T18297]
> >> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-09 03:43:19

by Qian Cai

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.



> On Apr 7, 2020, at 9:30 AM, Steven Rostedt <[email protected]> wrote:
>
> On Tue, 7 Apr 2020 09:01:10 -0400
> Qian Cai <[email protected]> wrote:
>
>> + Steven
>>
>>> On Apr 7, 2020, at 8:42 AM, Michael Ellerman <[email protected]> wrote:
>>>
>>> Qian Cai <[email protected]> writes:
>>>> Ever since 1st Apr, linux-next starts to trigger a NULL pointer NIP on POWER9 below using
>>>> this config,
>>>>
>>>> https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config
>>>>
>>>> It takes a while to reproduce, so before I bury myself into bisecting and just send a head-up
>>>> to see if anyone spots anything obvious.
>>>>
>>>> [ 206.744625][T13224] LTP: starting fallocate04
>>>> [ 207.601583][T27684] /dev/zero: Can't open blockdev
>>>> [ 208.674301][T27684] EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
>>>> [ 208.680347][T27684] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>>>> [ 208.680383][T27684] Faulting instruction address: 0x00000000
>>>> [ 208.680406][T27684] Oops: Kernel access of bad area, sig: 11 [#1]
>>>> [ 208.680439][T27684] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>>>> [ 208.680474][T27684] Modules linked in: ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci mdio tg3 libata libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
>>>> [ 208.680576][T27684] CPU: 117 PID: 27684 Comm: fallocate04 Tainted: G W 5.6.0-next-20200401+ #288
>>>> [ 208.680614][T27684] NIP: 0000000000000000 LR: c0080000102c0048 CTR: 0000000000000000
>>>> [ 208.680657][T27684] REGS: c000200361def420 TRAP: 0400 Tainted: G W (5.6.0-next-20200401+)
>>>> [ 208.680700][T27684] MSR: 900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42022228 XER: 20040000
>>>> [ 208.680760][T27684] CFAR: c00800001032c494 IRQMASK: 0
>>>> [ 208.680760][T27684] GPR00: c0000000005ac3f8 c000200361def6b0 c00000000165c200 c00020107dae0bd0
>>>> [ 208.680760][T27684] GPR04: 0000000000000000 0000000000000400 0000000000000000 0000000000000000
>>>> [ 208.680760][T27684] GPR08: c000200361def6e8 c0080000102c0040 000000007fffffff c000000001614e80
>>>> [ 208.680760][T27684] GPR12: 0000000000000000 c000201fff671280 0000000000000000 0000000000000002
>>>> [ 208.680760][T27684] GPR16: 0000000000000002 0000000000040001 c00020030f5a1000 c00020030f5a1548
>>>> [ 208.680760][T27684] GPR20: c0000000015fbad8 c00000000168c654 c000200361def818 c0000000005b4c10
>>>> [ 208.680760][T27684] GPR24: 0000000000000000 c0080000103365b8 c00020107dae0bd0 0000000000000400
>>>> [ 208.680760][T27684] GPR28: c00000000168c3a8 0000000000000000 0000000000000000 0000000000000000
>>>> [ 208.681014][T27684] NIP [0000000000000000] 0x0
>>>> [ 208.681065][T27684] LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4]
>>>
>>> That LR looks like it's pointing to the return from _mcount in
>>> ext4_iomap_end(), which means we have probably crashed in ftrace
>>> somewhere.
>>>
>>> Did you have tracing enabled when you ran the test? Or does it do
>>> tracing itself?
>>
>> Yes, it run ftrace at first before running LTP to trigger it,
>>
>> https://github.com/cailca/linux-mm/blob/master/test.sh
>>
>> echo function > /sys/kernel/debug/tracing/current_tracer
>> echo nop > /sys/kernel/debug/tracing/current_tracer
>>
>> There is another crash with even non-NULL NIP, but then symbol behaves weird.
>>
>> # ./scripts/faddr2line vmlinux sysctl_net_busy_read+0x0/0x4
>> skipping sysctl_net_busy_read address at 0xc0000000016804ac due to non-function symbol of type 'D'
>>
>> [ 148.110969][T13115] LTP: starting chown04_16
>> [ 148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
>> [ 148.255099][T13380] BUG: Unable to handle kernel instruction fetch
>> [ 148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
>> [ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
>> [ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>> [ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G W 5.6.0+ #7
>> [ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
>> [ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted: G W (5.6.0+)
>> [ 148.255281][T13380] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>> [ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
>> [ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000
>> [ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f
>> [ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520
>> [ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30
>> [ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780
>> [ 148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4
>> [ 148.255516][T13380] LR [c00800000fa60408] find_free_cb+0x8/0x30 [loop]
>> [ 148.255528][T13380] Call Trace:
>> [ 148.255538][T13380] [c0000010a6fafc90] [c0000000009732c0] idr_for_each+0xf0/0x170 (unreliable)
>> [ 148.255572][T13380] [c0000010a6fafd10] [c00800000fa626c4] loop_lookup.part.1+0x4c/0xb0 [loop]
>> [ 148.255597][T13380] [c0000010a6fafd50] [c00800000fa634d8] loop_control_ioctl+0x120/0x1d0 [loop]
>> [ 148.255623][T13380] [c0000010a6fafdb0] [c0000000004ddc08] ksys_ioctl+0xd8/0x130
>> [ 148.255636][T13380] [c0000010a6fafe00] [c0000000004ddc88] sys_ioctl+0x28/0x40
>> [ 148.255669][T13380] [c0000010a6fafe20] [c00000000000b378] system_call+0x5c/0x68
>> [ 148.255699][T13380] Instruction dump:
>> [ 148.255718][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 148.255744][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 148.255772][T13380] ---[ end trace a5894a74208c22ec ]---
>> [ 148.576663][T13380]
>> [ 149.576765][T13380] Kernel panic - not syncing: Fatal exception
>>
>> The bisect so far indicated the bad ones always have this,
>>
>> aa1a8ce53332 Merge tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>
>> I’ll go to bisect some more but it is going to take a while.
>>
>> $ git log --oneline 4c205c84e249..8e99cf91b99b
>> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
>> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
>> ebed9628f5c2 selftests/ftrace: Add test to test new set_event_notrace_pid file
>> ed8839e072b8 selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
>> 276836260301 tracing: Create set_event_notrace_pid to not trace tasks
>
>> b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
>> 717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact
>
> If it is affecting function tracing, it is probably one of the above two
> commits.

I tested reverting those 3 commits but it does NOT help.

276836260301 tracing: Create set_event_notrace_pid to not trace tasks
b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact

So, it is either one of those is bad.

9c94b39560c3 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
aa1a8ce53332 Merge tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

At this point, I suspect this one in ext4_for_linus,

ac58e4fb03f9 ext4: move ext4 bmap to use iomap infrastructure

given the first a few days, it was crashed due to this new ext4_bmap -> iomap_bmap
callsite from the above commit,

LR: ext4_iomap_end

The next few days, even though it starts to crash at,

LR: find_free_cb

by the LTP chown04_16 which is still somewhat ext4 related.

# /opt/ltp/testcases/bin/chown04_16
chown04_16 0 TINFO : Found free device 0 '/dev/loop0'
chown04_16 0 TINFO : Formatting /dev/loop0 with ext2 opts='' extra opts=''
mke2fs 1.45.4 (23-Sep-2019)
chown04_16 1 TBROK : safe_macros.c:764: chown04.c:175: mount(/dev/loop0, mntpoint, ext2, 1, (nil)) failed: errno=ENODEV(19): No such device
chown04_16 2 TBROK : safe_macros.c:764: Remaining cases broken

Is it possible that iomap somewhat corrupt the NIP?

In the next a few hours, I should be able to tell if the trace merge commit is clear.

>
> -- Steve
>
>
>> 6a13a0d7b4d1 ftrace/kprobe: Show the maxactive number on kprobe_events
>> 8a815e6b8b88 tracing: Have the document reflect that the trace file keeps tracing enabled
>> c9b7a4a72ff6 ring-buffer/tracing: Have iterator acknowledge dropped events
>> 06e0a548bad0 tracing: Do not disable tracing when reading the trace file
>> 1039221cc278 ring-buffer: Do not disable recording when there is an iterator
>> 07b8b10ec94f ring-buffer: Make resize disable per cpu buffer instead of total buffer
>> 153368ce1bd0 ring-buffer: Optimize rb_iter_head_event()
>> ff84c50cfb4b ring-buffer: Do not die if rb_iter_peek() fails more than thrice
>> 785888c544e0 ring-buffer: Have rb_iter_head_event() handle concurrent writer
>> 28e3fc56a471 ring-buffer: Add page_stamp to iterator for synchronization
>> bc1a72afdc4a ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
>> ead6ecfddea5 ring-buffer: Have ring_buffer_empty() not depend on tracing stopped
>> ff895103a84a tracing: Save off entry when peeking at next entry
>> 8c77f0ba4156 selftest/ftrace: Fix function trigger test to handle trace not disabling the tracer
>> bf2cbe044da2 tracing: Use address-of operator on section symbols
>> bbd9d05618a6 gpu/trace: add a gpu total memory usage tracepoint
>> 89b74cac7834 tools/bootconfig: Show line and column in parse error
>> 306b69dce926 bootconfig: Support O=<builddir> option
>> 5412e0b763e0 tracing: Remove unused TRACE_BUFFER bits
>> b396bfdebffc tracing: Have hwlat ts be first instance and record count of instances
>>
>>
>>>
>>> cheers
>>>
>>>> [ 208.681091][T27684] Call Trace:
>>>> [ 208.681129][T27684] [c000200361def6b0] [c0000000005ac3bc] iomap_apply+0x20c/0x920 (unreliable) iomap_apply at fs/iomap/apply.c:80 (discriminator 4)
>>>> [ 208.681173][T27684] [c000200361def7f0] [c0000000005b4adc] iomap_bmap+0xfc/0x160 iomap_bmap at fs/iomap/fiemap.c:142
>>>> [ 208.681228][T27684] [c000200361def850] [c0080000102c2c1c] ext4_bmap+0xa4/0x180 [ext4] ext4_bmap at fs/ext4/inode.c:3213
>>>> [ 208.681260][T27684] [c000200361def890] [c0000000004f71fc] bmap+0x4c/0x80
>>>> [ 208.681281][T27684] [c000200361def8c0] [c00800000fdb0acc] jbd2_journal_init_inode+0x44/0x1a0 [jbd2] jbd2_journal_init_inode at fs/jbd2/journal.c:1255
>>>> [ 208.681326][T27684] [c000200361def960] [c00800001031c808] ext4_load_journal+0x440/0x860 [ext4]
>>>> [ 208.681371][T27684] [c000200361defa30] [c008000010322a14] ext4_fill_super+0x342c/0x3ab0 [ext4]
>>>> [ 208.681414][T27684] [c000200361defba0] [c0000000004cb0bc] mount_bdev+0x25c/0x290
>>>> [ 208.681478][T27684] [c000200361defc40] [c008000010310250] ext4_mount+0x28/0x50 [ext4]
>>>> [ 208.681520][T27684] [c000200361defc60] [c00000000053242c] legacy_get_tree+0x4c/0xb0
>>>> [ 208.681556][T27684] [c000200361defc90] [c0000000004c864c] vfs_get_tree+0x4c/0x130
>>>> [ 208.681593][T27684] [c000200361defd00] [c00000000050a1c8] do_mount+0xa18/0xc50
>>>> [ 208.681641][T27684] [c000200361defdd0] [c00000000050a9a8] sys_mount+0x158/0x180
>>>> [ 208.681679][T27684] [c000200361defe20] [c00000000000b3f8] system_call+0x5c/0x68
>>>> [ 208.681726][T27684] Instruction dump:
>>>> [ 208.681747][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 208.681797][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 208.681839][T27684] ---[ end trace 4e9e2bab7f1d4048 ]---
>>>> [ 208.802259][T27684]
>>>> [ 209.802373][T27684] Kernel panic - not syncing: Fatal exception
>>>>
>>>> [ 215.281666][T16896] LTP: starting chown04_16
>>>> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>>>> [ 215.424289][T18297] Faulting instruction address: 0x00000000
>>>> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
>>>> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>>>> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
>>>> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
>>>> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
>>>> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
>>>> [ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>>>> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
>>>> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
>>>> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
>>>> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
>>>> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
>>>> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
>>>> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
>>>> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
>>>> [ 215.424914][T18297] NIP [0000000000000000] 0x0
>>>> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
>>>> find_free_cb at drivers/block/loop.c:2129
>>>> [ 215.424997][T18297] Call Trace:
>>>> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
>>>> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
>>>> loop_lookup at drivers/block/loop.c:2144
>>>> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
>>>> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
>>>> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
>>>> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
>>>> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
>>>> [ 215.425314][T18297] Instruction dump:
>>>> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
>>>> [ 215.642114][T18297]
>>>> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-09 10:07:28

by Qian Cai

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.



> On Apr 7, 2020, at 9:30 AM, Steven Rostedt <[email protected]> wrote:
>
> On Tue, 7 Apr 2020 09:01:10 -0400
> Qian Cai <[email protected]> wrote:
>
>> + Steven
>>
>>> On Apr 7, 2020, at 8:42 AM, Michael Ellerman <[email protected]> wrote:
>>>
>>> Qian Cai <[email protected]> writes:
>>>> Ever since 1st Apr, linux-next starts to trigger a NULL pointer NIP on POWER9 below using
>>>> this config,
>>>>
>>>> https://raw.githubusercontent.com/cailca/linux-mm/master/powerpc.config
>>>>
>>>> It takes a while to reproduce, so before I bury myself into bisecting and just send a head-up
>>>> to see if anyone spots anything obvious.
>>>>
>>>> [ 206.744625][T13224] LTP: starting fallocate04
>>>> [ 207.601583][T27684] /dev/zero: Can't open blockdev
>>>> [ 208.674301][T27684] EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
>>>> [ 208.680347][T27684] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>>>> [ 208.680383][T27684] Faulting instruction address: 0x00000000
>>>> [ 208.680406][T27684] Oops: Kernel access of bad area, sig: 11 [#1]
>>>> [ 208.680439][T27684] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>>>> [ 208.680474][T27684] Modules linked in: ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci mdio tg3 libata libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
>>>> [ 208.680576][T27684] CPU: 117 PID: 27684 Comm: fallocate04 Tainted: G W 5.6.0-next-20200401+ #288
>>>> [ 208.680614][T27684] NIP: 0000000000000000 LR: c0080000102c0048 CTR: 0000000000000000
>>>> [ 208.680657][T27684] REGS: c000200361def420 TRAP: 0400 Tainted: G W (5.6.0-next-20200401+)
>>>> [ 208.680700][T27684] MSR: 900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 42022228 XER: 20040000
>>>> [ 208.680760][T27684] CFAR: c00800001032c494 IRQMASK: 0
>>>> [ 208.680760][T27684] GPR00: c0000000005ac3f8 c000200361def6b0 c00000000165c200 c00020107dae0bd0
>>>> [ 208.680760][T27684] GPR04: 0000000000000000 0000000000000400 0000000000000000 0000000000000000
>>>> [ 208.680760][T27684] GPR08: c000200361def6e8 c0080000102c0040 000000007fffffff c000000001614e80
>>>> [ 208.680760][T27684] GPR12: 0000000000000000 c000201fff671280 0000000000000000 0000000000000002
>>>> [ 208.680760][T27684] GPR16: 0000000000000002 0000000000040001 c00020030f5a1000 c00020030f5a1548
>>>> [ 208.680760][T27684] GPR20: c0000000015fbad8 c00000000168c654 c000200361def818 c0000000005b4c10
>>>> [ 208.680760][T27684] GPR24: 0000000000000000 c0080000103365b8 c00020107dae0bd0 0000000000000400
>>>> [ 208.680760][T27684] GPR28: c00000000168c3a8 0000000000000000 0000000000000000 0000000000000000
>>>> [ 208.681014][T27684] NIP [0000000000000000] 0x0
>>>> [ 208.681065][T27684] LR [c0080000102c0048] ext4_iomap_end+0x8/0x30 [ext4]
>>>
>>> That LR looks like it's pointing to the return from _mcount in
>>> ext4_iomap_end(), which means we have probably crashed in ftrace
>>> somewhere.
>>>
>>> Did you have tracing enabled when you ran the test? Or does it do
>>> tracing itself?
>>
>> Yes, it run ftrace at first before running LTP to trigger it,
>>
>> https://github.com/cailca/linux-mm/blob/master/test.sh
>>
>> echo function > /sys/kernel/debug/tracing/current_tracer
>> echo nop > /sys/kernel/debug/tracing/current_tracer
>>
>> There is another crash with even non-NULL NIP, but then symbol behaves weird.
>>
>> # ./scripts/faddr2line vmlinux sysctl_net_busy_read+0x0/0x4
>> skipping sysctl_net_busy_read address at 0xc0000000016804ac due to non-function symbol of type 'D'
>>
>> [ 148.110969][T13115] LTP: starting chown04_16
>> [ 148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
>> [ 148.255099][T13380] BUG: Unable to handle kernel instruction fetch
>> [ 148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
>> [ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
>> [ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>> [ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G W 5.6.0+ #7
>> [ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
>> [ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted: G W (5.6.0+)
>> [ 148.255281][T13380] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>> [ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
>> [ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000
>> [ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f
>> [ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520
>> [ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30
>> [ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780
>> [ 148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4
>> [ 148.255516][T13380] LR [c00800000fa60408] find_free_cb+0x8/0x30 [loop]
>> [ 148.255528][T13380] Call Trace:
>> [ 148.255538][T13380] [c0000010a6fafc90] [c0000000009732c0] idr_for_each+0xf0/0x170 (unreliable)
>> [ 148.255572][T13380] [c0000010a6fafd10] [c00800000fa626c4] loop_lookup.part.1+0x4c/0xb0 [loop]
>> [ 148.255597][T13380] [c0000010a6fafd50] [c00800000fa634d8] loop_control_ioctl+0x120/0x1d0 [loop]
>> [ 148.255623][T13380] [c0000010a6fafdb0] [c0000000004ddc08] ksys_ioctl+0xd8/0x130
>> [ 148.255636][T13380] [c0000010a6fafe00] [c0000000004ddc88] sys_ioctl+0x28/0x40
>> [ 148.255669][T13380] [c0000010a6fafe20] [c00000000000b378] system_call+0x5c/0x68
>> [ 148.255699][T13380] Instruction dump:
>> [ 148.255718][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 148.255744][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 148.255772][T13380] ---[ end trace a5894a74208c22ec ]---
>> [ 148.576663][T13380]
>> [ 149.576765][T13380] Kernel panic - not syncing: Fatal exception
>>
>> The bisect so far indicated the bad ones always have this,
>>
>> aa1a8ce53332 Merge tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>
>> I’ll go to bisect some more but it is going to take a while.
>>
>> $ git log --oneline 4c205c84e249..8e99cf91b99b
>> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
>> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
>> ebed9628f5c2 selftests/ftrace: Add test to test new set_event_notrace_pid file
>> ed8839e072b8 selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
>> 276836260301 tracing: Create set_event_notrace_pid to not trace tasks
>
>> b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
>> 717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact
>
> If it is affecting function tracing, it is probably one of the above two
> commits.

OK, it was narrowed down to one of those messed with mcount here,

8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
6a13a0d7b4d1 ftrace/kprobe: Show the maxactive number on kprobe_events
c9b7a4a72ff6 ring-buffer/tracing: Have iterator acknowledge dropped events
06e0a548bad0 tracing: Do not disable tracing when reading the trace file
1039221cc278 ring-buffer: Do not disable recording when there is an iterator
07b8b10ec94f ring-buffer: Make resize disable per cpu buffer instead of total buffer
153368ce1bd0 ring-buffer: Optimize rb_iter_head_event()
ff84c50cfb4b ring-buffer: Do not die if rb_iter_peek() fails more than thrice
785888c544e0 ring-buffer: Have rb_iter_head_event() handle concurrent writer
28e3fc56a471 ring-buffer: Add page_stamp to iterator for synchronization
bc1a72afdc4a ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
ead6ecfddea5 ring-buffer: Have ring_buffer_empty() not depend on tracing stopped
ff895103a84a tracing: Save off entry when peeking at next entry
bf2cbe044da2 tracing: Use address-of operator on section symbols
bbd9d05618a6 gpu/trace: add a gpu total memory usage tracepoint
89b74cac7834 tools/bootconfig: Show line and column in parse error
306b69dce926 bootconfig: Support O=<builddir> option
5412e0b763e0 tracing: Remove unused TRACE_BUFFER bits
b396bfdebffc tracing: Have hwlat ts be first instance and record count of instances

This time happens with another random NIP.

[ 130.866777][ T8601] LTP: starting chown04_16
[ 131.004754][ T8868] kernel tried to execute exec-protected page (c000000005add4d4) - exploit attempt? (uid: 0)
[ 131.004804][ T8868] BUG: Unable to handle kernel instruction fetch
[ 131.004827][ T8868] Faulting instruction address: 0xc000000005add4d4
[ 131.004850][ T8868] Oops: Kernel access of bad area, sig: 11 [#1]
[ 131.004862][ T8868] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
[ 131.004885][ T8868] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci tg3 mdio libahci libphy firmware_class libata dm_mirror dm_region_hash dm_log dm_mod
[ 131.004925][ T8868] CPU: 58 PID: 8868 Comm: chown04_16 Tainted: G W 5.6.0+ #6
[ 131.004939][ T8868] NIP: c000000005add4d4 LR: c00800000fb20408 CTR: c000000005add4d4
[ 131.004962][ T8868] REGS: c000001b3c32fa00 TRAP: 0400 Tainted: G W (5.6.0+)
[ 131.004993][ T8868] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
[ 131.005031][ T8868] CFAR: c00800000fb26534 IRQMASK: 0
[ 131.005031][ T8868] GPR00: c00000000097c638 c000001b3c32fc90 c000000001659400 0000000000000000
[ 131.005031][ T8868] GPR04: c00000004a96b800 c000001b3c32fd30 000000000b83d6b8 ffffffff000129f4
[ 131.005031][ T8868] GPR08: c00000004a96b800 0000000000000000 0000000000000000 c00000000164d720
[ 131.005031][ T8868] GPR12: c000000005add4d4 c000001ffffd0b00 0000000000000000 0000000000000000
[ 131.005031][ T8868] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 131.005031][ T8868] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 131.005031][ T8868] GPR24: 00007fffa4982d80 0000000000000000 c00800000fb2a488 c000001b3c32fd30
[ 131.005031][ T8868] GPR28: 0000000000000000 000000007fffffff c00800000fb20400 c000000335c88790
[ 131.005202][ T8868] NIP [c000000005add4d4] net_msg_warn+0x0/0x4
[ 131.005224][ T8868] LR [c00800000fb20408] find_free_cb+0x8/0x30 [loop]
[ 131.005243][ T8868] Call Trace:
[ 131.005253][ T8868] [c000001b3c32fc90] [c00000000097c690] idr_for_each+0xf0/0x170 (unreliable)
[ 131.005278][ T8868] [c000001b3c32fd10] [c00800000fb226c4] loop_lookup.part.1+0x4c/0xb0 [loop]
[ 131.005311][ T8868] [c000001b3c32fd50] [c00800000fb234d8] loop_control_ioctl+0x120/0x1d0 [loop]
[ 131.005344][ T8868] [c000001b3c32fdb0] [c0000000004dd778] ksys_ioctl+0xd8/0x130
[ 131.005376][ T8868] [c000001b3c32fe00] [c0000000004dd7f8] sys_ioctl+0x28/0x40
[ 131.005399][ T8868] [c000001b3c32fe20] [c00000000000b378] system_call+0x5c/0x68
[ 131.005438][ T8868] Instruction dump:
[ 131.005465][ T8868] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 131.005500][ T8868] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 131.005527][ T8868] ---[ end trace cfaf36049cb72805 ]---
[ 131.117039][ T8868]
[ 132.117139][ T8868] Kernel panic - not syncing: Fatal exception

>>
>>
>>>
>>> cheers
>>>
>>>> [ 208.681091][T27684] Call Trace:
>>>> [ 208.681129][T27684] [c000200361def6b0] [c0000000005ac3bc] iomap_apply+0x20c/0x920 (unreliable) iomap_apply at fs/iomap/apply.c:80 (discriminator 4)
>>>> [ 208.681173][T27684] [c000200361def7f0] [c0000000005b4adc] iomap_bmap+0xfc/0x160 iomap_bmap at fs/iomap/fiemap.c:142
>>>> [ 208.681228][T27684] [c000200361def850] [c0080000102c2c1c] ext4_bmap+0xa4/0x180 [ext4] ext4_bmap at fs/ext4/inode.c:3213
>>>> [ 208.681260][T27684] [c000200361def890] [c0000000004f71fc] bmap+0x4c/0x80
>>>> [ 208.681281][T27684] [c000200361def8c0] [c00800000fdb0acc] jbd2_journal_init_inode+0x44/0x1a0 [jbd2] jbd2_journal_init_inode at fs/jbd2/journal.c:1255
>>>> [ 208.681326][T27684] [c000200361def960] [c00800001031c808] ext4_load_journal+0x440/0x860 [ext4]
>>>> [ 208.681371][T27684] [c000200361defa30] [c008000010322a14] ext4_fill_super+0x342c/0x3ab0 [ext4]
>>>> [ 208.681414][T27684] [c000200361defba0] [c0000000004cb0bc] mount_bdev+0x25c/0x290
>>>> [ 208.681478][T27684] [c000200361defc40] [c008000010310250] ext4_mount+0x28/0x50 [ext4]
>>>> [ 208.681520][T27684] [c000200361defc60] [c00000000053242c] legacy_get_tree+0x4c/0xb0
>>>> [ 208.681556][T27684] [c000200361defc90] [c0000000004c864c] vfs_get_tree+0x4c/0x130
>>>> [ 208.681593][T27684] [c000200361defd00] [c00000000050a1c8] do_mount+0xa18/0xc50
>>>> [ 208.681641][T27684] [c000200361defdd0] [c00000000050a9a8] sys_mount+0x158/0x180
>>>> [ 208.681679][T27684] [c000200361defe20] [c00000000000b3f8] system_call+0x5c/0x68
>>>> [ 208.681726][T27684] Instruction dump:
>>>> [ 208.681747][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 208.681797][T27684] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 208.681839][T27684] ---[ end trace 4e9e2bab7f1d4048 ]---
>>>> [ 208.802259][T27684]
>>>> [ 209.802373][T27684] Kernel panic - not syncing: Fatal exception
>>>>
>>>> [ 215.281666][T16896] LTP: starting chown04_16
>>>> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>>>> [ 215.424289][T18297] Faulting instruction address: 0x00000000
>>>> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
>>>> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>>>> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
>>>> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
>>>> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
>>>> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
>>>> [ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>>>> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
>>>> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
>>>> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
>>>> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
>>>> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
>>>> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
>>>> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
>>>> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
>>>> [ 215.424914][T18297] NIP [0000000000000000] 0x0
>>>> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
>>>> find_free_cb at drivers/block/loop.c:2129
>>>> [ 215.424997][T18297] Call Trace:
>>>> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
>>>> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
>>>> loop_lookup at drivers/block/loop.c:2144
>>>> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
>>>> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
>>>> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
>>>> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
>>>> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
>>>> [ 215.425314][T18297] Instruction dump:
>>>> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>>>> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
>>>> [ 215.642114][T18297]
>>>> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-09 14:15:43

by Steven Rostedt

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.

On Thu, 9 Apr 2020 06:06:35 -0400
Qian Cai <[email protected]> wrote:

> >> I’ll go to bisect some more but it is going to take a while.
> >>
> >> $ git log --oneline 4c205c84e249..8e99cf91b99b
> >> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
> >> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
> >> ebed9628f5c2 selftests/ftrace: Add test to test new set_event_notrace_pid file
> >> ed8839e072b8 selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
> >> 276836260301 tracing: Create set_event_notrace_pid to not trace tasks
> >
> >> b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
> >> 717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact
> >
> > If it is affecting function tracing, it is probably one of the above two
> > commits.
>
> OK, it was narrowed down to one of those messed with mcount here,

Thing is, nothing here touches mcount.

>
> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic

Touches reading the trace buffer.

> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid

Documentation.

> 6a13a0d7b4d1 ftrace/kprobe: Show the maxactive number on kprobe_events

kprobe output.

> c9b7a4a72ff6 ring-buffer/tracing: Have iterator acknowledge dropped events

Reading the buffer.

> 06e0a548bad0 tracing: Do not disable tracing when reading the trace file

Reading the buffer.

> 1039221cc278 ring-buffer: Do not disable recording when there is an iterator

Reading the buffer.

> 07b8b10ec94f ring-buffer: Make resize disable per cpu buffer instead of total buffer

Resizing the buffer.

> 153368ce1bd0 ring-buffer: Optimize rb_iter_head_event()

Reading the buffer.

> ff84c50cfb4b ring-buffer: Do not die if rb_iter_peek() fails more than thrice

Reading the buffer.

> 785888c544e0 ring-buffer: Have rb_iter_head_event() handle concurrent writer

Reading the buffer.

> 28e3fc56a471 ring-buffer: Add page_stamp to iterator for synchronization

Reading the buffer.

> bc1a72afdc4a ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()

Reading the buffer.

> ead6ecfddea5 ring-buffer: Have ring_buffer_empty() not depend on tracing stopped

Reading the buffer.

> ff895103a84a tracing: Save off entry when peeking at next entry

Reading the buffer.

> bf2cbe044da2 tracing: Use address-of operator on section symbols

Affects trace_printk()

> bbd9d05618a6 gpu/trace: add a gpu total memory usage tracepoint

New tracepoint infrastructure (just new trace events for gpu)

> 89b74cac7834 tools/bootconfig: Show line and column in parse error

Extended command line boot config.

> 306b69dce926 bootconfig: Support O=<builddir> option

Extended command line boot config

> 5412e0b763e0 tracing: Remove unused TRACE_BUFFER bits

Removed unused enums.

> b396bfdebffc tracing: Have hwlat ts be first instance and record count of instances

Affects only the hard ware latency detector (most likely not even
configured in the kernel).

So I don't understand how any of the above commits can cause a problem.

-- Steve

2020-04-10 19:23:10

by Qian Cai

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.



> On Apr 9, 2020, at 10:14 AM, Steven Rostedt <[email protected]> wrote:
>
> On Thu, 9 Apr 2020 06:06:35 -0400
> Qian Cai <[email protected]> wrote:
>
>>>> I’ll go to bisect some more but it is going to take a while.
>>>>
>>>> $ git log --oneline 4c205c84e249..8e99cf91b99b
>>>> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
>>>> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
>>>> ebed9628f5c2 selftests/ftrace: Add test to test new set_event_notrace_pid file
>>>> ed8839e072b8 selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
>>>> 276836260301 tracing: Create set_event_notrace_pid to not trace tasks
>>>
>>>> b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
>>>> 717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact
>>>
>>> If it is affecting function tracing, it is probably one of the above two
>>> commits.
>>
>> OK, it was narrowed down to one of those messed with mcount here,
>
> Thing is, nothing here touches mcount.

Yes, you are right. I went back to test the commit just before the 5.7-trace merge request,
I did reproduce there. The thing is that this bastard could take more 6-hour to happen,
so my previous attempt did not wait long enough. Back to the square one...

>
>>
>> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
>
> Touches reading the trace buffer.
>
>> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
>
> Documentation.
>
>> 6a13a0d7b4d1 ftrace/kprobe: Show the maxactive number on kprobe_events
>
> kprobe output.
>
>> c9b7a4a72ff6 ring-buffer/tracing: Have iterator acknowledge dropped events
>
> Reading the buffer.
>
>> 06e0a548bad0 tracing: Do not disable tracing when reading the trace file
>
> Reading the buffer.
>
>> 1039221cc278 ring-buffer: Do not disable recording when there is an iterator
>
> Reading the buffer.
>
>> 07b8b10ec94f ring-buffer: Make resize disable per cpu buffer instead of total buffer
>
> Resizing the buffer.
>
>> 153368ce1bd0 ring-buffer: Optimize rb_iter_head_event()
>
> Reading the buffer.
>
>> ff84c50cfb4b ring-buffer: Do not die if rb_iter_peek() fails more than thrice
>
> Reading the buffer.
>
>> 785888c544e0 ring-buffer: Have rb_iter_head_event() handle concurrent writer
>
> Reading the buffer.
>
>> 28e3fc56a471 ring-buffer: Add page_stamp to iterator for synchronization
>
> Reading the buffer.
>
>> bc1a72afdc4a ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
>
> Reading the buffer.
>
>> ead6ecfddea5 ring-buffer: Have ring_buffer_empty() not depend on tracing stopped
>
> Reading the buffer.
>
>> ff895103a84a tracing: Save off entry when peeking at next entry
>
> Reading the buffer.
>
>> bf2cbe044da2 tracing: Use address-of operator on section symbols
>
> Affects trace_printk()
>
>> bbd9d05618a6 gpu/trace: add a gpu total memory usage tracepoint
>
> New tracepoint infrastructure (just new trace events for gpu)
>
>> 89b74cac7834 tools/bootconfig: Show line and column in parse error
>
> Extended command line boot config.
>
>> 306b69dce926 bootconfig: Support O=<builddir> option
>
> Extended command line boot config
>
>> 5412e0b763e0 tracing: Remove unused TRACE_BUFFER bits
>
> Removed unused enums.
>
>> b396bfdebffc tracing: Have hwlat ts be first instance and record count of instances
>
> Affects only the hard ware latency detector (most likely not even
> configured in the kernel).
>
> So I don't understand how any of the above commits can cause a problem.
>
> -- Steve

2020-04-16 00:18:37

by Qian Cai

[permalink] [raw]
Subject: Re: Linux-next POWER9 NULL pointer NIP since 1st Apr.



> On Apr 10, 2020, at 3:20 PM, Qian Cai <[email protected]> wrote:
>
>
>
>> On Apr 9, 2020, at 10:14 AM, Steven Rostedt <[email protected]> wrote:
>>
>> On Thu, 9 Apr 2020 06:06:35 -0400
>> Qian Cai <[email protected]> wrote:
>>
>>>>> I’ll go to bisect some more but it is going to take a while.
>>>>>
>>>>> $ git log --oneline 4c205c84e249..8e99cf91b99b
>>>>> 8e99cf91b99b tracing: Do not allocate buffer in trace_find_next_entry() in atomic
>>>>> 2ab2a0924b99 tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
>>>>> ebed9628f5c2 selftests/ftrace: Add test to test new set_event_notrace_pid file
>>>>> ed8839e072b8 selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
>>>>> 276836260301 tracing: Create set_event_notrace_pid to not trace tasks
>>>>
>>>>> b3b1e6ededa4 ftrace: Create set_ftrace_notrace_pid to not trace tasks
>>>>> 717e3f5ebc82 ftrace: Make function trace pid filtering a bit more exact
>>>>
>>>> If it is affecting function tracing, it is probably one of the above two
>>>> commits.
>>>
>>> OK, it was narrowed down to one of those messed with mcount here,
>>
>> Thing is, nothing here touches mcount.
>
> Yes, you are right. I went back to test the commit just before the 5.7-trace merge request,
> I did reproduce there. The thing is that this bastard could take more 6-hour to happen,
> so my previous attempt did not wait long enough. Back to the square one…

OK, I starts to test all commits up to 12 hours. The progess on far is,

BAD: v5.6-rc1
GOOD: v5.5
GOOD: 153b5c566d30 Merge tag 'microblaze-v5.6-rc1' of git://git.monstr.eu/linux-2.6-microblaze

The next step I’ll be testing,

71c3a888cbca Merge tag 'powerpc-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

IF that is BAD, the merge request is the culprit. I can see a few commits are more related that others.

5290ae2b8e5f powerpc/64: Use {SAVE,REST}_NVGPRS macros
ed0bc98f8cbe powerpc/64s: Reimplement power4_idle code in C

Does it ring any bell yet?


2020-04-17 01:23:14

by Qian Cai

[permalink] [raw]
Subject: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

OK, reverted the commit,

c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)

or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,

https://lore.kernel.org/lkml/[email protected]/

[ 148.110969][T13115] LTP: starting chown04_16
[ 148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
[ 148.255099][T13380] BUG: Unable to handle kernel instruction fetch
[ 148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
[ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
[ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
[ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G W 5.6.0+ #7
[ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
[ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted: G W (5.6.0+)
[ 148.255281][T13380] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
[ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
[ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000
[ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f
[ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520
[ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000
[ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30
[ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780
[ 148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4
[ 148.255516][T13380] LR [c00800000fa60408] find_free_cb+0x8/0x30 [loop]
[ 148.255528][T13380] Call Trace:
[ 148.255538][T13380] [c0000010a6fafc90] [c0000000009732c0] idr_for_each+0xf0/0x170 (unreliable)
[ 148.255572][T13380] [c0000010a6fafd10] [c00800000fa626c4] loop_lookup.part.1+0x4c/0xb0 [loop]
[ 148.255597][T13380] [c0000010a6fafd50] [c00800000fa634d8] loop_control_ioctl+0x120/0x1d0 [loop]
[ 148.255623][T13380] [c0000010a6fafdb0] [c0000000004ddc08] ksys_ioctl+0xd8/0x130
[ 148.255636][T13380] [c0000010a6fafe00] [c0000000004ddc88] sys_ioctl+0x28/0x40
[ 148.255669][T13380] [c0000010a6fafe20] [c00000000000b378] system_call+0x5c/0x68
[ 148.255699][T13380] Instruction dump:
[ 148.255718][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 148.255744][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 148.255772][T13380] ---[ end trace a5894a74208c22ec ]---
[ 148.576663][T13380]
[ 149.576765][T13380] Kernel panic - not syncing: Fatal exception


2020-04-17 02:19:32

by Steven Rostedt

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

On Thu, 16 Apr 2020 21:19:10 -0400
Qian Cai <[email protected]> wrote:

> OK, reverted the commit,
>
> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>
> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,

This may be a symptom and not a cure.

>
> https://lore.kernel.org/lkml/[email protected]/
>
> [ 148.110969][T13115] LTP: starting chown04_16
> [ 148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
> [ 148.255099][T13380] BUG: Unable to handle kernel instruction fetch
> [ 148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
> [ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
> [ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
> [ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G W 5.6.0+ #7
> [ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
> [ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted: G W (5.6.0+)
> [ 148.255281][T13380] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
> [ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
> [ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000
> [ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f
> [ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520
> [ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000
> [ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30
> [ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780
> [ 148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4

The instruction pointer is on sysctl_net_busy_read? Isn't that data and
not code?

In net/socket.c:

#ifdef CONFIG_NET_RX_BUSY_POLL
unsigned int sysctl_net_busy_read __read_mostly;
unsigned int sysctl_net_busy_poll __read_mostly;
#endif

-- Steve


> [ 148.255516][T13380] LR [c00800000fa60408] find_free_cb+0x8/0x30 [loop]
> [ 148.255528][T13380] Call Trace:
> [ 148.255538][T13380] [c0000010a6fafc90] [c0000000009732c0] idr_for_each+0xf0/0x170 (unreliable)
> [ 148.255572][T13380] [c0000010a6fafd10] [c00800000fa626c4] loop_lookup.part.1+0x4c/0xb0 [loop]
> [ 148.255597][T13380] [c0000010a6fafd50] [c00800000fa634d8] loop_control_ioctl+0x120/0x1d0 [loop]
> [ 148.255623][T13380] [c0000010a6fafdb0] [c0000000004ddc08] ksys_ioctl+0xd8/0x130
> [ 148.255636][T13380] [c0000010a6fafe00] [c0000000004ddc88] sys_ioctl+0x28/0x40
> [ 148.255669][T13380] [c0000010a6fafe20] [c00000000000b378] system_call+0x5c/0x68
> [ 148.255699][T13380] Instruction dump:
> [ 148.255718][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 148.255744][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 148.255772][T13380] ---[ end trace a5894a74208c22ec ]---
> [ 148.576663][T13380]
> [ 149.576765][T13380] Kernel panic - not syncing: Fatal exception
>

2020-04-17 02:29:13

by Russell Currey

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

On Thu, 2020-04-16 at 22:17 -0400, Steven Rostedt wrote:
> On Thu, 16 Apr 2020 21:19:10 -0400
> Qian Cai <[email protected]> wrote:
>
> > OK, reverted the commit,
> >
> > c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility
> > with RELOCATABLE”)
> >
> > or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned
> > in this thread,
>
> This may be a symptom and not a cure.

Reverting the patch with the given config will have the same effect as
STRICT_KERNEL_RWX=n. Not discounting that it could be a bug on the
powerpc side (i.e. relocatable kernels with strict RWX on haven't been
exhaustively tested yet), but we should definitely figure out what's
going on with this bad access first.

>
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > [ 148.110969][T13115] LTP: starting chown04_16
> > [ 148.255048][T13380] kernel tried to execute exec-protected page
> > (c0000000016804ac) - exploit attempt? (uid: 0)
> > [ 148.255099][T13380] BUG: Unable to handle kernel instruction
> > fetch
> > [ 148.255122][T13380] Faulting instruction address:
> > 0xc0000000016804ac
> > [ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11
> > [#1]
> > [ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256
> > DEBUG_PAGEALLOC NUMA PowerNV
> > [ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs
> > sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class
> > dm_mirror dm_region_hash dm_log dm_mod
> > [ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted:
> > G W 5.6.0+ #7
> > [ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408
> > CTR: c0000000016804ac
> > [ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted:
> > G W (5.6.0+)
> > [ 148.255281][T13380] MSR: 9000000010009033
> > <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
> > [ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
> > [ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90
> > c000000001648200 0000000000000000
> > [ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30
> > 00000000b5e98331 ffffffff00012c9f
> > [ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000
> > 0000000000000000 c00000000163c520
> > [ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80
> > 0000000000000000 0000000000000000
> > [ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000
> > 0000000000000000 0000000000000000
> > [ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000
> > 0000000000000000 0000000000000000
> > [ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000
> > c00800000fa6a488 c0000010a6fafd30
> > [ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff
> > c00800000fa60400 c000000efd0c6780
> > [ 148.255494][T13380] NIP [c0000000016804ac]
> > sysctl_net_busy_read+0x0/0x4
>
> The instruction pointer is on sysctl_net_busy_read? Isn't that data
> and
> not code?
>
> In net/socket.c:
>
> #ifdef CONFIG_NET_RX_BUSY_POLL
> unsigned int sysctl_net_busy_read __read_mostly;
> unsigned int sysctl_net_busy_poll __read_mostly;
> #endif
>
> -- Steve
>
>
> > [ 148.255516][T13380] LR [c00800000fa60408] find_free_cb+0x8/0x30
> > [loop]
> > [ 148.255528][T13380] Call Trace:
> > [ 148.255538][T13380] [c0000010a6fafc90] [c0000000009732c0]
> > idr_for_each+0xf0/0x170 (unreliable)
> > [ 148.255572][T13380] [c0000010a6fafd10] [c00800000fa626c4]
> > loop_lookup.part.1+0x4c/0xb0 [loop]
> > [ 148.255597][T13380] [c0000010a6fafd50] [c00800000fa634d8]
> > loop_control_ioctl+0x120/0x1d0 [loop]
> > [ 148.255623][T13380] [c0000010a6fafdb0] [c0000000004ddc08]
> > ksys_ioctl+0xd8/0x130
> > [ 148.255636][T13380] [c0000010a6fafe00] [c0000000004ddc88]
> > sys_ioctl+0x28/0x40
> > [ 148.255669][T13380] [c0000010a6fafe20] [c00000000000b378]
> > system_call+0x5c/0x68
> > [ 148.255699][T13380] Instruction dump:
> > [ 148.255718][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> > XXXXXXXX XXXXXXXX XXXXXXXX
> > [ 148.255744][T13380] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> > XXXXXXXX XXXXXXXX XXXXXXXX
> > [ 148.255772][T13380] ---[ end trace a5894a74208c22ec ]---
> > [ 148.576663][T13380]
> > [ 149.576765][T13380] Kernel panic - not syncing: Fatal exception
> >

2020-04-17 02:43:19

by Qian Cai

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)



> On Apr 16, 2020, at 10:27 PM, Russell Currey <[email protected]> wrote:
>
> Reverting the patch with the given config will have the same effect as
> STRICT_KERNEL_RWX=n. Not discounting that it could be a bug on the
> powerpc side (i.e. relocatable kernels with strict RWX on haven't been
> exhaustively tested yet), but we should definitely figure out what's
> going on with this bad access first.

BTW, this bad access only happened once. The overwhelming rest of crashes are with NULL pointer NIP like below. How can you explain that STRICT_KERNEL_RWX=n would also make those NULL NIP disappear if STRICT_KERNEL_RWX is just a messenger?

[ 215.281666][T16896] LTP: starting chown04_16
[ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 215.424289][T18297] Faulting instruction address: 0x00000000
[ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
[ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
[ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
[ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
[ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
[ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
[ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
[ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
[ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
[ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
[ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
[ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
[ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
[ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
[ 215.424914][T18297] NIP [0000000000000000] 0x0
[ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
find_free_cb at drivers/block/loop.c:2129
[ 215.424997][T18297] Call Trace:
[ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
[ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
loop_lookup at drivers/block/loop.c:2144
[ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
[ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
[ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
[ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
[ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
[ 215.425314][T18297] Instruction dump:
[ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
[ 215.642114][T18297]
[ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-17 02:47:59

by Russell Currey

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

On Thu, 2020-04-16 at 22:40 -0400, Qian Cai wrote:
> > On Apr 16, 2020, at 10:27 PM, Russell Currey <[email protected]>
> > wrote:
> >
> > Reverting the patch with the given config will have the same effect
> > as
> > STRICT_KERNEL_RWX=n. Not discounting that it could be a bug on the
> > powerpc side (i.e. relocatable kernels with strict RWX on haven't
> > been
> > exhaustively tested yet), but we should definitely figure out
> > what's
> > going on with this bad access first.
>
> BTW, this bad access only happened once. The overwhelming rest of
> crashes are with NULL pointer NIP like below. How can you explain
> that STRICT_KERNEL_RWX=n would also make those NULL NIP disappear if
> STRICT_KERNEL_RWX is just a messenger?

What happens if you test with STRICT_KERNEL_RWX=y and RELOCATABLE=n,
reverting my patch? This would give us an idea of whether it's
something broken recently or if there's something else going on.

>
> [ 215.281666][T16896] LTP: starting chown04_16
> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch
> (NULL pointer?)
> [ 215.424289][T18297] Faulting instruction address: 0x00000000
> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256
> DEBUG_PAGEALLOC NUMA PowerNV
> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables
> x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata
> firmware_class dm_mirror dm_region_hash dm_log dm_mod
> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted:
> G W 5.6.0-next-20200405+ #3
> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408
> CTR: 0000000000000000
> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted:
> G W (5.6.0-next-20200405+)
> [ 215.424570][T18297] MSR: 9000000040009033
> <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20
> c00000000165ce00 0000000000000000
> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0
> 000000005f037e7d ffffffff00013bfb
> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000
> 0000000000000000 c000000001652ee0
> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600
> 0000000000000000 0000000000000000
> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000007
> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000
> c00800000fbc8688 c000200b8606fcc0
> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff
> c00800000fbc0400 c00020068b8c0e70
> [ 215.424914][T18297] NIP [0000000000000000] 0x0
> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30
> [loop]
> find_free_cb at drivers/block/loop.c:2129
> [ 215.424997][T18297] Call Trace:
> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290]
> idr_for_each+0xf0/0x170 (unreliable)
> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744]
> loop_lookup.part.2+0x4c/0xb0 [loop]
> loop_lookup at drivers/block/loop.c:2144
> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558]
> loop_control_ioctl+0x120/0x1d0 [loop]
> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688]
> ksys_ioctl+0xd8/0x130
> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708]
> sys_ioctl+0x28/0x40
> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30]
> system_call_exception+0x110/0x1e0
> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0]
> system_call_common+0xf0/0x278
> [ 215.425314][T18297] Instruction dump:
> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> XXXXXXXX XXXXXXXX XXXXXXXX
> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> XXXXXXXX XXXXXXXX XXXXXXXX
> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
> [ 215.642114][T18297]
> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-17 03:21:26

by Qian Cai

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)



> On Apr 16, 2020, at 10:46 PM, Russell Currey <[email protected]> wrote:
>
> On Thu, 2020-04-16 at 22:40 -0400, Qian Cai wrote:
>>> On Apr 16, 2020, at 10:27 PM, Russell Currey <[email protected]>
>>> wrote:
>>>
>>> Reverting the patch with the given config will have the same effect
>>> as
>>> STRICT_KERNEL_RWX=n. Not discounting that it could be a bug on the
>>> powerpc side (i.e. relocatable kernels with strict RWX on haven't
>>> been
>>> exhaustively tested yet), but we should definitely figure out
>>> what's
>>> going on with this bad access first.
>>
>> BTW, this bad access only happened once. The overwhelming rest of
>> crashes are with NULL pointer NIP like below. How can you explain
>> that STRICT_KERNEL_RWX=n would also make those NULL NIP disappear if
>> STRICT_KERNEL_RWX is just a messenger?
>
> What happens if you test with STRICT_KERNEL_RWX=y and RELOCATABLE=n,
> reverting my patch? This would give us an idea of whether it's
> something broken recently or if there's something else going on.

I don’t know what did you mean by reverting your patch because that combination
can be tested as-is. Anyway, it could take a long time to reproduce, so I’ll keep it
running for up to 12-hour to confirm it could not really crash.

>
>>
>> [ 215.281666][T16896] LTP: starting chown04_16
>> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch
>> (NULL pointer?)
>> [ 215.424289][T18297] Faulting instruction address: 0x00000000
>> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
>> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256
>> DEBUG_PAGEALLOC NUMA PowerNV
>> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables
>> x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata
>> firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted:
>> G W 5.6.0-next-20200405+ #3
>> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408
>> CTR: 0000000000000000
>> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted:
>> G W (5.6.0-next-20200405+)
>> [ 215.424570][T18297] MSR: 9000000040009033
>> <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
>> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20
>> c00000000165ce00 0000000000000000
>> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0
>> 000000005f037e7d ffffffff00013bfb
>> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000
>> 0000000000000000 c000000001652ee0
>> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600
>> 0000000000000000 0000000000000000
>> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000
>> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000007
>> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000
>> c00800000fbc8688 c000200b8606fcc0
>> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff
>> c00800000fbc0400 c00020068b8c0e70
>> [ 215.424914][T18297] NIP [0000000000000000] 0x0
>> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30
>> [loop]
>> find_free_cb at drivers/block/loop.c:2129
>> [ 215.424997][T18297] Call Trace:
>> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290]
>> idr_for_each+0xf0/0x170 (unreliable)
>> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744]
>> loop_lookup.part.2+0x4c/0xb0 [loop]
>> loop_lookup at drivers/block/loop.c:2144
>> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558]
>> loop_control_ioctl+0x120/0x1d0 [loop]
>> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688]
>> ksys_ioctl+0xd8/0x130
>> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708]
>> sys_ioctl+0x28/0x40
>> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30]
>> system_call_exception+0x110/0x1e0
>> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0]
>> system_call_common+0xf0/0x278
>> [ 215.425314][T18297] Instruction dump:
>> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
>> [ 215.642114][T18297]
>> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-17 07:03:58

by Naveen N. Rao

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

Hi Qian,

Qian Cai wrote:
> OK, reverted the commit,
>
> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>
> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,
>
> https://lore.kernel.org/lkml/[email protected]/

Do you see any errors logged in dmesg when you see the crash?
STRICT_KERNEL_RWX changes how patch_instruction() works, so it would be
interesting to see if there are any ftrace-related errors thrown before
the crash.


- Naveen

2020-04-17 11:47:02

by Michael Ellerman

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

Steven Rostedt <[email protected]> writes:
> On Thu, 16 Apr 2020 21:19:10 -0400
> Qian Cai <[email protected]> wrote:
>
>> OK, reverted the commit,
>>
>> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>>
>> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,
>
> This may be a symptom and not a cure.

I think it is a cure.

But we still have a bug, which is that when STRICT_KERNEL_RWX is enabled
we have some sort of corruption going on.

Enabling STRICT_KERNEL_RWX changes our implementation of
patch_instruction() which is used by ftrace, so I suspect this is a
powerpc bug.

>> [ 148.110969][T13115] LTP: starting chown04_16
>> [ 148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
>> [ 148.255099][T13380] BUG: Unable to handle kernel instruction fetch
>> [ 148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
>> [ 148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
>> [ 148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>> [ 148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [ 148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G W 5.6.0+ #7
>> [ 148.255236][T13380] NIP: c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
>> [ 148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400 Tainted: G W (5.6.0+)
>> [ 148.255281][T13380] MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>> [ 148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0
>> [ 148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000
>> [ 148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f
>> [ 148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520
>> [ 148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30
>> [ 148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780
>> [ 148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4
>
> The instruction pointer is on sysctl_net_busy_read? Isn't that data and
> not code?

Yes.

But we're corrupting the text, or data, somewhere, so we can jump
anywhere.

I have another trace where vhost_init() appears to call into
proc_dointvec() before crashing. vhost_init() is an empty function.

cheers

2020-04-17 11:50:29

by Michael Ellerman

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

"Naveen N. Rao" <[email protected]> writes:
> Hi Qian,
>
> Qian Cai wrote:
>> OK, reverted the commit,
>>
>> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>>
>> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,
>>
>> https://lore.kernel.org/lkml/[email protected]/
>
> Do you see any errors logged in dmesg when you see the crash?
> STRICT_KERNEL_RWX changes how patch_instruction() works, so it would be
> interesting to see if there are any ftrace-related errors thrown before
> the crash.

I've been able to reproduce with STRICT_KERNEL_RWX=y and concurrently
running:

# while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; echo nop > /sys/kernel/debug/tracing/current_tracer ; done

and:

# while true; do find /lib/modules/$(uname -r) -name '*.ko' -printf "%f\n" | sed -e "s/\.ko//" | xargs -i modprobe -va {}; lsmod | awk '{print $1}' | xargs -i modprobe -vr {}; done

ie. stressing module loading/unloading and ftrace at the same time.


It's not 100% but it usually reproduces within 10-20 minutes.

It looks like sometimes our __patch_instruction() fails, and then that
somehow leads to things getting further messed up. Presumably we have
some bad error handling somewhere.

cheers

2020-04-17 11:50:57

by Qian Cai

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)



> On Apr 16, 2020, at 10:46 PM, Russell Currey <[email protected]> wrote:
>
> On Thu, 2020-04-16 at 22:40 -0400, Qian Cai wrote:
>>> On Apr 16, 2020, at 10:27 PM, Russell Currey <[email protected]>
>>> wrote:
>>>
>>> Reverting the patch with the given config will have the same effect
>>> as
>>> STRICT_KERNEL_RWX=n. Not discounting that it could be a bug on the
>>> powerpc side (i.e. relocatable kernels with strict RWX on haven't
>>> been
>>> exhaustively tested yet), but we should definitely figure out
>>> what's
>>> going on with this bad access first.
>>
>> BTW, this bad access only happened once. The overwhelming rest of
>> crashes are with NULL pointer NIP like below. How can you explain
>> that STRICT_KERNEL_RWX=n would also make those NULL NIP disappear if
>> STRICT_KERNEL_RWX is just a messenger?
>
> What happens if you test with STRICT_KERNEL_RWX=y and RELOCATABLE=n,
> reverting my patch? This would give us an idea of whether it's
> something broken recently or if there's something else going on.

That combination will crash as well. I don’t think it is broken recently though due to
the crash could happen back in 5.6-rc1 when your commit first introduced.

>
>>
>> [ 215.281666][T16896] LTP: starting chown04_16
>> [ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch
>> (NULL pointer?)
>> [ 215.424289][T18297] Faulting instruction address: 0x00000000
>> [ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
>> [ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256
>> DEBUG_PAGEALLOC NUMA PowerNV
>> [ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables
>> x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata
>> firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted:
>> G W 5.6.0-next-20200405+ #3
>> [ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408
>> CTR: 0000000000000000
>> [ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted:
>> G W (5.6.0-next-20200405+)
>> [ 215.424570][T18297] MSR: 9000000040009033
>> <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
>> [ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
>> [ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20
>> c00000000165ce00 0000000000000000
>> [ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0
>> 000000005f037e7d ffffffff00013bfb
>> [ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000
>> 0000000000000000 c000000001652ee0
>> [ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600
>> 0000000000000000 0000000000000000
>> [ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000000
>> [ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000
>> 0000000000000000 0000000000000007
>> [ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000
>> c00800000fbc8688 c000200b8606fcc0
>> [ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff
>> c00800000fbc0400 c00020068b8c0e70
>> [ 215.424914][T18297] NIP [0000000000000000] 0x0
>> [ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30
>> [loop]
>> find_free_cb at drivers/block/loop.c:2129
>> [ 215.424997][T18297] Call Trace:
>> [ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290]
>> idr_for_each+0xf0/0x170 (unreliable)
>> [ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744]
>> loop_lookup.part.2+0x4c/0xb0 [loop]
>> loop_lookup at drivers/block/loop.c:2144
>> [ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558]
>> loop_control_ioctl+0x120/0x1d0 [loop]
>> [ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688]
>> ksys_ioctl+0xd8/0x130
>> [ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708]
>> sys_ioctl+0x28/0x40
>> [ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30]
>> system_call_exception+0x110/0x1e0
>> [ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0]
>> system_call_common+0xf0/0x278
>> [ 215.425314][T18297] Instruction dump:
>> [ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> XXXXXXXX XXXXXXXX XXXXXXXX
>> [ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
>> [ 215.642114][T18297]
>> [ 216.642220][T18297] Kernel panic - not syncing: Fatal exception

2020-04-17 12:05:34

by Qian Cai

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)



> On Apr 17, 2020, at 3:01 AM, Naveen N. Rao <[email protected]> wrote:
>
> Hi Qian,
>
> Qian Cai wrote:
>> OK, reverted the commit,
>> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,
>> https://lore.kernel.org/lkml/[email protected]/
>
> Do you see any errors logged in dmesg when you see the crash? STRICT_KERNEL_RWX changes how patch_instruction() works, so it would be interesting to see if there are any ftrace-related errors thrown before the crash.

Yes, looks like there is a warning right after,

echo function > /sys/kernel/debug/tracing/current_tracer
echo nop > /sys/kernel/debug/tracing/current_tracer

and just before the crash,

[ T3454] ftrace-powerpc: Unexpected call sequence at 00000000de85f044: 48003d1d 7c0802a6
[ 56.870472][ T3454] ------------[ cut here ]------------
[ 56.870500][ T3454] WARNING: CPU: 52 PID: 3454 at kernel/trace/ftrace.c:2026 ftrace_bug+0x104/0x310
[ 56.870527][ T3454] Modules linked in: kvm_hv kvm ses enclosure scsi_transport_sas ip_tables x_tables xfs sd_mod i40e firmware_class aacraid dm_mirror dm_region_hash dm_log dm_mod
[ 56.870592][ T3454] CPU: 52 PID: 3454 Comm: nip.sh Not tainted 5.7.0-rc1-next-20200416 #4
[ 56.870627][ T3454] NIP: c0000000002a3ae4 LR: c0000000002a47fc CTR: c0000000002436f0
[ 56.870661][ T3454] REGS: c00000069a9ef710 TRAP: 0700 Not tainted (5.7.0-rc1-next-20200416)
[ 56.870697][ T3454] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28228222 XER: 00000000
[ 56.870748][ T3454] CFAR: c0000000002a3a2c IRQMASK: 0
[ 56.870748][ T3454] GPR00: c0000000002a47fc c00000069a9ef9a0 c0000000012f9000 ffffffffffffffea
[ 56.870748][ T3454] GPR04: c0002004e2160438 c0000007fedf0ad8 00000000614ca19d 0000000000000007
[ 56.870748][ T3454] GPR08: 0000000000000003 0000000000000000 0000000000000000 0000000000000002
[ 56.870748][ T3454] GPR12: 0000000000004000 c0000007fffd5600 0000000040000000 0000000139ae9798
[ 56.870748][ T3454] GPR16: 0000000139ae9724 0000000139a86968 0000000139a1f230 0000000139aed568
[ 56.870748][ T3454] GPR20: 00000001402af8b0 0000000000000009 0000000139a996e8 00007fffc9186d94
[ 56.870748][ T3454] GPR24: 0000000000000000 c00000069a9efc00 c00000000132cd00 c00000069a9efc40
[ 56.870748][ T3454] GPR28: c0000000011c29e8 0000000000000001 c0002004e2160438 c008000009321a64
[ 56.870969][ T3454] NIP [c0000000002a3ae4] ftrace_bug+0x104/0x310
ftrace_bug at kernel/trace/ftrace.c:2026
[ 56.870995][ T3454] LR [c0000000002a47fc] ftrace_modify_all_code+0x16c/0x210
ftrace_modify_all_code at kernel/trace/ftrace.c:2672
[ 56.871034][ T3454] Call Trace:
[ 56.871057][ T3454] [c00000069a9ef9a0] [4bffff899a9efa00] 0x4bffff899a9efa00 (unreliable)
[ 56.871086][ T3454] [c00000069a9efa20] [c0000000002a47fc] ftrace_modify_all_code+0x16c/0x210
[ 56.871125][ T3454] [c00000069a9efa50] [c000000000061b68] arch_ftrace_update_code+0x18/0x30
[ 56.871162][ T3454] [c00000069a9efa70] [c0000000002a49c4] ftrace_run_update_code+0x44/0xc0
[ 56.871199][ T3454] [c00000069a9efaa0] [c0000000002aa3c8] ftrace_startup+0xe8/0x1b0
[ 56.871236][ T3454] [c00000069a9efae0] [c0000000002aa4e0] register_ftrace_function+0x50/0xc0
[ 56.871275][ T3454] [c00000069a9efb10] [c0000000002d0468] function_trace_init+0x98/0xd0
[ 56.871312][ T3454] [c00000069a9efb40] [c0000000002c75c0] tracing_set_tracer+0x350/0x640
[ 56.871349][ T3454] [c00000069a9efbe0] [c0000000002c7a90] tracing_set_trace_write+0x1e0/0x370
[ 56.871388][ T3454] [c00000069a9efd00] [c00000000052094c] __vfs_write+0x3c/0x70
[ 56.871424][ T3454] [c00000069a9efd20] [c000000000523d4c] vfs_write+0xcc/0x200
[ 56.871461][ T3454] [c00000069a9efd70] [c0000000005240ec] ksys_write+0x7c/0x140
[ 56.871498][ T3454] [c00000069a9efdc0] [c000000000038a94] system_call_exception+0x114/0x1e0
[ 56.871535][ T3454] [c00000069a9efe20] [c00000000000c870] system_call_common+0xf0/0x278
[ 56.871570][ T3454] Instruction dump:
[ 56.871592][ T3454] 7d908120 4e800020 60000000 2b890001 409effd4 3c62ff8b 38631958 4bf4491d
[ 56.871639][ T3454] 60000000 4bffffc0 60000000 fba10068 <0fe00000> 39000001 3ce20003 3d22fed7
[ 56.871685][ T3454] irq event stamp: 95388
[ 56.871708][ T3454] hardirqs last enabled at (95387): [<c0000000001e4f94>] console_unlock+0x6a4/0x950
[ 56.871746][ T3454] hardirqs last disabled at (95388): [<c00000000000960c>] program_check_common_virt+0x2bc/0x310
[ 56.871785][ T3454] softirqs last enabled at (91222): [<c000000000a213c8>] __do_softirq+0x658/0x8d8
[ 56.871823][ T3454] softirqs last disabled at (91215): [<c00000000011b40c>] irq_exit+0x16c/0x1d0
[ 56.871859][ T3454] ---[ end trace 48f8445450a4e206 ]---
[ 56.871907][ T3454] ftrace failed to modify
[ 56.871913][ T3454] [<c008000009321a64>] show_sas_rphy_phy_identifier+0xc/0x60 [scsi_transport_sas]
show_sas_rphy_phy_identifier at drivers/scsi/scsi_transport_sas.c:1221
[ 56.871969][ T3454] actual: 1d:3d:00:48
[ 56.871996][ T3454] Setting ftrace call site to call ftrace function
[ 56.872020][ T3454] ftrace record flags: 80000001
[ 56.872054][ T3454] (1)
[ 56.872054][ T3454] expected tramp: c000000000061fac

2020-04-17 15:20:47

by Naveen N. Rao

[permalink] [raw]
Subject: Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

Qian Cai wrote:
>
>
>> On Apr 17, 2020, at 3:01 AM, Naveen N. Rao <[email protected]> wrote:
>>
>> Hi Qian,
>>
>> Qian Cai wrote:
>>> OK, reverted the commit,
>>> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>>> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,
>>> https://lore.kernel.org/lkml/[email protected]/
>>
>> Do you see any errors logged in dmesg when you see the crash? STRICT_KERNEL_RWX changes how patch_instruction() works, so it would be interesting to see if there are any ftrace-related errors thrown before the crash.
>
> Yes, looks like there is a warning right after,
>
> echo function > /sys/kernel/debug/tracing/current_tracer
> echo nop > /sys/kernel/debug/tracing/current_tracer
>
> and just before the crash,
>
> [ T3454] ftrace-powerpc: Unexpected call sequence at 00000000de85f044: 48003d1d 7c0802a6
> [ 56.870472][ T3454] ------------[ cut here ]------------
> [ 56.870500][ T3454] WARNING: CPU: 52 PID: 3454 at kernel/trace/ftrace.c:2026 ftrace_bug+0x104/0x310
> [ 56.870527][ T3454] Modules linked in: kvm_hv kvm ses enclosure scsi_transport_sas ip_tables x_tables xfs sd_mod i40e firmware_class aacraid dm_mirror dm_region_hash dm_log dm_mod
> [ 56.870592][ T3454] CPU: 52 PID: 3454 Comm: nip.sh Not tainted 5.7.0-rc1-next-20200416 #4
> [ 56.870627][ T3454] NIP: c0000000002a3ae4 LR: c0000000002a47fc CTR: c0000000002436f0
> [ 56.870661][ T3454] REGS: c00000069a9ef710 TRAP: 0700 Not tainted (5.7.0-rc1-next-20200416)
> [ 56.870697][ T3454] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28228222 XER: 00000000
> [ 56.870748][ T3454] CFAR: c0000000002a3a2c IRQMASK: 0
> [ 56.870748][ T3454] GPR00: c0000000002a47fc c00000069a9ef9a0 c0000000012f9000 ffffffffffffffea
> [ 56.870748][ T3454] GPR04: c0002004e2160438 c0000007fedf0ad8 00000000614ca19d 0000000000000007
> [ 56.870748][ T3454] GPR08: 0000000000000003 0000000000000000 0000000000000000 0000000000000002
> [ 56.870748][ T3454] GPR12: 0000000000004000 c0000007fffd5600 0000000040000000 0000000139ae9798
> [ 56.870748][ T3454] GPR16: 0000000139ae9724 0000000139a86968 0000000139a1f230 0000000139aed568
> [ 56.870748][ T3454] GPR20: 00000001402af8b0 0000000000000009 0000000139a996e8 00007fffc9186d94
> [ 56.870748][ T3454] GPR24: 0000000000000000 c00000069a9efc00 c00000000132cd00 c00000069a9efc40
> [ 56.870748][ T3454] GPR28: c0000000011c29e8 0000000000000001 c0002004e2160438 c008000009321a64
> [ 56.870969][ T3454] NIP [c0000000002a3ae4] ftrace_bug+0x104/0x310
> ftrace_bug at kernel/trace/ftrace.c:2026
> [ 56.870995][ T3454] LR [c0000000002a47fc] ftrace_modify_all_code+0x16c/0x210
> ftrace_modify_all_code at kernel/trace/ftrace.c:2672
> [ 56.871034][ T3454] Call Trace:
> [ 56.871057][ T3454] [c00000069a9ef9a0] [4bffff899a9efa00] 0x4bffff899a9efa00 (unreliable)
> [ 56.871086][ T3454] [c00000069a9efa20] [c0000000002a47fc] ftrace_modify_all_code+0x16c/0x210
> [ 56.871125][ T3454] [c00000069a9efa50] [c000000000061b68] arch_ftrace_update_code+0x18/0x30
> [ 56.871162][ T3454] [c00000069a9efa70] [c0000000002a49c4] ftrace_run_update_code+0x44/0xc0
> [ 56.871199][ T3454] [c00000069a9efaa0] [c0000000002aa3c8] ftrace_startup+0xe8/0x1b0
> [ 56.871236][ T3454] [c00000069a9efae0] [c0000000002aa4e0] register_ftrace_function+0x50/0xc0
> [ 56.871275][ T3454] [c00000069a9efb10] [c0000000002d0468] function_trace_init+0x98/0xd0
> [ 56.871312][ T3454] [c00000069a9efb40] [c0000000002c75c0] tracing_set_tracer+0x350/0x640
> [ 56.871349][ T3454] [c00000069a9efbe0] [c0000000002c7a90] tracing_set_trace_write+0x1e0/0x370
> [ 56.871388][ T3454] [c00000069a9efd00] [c00000000052094c] __vfs_write+0x3c/0x70
> [ 56.871424][ T3454] [c00000069a9efd20] [c000000000523d4c] vfs_write+0xcc/0x200
> [ 56.871461][ T3454] [c00000069a9efd70] [c0000000005240ec] ksys_write+0x7c/0x140
> [ 56.871498][ T3454] [c00000069a9efdc0] [c000000000038a94] system_call_exception+0x114/0x1e0
> [ 56.871535][ T3454] [c00000069a9efe20] [c00000000000c870] system_call_common+0xf0/0x278
> [ 56.871570][ T3454] Instruction dump:
> [ 56.871592][ T3454] 7d908120 4e800020 60000000 2b890001 409effd4 3c62ff8b 38631958 4bf4491d
> [ 56.871639][ T3454] 60000000 4bffffc0 60000000 fba10068 <0fe00000> 39000001 3ce20003 3d22fed7
> [ 56.871685][ T3454] irq event stamp: 95388
> [ 56.871708][ T3454] hardirqs last enabled at (95387): [<c0000000001e4f94>] console_unlock+0x6a4/0x950
> [ 56.871746][ T3454] hardirqs last disabled at (95388): [<c00000000000960c>] program_check_common_virt+0x2bc/0x310
> [ 56.871785][ T3454] softirqs last enabled at (91222): [<c000000000a213c8>] __do_softirq+0x658/0x8d8
> [ 56.871823][ T3454] softirqs last disabled at (91215): [<c00000000011b40c>] irq_exit+0x16c/0x1d0
> [ 56.871859][ T3454] ---[ end trace 48f8445450a4e206 ]---
> [ 56.871907][ T3454] ftrace failed to modify
> [ 56.871913][ T3454] [<c008000009321a64>] show_sas_rphy_phy_identifier+0xc/0x60 [scsi_transport_sas]
> show_sas_rphy_phy_identifier at drivers/scsi/scsi_transport_sas.c:1221
> [ 56.871969][ T3454] actual: 1d:3d:00:48
> [ 56.871996][ T3454] Setting ftrace call site to call ftrace function
> [ 56.872020][ T3454] ftrace record flags: 80000001
> [ 56.872054][ T3454] (1)
> [ 56.872054][ T3454] expected tramp: c000000000061fac

Thanks. That confirms the problem.

In this particular case, we are unable to convert a module function to
call into ftrace_caller() since the _mcount() entry has not been nop-ed
out during module load. I suspect there might be another error/warning
earlier on indicating that failure. There are a few scenarios where we
are not propagating errors from patch_instruction() properly, so it is
possible that no errors were thrown previously. I will send a separate
set of patches to address that.

The reason for the original crash is a bit more subtle. On module load,
we patch _mcount() call sites of all the module functions with a nop.
However, with STRICT_KERNEL_RWX, one of these is failing. When that
happens, we end up continuing to call into _mcount(), which uses the
default module relocation stub, and not the special ftrace stub. The
default stub uses r2, which won't always be loaded with the module TOC
pointer with -mprofile-kernel. Instead, r2 likely points to the kernel
TOC, so we end up loading incorrect entries from the kernel TOC to
branch to. In all the traces we've seen so far, a kernel function has
called into a module function, wherein the module function does not set
up its own TOC (ext4_iomap_end(), find_free_cb()).

The core of the problem though is that patch_instruction() is failing in
certain scenarios. We need to find out why that is, and address that.

Russell,
You mentioned in the past that you identified an issue with
patch_instruction() during early boot:
http://lkml.kernel.org/r/[email protected]

Does that failure happen without STRICT_MODULE_RWX, with just
STRICT_KERNEL_RWX? That might be relevant here.


- Naveen