2017-06-02 06:04:59

by Eryu Guan

[permalink] [raw]
Subject: [v4.12-rc1 regression] nfs server crashed in fstests run

Hi all,

Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
time in my fstests (xfstests) runs, I appended the console log of an
NFSv3 crash to the end of this mail.

I was exporting a directory resided on XFS, and loopback mounted the NFS
export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
crash usually happens when running test case generic/029 or generic/095.

But the problem is, there's no easy and efficient way to reproduce it, I
tried run only generic/029 and generic/095 in a loop for 1000 times but
failed, I also tried run the 'quick' group tests only for 50 iterations
but failed again. It seems that the only reliable way to reproduce it is
run the 'auto' group tests for 20 iterations.

i=0
while [ $i -lt 20 ]; do
./check -nfs -g auto
((i++))
done

And usually server crashed within 5 iterations, but at times it could
survive 10 iterations and only crashed if you left it running for more
iterations. This makes it hard to bisect and bisecting is very
time-consuming.

(The bisecting is running now, it needs a few days to finish. My first
two attempts pointed first bad commit to some mm patches, but reverting
that patch didn't prevent server from crashing, so I enlarged the loop
count and started bisecting for the third time).

If more info is needed please let me know.

Thanks,
Eryu


[88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
[88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
[88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
[88901.135095] IP: report_bug+0x64/0x100
[88901.139177] PGD 3b7c0c067
[88901.139177] P4D 3b7c0c067
[88901.142194] PUD 3b7c0e067
[88901.145209] PMD 469e33067
[88901.148225] PTE 80000004675f4161
[88901.151240]
[88901.156497] Oops: 0003 [#1] SMP
[88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
[88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
[88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
[88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
[88901.245488] RIP: 0010:report_bug+0x64/0x100
[88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
[88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
[88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
[88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
[88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
[88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
[88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
[88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
[88901.319188] Call Trace:
[88901.321920] do_trap+0x16a/0x190
[88901.325519] do_error_trap+0x89/0x110
[88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
[88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
[88901.339875] ? kmem_cache_alloc+0x194/0x1a0
[88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
[88901.349615] do_invalid_op+0x20/0x30
[88901.353604] invalid_op+0x1e/0x30
[88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
[88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
[88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
[88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
[88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
[88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
[88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
[88901.408876] ? page_mkclean+0x69/0xb0
[88901.412960] ? invalid_page_referenced_vma+0x90/0x90
[88901.418500] write_cache_pages+0x26f/0x510
[88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
[88901.428819] ? block_write_end+0x36/0x80
[88901.433193] ? generic_write_end+0x54/0xc0
[88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
[88901.442741] do_writepages+0x1c/0x70
[88901.446727] __filemap_fdatawrite_range+0xc6/0x100
[88901.452071] filemap_write_and_wait_range+0x41/0x90
[88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
[88901.462294] vfs_fsync_range+0x3d/0xb0
[88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
[88901.471835] __do_readv_writev+0x2d3/0x360
[88901.476402] do_readv_writev+0x76/0xb0
[88901.480582] ? do_dentry_open+0x261/0x310
[88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
[88901.490704] vfs_writev+0x3c/0x50
[88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
[88901.499271] ? ima_file_check+0x20/0x30
[88901.503553] nfsd_write+0x15a/0x210 [nfsd]
[88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
[88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
[88901.517950] svc_process_common+0x387/0x740 [sunrpc]
[88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
[88901.528248] nfsd+0xe9/0x160 [nfsd]
[88901.532137] kthread+0x109/0x140
[88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
[88901.540501] ? kthread_park+0x60/0x60
[88901.544586] ret_from_fork+0x25/0x30
[88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
[88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
[88901.575955] CR2: ffffffffc0360e12
[88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
[88901.584811] Kernel panic - not syncing: Fatal exception
[88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[88901.602731] Rebooting in 3 seconds..


2017-06-08 15:32:10

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Wed, Jun 07, 2017 at 12:23:38PM -0700, Darrick J. Wong wrote:
> On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> > Hi all,
> >
> > Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> > time in my fstests (xfstests) runs, I appended the console log of an
> > NFSv3 crash to the end of this mail.
> >
> > I was exporting a directory resided on XFS, and loopback mounted the NFS
> > export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
> > crash usually happens when running test case generic/029 or generic/095.
> >
> > But the problem is, there's no easy and efficient way to reproduce it, I
> > tried run only generic/029 and generic/095 in a loop for 1000 times but
> > failed, I also tried run the 'quick' group tests only for 50 iterations
> > but failed again. It seems that the only reliable way to reproduce it is
> > run the 'auto' group tests for 20 iterations.
> >
> > i=0
> > while [ $i -lt 20 ]; do
> > ./check -nfs -g auto
> > ((i++))
> > done
> >
> > And usually server crashed within 5 iterations, but at times it could
> > survive 10 iterations and only crashed if you left it running for more
> > iterations. This makes it hard to bisect and bisecting is very
> > time-consuming.
> >
> > (The bisecting is running now, it needs a few days to finish. My first
> > two attempts pointed first bad commit to some mm patches, but reverting
> > that patch didn't prevent server from crashing, so I enlarged the loop
> > count and started bisecting for the third time).
> >
> > If more info is needed please let me know.
> >
> > Thanks,
> > Eryu
> >
> >
> > [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> > [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> > [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> > [88901.135095] IP: report_bug+0x64/0x100
> > [88901.139177] PGD 3b7c0c067
> > [88901.139177] P4D 3b7c0c067
> > [88901.142194] PUD 3b7c0e067
> > [88901.145209] PMD 469e33067
> > [88901.148225] PTE 80000004675f4161
> > [88901.151240]
> > [88901.156497] Oops: 0003 [#1] SMP
> > [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> > [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> > [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> > [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> > [88901.245488] RIP: 0010:report_bug+0x64/0x100
> > [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> > [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> > [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> > [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> > [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> > [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> > [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> > [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> > [88901.319188] Call Trace:
> > [88901.321920] do_trap+0x16a/0x190
> > [88901.325519] do_error_trap+0x89/0x110
> > [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
>
> Hmm, does gdb think this address is at:
>
> (gdb) l *(xfs_do_writepage+0x65c)
> 0x7c17c is in xfs_do_writepage (/storage/home/djwong/cdev/work/linux-xfs/include/linux/highmem.h:77).
>
> like my system? That location is preempt_disable() in kunmap_atomic(),
> which is ... odd. I tried to run your reproducer script but smashed
> into this instead:
>
> [ 826.735882] ============================================
> [ 826.736952] WARNING: possible recursive locking detected
> [ 826.737992] 4.12.0-rc4-xfsx #7 Not tainted
> [ 826.738636] --------------------------------------------
> [ 826.739700] nfsd/1416 is trying to acquire lock:
> [ 826.740423] (&stp->st_mutex){+.+.+.}, at: [<ffffffffa004cda6>] nfsd4_process_open2+0x4f6/0x1360 [nfsd]
> [ 826.741749]
> [ 826.741749] but task is already holding lock:
> [ 826.742897] (&stp->st_mutex){+.+.+.}, at: [<ffffffffa004cd3f>] nfsd4_process_open2+0x48f/0x1360 [nfsd]

I guess the most likely culprit's the weird locking at the end of
init_open_stateid().

I believe one of those locks (stp->st_mutex) is on an object that's not
visible outside this thread yet. But maybe there's something else wrong
there, I'll look.... I wonder why I'm not hitting it?

Anyway, this looks likely to be a different bug from the one Eryu Guan's
hitting.

--b.

2017-06-23 07:26:58

by Eryu Guan

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

[As this bug somehow involves in ext4/jbd2 changes, cc ext4 list too]

On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> Hi all,
>
> Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> time in my fstests (xfstests) runs, I appended the console log of an
> NFSv3 crash to the end of this mail.

Some follow-up updates on this bug. *My* conclusion is, commit
81378da64de6 ("jbd2: mark the transaction context with the scope
GFP_NOFS context") introduced this issue, and this is a bug that needs
ext4, XFS and NFS to reproduce.

For more details please see below.

>
> I was exporting a directory resided on XFS, and loopback mounted the NFS
> export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
> crash usually happens when running test case generic/029 or generic/095.
>
> But the problem is, there's no easy and efficient way to reproduce it, I
> tried run only generic/029 and generic/095 in a loop for 1000 times but
> failed, I also tried run the 'quick' group tests only for 50 iterations
> but failed again. It seems that the only reliable way to reproduce it is
> run the 'auto' group tests for 20 iterations.
>
> i=0
> while [ $i -lt 20 ]; do
> ./check -nfs -g auto
> ((i++))
> done
>
> And usually server crashed within 5 iterations, but at times it could
> survive 10 iterations and only crashed if you left it running for more
> iterations. This makes it hard to bisect and bisecting is very
> time-consuming.
>
> (The bisecting is running now, it needs a few days to finish. My first
> two attempts pointed first bad commit to some mm patches, but reverting
> that patch didn't prevent server from crashing, so I enlarged the loop
> count and started bisecting for the third time).

The third round of bisect finally finished after 2 weeks painful
testings, git bisect pointed first bad to commit 81378da64de6 ("jbd2:
mark the transaction context with the scope GFP_NOFS context"), which
seemed very weird to me, because the crash always happens in XFS code.

But this reminded me that I was not only exporting XFS for NFS testing,
but also ext4. So my full test setup is

# mount -t ext4 /dev/sda4 /export/test
# showmount -e localhost
Export list for localhost:
/export/scratch *
/export/test *

(/export/scratch is on rootfs, which is XFS)

# cat local.config
TEST_DEV=localhost:/export/test
TEST_DIR=/mnt/testarea/test
SCRATCH_DEV=localhost:/export/scratch
SCRATCH_MNT=/mnt/testarea/scratch


Then I did further confirmation tests:
1. switch to a new branch with that jbd2 patch as HEAD and compile
kernel, run test with both ext4 and XFS exported on this newly compiled
kernel, it crashed within 5 iterations.

2. revert that jbd2 patch (when it was HEAD), run test with both ext4
and XFS exported, kernel survived 20 iterations of full fstests run.

3. kernel from step 1 survived 20 iterations of full fstests run, if I
export XFS only (create XFS on /dev/sda4 and mount it at /export/test).

4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
/export/test and /export/scratch were mounted as ext4, and this was done
on another test host because I don't have another spare test partition)


All these facts seem to confirm that commit 81378da64de6 really is the
culprit, I just don't see how..

I attached git bisect log, and if you need more information please let
me know. BTW, I'm testing 4.12-rc6 kernel now to see if it's already
been fixed there.

Thanks,
Eryu

>
> If more info is needed please let me know.
>
> Thanks,
> Eryu
>
>
> [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> [88901.135095] IP: report_bug+0x64/0x100
> [88901.139177] PGD 3b7c0c067
> [88901.139177] P4D 3b7c0c067
> [88901.142194] PUD 3b7c0e067
> [88901.145209] PMD 469e33067
> [88901.148225] PTE 80000004675f4161
> [88901.151240]
> [88901.156497] Oops: 0003 [#1] SMP
> [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> [88901.245488] RIP: 0010:report_bug+0x64/0x100
> [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> [88901.319188] Call Trace:
> [88901.321920] do_trap+0x16a/0x190
> [88901.325519] do_error_trap+0x89/0x110
> [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
> [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> [88901.349615] do_invalid_op+0x20/0x30
> [88901.353604] invalid_op+0x1e/0x30
> [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> [88901.408876] ? page_mkclean+0x69/0xb0
> [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> [88901.418500] write_cache_pages+0x26f/0x510
> [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> [88901.428819] ? block_write_end+0x36/0x80
> [88901.433193] ? generic_write_end+0x54/0xc0
> [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> [88901.442741] do_writepages+0x1c/0x70
> [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> [88901.452071] filemap_write_and_wait_range+0x41/0x90
> [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> [88901.462294] vfs_fsync_range+0x3d/0xb0
> [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> [88901.471835] __do_readv_writev+0x2d3/0x360
> [88901.476402] do_readv_writev+0x76/0xb0
> [88901.480582] ? do_dentry_open+0x261/0x310
> [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> [88901.490704] vfs_writev+0x3c/0x50
> [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> [88901.499271] ? ima_file_check+0x20/0x30
> [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> [88901.528248] nfsd+0xe9/0x160 [nfsd]
> [88901.532137] kthread+0x109/0x140
> [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> [88901.540501] ? kthread_park+0x60/0x60
> [88901.544586] ret_from_fork+0x25/0x30
> [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> [88901.575955] CR2: ffffffffc0360e12
> [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> [88901.584811] Kernel panic - not syncing: Fatal exception
> [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [88901.602731] Rebooting in 3 seconds..
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Attachments:
(No filename) (8.67 kB)
upstream-v4.12-rc1-nfs-panic-bisect-jbd2.log (2.57 kB)
Download all attachments

2017-06-23 07:43:38

by Michal Hocko

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

[Let's add Jack and keep the full email for reference]

On Fri 23-06-17 15:26:56, Eryu Guan wrote:
> [As this bug somehow involves in ext4/jbd2 changes, cc ext4 list too]
>
> On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> > Hi all,
> >
> > Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> > time in my fstests (xfstests) runs, I appended the console log of an
> > NFSv3 crash to the end of this mail.
>
> Some follow-up updates on this bug. *My* conclusion is, commit
> 81378da64de6 ("jbd2: mark the transaction context with the scope
> GFP_NOFS context") introduced this issue, and this is a bug that needs
> ext4, XFS and NFS to reproduce.
>
> For more details please see below.
>
> >
> > I was exporting a directory resided on XFS, and loopback mounted the NFS
> > export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
> > crash usually happens when running test case generic/029 or generic/095.
> >
> > But the problem is, there's no easy and efficient way to reproduce it, I
> > tried run only generic/029 and generic/095 in a loop for 1000 times but
> > failed, I also tried run the 'quick' group tests only for 50 iterations
> > but failed again. It seems that the only reliable way to reproduce it is
> > run the 'auto' group tests for 20 iterations.
> >
> > i=0
> > while [ $i -lt 20 ]; do
> > ./check -nfs -g auto
> > ((i++))
> > done
> >
> > And usually server crashed within 5 iterations, but at times it could
> > survive 10 iterations and only crashed if you left it running for more
> > iterations. This makes it hard to bisect and bisecting is very
> > time-consuming.
> >
> > (The bisecting is running now, it needs a few days to finish. My first
> > two attempts pointed first bad commit to some mm patches, but reverting
> > that patch didn't prevent server from crashing, so I enlarged the loop
> > count and started bisecting for the third time).
>
> The third round of bisect finally finished after 2 weeks painful
> testings, git bisect pointed first bad to commit 81378da64de6 ("jbd2:
> mark the transaction context with the scope GFP_NOFS context"), which
> seemed very weird to me, because the crash always happens in XFS code.
>
> But this reminded me that I was not only exporting XFS for NFS testing,
> but also ext4. So my full test setup is
>
> # mount -t ext4 /dev/sda4 /export/test
> # showmount -e localhost
> Export list for localhost:
> /export/scratch *
> /export/test *
>
> (/export/scratch is on rootfs, which is XFS)
>
> # cat local.config
> TEST_DEV=localhost:/export/test
> TEST_DIR=/mnt/testarea/test
> SCRATCH_DEV=localhost:/export/scratch
> SCRATCH_MNT=/mnt/testarea/scratch
>
>
> Then I did further confirmation tests:
> 1. switch to a new branch with that jbd2 patch as HEAD and compile
> kernel, run test with both ext4 and XFS exported on this newly compiled
> kernel, it crashed within 5 iterations.
>
> 2. revert that jbd2 patch (when it was HEAD), run test with both ext4
> and XFS exported, kernel survived 20 iterations of full fstests run.
>
> 3. kernel from step 1 survived 20 iterations of full fstests run, if I
> export XFS only (create XFS on /dev/sda4 and mount it at /export/test).
>
> 4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
> /export/test and /export/scratch were mounted as ext4, and this was done
> on another test host because I don't have another spare test partition)
>
>
> All these facts seem to confirm that commit 81378da64de6 really is the
> culprit, I just don't see how..
>
> I attached git bisect log, and if you need more information please let
> me know. BTW, I'm testing 4.12-rc6 kernel now to see if it's already
> been fixed there.
>
> Thanks,
> Eryu
>
> >
> > If more info is needed please let me know.
> >
> > Thanks,
> > Eryu
> >
> >
> > [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> > [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> > [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> > [88901.135095] IP: report_bug+0x64/0x100
> > [88901.139177] PGD 3b7c0c067
> > [88901.139177] P4D 3b7c0c067
> > [88901.142194] PUD 3b7c0e067
> > [88901.145209] PMD 469e33067
> > [88901.148225] PTE 80000004675f4161
> > [88901.151240]
> > [88901.156497] Oops: 0003 [#1] SMP
> > [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> > [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> > [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> > [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> > [88901.245488] RIP: 0010:report_bug+0x64/0x100
> > [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> > [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> > [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> > [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> > [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> > [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> > [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> > [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> > [88901.319188] Call Trace:
> > [88901.321920] do_trap+0x16a/0x190
> > [88901.325519] do_error_trap+0x89/0x110
> > [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
> > [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> > [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> > [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> > [88901.349615] do_invalid_op+0x20/0x30
> > [88901.353604] invalid_op+0x1e/0x30
> > [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> > [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> > [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> > [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> > [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> > [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> > [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> > [88901.408876] ? page_mkclean+0x69/0xb0
> > [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> > [88901.418500] write_cache_pages+0x26f/0x510
> > [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> > [88901.428819] ? block_write_end+0x36/0x80
> > [88901.433193] ? generic_write_end+0x54/0xc0
> > [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> > [88901.442741] do_writepages+0x1c/0x70
> > [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> > [88901.452071] filemap_write_and_wait_range+0x41/0x90
> > [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> > [88901.462294] vfs_fsync_range+0x3d/0xb0
> > [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> > [88901.471835] __do_readv_writev+0x2d3/0x360
> > [88901.476402] do_readv_writev+0x76/0xb0
> > [88901.480582] ? do_dentry_open+0x261/0x310
> > [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> > [88901.490704] vfs_writev+0x3c/0x50
> > [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> > [88901.499271] ? ima_file_check+0x20/0x30
> > [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> > [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> > [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> > [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> > [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> > [88901.528248] nfsd+0xe9/0x160 [nfsd]
> > [88901.532137] kthread+0x109/0x140
> > [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> > [88901.540501] ? kthread_park+0x60/0x60
> > [88901.544586] ret_from_fork+0x25/0x30
> > [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> > [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> > [88901.575955] CR2: ffffffffc0360e12
> > [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> > [88901.584811] Kernel panic - not syncing: Fatal exception
> > [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [88901.602731] Rebooting in 3 seconds..
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

> # bad: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1
> # good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11
> git bisect start 'v4.12-rc1' 'v4.11'
> # good: [221656e7c4ce342b99c31eca96c1cbb6d1dce45f] Merge tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> git bisect good 221656e7c4ce342b99c31eca96c1cbb6d1dce45f
> # bad: [c6a677c6f37bb7abc85ba7e3465e82b9f7eb1d91] Merge tag 'staging-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> git bisect bad c6a677c6f37bb7abc85ba7e3465e82b9f7eb1d91
> # bad: [e579dde654fc2c6b0d3e4b77a9a4b2d2405c510e] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> git bisect bad e579dde654fc2c6b0d3e4b77a9a4b2d2405c510e
> # bad: [a96480723c287c502b02659f4b347aecaa651ea1] Merge tag 'for-linus-4.12b-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
> git bisect bad a96480723c287c502b02659f4b347aecaa651ea1
> # good: [16a12fa9aed176444fc795b09e796be41902bb08] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> git bisect good 16a12fa9aed176444fc795b09e796be41902bb08
> # good: [1684096b1ed813f621fb6cbd06e72235c1c2a0ca] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
> git bisect good 1684096b1ed813f621fb6cbd06e72235c1c2a0ca
> # bad: [9c35baf6cee9a5745d55de6f9995916dde642517] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
> git bisect bad 9c35baf6cee9a5745d55de6f9995916dde642517
> # bad: [666e5a406c3ed562e7b3ceff8b631b6067bdaead] mm: make ttu's return boolean
> git bisect bad 666e5a406c3ed562e7b3ceff8b631b6067bdaead
> # good: [a6ffdc07847e74cc244c02ab6d0351a4a5d77281] mm: use is_migrate_highatomic() to simplify the code
> git bisect good a6ffdc07847e74cc244c02ab6d0351a4a5d77281
> # bad: [2948be5acf7d798991b127ceda6eea13e52c597f] mm: do not use double negation for testing page flags
> git bisect bad 2948be5acf7d798991b127ceda6eea13e52c597f
> # good: [9070733b4efac4bf17f299a81b01c15e206f9ff5] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS
> git bisect good 9070733b4efac4bf17f299a81b01c15e206f9ff5
> # bad: [81378da64de6d33d0c200885f1de431c9a3e5ccd] jbd2: mark the transaction context with the scope GFP_NOFS context
> git bisect bad 81378da64de6d33d0c200885f1de431c9a3e5ccd
> # good: [9ba1fb2c602a6f2323e3a08ec8e7a8e33bf335f4] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio*
> git bisect good 9ba1fb2c602a6f2323e3a08ec8e7a8e33bf335f4
> # first bad commit: [81378da64de6d33d0c200885f1de431c9a3e5ccd] jbd2: mark the transaction context with the scope GFP_NOFS context


--
Michal Hocko
SUSE Labs

2017-06-23 07:52:00

by Michal Hocko

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Fri 23-06-17 09:43:34, Michal Hocko wrote:
> [Let's add Jack and keep the full email for reference]
>
> On Fri 23-06-17 15:26:56, Eryu Guan wrote:
[...]
> > Then I did further confirmation tests:
> > 1. switch to a new branch with that jbd2 patch as HEAD and compile
> > kernel, run test with both ext4 and XFS exported on this newly compiled
> > kernel, it crashed within 5 iterations.
> >
> > 2. revert that jbd2 patch (when it was HEAD), run test with both ext4
> > and XFS exported, kernel survived 20 iterations of full fstests run.
> >
> > 3. kernel from step 1 survived 20 iterations of full fstests run, if I
> > export XFS only (create XFS on /dev/sda4 and mount it at /export/test).
> >
> > 4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
> > /export/test and /export/scratch were mounted as ext4, and this was done
> > on another test host because I don't have another spare test partition)
> >
> >
> > All these facts seem to confirm that commit 81378da64de6 really is the
> > culprit, I just don't see how..

AFAIR, no follow up patches to remove GFP_NOFS have been merged into
ext4 so we are currently only with 81378da64de6 and all it does is that
_all_ allocations from the transaction context are implicitly GFP_NOFS.
I can imagine that if there is a GFP_KERNEL allocation in this context
(which would be incorrect AFAIU) some shrinkers will not be called as a
result and that might lead to an observable behavior change. But this
sounds like a wild speculation. The mere fact that xfs oopses and there
is no ext code in the backtrace is suspicious on its own. Does this oops
sound familiar to xfs guys?

Anyway, I assume that if you revert 81378da64de6 the issue is gone?
[...]

> > > [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> > > [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> > > [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> > > [88901.135095] IP: report_bug+0x64/0x100
> > > [88901.139177] PGD 3b7c0c067
> > > [88901.139177] P4D 3b7c0c067
> > > [88901.142194] PUD 3b7c0e067
> > > [88901.145209] PMD 469e33067
> > > [88901.148225] PTE 80000004675f4161
> > > [88901.151240]
> > > [88901.156497] Oops: 0003 [#1] SMP
> > > [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> > > [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> > > [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> > > [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> > > [88901.245488] RIP: 0010:report_bug+0x64/0x100
> > > [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> > > [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> > > [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> > > [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> > > [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> > > [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> > > [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> > > [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> > > [88901.319188] Call Trace:
> > > [88901.321920] do_trap+0x16a/0x190
> > > [88901.325519] do_error_trap+0x89/0x110
> > > [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
> > > [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> > > [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> > > [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> > > [88901.349615] do_invalid_op+0x20/0x30
> > > [88901.353604] invalid_op+0x1e/0x30
> > > [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> > > [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> > > [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> > > [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> > > [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> > > [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> > > [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> > > [88901.408876] ? page_mkclean+0x69/0xb0
> > > [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> > > [88901.418500] write_cache_pages+0x26f/0x510
> > > [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> > > [88901.428819] ? block_write_end+0x36/0x80
> > > [88901.433193] ? generic_write_end+0x54/0xc0
> > > [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> > > [88901.442741] do_writepages+0x1c/0x70
> > > [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> > > [88901.452071] filemap_write_and_wait_range+0x41/0x90
> > > [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> > > [88901.462294] vfs_fsync_range+0x3d/0xb0
> > > [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> > > [88901.471835] __do_readv_writev+0x2d3/0x360
> > > [88901.476402] do_readv_writev+0x76/0xb0
> > > [88901.480582] ? do_dentry_open+0x261/0x310
> > > [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> > > [88901.490704] vfs_writev+0x3c/0x50
> > > [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> > > [88901.499271] ? ima_file_check+0x20/0x30
> > > [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> > > [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> > > [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> > > [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> > > [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> > > [88901.528248] nfsd+0xe9/0x160 [nfsd]
> > > [88901.532137] kthread+0x109/0x140
> > > [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> > > [88901.540501] ? kthread_park+0x60/0x60
> > > [88901.544586] ret_from_fork+0x25/0x30
> > > [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> > > [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> > > [88901.575955] CR2: ffffffffc0360e12
> > > [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> > > [88901.584811] Kernel panic - not syncing: Fatal exception
> > > [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > > [88901.602731] Rebooting in 3 seconds..
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> > # bad: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1
> > # good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11
> > git bisect start 'v4.12-rc1' 'v4.11'
> > # good: [221656e7c4ce342b99c31eca96c1cbb6d1dce45f] Merge tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> > git bisect good 221656e7c4ce342b99c31eca96c1cbb6d1dce45f
> > # bad: [c6a677c6f37bb7abc85ba7e3465e82b9f7eb1d91] Merge tag 'staging-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> > git bisect bad c6a677c6f37bb7abc85ba7e3465e82b9f7eb1d91
> > # bad: [e579dde654fc2c6b0d3e4b77a9a4b2d2405c510e] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> > git bisect bad e579dde654fc2c6b0d3e4b77a9a4b2d2405c510e
> > # bad: [a96480723c287c502b02659f4b347aecaa651ea1] Merge tag 'for-linus-4.12b-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
> > git bisect bad a96480723c287c502b02659f4b347aecaa651ea1
> > # good: [16a12fa9aed176444fc795b09e796be41902bb08] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> > git bisect good 16a12fa9aed176444fc795b09e796be41902bb08
> > # good: [1684096b1ed813f621fb6cbd06e72235c1c2a0ca] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
> > git bisect good 1684096b1ed813f621fb6cbd06e72235c1c2a0ca
> > # bad: [9c35baf6cee9a5745d55de6f9995916dde642517] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
> > git bisect bad 9c35baf6cee9a5745d55de6f9995916dde642517
> > # bad: [666e5a406c3ed562e7b3ceff8b631b6067bdaead] mm: make ttu's return boolean
> > git bisect bad 666e5a406c3ed562e7b3ceff8b631b6067bdaead
> > # good: [a6ffdc07847e74cc244c02ab6d0351a4a5d77281] mm: use is_migrate_highatomic() to simplify the code
> > git bisect good a6ffdc07847e74cc244c02ab6d0351a4a5d77281
> > # bad: [2948be5acf7d798991b127ceda6eea13e52c597f] mm: do not use double negation for testing page flags
> > git bisect bad 2948be5acf7d798991b127ceda6eea13e52c597f
> > # good: [9070733b4efac4bf17f299a81b01c15e206f9ff5] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS
> > git bisect good 9070733b4efac4bf17f299a81b01c15e206f9ff5
> > # bad: [81378da64de6d33d0c200885f1de431c9a3e5ccd] jbd2: mark the transaction context with the scope GFP_NOFS context
> > git bisect bad 81378da64de6d33d0c200885f1de431c9a3e5ccd
> > # good: [9ba1fb2c602a6f2323e3a08ec8e7a8e33bf335f4] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio*
> > git bisect good 9ba1fb2c602a6f2323e3a08ec8e7a8e33bf335f4
> > # first bad commit: [81378da64de6d33d0c200885f1de431c9a3e5ccd] jbd2: mark the transaction context with the scope GFP_NOFS context
>
>
> --
> Michal Hocko
> SUSE Labs

--
Michal Hocko
SUSE Labs

2017-06-23 08:12:29

by Eryu Guan

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Fri, Jun 23, 2017 at 09:51:56AM +0200, Michal Hocko wrote:
> On Fri 23-06-17 09:43:34, Michal Hocko wrote:
> > [Let's add Jack and keep the full email for reference]
> >
> > On Fri 23-06-17 15:26:56, Eryu Guan wrote:
> [...]
> > > Then I did further confirmation tests:
> > > 1. switch to a new branch with that jbd2 patch as HEAD and compile
> > > kernel, run test with both ext4 and XFS exported on this newly compiled
> > > kernel, it crashed within 5 iterations.
> > >
> > > 2. revert that jbd2 patch (when it was HEAD), run test with both ext4
> > > and XFS exported, kernel survived 20 iterations of full fstests run.
> > >
> > > 3. kernel from step 1 survived 20 iterations of full fstests run, if I
> > > export XFS only (create XFS on /dev/sda4 and mount it at /export/test).
> > >
> > > 4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
> > > /export/test and /export/scratch were mounted as ext4, and this was done
> > > on another test host because I don't have another spare test partition)
> > >
> > >
> > > All these facts seem to confirm that commit 81378da64de6 really is the
> > > culprit, I just don't see how..
>
> AFAIR, no follow up patches to remove GFP_NOFS have been merged into
> ext4 so we are currently only with 81378da64de6 and all it does is that
> _all_ allocations from the transaction context are implicitly GFP_NOFS.
> I can imagine that if there is a GFP_KERNEL allocation in this context
> (which would be incorrect AFAIU) some shrinkers will not be called as a
> result and that might lead to an observable behavior change. But this
> sounds like a wild speculation. The mere fact that xfs oopses and there
> is no ext code in the backtrace is suspicious on its own. Does this oops
> sound familiar to xfs guys?
>
> Anyway, I assume that if you revert 81378da64de6 the issue is gone?
> [...]

Correct, the first thing I did after git bisect finished, was reverting
81378da64de6 (with it as HEAD) and re-testing, and it survived 20
iterations of fstests run, as described in above step 1 and 2.

Usually kernel crashed within 5 loops, and 20 loops required around 48
hours to finish, so I think 20 is reasonable and didn't loop for even
longer time.

But I can still be wrong, because it's not a 100% reproduced bug,
there's still chance that 20 iterations are not enough, and reverting
81378da64de6 just made it even harder to reproduce.

Thanks,
Eryu

>
> > > > [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> > > > [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> > > > [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> > > > [88901.135095] IP: report_bug+0x64/0x100
> > > > [88901.139177] PGD 3b7c0c067
> > > > [88901.139177] P4D 3b7c0c067
> > > > [88901.142194] PUD 3b7c0e067
> > > > [88901.145209] PMD 469e33067
> > > > [88901.148225] PTE 80000004675f4161
> > > > [88901.151240]
> > > > [88901.156497] Oops: 0003 [#1] SMP
> > > > [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> > > > [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> > > > [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> > > > [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> > > > [88901.245488] RIP: 0010:report_bug+0x64/0x100
> > > > [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> > > > [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> > > > [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> > > > [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> > > > [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> > > > [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> > > > [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> > > > [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> > > > [88901.319188] Call Trace:
> > > > [88901.321920] do_trap+0x16a/0x190
> > > > [88901.325519] do_error_trap+0x89/0x110
> > > > [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
> > > > [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> > > > [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> > > > [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> > > > [88901.349615] do_invalid_op+0x20/0x30
> > > > [88901.353604] invalid_op+0x1e/0x30
> > > > [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> > > > [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> > > > [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> > > > [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> > > > [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> > > > [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> > > > [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> > > > [88901.408876] ? page_mkclean+0x69/0xb0
> > > > [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> > > > [88901.418500] write_cache_pages+0x26f/0x510
> > > > [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> > > > [88901.428819] ? block_write_end+0x36/0x80
> > > > [88901.433193] ? generic_write_end+0x54/0xc0
> > > > [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> > > > [88901.442741] do_writepages+0x1c/0x70
> > > > [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> > > > [88901.452071] filemap_write_and_wait_range+0x41/0x90
> > > > [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> > > > [88901.462294] vfs_fsync_range+0x3d/0xb0
> > > > [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> > > > [88901.471835] __do_readv_writev+0x2d3/0x360
> > > > [88901.476402] do_readv_writev+0x76/0xb0
> > > > [88901.480582] ? do_dentry_open+0x261/0x310
> > > > [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> > > > [88901.490704] vfs_writev+0x3c/0x50
> > > > [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> > > > [88901.499271] ? ima_file_check+0x20/0x30
> > > > [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> > > > [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> > > > [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> > > > [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> > > > [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> > > > [88901.528248] nfsd+0xe9/0x160 [nfsd]
> > > > [88901.532137] kthread+0x109/0x140
> > > > [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> > > > [88901.540501] ? kthread_park+0x60/0x60
> > > > [88901.544586] ret_from_fork+0x25/0x30
> > > > [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> > > > [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> > > > [88901.575955] CR2: ffffffffc0360e12
> > > > [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> > > > [88901.584811] Kernel panic - not syncing: Fatal exception
> > > > [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > > > [88901.602731] Rebooting in 3 seconds..
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to [email protected]
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> > > # bad: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1
> > > # good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11
> > > git bisect start 'v4.12-rc1' 'v4.11'
> > > # good: [221656e7c4ce342b99c31eca96c1cbb6d1dce45f] Merge tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> > > git bisect good 221656e7c4ce342b99c31eca96c1cbb6d1dce45f
> > > # bad: [c6a677c6f37bb7abc85ba7e3465e82b9f7eb1d91] Merge tag 'staging-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> > > git bisect bad c6a677c6f37bb7abc85ba7e3465e82b9f7eb1d91
> > > # bad: [e579dde654fc2c6b0d3e4b77a9a4b2d2405c510e] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> > > git bisect bad e579dde654fc2c6b0d3e4b77a9a4b2d2405c510e
> > > # bad: [a96480723c287c502b02659f4b347aecaa651ea1] Merge tag 'for-linus-4.12b-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
> > > git bisect bad a96480723c287c502b02659f4b347aecaa651ea1
> > > # good: [16a12fa9aed176444fc795b09e796be41902bb08] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
> > > git bisect good 16a12fa9aed176444fc795b09e796be41902bb08
> > > # good: [1684096b1ed813f621fb6cbd06e72235c1c2a0ca] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
> > > git bisect good 1684096b1ed813f621fb6cbd06e72235c1c2a0ca
> > > # bad: [9c35baf6cee9a5745d55de6f9995916dde642517] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
> > > git bisect bad 9c35baf6cee9a5745d55de6f9995916dde642517
> > > # bad: [666e5a406c3ed562e7b3ceff8b631b6067bdaead] mm: make ttu's return boolean
> > > git bisect bad 666e5a406c3ed562e7b3ceff8b631b6067bdaead
> > > # good: [a6ffdc07847e74cc244c02ab6d0351a4a5d77281] mm: use is_migrate_highatomic() to simplify the code
> > > git bisect good a6ffdc07847e74cc244c02ab6d0351a4a5d77281
> > > # bad: [2948be5acf7d798991b127ceda6eea13e52c597f] mm: do not use double negation for testing page flags
> > > git bisect bad 2948be5acf7d798991b127ceda6eea13e52c597f
> > > # good: [9070733b4efac4bf17f299a81b01c15e206f9ff5] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS
> > > git bisect good 9070733b4efac4bf17f299a81b01c15e206f9ff5
> > > # bad: [81378da64de6d33d0c200885f1de431c9a3e5ccd] jbd2: mark the transaction context with the scope GFP_NOFS context
> > > git bisect bad 81378da64de6d33d0c200885f1de431c9a3e5ccd
> > > # good: [9ba1fb2c602a6f2323e3a08ec8e7a8e33bf335f4] xfs: use memalloc_nofs_{save,restore} instead of memalloc_noio*
> > > git bisect good 9ba1fb2c602a6f2323e3a08ec8e7a8e33bf335f4
> > > # first bad commit: [81378da64de6d33d0c200885f1de431c9a3e5ccd] jbd2: mark the transaction context with the scope GFP_NOFS context
> >
> >
> > --
> > Michal Hocko
> > SUSE Labs
>
> --
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-26 12:45:27

by Dave Chinner

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Fri, Jun 23, 2017 at 09:51:56AM +0200, Michal Hocko wrote:
> On Fri 23-06-17 09:43:34, Michal Hocko wrote:
> > [Let's add Jack and keep the full email for reference]
> >
> > On Fri 23-06-17 15:26:56, Eryu Guan wrote:
> [...]
> > > Then I did further confirmation tests:
> > > 1. switch to a new branch with that jbd2 patch as HEAD and compile
> > > kernel, run test with both ext4 and XFS exported on this newly compiled
> > > kernel, it crashed within 5 iterations.
> > >
> > > 2. revert that jbd2 patch (when it was HEAD), run test with both ext4
> > > and XFS exported, kernel survived 20 iterations of full fstests run.
> > >
> > > 3. kernel from step 1 survived 20 iterations of full fstests run, if I
> > > export XFS only (create XFS on /dev/sda4 and mount it at /export/test).
> > >
> > > 4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
> > > /export/test and /export/scratch were mounted as ext4, and this was done
> > > on another test host because I don't have another spare test partition)
> > >
> > >
> > > All these facts seem to confirm that commit 81378da64de6 really is the
> > > culprit, I just don't see how..
>
> AFAIR, no follow up patches to remove GFP_NOFS have been merged into
> ext4 so we are currently only with 81378da64de6 and all it does is that
> _all_ allocations from the transaction context are implicitly GFP_NOFS.
> I can imagine that if there is a GFP_KERNEL allocation in this context
> (which would be incorrect AFAIU) some shrinkers will not be called as a
> result and that might lead to an observable behavior change. But this
> sounds like a wild speculation. The mere fact that xfs oopses and there
> is no ext code in the backtrace is suspicious on its own. Does this oops
> sound familiar to xfs guys?

Nope, but if it's in write_cache_pages() then it's not actually
crashing in XFS code, but in generic page cache and radix tree
traversal code. Which means objects that are allocated from slabs
and pools that are shared by both XFS and ext4.

We've had problems in the past where use after free of bufferheads
in reiserfs was discovered by corruption of bufferheads in XFS code,
so maybe there's a similar issue being exposed by the ext4
GFP_NOFS changes? i.e. try debugging this by treating it as memory
corruption until we know more...

> > > > [88901.418500] write_cache_pages+0x26f/0x510

Knowing what line of code is failing would help identify what object
is problematic....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2017-06-27 13:01:43

by Michal Hocko

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Mon 26-06-17 22:39:50, Dave Chinner wrote:
> On Fri, Jun 23, 2017 at 09:51:56AM +0200, Michal Hocko wrote:
> > On Fri 23-06-17 09:43:34, Michal Hocko wrote:
> > > [Let's add Jack and keep the full email for reference]
> > >
> > > On Fri 23-06-17 15:26:56, Eryu Guan wrote:
> > [...]
> > > > Then I did further confirmation tests:
> > > > 1. switch to a new branch with that jbd2 patch as HEAD and compile
> > > > kernel, run test with both ext4 and XFS exported on this newly compiled
> > > > kernel, it crashed within 5 iterations.
> > > >
> > > > 2. revert that jbd2 patch (when it was HEAD), run test with both ext4
> > > > and XFS exported, kernel survived 20 iterations of full fstests run.
> > > >
> > > > 3. kernel from step 1 survived 20 iterations of full fstests run, if I
> > > > export XFS only (create XFS on /dev/sda4 and mount it at /export/test).
> > > >
> > > > 4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
> > > > /export/test and /export/scratch were mounted as ext4, and this was done
> > > > on another test host because I don't have another spare test partition)
> > > >
> > > >
> > > > All these facts seem to confirm that commit 81378da64de6 really is the
> > > > culprit, I just don't see how..
> >
> > AFAIR, no follow up patches to remove GFP_NOFS have been merged into
> > ext4 so we are currently only with 81378da64de6 and all it does is that
> > _all_ allocations from the transaction context are implicitly GFP_NOFS.
> > I can imagine that if there is a GFP_KERNEL allocation in this context
> > (which would be incorrect AFAIU) some shrinkers will not be called as a
> > result and that might lead to an observable behavior change. But this
> > sounds like a wild speculation. The mere fact that xfs oopses and there
> > is no ext code in the backtrace is suspicious on its own. Does this oops
> > sound familiar to xfs guys?
>
> Nope, but if it's in write_cache_pages() then it's not actually
> crashing in XFS code, but in generic page cache and radix tree
> traversal code. Which means objects that are allocated from slabs
> and pools that are shared by both XFS and ext4.
>
> We've had problems in the past where use after free of bufferheads
> in reiserfs was discovered by corruption of bufferheads in XFS code,
> so maybe there's a similar issue being exposed by the ext4
> GFP_NOFS changes? i.e. try debugging this by treating it as memory
> corruption until we know more...

Yes this makes a lot of sense. Maybe slab debugging can catch such a
corruption earlier?
--
Michal Hocko
SUSE Labs

2017-06-28 02:58:45

by Eryu Guan

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Mon, Jun 26, 2017 at 10:39:50PM +1000, Dave Chinner wrote:
> On Fri, Jun 23, 2017 at 09:51:56AM +0200, Michal Hocko wrote:
> > On Fri 23-06-17 09:43:34, Michal Hocko wrote:
> > > [Let's add Jack and keep the full email for reference]
> > >
> > > On Fri 23-06-17 15:26:56, Eryu Guan wrote:
> > [...]
> > > > Then I did further confirmation tests:
> > > > 1. switch to a new branch with that jbd2 patch as HEAD and compile
> > > > kernel, run test with both ext4 and XFS exported on this newly compiled
> > > > kernel, it crashed within 5 iterations.
> > > >
> > > > 2. revert that jbd2 patch (when it was HEAD), run test with both ext4
> > > > and XFS exported, kernel survived 20 iterations of full fstests run.
> > > >
> > > > 3. kernel from step 1 survived 20 iterations of full fstests run, if I
> > > > export XFS only (create XFS on /dev/sda4 and mount it at /export/test).
> > > >
> > > > 4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
> > > > /export/test and /export/scratch were mounted as ext4, and this was done
> > > > on another test host because I don't have another spare test partition)
> > > >
> > > >
> > > > All these facts seem to confirm that commit 81378da64de6 really is the
> > > > culprit, I just don't see how..
> >
> > AFAIR, no follow up patches to remove GFP_NOFS have been merged into
> > ext4 so we are currently only with 81378da64de6 and all it does is that
> > _all_ allocations from the transaction context are implicitly GFP_NOFS.
> > I can imagine that if there is a GFP_KERNEL allocation in this context
> > (which would be incorrect AFAIU) some shrinkers will not be called as a
> > result and that might lead to an observable behavior change. But this
> > sounds like a wild speculation. The mere fact that xfs oopses and there
> > is no ext code in the backtrace is suspicious on its own. Does this oops
> > sound familiar to xfs guys?
>
> Nope, but if it's in write_cache_pages() then it's not actually
> crashing in XFS code, but in generic page cache and radix tree
> traversal code. Which means objects that are allocated from slabs
> and pools that are shared by both XFS and ext4.
>
> We've had problems in the past where use after free of bufferheads
> in reiserfs was discovered by corruption of bufferheads in XFS code,
> so maybe there's a similar issue being exposed by the ext4
> GFP_NOFS changes? i.e. try debugging this by treating it as memory
> corruption until we know more...
>
> > > > > [88901.418500] write_cache_pages+0x26f/0x510
>
> Knowing what line of code is failing would help identify what object
> is problematic....

This was what I replied to Darrick when he first asked for the same
information:

"
I managed to reproduce again with 4.12-rc4 kernel, call trace is

[ 704.811107] Call Trace:
[ 704.811107] do_trap+0x16a/0x190
[ 704.811107] do_error_trap+0x89/0x110
[ 704.811107] ? xfs_do_writepage+0x6c7/0x6d0 [xfs]
[ 704.811107] ? check_preempt_curr+0x7d/0x90
[ 704.811107] ? ttwu_do_wakeup+0x1e/0x150
[ 704.811107] do_invalid_op+0x20/0x30
[ 704.811107] invalid_op+0x1e/0x30

and xfs_do_writepage+0x6c7 is

(gdb) l *(xfs_do_writepage+0x6c7)
0x679e7 is in xfs_do_writepage (fs/xfs/xfs_aops.c:850).
845 int error = 0;
846 int count = 0;
847 int uptodate = 1;
848 unsigned int new_type;
849
850 bh = head = page_buffers(page);
851 offset = page_offset(page);
852 do {
853 if (offset >= end_offset)
854 break;
"

Later on, I did the same several times, and it ended up in different
lines of the code, I can't remember the exact line number now, but it
always involved in dealing with buffer heads.

Thanks,
Eryu

>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]

2017-06-28 03:04:41

by Eryu Guan

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Fri, Jun 23, 2017 at 03:26:56PM +0800, Eryu Guan wrote:
> [As this bug somehow involves in ext4/jbd2 changes, cc ext4 list too]
>
> On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> > Hi all,
> >
> > Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> > time in my fstests (xfstests) runs, I appended the console log of an
> > NFSv3 crash to the end of this mail.
>
> Some follow-up updates on this bug. *My* conclusion is, commit
> 81378da64de6 ("jbd2: mark the transaction context with the scope
> GFP_NOFS context") introduced this issue, and this is a bug that needs
> ext4, XFS and NFS to reproduce.
>
> For more details please see below.
>
> >
> > I was exporting a directory resided on XFS, and loopback mounted the NFS
> > export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
> > crash usually happens when running test case generic/029 or generic/095.
> >
> > But the problem is, there's no easy and efficient way to reproduce it, I
> > tried run only generic/029 and generic/095 in a loop for 1000 times but
> > failed, I also tried run the 'quick' group tests only for 50 iterations
> > but failed again. It seems that the only reliable way to reproduce it is
> > run the 'auto' group tests for 20 iterations.
> >
> > i=0
> > while [ $i -lt 20 ]; do
> > ./check -nfs -g auto
> > ((i++))
> > done
> >
> > And usually server crashed within 5 iterations, but at times it could
> > survive 10 iterations and only crashed if you left it running for more
> > iterations. This makes it hard to bisect and bisecting is very
> > time-consuming.
> >
> > (The bisecting is running now, it needs a few days to finish. My first
> > two attempts pointed first bad commit to some mm patches, but reverting
> > that patch didn't prevent server from crashing, so I enlarged the loop
> > count and started bisecting for the third time).
>
> The third round of bisect finally finished after 2 weeks painful
> testings, git bisect pointed first bad to commit 81378da64de6 ("jbd2:
> mark the transaction context with the scope GFP_NOFS context"), which
> seemed very weird to me, because the crash always happens in XFS code.
>
> But this reminded me that I was not only exporting XFS for NFS testing,
> but also ext4. So my full test setup is
>
> # mount -t ext4 /dev/sda4 /export/test
> # showmount -e localhost
> Export list for localhost:
> /export/scratch *
> /export/test *
>
> (/export/scratch is on rootfs, which is XFS)
>
> # cat local.config
> TEST_DEV=localhost:/export/test
> TEST_DIR=/mnt/testarea/test
> SCRATCH_DEV=localhost:/export/scratch
> SCRATCH_MNT=/mnt/testarea/scratch
>
>
> Then I did further confirmation tests:
> 1. switch to a new branch with that jbd2 patch as HEAD and compile
> kernel, run test with both ext4 and XFS exported on this newly compiled
> kernel, it crashed within 5 iterations.
>
> 2. revert that jbd2 patch (when it was HEAD), run test with both ext4
> and XFS exported, kernel survived 20 iterations of full fstests run.
>
> 3. kernel from step 1 survived 20 iterations of full fstests run, if I
> export XFS only (create XFS on /dev/sda4 and mount it at /export/test).
>
> 4. 4.12-rc1 kernel survived the same test if I export ext4 only (both
> /export/test and /export/scratch were mounted as ext4, and this was done
> on another test host because I don't have another spare test partition)
>
>
> All these facts seem to confirm that commit 81378da64de6 really is the
> culprit, I just don't see how..
>
> I attached git bisect log, and if you need more information please let
> me know. BTW, I'm testing 4.12-rc6 kernel now to see if it's already
> been fixed there.

4.12-rc6 kernel survived 30 iterations of full fstests run. Not sure if
it's been fixed or made even harder to reproduce.

Thanks,
Eryu

2017-06-07 03:34:00

by Eryu Guan

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> Hi all,
>
> Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> time in my fstests (xfstests) runs, I appended the console log of an
> NFSv3 crash to the end of this mail.
>
> I was exporting a directory resided on XFS, and loopback mounted the NFS
> export on localhost. Both NFSv3 and NFSv4 could hit this crash. The

I did more test runs with ext4 exported, seems it's only reproducible
when the exported local filesystem is XFS, ext4 survived 20 rounds of
fstests runs auto group tests.

And this crash can still be seen with 4.12-rc4 kernel.

Thanks,
Eryu

> crash usually happens when running test case generic/029 or generic/095.
>
> But the problem is, there's no easy and efficient way to reproduce it, I
> tried run only generic/029 and generic/095 in a loop for 1000 times but
> failed, I also tried run the 'quick' group tests only for 50 iterations
> but failed again. It seems that the only reliable way to reproduce it is
> run the 'auto' group tests for 20 iterations.
>
> i=0
> while [ $i -lt 20 ]; do
> ./check -nfs -g auto
> ((i++))
> done
>
> And usually server crashed within 5 iterations, but at times it could
> survive 10 iterations and only crashed if you left it running for more
> iterations. This makes it hard to bisect and bisecting is very
> time-consuming.
>
> (The bisecting is running now, it needs a few days to finish. My first
> two attempts pointed first bad commit to some mm patches, but reverting
> that patch didn't prevent server from crashing, so I enlarged the loop
> count and started bisecting for the third time).
>
> If more info is needed please let me know.
>
> Thanks,
> Eryu
>
>
> [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> [88901.135095] IP: report_bug+0x64/0x100
> [88901.139177] PGD 3b7c0c067
> [88901.139177] P4D 3b7c0c067
> [88901.142194] PUD 3b7c0e067
> [88901.145209] PMD 469e33067
> [88901.148225] PTE 80000004675f4161
> [88901.151240]
> [88901.156497] Oops: 0003 [#1] SMP
> [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> [88901.245488] RIP: 0010:report_bug+0x64/0x100
> [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> [88901.319188] Call Trace:
> [88901.321920] do_trap+0x16a/0x190
> [88901.325519] do_error_trap+0x89/0x110
> [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
> [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> [88901.349615] do_invalid_op+0x20/0x30
> [88901.353604] invalid_op+0x1e/0x30
> [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> [88901.408876] ? page_mkclean+0x69/0xb0
> [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> [88901.418500] write_cache_pages+0x26f/0x510
> [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> [88901.428819] ? block_write_end+0x36/0x80
> [88901.433193] ? generic_write_end+0x54/0xc0
> [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> [88901.442741] do_writepages+0x1c/0x70
> [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> [88901.452071] filemap_write_and_wait_range+0x41/0x90
> [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> [88901.462294] vfs_fsync_range+0x3d/0xb0
> [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> [88901.471835] __do_readv_writev+0x2d3/0x360
> [88901.476402] do_readv_writev+0x76/0xb0
> [88901.480582] ? do_dentry_open+0x261/0x310
> [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> [88901.490704] vfs_writev+0x3c/0x50
> [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> [88901.499271] ? ima_file_check+0x20/0x30
> [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> [88901.528248] nfsd+0xe9/0x160 [nfsd]
> [88901.532137] kthread+0x109/0x140
> [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> [88901.540501] ? kthread_park+0x60/0x60
> [88901.544586] ret_from_fork+0x25/0x30
> [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> [88901.575955] CR2: ffffffffc0360e12
> [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> [88901.584811] Kernel panic - not syncing: Fatal exception
> [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [88901.602731] Rebooting in 3 seconds..

2017-06-07 15:53:57

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Wed, Jun 07, 2017 at 11:33:57AM +0800, Eryu Guan wrote:
> On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> > Hi all,
> >
> > Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> > time in my fstests (xfstests) runs, I appended the console log of an
> > NFSv3 crash to the end of this mail.
> >
> > I was exporting a directory resided on XFS, and loopback mounted the NFS
> > export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
>
> I did more test runs with ext4 exported, seems it's only reproducible
> when the exported local filesystem is XFS, ext4 survived 20 rounds of
> fstests runs auto group tests.
>
> And this crash can still be seen with 4.12-rc4 kernel.

So this sounds a little more likely to be an xfs bug. I also don't see
any likely looking nfsd commits in the v4.11..v4.12-rc1 range.

But I take it you can't reproduce when running directly on xfs with no
NFS involvement?

Not knowing this code well I don't know what to suggest other than the
long painful bisect.

--b.

> Thanks,
> Eryu
>
> > crash usually happens when running test case generic/029 or generic/095.
> >
> > But the problem is, there's no easy and efficient way to reproduce it, I
> > tried run only generic/029 and generic/095 in a loop for 1000 times but
> > failed, I also tried run the 'quick' group tests only for 50 iterations
> > but failed again. It seems that the only reliable way to reproduce it is
> > run the 'auto' group tests for 20 iterations.
> >
> > i=0
> > while [ $i -lt 20 ]; do
> > ./check -nfs -g auto
> > ((i++))
> > done
> >
> > And usually server crashed within 5 iterations, but at times it could
> > survive 10 iterations and only crashed if you left it running for more
> > iterations. This makes it hard to bisect and bisecting is very
> > time-consuming.
> >
> > (The bisecting is running now, it needs a few days to finish. My first
> > two attempts pointed first bad commit to some mm patches, but reverting
> > that patch didn't prevent server from crashing, so I enlarged the loop
> > count and started bisecting for the third time).
> >
> > If more info is needed please let me know.
> >
> > Thanks,
> > Eryu
> >
> >
> > [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> > [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> > [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> > [88901.135095] IP: report_bug+0x64/0x100
> > [88901.139177] PGD 3b7c0c067
> > [88901.139177] P4D 3b7c0c067
> > [88901.142194] PUD 3b7c0e067
> > [88901.145209] PMD 469e33067
> > [88901.148225] PTE 80000004675f4161
> > [88901.151240]
> > [88901.156497] Oops: 0003 [#1] SMP
> > [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> > [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> > [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> > [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> > [88901.245488] RIP: 0010:report_bug+0x64/0x100
> > [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> > [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> > [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> > [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> > [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> > [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> > [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> > [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> > [88901.319188] Call Trace:
> > [88901.321920] do_trap+0x16a/0x190
> > [88901.325519] do_error_trap+0x89/0x110
> > [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
> > [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> > [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> > [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> > [88901.349615] do_invalid_op+0x20/0x30
> > [88901.353604] invalid_op+0x1e/0x30
> > [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> > [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> > [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> > [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> > [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> > [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> > [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> > [88901.408876] ? page_mkclean+0x69/0xb0
> > [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> > [88901.418500] write_cache_pages+0x26f/0x510
> > [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> > [88901.428819] ? block_write_end+0x36/0x80
> > [88901.433193] ? generic_write_end+0x54/0xc0
> > [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> > [88901.442741] do_writepages+0x1c/0x70
> > [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> > [88901.452071] filemap_write_and_wait_range+0x41/0x90
> > [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> > [88901.462294] vfs_fsync_range+0x3d/0xb0
> > [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> > [88901.471835] __do_readv_writev+0x2d3/0x360
> > [88901.476402] do_readv_writev+0x76/0xb0
> > [88901.480582] ? do_dentry_open+0x261/0x310
> > [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> > [88901.490704] vfs_writev+0x3c/0x50
> > [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> > [88901.499271] ? ima_file_check+0x20/0x30
> > [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> > [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> > [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> > [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> > [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> > [88901.528248] nfsd+0xe9/0x160 [nfsd]
> > [88901.532137] kthread+0x109/0x140
> > [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> > [88901.540501] ? kthread_park+0x60/0x60
> > [88901.544586] ret_from_fork+0x25/0x30
> > [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> > [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> > [88901.575955] CR2: ffffffffc0360e12
> > [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> > [88901.584811] Kernel panic - not syncing: Fatal exception
> > [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [88901.602731] Rebooting in 3 seconds..
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-07 19:23:45

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> Hi all,
>
> Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> time in my fstests (xfstests) runs, I appended the console log of an
> NFSv3 crash to the end of this mail.
>
> I was exporting a directory resided on XFS, and loopback mounted the NFS
> export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
> crash usually happens when running test case generic/029 or generic/095.
>
> But the problem is, there's no easy and efficient way to reproduce it, I
> tried run only generic/029 and generic/095 in a loop for 1000 times but
> failed, I also tried run the 'quick' group tests only for 50 iterations
> but failed again. It seems that the only reliable way to reproduce it is
> run the 'auto' group tests for 20 iterations.
>
> i=0
> while [ $i -lt 20 ]; do
> ./check -nfs -g auto
> ((i++))
> done
>
> And usually server crashed within 5 iterations, but at times it could
> survive 10 iterations and only crashed if you left it running for more
> iterations. This makes it hard to bisect and bisecting is very
> time-consuming.
>
> (The bisecting is running now, it needs a few days to finish. My first
> two attempts pointed first bad commit to some mm patches, but reverting
> that patch didn't prevent server from crashing, so I enlarged the loop
> count and started bisecting for the third time).
>
> If more info is needed please let me know.
>
> Thanks,
> Eryu
>
>
> [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> [88901.135095] IP: report_bug+0x64/0x100
> [88901.139177] PGD 3b7c0c067
> [88901.139177] P4D 3b7c0c067
> [88901.142194] PUD 3b7c0e067
> [88901.145209] PMD 469e33067
> [88901.148225] PTE 80000004675f4161
> [88901.151240]
> [88901.156497] Oops: 0003 [#1] SMP
> [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> [88901.245488] RIP: 0010:report_bug+0x64/0x100
> [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> [88901.319188] Call Trace:
> [88901.321920] do_trap+0x16a/0x190
> [88901.325519] do_error_trap+0x89/0x110
> [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]

Hmm, does gdb think this address is at:

(gdb) l *(xfs_do_writepage+0x65c)
0x7c17c is in xfs_do_writepage (/storage/home/djwong/cdev/work/linux-xfs/include/linux/highmem.h:77).

like my system? That location is preempt_disable() in kunmap_atomic(),
which is ... odd. I tried to run your reproducer script but smashed
into this instead:

[ 826.735882] ============================================
[ 826.736952] WARNING: possible recursive locking detected
[ 826.737992] 4.12.0-rc4-xfsx #7 Not tainted
[ 826.738636] --------------------------------------------
[ 826.739700] nfsd/1416 is trying to acquire lock:
[ 826.740423] (&stp->st_mutex){+.+.+.}, at: [<ffffffffa004cda6>] nfsd4_process_open2+0x4f6/0x1360 [nfsd]
[ 826.741749]
[ 826.741749] but task is already holding lock:
[ 826.742897] (&stp->st_mutex){+.+.+.}, at: [<ffffffffa004cd3f>] nfsd4_process_open2+0x48f/0x1360 [nfsd]
[ 826.744211]
[ 826.744211] other info that might help us debug this:
[ 826.745262] Possible unsafe locking scenario:
[ 826.745262]
[ 826.746203] CPU0
[ 826.746651] ----
[ 826.747071] lock(&stp->st_mutex);
[ 826.747795] lock(&stp->st_mutex);
[ 826.748392]
[ 826.748392] *** DEADLOCK ***
[ 826.748392]
[ 826.749577] May be due to missing lock nesting notation
[ 826.749577]
[ 826.750912] 1 lock held by nfsd/1416:
[ 826.751570] #0: (&stp->st_mutex){+.+.+.}, at: [<ffffffffa004cd3f>] nfsd4_process_open2+0x48f/0x1360 [nfsd]
[ 826.753350]
[ 826.753350] stack backtrace:
[ 826.754122] CPU: 3 PID: 1416 Comm: nfsd Not tainted 4.12.0-rc4-xfsx #7
[ 826.755258] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 826.756856] Call Trace:
[ 826.757381] dump_stack+0x85/0xc7
[ 826.758029] __lock_acquire+0x1567/0x15c0
[ 826.758771] ? kvm_clock_read+0x23/0x40
[ 826.759423] lock_acquire+0xac/0x200
[ 826.760014] ? lock_acquire+0xac/0x200
[ 826.760665] ? nfsd4_process_open2+0x4f6/0x1360 [nfsd]
[ 826.761618] ? nfsd4_process_open2+0x4f6/0x1360 [nfsd]
[ 826.762549] __mutex_lock+0x75/0x9c0
[ 826.763160] ? nfsd4_process_open2+0x4f6/0x1360 [nfsd]
[ 826.763995] ? nfsd4_process_open2+0x4e1/0x1360 [nfsd]
[ 826.764836] ? nfsd4_process_open2+0x4e1/0x1360 [nfsd]
[ 826.765675] mutex_lock_nested+0x1b/0x20
[ 826.766334] ? _raw_spin_unlock+0x31/0x50
[ 826.767014] ? mutex_lock_nested+0x1b/0x20
[ 826.767690] nfsd4_process_open2+0x4f6/0x1360 [nfsd]
[ 826.768561] nfsd4_open+0x574/0x830 [nfsd]
[ 826.769248] ? nfsd4_open+0x574/0x830 [nfsd]
[ 826.769991] nfsd4_proc_compound+0x31b/0x620 [nfsd]
[ 826.770796] nfsd_dispatch+0xc8/0x260 [nfsd]
[ 826.771547] svc_process_common+0x351/0x690
[ 826.772270] svc_process+0x11f/0x380
[ 826.772879] nfsd+0x176/0x280 [nfsd]
[ 826.773539] kthread+0x117/0x150
[ 826.774124] ? nfsd_destroy+0x180/0x180 [nfsd]
[ 826.774969] ? kthread_create_on_node+0x40/0x40
[ 826.775889] ret_from_fork+0x2a/0x40
[ 826.778354] NFS: nfs4_reclaim_open_state: Lock reclaim failed!

After which the NFS mounts were all wedged up and wouldn't unmount.

<shrug>

--D

> [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> [88901.349615] do_invalid_op+0x20/0x30
> [88901.353604] invalid_op+0x1e/0x30
> [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> [88901.408876] ? page_mkclean+0x69/0xb0
> [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> [88901.418500] write_cache_pages+0x26f/0x510
> [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> [88901.428819] ? block_write_end+0x36/0x80
> [88901.433193] ? generic_write_end+0x54/0xc0
> [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> [88901.442741] do_writepages+0x1c/0x70
> [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> [88901.452071] filemap_write_and_wait_range+0x41/0x90
> [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> [88901.462294] vfs_fsync_range+0x3d/0xb0
> [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> [88901.471835] __do_readv_writev+0x2d3/0x360
> [88901.476402] do_readv_writev+0x76/0xb0
> [88901.480582] ? do_dentry_open+0x261/0x310
> [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> [88901.490704] vfs_writev+0x3c/0x50
> [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> [88901.499271] ? ima_file_check+0x20/0x30
> [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> [88901.528248] nfsd+0xe9/0x160 [nfsd]
> [88901.532137] kthread+0x109/0x140
> [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> [88901.540501] ? kthread_park+0x60/0x60
> [88901.544586] ret_from_fork+0x25/0x30
> [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> [88901.575955] CR2: ffffffffc0360e12
> [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> [88901.584811] Kernel panic - not syncing: Fatal exception
> [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [88901.602731] Rebooting in 3 seconds..
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-08 02:27:40

by Eryu Guan

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Wed, Jun 07, 2017 at 11:53:56AM -0400, J. Bruce Fields wrote:
> On Wed, Jun 07, 2017 at 11:33:57AM +0800, Eryu Guan wrote:
> > On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> > > Hi all,
> > >
> > > Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> > > time in my fstests (xfstests) runs, I appended the console log of an
> > > NFSv3 crash to the end of this mail.
> > >
> > > I was exporting a directory resided on XFS, and loopback mounted the NFS
> > > export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
> >
> > I did more test runs with ext4 exported, seems it's only reproducible
> > when the exported local filesystem is XFS, ext4 survived 20 rounds of
> > fstests runs auto group tests.
> >
> > And this crash can still be seen with 4.12-rc4 kernel.
>
> So this sounds a little more likely to be an xfs bug. I also don't see
> any likely looking nfsd commits in the v4.11..v4.12-rc1 range.
>
> But I take it you can't reproduce when running directly on xfs with no
> NFS involvement?

Correct, I don't see any xfs crashes in fstests run without nfs.

>
> Not knowing this code well I don't know what to suggest other than the
> long painful bisect.

Bisect is still running, automatically. I guess it still needs several
days to finish.

Thanks for looking at it!

Eryu

>
> --b.
>
> > Thanks,
> > Eryu
> >
> > > crash usually happens when running test case generic/029 or generic/095.
> > >
> > > But the problem is, there's no easy and efficient way to reproduce it, I
> > > tried run only generic/029 and generic/095 in a loop for 1000 times but
> > > failed, I also tried run the 'quick' group tests only for 50 iterations
> > > but failed again. It seems that the only reliable way to reproduce it is
> > > run the 'auto' group tests for 20 iterations.
> > >
> > > i=0
> > > while [ $i -lt 20 ]; do
> > > ./check -nfs -g auto
> > > ((i++))
> > > done
> > >
> > > And usually server crashed within 5 iterations, but at times it could
> > > survive 10 iterations and only crashed if you left it running for more
> > > iterations. This makes it hard to bisect and bisecting is very
> > > time-consuming.
> > >
> > > (The bisecting is running now, it needs a few days to finish. My first
> > > two attempts pointed first bad commit to some mm patches, but reverting
> > > that patch didn't prevent server from crashing, so I enlarged the loop
> > > count and started bisecting for the third time).
> > >
> > > If more info is needed please let me know.
> > >
> > > Thanks,
> > > Eryu
> > >
> > >
> > > [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> > > [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> > > [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> > > [88901.135095] IP: report_bug+0x64/0x100
> > > [88901.139177] PGD 3b7c0c067
> > > [88901.139177] P4D 3b7c0c067
> > > [88901.142194] PUD 3b7c0e067
> > > [88901.145209] PMD 469e33067
> > > [88901.148225] PTE 80000004675f4161
> > > [88901.151240]
> > > [88901.156497] Oops: 0003 [#1] SMP
> > > [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> > > [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> > > [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> > > [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> > > [88901.245488] RIP: 0010:report_bug+0x64/0x100
> > > [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> > > [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> > > [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> > > [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> > > [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> > > [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> > > [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> > > [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> > > [88901.319188] Call Trace:
> > > [88901.321920] do_trap+0x16a/0x190
> > > [88901.325519] do_error_trap+0x89/0x110
> > > [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
> > > [88901.334917] ? _xfs_buf_find+0x2a2/0x510 [xfs]
> > > [88901.339875] ? kmem_cache_alloc+0x194/0x1a0
> > > [88901.344561] ? kmem_zone_alloc+0x96/0x100 [xfs]
> > > [88901.349615] do_invalid_op+0x20/0x30
> > > [88901.353604] invalid_op+0x1e/0x30
> > > [88901.357318] RIP: 0010:xfs_do_writepage+0x65c/0x680 [xfs]
> > > [88901.363242] RSP: 0018:ffffc2820478f790 EFLAGS: 00010206
> > > [88901.369069] RAX: 0000000000340140 RBX: ffffc2820478f8b8 RCX: 0000000000000000
> > > [88901.377030] RDX: 0000000000000000 RSI: ffffc2820478f9f8 RDI: fffffc2be1095400
> > > [88901.384991] RBP: ffffc2820478f838 R08: 0000000000000002 R09: 000000000001ff20
> > > [88901.392950] R10: ffffa1007ffd1d00 R11: 0000000000000000 R12: ffffa10053c2ea58
> > > [88901.400911] R13: fffffc2be1095400 R14: ffffc2820478f960 R15: ffffc2820478f9f8
> > > [88901.408876] ? page_mkclean+0x69/0xb0
> > > [88901.412960] ? invalid_page_referenced_vma+0x90/0x90
> > > [88901.418500] write_cache_pages+0x26f/0x510
> > > [88901.423085] ? xfs_vm_set_page_dirty+0x1d0/0x1d0 [xfs]
> > > [88901.428819] ? block_write_end+0x36/0x80
> > > [88901.433193] ? generic_write_end+0x54/0xc0
> > > [88901.437791] xfs_vm_writepages+0xb6/0xd0 [xfs]
> > > [88901.442741] do_writepages+0x1c/0x70
> > > [88901.446727] __filemap_fdatawrite_range+0xc6/0x100
> > > [88901.452071] filemap_write_and_wait_range+0x41/0x90
> > > [88901.457532] xfs_file_fsync+0x63/0x1d0 [xfs]
> > > [88901.462294] vfs_fsync_range+0x3d/0xb0
> > > [88901.466491] xfs_file_write_iter+0x121/0x140 [xfs]
> > > [88901.471835] __do_readv_writev+0x2d3/0x360
> > > [88901.476402] do_readv_writev+0x76/0xb0
> > > [88901.480582] ? do_dentry_open+0x261/0x310
> > > [88901.485069] ? xfs_extent_busy_ag_cmp+0x20/0x20 [xfs]
> > > [88901.490704] vfs_writev+0x3c/0x50
> > > [88901.494409] nfsd_vfs_write+0xd0/0x360 [nfsd]
> > > [88901.499271] ? ima_file_check+0x20/0x30
> > > [88901.503553] nfsd_write+0x15a/0x210 [nfsd]
> > > [88901.508116] nfsd3_proc_write+0x9d/0x130 [nfsd]
> > > [88901.513172] nfsd_dispatch+0xce/0x270 [nfsd]
> > > [88901.517950] svc_process_common+0x387/0x740 [sunrpc]
> > > [88901.523492] svc_process+0xfc/0x1c0 [sunrpc]
> > > [88901.528248] nfsd+0xe9/0x160 [nfsd]
> > > [88901.532137] kthread+0x109/0x140
> > > [88901.535740] ? nfsd_destroy+0x60/0x60 [nfsd]
> > > [88901.540501] ? kthread_park+0x60/0x60
> > > [88901.544586] ret_from_fork+0x25/0x30
> > > [88901.548572] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
> > > [88901.569642] RIP: report_bug+0x64/0x100 RSP: ffffc2820478f598
> > > [88901.575955] CR2: ffffffffc0360e12
> > > [88901.579663] ---[ end trace 62c63b9fb594e7f4 ]---
> > > [88901.584811] Kernel panic - not syncing: Fatal exception
> > > [88901.590694] Kernel Offset: 0x35000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > > [88901.602731] Rebooting in 3 seconds..
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-08 05:37:53

by Eryu Guan

[permalink] [raw]
Subject: Re: [v4.12-rc1 regression] nfs server crashed in fstests run

On Wed, Jun 07, 2017 at 12:23:38PM -0700, Darrick J. Wong wrote:
> On Fri, Jun 02, 2017 at 02:04:57PM +0800, Eryu Guan wrote:
> > Hi all,
> >
> > Starting from 4.12-rc1 kernel, I saw Linux nfs server crashes all the
> > time in my fstests (xfstests) runs, I appended the console log of an
> > NFSv3 crash to the end of this mail.
> >
> > I was exporting a directory resided on XFS, and loopback mounted the NFS
> > export on localhost. Both NFSv3 and NFSv4 could hit this crash. The
> > crash usually happens when running test case generic/029 or generic/095.
> >
> > But the problem is, there's no easy and efficient way to reproduce it, I
> > tried run only generic/029 and generic/095 in a loop for 1000 times but
> > failed, I also tried run the 'quick' group tests only for 50 iterations
> > but failed again. It seems that the only reliable way to reproduce it is
> > run the 'auto' group tests for 20 iterations.
> >
> > i=0
> > while [ $i -lt 20 ]; do
> > ./check -nfs -g auto
> > ((i++))
> > done
> >
> > And usually server crashed within 5 iterations, but at times it could
> > survive 10 iterations and only crashed if you left it running for more
> > iterations. This makes it hard to bisect and bisecting is very
> > time-consuming.
> >
> > (The bisecting is running now, it needs a few days to finish. My first
> > two attempts pointed first bad commit to some mm patches, but reverting
> > that patch didn't prevent server from crashing, so I enlarged the loop
> > count and started bisecting for the third time).
> >
> > If more info is needed please let me know.
> >
> > Thanks,
> > Eryu
> >
> >
> > [88895.796834] run fstests generic/028 at 2017-06-01 00:43:18
> > [88900.945420] run fstests generic/029 at 2017-06-01 00:43:23
> > [88901.127315] BUG: unable to handle kernel paging request at ffffffffc0360e12
> > [88901.135095] IP: report_bug+0x64/0x100
> > [88901.139177] PGD 3b7c0c067
> > [88901.139177] P4D 3b7c0c067
> > [88901.142194] PUD 3b7c0e067
> > [88901.145209] PMD 469e33067
> > [88901.148225] PTE 80000004675f4161
> > [88901.151240]
> > [88901.156497] Oops: 0003 [#1] SMP
> > [88901.159997] Modules linked in: loop dm_mod nfsv3 nfs fscache ext4 jbd2 mbcache intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nfsd irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ipmi_ssif cdc_ether aesni_intel iTCO_wdt usbnet crypto_simd ipmi_si glue_helper sg iTCO_vendor_support wmi mii cryptd ipmi_devintf ipmi_msghandler auth_rpcgss shpchp i2c_i801 pcspkr ioatdma lpc_ich nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops igb ttm drm ptp pps_core dca megaraid_sas crc32c_intel i2c_algo_bit i2c_core
> > [88901.221606] CPU: 9 PID: 3278 Comm: nfsd Not tainted 4.12.0-rc3 #42
> > [88901.228500] Hardware name: IBM System x3650 M4 -[7915ON3]-/00J6520, BIOS -[VVE124AUS-1.30]- 11/21/2012
> > [88901.238885] task: ffffa10062d12d00 task.stack: ffffc2820478c000
> > [88901.245488] RIP: 0010:report_bug+0x64/0x100
> > [88901.250153] RSP: 0018:ffffc2820478f598 EFLAGS: 00010202
> > [88901.255980] RAX: ffffffffc0360e08 RBX: ffffffffc0301cdc RCX: 0000000000000001
> > [88901.263940] RDX: 0000000000000907 RSI: ffffffffc038bc80 RDI: ffffffffffff0d22
> > [88901.271901] RBP: ffffc2820478f5b8 R08: 0000000000000001 R09: 00000000000001cc
> > [88901.279861] R10: ffffffffb6a36f67 R11: 0000000000000000 R12: ffffc2820478f6e8
> > [88901.287821] R13: ffffffffc0351b2a R14: 00000000000003fb R15: 0000000000000004
> > [88901.295783] FS: 0000000000000000(0000) GS:ffffa1007f6c0000(0000) knlGS:0000000000000000
> > [88901.304818] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [88901.311227] CR2: ffffffffc0360e12 CR3: 00000003b7c09000 CR4: 00000000000406e0
> > [88901.319188] Call Trace:
> > [88901.321920] do_trap+0x16a/0x190
> > [88901.325519] do_error_trap+0x89/0x110
> > [88901.329652] ? xfs_do_writepage+0x65c/0x680 [xfs]
>
> Hmm, does gdb think this address is at:
>
> (gdb) l *(xfs_do_writepage+0x65c)
> 0x7c17c is in xfs_do_writepage (/storage/home/djwong/cdev/work/linux-xfs/include/linux/highmem.h:77).

I managed to reproduce again with 4.12-rc4 kernel, call trace is

[ 704.811107] Call Trace:
[ 704.811107] do_trap+0x16a/0x190
[ 704.811107] do_error_trap+0x89/0x110
[ 704.811107] ? xfs_do_writepage+0x6c7/0x6d0 [xfs]
[ 704.811107] ? check_preempt_curr+0x7d/0x90
[ 704.811107] ? ttwu_do_wakeup+0x1e/0x150
[ 704.811107] do_invalid_op+0x20/0x30
[ 704.811107] invalid_op+0x1e/0x30

and xfs_do_writepage+0x6c7 is

(gdb) l *(xfs_do_writepage+0x6c7)
0x679e7 is in xfs_do_writepage (fs/xfs/xfs_aops.c:850).
845 int error = 0;
846 int count = 0;
847 int uptodate = 1;
848 unsigned int new_type;
849
850 bh = head = page_buffers(page);
851 offset = page_offset(page);
852 do {
853 if (offset >= end_offset)
854 break;

Full console log appended at the end for reference.

Thanks,
Eryu

[ 704.560907] run fstests generic/014 at 2017-06-08 13:21:04
[ 704.807749] BUG: unable to handle kernel paging request at ffffffffa06884c2
[ 704.808362] IP: report_bug+0x64/0x100
[ 704.808362] PGD 1c0c067
[ 704.808362] P4D 1c0c067
[ 704.808362] PUD 1c0d063
[ 704.808362] PMD 1f40ce067
[ 704.811107] PTE 80000001f405f161
[ 704.811107]
[ 704.811107] Oops: 0003 [#1] SMP
[ 704.811107] Modules linked in: xfs rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ipt_REJECT ip6t_REJECT xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter btrfs xor raid6_pq ppdev i2c_piix4 parport_pc i2c_core parport virtio_balloon pcspkr nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 jbd2 mbcache ata_generic pata_acpi virtio_blk virtio_net ata_piix libata virtio_pci serio_raw virtio_ring virtio floppy
[ 704.811107] CPU: 1 PID: 1443 Comm: nfsd Not tainted 4.12.0-rc4 #42
[ 704.811107] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[ 704.811107] task: ffff880210ed8000 task.stack: ffffc900016b4000
[ 704.811107] RIP: 0010:report_bug+0x64/0x100
[ 704.811107] RSP: 0018:ffffc900016b75b8 EFLAGS: 00010202
[ 704.811107] RAX: ffffffffa06884b8 RBX: ffffffffa06119b7 RCX: 0000000000000001
[ 704.811107] RDX: 0000000000000907 RSI: ffffffffa06b2c00 RDI: fffffffffffe4f3c
[ 704.811107] RBP: ffffc900016b75d8 R08: 0000000000000001 R09: 0000000000000204
[ 704.811107] R10: ffffffff81a27c2f R11: 0000000000000000 R12: ffffc900016b7708
[ 704.811107] R13: ffffffffa066d3f4 R14: 00000000000003fb R15: 0000000000000004
[ 704.811107] FS: 0000000000000000(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
[ 704.811107] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 704.811107] CR2: ffffffffa06884c2 CR3: 0000000001c09000 CR4: 00000000000006e0
[ 704.811107] Call Trace:
[ 704.811107] do_trap+0x16a/0x190
[ 704.811107] do_error_trap+0x89/0x110
[ 704.811107] ? xfs_do_writepage+0x6c7/0x6d0 [xfs]
[ 704.811107] ? check_preempt_curr+0x7d/0x90
[ 704.811107] ? ttwu_do_wakeup+0x1e/0x150
[ 704.811107] do_invalid_op+0x20/0x30
[ 704.811107] invalid_op+0x1e/0x30
[ 704.811107] RIP: 0010:xfs_do_writepage+0x6c7/0x6d0 [xfs]
[ 704.811107] RSP: 0018:ffffc900016b77b8 EFLAGS: 00010206
[ 704.811107] RAX: 0000000000340140 RBX: ffffc900016b78d8 RCX: ffff88020f82d400
[ 704.811107] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffea0007a991c0
[ 704.811107] RBP: ffffc900016b7858 R08: 0000000000000002 R09: 000000000001fec8
[ 704.811107] R10: ffff88021ffd8d00 R11: 0000000000000000 R12: ffffc900016b7980
[ 704.811107] R13: ffffea0007a991c0 R14: 0000000000004326 R15: ffff880213a00168
[ 704.811107] ? page_mkclean+0x69/0xb0
[ 704.811107] ? invalid_page_referenced_vma+0x90/0x90
[ 704.811107] write_cache_pages+0x26f/0x510
[ 704.811107] ? xfs_aops_discard_page+0x140/0x140 [xfs]
[ 704.811107] ? generic_write_end+0x7e/0xc0
[ 704.811107] xfs_vm_writepages+0xb6/0xd0 [xfs]
[ 704.811107] do_writepages+0x1c/0x70
[ 704.811107] __filemap_fdatawrite_range+0xc6/0x100
[ 704.811107] filemap_write_and_wait_range+0x41/0x90
[ 704.811107] xfs_file_fsync+0x63/0x1d0 [xfs]
[ 704.811107] vfs_fsync_range+0x3d/0xb0
[ 704.811107] xfs_file_write_iter+0x121/0x140 [xfs]
[ 704.811107] __do_readv_writev+0x2d3/0x360
[ 704.811107] do_readv_writev+0x76/0xb0
[ 704.811107] ? __wake_up+0x44/0x50
[ 704.811107] ? nfs4_put_stid+0x78/0x90 [nfsd]
[ 704.811107] vfs_writev+0x3c/0x50
[ 704.811107] nfsd_vfs_write+0xd0/0x360 [nfsd]
[ 704.811107] nfsd4_write+0x1a3/0x1d0 [nfsd]
[ 704.811107] nfsd4_proc_compound+0x3b6/0x710 [nfsd]
[ 704.811107] nfsd_dispatch+0xce/0x270 [nfsd]
[ 704.811107] svc_process_common+0x387/0x740 [sunrpc]
[ 704.811107] svc_process+0x105/0x1c0 [sunrpc]
[ 704.811107] nfsd+0xe9/0x160 [nfsd]
[ 704.811107] kthread+0x109/0x140
[ 704.811107] ? nfsd_destroy+0x60/0x60 [nfsd]
[ 704.811107] ? kthread_park+0x60/0x60
[ 704.811107] ret_from_fork+0x25/0x30
[ 704.811107] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
[ 704.811107] RIP: report_bug+0x64/0x100 RSP: ffffc900016b75b8
[ 704.811107] CR2: ffffffffa06884c2
[ 704.811107] ---[ end trace a44b2c0cad7f2afe ]---
[ 704.811107] Kernel panic - not syncing: Fatal exception
[ 704.811107] Kernel Offset: disabled
[ 704.811107] ---[ end Kernel panic - not syncing: Fatal exception