2011-03-25 20:11:59

by Jeff Layton

[permalink] [raw]
Subject: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

I got this today on my Fedora 15 alpha machine:

[18959.724464] BUG: Dentry ffff8801a6fe8840{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
[18959.724507] ------------[ cut here ]------------
[18959.724609] kernel BUG at fs/dcache.c:947!
[18959.724691] invalid opcode: 0000 [#1] SMP
[18959.724781] last sysfs file: /sys/devices/system/cpu/cpu3/topology/thread_siblings
[18959.724926] CPU 3
[18959.724967] Modules linked in: nfs lockd nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sco bnep l2cap bluetooth rfkill it87 adt7475 hwmon_vid rpcsec_gss_krb5 auth_rpcgss des_generic sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf xfs exportfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_usb_audio snd_hda_codec snd_hwdep virtio_net snd_usbmidi_lib kvm_amd snd_seq snd_rawmidi sp5100_tco i2c_piix4 raid1 snd_pcm snd_seq_device kvm snd_timer snd uvcvideo edac_core ppdev edac_mce_amd parport_pc parport snd_page_alloc videodev k10temp r8169 mii soundcore v4l2_compat_ioctl32 wmi uinput microcode ipv6 ata_generic pata_acpi pata_atiixp usb_storage radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[18959.725006]
[18959.725006] Pid: 17684, comm: umount.nfs4 Not tainted 2.6.38.1-6.fc15.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
[18959.725006] RIP: 0010:[<ffffffff8113289f>] [<ffffffff8113289f>] shrink_dcache_for_umount_subtree+0x104/0x1c9
[18959.725006] RSP: 0018:ffff8801ee94bd98 EFLAGS: 00010296
[18959.725006] RAX: 0000000000000062 RBX: ffff8801a6fe8840 RCX: 000000000000affd
[18959.725006] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
[18959.725006] RBP: ffff8801ee94bdc8 R08: 0000000000000002 R09: 0000ffff00066c0a
[18959.725006] R10: 0000ffff00066c0a R11: 0000000000000000 R12: ffff8801a6fe8840
[18959.725006] R13: ffff8801974281c0 R14: ffff8801a6fcccb0 R15: 0000000000000000
[18959.725006] FS: 00007ff67dd45720(0000) GS:ffff8800cfd80000(0000) knlGS:00000000f76f3870
[18959.725006] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[18959.725006] CR2: 00007ff67dd78000 CR3: 00000001ff5fe000 CR4: 00000000000006e0
[18959.725006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[18959.725006] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[18959.725006] Process umount.nfs4 (pid: 17684, threadinfo ffff8801ee94a000, task ffff88021b674560)
[18959.725006] Stack:
[18959.725006] ffff8801f9df4658 0000000000000000 ffff8801f9df4400 ffff8801a6fe8f00
[18959.725006] ffff8801ee94bf28 ffff8801a6fccc00 ffff8801ee94bde8 ffffffff81133215
[18959.725006] ffff8801f9df4400 ffffffffa0587900 ffff8801ee94be18 ffffffff81122e29
[18959.725006] Call Trace:
[18959.725006] [<ffffffff81133215>] shrink_dcache_for_umount+0x4d/0x5f
[18959.725006] [<ffffffff81122e29>] generic_shutdown_super+0x23/0xde
[18959.725006] [<ffffffff81122f61>] kill_anon_super+0x13/0x4c
[18959.725006] [<ffffffffa056334a>] nfs4_kill_super+0x44/0x77 [nfs]
[18959.725006] [<ffffffff81123185>] deactivate_locked_super+0x26/0x4b
[18959.725006] [<ffffffff81123cc1>] deactivate_super+0x37/0x3b
[18959.725006] [<ffffffff811385dc>] mntput_no_expire+0xcc/0xd1
[18959.725006] [<ffffffff81138607>] mntput+0x26/0x28
[18959.725006] [<ffffffff81138a8e>] release_mounts+0x59/0x73
[18959.725006] [<ffffffff81138ff4>] sys_umount+0x28d/0x2da
[18959.725006] [<ffffffff81009bc2>] system_call_fastpath+0x16/0x1b
[18959.725006] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 ce 24 7b 81 48 89 de 31 c0 e8 1a 91 33 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
[18959.725006] RIP [<ffffffff8113289f>] shrink_dcache_for_umount_subtree+0x104/0x1c9
[18959.725006] RSP <ffff8801ee94bd98>
[18959.725006] [drm] force priority to high
[18959.725006] [drm] force priority to high
[18959.747232] [drm] force priority to high
[18959.758010] ---[ end trace 625763676adf8f53 ]---

...I'm pretty sure the umount was triggered by autofs on a filesystem
that I wasn't using at the moment. Beyond that I don't think I did much
special on the filesystem. I haven't gone to look at the package in
detail, but I expect that this is a fairly stock 2.6.38.1 kernel.

--
Jeff Layton <[email protected]>


2011-03-30 21:05:50

by Mark Moseley

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Mon, Mar 28, 2011 at 6:21 PM, Mark Moseley <[email protected]> wrote:
> On Fri, Mar 25, 2011 at 1:11 PM, Jeff Layton <[email protected]> wrote:
>> I got this today on my Fedora 15 alpha machine:
>>
>> [18959.724464] BUG: Dentry ffff8801a6fe8840{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
>> [18959.724507] ------------[ cut here ]------------
>> [18959.724609] kernel BUG at fs/dcache.c:947!
>> [18959.724691] invalid opcode: 0000 [#1] SMP
>> [18959.724781] last sysfs file: /sys/devices/system/cpu/cpu3/topology/thread_siblings
>> [18959.724926] CPU 3
>> [18959.724967] Modules linked in: nfs lockd nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sco bnep l2cap bluetooth rfkill it87 adt7475 hwmon_vid rpcsec_gss_krb5 auth_rpcgss des_generic sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf xfs exportfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_usb_audio snd_hda_codec snd_hwdep virtio_net snd_usbmidi_lib kvm_amd snd_seq snd_rawmidi sp5100_tco i2c_piix4 raid1 snd_pcm snd_seq_device kvm snd_timer snd uvcvideo edac_core ppdev edac_mce_amd parport_pc parport snd_page_alloc videodev k10temp r8169 mii soundcore v4l2_compat_ioctl32 wmi uinput microcode ipv6 ata_generic pata_acpi pata_atiixp usb_storage radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
>> [18959.725006]
>> [18959.725006] Pid: 17684, comm: umount.nfs4 Not tainted 2.6.38.1-6.fc15.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
>> [18959.725006] RIP: 0010:[<ffffffff8113289f>] ?[<ffffffff8113289f>] shrink_dcache_for_umount_subtree+0x104/0x1c9
>> [18959.725006] RSP: 0018:ffff8801ee94bd98 ?EFLAGS: 00010296
>> [18959.725006] RAX: 0000000000000062 RBX: ffff8801a6fe8840 RCX: 000000000000affd
>> [18959.725006] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
>> [18959.725006] RBP: ffff8801ee94bdc8 R08: 0000000000000002 R09: 0000ffff00066c0a
>> [18959.725006] R10: 0000ffff00066c0a R11: 0000000000000000 R12: ffff8801a6fe8840
>> [18959.725006] R13: ffff8801974281c0 R14: ffff8801a6fcccb0 R15: 0000000000000000
>> [18959.725006] FS: ?00007ff67dd45720(0000) GS:ffff8800cfd80000(0000) knlGS:00000000f76f3870
>> [18959.725006] CS: ?0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [18959.725006] CR2: 00007ff67dd78000 CR3: 00000001ff5fe000 CR4: 00000000000006e0
>> [18959.725006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [18959.725006] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [18959.725006] Process umount.nfs4 (pid: 17684, threadinfo ffff8801ee94a000, task ffff88021b674560)
>> [18959.725006] Stack:
>> [18959.725006] ?ffff8801f9df4658 0000000000000000 ffff8801f9df4400 ffff8801a6fe8f00
>> [18959.725006] ?ffff8801ee94bf28 ffff8801a6fccc00 ffff8801ee94bde8 ffffffff81133215
>> [18959.725006] ?ffff8801f9df4400 ffffffffa0587900 ffff8801ee94be18 ffffffff81122e29
>> [18959.725006] Call Trace:
>> [18959.725006] ?[<ffffffff81133215>] shrink_dcache_for_umount+0x4d/0x5f
>> [18959.725006] ?[<ffffffff81122e29>] generic_shutdown_super+0x23/0xde
>> [18959.725006] ?[<ffffffff81122f61>] kill_anon_super+0x13/0x4c
>> [18959.725006] ?[<ffffffffa056334a>] nfs4_kill_super+0x44/0x77 [nfs]
>> [18959.725006] ?[<ffffffff81123185>] deactivate_locked_super+0x26/0x4b
>> [18959.725006] ?[<ffffffff81123cc1>] deactivate_super+0x37/0x3b
>> [18959.725006] ?[<ffffffff811385dc>] mntput_no_expire+0xcc/0xd1
>> [18959.725006] ?[<ffffffff81138607>] mntput+0x26/0x28
>> [18959.725006] ?[<ffffffff81138a8e>] release_mounts+0x59/0x73
>> [18959.725006] ?[<ffffffff81138ff4>] sys_umount+0x28d/0x2da
>> [18959.725006] ?[<ffffffff81009bc2>] system_call_fastpath+0x16/0x1b
>> [18959.725006] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 ce 24 7b 81 48 89 de 31 c0 e8 1a 91 33 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
>> [18959.725006] RIP ?[<ffffffff8113289f>] shrink_dcache_for_umount_subtree+0x104/0x1c9
>> [18959.725006] ?RSP <ffff8801ee94bd98>
>> [18959.725006] [drm] force priority to high
>> [18959.725006] [drm] force priority to high
>> [18959.747232] [drm] force priority to high
>> [18959.758010] ---[ end trace 625763676adf8f53 ]---
>>
>> ...I'm pretty sure the umount was triggered by autofs on a filesystem
>> that I wasn't using at the moment. Beyond that I don't think I did much
>> special on the filesystem. I haven't gone to look at the package in
>> detail, but I expect that this is a fairly stock 2.6.38.1 kernel.
>
>
> I've got the same error in 2.6.38.2 (64-bit, Debian Lenny). It happens
> when umount tries to unmount a NetApp GX mount with a *lot* of
> submounts. The top-level mount is a regular (in fstab) NFSv3 mount
> over UDP. All of the submounts are handled by the kernel itself (and
> incidentally, what is the correct term for this sort of submount?),
> i.e. only the toplevel mount appears in fstab and the submounts
> automagically appear underneath when used. Sort of like automounting
> but not.
>
> For completeness sake, here's my backtrace:
>
>
> [ 1468.950280] BUG: Dentry ffff880226133e40{i=b4670040,n=web192} still
> in use (1) [unmount of nfs 0:19]
> [ 1468.950306] ------------[ cut here ]------------
> [ 1468.959538] kernel BUG at fs/dcache.c:947!
> [ 1468.960023] invalid opcode: 0000 [#1] SMP
> [ 1468.960023] last sysfs file:
> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> [ 1468.960023] CPU 0
> [ 1468.960023] Modules linked in: ipt_addrtype raid0 md_mod ioatdma
> dca power_meter loop joydev fan evdev psmouse dcdbas serio_raw
> i5000_edac pcspkr edac_core i5k_amb shpchp pci_hotplug processor
> button thermal_sys sg sr_mod cdrom ehci_hcd uhci_hcd sata_sil24 sd_mod
> crc_t10dif [last unloaded: scsi_wait_scan]
> [ 1468.960023]
> [ 1468.960023] Pid: 9859, comm: umount.nfs Not tainted 2.6.38.2 #1
> Dell Inc. PowerEdge 1950/0DT097
> [ 1468.960023] RIP: 0010:[<ffffffff81154541>] ?[<ffffffff81154541>]
> shrink_dcache_for_umount_subtree+0x251/0x260
> [ 1468.960023] RSP: 0018:ffff880198d47d58 ?EFLAGS: 00010292
> [ 1468.960023] RAX: 000000000000006e RBX: ffff880226001a1c RCX: 000000000001ffff
> [ 1468.960023] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffffffff81cf4830
> [ 1468.960023] RBP: ffff880198d47d88 R08: 0000000000000006 R09: 0000000000000000
> [ 1468.960023] R10: 0000000000000003 R11: 00000000ffffffff R12: ffff8802260019c0
> [ 1468.960023] R13: ffff880226133e40 R14: ffff880226133ee0 R15: ffff880226001a1c
> [ 1468.960023] FS: ?00000304a71996e0(0000) GS:ffff8800cfc00000(0000)
> knlGS:0000000000000000
> [ 1468.960023] CS: ?0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 1468.960023] CR2: ffffffffff600400 CR3: 0000000001646000 CR4: 00000000000006f0
> [ 1468.960023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1468.960023] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 1468.960023] Process umount.nfs (pid: 9859, threadinfo
> ffff880198d46000, task ffff880227f91730)
> [ 1468.960023] Stack:
> [ 1468.960023] ?ffff88022505ae58 ffff880226001900 ffff88022600195c
> ffff88022505ac00
> [ 1468.960023] ?ffff88022505ac00 0000000000000100 ffff880198d47db8
> ffffffff811545b2
> [ 1468.960023] ?0000000000000282 ffff88022505ac00 ffff880224715800
> ffffffff816bcc20
> [ 1468.960023] Call Trace:
> [ 1468.960023] ?[<ffffffff811545b2>] shrink_dcache_for_umount+0x62/0x80
> [ 1468.960023] ?[<ffffffff8113d35c>] generic_shutdown_super+0x2c/0x110
> [ 1468.960023] ?[<ffffffff8113d4d6>] kill_anon_super+0x16/0x60
> [ 1468.960023] ?[<ffffffff8123c485>] nfs_kill_super+0x25/0x40
> [ 1468.960023] ?[<ffffffff8113dc05>] deactivate_locked_super+0x55/0x70
> [ 1468.960023] ?[<ffffffff8113ebc9>] deactivate_super+0x79/0x80
> [ 1468.960023] ?[<ffffffff8115bb37>] mntput_no_expire+0x137/0x1b0
> [ 1468.960023] ?[<ffffffff8115bbcd>] mntput+0x1d/0x40
> [ 1468.960023] ?[<ffffffff8115bc61>] release_mounts+0x71/0x90
> [ 1468.960023] ?[<ffffffff8115c1c8>] sys_umount+0x308/0x3c0
> [ 1468.960023] ?[<ffffffff810030fb>] system_call_fastpath+0x16/0x1b
> [ 1468.960023] Code: 8b 45 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48
> 8d 86 58 02 00 00 48 c7 c7 20 c7 88 81 48 89 04 24 4c 89 ee 31 c0 e8
> 0d 0c 4c 00 <0f> 0b eb fe 0f 0b eb fe 0f 1f 80 00 00 00 00 55 48 89 e5
> 41 55
> [ 1468.960023] RIP ?[<ffffffff81154541>]
> shrink_dcache_for_umount_subtree+0x251/0x260
> [ 1468.960023] ?RSP <ffff880198d47d58>
> [ 1469.500150] ---[ end trace 74f032acd854ca51 ]---
>

I'm not 100% sure it's related (but I'm going to guess it is) but on
these same boxes, they're not actually able to reboot at the end of a
graceful shutdown. After yielding that bug and continuing with the
shutdown process, it gets all the way to exec'ing:

reboot -d -f -i

and then just hangs forever. I'm guessing a thread is hung still
trying to unmount things. On another box, I triggered that bug with a
umount of one top-level mount that had subtrees. When I umount'd
another top-level mount with subtrees on that same box, it's blocked
and unkillable. That second umount also logged another bug to the
kernel logs.

In both umounts described above, the entries in /proc/mounts go away
after the umount.

Jeff, are you at liberty to do a graceful shutdown of the box you saw
that bug on? If so, does it actually reboot?

2011-03-31 19:17:55

by Mark Moseley

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Wed, Mar 30, 2011 at 6:37 PM, Jeff Layton <[email protected]> wrote:
> On Wed, 30 Mar 2011 14:55:00 -0700
> Mark Moseley <[email protected]> wrote:
>
>> > I'm not 100% sure it's related (but I'm going to guess it is) but on
>> > these same boxes, they're not actually able to reboot at the end of a
>> > graceful shutdown. After yielding that bug and continuing with the
>> > shutdown process, it gets all the way to exec'ing:
>> >
>> > reboot -d -f -i
>> >
>> > and then just hangs forever. I'm guessing a thread is hung still
>> > trying to unmount things. On another box, I triggered that bug with a
>> > umount of one top-level mount that had subtrees. When I umount'd
>> > another top-level mount with subtrees on that same box, it's blocked
>> > and unkillable. That second umount also logged another bug to the
>> > kernel logs.
>> >
>> > In both umounts described above, the entries in /proc/mounts go away
>> > after the umount.
>> >
>> > Jeff, are you at liberty to do a graceful shutdown of the box you saw
>> > that bug on? If so, does it actually reboot?
>>
>>
>> A bit more info: On the same boxes, freshly booted but with all the
>> same mounts (even the subtrees) mounted, I don't get that bug, so it
>> seems to happen just when there's been significant usage within those
>> mounts. These are all read-only mounts, if it makes a difference.
>>
>> I was however able to trigger the bug on a box that had been running
>> (web serving) for about 15 minutes. Here's a snippet from slabinfo
>> right before umount'ing (let me know if more of it would help):
>>
>
> No, it hung. Unfortunately, I didn't collect any info that told me
> where it was hung though.
>
> I believe that's fairly typical though in situations like this. The
> problem is that we likely have a dentry refcount leak somewhere. The
> vfsmount refcount was low enough to allow an unmount though. Often
> these sorts of problems lurk in error handling code...
>
>> # grep nfs /proc/slabinfo
>> nfsd4_delegations ? ? ?0 ? ? ?0 ? ?360 ? 22 ? ?2 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>> nfsd4_stateids ? ? ? ? 0 ? ? ?0 ? ?120 ? 34 ? ?1 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>> nfsd4_files ? ? ? ? ? ?0 ? ? ?0 ? ?136 ? 30 ? ?1 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>> nfsd4_stateowners ? ? ?0 ? ? ?0 ? ?424 ? 38 ? ?4 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>> nfs_direct_cache ? ? ? 0 ? ? ?0 ? ?136 ? 30 ? ?1 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>> nfs_write_data ? ? ? ?46 ? ? 46 ? ?704 ? 23 ? ?4 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?2 ? ? ?2 ? ? ?0
>> nfs_read_data ? ? ? ?207 ? ?207 ? ?704 ? 23 ? ?4 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?9 ? ? ?9 ? ? ?0
>> nfs_inode_cache ? ?23901 ?23901 ? 1056 ? 31 ? ?8 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ?771 ? ?771 ? ? ?0
>> nfs_page ? ? ? ? ? ? 256 ? ?256 ? ?128 ? 32 ? ?1 : tunables ? ?0 ? ?0
>> ? 0 : slabdata ? ? ?8 ? ? ?8 ? ? ?0
>
> slabinfo probably won't tell us much here. If we can narrow down the
> reproducer for this though then that would definitely help.
> Actually, /proc/self/mountstats might give us something to go on...

Would turning on kmemleak and letting it run for a bit (luckily 15
mins of regular use seems to be enough to trigger this bug) tell us
anything? I can do that easily enough but I don't know if it tracks
dentry entries.

Also I'll grab mountstats as well before/after trying a umount.
Anything in particular within mountstats I should pay close attention
to?

2011-03-30 21:55:01

by Mark Moseley

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

> I'm not 100% sure it's related (but I'm going to guess it is) but on
> these same boxes, they're not actually able to reboot at the end of a
> graceful shutdown. After yielding that bug and continuing with the
> shutdown process, it gets all the way to exec'ing:
>
> reboot -d -f -i
>
> and then just hangs forever. I'm guessing a thread is hung still
> trying to unmount things. On another box, I triggered that bug with a
> umount of one top-level mount that had subtrees. When I umount'd
> another top-level mount with subtrees on that same box, it's blocked
> and unkillable. That second umount also logged another bug to the
> kernel logs.
>
> In both umounts described above, the entries in /proc/mounts go away
> after the umount.
>
> Jeff, are you at liberty to do a graceful shutdown of the box you saw
> that bug on? If so, does it actually reboot?


A bit more info: On the same boxes, freshly booted but with all the
same mounts (even the subtrees) mounted, I don't get that bug, so it
seems to happen just when there's been significant usage within those
mounts. These are all read-only mounts, if it makes a difference.

I was however able to trigger the bug on a box that had been running
(web serving) for about 15 minutes. Here's a snippet from slabinfo
right before umount'ing (let me know if more of it would help):

# grep nfs /proc/slabinfo
nfsd4_delegations 0 0 360 22 2 : tunables 0 0
0 : slabdata 0 0 0
nfsd4_stateids 0 0 120 34 1 : tunables 0 0
0 : slabdata 0 0 0
nfsd4_files 0 0 136 30 1 : tunables 0 0
0 : slabdata 0 0 0
nfsd4_stateowners 0 0 424 38 4 : tunables 0 0
0 : slabdata 0 0 0
nfs_direct_cache 0 0 136 30 1 : tunables 0 0
0 : slabdata 0 0 0
nfs_write_data 46 46 704 23 4 : tunables 0 0
0 : slabdata 2 2 0
nfs_read_data 207 207 704 23 4 : tunables 0 0
0 : slabdata 9 9 0
nfs_inode_cache 23901 23901 1056 31 8 : tunables 0 0
0 : slabdata 771 771 0
nfs_page 256 256 128 32 1 : tunables 0 0
0 : slabdata 8 8 0

2011-03-31 01:37:12

by Jeff Layton

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Wed, 30 Mar 2011 14:55:00 -0700
Mark Moseley <[email protected]> wrote:

> > I'm not 100% sure it's related (but I'm going to guess it is) but on
> > these same boxes, they're not actually able to reboot at the end of a
> > graceful shutdown. After yielding that bug and continuing with the
> > shutdown process, it gets all the way to exec'ing:
> >
> > reboot -d -f -i
> >
> > and then just hangs forever. I'm guessing a thread is hung still
> > trying to unmount things. On another box, I triggered that bug with a
> > umount of one top-level mount that had subtrees. When I umount'd
> > another top-level mount with subtrees on that same box, it's blocked
> > and unkillable. That second umount also logged another bug to the
> > kernel logs.
> >
> > In both umounts described above, the entries in /proc/mounts go away
> > after the umount.
> >
> > Jeff, are you at liberty to do a graceful shutdown of the box you saw
> > that bug on? If so, does it actually reboot?
>
>
> A bit more info: On the same boxes, freshly booted but with all the
> same mounts (even the subtrees) mounted, I don't get that bug, so it
> seems to happen just when there's been significant usage within those
> mounts. These are all read-only mounts, if it makes a difference.
>
> I was however able to trigger the bug on a box that had been running
> (web serving) for about 15 minutes. Here's a snippet from slabinfo
> right before umount'ing (let me know if more of it would help):
>

No, it hung. Unfortunately, I didn't collect any info that told me
where it was hung though.

I believe that's fairly typical though in situations like this. The
problem is that we likely have a dentry refcount leak somewhere. The
vfsmount refcount was low enough to allow an unmount though. Often
these sorts of problems lurk in error handling code...

> # grep nfs /proc/slabinfo
> nfsd4_delegations 0 0 360 22 2 : tunables 0 0
> 0 : slabdata 0 0 0
> nfsd4_stateids 0 0 120 34 1 : tunables 0 0
> 0 : slabdata 0 0 0
> nfsd4_files 0 0 136 30 1 : tunables 0 0
> 0 : slabdata 0 0 0
> nfsd4_stateowners 0 0 424 38 4 : tunables 0 0
> 0 : slabdata 0 0 0
> nfs_direct_cache 0 0 136 30 1 : tunables 0 0
> 0 : slabdata 0 0 0
> nfs_write_data 46 46 704 23 4 : tunables 0 0
> 0 : slabdata 2 2 0
> nfs_read_data 207 207 704 23 4 : tunables 0 0
> 0 : slabdata 9 9 0
> nfs_inode_cache 23901 23901 1056 31 8 : tunables 0 0
> 0 : slabdata 771 771 0
> nfs_page 256 256 128 32 1 : tunables 0 0
> 0 : slabdata 8 8 0

slabinfo probably won't tell us much here. If we can narrow down the
reproducer for this though then that would definitely help.
Actually, /proc/self/mountstats might give us something to go on...

--
Jeff Layton <[email protected]>

2011-03-29 01:21:47

by Mark Moseley

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Fri, Mar 25, 2011 at 1:11 PM, Jeff Layton <[email protected]> wrote:
> I got this today on my Fedora 15 alpha machine:
>
> [18959.724464] BUG: Dentry ffff8801a6fe8840{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
> [18959.724507] ------------[ cut here ]------------
> [18959.724609] kernel BUG at fs/dcache.c:947!
> [18959.724691] invalid opcode: 0000 [#1] SMP
> [18959.724781] last sysfs file: /sys/devices/system/cpu/cpu3/topology/thread_siblings
> [18959.724926] CPU 3
> [18959.724967] Modules linked in: nfs lockd nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sco bnep l2cap bluetooth rfkill it87 adt7475 hwmon_vid rpcsec_gss_krb5 auth_rpcgss des_generic sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf xfs exportfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_usb_audio snd_hda_codec snd_hwdep virtio_net snd_usbmidi_lib kvm_amd snd_seq snd_rawmidi sp5100_tco i2c_piix4 raid1 snd_pcm snd_seq_device kvm snd_timer snd uvcvideo edac_core ppdev edac_mce_amd parport_pc parport snd_page_alloc videodev k10temp r8169 mii soundcore v4l2_compat_ioctl32 wmi uinput microcode ipv6 ata_generic pata_acpi pata_atiixp usb_storage radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
> [18959.725006]
> [18959.725006] Pid: 17684, comm: umount.nfs4 Not tainted 2.6.38.1-6.fc15.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
> [18959.725006] RIP: 0010:[<ffffffff8113289f>] ?[<ffffffff8113289f>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> [18959.725006] RSP: 0018:ffff8801ee94bd98 ?EFLAGS: 00010296
> [18959.725006] RAX: 0000000000000062 RBX: ffff8801a6fe8840 RCX: 000000000000affd
> [18959.725006] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> [18959.725006] RBP: ffff8801ee94bdc8 R08: 0000000000000002 R09: 0000ffff00066c0a
> [18959.725006] R10: 0000ffff00066c0a R11: 0000000000000000 R12: ffff8801a6fe8840
> [18959.725006] R13: ffff8801974281c0 R14: ffff8801a6fcccb0 R15: 0000000000000000
> [18959.725006] FS: ?00007ff67dd45720(0000) GS:ffff8800cfd80000(0000) knlGS:00000000f76f3870
> [18959.725006] CS: ?0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [18959.725006] CR2: 00007ff67dd78000 CR3: 00000001ff5fe000 CR4: 00000000000006e0
> [18959.725006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [18959.725006] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [18959.725006] Process umount.nfs4 (pid: 17684, threadinfo ffff8801ee94a000, task ffff88021b674560)
> [18959.725006] Stack:
> [18959.725006] ?ffff8801f9df4658 0000000000000000 ffff8801f9df4400 ffff8801a6fe8f00
> [18959.725006] ?ffff8801ee94bf28 ffff8801a6fccc00 ffff8801ee94bde8 ffffffff81133215
> [18959.725006] ?ffff8801f9df4400 ffffffffa0587900 ffff8801ee94be18 ffffffff81122e29
> [18959.725006] Call Trace:
> [18959.725006] ?[<ffffffff81133215>] shrink_dcache_for_umount+0x4d/0x5f
> [18959.725006] ?[<ffffffff81122e29>] generic_shutdown_super+0x23/0xde
> [18959.725006] ?[<ffffffff81122f61>] kill_anon_super+0x13/0x4c
> [18959.725006] ?[<ffffffffa056334a>] nfs4_kill_super+0x44/0x77 [nfs]
> [18959.725006] ?[<ffffffff81123185>] deactivate_locked_super+0x26/0x4b
> [18959.725006] ?[<ffffffff81123cc1>] deactivate_super+0x37/0x3b
> [18959.725006] ?[<ffffffff811385dc>] mntput_no_expire+0xcc/0xd1
> [18959.725006] ?[<ffffffff81138607>] mntput+0x26/0x28
> [18959.725006] ?[<ffffffff81138a8e>] release_mounts+0x59/0x73
> [18959.725006] ?[<ffffffff81138ff4>] sys_umount+0x28d/0x2da
> [18959.725006] ?[<ffffffff81009bc2>] system_call_fastpath+0x16/0x1b
> [18959.725006] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 ce 24 7b 81 48 89 de 31 c0 e8 1a 91 33 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
> [18959.725006] RIP ?[<ffffffff8113289f>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> [18959.725006] ?RSP <ffff8801ee94bd98>
> [18959.725006] [drm] force priority to high
> [18959.725006] [drm] force priority to high
> [18959.747232] [drm] force priority to high
> [18959.758010] ---[ end trace 625763676adf8f53 ]---
>
> ...I'm pretty sure the umount was triggered by autofs on a filesystem
> that I wasn't using at the moment. Beyond that I don't think I did much
> special on the filesystem. I haven't gone to look at the package in
> detail, but I expect that this is a fairly stock 2.6.38.1 kernel.


I've got the same error in 2.6.38.2 (64-bit, Debian Lenny). It happens
when umount tries to unmount a NetApp GX mount with a *lot* of
submounts. The top-level mount is a regular (in fstab) NFSv3 mount
over UDP. All of the submounts are handled by the kernel itself (and
incidentally, what is the correct term for this sort of submount?),
i.e. only the toplevel mount appears in fstab and the submounts
automagically appear underneath when used. Sort of like automounting
but not.

For completeness sake, here's my backtrace:


[ 1468.950280] BUG: Dentry ffff880226133e40{i=b4670040,n=web192} still
in use (1) [unmount of nfs 0:19]
[ 1468.950306] ------------[ cut here ]------------
[ 1468.959538] kernel BUG at fs/dcache.c:947!
[ 1468.960023] invalid opcode: 0000 [#1] SMP
[ 1468.960023] last sysfs file:
/sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
[ 1468.960023] CPU 0
[ 1468.960023] Modules linked in: ipt_addrtype raid0 md_mod ioatdma
dca power_meter loop joydev fan evdev psmouse dcdbas serio_raw
i5000_edac pcspkr edac_core i5k_amb shpchp pci_hotplug processor
button thermal_sys sg sr_mod cdrom ehci_hcd uhci_hcd sata_sil24 sd_mod
crc_t10dif [last unloaded: scsi_wait_scan]
[ 1468.960023]
[ 1468.960023] Pid: 9859, comm: umount.nfs Not tainted 2.6.38.2 #1
Dell Inc. PowerEdge 1950/0DT097
[ 1468.960023] RIP: 0010:[<ffffffff81154541>] [<ffffffff81154541>]
shrink_dcache_for_umount_subtree+0x251/0x260
[ 1468.960023] RSP: 0018:ffff880198d47d58 EFLAGS: 00010292
[ 1468.960023] RAX: 000000000000006e RBX: ffff880226001a1c RCX: 000000000001ffff
[ 1468.960023] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffffffff81cf4830
[ 1468.960023] RBP: ffff880198d47d88 R08: 0000000000000006 R09: 0000000000000000
[ 1468.960023] R10: 0000000000000003 R11: 00000000ffffffff R12: ffff8802260019c0
[ 1468.960023] R13: ffff880226133e40 R14: ffff880226133ee0 R15: ffff880226001a1c
[ 1468.960023] FS: 00000304a71996e0(0000) GS:ffff8800cfc00000(0000)
knlGS:0000000000000000
[ 1468.960023] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1468.960023] CR2: ffffffffff600400 CR3: 0000000001646000 CR4: 00000000000006f0
[ 1468.960023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1468.960023] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1468.960023] Process umount.nfs (pid: 9859, threadinfo
ffff880198d46000, task ffff880227f91730)
[ 1468.960023] Stack:
[ 1468.960023] ffff88022505ae58 ffff880226001900 ffff88022600195c
ffff88022505ac00
[ 1468.960023] ffff88022505ac00 0000000000000100 ffff880198d47db8
ffffffff811545b2
[ 1468.960023] 0000000000000282 ffff88022505ac00 ffff880224715800
ffffffff816bcc20
[ 1468.960023] Call Trace:
[ 1468.960023] [<ffffffff811545b2>] shrink_dcache_for_umount+0x62/0x80
[ 1468.960023] [<ffffffff8113d35c>] generic_shutdown_super+0x2c/0x110
[ 1468.960023] [<ffffffff8113d4d6>] kill_anon_super+0x16/0x60
[ 1468.960023] [<ffffffff8123c485>] nfs_kill_super+0x25/0x40
[ 1468.960023] [<ffffffff8113dc05>] deactivate_locked_super+0x55/0x70
[ 1468.960023] [<ffffffff8113ebc9>] deactivate_super+0x79/0x80
[ 1468.960023] [<ffffffff8115bb37>] mntput_no_expire+0x137/0x1b0
[ 1468.960023] [<ffffffff8115bbcd>] mntput+0x1d/0x40
[ 1468.960023] [<ffffffff8115bc61>] release_mounts+0x71/0x90
[ 1468.960023] [<ffffffff8115c1c8>] sys_umount+0x308/0x3c0
[ 1468.960023] [<ffffffff810030fb>] system_call_fastpath+0x16/0x1b
[ 1468.960023] Code: 8b 45 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48
8d 86 58 02 00 00 48 c7 c7 20 c7 88 81 48 89 04 24 4c 89 ee 31 c0 e8
0d 0c 4c 00 <0f> 0b eb fe 0f 0b eb fe 0f 1f 80 00 00 00 00 55 48 89 e5
41 55
[ 1468.960023] RIP [<ffffffff81154541>]
shrink_dcache_for_umount_subtree+0x251/0x260
[ 1468.960023] RSP <ffff880198d47d58>
[ 1469.500150] ---[ end trace 74f032acd854ca51 ]---

2011-03-31 23:21:33

by Mark Moseley

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Thu, Mar 31, 2011 at 12:17 PM, Mark Moseley <[email protected]> wrote:
> On Wed, Mar 30, 2011 at 6:37 PM, Jeff Layton <[email protected]> wrote:
>> On Wed, 30 Mar 2011 14:55:00 -0700
>> Mark Moseley <[email protected]> wrote:
>>
>>> > I'm not 100% sure it's related (but I'm going to guess it is) but on
>>> > these same boxes, they're not actually able to reboot at the end of a
>>> > graceful shutdown. After yielding that bug and continuing with the
>>> > shutdown process, it gets all the way to exec'ing:
>>> >
>>> > reboot -d -f -i
>>> >
>>> > and then just hangs forever. I'm guessing a thread is hung still
>>> > trying to unmount things. On another box, I triggered that bug with a
>>> > umount of one top-level mount that had subtrees. When I umount'd
>>> > another top-level mount with subtrees on that same box, it's blocked
>>> > and unkillable. That second umount also logged another bug to the
>>> > kernel logs.
>>> >
>>> > In both umounts described above, the entries in /proc/mounts go away
>>> > after the umount.
>>> >
>>> > Jeff, are you at liberty to do a graceful shutdown of the box you saw
>>> > that bug on? If so, does it actually reboot?
>>>
>>>
>>> A bit more info: On the same boxes, freshly booted but with all the
>>> same mounts (even the subtrees) mounted, I don't get that bug, so it
>>> seems to happen just when there's been significant usage within those
>>> mounts. These are all read-only mounts, if it makes a difference.
>>>
>>> I was however able to trigger the bug on a box that had been running
>>> (web serving) for about 15 minutes. Here's a snippet from slabinfo
>>> right before umount'ing (let me know if more of it would help):
>>>
>>
>> No, it hung. Unfortunately, I didn't collect any info that told me
>> where it was hung though.
>>
>> I believe that's fairly typical though in situations like this. The
>> problem is that we likely have a dentry refcount leak somewhere. The
>> vfsmount refcount was low enough to allow an unmount though. Often
>> these sorts of problems lurk in error handling code...
>>
>>> # grep nfs /proc/slabinfo
>>> nfsd4_delegations ? ? ?0 ? ? ?0 ? ?360 ? 22 ? ?2 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>> nfsd4_stateids ? ? ? ? 0 ? ? ?0 ? ?120 ? 34 ? ?1 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>> nfsd4_files ? ? ? ? ? ?0 ? ? ?0 ? ?136 ? 30 ? ?1 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>> nfsd4_stateowners ? ? ?0 ? ? ?0 ? ?424 ? 38 ? ?4 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>> nfs_direct_cache ? ? ? 0 ? ? ?0 ? ?136 ? 30 ? ?1 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>> nfs_write_data ? ? ? ?46 ? ? 46 ? ?704 ? 23 ? ?4 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?2 ? ? ?2 ? ? ?0
>>> nfs_read_data ? ? ? ?207 ? ?207 ? ?704 ? 23 ? ?4 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?9 ? ? ?9 ? ? ?0
>>> nfs_inode_cache ? ?23901 ?23901 ? 1056 ? 31 ? ?8 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ?771 ? ?771 ? ? ?0
>>> nfs_page ? ? ? ? ? ? 256 ? ?256 ? ?128 ? 32 ? ?1 : tunables ? ?0 ? ?0
>>> ? 0 : slabdata ? ? ?8 ? ? ?8 ? ? ?0
>>
>> slabinfo probably won't tell us much here. If we can narrow down the
>> reproducer for this though then that would definitely help.
>> Actually, /proc/self/mountstats might give us something to go on...
>
> Would turning on kmemleak and letting it run for a bit (luckily 15
> mins of regular use seems to be enough to trigger this bug) tell us
> anything? I can do that easily enough but I don't know if it tracks
> dentry entries.
>
> Also I'll grab mountstats as well before/after trying a umount.
> Anything in particular within mountstats I should pay close attention
> to?
>

I've been trying random things too, just to see if anything shakes out:

* I tried mounting the top-level mount with the subtree mounts via TCP
(was UDP), but I still get that error.

* Mounting normally (UDP), if I iterate through the subtrees and
unmount them *first* (so that there's zero subtrees left over) and
then unmount the top-level mount, I *don't* get that error. And a
umount at shutdown proceeds without error and the final call to reboot
works just fine (instead of hanging indefinitely). So unmounting all
the subtrees first cleans things up sufficiently.

One random/irrelevant but slightly interesting thing is that when I
trigger that error, it also bumps kernel.printk from "1 7 1 4" to "15
7 1 4". I've not seen that before.

2011-04-27 23:23:08

by Mark Moseley

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Thu, Mar 31, 2011 at 4:21 PM, Mark Moseley <[email protected]> wrote:
> On Thu, Mar 31, 2011 at 12:17 PM, Mark Moseley <[email protected]> wrote:
>> On Wed, Mar 30, 2011 at 6:37 PM, Jeff Layton <[email protected]> wrote:
>>> On Wed, 30 Mar 2011 14:55:00 -0700
>>> Mark Moseley <[email protected]> wrote:
>>>
>>>> > I'm not 100% sure it's related (but I'm going to guess it is) but on
>>>> > these same boxes, they're not actually able to reboot at the end of a
>>>> > graceful shutdown. After yielding that bug and continuing with the
>>>> > shutdown process, it gets all the way to exec'ing:
>>>> >
>>>> > reboot -d -f -i
>>>> >
>>>> > and then just hangs forever. I'm guessing a thread is hung still
>>>> > trying to unmount things. On another box, I triggered that bug with a
>>>> > umount of one top-level mount that had subtrees. When I umount'd
>>>> > another top-level mount with subtrees on that same box, it's blocked
>>>> > and unkillable. That second umount also logged another bug to the
>>>> > kernel logs.
>>>> >
>>>> > In both umounts described above, the entries in /proc/mounts go away
>>>> > after the umount.
>>>> >
>>>> > Jeff, are you at liberty to do a graceful shutdown of the box you saw
>>>> > that bug on? If so, does it actually reboot?
>>>>
>>>>
>>>> A bit more info: On the same boxes, freshly booted but with all the
>>>> same mounts (even the subtrees) mounted, I don't get that bug, so it
>>>> seems to happen just when there's been significant usage within those
>>>> mounts. These are all read-only mounts, if it makes a difference.
>>>>
>>>> I was however able to trigger the bug on a box that had been running
>>>> (web serving) for about 15 minutes. Here's a snippet from slabinfo
>>>> right before umount'ing (let me know if more of it would help):
>>>>
>>>
>>> No, it hung. Unfortunately, I didn't collect any info that told me
>>> where it was hung though.
>>>
>>> I believe that's fairly typical though in situations like this. The
>>> problem is that we likely have a dentry refcount leak somewhere. The
>>> vfsmount refcount was low enough to allow an unmount though. Often
>>> these sorts of problems lurk in error handling code...
>>>
>>>> # grep nfs /proc/slabinfo
>>>> nfsd4_delegations ? ? ?0 ? ? ?0 ? ?360 ? 22 ? ?2 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>>> nfsd4_stateids ? ? ? ? 0 ? ? ?0 ? ?120 ? 34 ? ?1 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>>> nfsd4_files ? ? ? ? ? ?0 ? ? ?0 ? ?136 ? 30 ? ?1 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>>> nfsd4_stateowners ? ? ?0 ? ? ?0 ? ?424 ? 38 ? ?4 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>>> nfs_direct_cache ? ? ? 0 ? ? ?0 ? ?136 ? 30 ? ?1 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?0 ? ? ?0 ? ? ?0
>>>> nfs_write_data ? ? ? ?46 ? ? 46 ? ?704 ? 23 ? ?4 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?2 ? ? ?2 ? ? ?0
>>>> nfs_read_data ? ? ? ?207 ? ?207 ? ?704 ? 23 ? ?4 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?9 ? ? ?9 ? ? ?0
>>>> nfs_inode_cache ? ?23901 ?23901 ? 1056 ? 31 ? ?8 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ?771 ? ?771 ? ? ?0
>>>> nfs_page ? ? ? ? ? ? 256 ? ?256 ? ?128 ? 32 ? ?1 : tunables ? ?0 ? ?0
>>>> ? 0 : slabdata ? ? ?8 ? ? ?8 ? ? ?0
>>>
>>> slabinfo probably won't tell us much here. If we can narrow down the
>>> reproducer for this though then that would definitely help.
>>> Actually, /proc/self/mountstats might give us something to go on...
>>
>> Would turning on kmemleak and letting it run for a bit (luckily 15
>> mins of regular use seems to be enough to trigger this bug) tell us
>> anything? I can do that easily enough but I don't know if it tracks
>> dentry entries.
>>
>> Also I'll grab mountstats as well before/after trying a umount.
>> Anything in particular within mountstats I should pay close attention
>> to?
>>
>
> I've been trying random things too, just to see if anything shakes out:
>
> * I tried mounting the top-level mount with the subtree mounts via TCP
> (was UDP), but I still get that error.
>
> * Mounting normally (UDP), if I iterate through the subtrees and
> unmount them *first* (so that there's zero subtrees left over) and
> then unmount the top-level mount, I *don't* get that error. And a
> umount at shutdown proceeds without error and the final call to reboot
> works just fine (instead of hanging indefinitely). So unmounting all
> the subtrees first cleans things up sufficiently.
>
> One random/irrelevant but slightly interesting thing is that when I
> trigger that error, it also bumps kernel.printk from "1 7 1 4" to "15
> 7 1 4". I've not seen that before.
>

I posted this to bugzilla a while back but I figured I'd paste it here too:

-----------------------------------------------------------------------------------------------------

I've been getting bit by the exact same bug and been bisecting for the past
couple of weeks. It's slow going as it can sometimes take a day for the BUG()
to show up (though can also at time take 10 minutes). And I've also seen it
more than once where something was good after a day and then BUG()'d later on,
just to make things more complicated. So the upshot is that while I feel
confident enough about this latest batch of bisecting to post it here, I
wouldn't bet my life on it. I hope this isn't a case where bisecting just shows
where the bug gets exposed but not where it actually got planted :)

Incidentally, I tried the patch from the top of this thread and it didn't seem
to make a difference. I still got bit.

I've posted on the linux-fsdevel thread that Jeff Layton started about it,
http://www.spinics.net/lists/linux-nfs/msg20280.html if you need more details
on my setup (though I'll be happy to provide anything else you need). Though in
that thread you'll see that I'm not using autofs explicitly, the Netapp GX
cluster NFS appears to use autofs to do the implicit submounts (I'm not 100%
sure that's the correct terminology, so hopefully you know what I mean).

Here's my bisect log, ending up at commit
e61da20a50d21725ff27571a6dff9468e4fb7146

git bisect start 'fs'
# good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
# bad: [c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470] Linux 2.6.38-rc1
git bisect bad c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470
# good: [7c955fca3e1d8132982148267d9efcafae849bb6] Merge branch 'for_linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
git bisect good 7c955fca3e1d8132982148267d9efcafae849bb6
# good: [c32b0d4b3f19c2f5d29568f8b7b72b61693f1277] fs/mpage.c: consolidate code
git bisect good c32b0d4b3f19c2f5d29568f8b7b72b61693f1277
# bad: [f8206b925fb0eba3a11839419be118b09105d7b1] Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
git bisect bad f8206b925fb0eba3a11839419be118b09105d7b1
# good: [a8f2800b4f7b76cecb7209cb6a7d2b14904fc711] nfsd4: fix callback
restarting
git bisect good a8f2800b4f7b76cecb7209cb6a7d2b14904fc711
# bad: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up
autofs4_free_ino()
git bisect bad 6651149371b842715906311b4631b8489cebf7e8
# good: [0ad53eeefcbb2620b6a71ffdaad4add20b450b8b] afs: add afs_wq and use it
instead of the system workqueue
git bisect good 0ad53eeefcbb2620b6a71ffdaad4add20b450b8b
# good: [01c64feac45cea1317263eabc4f7ee1b240f297f] CIFS: Use d_automount()
rather than abusing follow_link()
git bisect good 01c64feac45cea1317263eabc4f7ee1b240f297f
# good: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage()
dentry operation
git bisect good b5b801779d59165c4ecf1009009109545bd1f642
# bad: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode
operations
git bisect bad e61da20a50d21725ff27571a6dff9468e4fb7146
# good: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code
git bisect good 8c13a676d5a56495c350f3141824a5ef6c6b4606

2011-05-26 13:49:55

by Jeff Layton

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Wed, 25 May 2011 16:08:15 -0400
Jeff Layton <[email protected]> wrote:

> On Wed, 27 Apr 2011 16:23:07 -0700
> Mark Moseley <[email protected]> wrote:
>
> >
> > I posted this to bugzilla a while back but I figured I'd paste it here too:
> >
> > -----------------------------------------------------------------------------------------------------
> >
> > I've been getting bit by the exact same bug and been bisecting for the past
> > couple of weeks. It's slow going as it can sometimes take a day for the BUG()
> > to show up (though can also at time take 10 minutes). And I've also seen it
> > more than once where something was good after a day and then BUG()'d later on,
> > just to make things more complicated. So the upshot is that while I feel
> > confident enough about this latest batch of bisecting to post it here, I
> > wouldn't bet my life on it. I hope this isn't a case where bisecting just shows
> > where the bug gets exposed but not where it actually got planted :)
> >
> > Incidentally, I tried the patch from the top of this thread and it didn't seem
> > to make a difference. I still got bit.
> >
> > I've posted on the linux-fsdevel thread that Jeff Layton started about it,
> > http://www.spinics.net/lists/linux-nfs/msg20280.html if you need more details
> > on my setup (though I'll be happy to provide anything else you need). Though in
> > that thread you'll see that I'm not using autofs explicitly, the Netapp GX
> > cluster NFS appears to use autofs to do the implicit submounts (I'm not 100%
> > sure that's the correct terminology, so hopefully you know what I mean).
> >
> > Here's my bisect log, ending up at commit
> > e61da20a50d21725ff27571a6dff9468e4fb7146
> >
> > git bisect start 'fs'
> > # good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
> > git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
> > # bad: [c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470] Linux 2.6.38-rc1
> > git bisect bad c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470
> > # good: [7c955fca3e1d8132982148267d9efcafae849bb6] Merge branch 'for_linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
> > git bisect good 7c955fca3e1d8132982148267d9efcafae849bb6
> > # good: [c32b0d4b3f19c2f5d29568f8b7b72b61693f1277] fs/mpage.c: consolidate code
> > git bisect good c32b0d4b3f19c2f5d29568f8b7b72b61693f1277
> > # bad: [f8206b925fb0eba3a11839419be118b09105d7b1] Merge branch 'for-linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
> > git bisect bad f8206b925fb0eba3a11839419be118b09105d7b1
> > # good: [a8f2800b4f7b76cecb7209cb6a7d2b14904fc711] nfsd4: fix callback
> > restarting
> > git bisect good a8f2800b4f7b76cecb7209cb6a7d2b14904fc711
> > # bad: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up
> > autofs4_free_ino()
> > git bisect bad 6651149371b842715906311b4631b8489cebf7e8
> > # good: [0ad53eeefcbb2620b6a71ffdaad4add20b450b8b] afs: add afs_wq and use it
> > instead of the system workqueue
> > git bisect good 0ad53eeefcbb2620b6a71ffdaad4add20b450b8b
> > # good: [01c64feac45cea1317263eabc4f7ee1b240f297f] CIFS: Use d_automount()
> > rather than abusing follow_link()
> > git bisect good 01c64feac45cea1317263eabc4f7ee1b240f297f
> > # good: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage()
> > dentry operation
> > git bisect good b5b801779d59165c4ecf1009009109545bd1f642
> > # bad: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode
> > operations
> > git bisect bad e61da20a50d21725ff27571a6dff9468e4fb7146
> > # good: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code
> > git bisect good 8c13a676d5a56495c350f3141824a5ef6c6b4606
>
> I can more or less reproduce this at will now, I think even with very
> few NFS operations on an automounted nfsv4 mount. Here's an oops from a
> 2.6.39 kernel:
>
> [ 119.419789] tun0: Features changed: 0x00004800 -> 0x00004000
> [ 178.242917] FS-Cache: Netfs 'nfs' registered for caching
> [ 178.269980] SELinux: initialized (dev 0:2c, type nfs4), uses genfs_contexts
> [ 178.282296] SELinux: initialized (dev 0:2d, type nfs4), uses genfs_contexts
> [ 523.953284] BUG: Dentry ffff8801f3084180{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
> [ 523.953306] ------------[ cut here ]------------
> [ 523.954013] kernel BUG at fs/dcache.c:925!
> [ 523.954013] invalid opcode: 0000 [#1] SMP
> [ 523.954013] last sysfs file: /sys/devices/virtual/bdi/0:45/uevent
> [ 523.954013] CPU 1
> [ 523.954013] Modules linked in: nfs lockd auth_rpcgss nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf it87 adt7475 hwmon_vid xfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel raid1 snd_hda_codec snd_usb_audio snd_usbmidi_lib snd_hwdep snd_seq snd_rawmidi snd_seq_device snd_pcm snd_timer snd uvcvideo ppdev videodev soundcore media sp5100_tco v4l2_compat_ioctl32 edac_core parport_pc snd_page_alloc i2c_piix4 edac_mce_amd k10temp parport wmi r8169 microcode mii virtio_net kvm_amd kvm ipv6 ata_generic pata_acpi pata_atiixp radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
> [ 523.954013]
> [ 523.954013] Pid: 8387, comm: umount.nfs4 Not tainted 2.6.39-1.fc16.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
> [ 523.954013] RIP: 0010:[<ffffffff81131788>] [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> [ 523.954013] RSP: 0018:ffff8801f2385d98 EFLAGS: 00010296
> [ 523.954013] RAX: 0000000000000062 RBX: ffff8801f3084180 RCX: 00000000000027a7
> [ 523.954013] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> [ 523.954013] RBP: ffff8801f2385dc8 R08: 0000000000000002 R09: 0000ffff00066c0a
> [ 523.954013] R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff8801f3084180
> [ 523.954013] R13: ffff8801f31305f0 R14: ffff8801f30ec5f0 R15: 0000000000000000
> [ 523.954013] FS: 00007f08e468d720(0000) GS:ffff88022fc80000(0000) knlGS:00000000f770f870
> [ 523.954013] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 523.954013] CR2: 00007fff8fc29ff8 CR3: 00000001f7ac4000 CR4: 00000000000006e0
> [ 523.954013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 523.954013] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 523.954013] Process umount.nfs4 (pid: 8387, threadinfo ffff8801f2384000, task ffff8801f7a61710)
> [ 523.954013] Stack:
> [ 523.954013] ffff8801fdd74a58 0000000000000000 ffff8801fdd74800 ffff8801f3084240
> [ 523.954013] ffff8801f2385f28 ffff8801f30ec540 ffff8801f2385de8 ffffffff81132063
> [ 523.954013] ffff8801fdd74800 ffffffffa0542300 ffff8801f2385e18 ffffffff81121f15
> [ 523.954013] Call Trace:
> [ 523.954013] [<ffffffff81132063>] shrink_dcache_for_umount+0x4d/0x5f
> [ 523.954013] [<ffffffff81121f15>] generic_shutdown_super+0x23/0xde
> [ 523.954013] [<ffffffff8112204d>] kill_anon_super+0x13/0x4c
> [ 523.954013] [<ffffffffa051c436>] nfs4_kill_super+0x44/0x77 [nfs]
> [ 523.954013] [<ffffffff81122271>] deactivate_locked_super+0x26/0x4b
> [ 523.954013] [<ffffffff81122a83>] deactivate_super+0x37/0x3b
> [ 523.954013] [<ffffffff811375b3>] mntput_no_expire+0xcc/0xd1
> [ 523.954013] [<ffffffff811375de>] mntput+0x26/0x28
> [ 523.954013] [<ffffffff81137bef>] release_mounts+0x59/0x73
> [ 523.954013] [<ffffffff81138155>] sys_umount+0x28d/0x2da
> [ 523.954013] [<ffffffff814828c2>] system_call_fastpath+0x16/0x1b
> [ 523.954013] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 f9 ed 7a 81 48 89 de 31 c0 e8 4e 17 34 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
> [ 523.954013] RIP [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> [ 523.954013] RSP <ffff8801f2385d98>
> [ 523.954013] [drm] force priority to high
> [ 523.954013] [drm] force priority to high
> [ 524.042397] ---[ end trace ff21a3f70c461b5e ]---
> [ 534.621786] [drm] force priority to high
> [ 534.621866] [drm] force priority to high
>
> Client rpc stats:
> calls retrans authrefrsh
> 44 0 44
>
> Client nfs v4:
> null read write commit open open_conf
> 0 0% 0 0% 1 2% 0 0% 1 2% 1 2%
> open_noat open_dgrd close setattr fsinfo renew
> 0 0% 0 0% 1 2% 1 2% 4 9% 0 0%
> setclntid confirm lock lockt locku access
> 1 2% 1 2% 0 0% 0 0% 0 0% 6 13%
> getattr lookup lookup_root remove rename link
> 7 16% 5 11% 1 2% 0 0% 0 0% 0 0%
> symlink create pathconf statfs readlink readdir
> 0 0% 0 0% 3 6% 0 0% 0 0% 3 6%
> server_caps delegreturn getacl setacl fs_locations rel_lkowner
> 7 16% 0 0% 0 0% 0 0% 0 0% 0 0%
> exchange_id create_ses destroy_ses sequence get_lease_t reclaim_comp
> 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
> layoutget layoutcommit layoutreturn getdevlist getdevinfo ds_write
> 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
> ds_commit
> 0 0%
>
> That's good because I think it may help narrow down the codepaths that
> lead to this...
>
> All I did was save a file to the mount using claws-mail and wait for
> autofs to unmount it. I'm going to see if I can narrow down the
> reproducer somewhat and whether I can take autofs out of the picture.
>

(cc'ing autofs mailing list too)

Taking autofs out of the picture seems to make the problem go away, so
this seems to be related to that. Why, exactly, I'm not sure yet. It's
always been my understanding that autofs basically gets out of the way
once the fs is mounted, but I suppose it needs some way to check for
expiration so maybe that mechanism is causing issues.

--
Jeff Layton <[email protected]>

2011-05-25 20:08:18

by Jeff Layton

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Wed, 27 Apr 2011 16:23:07 -0700
Mark Moseley <[email protected]> wrote:

>
> I posted this to bugzilla a while back but I figured I'd paste it here too:
>
> -----------------------------------------------------------------------------------------------------
>
> I've been getting bit by the exact same bug and been bisecting for the past
> couple of weeks. It's slow going as it can sometimes take a day for the BUG()
> to show up (though can also at time take 10 minutes). And I've also seen it
> more than once where something was good after a day and then BUG()'d later on,
> just to make things more complicated. So the upshot is that while I feel
> confident enough about this latest batch of bisecting to post it here, I
> wouldn't bet my life on it. I hope this isn't a case where bisecting just shows
> where the bug gets exposed but not where it actually got planted :)
>
> Incidentally, I tried the patch from the top of this thread and it didn't seem
> to make a difference. I still got bit.
>
> I've posted on the linux-fsdevel thread that Jeff Layton started about it,
> http://www.spinics.net/lists/linux-nfs/msg20280.html if you need more details
> on my setup (though I'll be happy to provide anything else you need). Though in
> that thread you'll see that I'm not using autofs explicitly, the Netapp GX
> cluster NFS appears to use autofs to do the implicit submounts (I'm not 100%
> sure that's the correct terminology, so hopefully you know what I mean).
>
> Here's my bisect log, ending up at commit
> e61da20a50d21725ff27571a6dff9468e4fb7146
>
> git bisect start 'fs'
> # good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
> git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
> # bad: [c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470] Linux 2.6.38-rc1
> git bisect bad c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470
> # good: [7c955fca3e1d8132982148267d9efcafae849bb6] Merge branch 'for_linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
> git bisect good 7c955fca3e1d8132982148267d9efcafae849bb6
> # good: [c32b0d4b3f19c2f5d29568f8b7b72b61693f1277] fs/mpage.c: consolidate code
> git bisect good c32b0d4b3f19c2f5d29568f8b7b72b61693f1277
> # bad: [f8206b925fb0eba3a11839419be118b09105d7b1] Merge branch 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
> git bisect bad f8206b925fb0eba3a11839419be118b09105d7b1
> # good: [a8f2800b4f7b76cecb7209cb6a7d2b14904fc711] nfsd4: fix callback
> restarting
> git bisect good a8f2800b4f7b76cecb7209cb6a7d2b14904fc711
> # bad: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up
> autofs4_free_ino()
> git bisect bad 6651149371b842715906311b4631b8489cebf7e8
> # good: [0ad53eeefcbb2620b6a71ffdaad4add20b450b8b] afs: add afs_wq and use it
> instead of the system workqueue
> git bisect good 0ad53eeefcbb2620b6a71ffdaad4add20b450b8b
> # good: [01c64feac45cea1317263eabc4f7ee1b240f297f] CIFS: Use d_automount()
> rather than abusing follow_link()
> git bisect good 01c64feac45cea1317263eabc4f7ee1b240f297f
> # good: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage()
> dentry operation
> git bisect good b5b801779d59165c4ecf1009009109545bd1f642
> # bad: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode
> operations
> git bisect bad e61da20a50d21725ff27571a6dff9468e4fb7146
> # good: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code
> git bisect good 8c13a676d5a56495c350f3141824a5ef6c6b4606

I can more or less reproduce this at will now, I think even with very
few NFS operations on an automounted nfsv4 mount. Here's an oops from a
2.6.39 kernel:

[ 119.419789] tun0: Features changed: 0x00004800 -> 0x00004000
[ 178.242917] FS-Cache: Netfs 'nfs' registered for caching
[ 178.269980] SELinux: initialized (dev 0:2c, type nfs4), uses genfs_contexts
[ 178.282296] SELinux: initialized (dev 0:2d, type nfs4), uses genfs_contexts
[ 523.953284] BUG: Dentry ffff8801f3084180{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
[ 523.953306] ------------[ cut here ]------------
[ 523.954013] kernel BUG at fs/dcache.c:925!
[ 523.954013] invalid opcode: 0000 [#1] SMP
[ 523.954013] last sysfs file: /sys/devices/virtual/bdi/0:45/uevent
[ 523.954013] CPU 1
[ 523.954013] Modules linked in: nfs lockd auth_rpcgss nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf it87 adt7475 hwmon_vid xfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel raid1 snd_hda_codec snd_usb_audio snd_usbmidi_lib snd_hwdep snd_seq snd_rawmidi snd_seq_device snd_pcm snd_timer snd uvcvideo ppdev videodev soundcore media sp5100_tco v4l2_compat_ioctl32 edac_core parport_pc snd_page_alloc i2c_piix4 edac_mce_amd k10temp parport wmi r8169 microcode mii virtio_net kvm_amd kvm ipv6 ata_generic pata_acpi pata_atiixp radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[ 523.954013]
[ 523.954013] Pid: 8387, comm: umount.nfs4 Not tainted 2.6.39-1.fc16.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
[ 523.954013] RIP: 0010:[<ffffffff81131788>] [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
[ 523.954013] RSP: 0018:ffff8801f2385d98 EFLAGS: 00010296
[ 523.954013] RAX: 0000000000000062 RBX: ffff8801f3084180 RCX: 00000000000027a7
[ 523.954013] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
[ 523.954013] RBP: ffff8801f2385dc8 R08: 0000000000000002 R09: 0000ffff00066c0a
[ 523.954013] R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff8801f3084180
[ 523.954013] R13: ffff8801f31305f0 R14: ffff8801f30ec5f0 R15: 0000000000000000
[ 523.954013] FS: 00007f08e468d720(0000) GS:ffff88022fc80000(0000) knlGS:00000000f770f870
[ 523.954013] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 523.954013] CR2: 00007fff8fc29ff8 CR3: 00000001f7ac4000 CR4: 00000000000006e0
[ 523.954013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 523.954013] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 523.954013] Process umount.nfs4 (pid: 8387, threadinfo ffff8801f2384000, task ffff8801f7a61710)
[ 523.954013] Stack:
[ 523.954013] ffff8801fdd74a58 0000000000000000 ffff8801fdd74800 ffff8801f3084240
[ 523.954013] ffff8801f2385f28 ffff8801f30ec540 ffff8801f2385de8 ffffffff81132063
[ 523.954013] ffff8801fdd74800 ffffffffa0542300 ffff8801f2385e18 ffffffff81121f15
[ 523.954013] Call Trace:
[ 523.954013] [<ffffffff81132063>] shrink_dcache_for_umount+0x4d/0x5f
[ 523.954013] [<ffffffff81121f15>] generic_shutdown_super+0x23/0xde
[ 523.954013] [<ffffffff8112204d>] kill_anon_super+0x13/0x4c
[ 523.954013] [<ffffffffa051c436>] nfs4_kill_super+0x44/0x77 [nfs]
[ 523.954013] [<ffffffff81122271>] deactivate_locked_super+0x26/0x4b
[ 523.954013] [<ffffffff81122a83>] deactivate_super+0x37/0x3b
[ 523.954013] [<ffffffff811375b3>] mntput_no_expire+0xcc/0xd1
[ 523.954013] [<ffffffff811375de>] mntput+0x26/0x28
[ 523.954013] [<ffffffff81137bef>] release_mounts+0x59/0x73
[ 523.954013] [<ffffffff81138155>] sys_umount+0x28d/0x2da
[ 523.954013] [<ffffffff814828c2>] system_call_fastpath+0x16/0x1b
[ 523.954013] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 f9 ed 7a 81 48 89 de 31 c0 e8 4e 17 34 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
[ 523.954013] RIP [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
[ 523.954013] RSP <ffff8801f2385d98>
[ 523.954013] [drm] force priority to high
[ 523.954013] [drm] force priority to high
[ 524.042397] ---[ end trace ff21a3f70c461b5e ]---
[ 534.621786] [drm] force priority to high
[ 534.621866] [drm] force priority to high

Client rpc stats:
calls retrans authrefrsh
44 0 44

Client nfs v4:
null read write commit open open_conf
0 0% 0 0% 1 2% 0 0% 1 2% 1 2%
open_noat open_dgrd close setattr fsinfo renew
0 0% 0 0% 1 2% 1 2% 4 9% 0 0%
setclntid confirm lock lockt locku access
1 2% 1 2% 0 0% 0 0% 0 0% 6 13%
getattr lookup lookup_root remove rename link
7 16% 5 11% 1 2% 0 0% 0 0% 0 0%
symlink create pathconf statfs readlink readdir
0 0% 0 0% 3 6% 0 0% 0 0% 3 6%
server_caps delegreturn getacl setacl fs_locations rel_lkowner
7 16% 0 0% 0 0% 0 0% 0 0% 0 0%
exchange_id create_ses destroy_ses sequence get_lease_t reclaim_comp
0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
layoutget layoutcommit layoutreturn getdevlist getdevinfo ds_write
0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
ds_commit
0 0%

That's good because I think it may help narrow down the codepaths that
lead to this...

All I did was save a file to the mount using claws-mail and wait for
autofs to unmount it. I'm going to see if I can narrow down the
reproducer somewhat and whether I can take autofs out of the picture.

--
Jeff Layton <[email protected]>

2011-05-27 19:51:13

by Mark Moseley

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Thu, May 26, 2011 at 8:02 AM, Ian Kent <[email protected]> wrote:
> On Thu, 2011-05-26 at 09:49 -0400, Jeff Layton wrote:
>> On Wed, 25 May 2011 16:08:15 -0400
>> Jeff Layton <[email protected]> wrote:
>>
>> > On Wed, 27 Apr 2011 16:23:07 -0700
>> > Mark Moseley <[email protected]> wrote:
>> >
>> > >
>> > > I posted this to bugzilla a while back but I figured I'd paste it here too:
>> > >
>> > > -----------------------------------------------------------------------------------------------------
>> > >
>> > > I've been getting bit by the exact same bug and been bisecting for the past
>> > > couple of weeks. It's slow going as it can sometimes take a day for the BUG()
>> > > to show up (though can also at time take 10 minutes). And I've also seen it
>> > > more than once where something was good after a day and then BUG()'d later on,
>> > > just to make things more complicated. So the upshot is that while I feel
>> > > confident enough about this latest batch of bisecting to post it here, I
>> > > wouldn't bet my life on it. I hope this isn't a case where bisecting just shows
>> > > where the bug gets exposed but not where it actually got planted :)
>> > >
>> > > Incidentally, I tried the patch from the top of this thread and it didn't seem
>> > > to make a difference. I still got bit.
>> > >
>> > > I've posted on the linux-fsdevel thread that Jeff Layton started about it,
>> > > http://www.spinics.net/lists/linux-nfs/msg20280.html if you need more details
>> > > on my setup (though I'll be happy to provide anything else you need). Though in
>> > > that thread you'll see that I'm not using autofs explicitly, the Netapp GX
>> > > cluster NFS appears to use autofs to do the implicit submounts (I'm not 100%
>> > > sure that's the correct terminology, so hopefully you know what I mean).
>> > >
>> > > Here's my bisect log, ending up at commit
>> > > e61da20a50d21725ff27571a6dff9468e4fb7146
>> > >
>> > > git bisect start 'fs'
>> > > # good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
>> > > git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
>> > > # bad: [c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470] Linux 2.6.38-rc1
>> > > git bisect bad c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470
>> > > # good: [7c955fca3e1d8132982148267d9efcafae849bb6] Merge branch 'for_linus' of
>> > > git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
>> > > git bisect good 7c955fca3e1d8132982148267d9efcafae849bb6
>> > > # good: [c32b0d4b3f19c2f5d29568f8b7b72b61693f1277] fs/mpage.c: consolidate code
>> > > git bisect good c32b0d4b3f19c2f5d29568f8b7b72b61693f1277
>> > > # bad: [f8206b925fb0eba3a11839419be118b09105d7b1] Merge branch 'for-linus' of
>> > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
>> > > git bisect bad f8206b925fb0eba3a11839419be118b09105d7b1
>> > > # good: [a8f2800b4f7b76cecb7209cb6a7d2b14904fc711] nfsd4: fix callback
>> > > restarting
>> > > git bisect good a8f2800b4f7b76cecb7209cb6a7d2b14904fc711
>> > > # bad: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up
>> > > autofs4_free_ino()
>> > > git bisect bad 6651149371b842715906311b4631b8489cebf7e8
>> > > # good: [0ad53eeefcbb2620b6a71ffdaad4add20b450b8b] afs: add afs_wq and use it
>> > > instead of the system workqueue
>> > > git bisect good 0ad53eeefcbb2620b6a71ffdaad4add20b450b8b
>> > > # good: [01c64feac45cea1317263eabc4f7ee1b240f297f] CIFS: Use d_automount()
>> > > rather than abusing follow_link()
>> > > git bisect good 01c64feac45cea1317263eabc4f7ee1b240f297f
>> > > # good: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage()
>> > > dentry operation
>> > > git bisect good b5b801779d59165c4ecf1009009109545bd1f642
>> > > # bad: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode
>> > > operations
>> > > git bisect bad e61da20a50d21725ff27571a6dff9468e4fb7146
>> > > # good: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code
>> > > git bisect good 8c13a676d5a56495c350f3141824a5ef6c6b4606
>> >
>> > I can more or less reproduce this at will now, I think even with very
>> > few NFS operations on an automounted nfsv4 mount. Here's an oops from a
>> > 2.6.39 kernel:
>> >
>> > [ ?119.419789] tun0: Features changed: 0x00004800 -> 0x00004000
>> > [ ?178.242917] FS-Cache: Netfs 'nfs' registered for caching
>> > [ ?178.269980] SELinux: initialized (dev 0:2c, type nfs4), uses genfs_contexts
>> > [ ?178.282296] SELinux: initialized (dev 0:2d, type nfs4), uses genfs_contexts
>> > [ ?523.953284] BUG: Dentry ffff8801f3084180{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
>> > [ ?523.953306] ------------[ cut here ]------------
>> > [ ?523.954013] kernel BUG at fs/dcache.c:925!
>> > [ ?523.954013] invalid opcode: 0000 [#1] SMP
>> > [ ?523.954013] last sysfs file: /sys/devices/virtual/bdi/0:45/uevent
>> > [ ?523.954013] CPU 1
>> > [ ?523.954013] Modules linked in: nfs lockd auth_rpcgss nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf it87 adt7475 hwmon_vid xfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel raid1 snd_hda_codec snd_usb_audio snd_usbmidi_lib snd_hwdep snd_seq snd_rawmidi snd_seq_device snd_pcm snd_timer snd uvcvideo ppdev videodev soundcore media sp5100_tco v4l2_compat_ioctl32 edac_core parport_pc snd_page_alloc i2c_piix4 edac_mce_amd k10temp parport wmi r8169 microcode mii virtio_net kvm_amd kvm ipv6 ata_generic pata_acpi pata_atiixp radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
>> > [ ?523.954013]
>> > [ ?523.954013] Pid: 8387, comm: umount.nfs4 Not tainted 2.6.39-1.fc16.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
>> > [ ?523.954013] RIP: 0010:[<ffffffff81131788>] ?[<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
>> > [ ?523.954013] RSP: 0018:ffff8801f2385d98 ?EFLAGS: 00010296
>> > [ ?523.954013] RAX: 0000000000000062 RBX: ffff8801f3084180 RCX: 00000000000027a7
>> > [ ?523.954013] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
>> > [ ?523.954013] RBP: ffff8801f2385dc8 R08: 0000000000000002 R09: 0000ffff00066c0a
>> > [ ?523.954013] R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff8801f3084180
>> > [ ?523.954013] R13: ffff8801f31305f0 R14: ffff8801f30ec5f0 R15: 0000000000000000
>> > [ ?523.954013] FS: ?00007f08e468d720(0000) GS:ffff88022fc80000(0000) knlGS:00000000f770f870
>> > [ ?523.954013] CS: ?0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [ ?523.954013] CR2: 00007fff8fc29ff8 CR3: 00000001f7ac4000 CR4: 00000000000006e0
>> > [ ?523.954013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > [ ?523.954013] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> > [ ?523.954013] Process umount.nfs4 (pid: 8387, threadinfo ffff8801f2384000, task ffff8801f7a61710)
>> > [ ?523.954013] Stack:
>> > [ ?523.954013] ?ffff8801fdd74a58 0000000000000000 ffff8801fdd74800 ffff8801f3084240
>> > [ ?523.954013] ?ffff8801f2385f28 ffff8801f30ec540 ffff8801f2385de8 ffffffff81132063
>> > [ ?523.954013] ?ffff8801fdd74800 ffffffffa0542300 ffff8801f2385e18 ffffffff81121f15
>> > [ ?523.954013] Call Trace:
>> > [ ?523.954013] ?[<ffffffff81132063>] shrink_dcache_for_umount+0x4d/0x5f
>> > [ ?523.954013] ?[<ffffffff81121f15>] generic_shutdown_super+0x23/0xde
>> > [ ?523.954013] ?[<ffffffff8112204d>] kill_anon_super+0x13/0x4c
>> > [ ?523.954013] ?[<ffffffffa051c436>] nfs4_kill_super+0x44/0x77 [nfs]
>> > [ ?523.954013] ?[<ffffffff81122271>] deactivate_locked_super+0x26/0x4b
>> > [ ?523.954013] ?[<ffffffff81122a83>] deactivate_super+0x37/0x3b
>> > [ ?523.954013] ?[<ffffffff811375b3>] mntput_no_expire+0xcc/0xd1
>> > [ ?523.954013] ?[<ffffffff811375de>] mntput+0x26/0x28
>> > [ ?523.954013] ?[<ffffffff81137bef>] release_mounts+0x59/0x73
>> > [ ?523.954013] ?[<ffffffff81138155>] sys_umount+0x28d/0x2da
>> > [ ?523.954013] ?[<ffffffff814828c2>] system_call_fastpath+0x16/0x1b
>> > [ ?523.954013] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 f9 ed 7a 81 48 89 de 31 c0 e8 4e 17 34 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
>> > [ ?523.954013] RIP ?[<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
>> > [ ?523.954013] ?RSP <ffff8801f2385d98>
>> > [ ?523.954013] [drm] force priority to high
>> > [ ?523.954013] [drm] force priority to high
>> > [ ?524.042397] ---[ end trace ff21a3f70c461b5e ]---
>> > [ ?534.621786] [drm] force priority to high
>> > [ ?534.621866] [drm] force priority to high
>> >
>> > Client rpc stats:
>> > calls ? ? ?retrans ? ?authrefrsh
>> > 44 ? ? ? ? 0 ? ? ? ? ?44
>> >
>> > Client nfs v4:
>> > null ? ? ? ? read ? ? ? ? write ? ? ? ?commit ? ? ? open ? ? ? ? open_conf
>> > 0 ? ? ? ? 0% 0 ? ? ? ? 0% 1 ? ? ? ? 2% 0 ? ? ? ? 0% 1 ? ? ? ? 2% 1 ? ? ? ? 2%
>> > open_noat ? ?open_dgrd ? ?close ? ? ? ?setattr ? ? ?fsinfo ? ? ? renew
>> > 0 ? ? ? ? 0% 0 ? ? ? ? 0% 1 ? ? ? ? 2% 1 ? ? ? ? 2% 4 ? ? ? ? 9% 0 ? ? ? ? 0%
>> > setclntid ? ?confirm ? ? ?lock ? ? ? ? lockt ? ? ? ?locku ? ? ? ?access
>> > 1 ? ? ? ? 2% 1 ? ? ? ? 2% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 6 ? ? ? ?13%
>> > getattr ? ? ?lookup ? ? ? lookup_root ?remove ? ? ? rename ? ? ? link
>> > 7 ? ? ? ?16% 5 ? ? ? ?11% 1 ? ? ? ? 2% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0%
>> > symlink ? ? ?create ? ? ? pathconf ? ? statfs ? ? ? readlink ? ? readdir
>> > 0 ? ? ? ? 0% 0 ? ? ? ? 0% 3 ? ? ? ? 6% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 3 ? ? ? ? 6%
>> > server_caps ?delegreturn ?getacl ? ? ? setacl ? ? ? fs_locations rel_lkowner
>> > 7 ? ? ? ?16% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0%
>> > exchange_id ?create_ses ? destroy_ses ?sequence ? ? get_lease_t ?reclaim_comp
>> > 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0%
>> > layoutget ? ?layoutcommit layoutreturn getdevlist ? getdevinfo ? ds_write
>> > 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0% 0 ? ? ? ? 0%
>> > ds_commit
>> > 0 ? ? ? ? 0%
>> >
>> > That's good because I think it may help narrow down the codepaths that
>> > lead to this...
>> >
>> > All I did was save a file to the mount using claws-mail and wait for
>> > autofs to unmount it. I'm going to see if I can narrow down the
>> > reproducer somewhat and whether I can take autofs out of the picture.
>> >
>>
>> (cc'ing autofs mailing list too)
>>
>> Taking autofs out of the picture seems to make the problem go away, so
>> this seems to be related to that. Why, exactly, I'm not sure yet. It's
>> always been my understanding that autofs basically gets out of the way
>> once the fs is mounted, but I suppose it needs some way to check for
>> expiration so maybe that mechanism is causing issues.
>
> Mmmm ... with 2.6.39 ... mmm ...
>
> With the vfs-automount in place autofs doesn't quite get out of the way
> anymore but only in so much as it gets a looking during the follow while
> path walking.
>
> Your thought about the autofs expire may be a good lead. There could be
> a problem between the expire check and the ref countless rcu-walk in
> that a umount (possibly a kernel dentry expire) could occur while there
> is an elevated ref-count (by the autofs expire) perhaps. I've thought
> about this possibility the couple of times I've seen reports like this
> but don't have anything concrete.
>
> But, I thought that Mark sees a problem when autofs is not involved at
> all which of course probably means autofs may be the reproducer not the
> cause or we have two bugs. Mark, can you confirm you see this without
> autofs in use please?
>
> Ian

Yup, we're not loading the autofs4 module:

# zgrep AUTOFS /proc/config.gz
CONFIG_AUTOFS4_FS=m
#

# lsmod | grep auto
#

I'm pretty that the Netapp GX cluster uses autofs routines to do its
implicit submounts though. Or at the very least, it's doing very
autofs-y things, i.e. the kernel takes care of mounting submounts on
its own. Depending on the job of the server, those submounts can
number from 30 up to 300, so there's always a decent amount of
submounts.

I'm more than happy to grab any debugging info that would be help,
and/or try out any patches/configs/etc.

2011-05-26 15:02:38

by Ian Kent

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Thu, 2011-05-26 at 09:49 -0400, Jeff Layton wrote:
> On Wed, 25 May 2011 16:08:15 -0400
> Jeff Layton <[email protected]> wrote:
>
> > On Wed, 27 Apr 2011 16:23:07 -0700
> > Mark Moseley <[email protected]> wrote:
> >
> > >
> > > I posted this to bugzilla a while back but I figured I'd paste it here too:
> > >
> > > -----------------------------------------------------------------------------------------------------
> > >
> > > I've been getting bit by the exact same bug and been bisecting for the past
> > > couple of weeks. It's slow going as it can sometimes take a day for the BUG()
> > > to show up (though can also at time take 10 minutes). And I've also seen it
> > > more than once where something was good after a day and then BUG()'d later on,
> > > just to make things more complicated. So the upshot is that while I feel
> > > confident enough about this latest batch of bisecting to post it here, I
> > > wouldn't bet my life on it. I hope this isn't a case where bisecting just shows
> > > where the bug gets exposed but not where it actually got planted :)
> > >
> > > Incidentally, I tried the patch from the top of this thread and it didn't seem
> > > to make a difference. I still got bit.
> > >
> > > I've posted on the linux-fsdevel thread that Jeff Layton started about it,
> > > http://www.spinics.net/lists/linux-nfs/msg20280.html if you need more details
> > > on my setup (though I'll be happy to provide anything else you need). Though in
> > > that thread you'll see that I'm not using autofs explicitly, the Netapp GX
> > > cluster NFS appears to use autofs to do the implicit submounts (I'm not 100%
> > > sure that's the correct terminology, so hopefully you know what I mean).
> > >
> > > Here's my bisect log, ending up at commit
> > > e61da20a50d21725ff27571a6dff9468e4fb7146
> > >
> > > git bisect start 'fs'
> > > # good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
> > > git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
> > > # bad: [c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470] Linux 2.6.38-rc1
> > > git bisect bad c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470
> > > # good: [7c955fca3e1d8132982148267d9efcafae849bb6] Merge branch 'for_linus' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
> > > git bisect good 7c955fca3e1d8132982148267d9efcafae849bb6
> > > # good: [c32b0d4b3f19c2f5d29568f8b7b72b61693f1277] fs/mpage.c: consolidate code
> > > git bisect good c32b0d4b3f19c2f5d29568f8b7b72b61693f1277
> > > # bad: [f8206b925fb0eba3a11839419be118b09105d7b1] Merge branch 'for-linus' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
> > > git bisect bad f8206b925fb0eba3a11839419be118b09105d7b1
> > > # good: [a8f2800b4f7b76cecb7209cb6a7d2b14904fc711] nfsd4: fix callback
> > > restarting
> > > git bisect good a8f2800b4f7b76cecb7209cb6a7d2b14904fc711
> > > # bad: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up
> > > autofs4_free_ino()
> > > git bisect bad 6651149371b842715906311b4631b8489cebf7e8
> > > # good: [0ad53eeefcbb2620b6a71ffdaad4add20b450b8b] afs: add afs_wq and use it
> > > instead of the system workqueue
> > > git bisect good 0ad53eeefcbb2620b6a71ffdaad4add20b450b8b
> > > # good: [01c64feac45cea1317263eabc4f7ee1b240f297f] CIFS: Use d_automount()
> > > rather than abusing follow_link()
> > > git bisect good 01c64feac45cea1317263eabc4f7ee1b240f297f
> > > # good: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage()
> > > dentry operation
> > > git bisect good b5b801779d59165c4ecf1009009109545bd1f642
> > > # bad: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode
> > > operations
> > > git bisect bad e61da20a50d21725ff27571a6dff9468e4fb7146
> > > # good: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code
> > > git bisect good 8c13a676d5a56495c350f3141824a5ef6c6b4606
> >
> > I can more or less reproduce this at will now, I think even with very
> > few NFS operations on an automounted nfsv4 mount. Here's an oops from a
> > 2.6.39 kernel:
> >
> > [ 119.419789] tun0: Features changed: 0x00004800 -> 0x00004000
> > [ 178.242917] FS-Cache: Netfs 'nfs' registered for caching
> > [ 178.269980] SELinux: initialized (dev 0:2c, type nfs4), uses genfs_contexts
> > [ 178.282296] SELinux: initialized (dev 0:2d, type nfs4), uses genfs_contexts
> > [ 523.953284] BUG: Dentry ffff8801f3084180{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
> > [ 523.953306] ------------[ cut here ]------------
> > [ 523.954013] kernel BUG at fs/dcache.c:925!
> > [ 523.954013] invalid opcode: 0000 [#1] SMP
> > [ 523.954013] last sysfs file: /sys/devices/virtual/bdi/0:45/uevent
> > [ 523.954013] CPU 1
> > [ 523.954013] Modules linked in: nfs lockd auth_rpcgss nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf it87 adt7475 hwmon_vid xfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel raid1 snd_hda_codec snd_usb_audio snd_usbmidi_lib snd_hwdep snd_seq snd_rawmidi snd_seq_device snd_pcm snd_timer snd uvcvideo ppdev videodev soundcore media sp5100_tco v4l2_compat_ioctl32 edac_core parport_pc snd_page_alloc i2c_piix4 edac_mce_amd k10temp parport wmi r8169 microcode mii virtio_net kvm_amd kvm ipv6 ata_generic pata_acpi pata_atiixp radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
> > [ 523.954013]
> > [ 523.954013] Pid: 8387, comm: umount.nfs4 Not tainted 2.6.39-1.fc16.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
> > [ 523.954013] RIP: 0010:[<ffffffff81131788>] [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> > [ 523.954013] RSP: 0018:ffff8801f2385d98 EFLAGS: 00010296
> > [ 523.954013] RAX: 0000000000000062 RBX: ffff8801f3084180 RCX: 00000000000027a7
> > [ 523.954013] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> > [ 523.954013] RBP: ffff8801f2385dc8 R08: 0000000000000002 R09: 0000ffff00066c0a
> > [ 523.954013] R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff8801f3084180
> > [ 523.954013] R13: ffff8801f31305f0 R14: ffff8801f30ec5f0 R15: 0000000000000000
> > [ 523.954013] FS: 00007f08e468d720(0000) GS:ffff88022fc80000(0000) knlGS:00000000f770f870
> > [ 523.954013] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 523.954013] CR2: 00007fff8fc29ff8 CR3: 00000001f7ac4000 CR4: 00000000000006e0
> > [ 523.954013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 523.954013] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [ 523.954013] Process umount.nfs4 (pid: 8387, threadinfo ffff8801f2384000, task ffff8801f7a61710)
> > [ 523.954013] Stack:
> > [ 523.954013] ffff8801fdd74a58 0000000000000000 ffff8801fdd74800 ffff8801f3084240
> > [ 523.954013] ffff8801f2385f28 ffff8801f30ec540 ffff8801f2385de8 ffffffff81132063
> > [ 523.954013] ffff8801fdd74800 ffffffffa0542300 ffff8801f2385e18 ffffffff81121f15
> > [ 523.954013] Call Trace:
> > [ 523.954013] [<ffffffff81132063>] shrink_dcache_for_umount+0x4d/0x5f
> > [ 523.954013] [<ffffffff81121f15>] generic_shutdown_super+0x23/0xde
> > [ 523.954013] [<ffffffff8112204d>] kill_anon_super+0x13/0x4c
> > [ 523.954013] [<ffffffffa051c436>] nfs4_kill_super+0x44/0x77 [nfs]
> > [ 523.954013] [<ffffffff81122271>] deactivate_locked_super+0x26/0x4b
> > [ 523.954013] [<ffffffff81122a83>] deactivate_super+0x37/0x3b
> > [ 523.954013] [<ffffffff811375b3>] mntput_no_expire+0xcc/0xd1
> > [ 523.954013] [<ffffffff811375de>] mntput+0x26/0x28
> > [ 523.954013] [<ffffffff81137bef>] release_mounts+0x59/0x73
> > [ 523.954013] [<ffffffff81138155>] sys_umount+0x28d/0x2da
> > [ 523.954013] [<ffffffff814828c2>] system_call_fastpath+0x16/0x1b
> > [ 523.954013] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 f9 ed 7a 81 48 89 de 31 c0 e8 4e 17 34 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
> > [ 523.954013] RIP [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> > [ 523.954013] RSP <ffff8801f2385d98>
> > [ 523.954013] [drm] force priority to high
> > [ 523.954013] [drm] force priority to high
> > [ 524.042397] ---[ end trace ff21a3f70c461b5e ]---
> > [ 534.621786] [drm] force priority to high
> > [ 534.621866] [drm] force priority to high
> >
> > Client rpc stats:
> > calls retrans authrefrsh
> > 44 0 44
> >
> > Client nfs v4:
> > null read write commit open open_conf
> > 0 0% 0 0% 1 2% 0 0% 1 2% 1 2%
> > open_noat open_dgrd close setattr fsinfo renew
> > 0 0% 0 0% 1 2% 1 2% 4 9% 0 0%
> > setclntid confirm lock lockt locku access
> > 1 2% 1 2% 0 0% 0 0% 0 0% 6 13%
> > getattr lookup lookup_root remove rename link
> > 7 16% 5 11% 1 2% 0 0% 0 0% 0 0%
> > symlink create pathconf statfs readlink readdir
> > 0 0% 0 0% 3 6% 0 0% 0 0% 3 6%
> > server_caps delegreturn getacl setacl fs_locations rel_lkowner
> > 7 16% 0 0% 0 0% 0 0% 0 0% 0 0%
> > exchange_id create_ses destroy_ses sequence get_lease_t reclaim_comp
> > 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
> > layoutget layoutcommit layoutreturn getdevlist getdevinfo ds_write
> > 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
> > ds_commit
> > 0 0%
> >
> > That's good because I think it may help narrow down the codepaths that
> > lead to this...
> >
> > All I did was save a file to the mount using claws-mail and wait for
> > autofs to unmount it. I'm going to see if I can narrow down the
> > reproducer somewhat and whether I can take autofs out of the picture.
> >
>
> (cc'ing autofs mailing list too)
>
> Taking autofs out of the picture seems to make the problem go away, so
> this seems to be related to that. Why, exactly, I'm not sure yet. It's
> always been my understanding that autofs basically gets out of the way
> once the fs is mounted, but I suppose it needs some way to check for
> expiration so maybe that mechanism is causing issues.

Mmmm ... with 2.6.39 ... mmm ...

With the vfs-automount in place autofs doesn't quite get out of the way
anymore but only in so much as it gets a looking during the follow while
path walking.

Your thought about the autofs expire may be a good lead. There could be
a problem between the expire check and the ref countless rcu-walk in
that a umount (possibly a kernel dentry expire) could occur while there
is an elevated ref-count (by the autofs expire) perhaps. I've thought
about this possibility the couple of times I've seen reports like this
but don't have anything concrete.

But, I thought that Mark sees a problem when autofs is not involved at
all which of course probably means autofs may be the reproducer not the
cause or we have two bugs. Mark, can you confirm you see this without
autofs in use please?

Ian



2011-05-26 14:48:08

by Ian Kent

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Thu, 2011-03-31 at 12:17 -0700, Mark Moseley wrote:
> On Wed, Mar 30, 2011 at 6:37 PM, Jeff Layton <[email protected]> wrote:
> > On Wed, 30 Mar 2011 14:55:00 -0700
> > Mark Moseley <[email protected]> wrote:
> >
> >> > I'm not 100% sure it's related (but I'm going to guess it is) but on
> >> > these same boxes, they're not actually able to reboot at the end of a
> >> > graceful shutdown. After yielding that bug and continuing with the
> >> > shutdown process, it gets all the way to exec'ing:
> >> >
> >> > reboot -d -f -i
> >> >
> >> > and then just hangs forever. I'm guessing a thread is hung still
> >> > trying to unmount things. On another box, I triggered that bug with a
> >> > umount of one top-level mount that had subtrees. When I umount'd
> >> > another top-level mount with subtrees on that same box, it's blocked
> >> > and unkillable. That second umount also logged another bug to the
> >> > kernel logs.
> >> >
> >> > In both umounts described above, the entries in /proc/mounts go away
> >> > after the umount.
> >> >
> >> > Jeff, are you at liberty to do a graceful shutdown of the box you saw
> >> > that bug on? If so, does it actually reboot?
> >>
> >>
> >> A bit more info: On the same boxes, freshly booted but with all the
> >> same mounts (even the subtrees) mounted, I don't get that bug, so it
> >> seems to happen just when there's been significant usage within those
> >> mounts. These are all read-only mounts, if it makes a difference.
> >>
> >> I was however able to trigger the bug on a box that had been running
> >> (web serving) for about 15 minutes. Here's a snippet from slabinfo
> >> right before umount'ing (let me know if more of it would help):
> >>
> >
> > No, it hung. Unfortunately, I didn't collect any info that told me
> > where it was hung though.
> >
> > I believe that's fairly typical though in situations like this. The
> > problem is that we likely have a dentry refcount leak somewhere. The
> > vfsmount refcount was low enough to allow an unmount though. Often
> > these sorts of problems lurk in error handling code...
> >
> >> # grep nfs /proc/slabinfo
> >> nfsd4_delegations 0 0 360 22 2 : tunables 0 0
> >> 0 : slabdata 0 0 0
> >> nfsd4_stateids 0 0 120 34 1 : tunables 0 0
> >> 0 : slabdata 0 0 0
> >> nfsd4_files 0 0 136 30 1 : tunables 0 0
> >> 0 : slabdata 0 0 0
> >> nfsd4_stateowners 0 0 424 38 4 : tunables 0 0
> >> 0 : slabdata 0 0 0
> >> nfs_direct_cache 0 0 136 30 1 : tunables 0 0
> >> 0 : slabdata 0 0 0
> >> nfs_write_data 46 46 704 23 4 : tunables 0 0
> >> 0 : slabdata 2 2 0
> >> nfs_read_data 207 207 704 23 4 : tunables 0 0
> >> 0 : slabdata 9 9 0
> >> nfs_inode_cache 23901 23901 1056 31 8 : tunables 0 0
> >> 0 : slabdata 771 771 0
> >> nfs_page 256 256 128 32 1 : tunables 0 0
> >> 0 : slabdata 8 8 0
> >
> > slabinfo probably won't tell us much here. If we can narrow down the
> > reproducer for this though then that would definitely help.
> > Actually, /proc/self/mountstats might give us something to go on...
>
> Would turning on kmemleak and letting it run for a bit (luckily 15
> mins of regular use seems to be enough to trigger this bug) tell us
> anything? I can do that easily enough but I don't know if it tracks
> dentry entries.
>
> Also I'll grab mountstats as well before/after trying a umount.
> Anything in particular within mountstats I should pay close attention
> to?

Unfortunately, this type of problem defies attempts to work out what is
causing it.

I had a similar problem with a dentry ref count leak in autofs when the
rcu-walk and vfs-automount patches were merged concurrently in 2.6.38.
The only reason I eventually found it was thanks to the eagle eye of Al
Viro. It was actually an obvious problem but in this case (where the
problem actually occurs at some point before the BUG()) it's extremely
difficult to find a way to work out where to look and that's the kick!
All that can be done is to nail down a reproducer (which Jeff is working
on I believe) and hope we can work out a way to target the debugging
from there.

You did say you have clear evidence that autofs in 2.6.38 is not part of
the problem, yes? If you are using autofs then we need to add the 2.6.39
autofs patches to ensure we don't get more confused between the two
problems.

Ian


2011-05-26 20:20:19

by Jeff Layton

[permalink] [raw]
Subject: Re: BUG() in shrink_dcache_for_umount_subtree on nfs4 mount

On Thu, 26 May 2011 23:02:32 +0800
Ian Kent <[email protected]> wrote:

> On Thu, 2011-05-26 at 09:49 -0400, Jeff Layton wrote:
> > On Wed, 25 May 2011 16:08:15 -0400
> > Jeff Layton <[email protected]> wrote:
> >
> > > On Wed, 27 Apr 2011 16:23:07 -0700
> > > Mark Moseley <[email protected]> wrote:
> > >
> > > >
> > > > I posted this to bugzilla a while back but I figured I'd paste it here too:
> > > >
> > > > -----------------------------------------------------------------------------------------------------
> > > >
> > > > I've been getting bit by the exact same bug and been bisecting for the past
> > > > couple of weeks. It's slow going as it can sometimes take a day for the BUG()
> > > > to show up (though can also at time take 10 minutes). And I've also seen it
> > > > more than once where something was good after a day and then BUG()'d later on,
> > > > just to make things more complicated. So the upshot is that while I feel
> > > > confident enough about this latest batch of bisecting to post it here, I
> > > > wouldn't bet my life on it. I hope this isn't a case where bisecting just shows
> > > > where the bug gets exposed but not where it actually got planted :)
> > > >
> > > > Incidentally, I tried the patch from the top of this thread and it didn't seem
> > > > to make a difference. I still got bit.
> > > >
> > > > I've posted on the linux-fsdevel thread that Jeff Layton started about it,
> > > > http://www.spinics.net/lists/linux-nfs/msg20280.html if you need more details
> > > > on my setup (though I'll be happy to provide anything else you need). Though in
> > > > that thread you'll see that I'm not using autofs explicitly, the Netapp GX
> > > > cluster NFS appears to use autofs to do the implicit submounts (I'm not 100%
> > > > sure that's the correct terminology, so hopefully you know what I mean).
> > > >
> > > > Here's my bisect log, ending up at commit
> > > > e61da20a50d21725ff27571a6dff9468e4fb7146
> > > >
> > > > git bisect start 'fs'
> > > > # good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
> > > > git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
> > > > # bad: [c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470] Linux 2.6.38-rc1
> > > > git bisect bad c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470
> > > > # good: [7c955fca3e1d8132982148267d9efcafae849bb6] Merge branch 'for_linus' of
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6
> > > > git bisect good 7c955fca3e1d8132982148267d9efcafae849bb6
> > > > # good: [c32b0d4b3f19c2f5d29568f8b7b72b61693f1277] fs/mpage.c: consolidate code
> > > > git bisect good c32b0d4b3f19c2f5d29568f8b7b72b61693f1277
> > > > # bad: [f8206b925fb0eba3a11839419be118b09105d7b1] Merge branch 'for-linus' of
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
> > > > git bisect bad f8206b925fb0eba3a11839419be118b09105d7b1
> > > > # good: [a8f2800b4f7b76cecb7209cb6a7d2b14904fc711] nfsd4: fix callback
> > > > restarting
> > > > git bisect good a8f2800b4f7b76cecb7209cb6a7d2b14904fc711
> > > > # bad: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up
> > > > autofs4_free_ino()
> > > > git bisect bad 6651149371b842715906311b4631b8489cebf7e8
> > > > # good: [0ad53eeefcbb2620b6a71ffdaad4add20b450b8b] afs: add afs_wq and use it
> > > > instead of the system workqueue
> > > > git bisect good 0ad53eeefcbb2620b6a71ffdaad4add20b450b8b
> > > > # good: [01c64feac45cea1317263eabc4f7ee1b240f297f] CIFS: Use d_automount()
> > > > rather than abusing follow_link()
> > > > git bisect good 01c64feac45cea1317263eabc4f7ee1b240f297f
> > > > # good: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage()
> > > > dentry operation
> > > > git bisect good b5b801779d59165c4ecf1009009109545bd1f642
> > > > # bad: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode
> > > > operations
> > > > git bisect bad e61da20a50d21725ff27571a6dff9468e4fb7146
> > > > # good: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code
> > > > git bisect good 8c13a676d5a56495c350f3141824a5ef6c6b4606
> > >
> > > I can more or less reproduce this at will now, I think even with very
> > > few NFS operations on an automounted nfsv4 mount. Here's an oops from a
> > > 2.6.39 kernel:
> > >
> > > [ 119.419789] tun0: Features changed: 0x00004800 -> 0x00004000
> > > [ 178.242917] FS-Cache: Netfs 'nfs' registered for caching
> > > [ 178.269980] SELinux: initialized (dev 0:2c, type nfs4), uses genfs_contexts
> > > [ 178.282296] SELinux: initialized (dev 0:2d, type nfs4), uses genfs_contexts
> > > [ 523.953284] BUG: Dentry ffff8801f3084180{i=2,n=} still in use (1) [unmount of nfs4 0:2c]
> > > [ 523.953306] ------------[ cut here ]------------
> > > [ 523.954013] kernel BUG at fs/dcache.c:925!
> > > [ 523.954013] invalid opcode: 0000 [#1] SMP
> > > [ 523.954013] last sysfs file: /sys/devices/virtual/bdi/0:45/uevent
> > > [ 523.954013] CPU 1
> > > [ 523.954013] Modules linked in: nfs lockd auth_rpcgss nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf it87 adt7475 hwmon_vid xfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel raid1 snd_hda_codec snd_usb_audio snd_usbmidi_lib snd_hwdep snd_seq snd_rawmidi snd_seq_device snd_pcm snd_timer snd uvcvideo ppdev videodev soundcore media sp5100_tco v4l2_compat_ioctl32 edac_core parport_pc snd_page_alloc i2c_piix4 edac_mce_amd k10temp parport wmi r8169 microcode mii virtio_net kvm_amd kvm ipv6 ata_generic pata_acpi pata_atiixp radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
> > > [ 523.954013]
> > > [ 523.954013] Pid: 8387, comm: umount.nfs4 Not tainted 2.6.39-1.fc16.x86_64 #1 BIOSTAR Group TA790GX 128M/TA790GX 128M
> > > [ 523.954013] RIP: 0010:[<ffffffff81131788>] [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> > > [ 523.954013] RSP: 0018:ffff8801f2385d98 EFLAGS: 00010296
> > > [ 523.954013] RAX: 0000000000000062 RBX: ffff8801f3084180 RCX: 00000000000027a7
> > > [ 523.954013] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> > > [ 523.954013] RBP: ffff8801f2385dc8 R08: 0000000000000002 R09: 0000ffff00066c0a
> > > [ 523.954013] R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff8801f3084180
> > > [ 523.954013] R13: ffff8801f31305f0 R14: ffff8801f30ec5f0 R15: 0000000000000000
> > > [ 523.954013] FS: 00007f08e468d720(0000) GS:ffff88022fc80000(0000) knlGS:00000000f770f870
> > > [ 523.954013] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 523.954013] CR2: 00007fff8fc29ff8 CR3: 00000001f7ac4000 CR4: 00000000000006e0
> > > [ 523.954013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [ 523.954013] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > [ 523.954013] Process umount.nfs4 (pid: 8387, threadinfo ffff8801f2384000, task ffff8801f7a61710)
> > > [ 523.954013] Stack:
> > > [ 523.954013] ffff8801fdd74a58 0000000000000000 ffff8801fdd74800 ffff8801f3084240
> > > [ 523.954013] ffff8801f2385f28 ffff8801f30ec540 ffff8801f2385de8 ffffffff81132063
> > > [ 523.954013] ffff8801fdd74800 ffffffffa0542300 ffff8801f2385e18 ffffffff81121f15
> > > [ 523.954013] Call Trace:
> > > [ 523.954013] [<ffffffff81132063>] shrink_dcache_for_umount+0x4d/0x5f
> > > [ 523.954013] [<ffffffff81121f15>] generic_shutdown_super+0x23/0xde
> > > [ 523.954013] [<ffffffff8112204d>] kill_anon_super+0x13/0x4c
> > > [ 523.954013] [<ffffffffa051c436>] nfs4_kill_super+0x44/0x77 [nfs]
> > > [ 523.954013] [<ffffffff81122271>] deactivate_locked_super+0x26/0x4b
> > > [ 523.954013] [<ffffffff81122a83>] deactivate_super+0x37/0x3b
> > > [ 523.954013] [<ffffffff811375b3>] mntput_no_expire+0xcc/0xd1
> > > [ 523.954013] [<ffffffff811375de>] mntput+0x26/0x28
> > > [ 523.954013] [<ffffffff81137bef>] release_mounts+0x59/0x73
> > > [ 523.954013] [<ffffffff81138155>] sys_umount+0x28d/0x2da
> > > [ 523.954013] [<ffffffff814828c2>] system_call_fastpath+0x16/0x1b
> > > [ 523.954013] Code: 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 07 48 8b 90 a8 00 00 00 48 89 34 24 48 c7 c7 f9 ed 7a 81 48 89 de 31 c0 e8 4e 17 34 00 <0f> 0b 4c 8b 63 18 4c 39 e3 75 11 48 8d bb 90 00 00 00 45 31 e4
> > > [ 523.954013] RIP [<ffffffff81131788>] shrink_dcache_for_umount_subtree+0x104/0x1c9
> > > [ 523.954013] RSP <ffff8801f2385d98>
> > > [ 523.954013] [drm] force priority to high
> > > [ 523.954013] [drm] force priority to high
> > > [ 524.042397] ---[ end trace ff21a3f70c461b5e ]---
> > > [ 534.621786] [drm] force priority to high
> > > [ 534.621866] [drm] force priority to high
> > >
> > > Client rpc stats:
> > > calls retrans authrefrsh
> > > 44 0 44
> > >
> > > Client nfs v4:
> > > null read write commit open open_conf
> > > 0 0% 0 0% 1 2% 0 0% 1 2% 1 2%
> > > open_noat open_dgrd close setattr fsinfo renew
> > > 0 0% 0 0% 1 2% 1 2% 4 9% 0 0%
> > > setclntid confirm lock lockt locku access
> > > 1 2% 1 2% 0 0% 0 0% 0 0% 6 13%
> > > getattr lookup lookup_root remove rename link
> > > 7 16% 5 11% 1 2% 0 0% 0 0% 0 0%
> > > symlink create pathconf statfs readlink readdir
> > > 0 0% 0 0% 3 6% 0 0% 0 0% 3 6%
> > > server_caps delegreturn getacl setacl fs_locations rel_lkowner
> > > 7 16% 0 0% 0 0% 0 0% 0 0% 0 0%
> > > exchange_id create_ses destroy_ses sequence get_lease_t reclaim_comp
> > > 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
> > > layoutget layoutcommit layoutreturn getdevlist getdevinfo ds_write
> > > 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
> > > ds_commit
> > > 0 0%
> > >
> > > That's good because I think it may help narrow down the codepaths that
> > > lead to this...
> > >
> > > All I did was save a file to the mount using claws-mail and wait for
> > > autofs to unmount it. I'm going to see if I can narrow down the
> > > reproducer somewhat and whether I can take autofs out of the picture.
> > >
> >
> > (cc'ing autofs mailing list too)
> >
> > Taking autofs out of the picture seems to make the problem go away, so
> > this seems to be related to that. Why, exactly, I'm not sure yet. It's
> > always been my understanding that autofs basically gets out of the way
> > once the fs is mounted, but I suppose it needs some way to check for
> > expiration so maybe that mechanism is causing issues.
>
> Mmmm ... with 2.6.39 ... mmm ...
>
> With the vfs-automount in place autofs doesn't quite get out of the way
> anymore but only in so much as it gets a looking during the follow while
> path walking.
>
> Your thought about the autofs expire may be a good lead. There could be
> a problem between the expire check and the ref countless rcu-walk in
> that a umount (possibly a kernel dentry expire) could occur while there
> is an elevated ref-count (by the autofs expire) perhaps. I've thought
> about this possibility the couple of times I've seen reports like this
> but don't have anything concrete.
>
> But, I thought that Mark sees a problem when autofs is not involved at
> all which of course probably means autofs may be the reproducer not the
> cause or we have two bugs. Mark, can you confirm you see this without
> autofs in use please?
>

Thanks Ian. I've gone ahead and opened a bug against Fedora for this so
that we have a place to track progress on this. I've written up some
info on my reproducer there and a bit that I've discovered so far.

https://bugzilla.redhat.com/show_bug.cgi?id=708039

Feel free to go on the cc list if you're interested...

Cheers,
--
Jeff Layton <[email protected]>