2007-11-09 18:23:54

by Erez Zadok

[permalink] [raw]
Subject: OOPSes + WARNING: at fs/file_table.c:262 __fput() in -mm 2007-11-06-02-32++

Setup: FC6 system with MM snapshot broken-out-2007-11-06-02-32 and these two
patches added:

r-o-bind-mounts-track-number-of-mount-writer-fix-buggy-loop.patch
r-o-bind-mounts-track-number-of-mount-writer-fix-buggy-loop-checkpatch-fixes.patch

All I do is boot the system with said kernel, wait a few seconds, then shut
it down with "init 0". As services shutdown, I get this:

Stopping RPC idmapd:

kernel: __fput() of writeable file with no mnt_want_write()
kernel: WARNING: at fs/file_table.c:262 __fput()
kernel: [<c010283e>] show_trace_log_lvl+0x12/0x25
kernel: [<c0103042>] show_trace+0xd/0x10
kernel: [<c010313e>] dump_stack+0x15/0x17
kernel: [<c01491a4>] __fput+0x153/0x203
kernel: [<c014952d>] fput+0x2d/0x32
kernel: [<c0146c8a>] filp_close+0x50/0x5a
kernel: [<c01104fc>] put_files_struct+0x7c/0xbe
kernel: [<c0110575>] __exit_files+0x37/0x3c
kernel: [<c01115de>] do_exit+0x1ed/0x653
kernel: [<c0111ab2>] sys_exit_group+0x0/0x11
kernel: [<c01183fc>] get_signal_to_deliver+0x3c4/0x3e8
kernel: [<c0101873>] do_notify_resume+0x8c/0x6c4
kernel: [<c0102339>] work_notifysig+0x13/0x1a
kernel: =======================

Stopping NFS statd: BUG: unable to handle kernel paging request at virtual
address 6b6b6b6b.

EIP is at kobject_uevent_env+0x3c/0x399 which was called from
kobject_uevent, called from remove_user_sysfs_dir, run_workqueue,
worker_thread, kthread, kernel_thread_helper.

After that there are several other oopses (eg. during unmount of pipefs),
but I don't know how much to believe them given the first 1-2. Sorry I
don't have a cut-n-paste of the precise stack trace, but if anyone's
interested, I have a short video file of the shutdown sequence w/ all of the
oopses.

Erez.


2007-11-09 18:44:45

by Dave Hansen

[permalink] [raw]
Subject: Re: OOPSes + WARNING: at fs/file_table.c:262 __fput() in -mm 2007-11-06-02-32++

On Fri, 2007-11-09 at 13:23 -0500, Erez Zadok wrote:
>
> Setup: FC6 system with MM snapshot broken-out-2007-11-06-02-32 and
> these two
> patches added:
>
> r-o-bind-mounts-track-number-of-mount-writer-fix-buggy-loop.patch
> r-o-bind-mounts-track-number-of-mount-writer-fix-buggy-loop-checkpatch-fixes.patch
>
> All I do is boot the system with said kernel, wait a few seconds,
> then shut
> it down with "init 0". As services shutdown, I get this:
>
> Stopping RPC idmapd:
>
> kernel: __fput() of writeable file with no mnt_want_write()
> kernel: WARNING: at fs/file_table.c:262 __fput()
> kernel: [<c010283e>] show_trace_log_lvl+0x12/0x25
> kernel: [<c0103042>] show_trace+0xd/0x10
> kernel: [<c010313e>] dump_stack+0x15/0x17
> kernel: [<c01491a4>] __fput+0x153/0x203
> kernel: [<c014952d>] fput+0x2d/0x32
> kernel: [<c0146c8a>] filp_close+0x50/0x5a
> kernel: [<c01104fc>] put_files_struct+0x7c/0xbe
> kernel: [<c0110575>] __exit_files+0x37/0x3c
> kernel: [<c01115de>] do_exit+0x1ed/0x653
> kernel: [<c0111ab2>] sys_exit_group+0x0/0x11
> kernel: [<c01183fc>] get_signal_to_deliver+0x3c4/0x3e8
> kernel: [<c0101873>] do_notify_resume+0x8c/0x6c4
> kernel: [<c0102339>] work_notifysig+0x13/0x1a

This looks like RPC idmapd had a file open for write when it tried to
exit. However, that file never had a mnt_want_write() taken for it.
This isn't a new bug, but the new debugging code found it much faster.

I'm looking for a fix now.

-- Dave

2007-11-09 20:10:42

by Erez Zadok

[permalink] [raw]
Subject: BUG: can't reboot -nf -mm 2007-11-06-02-32++

In message <[email protected]>, Erez Zadok writes:
> Setup: FC6 system with MM snapshot broken-out-2007-11-06-02-32 and these two
> patches added:
>
> r-o-bind-mounts-track-number-of-mount-writer-fix-buggy-loop.patch
> r-o-bind-mounts-track-number-of-mount-writer-fix-buggy-loop-checkpatch-fixes.patch

Same setup above. Here's a simple bug. Just run

# /sbin/reboot -n -f
Segmentation fault

# dmesg
[...]
BUG: unable to handle kernel paging request at virtual address 42da9ff4
printing eip: c1c81e24 *pde = 030fb067 *pte = 3dfdc065
Oops: 0003 [#1]
last sysfs file: /sys/kernel/uevent_seqnum
Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc

Pid: 1954, comm: reboot Not tainted (2.6.24-rc1-mm1-mm #11)
EIP: 0060:[<c1c81e24>] EFLAGS: 00010286 CPU: 0
EIP is at 0xc1c81e24
EAX: c1c152ac EBX: c0446d40 ECX: c1c81e24 EDX: c1c152ac
ESI: 01234567 EDI: 42da9ff4 EBP: c310be94 ESP: c310be8c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process reboot (pid: 1954, ti=c310a000 task=c309a000 task.ti=c310a000)
Stack: c02fad37 00000000 c310bea0 c0219d3f 28121969 c310bfb0 c0219e68 c17fe240
c310bee4 c0230cd7 c310bf14 c30ceb58 c20141fc c3043de8 c30fb4d4 c3043de8
c17fe240 00000000 c17fe240 c310bee4 c022f023 c17fe240 c310bf34 c0239003
Call Trace:
[<c020283e>] show_trace_log_lvl+0x12/0x25
[<c02028db>] show_stack_log_lvl+0x8a/0x9c
[<c0202978>] show_registers+0x8b/0x153
[<c0202b24>] die+0xe4/0x1a2
[<c0209fc2>] do_page_fault+0x3db/0x4b1
[<c038a922>] error_code+0x6a/0x70
[<c0219d3f>] kernel_restart+0x26/0x55
[<c0219e68>] sys_reboot+0xee/0x101
[<c0202212>] sysenter_past_esp+0x5f/0x91
=======================
INFO: lockdep is turned off.
Code: 00 00 00 00 00 00 00 00 1c c8 c1 6f 76 20 20 39 20 31 34 3a 35 33 3a 31 39 20 6b 65 72 6e 65 6c 3a 20 54 43 50 20 65 73 74 61 62 <6c> 69 73 68 65 64 20 68 61 73 68 20 74 61 62 6c 65 20 65 6e 74
EIP: [<c1c81e24>] 0xc1c81e24 SS:ESP 0068:c310be8c
WARNING: at lib/kref.c:33 kref_get()
[<c020283e>] show_trace_log_lvl+0x12/0x25
[<c0203038>] show_trace+0xd/0x10
[<c0203134>] dump_stack+0x15/0x17
[<c02c7f59>] kref_get+0x26/0x30
[<c02c72ef>] kobject_get+0x12/0x17
[<c02c7744>] kobject_add+0x97/0x162
[<c02164e5>] uids_user_create+0x41/0x5d
[<c0216850>] alloc_uid+0xc9/0x171
[<c0219b58>] set_user+0x1f/0x93
[<c021b36a>] sys_setreuid+0xa1/0x139
[<c0202212>] sysenter_past_esp+0x5f/0x91
=======================
BUG: atomic counter underflow at:
[<c020283e>] show_trace_log_lvl+0x12/0x25
[<c0203038>] show_trace+0xd/0x10
[<c0203134>] dump_stack+0x15/0x17
[<c02c7fb3>] kref_put+0x50/0x71
[<c02c7254>] kobject_put+0x14/0x16
[<c02c728d>] kobject_cleanup+0x37/0x3c
[<c02c729d>] kobject_release+0xb/0xd
[<c02c7fc3>] kref_put+0x60/0x71
[<c02c7254>] kobject_put+0x14/0x16
[<c0216751>] remove_user_sysfs_dir+0x77/0xad
[<c021bb52>] run_workqueue+0xba/0x182
[<c021c379>] worker_thread+0x79/0x84
[<c021ea74>] kthread+0x39/0x61
[<c0202537>] kernel_thread_helper+0x7/0x10
=======================



Erez.