2010-04-27 18:44:29

by Xianghua Xiao

[permalink] [raw]
Subject: 2.6.33.3-rt16 Oops caused by umount

2.6.33.2-rt13 worked fine, however on 2.6.33.3-rt16, when I do reboot, it oops:

# reboot
# Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT 83xx Sys
Modules linked in:
NIP: c00efc68 LR: c00efc38 CTR: 00000000
REGS: ce6e3dc0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
MSR: 00009032 <EE,ME,IR,DR> CR: 24000448 XER: 00000000
DAR: 00000038, DSISR: 20000000
TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
GPR00: 00000000 ce6e3e70 cd89ccc0 ce6e3ddc 22222222 00000000 ce6e3e24 ce6e3e04
GPR08: 00008000 00000010 cdfa2130 cdfa26e0 44000442 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 00000001 ce6e3eb8 00000021
GPR24: 00000060 00000000 00000000 ceb94c40 00000000 ceb94cc0 c065781c ce6e3e70
NIP [c00efc68] fs_may_remount_ro+0x6c/0xd8
LR [c00efc38] fs_may_remount_ro+0x3c/0xd8
Call Trace:
[ce6e3e70] [c00efc38] fs_may_remount_ro+0x3c/0xd8 (unreliable)
[ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
[ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
[ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
[ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
38000000 817d00c0 3bbd00c0 60088000 814b0000 2f8a0000 419e0008 7c00522c
7f8be800 419e004c 812b000c 81290040 <80090028> 2f800000 419e0028 a009006e
---[ end trace 17c711f9d369c3a3 ]---
------------[ cut here ]------------
Kernel BUG at c045eeac [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#2]
PREEMPT 83xx Sys
Modules linked in:
NIP: c045eeac LR: c045ee84 CTR: 00000000
REGS: ce6e3a80 TRAP: 0700 Tainted: G D (2.6.33.3-rt16)
MSR: 00021032 <ME,CE,IR,DR> CR: 44004428 XER: 00000000
TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
GPR00: 00000001 ce6e3b30 cd89ccc0 c068f6b4 c045fc68 00000000 ce6e3b84 ce6e3b64
GPR08: ce6e3b5c c0690000 cd89ccc0 ce6e3b30 24004422 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 c0657824 ce6e3eb8 ce6e3b3c
GPR24: cf028ea0 cec84d1c c065781c cec86a60 00009032 c065781c c065781c ce6e3b30
NIP [c045eeac] rt_spin_lock_slowlock+0xa8/0x394
LR [c045ee84] rt_spin_lock_slowlock+0x80/0x394
Call Trace:
[ce6e3b30] [c045ee84] rt_spin_lock_slowlock+0x80/0x394 (unreliable)
[ce6e3bc0] [c045fc68] rt_spin_lock+0x58/0x90
[ce6e3bd0] [c00efbbc] file_sb_list_del+0x48/0x88
[ce6e3bf0] [c00f03ac] __fput+0x168/0x274
[ce6e3c20] [c00f04f8] fput+0x40/0x58
[ce6e3c30] [c00d3f74] remove_vma+0x78/0xd8
[ce6e3c50] [c00d4150] exit_mmap+0x17c/0x1e4
[ce6e3cc0] [c00318c0] mmput+0x6c/0x144
[ce6e3ce0] [c003712c] exit_mm+0x15c/0x190
[ce6e3d10] [c00391e8] do_exit+0xf0/0x670
[ce6e3d60] [c0014c9c] die+0x1cc/0x1d4
[ce6e3d90] [c001cdb8] bad_page_fault+0x98/0xf0
[ce6e3db0] [c0017d8c] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x6c/0xd8
LR = fs_may_remount_ro+0x3c/0xd8
[ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
[ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
[ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
[ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
38600001 4bbc95d1 801a0004 3aba0008 2f800000 419e0264 801a0018 7c4a1378
5400003a 7c400278 7c000034 5400d97e <0f000000> 83a20000 39200002 2f9d0002
---[ end trace 17c711f9d369c3a4 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: umount/0x00000001/1613, CPU#0
Modules linked in:
Call Trace:
[ce6e3800] [c000b6a0] show_stack+0xe8/0x244 (unreliable)
[ce6e3850] [c04608c0] dump_stack+0x2c/0x44
[ce6e3860] [c00282b0] __schedule_bug+0x9c/0xc4
[ce6e3880] [c045cfe8] __schedule+0x488/0x5a4
[ce6e38b0] [c045d2e0] schedule+0x40/0xa8
[ce6e38c0] [c003973c] do_exit+0x644/0x670
[ce6e3910] [c0014c9c] die+0x1cc/0x1d4
[ce6e3940] [c0014fd0] _exception+0x130/0x170
[ce6e3a30] [c00151d0] program_check_exception+0xd0/0x634
[ce6e3a70] [c0017f38] ret_from_except_full+0x0/0x4c
--- Exception: 700 at rt_spin_lock_slowlock+0xa8/0x394
LR = rt_spin_lock_slowlock+0x80/0x394
[ce6e3bc0] [c045fc68] rt_spin_lock+0x58/0x90
[ce6e3bd0] [c00efbbc] file_sb_list_del+0x48/0x88
[ce6e3bf0] [c00f03ac] __fput+0x168/0x274
[ce6e3c20] [c00f04f8] fput+0x40/0x58
[ce6e3c30] [c00d3f74] remove_vma+0x78/0xd8
[ce6e3c50] [c00d4150] exit_mmap+0x17c/0x1e4
[ce6e3cc0] [c00318c0] mmput+0x6c/0x144
[ce6e3ce0] [c003712c] exit_mm+0x15c/0x190
[ce6e3d10] [c00391e8] do_exit+0xf0/0x670
[ce6e3d60] [c0014c9c] die+0x1cc/0x1d4
[ce6e3d90] [c001cdb8] bad_page_fault+0x98/0xf0
[ce6e3db0] [c0017d8c] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x6c/0xd8
LR = fs_may_remount_ro+0x3c/0xd8
[ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
[ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
[ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
[ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88


2010-04-27 18:56:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Tue, 27 Apr 2010, Xianghua Xiao wrote:

cc'ed John

> 2.6.33.2-rt13 worked fine, however on 2.6.33.3-rt16, when I do reboot, it oops:
>
> # reboot
> # Oops: Kernel access of bad area, sig: 11 [#1]
> PREEMPT 83xx Sys
> Modules linked in:
> NIP: c00efc68 LR: c00efc38 CTR: 00000000
> REGS: ce6e3dc0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
> MSR: 00009032 <EE,ME,IR,DR> CR: 24000448 XER: 00000000
> DAR: 00000038, DSISR: 20000000
> TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
> GPR00: 00000000 ce6e3e70 cd89ccc0 ce6e3ddc 22222222 00000000 ce6e3e24 ce6e3e04
> GPR08: 00008000 00000010 cdfa2130 cdfa26e0 44000442 100bbc1c 0fffd000 ffffffff
> GPR16: 00000001 00000000 007fff00 00000000 00000000 00000001 ce6e3eb8 00000021
> GPR24: 00000060 00000000 00000000 ceb94c40 00000000 ceb94cc0 c065781c ce6e3e70
> NIP [c00efc68] fs_may_remount_ro+0x6c/0xd8
> LR [c00efc38] fs_may_remount_ro+0x3c/0xd8
> Call Trace:
> [ce6e3e70] [c00efc38] fs_may_remount_ro+0x3c/0xd8 (unreliable)
> [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
> [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
> [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
> [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
> --- Exception: c01 at 0xfe5f8c4
> LR = 0x10051b88
> Instruction dump:
> 38000000 817d00c0 3bbd00c0 60088000 814b0000 2f8a0000 419e0008 7c00522c
> 7f8be800 419e004c 812b000c 81290040 <80090028> 2f800000 419e0028 a009006e
> ---[ end trace 17c711f9d369c3a3 ]---
> ------------[ cut here ]------------
> Kernel BUG at c045eeac [verbose debug info unavailable]
> Oops: Exception in kernel mode, sig: 5 [#2]
> PREEMPT 83xx Sys
> Modules linked in:
> NIP: c045eeac LR: c045ee84 CTR: 00000000
> REGS: ce6e3a80 TRAP: 0700 Tainted: G D (2.6.33.3-rt16)
> MSR: 00021032 <ME,CE,IR,DR> CR: 44004428 XER: 00000000
> TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
> GPR00: 00000001 ce6e3b30 cd89ccc0 c068f6b4 c045fc68 00000000 ce6e3b84 ce6e3b64
> GPR08: ce6e3b5c c0690000 cd89ccc0 ce6e3b30 24004422 100bbc1c 0fffd000 ffffffff
> GPR16: 00000001 00000000 007fff00 00000000 00000000 c0657824 ce6e3eb8 ce6e3b3c
> GPR24: cf028ea0 cec84d1c c065781c cec86a60 00009032 c065781c c065781c ce6e3b30
> NIP [c045eeac] rt_spin_lock_slowlock+0xa8/0x394
> LR [c045ee84] rt_spin_lock_slowlock+0x80/0x394
> Call Trace:
> [ce6e3b30] [c045ee84] rt_spin_lock_slowlock+0x80/0x394 (unreliable)
> [ce6e3bc0] [c045fc68] rt_spin_lock+0x58/0x90
> [ce6e3bd0] [c00efbbc] file_sb_list_del+0x48/0x88
> [ce6e3bf0] [c00f03ac] __fput+0x168/0x274
> [ce6e3c20] [c00f04f8] fput+0x40/0x58
> [ce6e3c30] [c00d3f74] remove_vma+0x78/0xd8
> [ce6e3c50] [c00d4150] exit_mmap+0x17c/0x1e4
> [ce6e3cc0] [c00318c0] mmput+0x6c/0x144
> [ce6e3ce0] [c003712c] exit_mm+0x15c/0x190
> [ce6e3d10] [c00391e8] do_exit+0xf0/0x670
> [ce6e3d60] [c0014c9c] die+0x1cc/0x1d4
> [ce6e3d90] [c001cdb8] bad_page_fault+0x98/0xf0
> [ce6e3db0] [c0017d8c] handle_page_fault+0x7c/0x80
> --- Exception: 300 at fs_may_remount_ro+0x6c/0xd8
> LR = fs_may_remount_ro+0x3c/0xd8
> [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
> [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
> [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
> [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
> --- Exception: c01 at 0xfe5f8c4
> LR = 0x10051b88
> Instruction dump:
> 38600001 4bbc95d1 801a0004 3aba0008 2f800000 419e0264 801a0018 7c4a1378
> 5400003a 7c400278 7c000034 5400d97e <0f000000> 83a20000 39200002 2f9d0002
> ---[ end trace 17c711f9d369c3a4 ]---
> Fixing recursive fault but reboot is needed!
> BUG: scheduling while atomic: umount/0x00000001/1613, CPU#0
> Modules linked in:
> Call Trace:
> [ce6e3800] [c000b6a0] show_stack+0xe8/0x244 (unreliable)
> [ce6e3850] [c04608c0] dump_stack+0x2c/0x44
> [ce6e3860] [c00282b0] __schedule_bug+0x9c/0xc4
> [ce6e3880] [c045cfe8] __schedule+0x488/0x5a4
> [ce6e38b0] [c045d2e0] schedule+0x40/0xa8
> [ce6e38c0] [c003973c] do_exit+0x644/0x670
> [ce6e3910] [c0014c9c] die+0x1cc/0x1d4
> [ce6e3940] [c0014fd0] _exception+0x130/0x170
> [ce6e3a30] [c00151d0] program_check_exception+0xd0/0x634
> [ce6e3a70] [c0017f38] ret_from_except_full+0x0/0x4c
> --- Exception: 700 at rt_spin_lock_slowlock+0xa8/0x394
> LR = rt_spin_lock_slowlock+0x80/0x394
> [ce6e3bc0] [c045fc68] rt_spin_lock+0x58/0x90
> [ce6e3bd0] [c00efbbc] file_sb_list_del+0x48/0x88
> [ce6e3bf0] [c00f03ac] __fput+0x168/0x274
> [ce6e3c20] [c00f04f8] fput+0x40/0x58
> [ce6e3c30] [c00d3f74] remove_vma+0x78/0xd8
> [ce6e3c50] [c00d4150] exit_mmap+0x17c/0x1e4
> [ce6e3cc0] [c00318c0] mmput+0x6c/0x144
> [ce6e3ce0] [c003712c] exit_mm+0x15c/0x190
> [ce6e3d10] [c00391e8] do_exit+0xf0/0x670
> [ce6e3d60] [c0014c9c] die+0x1cc/0x1d4
> [ce6e3d90] [c001cdb8] bad_page_fault+0x98/0xf0
> [ce6e3db0] [c0017d8c] handle_page_fault+0x7c/0x80
> --- Exception: 300 at fs_may_remount_ro+0x6c/0xd8
> LR = fs_may_remount_ro+0x3c/0xd8
> [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
> [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
> [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
> [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
> --- Exception: c01 at 0xfe5f8c4
> LR = 0x10051b88
>

2010-04-27 20:23:24

by john stultz

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Tue, 2010-04-27 at 20:56 +0200, Thomas Gleixner wrote:
> On Tue, 27 Apr 2010, Xianghua Xiao wrote:
>
> cc'ed John
>
> > 2.6.33.2-rt13 worked fine, however on 2.6.33.3-rt16, when I do reboot, it oops:
> >
> > # reboot
> > # Oops: Kernel access of bad area, sig: 11 [#1]
> > PREEMPT 83xx Sys
> > Modules linked in:
> > NIP: c00efc68 LR: c00efc38 CTR: 00000000
> > REGS: ce6e3dc0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
> > MSR: 00009032 <EE,ME,IR,DR> CR: 24000448 XER: 00000000
> > DAR: 00000038, DSISR: 20000000
> > TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
> > GPR00: 00000000 ce6e3e70 cd89ccc0 ce6e3ddc 22222222 00000000 ce6e3e24 ce6e3e04
> > GPR08: 00008000 00000010 cdfa2130 cdfa26e0 44000442 100bbc1c 0fffd000 ffffffff
> > GPR16: 00000001 00000000 007fff00 00000000 00000000 00000001 ce6e3eb8 00000021
> > GPR24: 00000060 00000000 00000000 ceb94c40 00000000 ceb94cc0 c065781c ce6e3e70
> > NIP [c00efc68] fs_may_remount_ro+0x6c/0xd8
> > LR [c00efc38] fs_may_remount_ro+0x3c/0xd8
> > Call Trace:
> > [ce6e3e70] [c00efc38] fs_may_remount_ro+0x3c/0xd8 (unreliable)
> > [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
> > [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
> > [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
> > [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
> > --- Exception: c01 at 0xfe5f8c4
> > LR = 0x10051b88
> > Instruction dump:
> > 38000000 817d00c0 3bbd00c0 60088000 814b0000 2f8a0000 419e0008 7c00522c
> > 7f8be800 419e004c 812b000c 81290040 <80090028> 2f800000 419e0028 a009006e
> > ---[ end trace 17c711f9d369c3a3 ]---

Hey Xianghua,
What filesystem was this on? And what architecture?

thanks
-john

2010-04-27 20:31:12

by john stultz

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Tue, 2010-04-27 at 13:23 -0700, john stultz wrote:
> On Tue, 2010-04-27 at 20:56 +0200, Thomas Gleixner wrote:
> > On Tue, 27 Apr 2010, Xianghua Xiao wrote:
> >
> > cc'ed John
> >
> > > 2.6.33.2-rt13 worked fine, however on 2.6.33.3-rt16, when I do reboot, it oops:
> > >
> > > # reboot
> > > # Oops: Kernel access of bad area, sig: 11 [#1]
> > > PREEMPT 83xx Sys
> > > Modules linked in:
> > > NIP: c00efc68 LR: c00efc38 CTR: 00000000
> > > REGS: ce6e3dc0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
> > > MSR: 00009032 <EE,ME,IR,DR> CR: 24000448 XER: 00000000
> > > DAR: 00000038, DSISR: 20000000
> > > TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
> > > GPR00: 00000000 ce6e3e70 cd89ccc0 ce6e3ddc 22222222 00000000 ce6e3e24 ce6e3e04
> > > GPR08: 00008000 00000010 cdfa2130 cdfa26e0 44000442 100bbc1c 0fffd000 ffffffff
> > > GPR16: 00000001 00000000 007fff00 00000000 00000000 00000001 ce6e3eb8 00000021
> > > GPR24: 00000060 00000000 00000000 ceb94c40 00000000 ceb94cc0 c065781c ce6e3e70
> > > NIP [c00efc68] fs_may_remount_ro+0x6c/0xd8
> > > LR [c00efc38] fs_may_remount_ro+0x3c/0xd8
> > > Call Trace:
> > > [ce6e3e70] [c00efc38] fs_may_remount_ro+0x3c/0xd8 (unreliable)
> > > [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
> > > [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
> > > [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
> > > [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
> > > --- Exception: c01 at 0xfe5f8c4
> > > LR = 0x10051b88
> > > Instruction dump:
> > > 38000000 817d00c0 3bbd00c0 60088000 814b0000 2f8a0000 419e0008 7c00522c
> > > 7f8be800 419e004c 812b000c 81290040 <80090028> 2f800000 419e0028 a009006e
> > > ---[ end trace 17c711f9d369c3a3 ]---
>
> Hey Xianghua,
> What filesystem was this on? And what architecture?

Also a .config would be helpful.

thanks
-john

2010-04-27 20:54:42

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Tue, Apr 27, 2010 at 3:30 PM, john stultz <[email protected]> wrote:
> On Tue, 2010-04-27 at 13:23 -0700, john stultz wrote:
>> On Tue, 2010-04-27 at 20:56 +0200, Thomas Gleixner wrote:
>> > On Tue, 27 Apr 2010, Xianghua Xiao wrote:
>> >
>> > cc'ed John
>> >
>> > > 2.6.33.2-rt13 worked fine, however on 2.6.33.3-rt16, when I do reboot, it oops:
>> > >
>> > > # reboot
>> > > # Oops: Kernel access of bad area, sig: 11 [#1]
>> > > PREEMPT 83xx Sys
>> > > Modules linked in:
>> > > NIP: c00efc68 LR: c00efc38 CTR: 00000000
>> > > REGS: ce6e3dc0 TRAP: 0300   Not tainted  (2.6.33.3-rt16)
>> > > MSR: 00009032 <EE,ME,IR,DR>  CR: 24000448  XER: 00000000
>> > > DAR: 00000038, DSISR: 20000000
>> > > TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
>> > > GPR00: 00000000 ce6e3e70 cd89ccc0 ce6e3ddc 22222222 00000000 ce6e3e24 ce6e3e04
>> > > GPR08: 00008000 00000010 cdfa2130 cdfa26e0 44000442 100bbc1c 0fffd000 ffffffff
>> > > GPR16: 00000001 00000000 007fff00 00000000 00000000 00000001 ce6e3eb8 00000021
>> > > GPR24: 00000060 00000000 00000000 ceb94c40 00000000 ceb94cc0 c065781c ce6e3e70
>> > > NIP [c00efc68] fs_may_remount_ro+0x6c/0xd8
>> > > LR [c00efc38] fs_may_remount_ro+0x3c/0xd8
>> > > Call Trace:
>> > > [ce6e3e70] [c00efc38] fs_may_remount_ro+0x3c/0xd8 (unreliable)
>> > > [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
>> > > [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
>> > > [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
>> > > [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
>> > > --- Exception: c01 at 0xfe5f8c4
>> > >     LR = 0x10051b88
>> > > Instruction dump:
>> > > 38000000 817d00c0 3bbd00c0 60088000 814b0000 2f8a0000 419e0008 7c00522c
>> > > 7f8be800 419e004c 812b000c 81290040 <80090028> 2f800000 419e0028 a009006e
>> > > ---[ end trace 17c711f9d369c3a3 ]---
>>
>> Hey Xianghua,
>>       What filesystem was this on? And what architecture?
>
> Also a .config would be helpful.
>
> thanks
> -john
>
>
>

John,
it's ext2 and powerpc 834x. config.gz is attached.
the same config is used on 2.6.33.2-rt13 which did not show this umount oops.
Thanks!
Xianghua


Attachments:
config.gz (12.57 kB)

2010-04-27 22:08:25

by Uwaysi Bin Kareem

[permalink] [raw]
Subject: re: 2.6.33.3-rt16 - not reaching login.

Hi. I tried 2.6.33.3-rt16 patch, and it did not work, first compile
failure, and then after applying the inode fix, it will not reach the
login screen. It does run some scripts before though.

Peace Be With You,
Uwaysi Bin Kareem.

2010-04-28 06:01:26

by john stultz

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Tue, 2010-04-27 at 15:54 -0500, Xianghua Xiao wrote:
> On Tue, Apr 27, 2010 at 3:30 PM, john stultz <[email protected]> wrote:
> > On Tue, 2010-04-27 at 13:23 -0700, john stultz wrote:
> >> On Tue, 2010-04-27 at 20:56 +0200, Thomas Gleixner wrote:
> >> > On Tue, 27 Apr 2010, Xianghua Xiao wrote:
> >> > > 2.6.33.2-rt13 worked fine, however on 2.6.33.3-rt16, when I do reboot, it oops:
> >> > >
> >> > > # reboot
> >> > > # Oops: Kernel access of bad area, sig: 11 [#1]
> >> > > PREEMPT 83xx Sys
> >> > > Modules linked in:
> >> > > NIP: c00efc68 LR: c00efc38 CTR: 00000000
> >> > > REGS: ce6e3dc0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
> >> > > MSR: 00009032 <EE,ME,IR,DR> CR: 24000448 XER: 00000000
> >> > > DAR: 00000038, DSISR: 20000000
> >> > > TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
> >> > > GPR00: 00000000 ce6e3e70 cd89ccc0 ce6e3ddc 22222222 00000000 ce6e3e24 ce6e3e04
> >> > > GPR08: 00008000 00000010 cdfa2130 cdfa26e0 44000442 100bbc1c 0fffd000 ffffffff
> >> > > GPR16: 00000001 00000000 007fff00 00000000 00000000 00000001 ce6e3eb8 00000021
> >> > > GPR24: 00000060 00000000 00000000 ceb94c40 00000000 ceb94cc0 c065781c ce6e3e70
> >> > > NIP [c00efc68] fs_may_remount_ro+0x6c/0xd8
> >> > > LR [c00efc38] fs_may_remount_ro+0x3c/0xd8
> >> > > Call Trace:
> >> > > [ce6e3e70] [c00efc38] fs_may_remount_ro+0x3c/0xd8 (unreliable)
> >> > > [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
> >> > > [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
> >> > > [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
> >> > > [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
> >> > > --- Exception: c01 at 0xfe5f8c4
> >> > > LR = 0x10051b88
> >> > > Instruction dump:
> >> > > 38000000 817d00c0 3bbd00c0 60088000 814b0000 2f8a0000 419e0008 7c00522c
> >> > > 7f8be800 419e004c 812b000c 81290040 <80090028> 2f800000 419e0028 a009006e
> >> > > ---[ end trace 17c711f9d369c3a3 ]---
> >>
> >> Hey Xianghua,
> >> What filesystem was this on? And what architecture?
> >
> it's ext2 and powerpc 834x. config.gz is attached.
> the same config is used on 2.6.33.2-rt13 which did not show this umount oops.

So I've not been able to reproduce the issue, but I have found a few
problems in hunting down the issue Luis reported, and one of them may be
affecting you here.

Could you try the patch below and let me know if it resolves it for you?

thanks
-john


Fix 3 logic bugs in the vfs-scalability patches.

1) Typo that could cause a deadlock in do_umount
2) Improve MNT_MOUNT handling on cloned rootfs
3) Fix might_sleep in atomic in put_mnt_ns

These may not be totally correct, as I still am chasing down some
namespace issues triggered by unshare().

Signed-off-by: John Stultz <[email protected]>

diff --git a/fs/namespace.c b/fs/namespace.c
index 5459a05..8c5d60b 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1233,7 +1233,7 @@ static int do_umount(struct vfsmount *mnt, int flags)
*/
vfsmount_write_lock();
if (count_mnt_count(mnt) != 2) {
- vfsmount_write_lock();
+ vfsmount_write_unlock();
return -EBUSY;
}
vfsmount_write_unlock();
@@ -1376,6 +1376,12 @@ struct vfsmount *copy_tree(struct vfsmount *mnt, struct dentry *dentry,
if (!q)
goto Enomem;
q->mnt_mountpoint = mnt->mnt_mountpoint;
+ /*
+ * We don't call attach_mnt on rootfs, so set
+ * it as mounted here.
+ */
+ WARN_ON(q->mnt_flags & MNT_MOUNTED);
+ q->mnt_flags |= MNT_MOUNTED;

p = mnt;
list_for_each_entry(r, &mnt->mnt_mounts, mnt_child) {
@@ -2513,17 +2519,15 @@ void put_mnt_ns(struct mnt_namespace *ns)
{
struct vfsmount *root;
LIST_HEAD(umount_list);
- spinlock_t *lock;

- lock = &get_cpu_var(vfsmount_lock);
- if (!atomic_dec_and_lock(&ns->count, lock)) {
- put_cpu_var(vfsmount_lock);
+ vfsmount_write_lock();
+ if (!atomic_dec_and_test(&ns->count)){
+ vfsmount_write_unlock();
return;
}
root = ns->root;
ns->root = NULL;
- spin_unlock(lock);
- put_cpu_var(vfsmount_lock);
+ vfsmount_write_unlock();

down_write(&namespace_sem);
vfsmount_write_lock();



2010-04-28 15:21:29

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, Apr 28, 2010 at 1:01 AM, john stultz <[email protected]> wrote:
> On Tue, 2010-04-27 at 15:54 -0500, Xianghua Xiao wrote:
>> On Tue, Apr 27, 2010 at 3:30 PM, john stultz <[email protected]> wrote:
>> > On Tue, 2010-04-27 at 13:23 -0700, john stultz wrote:
>> >> On Tue, 2010-04-27 at 20:56 +0200, Thomas Gleixner wrote:
>> >> > On Tue, 27 Apr 2010, Xianghua Xiao wrote:
>> >> > > 2.6.33.2-rt13 worked fine, however on 2.6.33.3-rt16, when I do reboot, it oops:
>> >> > >
>> >> > > # reboot
>> >> > > # Oops: Kernel access of bad area, sig: 11 [#1]
>> >> > > PREEMPT 83xx Sys
>> >> > > Modules linked in:
>> >> > > NIP: c00efc68 LR: c00efc38 CTR: 00000000
>> >> > > REGS: ce6e3dc0 TRAP: 0300   Not tainted  (2.6.33.3-rt16)
>> >> > > MSR: 00009032 <EE,ME,IR,DR>  CR: 24000448  XER: 00000000
>> >> > > DAR: 00000038, DSISR: 20000000
>> >> > > TASK = cd89ccc0[1613] 'umount' THREAD: ce6e2000
>> >> > > GPR00: 00000000 ce6e3e70 cd89ccc0 ce6e3ddc 22222222 00000000 ce6e3e24 ce6e3e04
>> >> > > GPR08: 00008000 00000010 cdfa2130 cdfa26e0 44000442 100bbc1c 0fffd000 ffffffff
>> >> > > GPR16: 00000001 00000000 007fff00 00000000 00000000 00000001 ce6e3eb8 00000021
>> >> > > GPR24: 00000060 00000000 00000000 ceb94c40 00000000 ceb94cc0 c065781c ce6e3e70
>> >> > > NIP [c00efc68] fs_may_remount_ro+0x6c/0xd8
>> >> > > LR [c00efc38] fs_may_remount_ro+0x3c/0xd8
>> >> > > Call Trace:
>> >> > > [ce6e3e70] [c00efc38] fs_may_remount_ro+0x3c/0xd8 (unreliable)
>> >> > > [ce6e3e90] [c00f1198] do_remount_sb+0x11c/0x164
>> >> > > [ce6e3eb0] [c0113a3c] do_mount+0x538/0x86c
>> >> > > [ce6e3f10] [c0113e30] sys_mount+0xc0/0x120
>> >> > > [ce6e3f40] [c00178d8] ret_from_syscall+0x0/0x38
>> >> > > --- Exception: c01 at 0xfe5f8c4
>> >> > >     LR = 0x10051b88
>> >> > > Instruction dump:
>> >> > > 38000000 817d00c0 3bbd00c0 60088000 814b0000 2f8a0000 419e0008 7c00522c
>> >> > > 7f8be800 419e004c 812b000c 81290040 <80090028> 2f800000 419e0028 a009006e
>> >> > > ---[ end trace 17c711f9d369c3a3 ]---
>> >>
>> >> Hey Xianghua,
>> >>       What filesystem was this on? And what architecture?
>> >
>> it's ext2 and powerpc 834x. config.gz is attached.
>> the same config is used on 2.6.33.2-rt13 which did not show this umount oops.
>
> So I've not been able to reproduce the issue, but I have found a few
> problems in hunting down the issue Luis reported, and one of them may be
> affecting you here.
>
> Could you try the patch below and let me know if it resolves it for you?
>
> thanks
> -john
>
>
> Fix 3 logic bugs in the vfs-scalability patches.
>
> 1) Typo that could cause a deadlock in do_umount
> 2) Improve MNT_MOUNT handling on cloned rootfs
> 3) Fix might_sleep in atomic in put_mnt_ns
>
> These may not be totally correct, as I still am chasing down some
> namespace issues triggered by unshare().
>
> Signed-off-by: John Stultz <[email protected]>
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 5459a05..8c5d60b 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -1233,7 +1233,7 @@ static int do_umount(struct vfsmount *mnt, int flags)
>                 */
>                vfsmount_write_lock();
>                if (count_mnt_count(mnt) != 2) {
> -                       vfsmount_write_lock();
> +                       vfsmount_write_unlock();
>                        return -EBUSY;
>                }
>                vfsmount_write_unlock();
> @@ -1376,6 +1376,12 @@ struct vfsmount *copy_tree(struct vfsmount *mnt, struct dentry *dentry,
>        if (!q)
>                goto Enomem;
>        q->mnt_mountpoint = mnt->mnt_mountpoint;
> +       /*
> +        * We don't call attach_mnt on rootfs, so set
> +        * it as mounted here.
> +        */
> +       WARN_ON(q->mnt_flags & MNT_MOUNTED);
> +       q->mnt_flags |= MNT_MOUNTED;
>
>        p = mnt;
>        list_for_each_entry(r, &mnt->mnt_mounts, mnt_child) {
> @@ -2513,17 +2519,15 @@ void put_mnt_ns(struct mnt_namespace *ns)
>  {
>        struct vfsmount *root;
>        LIST_HEAD(umount_list);
> -       spinlock_t *lock;
>
> -       lock = &get_cpu_var(vfsmount_lock);
> -       if (!atomic_dec_and_lock(&ns->count, lock)) {
> -               put_cpu_var(vfsmount_lock);
> +       vfsmount_write_lock();
> +       if (!atomic_dec_and_test(&ns->count)){
> +               vfsmount_write_unlock();
>                return;
>        }
>        root = ns->root;
>        ns->root = NULL;
> -       spin_unlock(lock);
> -       put_cpu_var(vfsmount_lock);
> +       vfsmount_write_unlock();
>
>        down_write(&namespace_sem);
>        vfsmount_write_lock();
>
>
>
>
>

John,
Just tried the patch, still got umount hang, please see below.
Thanks!
Xianghua

# umount hda2
# reboot
# Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT 834x SYS
Modules linked in:
NIP: c009ddd8 LR: c009dda8 CTR: 00000000
REGS: ce0f1dd0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
MSR: 00009032 <EE,ME,IR,DR> CR: 24000444 XER: 00000000
DAR: 00000028, DSISR: 20000000
TASK = ceb65ab0[973] 'umount' THREAD: ce0f0000
GPR00: 00000000 ce0f1e80 ceb65ab0 ce0f1dfc 22222222 00000000 ce0f1e44
ce0f1e24
GPR08: 00008000 00000000 cf17cc50 cf17c978 44000442 100bbc1c 0fffd000
ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a8 00000000
ce0f1ec8
GPR24: 00000021 00000060 cebaec40 00000000 00000021 cebaecc0 00000001
c051221c
NIP [c009ddd8] fs_may_remount_ro+0x58/0xd0
LR [c009dda8] fs_may_remount_ro+0x28/0xd0
Call Trace:
[ce0f1e80] [c009dda8] fs_may_remount_ro+0x28/0xd0 (unreliable)
[ce0f1ea0] [c009ef1c] do_remount_sb+0x138/0x178
[ce0f1ec0] [c00bdbe8] do_mount+0x54c/0x840
[ce0f1f10] [c00bdfac] sys_mount+0xd0/0xfc
[ce0f1f40] [c0014208] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b44
Instruction dump:
38000000 817d00c0 60088000 3bbd00c0 814b0000 2f8a0000 419e0008
7c00522c
7f8be800 419e0060 812b000c 81290040 <80090028> 2f800000 419e0028
a009006e
---[ end trace faefbff1ebfe68f9 ]---
------------[ cut here ]------------
Kernel BUG at c03ae294 [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#2]
PREEMPT 834x SYS
Modules linked in:
NIP: c03ae294 LR: c03ae26c CTR: 00000000
REGS: ce0f1af0 TRAP: 0700 Tainted: G D (2.6.33.3-rt16)
MSR: 00021032 <ME,CE,IR,DR> CR: 24004428 XER: 00000000
TASK = ceb65ab0[973] 'umount' THREAD: ce0f0000
GPR00: 00000001 ce0f1ba0 ceb65ab0 00000001 11111111 00000000 ce0f1bf4
ce0f1bd4
GPR08: ce0f1bcc 00000000 ceb65ab0 ce0f0000 24004422 100bbc1c 0fffd000
ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a8 c0512224
ce0f1ec8
GPR24: ce0f1bac cf0281a0 cec1ee84 c051221c cec1fdb0 00009032 ceba4b80
ceba4b80
NIP [c03ae294] rt_spin_lock_slowlock+0x90/0x348
LR [c03ae26c] rt_spin_lock_slowlock+0x68/0x348
Call Trace:
[ce0f1ba0] [c03ae26c] rt_spin_lock_slowlock+0x68/0x348 (unreliable)
[ce0f1c30] [c009dd48] file_sb_list_del+0x34/0x6c
[ce0f1c50] [c009e458] __fput+0x154/0x254
[ce0f1c80] [c0085530] remove_vma+0x64/0xd0
[ce0f1c90] [c0085704] exit_mmap+0x168/0x1c4
[ce0f1cf0] [c0022fe4] mmput+0x70/0x138
[ce0f1d10] [c0027c80] exit_mm+0x148/0x170
[ce0f1d40] [c0029e7c] do_exit+0x508/0x614
[ce0f1d90] [c0011ce0] die+0x19c/0x1a4
[ce0f1db0] [c001822c] bad_page_fault+0x98/0xd0
[ce0f1dc0] [c00146a8] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x58/0xd0
LR = fs_may_remount_ro+0x28/0xd0
[ce0f1ea0] [c009ef1c] do_remount_sb+0x138/0x178
[ce0f1ec0] [c00bdbe8] do_mount+0x54c/0x840
[ce0f1f10] [c00bdfac] sys_mount+0xd0/0xfc
[ce0f1f40] [c0014208] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b44
Instruction dump:
38600001 4bc70781 801b0004 3adb0008 2f800000 419e027c 801b0018
7c4a1378
5400003a 7c400278 7c000034 5400d97e <0f000000> 83c20000 39200002
2f9e0002
---[ end trace faefbff1ebfe68fa ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: umount/0x00000001/973, CPU#0
Modules linked in:
Call Trace:
[ce0f18f0] [c0009d14] show_stack+0x70/0x1b8 (unreliable)
[ce0f1930] [c001e8cc] __schedule_bug+0x90/0x94
[ce0f1950] [c03ac910] __schedule+0x2ac/0x390
[ce0f1970] [c03acb98] schedule+0x28/0x54
[ce0f1980] [c0029df4] do_exit+0x480/0x614
[ce0f19d0] [c0011ce0] die+0x19c/0x1a4
[ce0f19f0] [c0011f64] _exception+0x138/0x16c
[ce0f1ae0] [c0014854] ret_from_except_full+0x0/0x4c
--- Exception: 700 at rt_spin_lock_slowlock+0x90/0x348
LR = rt_spin_lock_slowlock+0x68/0x348
[ce0f1c30] [c009dd48] file_sb_list_del+0x34/0x6c
[ce0f1c50] [c009e458] __fput+0x154/0x254
[ce0f1c80] [c0085530] remove_vma+0x64/0xd0
[ce0f1c90] [c0085704] exit_mmap+0x168/0x1c4
[ce0f1cf0] [c0022fe4] mmput+0x70/0x138
[ce0f1d10] [c0027c80] exit_mm+0x148/0x170
[ce0f1d40] [c0029e7c] do_exit+0x508/0x614
[ce0f1d90] [c0011ce0] die+0x19c/0x1a4
[ce0f1db0] [c001822c] bad_page_fault+0x98/0xd0
[ce0f1dc0] [c00146a8] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x58/0xd0
LR = fs_may_remount_ro+0x28/0xd0
[ce0f1ea0] [c009ef1c] do_remount_sb+0x138/0x178
[ce0f1ec0] [c00bdbe8] do_mount+0x54c/0x840
[ce0f1f10] [c00bdfac] sys_mount+0xd0/0xfc
[ce0f1f40] [c0014208] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b44


#

2010-04-28 19:39:07

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, 28 Apr 2010, Xianghua Xiao wrote:
> Thomas,
> I patched it and re-run it however did not find any condition from
> your patch had a hit.
> In your patch I changed :
>
> if (!file->f_path) {
> to
> if(!(&(file->f_path))){
> Otherwise it won't compile as f_path is a not a pointer.

True :)

> # reboot
> # Oops: Kernel access of bad area, sig: 11 [#1]

Ok. Can you please enable CONFIG_DEBUG_LIST ?

Thanks,

tglx

2010-04-28 17:54:56

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, Apr 28, 2010 at 11:34 AM, Thomas Gleixner <[email protected]> wrote:
> On Wed, 28 Apr 2010, Xianghua Xiao wrote:
>> Just tried the patch, still got umount hang, please see below.
>
> Can you please apply the patch below and provide the debug output ?
>
> Thanks,
>
>        tglx
> ---
>  fs/file_table.c |   22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> Index: linux-2.6-tip/fs/file_table.c
> ===================================================================
> --- linux-2.6-tip.orig/fs/file_table.c
> +++ linux-2.6-tip/fs/file_table.c
> @@ -410,7 +410,27 @@ int fs_may_remount_ro(struct super_block
>                list = &sb->s_files;
>  #endif
>                list_for_each_entry(file, list, f_u.fu_list) {
> -                       struct inode *inode = file->f_path.dentry->d_inode;
> +                       struct inode *inode;
> +
> +                       if (!file->f_path) {
> +                               printk(KERN_ERR "file %p fpath == NULL\n",
> +                                      file);
> +                               continue;
> +                       }
> +
> +                       if (!file->f_path.dentry) {
> +                               printk(KERN_ERR "file %p dentry == NULL\n",
> +                                      file);
> +                               continue;
> +                       }
> +
> +                       if (!file->f_path.dentry->d_inode) {
> +                               printk(KERN_ERR "file %p d_inode == NULL\n",
> +                                      file);
> +                               continue;
> +                       }
> +
> +                       inode = file->f_path.dentry->d_inode;
>
>                        /* File with pending delete? */
>                        if (inode->i_nlink == 0)
>
Thomas,
I patched it and re-run it however did not find any condition from
your patch had a hit.
In your patch I changed :

if (!file->f_path) {
to
if(!(&(file->f_path))){
Otherwise it won't compile as f_path is a not a pointer.

Thanks,
Xianghua

# reboot
# Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT 834x SYS
Modules linked in:
NIP: c009d5e0 LR: c009d69c CTR: 00000001
REGS: cde87dd0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
MSR: 00009032 <EE,ME,IR,DR> CR: 24000424 XER: 20000000
DAR: 2e657490, DSISR: 20000000
TASK = ce99e9f0[1404] 'umount' THREAD: cde86000
GPR00: 00007000 cde87e80 ce99e9f0 00000024 00003da7 ffffffff c0542548 00020000
GPR08: c054292c 2e657468 0001ffff cde12b58 24000422 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 cde87ec8
GPR24: 00000021 00000060 c045b5a8 c045b5c4 c050cd6c ce953488 00008000 cde12940
NIP [c009d5e0] fs_may_remount_ro+0x88/0x150
LR [c009d69c] fs_may_remount_ro+0x144/0x150
Call Trace:
[cde87e80] [c009d69c] fs_may_remount_ro+0x144/0x150 (unreliable)
[cde87ea0] [c009e5dc] do_remount_sb+0x138/0x178
[cde87ec0] [c00bd25c] do_mount+0x54c/0x840
[cde87f10] [c00bd620] sys_mount+0xd0/0xfc
[cde87f40] [c0014208] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
817f0000 2f8b0000 419e0008 7c005a2c 7f9fe800 419e0080 813f000c 2f890000
419e00a8 81290024 2f890000 419e00b4 <80090028> 2f800000 419e0028 a009006e
---[ end trace 3fba518eec56e584 ]---
------------[ cut here ]------------
Kernel BUG at c03ad89c [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#2]
PREEMPT 834x SYS
Modules linked in:
NIP: c03ad89c LR: c03ad874 CTR: c0121220
REGS: cde87b00 TRAP: 0700 Tainted: G D (2.6.33.3-rt16)
MSR: 00021032 <ME,CE,IR,DR> CR: 84004428 XER: 00000000
TASK = ce99e9f0[1404] 'umount' THREAD: cde86000
GPR00: 00000001 cde87bb0 ce99e9f0 00000001 000002ac 000002ac 00008000 00000000
GPR08: 00000000 00000000 ce99e9f0 cde86000 24004422 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 c050cd74
GPR24: 00000021 cf0231a0 cec19b34 c050cd6c cec1a9a8 00009032 cdf877a0 cdf877a0
NIP [c03ad89c] rt_spin_lock_slowlock+0x84/0x318
LR [c03ad874] rt_spin_lock_slowlock+0x5c/0x318
Call Trace:
[cde87bb0] [c03ad874] rt_spin_lock_slowlock+0x5c/0x318 (unreliable)
[cde87c30] [c009d3a8] file_sb_list_del+0x34/0x6c
[cde87c50] [c009db38] __fput+0x154/0x254
[cde87c80] [c0084bfc] remove_vma+0x64/0xd0
[cde87c90] [c0084dd0] exit_mmap+0x168/0x1c4
[cde87cf0] [c0022fd8] mmput+0x70/0x138
[cde87d10] [c0027c8c] exit_mm+0x148/0x170
[cde87d40] [c0029e88] do_exit+0x508/0x614
[cde87d90] [c0011ce0] die+0x19c/0x1a4
[cde87db0] [c001822c] bad_page_fault+0x98/0xd0
[cde87dc0] [c00146a8] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x88/0x150
LR = fs_may_remount_ro+0x144/0x150
[cde87ea0] [c009e5dc] do_remount_sb+0x138/0x178
[cde87ec0] [c00bd25c] do_mount+0x54c/0x840
[cde87f10] [c00bd620] sys_mount+0xd0/0xfc
[cde87f40] [c0014208] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
38600001 4bc71179 801b0004 3afb0008 2f800000 419e0270 801b0010 7c4a1378
5400003a 7c400278 7c000034 5400d97e <0f000000> 83c20000 39200002 2f9e0002
---[ end trace 3fba518eec56e585 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: umount/0x00000001/1404, CPU#0
Modules linked in:
Call Trace:
[cde87900] [c0009d14] show_stack+0x70/0x1b8 (unreliable)
[cde87940] [c001e8cc] __schedule_bug+0x90/0x94
[cde87960] [c03ac0f8] __schedule+0x2ac/0x390
[cde87980] [c03ac380] schedule+0x28/0x54
[cde87990] [c0029e00] do_exit+0x480/0x614
[cde879e0] [c0011ce0] die+0x19c/0x1a4
[cde87a00] [c0011f64] _exception+0x138/0x16c
[cde87af0] [c0014854] ret_from_except_full+0x0/0x4c
--- Exception: 700 at rt_spin_lock_slowlock+0x84/0x318
LR = rt_spin_lock_slowlock+0x5c/0x318
[cde87c30] [c009d3a8] file_sb_list_del+0x34/0x6c
[cde87c50] [c009db38] __fput+0x154/0x254
[cde87c80] [c0084bfc] remove_vma+0x64/0xd0
[cde87c90] [c0084dd0] exit_mmap+0x168/0x1c4
[cde87cf0] [c0022fd8] mmput+0x70/0x138
[cde87d10] [c0027c8c] exit_mm+0x148/0x170
[cde87d40] [c0029e88] do_exit+0x508/0x614
[cde87d90] [c0011ce0] die+0x19c/0x1a4
[cde87db0] [c001822c] bad_page_fault+0x98/0xd0
[cde87dc0] [c00146a8] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x88/0x150
LR = fs_may_remount_ro+0x144/0x150
[cde87ea0] [c009e5dc] do_remount_sb+0x138/0x178
[cde87ec0] [c00bd25c] do_mount+0x54c/0x840
[cde87f10] [c00bd620] sys_mount+0xd0/0xfc
[cde87f40] [c0014208] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88

2010-04-28 20:07:01

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, Apr 28, 2010 at 2:38 PM, Thomas Gleixner <[email protected]> wrote:
> On Wed, 28 Apr 2010, Xianghua Xiao wrote:
>> Thomas,
>> I patched it and re-run it however did not find any condition from
>> your patch had a hit.
>> In your patch I changed :
>>
>> if (!file->f_path) {
>> to
>> if(!(&(file->f_path))){
>> Otherwise it won't compile as f_path is a not a pointer.
>
> True :)
>
>> # reboot
>> # Oops: Kernel access of bad area, sig: 11 [#1]
>
> Ok. Can you please enable CONFIG_DEBUG_LIST ?
>
> Thanks,
>
>        tglx
>
I turned on that, could not find any difference from the oops log.
If I try to remount it rw then ro, the remount ro will cause similar oops
Thanks,
Xianghua

# reboot
# Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT 834x SYS
Modules linked in:
NIP: c009ca1c LR: c009c9cc CTR: 00000000
REGS: cde43dd0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
MSR: 00009032 <EE,ME,IR,DR> CR: 24000444 XER: 20000000
DAR: 31c554a2, DSISR: 20000000
TASK = ce9219d0[1396] 'umount' THREAD: cde42000
GPR00: 0000001d cde43e80 ce9219d0 c0454910 000002ac 000002ac 00008000 00000000
GPR08: 00007fff 31c5547a c0454910 cea82b78 44000442 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 cde43ec8
GPR24: 00000021 00000060 c045869c c04586b8 c050bd6c ce951488 00008000 cea82960
NIP [c009ca1c] fs_may_remount_ro+0x88/0x150
LR [c009c9cc] fs_may_remount_ro+0x38/0x150
Call Trace:
[cde43e80] [c009c9cc] fs_may_remount_ro+0x38/0x150 (unreliable)
[cde43ea0] [c009da10] do_remount_sb+0x138/0x178
[cde43ec0] [c00bc420] do_mount+0x54c/0x840
[cde43f10] [c00bc7e4] sys_mount+0xd0/0xfc
[cde43f40] [c00141e8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
817f0000 2f8b0000 419e0008 7c005a2c 7f9fe800 419e0080 813f000c 2f890000
419e00a8 81290024 2f890000 419e00b4 <80090028> 2f800000 419e0028 a009006e
---[ end trace cd3eb2ed5361fbce ]---
------------[ cut here ]------------
kernel BUG at kernel/rtmutex.c:808!
Oops: Exception in kernel mode, sig: 5 [#2]
PREEMPT 834x SYS
Modules linked in:
NIP: c03aa79c LR: c03aa774 CTR: c011fbfc
REGS: cde43b00 TRAP: 0700 Tainted: G D (2.6.33.3-rt16)
MSR: 00021032 <ME,CE,IR,DR> CR: 82004428 XER: 00000000
TASK = ce9219d0[1396] 'umount' THREAD: cde42000
GPR00: 00000001 cde43bb0 ce9219d0 00000001 000002ac 000002ac 00008000 00000000
GPR08: 00000000 00000000 ce9219d0 cde42000 22004422 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 c050bd74
GPR24: 00000021 cf0231a0 cec19b34 c050bd6c cec1a9a8 00009032 cde52080 cde52080
NIP [c03aa79c] rt_spin_lock_slowlock+0x84/0x318
LR [c03aa774] rt_spin_lock_slowlock+0x5c/0x318
Call Trace:
[cde43bb0] [c03aa774] rt_spin_lock_slowlock+0x5c/0x318 (unreliable)
[cde43c30] [c009c7e4] file_sb_list_del+0x34/0x6c
[cde43c50] [c009cf6c] __fput+0x154/0x254
[cde43c80] [c00843dc] remove_vma+0x64/0xd0
[cde43c90] [c00845b0] exit_mmap+0x168/0x1c4
[cde43cf0] [c0022f48] mmput+0x7c/0x124
[cde43d10] [c0027ba8] exit_mm+0x148/0x170
[cde43d40] [c0029d90] do_exit+0x500/0x60c
[cde43d90] [c0011cc0] die+0x19c/0x1a4
[cde43db0] [c00181e0] bad_page_fault+0x98/0xd0
[cde43dc0] [c0014688] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x88/0x150
LR = fs_may_remount_ro+0x38/0x150
[cde43ea0] [c009da10] do_remount_sb+0x138/0x178
[cde43ec0] [c00bc420] do_mount+0x54c/0x840
[cde43f10] [c00bc7e4] sys_mount+0xd0/0xfc
[cde43f40] [c00141e8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
38600001 4bc74275 801b0004 3afb0008 2f800000 419e0270 801b0010 7c4a1378
5400003a 7c400278 7c000034 5400d97e <0f000000> 83c20000 39200002 2f9e0002
---[ end trace cd3eb2ed5361fbcf ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: umount/0x00000001/1396, CPU#0
Modules linked in:
Call Trace:
[cde43900] [c0009d0c] show_stack+0x70/0x1b8 (unreliable)
[cde43940] [c001e8c8] __schedule_bug+0x90/0x94
[cde43960] [c03a9024] __schedule+0x2ac/0x390
[cde43980] [c03a92ac] schedule+0x28/0x54
[cde43990] [c0029d08] do_exit+0x478/0x60c
[cde439e0] [c0011cc0] die+0x19c/0x1a4
[cde43a00] [c0011f44] _exception+0x138/0x16c
[cde43af0] [c0014834] ret_from_except_full+0x0/0x4c
--- Exception: 700 at rt_spin_lock_slowlock+0x84/0x318
LR = rt_spin_lock_slowlock+0x5c/0x318
[cde43c30] [c009c7e4] file_sb_list_del+0x34/0x6c
[cde43c50] [c009cf6c] __fput+0x154/0x254
[cde43c80] [c00843dc] remove_vma+0x64/0xd0
[cde43c90] [c00845b0] exit_mmap+0x168/0x1c4
[cde43cf0] [c0022f48] mmput+0x7c/0x124
[cde43d10] [c0027ba8] exit_mm+0x148/0x170
[cde43d40] [c0029d90] do_exit+0x500/0x60c
[cde43d90] [c0011cc0] die+0x19c/0x1a4
[cde43db0] [c00181e0] bad_page_fault+0x98/0xd0
[cde43dc0] [c0014688] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x88/0x150
LR = fs_may_remount_ro+0x38/0x150
[cde43ea0] [c009da10] do_remount_sb+0x138/0x178
[cde43ec0] [c00bc420] do_mount+0x54c/0x840
[cde43f10] [c00bc7e4] sys_mount+0xd0/0xfc
[cde43f40] [c00141e8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88

2010-04-28 16:34:43

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, 28 Apr 2010, Xianghua Xiao wrote:
> Just tried the patch, still got umount hang, please see below.

Can you please apply the patch below and provide the debug output ?

Thanks,

tglx
---
fs/file_table.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

Index: linux-2.6-tip/fs/file_table.c
===================================================================
--- linux-2.6-tip.orig/fs/file_table.c
+++ linux-2.6-tip/fs/file_table.c
@@ -410,7 +410,27 @@ int fs_may_remount_ro(struct super_block
list = &sb->s_files;
#endif
list_for_each_entry(file, list, f_u.fu_list) {
- struct inode *inode = file->f_path.dentry->d_inode;
+ struct inode *inode;
+
+ if (!file->f_path) {
+ printk(KERN_ERR "file %p fpath == NULL\n",
+ file);
+ continue;
+ }
+
+ if (!file->f_path.dentry) {
+ printk(KERN_ERR "file %p dentry == NULL\n",
+ file);
+ continue;
+ }
+
+ if (!file->f_path.dentry->d_inode) {
+ printk(KERN_ERR "file %p d_inode == NULL\n",
+ file);
+ continue;
+ }
+
+ inode = file->f_path.dentry->d_inode;

/* File with pending delete? */
if (inode->i_nlink == 0)

2010-04-28 20:22:21

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, 28 Apr 2010, Xianghua Xiao wrote:
> On Wed, Apr 28, 2010 at 2:38 PM, Thomas Gleixner <[email protected]> wrote:
> > On Wed, 28 Apr 2010, Xianghua Xiao wrote:
> >> Thomas,
> >> I patched it and re-run it however did not find any condition from
> >> your patch had a hit.
> >> In your patch I changed :
> >>
> >> if (!file->f_path) {
> >> to
> >> if(!(&(file->f_path))){
> >> Otherwise it won't compile as f_path is a not a pointer.
> >
> > True :)
> >
> >> # reboot
> >> # Oops: Kernel access of bad area, sig: 11 [#1]
> >
> > Ok. Can you please enable CONFIG_DEBUG_LIST ?
> >
> > Thanks,
> >
> >        tglx
> >
> I turned on that, could not find any difference from the oops log.
> If I try to remount it rw then ro, the remount ro will cause similar oops
> Thanks,
> Xianghua
>
> # reboot
> # Oops: Kernel access of bad area, sig: 11 [#1]
> PREEMPT 834x SYS
> Modules linked in:
> NIP: c009ca1c LR: c009c9cc CTR: 00000000

Can you please decode the code lines with

# addr2line -e vmlinux 0xc009ca1c 0xc009c9cc

You need to enable CONFIG_DEBUG_INFO to get real line numbers.

Thanks,

tglx

2010-04-28 21:22:50

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, Apr 28, 2010 at 3:22 PM, Thomas Gleixner <[email protected]> wrote:
> On Wed, 28 Apr 2010, Xianghua Xiao wrote:
>> On Wed, Apr 28, 2010 at 2:38 PM, Thomas Gleixner <[email protected]> wrote:
>> > On Wed, 28 Apr 2010, Xianghua Xiao wrote:
>> >> Thomas,
>> >> I patched it and re-run it however did not find any condition from
>> >> your patch had a hit.
>> >> In your patch I changed :
>> >>
>> >> if (!file->f_path) {
>> >> to
>> >> if(!(&(file->f_path))){
>> >> Otherwise it won't compile as f_path is a not a pointer.
>> >
>> > True :)
>> >
>> >> # reboot
>> >> # Oops: Kernel access of bad area, sig: 11 [#1]
>> >
>> > Ok. Can you please enable CONFIG_DEBUG_LIST ?
>> >
>> > Thanks,
>> >
>> >        tglx
>> >
>> I turned on that, could not find any difference from the oops log.
>> If I try to remount it rw then ro, the remount ro will cause similar oops
>> Thanks,
>> Xianghua
>>
>> # reboot
>> # Oops: Kernel access of bad area, sig: 11 [#1]
>> PREEMPT 834x SYS
>> Modules linked in:
>> NIP: c009ca1c LR: c009c9cc CTR: 00000000
>
> Can you please decode the code lines with
>
> # addr2line -e vmlinux 0xc009ca1c 0xc009c9cc
>
> You need to enable CONFIG_DEBUG_INFO to get real line numbers.
>
> Thanks,
>
>        tglx

Here it is, thanks!
Xianghua

# reboot
# Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT 834x SYS
Modules linked in:
NIP: c009ded8 LR: c009de88 CTR: 00000000
REGS: cde51dd0 TRAP: 0300 Not tainted (2.6.33.3-rt16)
MSR: 00009032 <EE,ME,IR,DR> CR: 24000444 XER: 00000000
DAR: 00000030, DSISR: 20000000
TASK = ce99d580[1404] 'umount' THREAD: cde50000
GPR00: 0000001d cde51e80 ce99d580 cde51dfc 22222222 00000000 cde51e44 cde51e24
GPR08: cde51e1c 00000008 ce99d580 cdf77c90 44000442 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 cde51ec8
GPR24: 00000021 00000060 c045a09c c045a0b8 c051321c cdf084c0 00008000 cdf779b8
NIP [c009ded8] fs_may_remount_ro+0x88/0x150
LR [c009de88] fs_may_remount_ro+0x38/0x150
Call Trace:
[cde51e80] [c009de88] fs_may_remount_ro+0x38/0x150 (unreliable)
[cde51ea0] [c009ef50] do_remount_sb+0x138/0x178
[cde51ec0] [c00bd9c0] do_mount+0x54c/0x840
[cde51f10] [c00bdd84] sys_mount+0xd0/0xfc
[cde51f40] [c00141e8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
817f0000 2f8b0000 419e0008 7c005a2c 7f9fe800 419e0080 813f000c 2f890000
419e00a8 81290040 2f890000 419e00b4 <80090028> 2f800000 419e0028 a009006e
---[ end trace 8efa68ffffb3f0d2 ]---
------------[ cut here ]------------
kernel BUG at kernel/rtmutex.c:808!
Oops: Exception in kernel mode, sig: 5 [#2]
PREEMPT 834x SYS
Modules linked in:
NIP: c03ac1fc LR: c03ac1d4 CTR: 00000000
REGS: cde51af0 TRAP: 0700 Tainted: G D (2.6.33.3-rt16)
MSR: 00021032 <ME,CE,IR,DR> CR: 24004428 XER: 00000000
TASK = ce99d580[1404] 'umount' THREAD: cde50000
GPR00: 00000001 cde51ba0 ce99d580 00000001 11111111 00000000 cde51bf4 cde51bd4
GPR08: cde51bcc 00000000 ce99d580 cde50000 24004422 100bbc1c 0fffd000 ffffffff
GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 c0513224 cde51ec8
GPR24: cde51bac cf0281a0 cec21e84 c051321c cec22a60 00009032 cde0e060 cde0e060
NIP [c03ac1fc] rt_spin_lock_slowlock+0x90/0x348
LR [c03ac1d4] rt_spin_lock_slowlock+0x68/0x348
Call Trace:
[cde51ba0] [c03ac1d4] rt_spin_lock_slowlock+0x68/0x348 (unreliable)
[cde51c30] [c009dc24] file_sb_list_del+0x34/0x6c
[cde51c50] [c009e44c] __fput+0x154/0x27c
[cde51c80] [c0085588] remove_vma+0x64/0xd0
[cde51c90] [c008575c] exit_mmap+0x168/0x1c4
[cde51cf0] [c0023054] mmput+0x7c/0x124
[cde51d10] [c0027c9c] exit_mm+0x148/0x170
[cde51d40] [c0029e84] do_exit+0x500/0x60c
[cde51d90] [c0011cc0] die+0x19c/0x1a4
[cde51db0] [c00181e0] bad_page_fault+0x98/0xd0
[cde51dc0] [c0014688] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x88/0x150
LR = fs_may_remount_ro+0x38/0x150
[cde51ea0] [c009ef50] do_remount_sb+0x138/0x178
[cde51ec0] [c00bd9c0] do_mount+0x54c/0x840
[cde51f10] [c00bdd84] sys_mount+0xd0/0xfc
[cde51f40] [c00141e8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88
Instruction dump:
38600001 4bc72915 801b0004 3adb0008 2f800000 419e027c 801b0018 7c4a1378
5400003a 7c400278 7c000034 5400d97e <0f000000> 83c20000 39200002 2f9e0002
---[ end trace 8efa68ffffb3f0d3 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: umount/0x00000001/1404, CPU#0
Modules linked in:
Call Trace:
[cde518f0] [c0009d0c] show_stack+0x70/0x1b8 (unreliable)
[cde51930] [c001e9c8] __schedule_bug+0x90/0x94
[cde51950] [c03aa8a4] __schedule+0x2ac/0x390
[cde51970] [c03aab2c] schedule+0x28/0x54
[cde51980] [c0029dfc] do_exit+0x478/0x60c
[cde519d0] [c0011cc0] die+0x19c/0x1a4
[cde519f0] [c0011f44] _exception+0x138/0x16c
[cde51ae0] [c0014834] ret_from_except_full+0x0/0x4c
--- Exception: 700 at rt_spin_lock_slowlock+0x90/0x348
LR = rt_spin_lock_slowlock+0x68/0x348
[cde51c30] [c009dc24] file_sb_list_del+0x34/0x6c
[cde51c50] [c009e44c] __fput+0x154/0x27c
[cde51c80] [c0085588] remove_vma+0x64/0xd0
[cde51c90] [c008575c] exit_mmap+0x168/0x1c4
[cde51cf0] [c0023054] mmput+0x7c/0x124
[cde51d10] [c0027c9c] exit_mm+0x148/0x170
[cde51d40] [c0029e84] do_exit+0x500/0x60c
[cde51d90] [c0011cc0] die+0x19c/0x1a4
[cde51db0] [c00181e0] bad_page_fault+0x98/0xd0
[cde51dc0] [c0014688] handle_page_fault+0x7c/0x80
--- Exception: 300 at fs_may_remount_ro+0x88/0x150
LR = fs_may_remount_ro+0x38/0x150
[cde51ea0] [c009ef50] do_remount_sb+0x138/0x178
[cde51ec0] [c00bd9c0] do_mount+0x54c/0x840
[cde51f10] [c00bdd84] sys_mount+0xd0/0xfc
[cde51f40] [c00141e8] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfe5f8c4
LR = 0x10051b88

2010-04-28 21:46:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, 28 Apr 2010, Xianghua Xiao wrote:
> On Wed, Apr 28, 2010 at 3:22 PM, Thomas Gleixner <[email protected]> wrote:
> >
> > Can you please decode the code lines with
> >
> > # addr2line -e vmlinux 0xc009ca1c 0xc009c9cc
> >
> > You need to enable CONFIG_DEBUG_INFO to get real line numbers.
>
> # reboot
> # Oops: Kernel access of bad area, sig: 11 [#1]
> PREEMPT 834x SYS
> Modules linked in:
> NIP: c009ded8 LR: c009de88 CTR: 00000000

Again. Can you please decode the code lines with

# addr2line -e vmlinux 0xc009ded8 0xc009de88

Please run the above shell command in the directory where your kernel
compile output resides. If you compiled with O=BUILD_DIR then cd to
$BUILD_DIR otherwise you will find vmlinux in the root of your kernel
source tree. Please provide the output.

Thanks,

tglx

2010-04-28 23:34:06

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, Apr 28, 2010 at 4:46 PM, Thomas Gleixner <[email protected]> wrote:
> On Wed, 28 Apr 2010, Xianghua Xiao wrote:
>> On Wed, Apr 28, 2010 at 3:22 PM, Thomas Gleixner <[email protected]> wrote:
>> >
>> > Can you please decode the code lines with
>> >
>> > # addr2line -e vmlinux 0xc009ca1c 0xc009c9cc
>> >
>> > You need to enable CONFIG_DEBUG_INFO to get real line numbers.
>>
>> # reboot
>> # Oops: Kernel access of bad area, sig: 11 [#1]
>> PREEMPT 834x SYS
>> Modules linked in:
>> NIP: c009ded8 LR: c009de88 CTR: 00000000
>
> Again. Can you please decode the code lines with
>
> # addr2line -e vmlinux 0xc009ded8 0xc009de88
>
> Please run the above shell command in the directory where your kernel
> compile output resides. If you compiled with O=BUILD_DIR then cd to
> $BUILD_DIR otherwise you will find vmlinux in the root of your kernel
> source tree. Please provide the output.
>
> Thanks,
>
>        tglx
>

here it is, just in case I also attached the source file related:
addr2line -e vmlinux c009ded8 c009de88
/home/xxiao/xxiao/linux-2.6.33.3/fs/file_table.c:436
/home/xxiao/xxiao/linux-2.6.33.3/fs/file_table.c:440

thanks,


Attachments:
file_table.c (12.09 kB)

2010-04-30 16:51:21

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Thu, Apr 29, 2010 at 10:15 AM, Phil Carmody
<[email protected]> wrote:
> On 28/04/10 19:54 +0200, ext Xianghua Xiao wrote:
>> On Wed, Apr 28, 2010 at 11:34 AM, Thomas Gleixner <[email protected]> wrote:
> ...
>> Thomas,
>> I patched it and re-run it however did not find any condition from
>> your patch had a hit.
>> In your patch I changed :
>>
>> if (!file->f_path) {
>> to
>> if(!(&(file->f_path))){
>> Otherwise it won't compile as f_path is a not a pointer.
>
> That check is completely bogus. The address of _anything_ (whose address
> can be taken) is _always_ non-null.
>
> Phil
>

Thomas,

I confirm that 2.6.33.3-rt15 worked fine, reboot/umount will not oops.

Thanks,
Xianghua

2010-04-30 17:13:57

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Thu, 29 Apr 2010, Xianghua Xiao wrote:
> I confirm that 2.6.33.3-rt15 worked fine, reboot/umount will not oops.

Just pushed out 2.6.33.3-rt17 which has the problem fixed. Can you
please verify that it works for you as well ?

Thanks,

tglx

2010-04-30 17:49:19

by Xianghua Xiao

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Fri, Apr 30, 2010 at 5:02 AM, Thomas Gleixner <[email protected]> wrote:
> On Thu, 29 Apr 2010, Xianghua Xiao wrote:
>> I confirm that 2.6.33.3-rt15 worked fine, reboot/umount will not oops.
>
> Just pushed out 2.6.33.3-rt17 which has the problem fixed. Can you
> please verify that it works for you as well ?
>
> Thanks,
>
>        tglx
>

Yes it works fine now.
Thanks a lot for the efforts!
Xianghua

2010-04-30 18:51:23

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.33.3-rt16 Oops caused by umount

On Wed, 28 Apr 2010, Xianghua Xiao wrote:
> here it is, just in case I also attached the source file related:
> addr2line -e vmlinux c009ded8 c009de88
> /home/xxiao/xxiao/linux-2.6.33.3/fs/file_table.c:436

The code line in all traces is

if (inode->i_nlink == 0)

and interestingly enough the inode pointer is not NULL, in the various
traces there are random values: 0x00000008, 0x00000010, 0x2e657468,
0x31c5547a

Can you please verify whether 2.6.33.3-rt15 works ?

Thanks,

tglx