2003-02-11 20:52:09

by Dave Jones

[permalink] [raw]
Subject: jfs breakage in 2.5.60

Running fsx & fsstress in parallel gets a load of oopsen in the
logs, here's the first one..

Dave

Feb 12 04:30:16 mesh kernel: BUG at fs/jfs/jfs_logmgr.c:1438 assert(log->cqueue.head == NULL)
Feb 12 04:30:16 mesh kernel: ------------[ cut here ]------------
Feb 12 04:30:16 mesh kernel: kernel BUG at fs/jfs/jfs_logmgr.c:1438!
Feb 12 04:30:16 mesh kernel: invalid operand: 0000
Feb 12 04:30:16 mesh kernel: CPU: 0
Feb 12 04:30:16 mesh kernel: EIP: 0060:[<c02963ee>] Not tainted
Feb 12 04:30:16 mesh kernel: EFLAGS: 00010296
Feb 12 04:30:16 mesh kernel: EIP is at jfs_flush_journal+0x1de/0x2b0
Feb 12 04:30:16 mesh kernel: eax: 00000044 ebx: 00000320 ecx: c3772680 edx: c0578678
Feb 12 04:30:16 mesh kernel: esi: c3742000 edi: c72fea00 ebp: c3743f70 esp: c3743f44
Feb 12 04:30:16 mesh kernel: ds: 007b es: 007b ss: 0068
Feb 12 04:30:16 mesh kernel: Process fsstress (pid: 1427, threadinfo=c3742000 task=c3773980)
Feb 12 04:30:16 mesh kernel: Stack: c0518308 c0518735 0000059e c0518764 c10bbeb8 c3742000 c1287994 c72feae4
Feb 12 04:30:16 mesh kernel: c72fec40 c72fec00 00000001 c3743f80 c02774d5 c72fea00 00000001 c3743fb0
Feb 12 04:30:16 mesh kernel: c017052e c72fec00 00000001 c1287994 c72fec00 4000988c c3743fb0 c0193125
Feb 12 04:30:16 mesh kernel: Call Trace:
Feb 12 04:30:16 mesh kernel: [<c02774d5>] jfs_sync_fs+0x25/0x30
Feb 12 04:30:16 mesh kernel: [<c017052e>] sync_filesystems+0x2fe/0x450
Feb 12 04:30:16 mesh kernel: [<c0193125>] sync_inodes+0x25/0xb0
Feb 12 04:30:16 mesh kernel: [<c01694fb>] sys_sync+0x3b/0x50
Feb 12 04:30:16 mesh kernel: [<c010a5f7>] syscall_call+0x7/0xb
Feb 12 04:30:16 mesh kernel:
Feb 12 04:30:16 mesh kernel: Code: 0f 0b 9e 05 35 87 51 c0 eb 8c e8 93 8b e8 ff e9 15 ff ff ff


--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs


2003-02-11 21:30:29

by Dave Kleikamp

[permalink] [raw]
Subject: Re: jfs breakage in 2.5.60

On Tuesday 11 February 2003 14:57, Dave Jones wrote:
> Running fsx & fsstress in parallel gets a load of oopsen in the
> logs, here's the first one..
>
> Dave

Thanks Dave,
I think I know what the problem is here. I tried to adapt some code I
used at unmount for the sync_fs call, but I screwed up. The routine is
waiting until all the journal writes have completed but it doesn't
anticipate (or prevent) new activity. The code times out after a while
because a BUG at unmount time is easier to track down than a hang.

I'll work on a real patch, but this should work in the mean time.
(Pardon the compiler warning.)

===== fs/jfs/super.c 1.33 vs edited =====
--- 1.33/fs/jfs/super.c Fri Jan 17 14:17:14 2003
+++ edited/fs/jfs/super.c Tue Feb 11 15:36:25 2003
@@ -396,7 +396,6 @@
.write_inode = jfs_write_inode,
.delete_inode = jfs_delete_inode,
.put_super = jfs_put_super,
- .sync_fs = jfs_sync_fs,
.write_super_lockfs = jfs_write_super_lockfs,
.unlockfs = jfs_unlockfs,
.statfs = jfs_statfs,

--
David Kleikamp
IBM Linux Technology Center