I got this oops running the following:
mkreiserfs /dev/hdb3
mount /dev/hdb3 /test
mkdir /test/d
cd /test
`fsx-linux -c 2 linux-2.4.20.tar.bz2' on a console and
`fsstress -d -d -n 10000 -p 10' on a second console.
The oops happened just few seconds after I run `vmstat 1'
on third console. Reproducible.
ksymoops 2.4.8 on i686 2.4.21-pre4. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.21-pre4/ (default)
-m /usr/src/linux/System.map (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.
Mar 11 21:46:45 odyssey kernel: Unable to handle kernel NULL pointer
dereference at virtual address 0000000d
Mar 11 21:46:45 odyssey kernel: c0172a60
Mar 11 21:46:45 odyssey kernel: *pde = 00000000
Mar 11 21:46:45 odyssey kernel: Oops: 0000
Mar 11 21:46:45 odyssey kernel: CPU: 0
Mar 11 21:46:45 odyssey kernel: EIP: 0010:[<c0172a60>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Mar 11 21:46:45 odyssey kernel: EFLAGS: 00010256
Mar 11 21:46:45 odyssey kernel: eax: 00000000 ebx: 00000000 ecx: 00000000
edx: 00000001
Mar 11 21:46:45 odyssey kernel: esi: 00000000 edi: ddf87d7c ebp: 00000001
esp: ddf87cf4
Mar 11 21:46:45 odyssey kernel: ds: 0018 es: 0018 ss: 0018
Mar 11 21:46:45 odyssey kernel: Process fsstress (pid: 6617,
stackpage=ddf87000)
Mar 11 21:46:45 odyssey kernel: Stack: 00000001 ddf87e80 ddf87d7c ddf87e38
00000001 00001000 c017355a cddb37c0
Mar 11 21:46:45 odyssey kernel: ddf87e80 00000001 00000000 ddf87db0
dc8c4a00 00000037 ddf87df8 ddf87e80
Mar 11 21:46:45 odyssey kernel: 00000000 00000100 cb577018 ddf87db0
ddf87d78 00000000 00000001 00000000
Mar 11 21:46:45 odyssey kernel: Call Trace: [<c017355a>] [<c01729da>]
[<c01329c1>] [<c0120343>] [<c0120182>]
Mar 11 21:46:45 odyssey kernel: [<c01202c1>] [<c0175845>] [<c01729c0>]
[<c0123cee>] [<c0125b30>] [<c012f986>]
Mar 11 21:46:45 odyssey kernel: [<c0106d27>]
Mar 11 21:46:45 odyssey kernel: Code: 3b 42 0c 74 2b 8b 44 24 1c 8b 90 ac 00
00 00 8b 42 30 50 51
>>EIP; c0172a60 <convert_tail_for_hole+60/110> <=====
>>edi; ddf87d7c <_end+1dcc0124/2056e408>
>>esp; ddf87cf4 <_end+1dcc009c/2056e408>
Trace; c017355a <reiserfs_get_block+a4a/e40>
Trace; c01729da <reiserfs_get_block_direct_io+1a/40>
Trace; c01329c1 <generic_direct_IO+c1/130>
Trace; c0120343 <mark_dirty_kiobuf+33/60>
Trace; c0120182 <get_user_pages+f2/180>
Trace; c01202c1 <map_user_kiobuf+b1/100>
Trace; c0175845 <reiserfs_direct_io+25/30>
Trace; c01729c0 <reiserfs_get_block_direct_io+0/40>
Trace; c0123cee <generic_file_direct_IO+1ae/230>
Trace; c0125b30 <generic_file_write+670/710>
Trace; c012f986 <sys_write+96/f0>
Trace; c0106d27 <system_call+33/38>
Code; c0172a60 <convert_tail_for_hole+60/110>
00000000 <_EIP>:
Code; c0172a60 <convert_tail_for_hole+60/110> <=====
0: 3b 42 0c cmp 0xc(%edx),%eax <=====
Code; c0172a63 <convert_tail_for_hole+63/110>
3: 74 2b je 30 <_EIP+0x30> c0172a90
<convert_tail_for_hole+90/110>
Code; c0172a65 <convert_tail_for_hole+65/110>
5: 8b 44 24 1c mov 0x1c(%esp,1),%eax
Code; c0172a69 <convert_tail_for_hole+69/110>
9: 8b 90 ac 00 00 00 mov 0xac(%eax),%edx
Code; c0172a6f <convert_tail_for_hole+6f/110>
f: 8b 42 30 mov 0x30(%edx),%eax
Code; c0172a72 <convert_tail_for_hole+72/110>
12: 50 push %eax
Code; c0172a73 <convert_tail_for_hole+73/110>
13: 51 push %ecx
1 warning issued. Results may not be reliable.
Hello!
On Tue, Mar 11, 2003 at 10:16:12PM +0000, Lorenzo Allegrucci wrote:
> `fsx-linux -c 2 linux-2.4.20.tar.bz2' on a console and
> `fsstress -d -d -n 10000 -p 10' on a second console.
> The oops happened just few seconds after I run `vmstat 1'
> on third console. Reproducible.
Ah, looks like known direct io problem.
There was a directio fix in 2.4.21-pre5, but
some later IBM guys made us aware of another directio problem,
see the patch below (should help for you).
Please try 2.4.21-pre5 + this patch and if it still does not work, please tell us.
Thank you. (note that this patch was not passed through our full patch verification
process yet, but it works for me and for Chris Mason (whose code is most part of the patch)).
Bye,
Oleg
===== fs/reiserfs/inode.c 1.42 vs edited =====
--- 1.42/fs/reiserfs/inode.c Thu Feb 13 15:42:42 2003
+++ edited/fs/reiserfs/inode.c Fri Mar 7 09:54:11 2003
@@ -469,7 +469,7 @@
tail_end = (tail_start | (bh_result->b_size - 1)) + 1 ;
index = tail_offset >> PAGE_CACHE_SHIFT ;
- if (index != hole_page->index) {
+ if ( !hole_page || index != hole_page->index) {
tail_page = grab_cache_page(inode->i_mapping, index) ;
retval = -ENOMEM;
if (!tail_page) {
@@ -1810,7 +1810,12 @@
flush_dcache_page(page) ;
kunmap(page) ;
if (buffer_mapped(bh) && bh->b_blocknr != 0) {
- mark_buffer_dirty(bh) ;
+ if (!atomic_set_buffer_dirty(bh)) {
+ set_buffer_flushtime(bh);
+ refile_buffer(bh);
+ buffer_insert_inode_data_queue(bh, p_s_inode);
+ balance_dirty();
+ }
}
}
UnlockPage(page) ;
@@ -2158,6 +2163,9 @@
struct kiobuf *iobuf, unsigned long blocknr,
int blocksize)
{
+ lock_kernel();
+ reiserfs_commit_for_tail(inode);
+ unlock_kernel();
return generic_direct_IO(rw, inode, iobuf, blocknr, blocksize,
reiserfs_get_block_direct_io) ;
}
===== fs/reiserfs/journal.c 1.25 vs edited =====
--- 1.25/fs/reiserfs/journal.c Tue Aug 20 15:39:48 2002
+++ edited/fs/reiserfs/journal.c Fri Mar 7 09:54:14 2003
@@ -2649,32 +2649,52 @@
inode->u.reiserfs_i.i_trans_id = SB_JOURNAL(inode->i_sb)->j_trans_id ;
}
-static int reiserfs_inode_in_this_transaction(struct inode *inode) {
- if (inode->u.reiserfs_i.i_trans_id == SB_JOURNAL(inode->i_sb)->j_trans_id ||
- inode->u.reiserfs_i.i_trans_id == 0) {
- return 1;
- }
- return 0 ;
+void reiserfs_update_tail_transaction(struct inode *inode) {
+
+ inode->u.reiserfs_i.i_tail_trans_index = SB_JOURNAL_LIST_INDEX(inode->i_sb);
+
+ inode->u.reiserfs_i.i_tail_trans_id = SB_JOURNAL(inode->i_sb)->j_trans_id ;
+}
+
+static void __commit_trans_index(struct inode *inode, unsigned long id,
+ unsigned long index)
+{
+ struct reiserfs_journal_list *jl ;
+ struct reiserfs_transaction_handle th ;
+ struct super_block *sb = inode->i_sb ;
+
+ jl = SB_JOURNAL_LIST(sb) + index;
+
+ /* is it from the current transaction, or from an unknown transaction? */
+ if (id == SB_JOURNAL(sb)->j_trans_id) {
+ journal_join(&th, sb, 1) ;
+ journal_end_sync(&th, sb, 1) ;
+ } else if (jl->j_trans_id == id) {
+ flush_commit_list(sb, jl, 1) ;
+ }
+ /* if the transaction id does not match, this list is long since flushed
+ ** and we don't have to do anything here
+ */
}
+void reiserfs_commit_for_tail(struct inode *inode) {
+ unsigned long id = inode->u.reiserfs_i.i_tail_trans_id;
+ unsigned long index = inode->u.reiserfs_i.i_tail_trans_index;
+ /* for tails, if this info is unset there's nothing to commit */
+ if (id && index)
+ __commit_trans_index(inode, id, index);
+}
void reiserfs_commit_for_inode(struct inode *inode) {
- struct reiserfs_journal_list *jl ;
- struct reiserfs_transaction_handle th ;
- struct super_block *sb = inode->i_sb ;
-
- jl = SB_JOURNAL_LIST(sb) + inode->u.reiserfs_i.i_trans_index ;
-
- /* is it from the current transaction, or from an unknown transaction? */
- if (reiserfs_inode_in_this_transaction(inode)) {
- journal_join(&th, sb, 1) ;
- reiserfs_update_inode_transaction(inode) ;
- journal_end_sync(&th, sb, 1) ;
- } else if (jl->j_trans_id == inode->u.reiserfs_i.i_trans_id) {
- flush_commit_list(sb, jl, 1) ;
- }
- /* if the transaction id does not match, this list is long since flushed
- ** and we don't have to do anything here
- */
+ unsigned long id = inode->u.reiserfs_i.i_trans_id;
+ unsigned long index = inode->u.reiserfs_i.i_trans_index;
+
+ /* for the whole inode, assume unset id or index means it was
+ * changed in the current transaction. More conservative
+ */
+ if (!id || !index)
+ reiserfs_update_inode_transaction(inode) ;
+
+ __commit_trans_index(inode, id, index);
}
void reiserfs_restore_prepared_buffer(struct super_block *p_s_sb,
===== fs/reiserfs/tail_conversion.c 1.16 vs edited =====
--- 1.16/fs/reiserfs/tail_conversion.c Thu Feb 13 15:42:42 2003
+++ edited/fs/reiserfs/tail_conversion.c Fri Mar 7 09:54:15 2003
@@ -133,6 +133,7 @@
inode->u.reiserfs_i.i_first_direct_byte = U32_MAX;
+ reiserfs_update_tail_transaction(inode);
return 0;
}
===== include/linux/reiserfs_fs.h 1.26 vs edited =====
--- 1.26/include/linux/reiserfs_fs.h Mon Jan 20 13:19:30 2003
+++ edited/include/linux/reiserfs_fs.h Fri Mar 7 09:58:17 2003
@@ -1558,7 +1558,9 @@
#define JOURNAL_BUFFER(j,n) ((j)->j_ap_blocks[((j)->j_start + (n)) % JOURNAL_BLOCK_COUNT])
void reiserfs_commit_for_inode(struct inode *) ;
+void reiserfs_commit_for_tail(struct inode *) ;
void reiserfs_update_inode_transaction(struct inode *) ;
+void reiserfs_update_tail_transaction(struct inode *) ;
void reiserfs_wait_on_write_block(struct super_block *s) ;
void reiserfs_block_writes(struct reiserfs_transaction_handle *th) ;
void reiserfs_allow_writes(struct super_block *s) ;
===== include/linux/reiserfs_fs_i.h 1.8 vs edited =====
--- 1.8/include/linux/reiserfs_fs_i.h Fri Aug 9 19:22:34 2002
+++ edited/include/linux/reiserfs_fs_i.h Fri Mar 7 09:54:16 2003
@@ -53,6 +53,13 @@
** flushed */
unsigned long i_trans_id ;
unsigned long i_trans_index ;
+
+ /* direct io needs to make sure the tail is on disk to avoid
+ * buffer alias problems. This records the transaction last
+ * involved in a direct->indirect conversion for this file
+ */
+ unsigned long i_tail_trans_id;
+ unsigned long i_tail_trans_index;
};
#endif