LinuxLists.cc - jbd commit data buffers EIO error

2008-11-26 00:18:06

Subject: jbd commit data buffers EIO error

Eric and I are looking at an EIO error in journal_submit_data_buffers()
in commit.c

If the buffer is unlocked and not dirty, journal_submit_data_buffers()
would assume someboday else (pdflush) would already flushed the dirty
data to disk. Now it checks the buffer's uptodate bit, try to catch the
disk IO if the buffer is submitted by pdflush. This behavior is
introduced with
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=cbe5f466f6995e10a10c7ae66d6dc8608f08a6b8

We consistantly hit this EIO on blocksize<pagesize case, on multiple
arch (ppc64, x86 with 1k blk size) with fsstress test, where there is no
real disk io error

journal_submit_data_buffers() in commit.c

} else if (!locked && buffer_locked(bh)) {
__journal_file_buffer(jh, commit_transaction,
BJ_Locked);
jbd_unlock_bh_state(bh);
put_bh(bh);
} else {
BUFFER_TRACE(bh, "writeout complete: unfile");
if (unlikely(!buffer_uptodate(bh)))
-------> err = -EIO;
__journal_unfile_buffer(jh);
jbd_unlock_bh_state(bh);
if (locked)
unlock_buffer(bh);
journal_remove_journal_head(bh);
/* One for our safety reference, other for
* journal_remove_journal_head() */
put_bh(bh);
release_data_buffer(bh);
}

And print out false warning message:

JBD: Detected IO errors while flushing file data on sdb4

The buffer head was attached to the sync_data_list from
ext3_ordered_writepage()->journal_dirty_data_fn on every buffer heads of
that dirty page, but I saw many many cases that the buffer is not dirty
but still added to the journal dirty data buffer list(when block size <
pagesize). Is this the right thing to do? If the attached buffer is not
dirty, why jbd still need to keep track of it?

In the fsstress test case, the debug info shows the buffers attached to
sync_data_list originally could be mapped & not dirty & not uptodate
&not locked. (not sure what kind of bh in that state? ) That will
confuse journal_submit_data_buffers() and returns with false EIO alert
(as the bh is !uptodate)

Adding a hack in journal_dirty_data_fn() to check buffer dirty bit fixed
the false alert issue. It seems too easy, so I want to make sure whether
that's the right thing to do.

Thanks,
Mingming