2006-05-02 18:46:50

by Jan Kara

[permalink] [raw]
Subject: [PATCH] JBD checkpoint cleanup strikes back

Hello,

attached is a new version of a patch splitting checkpoint lists into
two. Motivation of the patch is: prevent possible false assertion
failures when we fail to notice the progress while processing checkpoint
list, simplify the checkpoint list handling.
The patch was already in 2.6.16-rc3 but was dropped because of
problems with OCFS2. I've now tracked down the problem - OCFS2 relies on
the fact that buffer does not have journal_head if it is not on any
transaction's list. It assumes that if the buffer has buffer_jbd set,
then it is journaled and hence the node has uptodate data in the
buffer. My patch broke that assumption as in one path I forgot to call
journal_remove_journal_head() and hence I was leaving behind some
buffers not attached to any transaction but with journal_head. Now that
leak is fixed and OCFS2 seems to work fine also with my patch.
Andrew, could you please put the patch into -mm? Thanks.

Honza

--
Jan Kara <[email protected]>
SuSE CR Labs


Attachments:
(No filename) (994.00 B)
jbd-2.6.17-rc3-1-checkpoint_list_split.diff (19.16 kB)
Download all attachments

2006-05-02 23:47:37

by Mark Fasheh

[permalink] [raw]
Subject: Re: [PATCH] JBD checkpoint cleanup strikes back

On Tue, May 02, 2006 at 08:46:46PM +0200, Jan Kara wrote:
> The patch was already in 2.6.16-rc3 but was dropped because of
> problems with OCFS2. I've now tracked down the problem - OCFS2 relies on
> the fact that buffer does not have journal_head if it is not on any
> transaction's list. It assumes that if the buffer has buffer_jbd set,
> then it is journaled and hence the node has uptodate data in the
> buffer. My patch broke that assumption as in one path I forgot to call
> journal_remove_journal_head() and hence I was leaving behind some
> buffers not attached to any transaction but with journal_head. Now that
> leak is fixed and OCFS2 seems to work fine also with my patch.
Ahh, ok that makes sense and it definitely sounds like the type of thing
that would cause OCFS2 to pick up stale data.

> Andrew, could you please put the patch into -mm? Thanks.
Without commenting any further on the patch, I can definitely offer up some
more testing on my end should Andrew decide to pick this up.
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
[email protected]

2006-05-03 10:25:54

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] JBD checkpoint cleanup strikes back

> On Tue, May 02, 2006 at 08:46:46PM +0200, Jan Kara wrote:
> > The patch was already in 2.6.16-rc3 but was dropped because of
> > problems with OCFS2. I've now tracked down the problem - OCFS2 relies on
> > the fact that buffer does not have journal_head if it is not on any
> > transaction's list. It assumes that if the buffer has buffer_jbd set,
> > then it is journaled and hence the node has uptodate data in the
> > buffer. My patch broke that assumption as in one path I forgot to call
> > journal_remove_journal_head() and hence I was leaving behind some
> > buffers not attached to any transaction but with journal_head. Now that
> > leak is fixed and OCFS2 seems to work fine also with my patch.
> Ahh, ok that makes sense and it definitely sounds like the type of thing
> that would cause OCFS2 to pick up stale data.
>
> > Andrew, could you please put the patch into -mm? Thanks.
> Without commenting any further on the patch, I can definitely offer up some
> more testing on my end should Andrew decide to pick this up.
Testing is always welcome :). Thanks. I did some basic one myself but
the more tests the better.

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs