2008-01-31 16:19:28

by Josef Bacik

[permalink] [raw]
Subject: [PATCH] jbd: fix assertion failure in journal_next_log_block

Hello,

The way jbd tries to determine if there is enough space left on the journal in
order to start a new transaction is looking at the space left in the journal and
the space needed for the committing transaction, which is 1/4 of the journal +
the number of t_outstanding_credits for that transaction. In this case its
assumed that t_outstanding_credits accurately represents the number of credits
waiting to be written to the journal, but this sometimes isn't the case. The
transaction has two counters for buffers, t_outstanding_credits which is used in
conjunction with handles that are added to the transaction, and t_nr_buffers
which is only incremented/decremented when buffers are added/removed from the
transaction and are actually destined to be journaled. Normally these two
counters are the same, however there are cases where the committing transaction
can have buffers moved to the next running transaction, for example any buffers
on the committing transactions t_reserved list would be moved to the next
(running) transaction, and if it had been dirtied in the process it would
immediately make it onto the t_updates list, which would increment t_nr_buffers
but not t_outstanding_credits. So you get into this situation where
t_nr_buffers (the actual number of buffers that are on the transaction) is
greater than the number of buffers accounted for via t_outstanding_credits.
This presents a problem since as we loop through writting buffers to the
journal, we decrement t_outstanding_credits, and if t_nr_buffers is more than
t_outstanding_credits then we end up with a negative number for
t_outstanding_credits, which means we start saying we need less than 1/4 of the
journal for our committing transaction and allow more transactions than we can
handle to start, and then bam we fail because journal_next_log_block doesn't
have enough free blocks in order to handle the request. This has been tested
and fixes the issue (which could not be reproduced by me but several other
people could get it to reproduce using postmark), and although I couldn't
reproduce the assertion, I could very easily reproduce the situation where
t_outstanding_credits was < than t_nr_buffers.


Signed-off-by: Josef Bacik <[email protected]>


diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
index 31853eb..e385a5c 100644
--- a/fs/jbd/commit.c
+++ b/fs/jbd/commit.c
@@ -561,13 +561,6 @@ void journal_commit_transaction(journal_t *journal)
continue;
}

- /*
- * start_this_handle() uses t_outstanding_credits to determine
- * the free space in the log, but this counter is changed
- * by journal_next_log_block() also.
- */
- commit_transaction->t_outstanding_credits--;
-
/* Bump b_count to prevent truncate from stumbling over
the shadowed buffer! @@@ This can go if we ever get
rid of the BJ_IO/BJ_Shadow pairing of buffers. */
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 4f302d2..c0f93f5 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -580,7 +580,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
stats.u.run.rs_logging = jiffies;
stats.u.run.rs_flushing = jbd2_time_diff(stats.u.run.rs_flushing,
stats.u.run.rs_logging);
- stats.u.run.rs_blocks = commit_transaction->t_outstanding_credits;
+ stats.u.run.rs_blocks = commit_transaction->t_nr_buffers;
stats.u.run.rs_blocks_logged = 0;

descriptor = NULL;
@@ -655,13 +655,6 @@ void jbd2_journal_commit_transaction(journal_t *journal)
continue;
}

- /*
- * start_this_handle() uses t_outstanding_credits to determine
- * the free space in the log, but this counter is changed
- * by jbd2_journal_next_log_block() also.
- */
- commit_transaction->t_outstanding_credits--;
-
/* Bump b_count to prevent truncate from stumbling over
the shadowed buffer! @@@ This can go if we ever get
rid of the BJ_IO/BJ_Shadow pairing of buffers. */
diff --git a/include/linux/jbd.h b/include/linux/jbd.h
index d9ecd13..076d1fc 100644
--- a/include/linux/jbd.h
+++ b/include/linux/jbd.h
@@ -1056,7 +1056,7 @@ static inline int jbd_space_needed(journal_t *journal)
int nblocks = journal->j_max_transaction_buffers;
if (journal->j_committing_transaction)
nblocks += journal->j_committing_transaction->
- t_outstanding_credits;
+ t_nr_buffers;
return nblocks;
}

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 2cbf6fd..60800a0 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1168,7 +1168,7 @@ static inline int jbd_space_needed(journal_t *journal)
int nblocks = journal->j_max_transaction_buffers;
if (journal->j_committing_transaction)
nblocks += journal->j_committing_transaction->
- t_outstanding_credits;
+ t_nr_buffers;
return nblocks;
}



2008-01-31 19:35:48

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

On Jan 31, 2008 11:14 -0500, Josef Bacik wrote:
[snip excellent analysis]
> So you get into this situation where
> t_nr_buffers (the actual number of buffers that are on the transaction) is
> greater than the number of buffers accounted for via t_outstanding_credits.
> This presents a problem since as we loop through writting buffers to the
> journal, we decrement t_outstanding_credits, and if t_nr_buffers is more than
> t_outstanding_credits then we end up with a negative number for
> t_outstanding_credits
>
> Signed-off-by: Josef Bacik <[email protected]>

Do you know what kernel this problem was introduced in, or is this a
long standing problem? Presumably the same is needed for jbd2?

Once we have some decent amount of testing going on with ext4, I think
it makes sense to merge the jbd2 changes back into jbd and return to
a single code base, since there is nothing in the jbd2 code that ext3
can't also work with (i.e. all of the changes are properly isolated
with compatibility flags and such).

> @@ -1056,7 +1056,7 @@ static inline int jbd_space_needed(journal_t *journal)
> int nblocks = journal->j_max_transaction_buffers;
> if (journal->j_committing_transaction)
> nblocks += journal->j_committing_transaction->
> - t_outstanding_credits;
> + t_nr_buffers;

(trivial) this can be moved back onto the previous line.

> @@ -1168,7 +1168,7 @@ static inline int jbd_space_needed(journal_t *journal)
> int nblocks = journal->j_max_transaction_buffers;
> if (journal->j_committing_transaction)
> nblocks += journal->j_committing_transaction->
> - t_outstanding_credits;
> + t_nr_buffers;

Same...

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

2008-01-31 21:57:59

by Josef Bacik

[permalink] [raw]
Subject: Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

On Thu, Jan 31, 2008 at 12:35:43PM -0700, Andreas Dilger wrote:
> On Jan 31, 2008 11:14 -0500, Josef Bacik wrote:
> [snip excellent analysis]
> > So you get into this situation where
> > t_nr_buffers (the actual number of buffers that are on the transaction) is
> > greater than the number of buffers accounted for via t_outstanding_credits.
> > This presents a problem since as we loop through writting buffers to the
> > journal, we decrement t_outstanding_credits, and if t_nr_buffers is more than
> > t_outstanding_credits then we end up with a negative number for
> > t_outstanding_credits
> >
> > Signed-off-by: Josef Bacik <[email protected]>
>
> Do you know what kernel this problem was introduced in, or is this a
> long standing problem? Presumably the same is needed for jbd2?
>
> Once we have some decent amount of testing going on with ext4, I think
> it makes sense to merge the jbd2 changes back into jbd and return to
> a single code base, since there is nothing in the jbd2 code that ext3
> can't also work with (i.e. all of the changes are properly isolated
> with compatibility flags and such).
>
> > @@ -1056,7 +1056,7 @@ static inline int jbd_space_needed(journal_t *journal)
> > int nblocks = journal->j_max_transaction_buffers;
> > if (journal->j_committing_transaction)
> > nblocks += journal->j_committing_transaction->
> > - t_outstanding_credits;
> > + t_nr_buffers;
>
> (trivial) this can be moved back onto the previous line.
>
> > @@ -1168,7 +1168,7 @@ static inline int jbd_space_needed(journal_t *journal)
> > int nblocks = journal->j_max_transaction_buffers;
> > if (journal->j_committing_transaction)
> > nblocks += journal->j_committing_transaction->
> > - t_outstanding_credits;
> > + t_nr_buffers;
>
> Same...
>

The original issue was reported on RHEL4, so thats 2.6.9, and looking through
the old-bkcvs git tree I can't see where this was introduced, so it's probably
existed before that. The same problem looks to exist in jbd2 though I haven't
tested it myself, I just went ahead and included the fixes. Here is the updated
patch, thanks much for the comments.

Signed-off-by: Josef Bacik <[email protected]>


diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
index 31853eb..e385a5c 100644
--- a/fs/jbd/commit.c
+++ b/fs/jbd/commit.c
@@ -561,13 +561,6 @@ void journal_commit_transaction(journal_t *journal)
continue;
}

- /*
- * start_this_handle() uses t_outstanding_credits to determine
- * the free space in the log, but this counter is changed
- * by journal_next_log_block() also.
- */
- commit_transaction->t_outstanding_credits--;
-
/* Bump b_count to prevent truncate from stumbling over
the shadowed buffer! @@@ This can go if we ever get
rid of the BJ_IO/BJ_Shadow pairing of buffers. */
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 4f302d2..c0f93f5 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -580,7 +580,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
stats.u.run.rs_logging = jiffies;
stats.u.run.rs_flushing = jbd2_time_diff(stats.u.run.rs_flushing,
stats.u.run.rs_logging);
- stats.u.run.rs_blocks = commit_transaction->t_outstanding_credits;
+ stats.u.run.rs_blocks = commit_transaction->t_nr_buffers;
stats.u.run.rs_blocks_logged = 0;

descriptor = NULL;
@@ -655,13 +655,6 @@ void jbd2_journal_commit_transaction(journal_t *journal)
continue;
}

- /*
- * start_this_handle() uses t_outstanding_credits to determine
- * the free space in the log, but this counter is changed
- * by jbd2_journal_next_log_block() also.
- */
- commit_transaction->t_outstanding_credits--;
-
/* Bump b_count to prevent truncate from stumbling over
the shadowed buffer! @@@ This can go if we ever get
rid of the BJ_IO/BJ_Shadow pairing of buffers. */
diff --git a/include/linux/jbd.h b/include/linux/jbd.h
index d9ecd13..eaeb3db 100644
--- a/include/linux/jbd.h
+++ b/include/linux/jbd.h
@@ -1055,8 +1055,7 @@ static inline int jbd_space_needed(journal_t *journal)
{
int nblocks = journal->j_max_transaction_buffers;
if (journal->j_committing_transaction)
- nblocks += journal->j_committing_transaction->
- t_outstanding_credits;
+ nblocks += journal->j_committing_transaction->t_nr_buffers;
return nblocks;
}

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 2cbf6fd..acf9d34 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1167,8 +1167,7 @@ static inline int jbd_space_needed(journal_t *journal)
{
int nblocks = journal->j_max_transaction_buffers;
if (journal->j_committing_transaction)
- nblocks += journal->j_committing_transaction->
- t_outstanding_credits;
+ nblocks += journal->j_committing_transaction->t_nr_buffers;
return nblocks;
}


2008-02-01 00:50:30

by Mingming Cao

[permalink] [raw]
Subject: Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

On Thu, 2008-01-31 at 16:52 -0500, Josef Bacik wrote:
> On Thu, Jan 31, 2008 at 12:35:43PM -0700, Andreas Dilger wrote:
> > On Jan 31, 2008 11:14 -0500, Josef Bacik wrote:
> > [snip excellent analysis]
> > > So you get into this situation where
> > > t_nr_buffers (the actual number of buffers that are on the transaction) is
> > > greater than the number of buffers accounted for via t_outstanding_credits.
> > > This presents a problem since as we loop through writting buffers to the
> > > journal, we decrement t_outstanding_credits, and if t_nr_buffers is more than
> > > t_outstanding_credits then we end up with a negative number for
> > > t_outstanding_credits
> > >
> > > Signed-off-by: Josef Bacik <[email protected]>
> >
> > Do you know what kernel this problem was introduced in, or is this a
> > long standing problem? Presumably the same is needed for jbd2?
> >
> > Once we have some decent amount of testing going on with ext4, I think
> > it makes sense to merge the jbd2 changes back into jbd and return to
> > a single code base, since there is nothing in the jbd2 code that ext3
> > can't also work with (i.e. all of the changes are properly isolated
> > with compatibility flags and such).
> >
> > > @@ -1056,7 +1056,7 @@ static inline int jbd_space_needed(journal_t *journal)
> > > int nblocks = journal->j_max_transaction_buffers;
> > > if (journal->j_committing_transaction)
> > > nblocks += journal->j_committing_transaction->
> > > - t_outstanding_credits;
> > > + t_nr_buffers;
> >
> > (trivial) this can be moved back onto the previous line.
> >
> > > @@ -1168,7 +1168,7 @@ static inline int jbd_space_needed(journal_t *journal)
> > > int nblocks = journal->j_max_transaction_buffers;
> > > if (journal->j_committing_transaction)
> > > nblocks += journal->j_committing_transaction->
> > > - t_outstanding_credits;
> > > + t_nr_buffers;
> >
> > Same...
> >
>
> The original issue was reported on RHEL4, so thats 2.6.9, and looking through
> the old-bkcvs git tree I can't see where this was introduced, so it's probably
> existed before that. The same problem looks to exist in jbd2 though I haven't
> tested it myself, I just went ahead and included the fixes. Here is the updated
> patch, thanks much for the comments.
>

Added to ext4 patch queue.

Thanks,
Mingming
> Signed-off-by: Josef Bacik <[email protected]>
>
>
> diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
> index 31853eb..e385a5c 100644
> --- a/fs/jbd/commit.c
> +++ b/fs/jbd/commit.c
> @@ -561,13 +561,6 @@ void journal_commit_transaction(journal_t *journal)
> continue;
> }
>
> - /*
> - * start_this_handle() uses t_outstanding_credits to determine
> - * the free space in the log, but this counter is changed
> - * by journal_next_log_block() also.
> - */
> - commit_transaction->t_outstanding_credits--;
> -
> /* Bump b_count to prevent truncate from stumbling over
> the shadowed buffer! @@@ This can go if we ever get
> rid of the BJ_IO/BJ_Shadow pairing of buffers. */
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index 4f302d2..c0f93f5 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -580,7 +580,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
> stats.u.run.rs_logging = jiffies;
> stats.u.run.rs_flushing = jbd2_time_diff(stats.u.run.rs_flushing,
> stats.u.run.rs_logging);
> - stats.u.run.rs_blocks = commit_transaction->t_outstanding_credits;
> + stats.u.run.rs_blocks = commit_transaction->t_nr_buffers;
> stats.u.run.rs_blocks_logged = 0;
>
> descriptor = NULL;
> @@ -655,13 +655,6 @@ void jbd2_journal_commit_transaction(journal_t *journal)
> continue;
> }
>
> - /*
> - * start_this_handle() uses t_outstanding_credits to determine
> - * the free space in the log, but this counter is changed
> - * by jbd2_journal_next_log_block() also.
> - */
> - commit_transaction->t_outstanding_credits--;
> -
> /* Bump b_count to prevent truncate from stumbling over
> the shadowed buffer! @@@ This can go if we ever get
> rid of the BJ_IO/BJ_Shadow pairing of buffers. */
> diff --git a/include/linux/jbd.h b/include/linux/jbd.h
> index d9ecd13..eaeb3db 100644
> --- a/include/linux/jbd.h
> +++ b/include/linux/jbd.h
> @@ -1055,8 +1055,7 @@ static inline int jbd_space_needed(journal_t *journal)
> {
> int nblocks = journal->j_max_transaction_buffers;
> if (journal->j_committing_transaction)
> - nblocks += journal->j_committing_transaction->
> - t_outstanding_credits;
> + nblocks += journal->j_committing_transaction->t_nr_buffers;
> return nblocks;
> }
>
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 2cbf6fd..acf9d34 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1167,8 +1167,7 @@ static inline int jbd_space_needed(journal_t *journal)
> {
> int nblocks = journal->j_max_transaction_buffers;
> if (journal->j_committing_transaction)
> - nblocks += journal->j_committing_transaction->
> - t_outstanding_credits;
> + nblocks += journal->j_committing_transaction->t_nr_buffers;
> return nblocks;
> }
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-02-05 17:27:31

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

Hello,

Sorry for replying a bit late but I'm currently falling behind in
maling-list reading...

> The way jbd tries to determine if there is enough space left on the journal in
> order to start a new transaction is looking at the space left in the journal and
> the space needed for the committing transaction, which is 1/4 of the journal +
> the number of t_outstanding_credits for that transaction. In this case its
> assumed that t_outstanding_credits accurately represents the number of credits
Yes.

> waiting to be written to the journal, but this sometimes isn't the case. The
> transaction has two counters for buffers, t_outstanding_credits which is used in
> conjunction with handles that are added to the transaction, and t_nr_buffers
> which is only incremented/decremented when buffers are added/removed from the
> transaction and are actually destined to be journaled. Normally these two
t_nr_buffers actually represents number of buffers on BJ_Metadata list
and nobody uses it (except for the assertion in
__journal_temp_unlink_buffer()). t_outstanding_credits is supposed to be *the*
counter making sure we don't write more than we have credits for.

> counters are the same, however there are cases where the committing transaction
> can have buffers moved to the next running transaction, for example any buffers
> on the committing transactions t_reserved list would be moved to the next
> (running) transaction, and if it had been dirtied in the process it would
> immediately make it onto the t_updates list, which would increment t_nr_buffers
You probably mean t_buffers list here...
> but not t_outstanding_credits. So you get into this situation where
But which moving and dirtying do you mean? The caller which dirties
the buffer must make sure that he has acquired enough credits for the
transaction where the buffer ends up... So if there were not enough
buffers in the running transaction where we refiled the buffer it is a
bug in the caller which dirties the buffer.

> t_nr_buffers (the actual number of buffers that are on the transaction) is
> greater than the number of buffers accounted for via t_outstanding_credits.
> This presents a problem since as we loop through writting buffers to the
> journal, we decrement t_outstanding_credits, and if t_nr_buffers is more than
> t_outstanding_credits then we end up with a negative number for
> t_outstanding_credits, which means we start saying we need less than 1/4 of the
> journal for our committing transaction and allow more transactions than we can
> handle to start, and then bam we fail because journal_next_log_block doesn't
> have enough free blocks in order to handle the request. This has been tested
> and fixes the issue (which could not be reproduced by me but several other
> people could get it to reproduce using postmark), and although I couldn't
> reproduce the assertion, I could very easily reproduce the situation where
> t_outstanding_credits was < than t_nr_buffers.
I suppose you see the assertion J_ASSERT(journal->j_free > 1); to
fail, right? I don't see how your patch could help avoid that assertion.
You've just removed accounting of t_outstanding_credits which has no
impact on the real number of free blocks in the journal stored in
j_free. Anyway, if you can reproduce t_outstanding_credits <
t_nr_buffers, then there's something fishy. Are you able to reproduce it
also with a current kernel?
Thanks for looking into the problem :)

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs

2008-02-05 18:59:05

by Josef Bacik

[permalink] [raw]
Subject: Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

On Tuesday 05 February 2008 12:27:31 pm Jan Kara wrote:
> Hello,
>
> Sorry for replying a bit late but I'm currently falling behind in
> maling-list reading...
>
> > The way jbd tries to determine if there is enough space left on the
> > journal in order to start a new transaction is looking at the space left
> > in the journal and the space needed for the committing transaction, which
> > is 1/4 of the journal + the number of t_outstanding_credits for that
> > transaction. In this case its assumed that t_outstanding_credits
> > accurately represents the number of credits
>
> Yes.
>
> > waiting to be written to the journal, but this sometimes isn't the case.
> > The transaction has two counters for buffers, t_outstanding_credits which
> > is used in conjunction with handles that are added to the transaction,
> > and t_nr_buffers which is only incremented/decremented when buffers are
> > added/removed from the transaction and are actually destined to be
> > journaled. Normally these two
>
> t_nr_buffers actually represents number of buffers on BJ_Metadata list
> and nobody uses it (except for the assertion in
> __journal_temp_unlink_buffer()). t_outstanding_credits is supposed to be
> *the* counter making sure we don't write more than we have credits for.
>
> > counters are the same, however there are cases where the committing
> > transaction can have buffers moved to the next running transaction, for
> > example any buffers on the committing transactions t_reserved list would
> > be moved to the next (running) transaction, and if it had been dirtied in
> > the process it would immediately make it onto the t_updates list, which
> > would increment t_nr_buffers
>
> You probably mean t_buffers list here...
>
> > but not t_outstanding_credits. So you get into this situation where
>
> But which moving and dirtying do you mean? The caller which dirties
> the buffer must make sure that he has acquired enough credits for the
> transaction where the buffer ends up... So if there were not enough
> buffers in the running transaction where we refiled the buffer it is a
> bug in the caller which dirties the buffer.
>

You know now that you say that I feel like an idiot, you are right the only way
for something to actually end up on that list was if somebody dirtied it and if
they did it would have had to been accounted for at some point on the running
transaction.

> > t_nr_buffers (the actual number of buffers that are on the transaction)
> > is greater than the number of buffers accounted for via
> > t_outstanding_credits. This presents a problem since as we loop through
> > writting buffers to the journal, we decrement t_outstanding_credits, and
> > if t_nr_buffers is more than t_outstanding_credits then we end up with a
> > negative number for
> > t_outstanding_credits, which means we start saying we need less than 1/4
> > of the journal for our committing transaction and allow more transactions
> > than we can handle to start, and then bam we fail because
> > journal_next_log_block doesn't have enough free blocks in order to handle
> > the request. This has been tested and fixes the issue (which could not
> > be reproduced by me but several other people could get it to reproduce
> > using postmark), and although I couldn't reproduce the assertion, I could
> > very easily reproduce the situation where t_outstanding_credits was <
> > than t_nr_buffers.
>
> I suppose you see the assertion J_ASSERT(journal->j_free > 1); to
> fail, right? I don't see how your patch could help avoid that assertion.
> You've just removed accounting of t_outstanding_credits which has no
> impact on the real number of free blocks in the journal stored in
> j_free. Anyway, if you can reproduce t_outstanding_credits <
> t_nr_buffers, then there's something fishy. Are you able to reproduce it
> also with a current kernel?
> Thanks for looking into the problem :)
>

Well my patch helped avoid the assertion because t_outstanding_credits was going
negative therefore we were letting transactions start when we shouldn't be, and
eventually we would end up with too much of the journal in use and we'd assert.
Course I can't reproduce where t_outstanding_credits < t_nr_buffers upstream
(again I feel like an idiot, should have tested that first). Thanks for
looking at this Jan.

Mingming, would you mind pulling this patch out of the patch queue please since
its wrong? Thanks much,

Josef

2008-02-06 11:06:02

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

On Tue 05-02-08 13:59:05, Josef Bacik wrote:
> On Tuesday 05 February 2008 12:27:31 pm Jan Kara wrote:
> > Hello,
> >
> > Sorry for replying a bit late but I'm currently falling behind in
> > maling-list reading...
> >
> > > The way jbd tries to determine if there is enough space left on the
> > > journal in order to start a new transaction is looking at the space left
> > > in the journal and the space needed for the committing transaction, which
> > > is 1/4 of the journal + the number of t_outstanding_credits for that
> > > transaction. In this case its assumed that t_outstanding_credits
> > > accurately represents the number of credits
> >
> > Yes.
> >
> > > waiting to be written to the journal, but this sometimes isn't the case.
> > > The transaction has two counters for buffers, t_outstanding_credits which
> > > is used in conjunction with handles that are added to the transaction,
> > > and t_nr_buffers which is only incremented/decremented when buffers are
> > > added/removed from the transaction and are actually destined to be
> > > journaled. Normally these two
> >
> > t_nr_buffers actually represents number of buffers on BJ_Metadata list
> > and nobody uses it (except for the assertion in
> > __journal_temp_unlink_buffer()). t_outstanding_credits is supposed to be
> > *the* counter making sure we don't write more than we have credits for.
> >
> > > counters are the same, however there are cases where the committing
> > > transaction can have buffers moved to the next running transaction, for
> > > example any buffers on the committing transactions t_reserved list would
> > > be moved to the next (running) transaction, and if it had been dirtied in
> > > the process it would immediately make it onto the t_updates list, which
> > > would increment t_nr_buffers
> >
> > You probably mean t_buffers list here...
> >
> > > but not t_outstanding_credits. So you get into this situation where
> >
> > But which moving and dirtying do you mean? The caller which dirties
> > the buffer must make sure that he has acquired enough credits for the
> > transaction where the buffer ends up... So if there were not enough
> > buffers in the running transaction where we refiled the buffer it is a
> > bug in the caller which dirties the buffer.
> >
>
> You know now that you say that I feel like an idiot, you are right the only way
> for something to actually end up on that list was if somebody dirtied it and if
> they did it would have had to been accounted for at some point on the running
> transaction.
Never mind :) I also made several mistakes before I learned how JBD
works...

> > > t_nr_buffers (the actual number of buffers that are on the transaction)
> > > is greater than the number of buffers accounted for via
> > > t_outstanding_credits. This presents a problem since as we loop through
> > > writting buffers to the journal, we decrement t_outstanding_credits, and
> > > if t_nr_buffers is more than t_outstanding_credits then we end up with a
> > > negative number for
> > > t_outstanding_credits, which means we start saying we need less than 1/4
> > > of the journal for our committing transaction and allow more transactions
> > > than we can handle to start, and then bam we fail because
> > > journal_next_log_block doesn't have enough free blocks in order to handle
> > > the request. This has been tested and fixes the issue (which could not
> > > be reproduced by me but several other people could get it to reproduce
> > > using postmark), and although I couldn't reproduce the assertion, I could
> > > very easily reproduce the situation where t_outstanding_credits was <
> > > than t_nr_buffers.
> >
> > I suppose you see the assertion J_ASSERT(journal->j_free > 1); to
> > fail, right? I don't see how your patch could help avoid that assertion.
> > You've just removed accounting of t_outstanding_credits which has no
> > impact on the real number of free blocks in the journal stored in
> > j_free. Anyway, if you can reproduce t_outstanding_credits <
> > t_nr_buffers, then there's something fishy. Are you able to reproduce it
> > also with a current kernel?
> > Thanks for looking into the problem :)
> >
>
> Well my patch helped avoid the assertion because t_outstanding_credits was going
> negative therefore we were letting transactions start when we shouldn't be, and
> eventually we would end up with too much of the journal in use and we'd assert.
> Course I can't reproduce where t_outstanding_credits < t_nr_buffers upstream
> (again I feel like an idiot, should have tested that first). Thanks for
> looking at this Jan.
Glad to hear we don't have a bug there :) Anyway, assertion that
t_outstanding_credits doesn't go below zero would be a useful assertion to
add...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2008-02-06 18:47:17

by Mingming Cao

[permalink] [raw]
Subject: Re: [PATCH] jbd: fix assertion failure in journal_next_log_block

On Tue, 2008-02-05 at 13:59 -0500, Josef Bacik wrote:
> On Tuesday 05 February 2008 12:27:31 pm Jan Kara wrote:
> > Hello,
> >
> > Sorry for replying a bit late but I'm currently falling behind in
> > maling-list reading...
> >
> > > The way jbd tries to determine if there is enough space left on the
> > > journal in order to start a new transaction is looking at the space left
> > > in the journal and the space needed for the committing transaction, which
> > > is 1/4 of the journal + the number of t_outstanding_credits for that
> > > transaction. In this case its assumed that t_outstanding_credits
> > > accurately represents the number of credits
> >
> > Yes.
> >
> > > waiting to be written to the journal, but this sometimes isn't the case.
> > > The transaction has two counters for buffers, t_outstanding_credits which
> > > is used in conjunction with handles that are added to the transaction,
> > > and t_nr_buffers which is only incremented/decremented when buffers are
> > > added/removed from the transaction and are actually destined to be
> > > journaled. Normally these two
> >
> > t_nr_buffers actually represents number of buffers on BJ_Metadata list
> > and nobody uses it (except for the assertion in
> > __journal_temp_unlink_buffer()). t_outstanding_credits is supposed to be
> > *the* counter making sure we don't write more than we have credits for.
> >
> > > counters are the same, however there are cases where the committing
> > > transaction can have buffers moved to the next running transaction, for
> > > example any buffers on the committing transactions t_reserved list would
> > > be moved to the next (running) transaction, and if it had been dirtied in
> > > the process it would immediately make it onto the t_updates list, which
> > > would increment t_nr_buffers
> >
> > You probably mean t_buffers list here...
> >
> > > but not t_outstanding_credits. So you get into this situation where
> >
> > But which moving and dirtying do you mean? The caller which dirties
> > the buffer must make sure that he has acquired enough credits for the
> > transaction where the buffer ends up... So if there were not enough
> > buffers in the running transaction where we refiled the buffer it is a
> > bug in the caller which dirties the buffer.
> >
>
> You know now that you say that I feel like an idiot, you are right the only way
> for something to actually end up on that list was if somebody dirtied it and if
> they did it would have had to been accounted for at some point on the running
> transaction.
>
> > > t_nr_buffers (the actual number of buffers that are on the transaction)
> > > is greater than the number of buffers accounted for via
> > > t_outstanding_credits. This presents a problem since as we loop through
> > > writting buffers to the journal, we decrement t_outstanding_credits, and
> > > if t_nr_buffers is more than t_outstanding_credits then we end up with a
> > > negative number for
> > > t_outstanding_credits, which means we start saying we need less than 1/4
> > > of the journal for our committing transaction and allow more transactions
> > > than we can handle to start, and then bam we fail because
> > > journal_next_log_block doesn't have enough free blocks in order to handle
> > > the request. This has been tested and fixes the issue (which could not
> > > be reproduced by me but several other people could get it to reproduce
> > > using postmark), and although I couldn't reproduce the assertion, I could
> > > very easily reproduce the situation where t_outstanding_credits was <
> > > than t_nr_buffers.
> >
> > I suppose you see the assertion J_ASSERT(journal->j_free > 1); to
> > fail, right? I don't see how your patch could help avoid that assertion.
> > You've just removed accounting of t_outstanding_credits which has no
> > impact on the real number of free blocks in the journal stored in
> > j_free. Anyway, if you can reproduce t_outstanding_credits <
> > t_nr_buffers, then there's something fishy. Are you able to reproduce it
> > also with a current kernel?
> > Thanks for looking into the problem :)
> >
>
> Well my patch helped avoid the assertion because t_outstanding_credits was going
> negative therefore we were letting transactions start when we shouldn't be, and
> eventually we would end up with too much of the journal in use and we'd assert.
> Course I can't reproduce where t_outstanding_credits < t_nr_buffers upstream
> (again I feel like an idiot, should have tested that first). Thanks for
> looking at this Jan.
>
> Mingming, would you mind pulling this patch out of the patch queue please since
> its wrong? Thanks much,
>

Sure, done!

Mingming